Computing and Library Services - delivering an inspiring information environment

Big Textual Data: Lessons and Challenges for Statistics

Murtagh, Fionn (2017) Big Textual Data: Lessons and Challenges for Statistics. In: SIS 2017 Statistics and Data Science: new challenges, new generations. Proceedings of the Conference of the Italian Statistical Society. Firenze University Press, Florence, Italy, pp. 719-730. ISBN 978-88-6453-521-0

PDF - Published Version
Available under License Creative Commons Attribution.

Download (768kB) | Preview


At issue are a few early stage case studies relating to: research publishing and research impact; literature, narrative and foundational emotional tracking; and social media, here Twitter, with a social science orientation. Central relevance and
importance will be associated with the following aspects of analytical methodology: context, leading to availing of semantics; focus, motivating homology between fields of analytical orientation; resolution scale, which can incorporate a concept hierarchy and aggregation in general; and acknowledging all that is implied by this expression: correlation is not causation. Application areas are: research publishing and qualitative assessment, narrative analysis and assessing impact, and baselining and contextualizing, statistically and in related aspects such as visualization.

▼ Jump to Download Statistics
Item Type: Book Chapter
Additional Information: Conference keynote presentation.
Uncontrolled Keywords: mapping narrative, emotion tracking, significance of style, Correspondence Analysis, chronological hierarchical clustering
Subjects: H Social Sciences > H Social Sciences (General)
H Social Sciences > HA Statistics
Q Science > QA Mathematics
Q Science > QA Mathematics > QA76 Computer software
Z Bibliography. Library Science. Information Resources > ZA Information resources
Z Bibliography. Library Science. Information Resources > ZA Information resources > ZA4050 Electronic information resources
Schools: School of Computing and Engineering
Related URLs:


1. Anderson, C.: “The end of theory: the data deluge makes the scientific method obsolete”, Wired Magazine (16 July 2008),

2. Bécue-Bertaut, M., Kostov, B., Morin, A., Naro, G.: “Rhetorical strategy in forensic speeches: Multidimensional statistics-based methodology”, Journal of Classification, 31, 85–106 (2014)

3. Gelman, A., Hennig, C.: “Beyond subjective and objective in statistics”, Journal of the Royal Statistical Society Series A, 180, Part 4, 1–31 (2017).

4. Goeuriot, L., Mothe, J., Mulhem, P., Murtagh, F., SanJuan, E.: “Overview of the CLEF 2016 Cultural Micro-blog Contextualization Workshop”. In: Editors: N. Fuhr, P. Quaresma, T. Goncalves, B. Larsen, K. Balog, C. Macdonald, L. Cappellato, N. Ferro, Experimental IR Meets Multilinguality, Multimodality, and Interaction, 7th International Conference of the CLEF Association, CLEF 2016, Évora, Portugal, September 5-8, 2016, Proceedings, Lecture Notes in Computer Science, volume 9822, pp. 371–378 (2016)

5. Hernández, D.M., Bécue-Bertuat, M., Barahona, I.: “How scientific literature has been evolving over the time? A novel statistical approach using tracking verbal-based methods”, JSM Proceedings, 2014, Section on Statistical Learning and Data Mining, American Statistical Association, 1121–1132 (2014)

6. Keiding, N., Louis, T.A.: “Perils and potentials of self-selected entry to epidemiological studies and surveys”, Journal of the Royal Statistical Society A, 179, Part 2, 319–376 (2016)

7. Legendre, P., Legendre, L.: Numerical Ecology, 3rd edn., Elsevier, Amsterdam (2012)

8. Le Roux, B.: Analyse Géométrique des Données Multidimensionelles, Dunod, Paris (2014)

9. Le Roux, R., Rouanet, H.: Geometric Data Analysis, From Correspondence Analysis to Structured Data Analysis, Kluwer, Dordrecht (2004)10. Le Roux, B., Lebaron, F.: “Idées-clefs de l’analyse géometrique des données” (Key ideas in
the geometric analysis of data). In F. Lebaron and B. Le Roux, editors, La Méthodologie de Pierre Bourdieu en Action: Espace Culturel, Espace Social et Analyse des Données, pages
3–20. Dunod, Paris (2015)

11. McKee, R.: Story: Substance, Structure, Style, and the Principles of Screenwriting, Methuen (1999)

12. Murtagh, F.: Multidimensional Clustering Algorithms, Physica-Verlag, Würzburg (1985)

13. Murtagh, F.: “Contextualizing Geometric Data Analysis and related data analytics: A virtual microscope for Big Data analytics”, JIMIS, submitted (2017)

14. Murtagh, F.: Data Science Foundations: Geometry and Topology of Complex Hierarchic Systems and Big Data Analytics, Chapman and Hall, CRC Press (2017)

15. Murtagh, F., Ganz, A., McKie, S.: “The structure of narrative: the case of film scripts”, Pattern Recognition, 42, 302–312 (2009)

16. Murtagh, F., Ganz, A.: “Pattern recognition in narrative: Tracking emotional expression in context”, Journal of Data Mining and Digital Humanities, vol. 2015 (published May 26,

17. Murtagh, F.: “Semantic mapping: towards contextual and trend analysis of behaviours and practices”. In: K. Balog, L. Cappellato, N. Ferro, C. MacDonald, Eds., Working Notes of
CLEF 2016 – Conference and Labs of the Evaluation Forum, ´ Evora, Portugal, 5-8 September, 2016, pp. 1207–1225 (2016).

18. Murtagh, F., Pianosi, M., Bull, R.: “Semantic mapping of discourse and activity, using Habermas’s Theory of Communicative Action to analyze process”, Quality and Quantity, 50(4), 1675–1694 (2016

19. Murtagh, F., Orlov, M., Mirkin, B.: “Qualitative judgement of research impact: Domain taxonomy as a fundamental framework for judgement of the quality of research”, Journal of Classification (in press, 2017). Preprint:

20. Bienaise, S., Le Roux, B.: “Combinatorial typicality test in Geometric typicality test in geometric data analysis”, preprint (2016)

21. Wessel, M.: “You dont need Big Data – You need the right data”. Harvard Business Review (3 Nov. 2016).

Depositing User: Fionn Murtagh
Date Deposited: 09 Aug 2017 13:58
Last Modified: 09 Aug 2017 14:08


Downloads per month over past year

Repository Staff Only: item control page

View Item View Item

University of Huddersfield, Queensgate, Huddersfield, HD1 3DH Copyright and Disclaimer All rights reserved ©