News Release

Critical text mining in historical newspapers from Luxembourg, Germany, France and Sw

Impresso: Media monitoring of the past

Grant and Award Announcement

University of Luxembourg

Luxembourg, 4 July 2017 - The aim of the project "Impresso: Media monitoring of the past. Mining 200 years of historical newspapers" is to link digitised corpora of newspapers from Switzerland, Luxembourg, France and Germany and to develop new methods to analyse them.

Over the next three years, the Luxembourg Centre for Contemporary and Digital History (C2DH) at the University of Luxembourg will work with the DHLAB at the École polytechnique fédérale de Lausanne (EPFL) and the Institute for Computational Linguistics at the University of Zurich on this project. The project will receive 1.7 million Swiss francs (1,55 million euros) in funding from the Swiss National Science Foundation (SNSF).

Improve usage of digital technologies for research

Historical newspapers represent a wealth of archival material, and many have already been digitised. However, conducting research using these sources raises a number of problems, including insufficient text searchability as a result of poor text recognition and missing metadata, the relative isolation of digitised newspapers within their respective archives, search functions that are difficult to use, and poorly designed user interfaces. Recent progress in text analysis has also opened up new possibilities for conducting research on large collections of texts.

The project will develop "deep learning" method, a subfield of machine learning, in order to correct errors in text recognition, improving the identification of people, institutions and places, and enhancing this entity recognition using external data repositories. The C2DH will be responsible for developing a user interface that will incorporate new search functions and facilitate the critical analysis of the newspaper corpora. This may include providing information on the provenance of the data and the quality of automatically generated annotations, as well as indicating any gaps in the inventory.

An comprehensive and collaborative project

To boost the relevance of the project for history, the humanities and social sciences in general, the C2DH will coordinate a series of workshops that will provide a forum for users and developers to exchange their ideas. "Further links between history, computer science and design will be developed via an associated C2DH-based research project on resistance to European unification in the late 19th and early 20th centuries", explains Dr Marten Düring, who coordinates the project at the University of Luxembourg. "Finally, the project will also be used for University teaching, giving young scholars the opportunity to explore automated methods for the extraction and representation of information from historical sources."

The project will not only lead to academic publications; at the end of the project, the individual processing, analysis and storage systems will also be made available on an open source basis for others to reuse and develop.

Associated project partners include the Luxembourg National Library, the Swiss National Library, the Swiss newspapers Le Temps and Neue Zürcher Zeitung, Swiss archives, and researchers from the University of Lausanne. In Luxembourg the project will be coordinated by Dr Marten Düring, Dr. Lars Wieneke and Prof. Dr Andreas Fickers, in coordination with Daniele Guido and Estelle Bunout.


Notes to editors

For further information, please contact: Dr. Marten Düring, E., T. +352 46 66 44 9029

Copyright for the photo: © University of Luxembourg / Michel Brumat

The Synergia programme

The SNSF's Sinergia Programme offers exclusive support to interdisciplinary collaborative research groups working on pioneering research. Projects are eligible for funding under Sinergia if they draw on theories and methods from two or more disciplines, with a similar degree of importance being attached to all the disciplines involved, and if the respective partners can provide complementary skills and knowledge.

Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.