Relevance & Research Question: One of the NRW-Innovativ projects is an attempt to fill a lacuna in communications studies by creating a harmonized dataset for longitudinal data (since 1954) about media use in Germany exploiting the Media-Analysis-Data. In making large scale media use data accessible for academic research in high quality standards of data documentation lies the relevance of this project. The research question, therefore, is: how to make the Media-Analysis-Data – as a big data – accessible for academic research while being transparent.
Methods & Data: This paper will present the various theoretical, practical and the use of a digital harmonization software, CharmStats, utilized over the course of this project. Goal of the harmonization was to create a scientific use file setting excellent documentation standards with the help of CharmStats and to continue the harmonization already done until 2009. Using a new harmonization software, CharmStats, we review the challenges and solutions developed as a case study in large-scale data harmonization. With more than 1.5 million cases per dataset – in total there are two harmonized datasets –, each with almost 30.000 variables for over 60 years for pressmedia and almost 40 years for radio, the Media-Analysis data can be counted as the biggest dataset of media use in Germany being available for academics.
Results: Target of the project is to make the complex process of data harmonization with large-scale data most transparent and replicable. CharmStats offers the possibility to fulfil the project´s goals as it produces syntaxes for data harmonization plus a report for documentation. For the presentation we would portrait the different levels to reach the projects´ goals to answer the research question:
- Find a structure to work with
- Setting standards for data documentation with CharmStats
- Producing a harmonized dataset
- Making the dataset replicable, moreover, making it an accessible and sustainable source for academic research throughout the Library of Online Harmonization (scheduled for release in 2019)
Added Value: The methodological approach of this project can be counted as a user case for documenting and harmonizing big data for academic research.