Managing and improving European sampling practice
Obtaining good probability samples is a key challenge for European cross-national studies in order to represent the population. This report gives an overview of the sampling frames which are used in countries participating in the four cross-European surveys cooperating in SERISS: the European Social Survey (ESS), the European Values Study (EVS), the Gender and Generations Program (GGP), and the Survey of Health, Ageing, and Retirement in Europe (SHARE). The overview will show where possibilities exist to jointly build and share sampling frames and where studies not using an existing population register can profit from the experience of other studies which do have access to such a register in the same country. It provides a valuable knowledge database of national sampling procedures and accessible population registers across Europe and in addition offers a way to improve harmonization of sampling frames and sample data across European surveys. The report is accompanied by an Excel file which provides a full listing of the registers used by the four surveys across 24 countries.
This report addresses the quality of the population registers which are currently being used as sampling frames in countries participating in the four cross-European surveys cooperating in SERISS: the European Social Survey (ESS), the European Values Study (EVS), the Generations and Gender Programme (GGP), and the Survey of Health, Ageing, and Retirement in Europe (SHARE). It summarizes what efforts have been undertaken by register authorities to improve and update the registers and presents an inventory of the main problems encountered in the field by survey sampling experts. In addition, it discusses the quality of alternative methods of sampling and possible improvements.
Learning from administrative data
The potential for using auxiliary or contextual data for sample-based nonresponse adjustments recently gained more attention. However, identifying and accessing auxiliary data for nonresponse analysis, especially data which is of sufficiently high quality, presents a challenge. Data availability and access conditions can vary across countries and across organisations. This deliverable provides an inventory of auxiliary data that are available in registers used as sampling frames in the four major cross-European social surveys participating in SERISS: SHARE, ESS, GGP and EVS. Information is based primarily on findings from an expert survey among the researchers of these studies’ country teams. Findings are augmented with information on auxiliary data sources on the European level. In addition to this summary report, an accompanying Excel file provides detailed information about the auxiliary data available in registers used by the four surveys across 24 countries. This resource provides an opportunity to compare and learn from the experiences of the four major cross-national surveys being conducted in Europe today and identify potential sources of auxiliary data for future use.
Weighting for complex survey designs
This report reviews two broad classes of weighting methods to compensate for nonresponse errors in sample surveys: the nonresponse calibration approach and the propensity score approach. We show first, that arbitrary choices of the distance function characterizing the calibration methodology correspond to assuming, at least implicitly, alternative parametric models or the nonresponse process. As a natural extension of the nonresponse calibration approach, we then introduce the propensity score approach that allows us to improve the robustness of the survey weights by estimating an explicit model for the nonresponse process. Since the choice between these two approaches is not always clear-cut, we also consider a two-step procedure which involves a calibration adjustment in the first step and a propensity score adjustment in the second stage. Finally, the report concludes with a discussion on the important role played by the auxiliary information which is available to compensate for nonresponse errors.
Including the institutional population
The large European social surveys usually exclude residents living in institutions from their samples. In 2016, the SERISS project started to investigate the possible consequences of this exclusion. The project examines the feasibility to sample and survey the institutionalized population. This report introduces the first release of an inventory of surveys that include the institutional population and describes contents and sampling approaches of national and cross-national surveys that interviewed institutionalized respondents. Moreover, the report advances a detailed definition of institutions and the institutionalized population and briefly describes the quantitative size and statistical distinctiveness of this subgroup in European countries.
Survey experiments to compare two translation approaches: the ‘stay close to the source’ approach & the adaptive approach
D3.1 – Standards for the implementation of the two survey translation approaches: the ‘stay close to the source’ approach & the adaptive approach
Under SERISS an experiment has been set up to test two different approaches to questionnaire translation: the so-called ‘ask-the-same-question’ approach, which is a rather close translation method and has been the basic rule in most of the multilingual surveys, will be tested against a different approach that allows a higher degree of adaptation, an approach that has so far not been applied by the major surveys out of the concern that comparability between the different language versions may be hampered by a presumably too large distance between the source and the target versions.
This experiment will be carried out in two languages: Estonian in Estonia and Slovene in Slovenia. In each of these countries, three translation teams will work on altogether 60 questionnaire items to be translated according to both methods from English into both languages, subdivided into three sets of 20 items each. An elaborate research design has been worked out in order to minimize team effects: each item will be translated three times into each of the languages, twice following one method and once following the other method. An important aspect in the research design was to minimise potential team effects. The resulting translations will be fielded on the CRONOS webpanel (WP7) in late 2017.
Feasibility of applying computational linguistic methods to survey translation
D3.5 – First SERISS Symposium on synergies between survey translation and developments in translation sciences. Minutes of the expert meeting
This report summarises the discussion of the expert meeting organized as part of the “1st SERISS Symposium on synergies between survey translation and developments in translation sciences” which took place on 1-2 June 2017 at Universitat Pompeu Fabra (UPF) in Barcelona. The expert meeting brought together 15 experts from the public and private sectors in the fields of comparative survey methodology, computational linguistics, and translation and language sciences to discuss how current advances in computational linguistics could potentially improve survey translation.
Comparative assessment of thesaurus key words
This deliverable describes two methods that were used to assess the translation quality of the European Language Social Science Thesaurus (ELSST). ELSST is a social science multilingual thesaurus that was developed to aid cross-language information retrieval of social science datasets, including cross-national survey datasets, in the Consortium of Social Science Data Archives (CESSDA) data portal. It is available in 12 languages, including English, the source language, from which target language versions are derived.
The first evaluation method is re-translation (referred to in this deliverable as ‘back-translation’), which is a standard evaluation method used for assessing the translation quality of thesauri and related vocabularies. A subset of French and German ELSST terms were back-translated, and results analysed to detect errors in the target language terms, including unintended ambiguity. The second evaluation method compares the set of ELSST index terms in all languages that were assigned to the same cross-national datasets by different CESSDA archives and associated archives. Differences in the sets of index terms assigned were analysed to see if they were due to differences in the interpretation of the terms by the indexers who assigned them.
These guidelines describe how to manage the content of the European Social Science Thesaurus (ELSST), following international standards and best practice. It is aimed primarily at ELSST translators and content developers, but will also be of interest to end-users. The guidelines take account of ISO 25964-1, the latest international standard on thesaurus construction, published by the International Organization for Standardization, as well as the work of other knowledge organisation experts and thesaurus developers.
Updating the Translation Management Tool
Programmers from CentERdata, University of Tilburg have been working with several survey infrastructures to develop the Translation Management Tool (TMT), a web environment that centrally stores translations for large international multilingual studies. The TMT was originally developed for the Survey of Health, Ageing and Retirement in Europe. Under SERISS the TMT has been adapted to support other large scale studies that need to translate survey questionnaires including ESS, EVS and GGP. The program has been modularized for easier adaptation to other research infrastructures.
This document provides further information on the tool and details of how it can be configured for different surveys. A video demonstration of the tool and user guides for the ESS and EVS configurations of the TMT are available online.
Testing the Translation Management Tool (TMT) in ESS Round 8
D4.3 – Testing specifications for real-time testing of Translation Management Tool (TMT) in ESS Round 8
The Translation Management Tool (TMT) is a web-based tool specially designed to allow translators to translate questionnaires without the burden of understanding complex routing and programming codes for large multi-lingual questionnaires. Developed by CentERdata (University of Tilburg), it has been used in several large international studies including the Survey for Health, Ageing and Retirement in Europe (SHARE). Under SERISS the TMT has been extended to accommodate European Social Survey (ESS) translation procedures which uses TRAPD (Translation – Review – Adjudication – Pretesting – Documentation) as well as verification and SQP coding. The extended TMT was tested by three countries when carrying out their translations for ESS Round 8 (2016/17). This deliverable provides details of how those involved in the translation process – particularly the national teams and the verifiers CApStAn – were briefed on how to use and test the TMT.
Extending SHARE’s fieldwork management tools
SHARE has developed a set of tools – their Sample Distributor/Sample Management System (SD/SMS) – that are used to manage all fieldwork processes. These distribute addresses to interviewers, monitor their contact attempts, install the correct version of the questionnaire, give access to pre-loaded data, record fieldwork success, and transmit the data back to the agency and central coordination. This tool box is very complex since it has to accommodate many different fieldwork situations, different survey instruments according to respondent type, different languages, and varying levels of electronic hardware/software sophistication in different countries. Under SERISS SHARE are re-writing the tools to make the adaptation to different situations more transparent and easier to handle and facilitate their take-up by other surveys.
Researchers on the Survey of Health, Ageing and Retirement in Europe have been working with programmers from CentERdata, University of Tilburg to develop a new tablet-based Sample Management System (tablet SMS) to replace the current server/desktop/laptop version. The main reason for this update is that the current client-server based software (originating from 2002) is outdated. Replacing it by modern web-based applications will reduce maintenance cost, will make the software easier to set up and use (and provide it as a service), will open up the use of the software on modern devices like tablets and smartphones, and finally, through designing and developing the software in a more generic way, it opens up the possibility for other cross national survey projects like ESS, EVS and GGP to use the software. The prototype tablet SMS is available to view online.
A Fieldwork Management and Monitoring System (FMMS) for ESS
A key challenge facing social surveys, especially those operating cross-nationally, is to monitor and manage fieldwork effectively. As part of the SERISS project, a new electronic fieldwork management and monitoring system (FMMS) has been developed for the European Social Survey. The FMMS consists of two components: a mobile app to be used by interviewers in the field to manage their caseloads and complete contact records on the doorstep and a centralised case management system (CCMS) to manage the exchange of data between the survey agency and interviewers and maintain a central database which can be used for fieldwork progress monitoring.
D4.7 – Fieldwork Management and Monitoring System (FMMS) for the European Social Survey: Report on the feasibility of cross-national implementation
The FMMS offers clear potential benefits in terms of providing ESS stakeholders with access to consistent and timely data on fieldwork progress. However, there are implementation issues which need to be addressed before the FMMS can be rolled out cross-nationally. This deliverable reports on the results of a consultation exercise carried out with ESS National Coordinators and survey agencies to identify potential legal, technological and organisational barriers to implementing the FMMS on the ESS. Issues were identified around the need to transfer and store personal data across national borders, a lack of handheld mobile devices available within agencies, insufficient IT and end-user support in some countries, and a lack of support for a centralised monitoring tool among many ESS survey agencies, particularly those with their own well-developed in-house monitoring systems.
D4.8 – Fieldwork Management and Monitoring System (FMMS) for the European Social Survey: Report on test case scenarios
Researcher testing of the FMMS was conducted in June 2016 to test how far the tool currently under development meets the business needs of the ESS and to detect missing or defective functionalities. Issues identified during this first round of testing were then addressed as far as possible by developers prior to retesting in September 2016. Feedback from researcher testing fed into a final version of the FMMS prototype scheduled for finalisation and validation in early 2017.
D4.9 – Fieldwork Management and Monitoring System (FMMS) for the European Social Survey: Report on interviewer testing of the final mobile application and central database
Classroom based testing of the FMMS app and data exchange with the CCMS took place with UK-based ESS interviewers in October 2016. Its purpose was to gather end-user feedback on whether the app’s workflow and user interface met interviewer needs.
Building a survey network for Europe
The ‘SERISS Survey Experts Network’ is a series of workshops thematically based around SERISS work packages. The aim of the workshops is to bring together survey practitioners and researchers (e.g. representatives from national statistics institutes, cross-national European surveys, survey agencies and survey methodologists) in order to facilitate a productive exchange of knowledge and practices in state-of-art survey research, to initiate a discussion on how to tackle specific challenges in survey methodology and data harmonisation, and to encourage future cooperation between different organisations.
This report provides a summary of the first Survey Network workshop ‘Representing the population in surveys’ which draws on work done in SERISS Work Package 2 dealing with sampling approaches and challenges in cross-national surveys. The workshop took place in December 2016 and was hosted by Munich Center for the Economics of Aging (MEA). The main purpose of the first workshop was to review sampling practices across Europe, to exploit synergies to be gained from exchanging knowledge and to discuss possible cooperation in gaining better access to registers or sharing sampling frames across surveys.
The second Survey Experts Network Meeting took place at the University of Amsterdam, 4-5 September 2017. The main purpose of the workshop was to bring together researchers, survey practitioners (e.g. cross-national survey infrastructures, commercial survey agencies, representatives of non-profit organisations conducting social surveys) and other stakeholders (e.g. national statistics institutes, employment agencies) involved in designing, coding and analysing socio-economic questions in order to demonstrate coding and harmonisation tools developed under SERISS, to offer the participants an opportunity to try out the tools during the workshop, and to provide feedback and suggestions for tool upgrades and training materials.
Modern research in the social sciences increasingly requires recording innovative variables such as objective health information. The inclusion of so-called biomarkers i.e. objective measures of biological and physical functions recently gained in importance for field surveys in terms of supplementing traditional self-reported survey data. However, while being able to complement population based survey data collection with biomarkers is of great value scientifically, the collection of these types of data is associated with legal and ethical challenges.
This deliverable addresses centrel legal requirements and ethical issues related to the collection of Dried Blood Spot (DBS) samples and provides a synopsis of policy-rules for collecting biomarkers in social surveys. By describing experiences when implementing the collection of biological samples in various European countries and Israel as part of the SHARE study, this deliverable also demonstrates the concrete legal and ethical challenges associated with the collection of DBS samples cross-nationally.
In recent years, there has been an upsurge in the use of biological specimens as objective health measurements in socio-economic surveys. High participation rates in the collection of biomarkers are desirable to enhance the statistical power by increasing the number of observations available for statistical analyses. Consent rates may depend on many factors. This report looks at the consent rates for the collection of dried blood spots (DBS) samples in the context of the sixth Wave of the Survey of Health, Ageing and Retirement in Europe (SHARE) and focusses on two of them: different legal or ethical requirements in the participating countries and the expectations of SHARE interviewers regarding the success of the DBS collection. The analyses of these factors in correlation with actual consent rates suggest that the interviewers’ expectations have been more important for the success of the blood collection in terms of higher consent rates than specific deviations regarding the collection process based on legal and ethical requirements. This is an interesting result for survey practitioners who are concerned with the integration of the collection of biomarkers in socio-economic surveys: whereas legal or ethical requirements cannot be changed easily, the expectations of the interviewers can be influenced by good interviewer training.
The Cross-National Online Survey (CRONOS) panel designed and implemented as part of the SERISS project is the first attempt to establish a probability-based cross-national online panel following an input harmonisation approach. CRONOS is a pilot study to evaluate the effectiveness of panel recruitment off the back of an existing cross-national face-to-face survey (European Social Survey Round 8) in terms of costs, sample representativeness, participation rates, panel attrition over time and data quality. It serves as a ‘proof of concept’ from which to produce a blueprint for an online probability-based panel that a range of cross-sectional survey infrastructures could consider adopting in the future.
To inform the development of CRONOS seven existing general population web-based random probability panels were reviewed: LISS (Longitudinal Internet Studies for the Social sciences, The Netherlands), GIP (German Internet Panel, Germany), GESIS Panel (Germany), ELIPSS (Étude Longitudinale par Internet Pour les Sciences Sociales, France), NCP (Norwegian Citizen Panel, Norway), ATP (American Trends Panel, USA) and FFRISP (Face-to-Face Recruited Internet Survey Platform, USA). This report summarises the findings and recommendations from that review.
One of the main goals of this pilot study is to explore the challenges associated with cross-national recruitment and implementation. This deliverable summarises the recruitment plans and decision process related to setting up CRONOS; these plans were informed by a literature review, practical and empirical evidence from similar projects, and cooperation with the numerous survey experts and organisations involved in the project.
SERISS WP8 develops a cross-country harmonised, fast, high-quality and cost-effective coding module for the core variables: Occupation, industry, employment status, educational attainment and field of education. The module uses a large multi-lingual dictionary with tens of thousands of entries about job titles, industry names, fields of education and training, and employment status categories. Additionally, the module will include country-specific, structured lists of educational qualifications. The module will provide up-to-date codes to classify the variables, using international standardized classification systems.
For further information see www.surveycodings.org.
This deliverable includes the XML software for the survey questions to produce a coding module for five core socio-economic variables: Occupation, industry, employment status, educational attainment and field of education. It also includes the code for those survey answers not available in an API.
Deliverable D8.14 summarizes all survey questions and answers for the five core variables and their translation into 47 languages. Deliverable D8.15 shows how the survey questions and answers appear in web survey mode.
Many questionnaires have a question “Please write the main business activity of the organisation where you work”. The answer is commonly asked as an open text field, challenging the survey holder to code the response into an industry classification. Alternatively, in web surveys respondents can self-identify their industry from a database.
This paper builds on previous work of the author for the industry question in the WageIndicator web survey (since 2001). For this survey a database and a search tree was developed, all coded according to the European Community NACE Statistical Classification of Economic Activities, commonly referred to as NACE Rev. 2.0. WageIndicator did not aim for a question with a short aggregated list, because of the related aggregation bias whereby respondents may classify their detailed industry into different aggregated categories. Therefore, a database of industry names and a two-level search tree was developed, offering 300 industry categories to the survey respondents. The database has English and national industry names. The national labels are shown to the survey respondent.
For deliverable D8.10 “Database of industries + explanatory note” the WageIndicator database was extended to 47 languages, suitable for use in 99 countries. The deliverable’s accompanying database consists of the industry codes, the English master label and the national labels for all 99 countries and 47 languages. Note that the codes correspond fully to the 3 or 4 digit codes of NACE2.0. For any database current Internet technologies allow for an API (Application Programming Interface) which means that the database can also be used offline, for example in tablets. From February 2017 the industry database will be available.
Many questionnaires have a question “Please write the main business activity of the organisation where you work”. Experience of the WageIndicator survey suggests respondents tend to skip the question about industry relatively more often compared to other questions, presumably because they judge answering the question as cognitively too demanding. An occupation>industry prediction tool has therefore been developed, providing survey respondents with a limited set of industries from which to pick based on their stated occupation. The report explains how the predictions were arrived at whilst the accompanying database gives the predictions derived for all 4 digit ISCO codes.
Employment status is a measure of an individual’s position in the labour market. The International Labour Organisation (ILO) maintains the International Classification by Status in Employment (ICSE). In 1993 ICSE was defined as a classification with six categories. In 2013, ILO scheduled to revise the classification, but its statisticians postponed final decision making until 2018. This paper develops survey questions for the measurement of ICSE-93. It then squeezes greater detail into the initial six categories, building on the suggestions proposed in 2013. The revised ICSE classification is a three-level classification of the six ICSE-93 groups at the first level, eight categories at the second and 13 at the third level. For the full classification 28 variables are needed to measure the revised ICSE. These are detailed in the paper.
This deliverable reports on survey questions designed to measure the revised ICSE classification. These survey questions and answers have been translated in 47 languages, facilitating the measurement of the ICSE classification in 99 countries. The deliverable provides the coding scheme and the syntax needed to convert the data from the survey questions into the revised ICSE classification.
Socio-economic status (SES) is a measure of an individual’s economic and social position. Over the past decades numerous studies have elaborated the measurement and its effect on a set of outcomes, with a predominant focus on the United States and the United Kingdom. In the early 2000s the European Socio-Economic Classification (ESeC) was developed. The 2008 revision of the ISCO occupational coding challenged the ESeC classification, and Eurostat called for an update, which was called the European Socio-Economic Groups (ESeG-2014). The ESeG-2014 classification is a two-level classification of nine groups and 42 subgroups, to ensure a quick and uncomplicated implementation in all statistical sources. Four variables are needed to measure ESeG-2014, notably the two core variables ISCO08 occupation and employment status (employee / self-employed), and two additional variables for people not in paid employment, notably status (retired / student / disabled) and age.
This report details survey questions designed to measure ESeG-2014 at a detailed two-digit level. These survey questions and answers have been translated in 47 languages, facilitating the measurement of the ESeG-2014 classification in 99 countries. With fewer survey questions the one-digit ESeG-2014 classification can be measured. The deliverable provides the coding scheme and the syntax needed to convert the data from the survey questions into the ESeG-2014 classification.
This report summarizes all survey questions and answers used to produce a coding module for five core socio-economic variables: Occupation, industry, employment status, educational attainment and field of education. For each variable separate deliverables detail the arguments why the questions and answers are phrased as proposed. This report has an accompanying database that contains the translations of all survey questions and answers detailed in this deliverable. Translations are provided for 99 countries in 47 languages.
Building on D8.14 this deliverable shows how the module to collect data on the five core socio-economic variables: Occupation, industry, employment status, educational attainment and field of education appears when programmed in a web survey.
This deliverable reports on a workshop which took place to introduce coding tools developed under SERISS to a wider group of researchers, survey practitioners (e.g. cross-national survey infrastructures, commercial survey agencies, representatives of non-profit organisations conducting social surveys) and other stakeholders (e.g. national statistics institutes, employment agencies) involved in designing, coding and analysing socio-economic questions, to offer them an opportunity to try out the tools during the workshop and to provide feedback and suggestions for tool upgrades and training materials.
Presentations from the workshop are included in an appendix.
Measuring social networks
D8.20 – Name generator and questionnaire items for Social Network Module
D8.21 – Translated name generator and questionnaire items
Social networks are the collection of personal ties that individuals variously maintain and from which they gain a range of benefits, supports and services. Given the significance of the social network construct for both science and policy, SHARE is developing a unique module for the measurement of social networks that can serve as a model for other surveys. The SHARE Social Network Module (SN) is based principally on the approach that was employed in the National Social life, Health and Aging Project, in the United States, in 2005-2006 (Cornwell et al., 2009). The module applies a name generating mechanism in which respondents identify the people who are important to them and then add information on each person named (via “name interpreter questions”). It also allows the tracing of changes in respondents’ social networks over time and is programmed to avoid respondents having to duplicate information provided.
D8.20 describes the basic structure of the name generator, as it is developing, and its mode of operation.
D8.21 showcases the production of the name generator tool in a wide range of languages that are spoken in Europe. The SHARE country teams have translated the SN module into the different national languages of the participating countries. This deliverable presents the generic questionnaire in English and the available translations from Austria, Belgium (French and Dutch), Switzerland (German, French and Italian), the Czech-Republic, Germany, Denmark, Spain, Spain \Girona, France, Croatia, Italy, Luxembourg (German, French and Portuguese), Poland, Sweden and Slovenia.