WP8: A coding module for socio-economic survey questions

Work package leaders:

Stephanie Stuck



Kea Tijdens

University of Amsterdam, WageIndicator Survey

Occupation, industry, employment status, educational attainment and field of education are core variables in many socio-economic and health surveys, as are the size and intensity of social networks.

However, their measurement is cumbersome, not sufficiently standardized and often expensive. This work package develops a cross-country harmonised, fast, high-quality and cost-effective coding module for these variables.

The module uses a large multi-lingual dictionary with tens of thousands of entries about job titles, industry names, and fields of education and training. Additionally, the module includes country-specific, structured lists of educational qualifications and employment status categories, and provides up-to-date codes for these classifications.

It thereby facilitates surveys in the ESS, GGP and SHARE countries and their associated networks to serve infrastructures reaching out to a global audience, including the most spoken language groups outside the EU28 area (including Russian, Mandarin, Arabic, Hindi and Bahasa, a total of 34 languages servicing 99 countries).

WP8 tasks

The WP is divided into eight tasks:

Programming the module

This task involves the programming of a web-based module to capture all required variables across multiple platforms, using search tree navigation or semantic text matching techniques. The module will incorporate API databases from other tasks.

Compile the API-database of occupations

This task aims to compile a multilingual ‘training set’ for developing machine learning algorithms to code occupation, drawing on job titles and codes from a range of sources. Codings will be checked for, corrected and annotated in collaboration with IER.

Compile the API-databases of educational attainment and field of education

This task aims to compile country-specific databases of educational attainment for 99 countries and a general database of field of education and training. Both will draw on existing resources from the CAMCES project, UNESCO and Eurostat.

Compile the API-database of industries

This task is to develop and extend WageIndicator’s database of industries from 80 to 99 countries, and in doing so build on resources like FORBES and FORTUNE to compile a database of company names to aid text matching in the API.

Compile the API-database of employment status

This task extends WageIndicator’s employment status Q&A into a database covering 99 countries, with coding schemes to ISCE-93 and ESEG-2014.

Design survey questions

This task aims to design survey questions for the socio-economic variables described in 8.2 to 8.5, as well as auxiliary variables required for occupational and socio-economic class coding.

Consultation and dissemination

The API will be offered widely and free of charge to survey practitioners inside and outside research Infrastructures. Additionally this task will organise newsletters and webinars to involve stakeholders, run a workshop for intending users of the coding module and collaborate with GESIS over training events.

Measuring social networks

Building on previous work by SHARE, this task involves designing a name generator for cross-national surveys that allows respondents describe the intensity of contact they have with named individuals. This will in turn feed standard classifications of network type.

Key Outputs

– Database of industries and employment status to support cross-national coding of core socio-economic variables
Documentation of SHARE’s Social Network module