Semantic Annotation, Representation and Linking of Survey Data

Video Forum

Semantic technologies offer significant potential for improving data search applications. Ongoing work thrives to equip data catalogs with new semantic search features to supplement existing keyword search and browsing capabilities. In particular within the social sciences, searching and reusing data is essential to foster efficient research. In this paper, we introduce an approach and experimental results aimed at improving interoperability and findability of social sciences survey items. Our contributions include a conceptual model for semantically representing survey items and questions, detailing meaningful dimensions of items, as well as experimental results geared towards the automated prediction of such item features using state-of-the-art machine learning models. Dimensions of interest include, for instance, references to geolocation and time periods or the scope and style of particular questions. We define classification tasks using neural and traditional machine learning models combined with sentence structure features. Applications of our work include semantic and faceted search for questions as part of our GESIS Search. We also provide the lifted data as a knowledge graph via a SPARQL endpoint for further reuse and sharing.

Video