Constructing and evolving a knowledge graph about the recruitment and labour market domain is a big challenge not only because the domain is quite broad but also because it is very heterogeneous (different industries and business areas, languages, labour markets, educational systems etc.) and changes in a very fast pace. Equally challenging is the task of incorporating and deploying such a graph in an organization’s existing products and customers, in a way that achieves only positive disruption. In this presentation we reflect upon our so far journey in developing and putting in use a large labour market knowledge graph at Textkernel, a leading company in the semantic recruitment software sector, and we share relevant best practices and learned lessons that are applicable to any domain, not just the recruitment one.
A few years after Google announced that their knowledge graph allowed searching for things, not strings, knowledge graphs have been gaining momentum in the world’s leading organisations as a means to integrate, share and exploit data and knowledge that they need in order to stay competitive. Such a knowledge graph, for the recruitment and labour market domain, we have been developing and using for the last couple of years at Textkernel, aiming to significantly improve the way our semantic software modules parse, retrieve and match CVs and Vacancies.
In these two years, we started with the inception and specification of the knowledge graph, continued with its design and population, and reached an important milestone with the graph’s first pilot deployments in production settings. Reflecting on this journey, we have identified several best practices, lessons learned, and challenges about the task and process of building, applying and evolving Knowledge Graphs in organizational settings, some of which we would like to share with the SEMANTiCS community.
Our proposed presentation has 2 parts. In the first part, we describe the organizational context of our Knowledge Graph (what we build and why) and the 3 key principles we follow for its development: 1) Scope and structure definition based on business and system needs, 2) Content population based on data-driven ontology mining, and 3) Human-in-the-loop for quality assurance and continuous improvement. In the second part, we focus on 2 important lessons that we learned while developing and using the graph, and describe how we effectively dealt with them:
1) Vagueness and subjectiveness are (necessary) features, not bugs: Much of the domain knowledge that a knowledge graph contains is most likely vague and subjective, i.e., it can be interpreted differently by different users and in different contexts. This is a reality that when not acknowledged and tackled, can make the knowledge graph unusable in the real world.
2) Domain semantics are not always directly usable by applications: While a knowledge graph may contain concepts and relations that represent very well the meaning of the domain, the applications that are to use this knowledge are often designed to work with seemingly similar, yet in many ways incompatible to it knowledge. This calls for a more inclusive consideration of applications in knowledge graph development, and a balancing act between representing accurately the domain and satisfying the different application requirements.