How can man and job be matched for the perfect date?

It is extremely difficult to match one person with another by using technology to send them on a date. There will be numerous factors and expectations that have to be taken into account. Do they have similar interests? Are they living in the same area? What are their goals? And then there are plenty of hidden expectations regarding things such as appearance. Matching has always been a complex task.

The same is true when it comes to bringing together the right person for the right job. Even for specialists with years of experience, matching jobs and skills is a huge challenge. Who works well with what? How can you be sure of making a good decision? Every day, such questions have to be correctly answered in order to be able to successfully match person to job. This requires thorough knowledge and good information. The expectations of employers and potential employees are high. Can a machine or an algorithm more than satisfy these expectations?

How to match this complex data? Source: Getty Images.


Is good matching possible?

Firstly, let’s determine whether good matching is even possible. Matching is the act of combining complementary attributes of two entities, in our case job and person. However, even in this context the word ‘matching’ can have various meanings. In some jobs, whether a candidate is suitable for a given job is merely a question of whether he or she is able to work. If you are physically healthy, for example, you should be able to pick strawberries. There are other jobs, however, that require a variety of certificates, specializations and experience. Try to match a neonatal surgeon to a job in a hospital department and this becomes clear.

Although HR specialists realize that the tiniest details have to be considered during the matching process, their task remains highly complex. This is because the prevailing conditions are constantly changing. Requirements that were commonplace yesterday no longer apply today, and in turn today’s requirements will no longer be valid tomorrow. How we define job, prospective employee and labor market shifts all the time. Who would have needed a Director of Digital Development a few years ago? And who would have cited such a specialization in his or her CV?

Matching becomes far more complex when a machine has to deal with the task. A machine has to apply all the experience and knowledge of the specialist in the same way, pay attention to the smallest details and react to changes in the labor market. Suppliers of such machines focus on different data in order to overcome this highly complex problem. For example, former job titles of applicants or their skills are taken into account. An algorithm then compares job requests and CVs, and a match is made. Successful?


Bricklayer equals bricklayer. Sales consultant equals sales consultant?

Some algorithms, as we have heard, will make a match based on former job titles. If the candidate had position X at company A, he or she can also hold position X at company B, no? This may have held true in the past, yes. We used to be general practitioners, secretaries, lawyers, bricklayers, etc. Today we are sales consultants, data ninjas, facility managers, etc. Is a sales consultant someone who works in a retail shop and advises customers? Or someone who prepares offers, takes up orders and negotiates contracts with customers? Such questions are already being asked by specialists when they look at CVs. And now machines should be able to make such distinctions efficiently.

Job titles are therefore often too generic. Or too specific, as internal company terms influence job titles and therefore tend to describe functions. Nowadays, everyone is some kind of manager. Without a more detailed description of the jobs we would often be lost and would not know whether an applicant is really suitable for a position – or vice versa.


Comparing skills

A job title is not enough for good matching. So other job matching providers solve the matching problem by using other parameters – they look at skills and competences, since these represent the ‘content’ behind descriptions of sometimes cryptic job titles. Skills-based or competence-based matching is more meaningful and promising because it takes into account not only a title previously held by an applicant, but also that person’s knowledge, talents, insights and education. Thus, one considers the candidate’s skills and the skills required for a job, and matches them.

This sounds logical: I want a manager who is open-minded, communicative, strong in leadership and good at solving problems. I find someone who outlines such qualities in his/her résumé and thus corresponds to my criteria. So, are skills now a reliable factor for machines to evaluate the perfect match for my vacancy?

Let’s take a closer look at skills. Skills result from knowledge. Aristotle said that knowledge is the absolute truth. Absolute truth can only be attained if one has experienced and tested the knowledge oneself. Knowledge that I have acquired from others through communication and study must be verified and therefore is not necessarily the absolute truth. If someone tells me something new, how can I be sure it is true?

So, as long as I have not experienced this new knowledge – and applied it accordingly – it remains incomplete. There is no doubt that good education is of great value, but until I know how someone has used the acquired knowledge, it is not proven and does not give me the opportunity to benefit from it. Only when it has been tested does it give me an advantage or a certain scope for action and, to some extent, power.

Let’s return to my manager who is open-minded, communicative, strong in leadership and good at solving problems. Couldn’t it be that our potential candidates are managers in the construction, finance or clothing industry? Without their experience, the vacancy would probably have been matched to all three positions, although each job requires its own industry insights. There is a lack of relevant experience to put the skills into a meaningful context.


Real knowledge needs experience

This has been recognized by other job matching experts. The criterion of skills is not sufficient for good matching. If I want to match a job seeker with a certain profession, I cannot only take into account knowledge of the person’s skills based on his/her CV and cover letter. I will also need to know about experience. Only with experience can relationships and industries be developed.

In addition, no one mentions only the skills he/she has – but very often other relevant information that can contribute to a good match. Similarly, in a job advertisement, a company does not specify all the skills it is looking for – and this is a hindrance to matching. Because if a job advertisement appears for a “Data Scientist,” the employer will probably not mention “IT usage” or “data processing” as he/she will assume that such skills are evident from the job title. Similarly, a data scientist would probably indicate in his or her CV more specific skills than those associated with previous job titles. But if that person is to be matched according to skills, then information relevant to this matching parameter will be missing.

If we only matched on the basis of skills, I’m sure we would get different results than if we just compared job titles. However, such an approach is ultimately not good enough to guide people to jobs, applicants to positions and employees to employers. We need more.


Good education does not mean good manners

Knowledge of skills and experience cannot determine whether the new copywriter will fit into the team well, or whether the new nurse will arrive at the hospital on time, or whether the new procurement officer will negotiate well. Who would confide in a CV nowadays that he/she is a poor team player or is unreliable? Yet it is precisely these soft skills and the personality of the applicant that are incredibly important for a good match. A consultant must be punctual for a customer appointment, whereas a programmer can keep flexible working hours. Likewise, the programmer’s appearance is of less importance than that of the consultant. However, if the consultant is unable to speak openly to customers, his company will soon lose them. Accordingly, a match only becomes truly successful if the applicant’s personality is also considered. My CV details a wide range of things I have done, but how I have done them is also crucial.


Pulling together?

Now, if this CV fits that that vacancy perfectly, it is not yet certain that we will have a perfect match. After all, the skills and personality of a new employee have to complement a network of skills and personalities of colleagues. If I’m the only software engineer in a company, I have to be an all-rounder and take the initiative with ease. If I am hired in a team with two others – one of them is more familiar with field X, the other with field Y – skills complement each other and the collaboration creates something completely new. I can ask for help more often and at the same time I am expected to be able to fit in well with the team. The colleagues involved also influence the perfect match. To be precise, the CVs of the staff have to be matched as well.

Whoever still thinks that you can match an employee to a job by taking into account just one parameter (job title, skills, experience or personality) may realize that this can only work well if you are very lucky. If an algorithm is supposed to solve such a complex problem, the chances of a successful match are like those of finding the proverbial needle in the haystack.

So, are we at the end of the road?

Not yet. Confucius made the statement, “Experience is like a lantern in the background; it always illuminates only the piece of road that we already have behind us.”

We have tested our knowledge, brought us and others advantages, we may be punctual and reliable. We are in possession of the required soft skills. This means that we are sure to secure our ongoing business. All deadlines are met, the customers are treated well, and the employees occupy their seats punctually each morning. Everything should be fixed by now.


What truly strengthens a business?

But if everyone always conforms to the requirements, then business remains “only” secured. We don’t create anything new. Creating something new calls for good knowledge and often a great deal of experience. Above all, however, one needs – literally and semantically – creativity.

The Cambridge dictionary defines creativity as “The ability to produce original and unusual ideas, or to make something new or imaginative.”¹ Basically, by going beyond knowledge and experience, creativity gives us a third way of looking at something, which one could call “thinking outside of the box.” As an approach, creativity is therefore less artistic than rule-breaking: radical reworking, thinking outside the box, or throwing the box out altogether. The creative act produces something new, different and maybe a little frightening.

Albert Einstein said that “Creativity is intelligence having fun,”² so a creator is someone who enjoys turning the business around rather than someone who merely meets a catalogue of requirements. Creativity is of the greatest value in times when a great deal of change is happening. After all, anyone who simply adapts during the course of digitalization will not follow, and certainly will not move forward. We need the people who keep an overview. We need the employees who secure the business. And we also need people who show us new ways of doing thing, especially nowadays. Creativity is the most important skill today.

Creativity, intuition, emotions and anything contrary to logical, analytical and rational thinking (which could be considered analogous to knowledge and experience) are often attributed to the right side of the brain. You may have heard the theory that people think more with either the left or right side of the brain. However, researchers have found out that this is a myth. Even if some functions may be attributed more to one side of the brain, results are greatest when both sides of the brain work together in complex networks.³

If I want to create a new product, knowledge of the production processes and the materials required helps me. My experience in planning a new product also helps me. My organizational talent supports the process. But the idea of creating a new product results from my creativity. So, if you are good at something, then you get the best results because all factors are involved at the same time: knowledge, experience, personality and creativity.


Farewell to the notion of the perfect match

Let’s put it in a nutshell: matching cannot be competence-based, skills-based, or based on an ad hoc approach because the problem is too complex. Matching is driven by expectations and expectations change constantly.

Accordingly, there is simply no such thing as a perfect match, because it is impossible to overcome expectations. Expectations are very subjective and can never be fulfilled equally well for all. So we can only evaluate all the factors as far as possible in order to get as close as possible to the perfect match.

The results of today’s culture of matching with data shreds, such as a few skills or cryptic job titles, will destroy the quality of the machine again and again. Matching with data shreds is a tap in the dark. Those who believe that they can match data fragments with arbitrary keywords will never approach the perfect match. As we have mentioned, such an approach ignores other parameters that are crucial for high-quality allocation.

With complex algorithms you can only create the greatest possible approximation if you distance yourself from data fragments and try to include all factors, as does the brain when creating something new: skills, experience, personality and, if treated appropriated, also former job titles. The machine takes all these criteria into account, evaluates them in turn, and gives each a weighting. If these criteria are represented with an adequate weighting, a good starting point will have been reached to bring person and job together using technology. All determinants, including expectations, are aligned and the chances for the perfect match are thereby optimized.

Even with JANZZ. Technology’s well-designed, developed and improved matching processes, it is difficult to consider all factors to the correct extent. Expectations can be mapped on a large scale, but one part is always kept hidden. For example, if unemployed people are to be secured positions, a large part of the expectation is that they will actually be employed. If engineers are to be matched, there is the expectation that the salary band will correspond with that in previous positions. Further expectations can be mapped if it is clear that they exist. Accordingly, we can only approximate the perfect match. However, we are not fumbling in the dark with data fragments. The process may not end with the perfect date – but maybe with an invitation to another one.



¹ Cambridge Dictionary (2017). Creativity. Accessed from: [2017.11.02].

² Einstein, Albert (1930). Mein Weltbild. Wie ich die Welt sehe.

³ Nielsen JA, Zielinski BA, Ferguson MA, Lainhart JE, Anderson JS (2013). An Evaluation of the Left-Brain vs. Right-Brain Hypothesis with Resting State Functional Connectivity Magnetic Resonance Imaging. PLoS ONE8(8): e71275.

Sahoo, Anadi (2017). Knowledge, Experience & Creativity. Accessed from: [2017.11.03.].


ESCO: We expected an ontology – we got a disappointing term collection

Almost four years had passed. We have waited for a long time – and we were curious to see what the EU has announced grandiosely. Steadily excited to see if it solves well-known problems of classification systems. The classification of the European Union for occupational data is called „ESCO“ (European Skills, Competences, Qualifications and Occupations). Up to now, classifications have been solved by all states on their own, such as ROME in France or KLdB in Germany or CP in Italy. They are usually based on the mother of all classifications, the International Standard Classification of Occupations (ISCO) made by the International Labour Organisation around 1960, but they are not necessarily comparable – different numbers, letters and different taxonomy levels can differentiate the classifications.


Other classification systems were first and foremost developed for statistical reasons. Thus, it was possible to compose occupations with identification numbers into groups and then raise statistics, but these systems did not expand the understanding of the individual occupations. The group arrangements were often far too broad, too generic. For example, all medical specialists are grouped together and this group is described with only one set of skills for all specialists. This means that an oncologist is described as having exactly the same skills as a gastroenterologist, gynecologist or pathologist. So, according to the taxonomies, they have exactly the same knowledge, their specializations can only be recognized by their job title. With such inaccurate descriptions you certainly can’t understand individual job titles any better.

The EU did not want to build ESCO as another far too vague skeleton but rather create a common understanding of occupations, skills, knowledge and qualifications across 26 languages so that employers, employees and educational institutions better understand each other’s needs and requirements. In this way, freedom of movement could make up for skills gaps and unemployment in the different member states, as Juncker says¹.

Almost four years have now been worked since the trial version. All possible stakeholders were involved, such as employment offices, career advisors, statisticians, scientists… to create this classification in 26 languages. Almost four years of testing, extending, modifying, reworking… And now I’m sitting here on my PC, typing in the online database „Word“ as a skill and the database does not recognize the term. The only alternative suggestion: WordPress, not really related. If I type „PowerPoint“, there happens just nothing, the database does not recognize the term, it is not stored².

All right, let’s try Indeed. In Germany alone, I find more than 13000 job advertisements with the search term „PowerPoint“, in France and the United Kingdom about 8000 but PowerPoint is not classified as a skill across Europe. No place among 13485 skills in the ESCO. Should an employee understand a potential employer better in the sense that PowerPoint is not an important skill for employment?

Admittedly, the database does recognize „use microsoft office“ when „Microsoft“ is entered but the semantic understanding of the database does not go any further. After all, „use word processing software“ is even stored as a stand-alone skill with no connection to Microsoft Office, none of the two skills suggest themselves to be synonyms.

ESCO states that it recognizes 2942 occupations. It is interesting that the system identifies a «rail logistics coordinator», and also offers certain alternative spellings but not the logistician. Now and then occupations with similar illnesses are found. In addition, as an alternative term for a „political party agent“, «public relations agent» is suggested. Just to give you an example of a mistaken job title alternative.

ESCO will now be available in 26 languages. Yes and no, I’ll find out. Yes, the job titles are available in 26 languages, yes, the skills are also available. The explanation of a term is always in English, though, which means that a title can be translated into all languages but the job description not. It always remains in English. It is now questionable whether an employer from France understands the profession of his Swedish applicant better without a definition in his native language French. Or whether he understands if the classification really matches his vacancy.

Quite apart from the fact that qualifications are only available in one language: Greek. The detailed descriptions can only be found in this language. In any case, an employer in another member state will not understand his or her applicant better, even if he or she comes from Greece. ESCO itself reports that the qualifications have to be supplied by the member states and will be integrated from time to time. However, 27 Member States allowed quiet a lot of time themselves.

Now I have to sum up, I am more than just marginally disappointed. I have waited almost four years since I have been explaining the manifold possibilities of ontologies along with others at the ESCO Congress. But there was not build any ontology, rather a taxonomy or collection of terms. 2942 professions, 13485 skills and 672 (Greek) qualifications were integrated into ESCO. ESCO has apparently invested a great deal of time and probably a great deal of money in this development. But whether this is the breakthrough to Juncker’s goal is fundamentally questionable.

And the question is: What do we do now? Hope and wait another four years until ESCO might be able to meet the needs of HR and Public Employment Services? Or maybe rather look for an alternative? How about an alternative which represents a true ontology with semantic recognition? Which recognizes that a party employee does not do the same as a PR employee. Which knows that MS Word is the same skill as Microsoft Word or word processing. And which contains many languages completely. Who knows, maybe there is such a solution already. Perhaps an online research could be successful in this respect. For example on


[1] ESCO (2015). ESCO strategic framework. Vision, mission, position, added value and guiding principles. Brüssel.

[2] For this research only the online database of ESCO was used.

Our Article on the „Google for Jobs“ Initiative Now Also Available in French, Spanish and Arabic!


The landscape of online job search has gained a significant addition with wide-ranging implications. In line with its recently announced initiative Google for Jobs, Google launched a new jobs search feature right on its search result pages that lets you search for jobs across virtually all of the major online job boards. Google’s new initiative not only has the potential to disrupt the online job search market, but the initiative’s underlying data model, an occupation ontology, may change the nature of job and candidate searches altogether.

Google’s advance into the domain of job search threatens many of the existing players. Not just because the feature will likely focus more user searches on Google’s own site but also because Google’s search quality will likely surpass other services due to the occupation ontology Google has built. Other online job services will have to carefully consider how to go ahead in face of Google’s move. Some can partner with Google. Others will have to look elsewhere for solutions. For the latter, we can offer an even more extensive ontology of occupations and skills.

Read the full article about Google for Jobs by clicking on one of the links below.

English Version: Google launches its ontology-powered jobs search engine. What now?

French Version: Google lance son moteur de recherche d’emploi par ontologie. Et alors?

Spanish Version: Google lanza su buscador de trabajo apoyado en la ontología. ¿Y ahora qué?

Arabic Version: Google launches its ontology-powered jobs search engine. What now? (Arabic Version)

Google Launches its Ontology-powered Jobs Search Engine. What Now?


This week, the landscape of online job search has gained a significant addition with wide-ranging implications. In line with its recently announced initiative “Google for Jobs”, Google launched a new jobs search feature right on its search result pages that lets you search for jobs across virtually all of the major online job boards. Google’s new initiative not only has the potential to disrupt the online job search market, but the initiative’s underlying data model, an occupation ontology, may change the nature of job and candidate searches altogether. We should know, as we have been working with occupation ontologies for the past 8 years.

Google’s advance into the domain of job search threatens many of the existing players. Not just because the feature will likely focus more user searches on Google’s own site but also because Google’s search quality will likely surpass other services due to the occupation ontology Google has built. Other online job services will have to carefully consider how to go ahead in face of Google’s move. Some can partner with Google. Others will have to look elsewhere for solutions. For the latter, we can offer an even more extensive ontology of occupations and skills.

Many people already start their job search on Google. But with the new feature, they will have a very different experience on Google from now on. Previously, Google search queries for jobs, such as “retail jobs” produced a list of links to websites like Indeed and ZipRecruiter. People would click on one of the top links and continue their search on their chosen site. However, Google’s new feature will keep significantly more search traffic on Google’s own site, as Google’s new feature will list single job postings in a box above the traditional web search results. The information will come from the websites of job search specialists like Glassdoor and LinkedIn, and directly from the career sections of many other company websites. Job seekers will click on the new listings and Google will show more information about the position. A „Read More“ button will take them to the job site or mobile app where the listing originated.

Job Board Woes
While Google initially partners with some of the biggest players in job search, including CareerBuilder, Monster, LinkedIn and Glassdoor, Google’s new initiative also injects great uncertainty into the business models of many players in the recruitment market. Most directly affected are job aggregators like Indeed. Chris Russell, a recruiting technology and job site consultant with RecTech Media, said to SHRM that “just like that, Indeed can no longer call itself the ‚Google for jobs‘.” Other recruiting technology experts go even further saying that “it may take another 10 years for Indeed to become an afterthought, but it’s fooling itself if it thinks this isn’t a DEFCON 1 moment” (SHRM). Indeed’s SEO traffic will certainly drop as Google takes over the top spots in search results, prized online real estate Indeed currently holds. Furthermore, companies may be encouraged to list fewer jobs on job boards, where they often have to pay, as Google picks up the listings directly from their career sites.

But that is just to sum up briefly a few ways in which Google’s new initiative may affect the status quo in online job search. There would be a lot of other things to consider and Google is of course also tied to many of the existing players from its other revenue streams, in particular its ad business – relations which Google may not want to upset. In any case, the more significant shift may lie in the underlying data model and approach Google has used to build its solution: the occupational ontology. While ontologies have been around for some time, they have never been used on a large scale. Google’s initiative has put occupation ontologies centre stage, which may finally alter the way companies and technology providers approach the problem of matching people and jobs.

The Problem of Matching the Right People and Jobs
The most challenging problem in business still is matching the right person to the right job. This has many reasons. First, many of the criteria that determine whether a position is right for a person, such as personality and lifestyle, are not embedded in job descriptions. Second, many job descriptions are limited, out of date and often poorly written. Third, employers each use different language to describe the same jobs. For any given job, there are hundreds of different job titles, which makes job or candidate searches often inaccurate and misleading. This has led to a mismatch on the job market: Employers say they still have issues filling open positions. Meanwhile, job seekers often do not know there is a job opening just around the corner from them because search engines have trouble detecting what job postings really mean.

Enter the Occupation Ontology.
This is where the occupation ontology comes in. An occupation ontology functions like a Rosetta Stone between job seeker and employers: it aggregates similar job titles, competences, educations and so on, and thereby helps understand the nuances of CVs and vacancies. An ontology aggregates similar job titles into families of jobs to build a truly useful, searchable, “universe” of jobs, organized by discipline and functional domain. By understanding the relationships between job content, competences, experience, and education, an occupation ontology helps deliver more relevant search results and recommendations.


An occupation ontology therefore offers maximal support both for job seekers and employers. When it is integrated into a job platform, it allows users to get the search results they are looking for without having to worry about their search criteria being too broad or too narrow. In contrast to keyword-based / Boolean search, an ontology based search will deliver results for things you did not explicitly search for but are related to your search criteria.

For instance, even a simple job such as “truck driver” appears in myriad of different wordings, depending on the company. FedEx Express calls drivers Couriers, FedEx Office calls them SameDay City Couriers and FedEx Freight calls them City Drivers. They each use different language to describe that very job. In a normal job search for “truck driver”, these jobs would not surface. However, an ontology knows that these jobs are highly similar and can include all of these and more in the results of a job search.

What Now?
Building an ontology is an enormous undertaking but it will benefit our economy and entire society if it is applied right. Google’s initial customers have been psyched with the successes they have achieved by using Google’s job search. Likewise, our customers have been surprised by the extensive improvements they were able to make with our knowledge graph (JANZZon!). Google has built an occupation ontology in English so far. In the past 8 years, we have built an occupation ontology in 8 languages that captures job universes from regions as diverse as the United States, Germany, Norway, and the Middle East. We have learnt that as the dimension of the ontology grows, both its complexity and value multiply.

Now, while Google’s job feature is a great step towards better job search quality, it is not a suitable solution for every player in the job market. Some, like Indeed are notable excluded from working with Google for competitive reasons, others, such as public employment services and HCM system providers may have greater needs for data security and customization, and finally, Google’s feature does only serve the English language so far. So where should companies look for solutions to keep in step with the rapid technological advancement?

Licensing an Ontology of their Own
Some companies have tried to build their own ontology but have failed to maintain it as a result of a shortage of specialist knowledge and insufficient resources or financial means. In recent years, many digital graveyards have emerged in the area of occupation data. The easiest solution is to license an occupation ontology as a cloud service. The occupation ontology JANZZon! offers this possibility and gives companies and public services the chance to connect to a wealth of knowledge about occupations and skills and to use it for their existing applications. JANZZon! is currently available in 9 languages (working on up to 40) and extends over all industries and job families. It is the most comprehensive ontology available today. And why not give someone else but Google a shot 😉?

Building a Job Matching Engine for the Global Labor Market

Understanding resumes and job ads, and finding the best matches among a great number of them is probably one of the most challenging tasks for machines today. The results and the precision of search and matching processes are dependent upon the scope and depth as well as the quality and comprehensiveness of the applied contextual and background knowledge.

Job postings are often worded in industry- and company-specific jargon that job seekers do not search for and would not use when writing their resumes. While resumes are often structured in a similar way, job postings can take pretty much any form and are often filled with information about the company and its culture rather than a specific list of requirements for the post to be filled. Furthermore, the labor market domain is inherently very complex to understand for machines. For example, data scientist is a profession and Hadoop is a technology that most data scientists need to know. Microsoft Excel and Powerpoint are both components of the MS Office Suite. A mechanic can work with cars or with industry machinery and an investigator can be a scientific researcher or someone investigating crime.

To understand the labor market domain thus, and to understand resumes and job ads, a lot of contextual and background is needed. Recruiters and job seekers already have this knowledge. Machines on the other hand need to be fed with this kind of knowledge so they can then apply it when parsing or matching job ads and resumes. The acquisition of such knowledge is beyond the scope of smart algorithms. Instead, semantic knowledge graphs are required, which represent this domain knowledge in digital form so it can be processed by a matching engine.

job matching engine

The idea that data not algorithms are crucial to developing human level artificial intelligence has been gaining momentum for some time. The use of knowledge graphs and ontologies in particular, has been referenced prominently when Google announced at the end of last year that they had built a knowledge graph of occupations and skills for powerful and precise job search and discovery with their Cloud Jobs API.

JANZZ’s semantic matching engine has the most comprehensive, multilingual knowledge graph in the area of occupations and skills at its disposal. Thus, when the matching engine JANZZsme! does a query expansion, searches or matches job ads and resumes, it accesses the ontology concepts, lexical terms and synonyms by which they may appear in CVs and vacancies in up to 40 languages, and the connections to related concepts. Thus, one of the essential parts in building a matching engine for the global labor market is building a knowledge base that covers the contextual knowledge required to understand the specificities of a labor market and its participants.

To find out more about the powerful application possibilities of our matching engine JANZZsme!, contact us for a demo anytime.

How an Ontology Can Help with Content-based Matching

Job and candidate search, job recommendations and automated candidate evaluations have one thing in common. They are a matching problem.

Simply put, given a set of CVs and a set of vacancies, the most similar items should match, that is, these items should come out at the top of the search, recommendation or evaluation. Most applications use either of two high-level approaches to achieve this: behavior-based or content-based. They each have pros and cons, and there are also ways to combine the approaches to take advantage of both techniques.


Behavior-based approaches leverage user behavior to generate recommendations or suggestions. These approaches are domain agnostic, meaning the same algorithms that work on music or movies can be applied to the jobs domain. Behavior-based approaches do suffer from a cold start problem. If you have little user activity, it is much harder to generate good quality results.

Content-based approaches use data, such as user preferences and features of the items being matched or recommended, to determine the best matches. For recommending jobs, using keywords of the job description to match keywords in a user’s resume is one content-based approach. Using keywords in a job to find other similar jobs is another way to implement content-based recommendations.

However, the issue in this process is really the determination of similarity between two items. How can the similarity between for instance a resume and a vacancy be determined effectively even though they are often structured extremely heterogeneously? All too often, simple keyword-based matching is used for this, which means that many similarities go undetected, as keyword variations, synonyms and alternative phrases are not matched. With a content-based approach, it is important that the semantics (the underlying meaning) of two items be compared rather than the wording. This is where ontologies come into play. They can provide a relational model that can detect the underlying meanings and similarities in CVs and job descriptions. Ontologies enable a digital representation of implicit knowledge: humans usually understand the correct meaning of a term, thanks to their background knowledge and the context in which a specific term is used. A machine on the other hand lacks this ability. It can however, learn about the semantic meaning of a term by means of the concepts and relations stored in an ontology. By using an occupation and skills ontology as an intermediary, content-based approaches for job recommendations, job and candidate search and automated candidate evaluations can achieve much more.

The comprehensive ontology of occupations and skills JANZZon! for example offers a large number of poly-directional concepts pertaining to the global labor market. With its extraodrinary range of concepts, this ontology offers essential context and intelligent evaluation and enhancement options for applications such as information systems, matching engines, job portals, CV parsers, statistical analysis and modelling tools.

Industry Taxonomies Enhanced by JANZZ’s Occupation Ontology

At the heart of’s ontology of occupations and skills, there are over 35 taxonomies, among which occupation, skills and industry taxonomies like O*Net, ESCO, NAICS and ISCO-08. They are mapped by the JANZZ curation team to form a single entity that serves as a relational model for a great part of the world’s economic activity. As part of the latest additions to the occupational ontology JANZZon!, the curation team has inserted the two industry classifications GICS and ICB into the ontology, thereby extending the scope of the ontology as well as enhancing the intelligence of the two industry classifications.

GICS refers to the Global Industry Classification Standard and is a standardized classification system for equities developed jointly by Morgan Stanley Capital International (MSCI) and Standard & Poor’s. The GICS methodology is used by the MSCI indexes, which include domestic and international stocks, as well as by a large portion of the professional investment management community. The GICS hierarchy begins with 10 sectors and is followed by 24 industry groups, 67 industries and 147 sub-industries. Each stock that is classified will have a coding at all four of these levels.

ICB (Industry Classification Benchmark) is an industry classification taxonomy launched by Dow Jones and FTSE in 2005. It is used to segregate markets into sectors within the macro economy and categorizes over 290’000 securities worldwide. The ICB uses a system of 10 industries, partitioned into 18 supersectors, which are further divided into 41 sectors, which in turn contain 114 subsectors.

The two industry classifications allocate each company to the subsector that most closely represents the nature of its principal business activity. Thereby, the classifications allow a comparison of companies across national and linguistic boundaries.

Industry taxonomy enhanced by the ontology network

However, mapped to the semantic network of JANZZon!, the intelligence of the two classifications and their potential use multiply exponentially. As part of the semantic database JANZZon! the taxonomies are connected to a dense web of relations between occupations, skills, specializations and industries. The information on individual companies from the two taxonomies and the relational model of occupations, skills, specializations and industries are intertwined to form an even greater knowledge database. With the added knowledge about companies and how they relate to industry sectors, the ontology JANZZon! can serve its purpose even better, namely to provide an accurate relational model for parsing, matching benchmarking and classification.

Why leading employment services and software providers are betting on ontologies.

Algorithms are out, datasets are in. Perhaps one of the crucial findings in data science today is that datasets – not algorithms – might be the key limiting factor to developing human-level artificial intelligence. This contention is especially true in the case of solutions for the labor and recruitment market. Many companies in the recruitment market and public employment services are taking notice and are investing in the ontology-based solutions of

Therefore, we have taken a brief moment to lay out the underlying reasons why datasets have become so important. And what qualities these datasets must have in order for businesses to take full advantage.

Over the past years, machine learning as well as deep learning and evidence based systems have achieved remarkable breakthroughs. These have in turn, driven performance improvements across AI components. Perhaps the two branches of machine learning that have contributed most to this are deep learning, particularly in perception tasks, and reinforcement learning, especially in decision making. Interestingly, these advancements have arguably been driven mostly by the exponential growth of high-quality annotated dataset, rather than algorithms. And the results are staggering: continuously better performance in increasingly complex tasks at often super-human levels.

Machine learning thrives on patterns. Unfortunately, our world is full of an almost limitless number of outliers. The labor market in particular is intrinsically a very tough market for automated solutions. Neither job titles, nor skills or educations are in any way standardized across the world or even in a particular country. There is a lot of company, culture and geography specific language involved in the description of jobs and qualifications. Furthermore, implicit phrases like “relevant education” or “relevant experience” are all too common, making job descriptions hard to decipher for machines. Algorithms – even sophisticated ones – have a hard time dealing with such an amount of heterogeneity, implicitness and inconsistency. When one adds the factor or language, it gets even more complicated. Currently, most algorithms struggle to deal with any languages other than English.

Datasets on the other hand, in particular annotated datasets of high quality, can reflect and understand the full bandwidth of the labor market vocabulary. They can also deal with cultural and geographical particularities. However, not all datasets are annotated and of a high quality. Most datasets that companies or public employment services have at their disposal are legacies of the past and therefore often messy, incomplete or inconsistent. Nevertheless, they want to leverage the power of data to improve their applications. Therefore, they need a way to enhance their data with standardized, intelligent meta-data.


This is where ontologies come into play. An ontology formally represents knowledge as a hierarchy of concepts within a domain. Concepts are linked to each other through different relations. Through the relations that have been set and the location of the term in the ontology the meaning of a specific term becomes interpretable for a machine. An ontology is a dataset but it is of such high quality that it can also help improve the quality of other datasets. focuses solely on the labor market and its ontology is the largest multilingual encyclopedic knowledge database in the area of occupation data, in particular jobs, job classifications, hard and soft skills and qualifications. The occupation and skills ontology can help companies and public employment services in many respects. It can serve as the basis of matching engines, parsing tools, natural language processing or classification tools, improving the results and learning of these tools significantly. More specifically, it can enhance job and candidate matching processes, CV parsing, benchmarking, statistical analyses and much more.

Positive that high data quality is going to create a competitive advantage for them, many stakeholders in the global labor market are currently investing in the solutions offered by Above all its ontology. Data quality is becoming a focal point of competition in the digital labor and recruitment market.

Lost in Big Data?
The Misguided Idea Ruling the Data Universe.


“. . . In that Empire, the Art of Cartography attained such Perfection that the map of a single Province occupied the entirety of a City, and the map of the Empire, the entirety of a Province. In time, those Unconscionable Maps no longer satisfied, and the Cartographers Guilds struck a Map of the Empire whose size was that of the Empire, and which coincided point for point with it.[…]”

„On Exactitude in Science“
Jorge Luis Borges

Borges’s story imagines an Empire addicted to the idea of creating a perfect representation of its world. The fictional Empire has immersed itself completely in the task of creating a map that coincides with its land point for point. Today, I cannot help but think that we find ourselves in a very similar environment: data is profoundly changing our world and how we perceive it. We find ourselves in the midst of a data revolution so vast, pervasive and young that it is hard to take it all in. The impact of data is extending on a truly massive scale; we are striving to use big data to transform whole industries, from marketing and sales to weather forecasts, from medical diagnoses to food packaging and from the storage of documents and the use software to communication. Indeed, very much like Borges’ fictional Empire, we have come to believe that the more data we collect and analyze, the more knowledge we gain of the world and the people living in it. How foolish data maniacs we have become.

The conviction now prevails that big data delivers actionable insights into nearly every aspect of life. Philip Evans and Patrick Forth contest that “information is comprehended and applied through fundamentally new methods of artificial intelligence that seek insights through algorithms using massive, noisy data sets. Since larger data sets yield better insights, big is beautiful” (From their joint article in bcg.perspectives). Along these lines, our hunger for data is consistently increasing and our digital ecosystem is fueling it: sensors, connected devices, social media and a growing number of clouds continually produce new data for us to collect and analyze. According to a study by the International Data Corporation (IDC), the digital universe will about double every two years. From 2005 to 2020, the volume of data will grow by a factor of 300, to 40 zettabytes of data. A zettabyte has 21 zeros. In this world of exponential data growth, the ambition to accumulate data goes unchecked. As in Borges’ fictional empire, the outer limit is the scale of 1:1, a complete digital representation of our world.

Today, companies like IBM or LinkedIn are already pushing towards that limit. IBM is training its cognitive computing system called Watson to be able to answer virtually any question. In order to do so, IBM Watson is collecting unprecedented amounts of data to form an impressive corpus of information. The company just acquired Truven Health Analytics for $2.6 billion in cash, bringing to its health unit a major repository of health data from thousands of hospitals, employers and state governments across the US. It was the fourth major acquisition of a health data company in IBM Watson’s 10 month life span, showing just how important a digital representation of patients, diagnoses, treatments and hospitals is to the computer giant’s artificial intelligence system. LinkedIn’s vision is equally ambitious: they are creating an Economic Graph, which is nothing less than a digital mapping of the global economy. It aims to include a profile for every one of the 3 billion members of the global workforce. It intends to digitally represent every company, their products and services, the economic opportunities they offer and the skills required to obtain those opportunities. And it plans to include a digital presence for every higher education organization in the world. Yet, the endeavors of the two companies are but the tip of the iceberg. Their pursuit of building a complete digital representation of their respective fields is emblematic of a more general aspiration today towards a state of ubiquitous information.

The visions of companies like IBM Watson and LinkedIn are thus already evoking Borges’ imagined world. The forces of big data are converging and recreating the cartographic ambitions of the Empire of his story. The world is becoming self-referential. The digital representation of our world is expanding fast and at the outer limits, representation and reality are starting to coincide. The world and our picture of it are converging. Suddenly, we find ourselves in a world bearing a startling resemblance to Borges’ Empire.

How foolish – Borges’ story continues, calling into question the very purpose of such an immense representation. Whether cartographic or digital, a map of the scale 1:1 might not be as valuable as thought.

“[…] The following Generations, who were not so fond of the Study of Cartography as their Forebears had been, saw that that vast map was Useless, and not without some Pitilessness was it, that they delivered it up to the Inclemencies of Sun and Winters. In the Deserts of the West, still today, there are Tattered Ruins of that Map, inhabited by Animals and Beggars; in all the Land there is no other Relic of the Disciplines of Geography.”

In Borges’ fictional world, the next generations disposed of their forefathers’ map as they had not been gripped by the same ambition as their ancestors and recognized that the map of the scale 1:1 was useless. They left it to decompose and all that remained were the “tattered ruins” of the forebears’ map. The realization that a map of the scale 1:1 is practically pointless also echoes with our experience with the expanding data universe.  Professor Patrick Wolfe, Executive Director of the University College of London’s Big Data Institute, warns that “the rate at which we are generating data is rapidly outpacing our ability to analyze it.” Just about 0.5% of all data is currently analyzed, and Wolfe says that percentage is shrinking as more data is being collected. So we are also beginning to realize the impracticality of the masses of data that we are wielding. Rather than gaining exponentially more knowledge about our world through data, we are creating an entity that is in danger of slipping into oblivion through its sheer size.

In order to prevent our perpetually accumulating digital collection from suffering the same fate as Borges’ map – to be left to tattered ruins by our subsequent generations –, it is essential to draw actionable intelligence from it. Hence, the capacity to really understand the full complexity of the masses of collected data and to produce relevant knowledge from them will be the ultimate competitive advantage, today and even more so in the future.

While turning big data into smart or intelligent data is already being advocated by many, no patent solution has yet emerged about how to actually achieve this transformation. Today, applied mathematics, natural language processing and machine learning are equally weighing in the balance and replace every other tool that might be brought to bear. It is the idea that with enough data, the numbers speak for themselves. To reiterate what Evans and Forth said, “big is beautiful”. This idea informs the culture of Silicon Valley and by extension that of many ventures around the world.

Other methodologies like ontologies, taxonomies and semantics are completely disregarded in the current spirit of discovery. Where applied mathematics, machine learning and predictive analytics stand for size, ontologies, taxonomies and semantics stand for meaning and understanding. And while the latter might seem insignificant compared to the dimensions of the first, they will play no lesser part in determining the competitive fitness of companies. After the exponential growth of the digital universe over the last years, we have reached a degree of complexity that requires the insertion of a deep understanding of the matters at hand. Something that will not be achieved by collecting yet more data or with the implementation of an algorithm. Ironically thus, it is a change of direction away from „big is beautiful“ that could really leverage the full power of big data.

Effective Data Curation for Occupation Related Data: How We Are Dealing with NAICS and ISIC.

The North American Industrial Classification System (NAICS) and the International Standard Industrial Classification (ISIC) are two landmarks on our way to master occupation data. The way we are curating the data from these two classifications is exemplary of our approach to put a deep understanding of jobs, skills and industries at the center of our recruitment/employment solutions. Hence, we felt it would be about time to give you a little more insight into how we deal with occupation related data, showing you the inherent complexity of the labor market and the difficulty in preparing occupation related data in a way that it can go on to drive some of today’s most powerful applications. For example public employment services, applicant tracking systems, statistical tools or job boards. Solutions that help alleviate some of today’s hardest problems on the global labor market.


The two industrial classifications are fairly complex structures in themselves. They also show a different approach to the classification of industries. When looking at an industry like street construction for example, NAICS lists a total of 38 different activities under “Highway, street and bridge construction”, among which you will find airport runway construction, highway line painting, pothole filling and guardrail construction. ISIC on the other hand is less detailed; it sums up the same industry in only three bullet points: asphalt paving of roads, road painting and installation of crash barriers and traffic signs. While ISIC contains less detailed information about activities, the underlying structure of the two classifications is the same. The International Standard Industrial Classification has provided guidance to countries in developing national activity classifications, hence most national taxonomies took over its general structure and filled it with country specific activities.

How enriches data from standard classifications

Now, what do we do with the thousands of activities and industries in these classifications? We connect each of the terms within the classifications with terms that are already in our ontology JANZZon!: not only related industries, for example other types of civil engineering in the case of “street construction”, but also occupations, skills, specializations and educations that belong within the realm of a particular industry. Also SSIC, the Singapore Standard Industrial Classification, adopts the basic framework and principles of ISIC. Including each of these industrial classifications into our ontology means having a greater level of detail and comprehensiveness at our fingertips than any of the taxonomies could provide on their own.

NAICS and ISIC street construction

Not only industries and activities are curated like that but also skills, educations, job titles etc. All these “data trees” are again interconnected. “Street construction” is related for example with the “road construction engineer”, the “roller driver”, “infrastructure planning” and “road surface marking”.

Sometimes, the denomination of skills, industries and specializations can be the same: for instance, “street construction” could also be a skill or specialization of a construction worker. In these cases, NAICS, ISIC and SSIC intersect with taxonomies of skills and competencies such as ESCO. Our ontology curation team adds these intersections and thereby creates yet more cross-relations and thus makes the ontology even smarter.
On the one hand, the ontology enriches the data from the standard classifications by establishing meaningful connections between occupations, skills, industries and so on. In multiple languages at that. On the other hand, another layer of detail is added to the taxonomies by including also real life data: data from job boards for instance. For taxonomies like NAICS and ISIC have become important tools for comparing statistical data on economic activities but the denominations used are not necessarily the ones used in CVs or jobs postings. By adding a wealth of synonyms, we make the data harvested from the taxonomies fit to be used not only for statistical purposes but also for job matching.
Finally, the effective curation of occupation related data is not only ensured by the breadth and detail of data that is entered into our ontology JANZZon! but also by the industry specific expertise of our team. Establishing meaningful relations between occupations, skills and education requires human experts in order to guarantee the high quality of the knowledge base. In a time when machine learning, smart algorithms and predictive analytics are often held as ubiquitous solutions to everything, we put a deep understanding of occupations, skills and industries back at the center of solving some of today’s hardest labor market issues.