Ontology and taxonomy – stop comparing things that are incomparable

To many people, the word ‘ontology’ might sound abstract. It has its origin in Tim Berners-Lee’s dream of inventing the World Wide Web. This dream included the Web becoming capable of defining a so-called ‘Semantic Web’ by analyzing all Web data, including content, links and computer-person transaction. In the Semantic Web, the Resource Description Framework (RDF) and Web Ontology Language (OWL) have been established as standard formats for sharing and integrating both data and knowledge—the latter in the form of rich conceptual schemes called ontologies. [1] In this article the word ontology serves as the working definition, however it is worth mentioning that in today’s IT world there is also a broad use the term ‘knowledge graph’ to refer to this concept.

Why to care about ontology

With regard to artificial intelligence (AI), the terms ‘big data’, ‘machine learning’ and ‘deep learning’ are slowly replacing the usage of ‘AI’. However, to quote Adrian Bowles, “there is no machine intelligence without (knowledge) representation.” In other words, AI requires some elements of knowledge engineering, information architecture and a significant amount of human work to do its ‘magical neural work’. Fittingly, Alexander Wissner-Gross finds that, perhaps most importantly, we need to recognize that it is intelligent datasets—not algorithms—that are likely to be the key limiting factor in the development of human-level artificial intelligence.

             “there is no machine intelligence without (knowledge) representation.”

An ontology is a structured and formal representation of relative knowledge in a certain domain. This is necessary, because unlike humans it cannot directly rely on human background knowledge about a term’s correct usage. What an ontology can do, however, is to “learn” about the semantic meaning of a term through the interlinks between the concepts in its system. Powerful ontologies already exist in specific domains, examples include the Financial Industry Business Ontology (FIBO) as well as numerous ontologies for healthcare, geography or occupations.

Another important part of AI is semantic reasoning. In addition to identifying potentially fraudulent transactions, determining users’ intent based on their browser history and making product recommendations, AI can also do the following: It can execute tasks that require explicit reasoning based on general and domain-specific knowledge, such as understanding news articles, preparing food or buying a car. Thus, such tasks require information that is not part of the input data but needs to be dynamically combined with knowledge. This type of machine reasoning can only be achieved with ontologies and the way their knowledge is modeled. [2]

Taxonomy and ontology are fundamentally different

Ontology is often confused with taxonomy.  Apart from the fact that both belong to the fields of AI, the Semantic Web and system engineering, there is really not much that would characterize them as synonyms. Taxonomy classifications such as O*NET (Occupational Information Network) and ESCO (European Skills/Competences, qualifications and Occupations) simply cannot be compared to ontologies.  They provide a much simpler approach to classifying objects, as they have a hierarchical structure and utilize only parent-child relations without any additional, more sophisticated links. Ontologies, on the other hand, are a much more complex form of categorization. Speaking metaphorically, a taxonomy equals a tree whereas an ontology comes closer to a forest.

For example: The term ‘golf’ could appear in several taxonomies.  It might be located under a ‘Human Activities’ tree (human activities -> leisure activities -> sports -> golf).  It could also be found under a taxonomy concerning apparel (apparel -> casual/active apparel -> sporting apparel -> golf clothing and accessories). It could even appear in something quite different, for example an automobile taxonomy (automobile -> Germany -> VW -> Golf). Each of these taxonomies can be considered a tree whose branches touch at their ‘golf’-related nodes. [3]

Put differently, taxonomies represent a collection of topics with ‘is-a’-relationships while ontologies allow for much more complex connections, such as ‘has-a’- and ‘use-a’-relations. [4] Hence, if we return to the classification example above, taxonomies lack the capability to compare child concepts.

In the classification of ESCO, almost all medical specialists are grouped under the heading ‘Specialist Medical Practitioners’. Furthermore, specialist skill sets are simply grouped in lists without any links to the respective specialist occupations. Why is that? One reason is that classifications are mainly used for statistical purposes. From this viewpoint there is no need to further classify all individual medical specialists according to their skill sets and training background. Therefore, according to taxonomies, specializations can only be recognized by their job title and one needs to refer to other sources to better understand their individual meaning.

Building an ontology of occupations, qualifications and skills makes it possible to automatically recognize similarities and differences between job titles. For example, pediatricians and neonatologists have similar jobs, both of which concern themselves with the medical care of newborn infants. With the ontology modeling approach, it is possible to determine that a pediatrician has a very high percentage of similar skills to those of a neonatologist. However, pediatricians can only take over the neonatologist’s job after further training. All this information can be represented in an ontology through the interrelationships between concepts. This goes beyond the capacity of a simple taxonomy.

Ontologies enable matching datasets

When it comes to matching, say the matching of CVs with vacancies, there is no better way than to use an ontology. All too often, simple keyword-based matching or fuzzy machine learning methods are used for this, which means that many similarities go undetected and cannot be matched, such as keyword variations, synonyms and alternative phrases. When matching, it is important to compare the semantics (the underlying meaning) of two items rather than the wording. This is where ontologies come into play. They can provide a semantic modeling that can detect the underlying meanings and similarities in CVs and job descriptions.

The ontology matching technique represents a fundamental technique in many areas, such as ontology merging. In domains with very complex rules (and complex interactions between rules) there’s no substitute for ontologies. This is shown, for instance, when you consider integrating disparate domains. Let’s say there are two separate ontologies, a weather ontology and a geographic ontology, when considering navigation or insurance risks, to create a third ontology which integrates and leverages the other two is a manageable proposition. [5]

 True value of ontologies

The semantic system relies on explicit, human-understandable representations of concepts, relationships, and rules to develop the desired domain knowledge. It is impossible to rely solely on programmers to build such a system based on machine learning, as they lack the knowledge needed to define relationships between concepts in the specific domains. Therefore, the domain knowledge must be learned from domain experts with various backgrounds (e.g. intellectual property law, fluid dynamics, car repair, open-heart surgery, or educational and vocational systems). This process is crucial for creating a comprehensive knowledge representation.

For the multi-lingual JANZZ ontology language skills are a key point. In many cases, a one-to-one translation of a concept into multiple languages isn’t possible, however, thanks to Switzerland being small and integrated, all the JANZZ ontology curators are fluent in at least two languages and some even speak more than four (including Chinese and Arabic). This advantage guarantees the ontology’s consistency and quality across different languages.

About a decade ago, JANZZ started building its ontology on various occupation taxonomies, namely ISCO-08, ESCO and all country-specific classifications. Over the years, JANZZ has added thousands of new professions and functions (e.g. Market Research Data Miner, Millennial Generational Expert and Social Media Manager) to the JANZZ ontology, which didn’t exist before in any of the known taxonomies. Besides job titles, also up-to-date skills, education, experience and specializations have been included in the ontology. It is the right tool for HR and Public Employment Services, which recognizes the similarities and ambiguities among job titles, rather than being a collection of terms like a taxonomy. Today, the JANZZ ontology is by far the largest, most complicated and most complete occupation data ontology in the world.

For private corporations and public employment services trying to choose between a classification system based on a taxonomy and a classification system based on an ontology, we hope this article helps you make the right decision and helps you realize that investing in a non-semantic system (without content) will not get you any further. Luckily, some governments and corporations have chosen the right path and have already benefited from our newest technology. If you would like to know more about the JANZZ ontology, please write now to sales@janzz.technology

 

 

[1] Ian Horrocks. 2008. Ontologies and the Semantic Web. URL: http://www.cs.ox.ac.uk/ian.horrocks/Publications/download/2008/Horr08a.pdf [2019.02.01 ]

[2] Larry Lefkowitz. 2018. Semantic Reasoning: The (Almost) Forgotten Half of AI. URL: https://aibusiness.com/semantic-reasoning-ai/ [2019.02.01]

[3] New Idea Engineering. 2018. What’s the difference between Taxonomies and Ontologies? URL: http://www.ideaeng.com/taxonomies-ontologies-0602 [2019.02.01]

[4] Daniel Tunkelang. 2017. Taxonomies and Ontologies. URL: https://queryunderstanding.com/taxonomies-and-ontologies-8e4812a79cb2 [2019.02.01]

[5] Nathan Winant. 2014. What are the advantages of semantic reasoning over machine learning? URL: https://www.quora.com/What-are-the-advantages-of-semantic-reasoning-over-machine-learning [2019.02.01 ]