‘So clever I don’t understand a word of what I am saying’ – AI’s potential for handling text-based data is far from unlimited


The often-expressed fear that AI robots are on the verge of infiltrating and taking control of every aspect of our lives is admittedly understandable, given the capabilities of AI already proclaimed today: writing guest articles for newspapers, answering basic customer service queries, diagnosing medical conditions, solving long-standing scientific problems in biology and much more – if you believe the sources [1],[2]. But are these purported successes really evidence of unlimited potential? Will AI systems really be able to solve any task given enough time and data? You may not want to hear this, but the answer is a resounding NOPE. At least not if researchers and developers stick to the knowledge-lean approaches they are currently so fixated on.

To start with, artificial intelligence is not even remotely comparable to human intelligence. Any AI system is a sophisticated ‘bag of tricks’ designed to lull humans into believing it has some kind of understanding of the task at hand. So, to develop AI-based technologies that are intelligent in any sense of the word there is no way around feeding these technologies with extensive human knowledge – which involves a substantial amount of human labor and will do for the foreseeable future. As a consequence – and again, you may not want to hear this – relying on solutions for HR and labor market management based solely on deep learning (DL) or other statistical/machine-learning (ML) approaches is a bad investment. And it is not simply a waste of money, it is also an ethical issue: Especially in the field of HR, the reliability of data and evaluations is crucial as it can deeply affect human lives. For instance, all too often perfectly suitable candidates are screened out by AI-based systems like ATS just because their resume does not contain the exact keywords specified in the filter or is associated with false contexts. Which a human recruiter would have realized, had they seen the resume themselves. This is just one of many examples of how real people can be affected by underperforming AI technology.

Artificial – as in fake

While researchers define AI systems as ones that perceive their environment and take actions that maximize their chance of achieving their goals, the popular perception of AI is that it aims to approach human cognition. Intelligence is typically defined as the ability to learn, understand, and form judgments or opinions based on reason or to deal with new or trying situations. However, this requires a key cognitive ability: storing and using commonsense knowledge, which we humans develop through a combination of learning and experience – and that so far, AI systems simply do not have and won’t achieve in the foreseeable future. These limitations are most obvious in natural language processing (NLP) and natural language understanding (NLU) techniques based on ML because commonsense knowledge is absolutely essential when it comes to understanding natural language. As an example, consider the following statement:

Charlie drove the bus into a tree.

Nowhere in this sentence does it explicitly state that Charlie is a human being, was in the bus, or that this is uncommon behavior. And yet our commonsense knowledge allows us to draw these and many other conclusions from this simple sentence without much effort. This ability, coined ‘linguistic competence’ by linguist Noam Chomsky, distinguishes computer systems trained in NLP and NLU fundamentally from human cognition. While we humans acquire this linguistic competence at an early age and can use it to discern the meaning of arbitrary linguistic expressions, knowledge-lean AI models will never be able to do so to the same extent because they work on a purely quantitative basis: their ‘intelligence’ is based on statistical approximations and (occasionally pointless) memorization of text-based data. ML systems can, at times, sidestep the problem of understanding and give the impression that they are behaving intelligently – provided they are fed enough data and the task is sufficiently narrowed down. But they will never actually understand the meaning of words; they simply lack the connection between form (language) and content (relation to the real world) [1].

This is precisely why even the most advanced AI models still struggle with these types of statements: because they contain so much implicit, often important information and causalities. For example, GPT-3, a state-of-the-art AI-based language model (which wrote the newspaper article cited at the beginning), was unable to correctly answer the simple question of whether a toaster or a pencil was heavier [1]. This is somewhat reminiscent of a quote from Oscar Wilde’s The Remarkable Rocket: “I am so clever that sometimes I don’t understand a single word of what I am saying”…

A major reason for this problem is that commonsense knowledge comprises an unconceivable number of facts about how the world works. We humans have internalized these facts through lived experience and can use them in expressing and understanding language without ever having to encode this staggering amount of knowledge into a written form. And precisely because this tacit knowledge is not captured systematically, AI systems have no access to it – or at least knowledge-lean AI systems don’t, i.e., systems based purely on statistical/ML approaches. So these systems are faced with unsurmountable challenges when tasked with understanding language. Because it is ‘unexpected’.

Another simple example: In a statistical analysis of words related to the English word pen, an ML system may spit out the words Chirac and Jospin, because these names are often mentioned together with the French politician Marie Le Pen, who of course has nothing to do with writing tools. It gets even more complicated when the same expression takes on different meanings depending on the context – think writing pen versus sheep pen. Systems based purely on ML often have great difficulty in discerning the nuances of such everyday language because they do not store the meanings of a word; connections are just based on cooccurrence. So, in the knowledge-lean world, there is still a long way ahead to reliable NLU.

No AI without HI

Having been around since the 1950s, AI has cycled through phases of hype and disillusionment many times. And right now, at least in the subfield of NLU, we are cycling back into the ‘trough of disillusionment’, as Gartner has so aptly coined it. Nevertheless, many are still clinging on to the great promises, blithely publishing, touting and investing in knowledge-lean AI technologies. But relying completely on ML-based algorithms for any application that requires language understanding is nothing but an expensive mistake. As we already explained, it is a huge leap from automated processing of textual data (NLP) to meaningful human-like understanding (NLU) of this information by machines. Thus, many automation plans will remain an illusion. It is high time to switch to a strategy that can succeed in these challenging tasks by effectively creating artificial intelligence through human intelligence.

In our area of expertise here at JANZZ, where we (re)structure and match job-related data, we understand that many automated tasks in big data require a significant amount of human labor and intelligence. Our job and resume parsing tool JANZZparser! has relied on NLP and NLU since the beginning – but always combined with human input: Our data analysts and ontology curators carefully and continuously train and adapt the language-specific deep learning models. NLP tasks are trained using our in-house, hand-selected corpus of gold standard training data. Parsed information is standardized and contextualized using our hand-curated ontology JANZZon!, which is the most comprehensive multilingual knowledge representation for job-related data worldwide. This machine-readable knowledge base contains millions of concepts such as occupations, skills, specializations, educations and experiences that are manually linked by our domain-specialized experts according to their relations with each other. JANZZon! integrates both data-driven knowledge from real job postings and resumes as well as expert information from international taxonomies such as ESCO or O*Net. This is the only way to ensure that our technologies can develop the kind of language understanding that actually deserves the name artificial intelligence. Generic phrases such as flexibility are given the relevant context, be it in terms of time management, thinking, or other aspects. As a result, false matches such as Research and Ontology Management Specialist with occupations like those in the figure below, due to overlap in wording but not in content, are excluded from matching results in our knowledge-based systems. The unique combination of technology and human intelligence in machine-readable form can achieve highly accurate, reliable and cross-linguistic/cross-cultural results when processing job-related data. Errors like the one in the pen example simply do not occur because each word is conceptually linked to the correct and relevant meanings and points of association.


Throwing good money after bad

The fact that we are on the right track with our hybrid, knowledge-based method of combining human intelligence with state-of-the-art ML/DL methods is not only confirmed by our own experiences and successful cooperation with businesses and public employment services (PES) across the globe, but also widely recognized by – non-commercial – NLU researchers. The outlined problems around the missing cognitive component in knowledge-lean AI systems will not be resolved in the next 50 years. As soon as human language is involved, there will always be countless cases where a 3-year-old child can make the correct semantic connection while a machine-learned tool either fails or does so only with absurdly high effort. Although knowledge-based systems like ours provide reliable and explainable analysis of language, they fell from grace because researchers and developers perceived the manual effort of knowledge engineering as a bottleneck. And the search for other ways to deal with language processing led to the knowledge-lean paradigm. Nowadays, supported by the immense speed and storage capacity of computers, most have come to rely on applying generic machine learning algorithms to ever-growing datasets for very narrow tasks. Since this paradigm shift, many developers and consumers have invested a lot of time and money in these systems. Being so heavily invested financially, they are simply not prepared to admit that this approach cannot produce the results they are looking for, despite the growing evidence against them.

However, the hybrid, knowledge-based approach of combining ML-based features with human-generated semantic representations can significantly improve the performance of systems that depend on language understanding. In our case, by adopting this approach, our technology can avoid the pitfalls of knowledge-lean systems based on uncontrolled AI processes, simple keyword matching and meaningless extractions of intrinsically context-poor and quickly outdated taxonomies. Instead, our matching and analytics solutions can draw on the smart data generated by our ontology. This context-based, constantly updated knowledge representation can be used in a variety of ways for intelligent classification, search, matching and parsing operations, as well as countless other processes in the area of job-related data. Especially in HR analytics, our solutions achieve above-average results that far exceed the performance of comparable offerings on the market. Thanks to these insights, employers are able to make well-informed decisions in talent management and strategic workforce planning based on smart, reliable data.

Do the right thing and do it right

Finally, there are the ethical concerns of applying AI to textual data. There are numerous examples that illustrate what happens when the use of machine learning systems goes awry. In 2016, for example, a software manufacturing giant’s chatbot caused a public controversy because, after an unsolicited, brief training session by Internet trolls, it almost immediately started spouting sexist and racist insults instead of demonstrating the company’s NLP technology in an entertaining and interactive way as planned. The challenge of developing AI that shares and reliably acts in accordance with humanity’s moral values is an extremely complex (and possibly unsolvable) task. However, given the trend toward entrusting machine learning systems with real-world responsibilities, this is an urgent and serious matter. In industries such as law enforcement, credit or HR, the inadequate use of AI and ML is all the more delicate. Talent and labor market management, for instance, directly affects the lives of real people. Therefore, every decision must be justifiable in detail; faulty, biased or any kind of black-box automation with a direct impact on essential decisions in these matters must be weeded out. This stance is also taken by the European Commission in their whitepaper on AI and the associated future regulations, especially in the HR sector. As a matter of fact, almost all of the highly praised AI systems for recruiting and talent management on the market, mainly originating from the USA, would be banned under these planned regulations. JANZZ.technology’s approach is currently the only one that will be compatible with these planned regulatory adjustments. And this has a great deal to do with our knowledge representation and how it allows us to produce not just AI technology that comes very close to understanding language, but in fact explainable AI. So ultimately, the way forward is to appreciate that – in the words of NLU researcher McShanethere is no bottleneck, there is simply work that needs to be done.

At JANZZ.technology, we have done this work for you, with experts from diverse backgrounds in terms of language, experience, education and culture. Their pooled knowledge is incorporated into our ontology JANZZon! and made readable and processable for both machines and humans. Together, our experts have created and continuously curate the best possible and most comprehensive representation of the ever-growing heterogeneity of job-related knowledge in the field of human resources and labor market administration. Enabling multilingual, modular and bias-free solutions for all HR processes – and bringing you a step closer to truly intelligent HR and labor market management solutions. If you would like to learn more about our expertise and our products, or benefit from advice tailored to your organization’s individual situation, please contact us at info@janzz.technology or via contact form, visit our product page for PES and listen in on our new podcast series.


[1] Toews, Rob. 2021. What Artificial Intelligence Still Can’t Do. URL: https://www.forbes.com/sites/robtoews/2021/06/01/what-artificial-intelligence-still-cant-do/amp/
[2] GPT-3 (Guardian). 2020. A robot wrote this entire article. Are you scared yet, human? URL: https://www.theguardian.com/commentisfree/2020/sep/08/robot-wrote-this-article-gpt-3

Teleworking, teletravail, teletrabajo… Who is working remotely?

In Washington D.C., metro ridership is only 30 percent of the 2019 ridership. The hustle and bustle of the city has not returned as employers are uncertain on when and how to reopen offices due to the Delta variant and at present Omicron. A Capital Covid survey conducted by the Greater Washington Partnership revealed that less than half of employees were expected to be back in the office on an average workday this fall. The slow return to work in the Washington region highlights new trends in work-from-home and hybrid arrangements becoming the business norm.

Across the world, employers and workers alike are coming to terms with more flexible working arrangements. In 2020, employers were not ready for their entire workforce to work remotely. Prior to the pandemic, about 17 percent of American’s worked remotely 5 days a week. Today, 45 percent of full-time employees in the United States were partly or fully remote in September 2021 per Gallup’s monthly employment trends update. About two-thirds of white-collar workers remain working from home (41 percent) and/or with a hybrid option (26 percent).[1]

In Europe, highly skilled professionals were more likely to be working from home (WFH) pre-pandemic than other workers. Approximately, 5 percent of EU nationals worked from home before Covid-19 while now that figure stands at 12.3 percent who do “home office” as it is called in Europe per Statista data. These figures vary depending on where in the European Union workers find themselves.

Home-based work is nearly non-existent in Eastern European countries such as Bulgaria and Romania with less than 3 percent working remotely. In comparison, one in four Finnish workers do home office (25 percent) followed by Luxembourg and Ireland with about 20 percent teleworking. In countries such as France, Germany, Spain, and Portugal between 10 to 15 percent partake in WFH.

Unsurprisingly, the prevalence of remote-based work also varied by industry and profession pre-pandemic. Knowledge workers or those in ICT-intensive sectors in the Netherlands and Sweden (about 60 percent) did some form of telework, while less than 30 percent did so in Italy, Austria, and Germany.[2]

Yet, this is not an option for workers in professions that require face-to-face interactions such as healthcare, hospitality, retail, and education. The gap between those who are WFH and in-person appears to create societal cleavages making society more unequal – as is currently seen in public debates in Switzerland.

On the employer-side, remote work brings new challenges to companies that rely on knowledge and creativity to spark new ideas and drive innovation. Workers miss out on face-to-face contact or “water cooler” chats that foster collaboration and help employees share information in ways that are limited or siloed by Slack channels, chat rooms, and email. Many executives also believe WFH cannot replace personal interactions that foster company culture. Productivity gains may also suffer in the long-term as collaboration declines amongst workers.

But so far, the Economist Intelligence Unit finds divergent views on workplace productivity. Nearly 39 percent of executives believe WFH has increased productivity while 33 percent find it has declined.

Globally, the study finds that company size and nature of the business impact productivity more than geographic location. Larger firms have more resources and digital tools to allow business continuity remotely – so, perhaps smaller firms without ICT uptake witnessed a productivity decline during the pandemic.

The uptake of remote working was accelerated by the pandemic, yet it remains more pronounced in the United States than Europe – even as EU countries encourage home-based work a few days per week due to the nascent Omicron variant. Overall, American workers report being happier with the more flexible WFH lifestyle and improved well-being, coupled with lessened commute times. Gallup’s State of the Global Workplace reveals that 91 percent of U.S. workers who work at least partially remote hope to continue splitting time between the office and work. Hybrid work is favored by 70 percent of workers partially on WFH and almost half of those fully on-site with jobs that can be feasibly performed from home. Only 6 percent of fully remote workers stated wanting to return to the office full time.[3]

Reevaluating work and the hybrid paradox

It has been nearly two years since the world heard about Covid-19. In that time, organizations and employees have been nimble to embrace all the surrounding complexities and disruptions to work life. The pandemic upended individual’s relationship with work and made many rethink not only how they work but also when and where.

While the United States witnessed the “Great Resignation”, worldwide about 40 percent of workers considered leaving their current job in 2021. Microsoft’s Work Trend Index points towards a new social contract between organizations and employees. Successful organizations are those likely to appease to individual’s different work styles. Globally a “hybrid paradox” appears to be gaining momentum with workers – people want to work from anywhere yet crave in-person connection.

At JANZZ.technology we strive to connect job seekers with jobs and businesses with talent powered by cognitive computing to find the best fit in labor market solutions.


[1] Saad and Wigert. October 2021. Remote work resisting and trending permanent. Gallup News. https://news.gallup.com/poll/355907/remote-work-persisting-trending-permanent.aspx
[2] European Commission.  2020. Telework in the EU before and after the COVID-19: Where we were, where we head to. Science for Policy Briefs. Brussels.  https://ec.europa.eu/jrc/sites/default/files/jrc120945_policy_brief_-_covid_and_telework_final.pdf
[3] State of the Global Workplace: 2021 Report. Gallup. https://www.gallup.com/workplace/349484/state-of-the-global-workplace.aspx?thank-you-report-form=1