Getting Topical

One of the common issues we encounter when talking with partners is the need to expand synonyms, synonymical topics and find highly-related topics extracted from user-created text, whether it be job postings, resumes, or curricula.

Put simply, a human reading a resume or job posting might know that "ms word" refers to "Microsoft Word" and that, given a list of software on a resume, "Word" is probably also referring to the Microsoft product, and that all of these are part of the "Microsoft Office" suite. But computers and keyword-based matching algorithms can find this simple task to be extremely challenging.

One of the classic ways of addressing this challenge is to build a massive ontology of topics and their relatedness and search through this ontology every time a new phrase is presented to try and find matches, but this approach is fraught with problems of its own. The string "Microsoft Word" doesn't match "msword", "word", "ms word", "m$ word", "microsft word", "micosoft word", and especially has trouble with the phrase "I am experienced with the entire Office suite of products" which doesn't even mention "Microsoft Word" in the first place. Add to that the problem of "freshness" -- that is, your ontology has to be continually updated, not only with new topics that arise in the world of work, but also with every possible human-entered permutation of how those topics are mentioned. This is a daunting task. So text-matching against an ontology, even with sophisticated regular expressions and deep libraries of constantly-updated associations can only get you so far.

We've taken a new approach, which we call Related Topic Expansion, that combines our curated SkillsEngine data library with a sophisticated, state-of-the-art neural network that learns, over the course of reading millions of work-related documents, what textual phrases are related to one another. We then use that neural net's inferences to help us surface the best, most relevant Tools, Techs, Knowledges, Skills, Abilities, and Workplace Essentials from our library. In addition, we also provide the raw tokens inferred by the neural net, for use in downstream matching applications as search metadata or in advanced authoring applications.

We are very excited to roll out this new Related Topic Expansion endpoint, and even more interested to see how our partners ultimately use the new service. Our hope is that this endpoint, coupled with our other advanced API services, will help application developers of all stripes build-in truly smart matching, authoring, and suggestion features to their own apps.