This endpoint analyzes one or more short phrases to find related topics and matching elements from the SkillsEngine core library.

Explanation

Our Expander endpoint provides the requestor with a means to find synonyms and highly-related topics to short phrases they provide. These tokens are returned in a standard JSON response, along with an array of rich matches from our curated data library when we surface tokens that we already know about. This allows, for example, the ability to match "ms word" with "Microsoft Word", as well as show that it is closely related to "Microsoft Office" in many occupational contexts.

In general, the topics endpoint works best if provided a small number (3 or less) of very short phrases, themselves consisting of a very small number (3 or less) of words. E.g. ["ms word", "excel", "powerpint"] (spelling mistake there is intentional, to show that the neural net is frequently capable of comprehending even misspelled input texts in the same way that a human might).

We anticipate the use cases around this endpoint to focus on expansion of sparse texts where an author of a resume or job posting may have mentioned certain arbitrary tools, technologies, or characteristics that a human reader would infer to be representative of a much wider list of topics. Expanding the user's text into either a list of raw tokens and/or our canonical library elements can provide downstream matching algorithms with much richer text for keyword matching or other applications.

The socs Parameter

Like many of our endpoints, providing one or more O*NET 8-digit SOCs can help Expander better contextualize the end results, boosting matches that directly relate to the referenced occupations and discarding or downsampling those that do not. While this is an optional parameter, it is highly recommended to get higher-quality results from the endpoint.

Response Data Structure

Root Key

The root key in a successful JSON response is result (the root key in a failure is error).

Matches

Includes a list of highly-relevant matched elements from our curated library. Currently, this list can include Tools/Techs, Knowledges, Skills, Abilities, and Workplace Essentials. It's important to understand that this list of matches is not all-encompassing of anything that might be related to the input text, but rather provides additional detail for surfaced tokens and phrases that we have additional information about. Especially new technologies or concepts (or things that simply don't fit into our current taxonomy) may not be represented in the matches array, even though raw tokens show up for them in the tokens array.

Tokens

This is the raw list of tokens and short phrases surfaced by the neural net that underpins the topic expansion system. This list is not guaranteed to have perfectly clean or grammatically-clear tokens, but rather represents the arrangements of words that the neural net has learned are common insofar as they relate to your input text. For example, "photoshop" may surface tokens that reference the exact product, similar products, the parent company of the product, or general concepts around the product, such as "adobe photoshop", "adobe illustrator", "indesign", and two-token phrases like "photoshop illustrator" and "photoshop indesign", suggesting that the neural net has learned that these words are very often found together in work-related documents. We believe that this 'raw' tokens list can have direct application in matching systems by using these as a hidden list of meta-topics attached on both sides of a matching system (e.g. resumes and job postings).

❗️

Performance Caveat

Matching a single word or short phrase generally takes Expander a small fraction of a second. Additional texts provided in the texts array increase this latency, as the system has to iterate over and process all of these texts as well as handle an increasingly large set of results. For this reason, we suggest users of Expander experiment with how many phrases they want to send in a single request of the endpoint. In cases where a multi-second delay is acceptable, using one request with many texts may be fine. In other cases, it may be more advantageous to send only one or two phrases at a time and parallelize multiple endpoint requests to get maximum performance for your downstream application.

🚧

Authorization Header Required

This endpoint requires an Authorization header attribute with a valid Bearer <access-token>.

Language
Click Try It! to start a request and see the response here!