Our Expander endpoint provides the requestor with a means to find synonyms and highly-related topics to short phrases they provide. These
tokens are returned in a standard JSON response, along with an array of rich
matches from our curated data library when we surface tokens that we already know about. This allows, for example, the ability to match "ms word" with "Microsoft Word", as well as show that it is closely related to "Microsoft Office" in many occupational contexts.
In general, the topics endpoint works best if provided a small number (3 or less) of very short phrases, themselves consisting of a very small number (3 or less) of words. E.g.
["ms word", "excel", "powerpint"] (spelling mistake there is intentional, to show that the neural net is frequently capable of comprehending even misspelled input texts in the same way that a human might).
We anticipate the use cases around this endpoint to focus on expansion of sparse texts where an author of a resume or job posting may have mentioned certain arbitrary tools, technologies, or characteristics that a human reader would infer to be representative of a much wider list of topics. Expanding the user's text into either a list of raw tokens and/or our canonical library elements can provide downstream matching algorithms with much richer text for keyword matching or other applications.
Like many of our endpoints, providing one or more O*NET 8-digit SOCs can help Expander better contextualize the end results, boosting matches that directly relate to the referenced occupations and discarding or downsampling those that do not. While this is an optional parameter, it is highly recommended to get higher-quality results from the endpoint.
The root key in a successful JSON response is
result (the root key in a failure is
Includes a list of highly-relevant matched elements from our curated library. Currently, this list can include Tools/Techs, Knowledges, Skills, Abilities, and Workplace Essentials. It's important to understand that this list of matches is not all-encompassing of anything that might be related to the input text, but rather provides additional detail for surfaced tokens and phrases that we have additional information about. Especially new technologies or concepts (or things that simply don't fit into our current taxonomy) may not be represented in the
matches array, even though raw tokens show up for them in the
This is the raw list of tokens and short phrases surfaced by the neural net that underpins the topic expansion system. This list is not guaranteed to have perfectly clean or grammatically-clear tokens, but rather represents the arrangements of words that the neural net has learned are common insofar as they relate to your input text. For example, "photoshop" may surface tokens that reference the exact product, similar products, the parent company of the product, or general concepts around the product, such as "adobe photoshop", "adobe illustrator", "indesign", and two-token phrases like "photoshop illustrator" and "photoshop indesign", suggesting that the neural net has learned that these words are very often found together in work-related documents. We believe that this 'raw' tokens list can have direct application in matching systems by using these as a hidden list of meta-topics attached on both sides of a matching system (e.g. resumes and job postings).
Matching a single word or short phrase generally takes Expander a small fraction of a second. Additional texts provided in the
textsarray increase this latency, as the system has to iterate over and process all of these texts as well as handle an increasingly large set of results. For this reason, we suggest users of Expander experiment with how many phrases they want to send in a single request of the endpoint. In cases where a multi-second delay is acceptable, using one request with many texts may be fine. In other cases, it may be more advantageous to send only one or two phrases at a time and parallelize multiple endpoint requests to get maximum performance for your downstream application.
Authorization Header Required
This endpoint requires an
Authorizationheader attribute with a valid