Token
The token is a separate occurrence of a linguistic item in the text (for example, speech or writing). This is opposed to type, defined as an abstract category, class, or category of a linguistic item; it is distinct from the number of occurrences known as a token.
The term “token” is related to the total number of words in a text or corpus, despite how frequently they are repeated. The term “type” is connected to the number of different words in a text or corpus. Hence, the sentence “a good wine is a wine that you like” holds nine tokens and only seven types (as “a” and “wine”) [ResearchGate].
A token is the smallest unit that a corpus is composed of. It is typically referring to:
A word forms.
Punctuation.
Digit.
Abbreviations, product names.
Anything else between spaces.
There are two types of tokens: words and non-words [SretchEngine].
A token is an individual occurrence of a linguistic item in speech or writing that is contrasted with type [Lexico: Oxford English Dictionary].
⠀ Token. Lexico: Oxford English Dictionary. Retrieved from: https://www.lexico.com/definition/token.
⠀ What is the difference between Word Type and Token? ResearchGate. Retrieved from: https://www.researchgate.net/post/What_is_the_difference_between_Word_Type_and_Token.