Tokens

The atomic units that LLMs read and write — sub-word pieces produced by a tokenizer. Pricing and context limits are measured in tokens, not words.

A token is the basic unit of text an LLM processes. Tokenizers (BPE, SentencePiece, tiktoken) break text into sub-word pieces. As a rough rule, 1 token ≈ 0.75 English words or 4 characters. Numbers, code and non-English languages often use more tokens per character.

Every input and output is billed in tokens, almost always quoted per million (Mtok). Input tokens are typically 4–8× cheaper than output tokens. Cached or batched input is cheaper still on most providers.

Knowing your token counts matters: it controls cost, fits within the context window, and decides how much retrieved or attached content you can include in a single call.

Related terms

More to explore

Other wiki entries that touch on Tokens.