omniAI processes text by breaking it down into tokens. Tokens can be words or just chunks of characters. For example, the word “hamburger” gets broken up into the tokens “ham”, “bur” and “ger”, while a short and common word like “pear” is a single token. Many tokens start with a whitespace, for example “ hello” and “ bye”.


The total number of tokens processed in a given request depends on the length of your input, output and request parameters. The quantity of tokens being processed will also affect your response latency and throughput for the models.


There two types of tokens: 

  1. Prompt tokens - used when you prompt omniAI with a question.
  2. Completion tokens - used when AI generates a response.


Tokens can be thought of as pieces of words. Before the API processes the prompts, the input is broken down into tokens. These tokens are not cut up exactly where the words start or end - tokens can include trailing spaces and even sub-words. Here are some helpful rules of thumb for understanding tokens in terms of lengths:


1 token ~= 4 chars in English


1 token ~= ¾ words


100 tokens ~= 75 words


Or 


1-2 sentence ~= 30 tokens


1 paragraph ~= 100 tokens


1,500 words ~= 2048 tokens


To get additional context on how tokens stack up, consider this:


Wayne Gretzky’s quote "You miss 100% of the shots you don't take" contains 11 tokens.


omniAI’s charter contains 476 tokens.


The transcript of the US Declaration of Independence contains 1,695 tokens.


How words are split into tokens is also language-dependent. For example ‘Cómo estás’ (‘How are you’ in Spanish) contains 5 tokens (for 10 chars). The higher token-to-char ratio can make it more expensive to implement the API for languages other than English.


To further explore tokenization, you can use OpenAI interactive Tokenizer tool, which allows you to calculate the number of tokens and see how text is broken into tokens.