You can set a limit on the number of tokens per AI Chat response. omniAI supports a maximum of 32768 tokens shared between the prompt (including system message, examples, message history, and user query) and the AI Chat's response. One token is roughly 4 characters for typical English text.


omniAI  model processes text by dividing it into tokens, which can be individual words or groups of characters.


For example, the word fantastic would be split into the tokens fan, tas, and tic, while a shorter word like gold would be considered a single token. Many tokens start with a space, such as hi and greetings.


The number of tokens processed in a single API request depends on the length of the input and output text.


As a general guideline, one token is roughly equivalent to 4 characters or 0.75 words for English text.


1 token ~= 4 chars in English

1 token ~= ¾ words

100 tokens ~= 75 words

Or


1-2 sentence ~= 30 tokens

1 paragraph ~= 100 tokens

1,500 words ~= 2048 tokens


It's important to note that the combined length of the text prompt and generated completion must not exceed the AI's maximum context length, which is typically 32,768 tokens or approximately 12,000 words.


Impact of Number of Tokens Value on Generated Content


Number of Tokens parameter is a control for the maximum number of tokens that can be generated in AI Chat.


A token is a discrete unit of meaning in natural language processing


In GPT, each token is represented as a unique integer, and the set of all possible tokens is called the vocabulary.

.


When generating text with Chat AI, the model takes in a prompt (also called the "seed text"), which is a starting point for the generated text. The model then uses this prompt to generate a sequence of tokens, one at a time.


Max tokens parameter works as a limit for the output of the model, in case the model is generating very long text, by setting this parameter, you can control the size of the generated text.


This parameter can be useful in different scenarios like:


  • When you want to generate a specific amount of text, regardless of how much context the model has.
  • When you want to constrain the generated text to a specific size to fit in a specific format or application
  • When you want to improve the performance of the model.

However, keep in mind that setting the Number of Tokens parameter too low can prevent the model from fully expressing its ideas or completing its thought. It could also lead to incomplete sentences or grammatically incorrect.