Support

Documentation

2.A crash course in generative AI

2.A crash course in generative AI

To best use AITiny you need to have an at least passing understanding of how generative AI works.

Just to be clear, knowing all of this is NOT required to use AITiny. The default settings are perfectly fine for the casual user who just wants to get things done in pretty much the same way the AI writing tools offered by your phone or operating system are.

That said, the better you understand how generative AI works, the more you will get out of it. Being able to fine-tune AITiny for your specific use-case will let you be far more productive, even when the information given to you by your client is, let's say, not great to begin with.

At the very least, read the first two paragraphs below. As you're using AITiny you can refer back to this document to gain a better understanding of why things happen they way they do, and figure out what you can do to help generative AI work the way you intended it to work.

Generative AI is maths, not actual intelligence

Generative AI uses Large Language Models (LLMs) to process text into other text. These LLMs are statistical models which pretty much allow the computer to guess which words are likely to go together based on what other words are already there. This produces generally high quality structured text which gives the illusion of intelligence. The reality is that the computer cannot “think” in the way us humans, or even our pets, can. It's maths all the way down. This is very important to remember, especially because of our human tendency to anthropomorphise machines (attribute human traits to inanimate objects).

Models dictate the observed behavior

Different models have been trained on different sets of source data, using different methods. As a result, they will behave very differently given the same input. This means that not all models work equally well when it comes to solving any certain type of problem. For example, we've found that Microsoft's Phi models work much better for content summarisation and changing the tone of voice, but fail gloriously on fact finding and translation. Conversely, the Aya models work great for translation, but are not as concise when asked to summarise text. Google's Gemma models are great at rewriting text, as long as you are asking for dry corporate-speak. Meta's Llama models, on the other hand, will rewrite content in a friendly tone without sounding insincere, something that Phi failed to do. The only way to find out which model fits your use case is experimentation.

Everything is a token

LLMs don't understand language in the way us humans do. They take the input and convert it into tokens. Tokens are numbers assigned to different parts of words. The number of tokens may vary across languages. In English, you have roughly one token for every 4 characters, whereas in Spanish you have roughly one token for ever 2 characters. The rule of thumb you can use in most cases is that the number of tokens is approximately 1.5 to 2 times the number of words. This is useful to remember because commercial companies charge you by the number of tokens, the maximum input (“context”) of an LLM is measured in tokens, and you restrict the maximum output length by specifying a maximum number of output tokens.

System prompts are setting up expectations

As it should be pretty obvious by now, generative AI cannot think. The only way to use it to produce something useful is to give it enough information for its statistical model to start picking the right words that fit our use case. We do that with the so-called system prompts.

A system prompt is a phrase, short paragraph, or a fairly longer piece of text which gives the LLM the context into which it is operating. When coming up with a system prompt keep in mind that LLMs have no intelligence, and no life experience. Giving the system prompt “You are the company's marketing director” gives us human great context, but it leaves too much wiggle room for an LLM. What is the company's market? What is the marketing director's objective? Are there any constraints in what they can say? A better system prompt would be:

You are the marketing director of a small factory producing PVC plumbing fittings. Your company is selling stock to distributors, big box stores, and smaller plumbing supply shops directly. You are tasked with writing marketing material for the company's products keeping a professional, matter of fact tone. You can only use the information provided to determine the properties, such as size and color, and intended use cases of the company's products. You are not allowed to compare the company's products directly against specific competitors and their products, but you are allowed and encouraged to talk about how the company's products are better than competitors' products in general.

Think of it like giving instructions to a new hire. They have no idea what you are doing and how you are doing it. You need to give them guidance to accomplish what you want, the way you want it.

More prompts get things done

The generic system prompt usually sets up the “who”: we are telling the LLM who they are role playing as, so to speak. We still need to tell it what to do and, if it's acting on a piece of content, what that content is. This is where the user prompt and additional system prompts come in.

If we are making a generic query, e.g. to generate content or answer a question, we just need a user prompt. The same rule about being explicit in what we want apply. For example, this is a bad prompt “What is Joomla?”. This could result in a short phrase, or a thousand word reply. A better prompt would be “Write two to three paragraphs, each paragraph between 30 and 50 words, about what Joomla is. Use a friendly, excited tone of voice, but keep it factual. Avoid direct comparisons with other CMS”.

When we need to process text, e.g. to create the abstract of a long article, our content will be the user prompt. So, how do we tell the LLM what to do with it? The answer is with another system prompt which describes the intended action. Again, we need to be explicit in what we want, e.g. “Create an abstract, between 50 and 80 words, of the following text. The text is an article intended for experienced systems administrators with little to no prior knowledge of CMS. Use a professional tone. Lead with the reason they might be interested in it”.

Fine-tuning the responses

Since LLMs are statistical models, there is a degree of randomness in their replies – what us humans interpret as “creativity” and “accuracy”. We can modify the randomness using two model parameters called Temperature and Top_p.

The Temperature is a decimal number that goes from 0 to 1, or 0 to 2 for most models. It controls the predictability of the words used in the generated text. A low setting (0.1 to 0.5) gives high reliability and accuracy. This is best for rewriting text, as you do not want to introduce falsehoods which are not present in the original text (“hallucinations”). A medium setting (0.5 to 1.0) gives a balance between accuracy and a bit more creativity, making it a good choice for generating new content. A high setting (1.0 to 2.0) is very creative, but factual accuracy is thrown out of the window. This is only useful for creative writing and inspiration, understanding that the LLM is now very much lying and making stuff up.

The Top_p parameter affects how many words the model considers before choosing one. It is a decimal number from 0.0 to 1.0. A value of 0.9 to 1.0 gives more repetitive but safer text. Lower values give more creative text, by choosing less likely words, but they run the risk of generating unintelligible garbage.

The default settings for the text generation use case are a temperature of 0.7 and a top_p of 0.9, giving a good balance between a modicum of creativity and fairly good accuracy. It is recommended that you only change one of these two parameters at a time. While you can change both, be advised that their interaction may exacerbate their effect leading to either very stiff phrasing, or an unintelligible word salad. Do keep in mind that small changes can have a big impact. Do not overdo it.