Difference between revisions of "Generating Text"
(11 intermediate revisions by the same user not shown) | |||
Line 13: | Line 13: | ||
OpenAI has created an easy to use Python library to interact with their service. Luckily with some minor modifications we are able to use this library with the stations' service. To install their Python library, run the following command in your terminal. | OpenAI has created an easy to use Python library to interact with their service. Luckily with some minor modifications we are able to use this library with the stations' service. To install their Python library, run the following command in your terminal. | ||
− | + | <syntaxhighlight lang="bash"> | |
pip install openai | pip install openai | ||
− | + | </syntaxhighlight> | |
= Authentication = | = Authentication = | ||
Line 23: | Line 23: | ||
The server uses API keys for authentication to not have everyone outside of WdKA also using the service. You can think of an API key as the key you use to open your front door or your password for your email account. For security reasons, every week on Sunday this API key will change automatically. You check what the current API key is by [https://api.ml.interactionstation.wdka.hro.nl/api-key/ clicking on this link]. | The server uses API keys for authentication to not have everyone outside of WdKA also using the service. You can think of an API key as the key you use to open your front door or your password for your email account. For security reasons, every week on Sunday this API key will change automatically. You check what the current API key is by [https://api.ml.interactionstation.wdka.hro.nl/api-key/ clicking on this link]. | ||
− | If you have a project that requires a static API key for a longer period of time, you can write me an email at b.smeenk@hr.nl or ask me (Boris) at the station. | + | If you have a project that requires a static API key for a longer period of time, you can write me an email at [mailto:b.smeenk@hr.nl b.smeenk@hr.nl] or ask me (Boris) at the station. |
− | + | At the time of writing this article, the <code>api_key</code> is <code>8195436a-9add-4281-ba7d-8595d266aab4</code>. If you see this key in any of the code examples, swap it out with the current <code>api_key</code> you got from the url above. | |
− | + | == Connection == | |
− | + | In a new Python script, create a variable named <code>client</code> and set it to an instance of the <code>OpenAI</code> class to establish a connection with the service. The current API key and the service's URL should be entered as the values for the parameters <code>api_key</code> and <code>base_url</code>. | |
+ | |||
+ | <syntaxhighlight lang="python"> | ||
from openai import OpenAI | from openai import OpenAI | ||
Line 36: | Line 38: | ||
base_url="https://ml-api.interactionstation.wdka.hro.nl" | base_url="https://ml-api.interactionstation.wdka.hro.nl" | ||
) | ) | ||
− | + | </syntaxhighlight> | |
− | The created instance of | + | The created instance of <code>OpenAI</code>, which we stored in a variable named <code>client</code> is now connected to the station's LocalAI service using the passed <code>api_key</code>. We can now use the <code>client</code> variable to make all sorts of requests to the server. Lets start by writing some code to chat with a LLM. |
− | + | = Chat = | |
− | + | == Completion == | |
− | At the moment of writing this tutorial, the LocalAI instance on the server is running [https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct Meta's Llama 3 8B Instruct] LLM to perform various chat instructions. We will always name the current loaded model | + | At the moment of writing this tutorial, the LocalAI instance on the server is running [https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct Meta's Llama 3 8B Instruct] LLM to perform various chat instructions. We will always name the current loaded model <code>gpt-3.5-turbo</code>. This maybe seems a bit strange as <code>gpt-3.5-turbo</code> is a model made by OpenAI, but in actuality another model is loaded. We chose to do this in order to be compatible with various libraries and plugins that rely on this naming scheme. Just remember that whenever you see <code>gpt-3.5-turbo</code>, we're actually using another model. |
− | The following code is used to start a chat with the model. Note that we use the | + | The following code is used to start a chat with the model. Note that we use the <code>client</code> variable we created earlier to call a function named <code>create</code>. The result we save in the variable named <code>response</code>. If you would like to see the result you can <code>print()</code> this to the console. |
− | + | <syntaxhighlight lang="python"> | |
# Ask the model a question and save the result in a variable | # Ask the model a question and save the result in a variable | ||
response = client.chat.completions.create( | response = client.chat.completions.create( | ||
Line 59: | Line 61: | ||
# Show the result in the console | # Show the result in the console | ||
print(response) | print(response) | ||
− | + | </syntaxhighlight> | |
Chat models take a list of messages as input and return a generated response as output. To keep the context of a conversation, always send the entire list of messages to the chat model. The printed response in the code above will look something like this: | Chat models take a list of messages as input and return a generated response as output. To keep the context of a conversation, always send the entire list of messages to the chat model. The printed response in the code above will look something like this: | ||
− | + | <syntaxhighlight lang="python"> | |
ChatCompletion( | ChatCompletion( | ||
id="247e11d7-e594-4333-8b12-f3103b29dc9e", | id="247e11d7-e594-4333-8b12-f3103b29dc9e", | ||
Line 87: | Line 89: | ||
usage=CompletionUsage(completion_tokens=0, prompt_tokens=0, total_tokens=0), | usage=CompletionUsage(completion_tokens=0, prompt_tokens=0, total_tokens=0), | ||
) | ) | ||
− | + | </syntaxhighlight> | |
− | As you can see, there is a lot of information the chat model returns to us. Most of the time however, we are only interested in the textual response of the model. In Python, you can grab specific parts of a class (in this case the | + | As you can see, there is a lot of information the chat model returns to us. Most of the time however, we are only interested in the textual response of the model. In Python, you can grab specific parts of a class (in this case the <code>ChatCompletion</code> class) by using something called "dot notation". So in order to grab the value of <code>content</code>, we can <code>print(response.choices[0].message.content)</code>. Note how we use <code>[0]</code> at <code>choices</code> because we want to grab the first item in the <code>choiches</code> list. swapping this print statement with your older print statement will result in the output below. |
− | + | <syntaxhighlight lang="python"> | |
"I'm just an AI, I don't have a physical body, so I don't feel cold or any other temperature. But I'm happy to chat with you about the weather or anything else you'd like to talk about!" | "I'm just an AI, I don't have a physical body, so I don't feel cold or any other temperature. But I'm happy to chat with you about the weather or anything else you'd like to talk about!" | ||
− | + | </syntaxhighlight> | |
Great! Right now, all we have is the chat model's textual response. Looking at our response however, we can see that the content generated is very uninteresting. How can we instruct the chat model to follow our instructions? This is where system prompts come in to play. | Great! Right now, all we have is the chat model's textual response. Looking at our response however, we can see that the content generated is very uninteresting. How can we instruct the chat model to follow our instructions? This is where system prompts come in to play. | ||
+ | |||
+ | == System prompts == | ||
+ | |||
+ | To influence how the chat model responds, you can introduce a system prompt. Taking the code example from above, we add an new object in the <code>messages</code> list with the <code>role: system</code>. You can add system prompts at any time during the conversation. Keep in mind that the order of the objects in the list matters and influences the flow of the conversation. | ||
+ | |||
+ | <syntaxhighlight lang="python"> | ||
+ | # Ask the model a question and save the result in a variable | ||
+ | response = client.chat.completions.create( | ||
+ | model="gpt-3.5-turbo", | ||
+ | messages=[ | ||
+ | # Add a system prompt object | ||
+ | {"role": "system", "content": "You are not an AI. You behave as a human, has a physical body and can feel emotions. At all times you behave like this."}, | ||
+ | {"role": "user", "content": "Are you feeling cold?"} | ||
+ | ] | ||
+ | ) | ||
+ | |||
+ | # Show the result in the console | ||
+ | print(response) | ||
+ | </syntaxhighlight> | ||
+ | |||
+ | The response will now look something like this, see how different this is compared to the previous example? System prompts are very useful to give the model context or make it [https://platform.openai.com/docs/guides/prompt-engineering/tactic-ask-the-model-to-adopt-a-persona adapt a persona]. | ||
+ | |||
+ | <syntaxhighlight lang="python"> | ||
+ | "Brrr! Yeah, I'm feeling a bit chilly. It's a crisp autumn day outside, and I'm not exactly dressed for the weather. I could use a warm cup of coffee or tea to thaw out a bit." | ||
+ | </syntaxhighlight> | ||
+ | |||
+ | == Conversation history == | ||
+ | |||
+ | Including the history of the conversation is important when you refer back to previous messages. In the example below, the final question of <code>"Is that why you are wearing winter clothes?"</code> only makes sense in the context of the prior messages. Because chat models don't have a memory of previous messages, the entire conversation must be send in each request. | ||
+ | |||
+ | <syntaxhighlight lang="python"> | ||
+ | # Ask the model a question and save the result in a variable | ||
+ | response = client.chat.completions.create( | ||
+ | model="gpt-3.5-turbo", | ||
+ | messages=[ | ||
+ | {"role": "system", "content": "You are not an AI. You behave as a human, has a physical body and can feel emotions. At all times you behave like this."}, | ||
+ | {"role": "user", "content": "Are you feeling cold?"}, | ||
+ | {"role": "assistant", "content": "Brrr! Yeah, I'm feeling a bit chilly. It's a crisp autumn day outside, and I'm not exactly dressed for the weather. I could use a warm cup of coffee or tea to thaw out a bit."}, | ||
+ | # this question does not make sense without the entire conversation history above | ||
+ | {"role": "user", "content": "Is that why you are wearing winter clothes?"} | ||
+ | ] | ||
+ | ) | ||
+ | |||
+ | # Show the result in the console | ||
+ | print(response) | ||
+ | </syntaxhighlight> | ||
+ | |||
+ | = Next steps = | ||
+ | |||
+ | * [https://interactionstation.wdka.hro.nl/wiki/Generating_Images Learn how to generate images with Python] | ||
+ | * [https://interactionstation.wdka.hro.nl/wiki/Text_To_Speech_(TTS) Learn how to generate audio with Python] | ||
+ | * [https://interactionstation.wdka.hro.nl/wiki/Speech_To_Text Learn how to transcribe audio with Python] |
Latest revision as of 08:52, 24 September 2024
Introduction
At the interaction station you have the ability to run various machine learning models on the station's server. Our server is hosting a service named LocalAI which can be used as a drop-in replacement and is compatible with OpenAI's API specification. It allows you to use large language models (LLMs), transcribe audio, generate images and generate audio. The only thing you need to know is how to request the server what model to run so that the server can give you a response. Using the Python programming language, this tutorial will walk you to the process. To follow along, it is advised that you have some understanding of Python.
As mentioned before, LocalAI is compatible with OpenAI's API specification. That means you can also read the OpenAI's API Reference for more information. This guide borrows heavily from their documentation.
Installation
OpenAI has created an easy to use Python library to interact with their service. Luckily with some minor modifications we are able to use this library with the stations' service. To install their Python library, run the following command in your terminal.
pip install openai
Authentication
API Keys
The server uses API keys for authentication to not have everyone outside of WdKA also using the service. You can think of an API key as the key you use to open your front door or your password for your email account. For security reasons, every week on Sunday this API key will change automatically. You check what the current API key is by clicking on this link.
If you have a project that requires a static API key for a longer period of time, you can write me an email at b.smeenk@hr.nl or ask me (Boris) at the station.
At the time of writing this article, the api_key
is 8195436a-9add-4281-ba7d-8595d266aab4
. If you see this key in any of the code examples, swap it out with the current api_key
you got from the url above.
Connection
In a new Python script, create a variable named client
and set it to an instance of the OpenAI
class to establish a connection with the service. The current API key and the service's URL should be entered as the values for the parameters api_key
and base_url
.
from openai import OpenAI
client = OpenAI(
api_key="8195436a-9add-4281-ba7d-8595d266aab4",
base_url="https://ml-api.interactionstation.wdka.hro.nl"
)
The created instance of OpenAI
, which we stored in a variable named client
is now connected to the station's LocalAI service using the passed api_key
. We can now use the client
variable to make all sorts of requests to the server. Lets start by writing some code to chat with a LLM.
Chat
Completion
At the moment of writing this tutorial, the LocalAI instance on the server is running Meta's Llama 3 8B Instruct LLM to perform various chat instructions. We will always name the current loaded model gpt-3.5-turbo
. This maybe seems a bit strange as gpt-3.5-turbo
is a model made by OpenAI, but in actuality another model is loaded. We chose to do this in order to be compatible with various libraries and plugins that rely on this naming scheme. Just remember that whenever you see gpt-3.5-turbo
, we're actually using another model.
The following code is used to start a chat with the model. Note that we use the client
variable we created earlier to call a function named create
. The result we save in the variable named response
. If you would like to see the result you can print()
this to the console.
# Ask the model a question and save the result in a variable
response = client.chat.completions.create(
model="gpt-3.5-turbo",
messages=[
{"role": "user", "content": "Are you feeling cold?"}
]
)
# Show the result in the console
print(response)
Chat models take a list of messages as input and return a generated response as output. To keep the context of a conversation, always send the entire list of messages to the chat model. The printed response in the code above will look something like this:
ChatCompletion(
id="247e11d7-e594-4333-8b12-f3103b29dc9e",
choices=[
Choice(
finish_reason="stop",
index=0,
logprobs=None,
message=ChatCompletionMessage(
content="I'm just an AI, I don't have a physical body, so I don't feel cold or any other temperature. But I'm happy to chat with you about the weather or anything else you'd like to talk about!",
refusal=None,
role="assistant",
function_call=None,
tool_calls=None,
),
)
],
created=1724536813,
model="gpt-3.5-turbo",
object="chat.completion",
service_tier=None,
system_fingerprint=None,
usage=CompletionUsage(completion_tokens=0, prompt_tokens=0, total_tokens=0),
)
As you can see, there is a lot of information the chat model returns to us. Most of the time however, we are only interested in the textual response of the model. In Python, you can grab specific parts of a class (in this case the ChatCompletion
class) by using something called "dot notation". So in order to grab the value of content
, we can print(response.choices[0].message.content)
. Note how we use [0]
at choices
because we want to grab the first item in the choiches
list. swapping this print statement with your older print statement will result in the output below.
"I'm just an AI, I don't have a physical body, so I don't feel cold or any other temperature. But I'm happy to chat with you about the weather or anything else you'd like to talk about!"
Great! Right now, all we have is the chat model's textual response. Looking at our response however, we can see that the content generated is very uninteresting. How can we instruct the chat model to follow our instructions? This is where system prompts come in to play.
System prompts
To influence how the chat model responds, you can introduce a system prompt. Taking the code example from above, we add an new object in the messages
list with the role: system
. You can add system prompts at any time during the conversation. Keep in mind that the order of the objects in the list matters and influences the flow of the conversation.
# Ask the model a question and save the result in a variable
response = client.chat.completions.create(
model="gpt-3.5-turbo",
messages=[
# Add a system prompt object
{"role": "system", "content": "You are not an AI. You behave as a human, has a physical body and can feel emotions. At all times you behave like this."},
{"role": "user", "content": "Are you feeling cold?"}
]
)
# Show the result in the console
print(response)
The response will now look something like this, see how different this is compared to the previous example? System prompts are very useful to give the model context or make it adapt a persona.
"Brrr! Yeah, I'm feeling a bit chilly. It's a crisp autumn day outside, and I'm not exactly dressed for the weather. I could use a warm cup of coffee or tea to thaw out a bit."
Conversation history
Including the history of the conversation is important when you refer back to previous messages. In the example below, the final question of "Is that why you are wearing winter clothes?"
only makes sense in the context of the prior messages. Because chat models don't have a memory of previous messages, the entire conversation must be send in each request.
# Ask the model a question and save the result in a variable
response = client.chat.completions.create(
model="gpt-3.5-turbo",
messages=[
{"role": "system", "content": "You are not an AI. You behave as a human, has a physical body and can feel emotions. At all times you behave like this."},
{"role": "user", "content": "Are you feeling cold?"},
{"role": "assistant", "content": "Brrr! Yeah, I'm feeling a bit chilly. It's a crisp autumn day outside, and I'm not exactly dressed for the weather. I could use a warm cup of coffee or tea to thaw out a bit."},
# this question does not make sense without the entire conversation history above
{"role": "user", "content": "Is that why you are wearing winter clothes?"}
]
)
# Show the result in the console
print(response)