ChatGPT API 101 — A Beginner’s Guide | by Skanda Vivek

[ad_1]

OpenAI just released the ChatGPT (GPT-3.5) API! This tutorial goes into the various ways you can use the ChatGPT API to turbo-charge your application.

Photo by **Mohamed Hassan** form **PxHere**

OpenAI just released their ChatGPT API (called GPT-3.5). This is a game changer for developers and businesses that are actively using the ChatGPT web app or using the older GPT-3 APIs like Davinci to serve customers.

And the best part — it is 1/10th the cost of their GPT-3 Davinci model!

https://makeameme.org/meme/yes-finally-e571a6c96b

Let’s dive into how to use the ChatGPT API for different use cases.

import openai
openai.api_key ="blah"completion = openai.ChatCompletion.create(
model="gpt-3.5-turbo", 
messages=["role": "user", "content": "Hello, how are you?"]
)
completion['choices'][0]["message"]["content"]

This is the response:

As an AI language model, I don’t have feelings so I cannot experience emotions like humans. However, I am functioning properly and ready to assist you with any questions or tasks you may have. How may I assist you today?

Pretty cool right! You can imagine integrating this in a website where customers ask questions. Snapchat, for example, just released its AI chatbot powered by ChatGPT.

We know that ChatGPT is not just a chatbot — but a powerful tool for many language tasks. You can prompt the ChatGPT API to answer document-specific questions as below:

prompt = """Answer the question as truthfully as possible using the provided text, and if the answer is not contained within the text below, say "I don't know"Context:
Apple quietly included a feature called Clean Energy Charging in iOS 16.1 and turned it on by default. Here's what you need to know about the environmentally conscious feature.
Apple was vocal about Clean Energy Charging when it announced iOS 16 in September 2022. It didn't launch to the public until October 24 with iOS 16.1, and it seemed the change went by without many users noticing - until a recent social media storm.
Sunday, February 26, saw a slew of posts getting attention on Twitter about Clean Energy Charging. Users were vocal and angry, stating that they didn't want Apple to decide how they used energy for them.
The feature is opt-out, so if you don't want to participate in Clean Energy Charging, there is a simple toggle in settings to turn it off.
Q: What is clean energy charging?
A:"""
completion = openai.ChatCompletion.create(
model="gpt-3.5-turbo", 
messages=["role": "user", "content": prompt]
)
completion['choices'][0]["message"]["content"]

Here is the response:

Clean Energy Charging is a feature included in iOS 16.1 by Apple that is environmentally conscious. It is turned on by default but can be easily turned off with a toggle in settings.

The ChatGPT API, however, has a maximum input limit of 4096 tokens — which is not ideal for larger documents. But there are ways to get around this, like chunking large documents into smaller sizes and choosing the context that most likely contains the answer to a specific question.

Let’s say you wanted to answer questions about content in a PDF document. First, you need to extract the text from the PDF. In the example below, I’m extracting text from the original BERT paper using the PyMUPDF library as below:

import fitzdoc = fitz.open('bert.pdf')
text = ""
for page in doc:
text+=page.get_text()

Next, you process embeddings using the OpenAI tokenizer tiktoken as in this example given by OpenAI.

Number of tokens from the text of the original BERT paper

The entire article is more than 17k tokens and can’t be processed at once by the ChatGPT API. So we break that article into many sentence chunks of max 500 tokens as below.

BERT paper text broken into chunks of max 500 tokens

Dataframe containing BERT paper text broken into chunks and corresponding embeddings

Now we need to compare the question embeddings to the various context chunks in the dataframe to find where the answer is located. The solution is basically comparing the question and context embeddings and doing a cosine or other similarity metric to get the top k sub-contexts.

The OpenAI example does this using the GPT-3 Davinci API. Here’s the modified code using the new ChatGPT (gpt-3.5-turbo) model endpoint:

def create_context(
question, df, max_len=1800, size="ada"
):
"""
Create a context for a question by finding the most similar context from the dataframe
"""# Get the embeddings for the question
q_embeddings = openai.Embedding.create(input=question, engine='text-embedding-ada-002')['data'][0]['embedding']
# Get the distances from the embeddings
df['distances'] = distances_from_embeddings(q_embeddings, df['embeddings'].values, distance_metric='cosine')
returns = []
cur_len = 0
# Sort by distance and add the text to the context until the context is too long
for i, row in df.sort_values('distances', ascending=True).iterrows():
# Add the length of the text to the current length
cur_len += row['n_tokens'] + 4
# If the context is too long, break
if cur_len > max_len:
break
# Else add it to the text that is being returned
returns.append(row["text"])
# Return the context
return "\n\n###\n\n".join(returns)
def answer_question(
df,
question="Am I a monkey?",
max_len=1800,
size="ada",
debug=False,
max_tokens=150,
stop_sequence=None
):
"""
Answer a question based on the most similar context from the dataframe texts
"""
context = create_context(
question,
df,
max_len=max_len,
size=size,
)
prompt = f"""Answer the question as truthfully as possible using the provided text, and if the answer is not contained within the text below, say "I don't know"
Context:context
Q:question
A:"""
# If debug, print the raw model response
if debug:
print("Context:\n" + context)
print("\n\n")
try:
# Create a completions using the question and context
response = openai.ChatCompletion.create(
model="gpt-3.5-turbo",
messages=["role": "user", "content": prompt]
)
return response['choices'][0]["message"]["content"]
except Exception as e:
print(e)
return ""

When I ask a question based on the context as below:

answer_question(df, question="What is BERT?", debug=False)

It gives me a good answer:

“BERT stands for Bidirectional Encoder Representations from Transformers, and it is a new language representation model designed to pretrain deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers. The pre-trained BERT model can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks, such as question answering and language inference, without substantial task-specific architecture modifications. BERT achieves new state-of-the-art results on eleven natural language processing tasks.”

If the above logic seems too complex to implement and say you want just an API that outputs the answer from ChatGPT based on a custom context and question, you can check out this API:

Again, you can use PyMUPDF to first extract PDF text and subsequently send a post request as below and obtain the answer.

Apart from questions, I can also ask it to do document tasks like “Summarize in 3 sentences” as below.

import requests
import fitzdoc = fitz.open('bert.pdf')
text = ""
for page in doc:
text+=page.get_text()
url = "https://chatgpt-powered-question-answering-over-documents.p.rapidapi.com/qa877"
payload = 
"text": text,
"query": "Summarize in 3 sentences"

headers = 
"content-type": "application/json",
"X-RapidAPI-Key": "BLAH",
"X-RapidAPI-Host": "chatgpt-powered-question-answering-over-documents.p.rapidapi.com"
response = requests.request("POST", url, json=payload, headers=headers)
print(response.text)

"answer": "BERT is a deep bidirectional transformer pre-trained for language understanding. It is trained on a 3.3 billion word corpus with a batch size of 256 sequences for 1,000,000 steps. The pre-training tasks include Masked LM and Next Sentence Prediction, which are illustrated with examples.",
"id": "1875011f-1a4e-40af-9b5c-77e6ba1729fd",
"message": "Successful",
"status": true

Which gives the following response:

BERT is a deep bidirectional transformer pre-trained for language understanding. It is trained on a 3.3 billion word corpus with a batch size of 256 sequences for 1,000,000 steps. The pre-training tasks include Masked LM and Next Sentence Prediction, which are illustrated with examples.

Notice that above, the response field also has an id. This id is a unique document id. You can ask questions on the same document as long as you provide the unique document id as below:

import requestsurl = "https://chatgpt-powered-question-answering-over-documents.p.rapidapi.com/qa877"
payload = 
"id": "1875011f-1a4e-40af-9b5c-77e6ba1729fd",
"query": "Summarize in 3 sentences"

headers = 
"content-type": "application/json",
"X-RapidAPI-Key": "BLAH",
"X-RapidAPI-Host": "chatgpt-powered-question-answering-over-documents.p.rapidapi.com"
response = requests.request("POST", url, json=payload, headers=headers)
print(response.text)

"answer": "BERT is trained on unlabeled data over different pre-training tasks.",
"id": "1875011f-1a4e-40af-9b5c-77e6ba1729fd",
"message": "Successful",
"status": true

This is the answer you get for the question “What is BERT trained on?”

BERT is trained on unlabeled data over different pre-training tasks.

Hopefully this article has made you excited about the various possibilities to use the ChatGPT API within your organization to solve complex problems previously unsolvable by machines. This could lead to massive impacts, reducing the time and resources spent combing through large volumes of documents.

However, there are multiple challenges along the way in getting products like this to scale well. First, searching through large documents takes a few seconds or more and is not ideal from a latency perspective. The solution to this might be caching document contexts in a vector database and efficiently computing similarity metrics between question and context embeddings to quickly find contexts where the answer is embedded.

Another challenge is the issue of LLM hallucination. The director of AI at Meta warns against autoregressive LLMs making up text, and lack of reliability. However, I think the best way to evaluate these generative LLMs is by testing them for your specific application. If they perform well in most (but not all) cases, that is a great starting point. But if not, maybe LLMs are not the right solution for you.

The code on GitHub is provided below:

[ad_2]

Source link