In my newest submit, the way to get structured, machine-readable outputs as a response from an LLM, utilizing JSON Mode, operate calling, and structured outputs. In that submit, we briefly touched on the thought of operate calling, approaching it as a way for acquiring structured responses. Nonetheless, operate calling is one thing that goes properly past simply getting structured information again from a mannequin, since it’s primarily the spine of agentic AI workflows. So, in at this time’s submit, we’re going to take a better take a look at precisely this matter.
In all the examples we’ve lined up to now, the LLM is simply used as a passive responder, that means it receives a query after which generates a solution, and that’s it. However what if we would like the LLM not simply to reply with one thing however as a substitute to do one thing? Or to place it extra exactly, what if we would like an motion to be triggered based mostly on the mannequin’s response? This motion could also be something: lookup into dwell information, ship a message, question a database, name an exterior API, and so forth.
That is made doable with software calling. Device calling is what transforms an LLM from a really good textual content generator into one thing that may really set off actions and work together with the world round it.
So, let’s have a look!
What’s Device Calling?
Device calling (additionally known as operate calling) is the mechanism by which an LLM can request the execution of exterior capabilities or APIs as a part of producing its response. In different phrases, as a substitute of simply returning textual content, the mannequin can execute a selected operate with particular arguments, as a response to the consumer’s request.
The important thing factor to grasp right here is that the mannequin itself doesn’t execute the software. It solely decides which software to name and with what arguments. The precise execution of the chosen software occurs in our personal code, by which the request to the AI mannequin is included. We then feed the software’s end result again to the AI mannequin, which makes use of it to generate a remaining response to the consumer.
That is the software calling loop, which incorporates the next steps:
- The consumer submits a message
- The AI mannequin takes the message as enter and produces an output, which is actually a choice on which software to utilise and with which arguments
- The mannequin’s response containing the software choice and respective arguments for use is handed again to the code. The code – with no involvement of the AI mannequin – executes the chosen software with the chosen arguments. This execution produces some type of end result (e.g., a calculation, info obtained from an API, and so on.), and this result’s then handed again to the AI mannequin.
- The AI mannequin takes as enter the results of the software and produces a remaining response to the consumer based mostly on that.
Once more, the mannequin generates a software name, not a software execution. The 2 are very various things, and conflating them is likely one of the commonest sources of confusion.
However what precisely is a software name? In follow, it implies that the mannequin returns a structured, machine-readable response utilizing Operate Calling, as we noticed within the earlier submit. On this response, content material is None; there isn’t a pure language reply, only a structured instruction indicating which software to name and with what arguments. It is just after we execute the software and move the end result again that the mannequin generates an precise textual content response for the consumer.
However let’s see this in follow!
We’ll begin with a easy instance utilizing only one software and one name, after which progressively construct as much as some extra attention-grabbing situations.
1. A single software: climate API
I believe that the most typical instance of software use with AI that involves thoughts is a climate API (the cornerstone of customized, dwell information), so let’s think about we’re constructing a climate assistant. Specifically, we need to create a mechanism by which the consumer asks concerning the climate, and as a substitute of simply letting the AI mannequin make one thing up (which the mannequin would very fortunately do 🙃), we would like it to name an actual climate operate and get precise information concerning the climate from elsewhere, exterior the LLM. To get the climate information, I will probably be utilizing Open-Meteo, a free, open-source climate API that fortunately requires no API key.
To make use of a software, we’ve to initially declare it in instruments.
from openai import OpenAI
import json
consumer = OpenAI(api_key="your_api_key")
# Step 1: outline the software
instruments = [
{
"type": "function",
"function": {
"name": "get_current_weather",
"description": "Get the current weather for a given city",
"parameters": {
"type": "object",
"properties": {
"city": {
"type": "string",
"description": "The name of the city, e.g. Athens"
},
"unit": {
"type": "string",
"enum": ["celsius", "fahrenheit"],
"description": "The temperature unit to make use of"
}
},
"required": ["city"]
}
}
}
]
Discover how the precise software for use (the climate API) is talked about nowhere up up to now. As a substitute, the mannequin decides which software to name based mostly on three issues: the operate description (“Get the present climate for a given metropolis”), the parameter descriptions (“The identify of town, e.g., Athens”), and the enforced schema. It’s purely from this info that the mannequin figures out whether or not that is the best software to name for a given consumer message and with what arguments. Thus, writing clear and correct descriptions when defining our instruments is of key significance for the mannequin to efficiently determine and name the best software based mostly on the consumer’s enter.
So, after we’ve outlined the instruments variable, we are able to then make a request to the AI mannequin:
# Step 2: ship the consumer message together with the software definition
messages = [
{"role": "user", "content": "What's the weather like in Athens right now?"}
]
response = consumer.chat.completions.create(
mannequin="gpt-4o-mini",
instruments=instruments,
messages=messages
)
print(response.decisions[0].message)
Right here’s what occurs once we make this request. The mannequin reads the consumer’s message, “What’s the climate like in Athens proper now?”, and understands that the out there software get_current_weather can assist reply this question with actual, dwell information. So, relatively than producing a textual content response instantly, it decides to name the software first. Extra particularly, the mannequin’s response at this level seems to be like this:
ChatCompletionMessage(
content material=None,
function='assistant',
tool_calls=[
ChatCompletionMessageToolCall(
id='call_abc123',
type='function',
function=Function(
name='get_current_weather',
arguments='{"city": "Athens", "unit": "celsius"}'
)
)
]
)
Discover how content material is None, as a result of the mannequin isn’t returning a textual content response, however a software name. Now it’s our job to really execute the software, the mannequin chosen, and return the end result again to it. In our case, that is going to be making the API request to the climate API, utilizing the arguments (that’s, town and unit of measurement) supplied within the AI mannequin’s response:
# Step 3: execute the software utilizing the Open-Meteo API
import requests
def get_current_weather(metropolis: str, unit: str = "celsius"):
# geocode town identify to coordinates
geo = requests.get(
"https://geocoding-api.open-meteo.com/v1/search",
params={"identify": metropolis, "depend": 1}
).json()
lat = geo["results"][0]["latitude"]
lon = geo["results"][0]["longitude"]
# fetch present climate
climate = requests.get(
"https://api.open-meteo.com/v1/forecast",
params={
"latitude": lat,
"longitude": lon,
"present": "temperature_2m,weather_code",
"temperature_unit": unit
}
).json()
temp = climate["current"]["temperature_2m"]
return {"metropolis": metropolis, "temperature": temp, "unit": unit}
# extract the software name from the response
tool_call = response.decisions[0].message.tool_calls[0]
arguments = json.masses(tool_call.operate.arguments)
# name the precise operate
weather_result = get_current_weather(**arguments)
we are able to then append the software’s end result to the message historical past after which ship the whole lot again to the mannequin:
# Step 4: add the assistant's software name AND the software end result to the message historical past
messages.append(response.decisions[0].message) # vital: append the software name first
messages.append({
"function": "software",
"tool_call_id": tool_call.id, # hyperlinks the end result again to the precise software name
"content material": json.dumps(weather_result)
})
# Step 5: ship the whole lot again to the mannequin for a remaining response
final_response = consumer.chat.completions.create(
mannequin="gpt-4o-mini",
instruments=instruments,
messages=messages
)
print(final_response.decisions[0].message.content material)
And now, we lastly get a correct textual content response:
It is presently 29°C in Athens. Feels like an excellent day to be exterior!
🍨 DataCream is a e-newsletter providing tales and tutorials on AI, information, and tech. In case you are concerned with these matters, subscribe right here!
2. Letting the mannequin select from a number of instruments
Now let’s check out a extra sensible instance. In a real-world agentic software, the mannequin sometimes has entry to not one, however a number of instruments, and because of this, it wants to determine which one (or ones) have to be used based mostly on what the consumer is asking.
Let’s prolong our preliminary climate API instance by including a further software for currencies. For this, we’ll use Frankfurter, a forex API offering European Central Financial institution each day charges, once more with no API key requirement. So, let’s replace our instruments variable by including a second software for changing currencies:
instruments = [
{
"type": "function",
"function": {
"name": "get_current_weather",
"description": "Get the current weather for a given city",
"parameters": {
"type": "object",
"properties": {
"city": {"type": "string", "description": "The name of the city"},
"unit": {"type": "string", "enum": ["celsius", "fahrenheit"]}
},
"required": ["city"]
}
}
},
{
"sort": "operate",
"operate": {
"identify": "convert_currency",
"description": "Convert an quantity from one forex to a different",
"parameters": {
"sort": "object",
"properties": {
"quantity": {"sort": "quantity", "description": "The quantity to transform"},
"from_currency": {"sort": "string", "description": "The supply forex code, e.g. USD"},
"to_currency": {"sort": "string", "description": "The goal forex code, e.g. EUR"}
},
"required": ["amount", "from_currency", "to_currency"]
}
}
}
]
And likewise arrange the precise convert_currency operate utilizing the Frankfurter API:
def convert_currency(quantity: float, from_currency: str, to_currency: str):
response = requests.get(
f"https://api.frankfurter.dev/v2/charge/{from_currency}/{to_currency}"
).json()
charge = response["rate"]
transformed = spherical(quantity * charge, 2)
return {
"quantity": quantity,
"from_currency": from_currency,
"to_currency": to_currency,
"converted_amount": transformed,
"charge": charge
}
On this method, the mannequin can deal with a a lot wider vary of consumer requests; it might now additionally reply about currencies, on prime of the climate 😋. Now, if the consumer asks “What’s the climate in Athens?”, the mannequin ought to name get_current_weather. In the event that they ask “How a lot is 100 USD in EUR?”, it ought to name convert_currency. And if we ask one thing irrelevant to each climate and currencies for which neither of the out there instruments can assist, the mannequin will merely reply in textual content with out calling any software in any respect.
However let’s see this in motion:
messages = [
{"role": "user", "content": "How much is 200 USD in EUR?"}
]
response = consumer.chat.completions.create(
mannequin="gpt-4o-mini",
instruments=instruments,
messages=messages
)
tool_call = response.decisions[0].message.tool_calls[0]
Let’s take a look on the response:
print(tool_call.operate.identify)
from which we get convert_currency. So, the mannequin understood that the query “How a lot is 200 USD in EUR?” is related to the convert_currency software. Let’s additionally check out the arguments:
print(tool_call.operate.arguments)
from which we get
'{"quantity": 200, "from_currency": "USD", "to_currency": "EUR"}'
So, the mannequin appropriately identifies convert_currency as the best software and fills within the acceptable arguments, with out us doing something apart from offering acceptable software descriptions, and the consumer offering an acceptable message. This precise decision-making mechanism is what makes tool-calling the muse of agentic techniques.
3. Calling a number of instruments directly
One other attention-grabbing software calling state of affairs is that many fashions, like gpt-4o, can name a number of instruments in a single response when the consumer’s request requires it. This is named parallel software calling.
For instance, let’s think about a state of affairs the place the consumer asks in a single request one thing that requires using each the get_current_weather and convert_currency instruments to acquire the required data:
messages = [
{"role": "user", "content": "What's the weather in Athens and how much is 100 USD in EUR?"}
]
response = consumer.chat.completions.create(
mannequin="gpt-4o-mini",
instruments=instruments,
messages=messages
)
for tool_call in response.decisions[0].message.tool_calls:
print(tool_call.operate.identify)
print(tool_call.operate.arguments)
On this case, the response we get is the next:
get_current_weather
{"metropolis": "Athens"}
convert_currency
{"quantity": 100, "from_currency": "USD", "to_currency": "EUR"}
Discover how each instruments are known as in a single mannequin response. We are able to then execute the respective instruments with the supplied arguments and move again the software outcomes to the mannequin collectively. That is rather more environment friendly than sequential calls, and it’s how extra superior brokers deal with multi-part requests.
On my thoughts: So, what makes this agentic?
One factor that has all the time gotten on my nerves is the time period “agentic” being slapped on the whole lot. Brokers, agentic workflows, something originating from the phrase agent may be very horny these days, however as you could have already found your self, not the whole lot bought as agentic actually is.
So let’s take a step again and take into consideration what an agent really is within the first place. At its core, an agent is one thing that perceives its setting, processes that info in a roundabout way, has a purpose, after which decides what motion to take so as to obtain it. Take into consideration what our software calling mechanism is doing: it perceives the instruments out there, decides which one is acceptable to handle the consumer’s request (if any), and passes that call on to the remainder of the code for execution. That, in its easiest kind, is company.
In real-world agentic purposes, the software calling loop runs not one however a number of occasions, with the mannequin utilizing the outcomes of 1 software name to determine whether or not, and which, software to name subsequent. That is typically known as a ReAct loop (Cause + Act), and it’s what permits brokers to deal with advanced, multi-step duties that may’t be solved in a single name.
In the end, what I discover most fascinating about software calling is the way it adjustments the character of what an LLM is. Up up to now, a language mannequin was primarily a very subtle input-output operate, which takes textual content as enter and generates textual content as output. However with the software calling, we acquire entry to an limitless assortment of extra functionalities, which we are able to mix with the reasoning energy of the LLM to create techniques which are much more succesful than both alone.
✨ Thanks for studying! ✨
If you happen to made it this far, you may discover pialgorithms helpful — a platform we’ve been constructing that helps groups securely handle organizational information in a single place.
Liked this submit? Be part of me on 💌Substack and 💼LinkedIn
All photographs by the writer, besides talked about in any other case.
