Example: Setting up Tool Calling#
Tool calling enables LLMs to interact with external functions, APIs, and databases. This opens up powerful possibilities for building sophisticated AI applications that can perform actions beyond just text generation.
Why Tool Calling Matters#
Enhanced Capabilities: Models can perform actions, not just generate text
Real-time Data: Access current information from APIs and databases
Workflow Automation: Integrate AI into existing business processes
Interactive Applications: Build chatbots that can actually do things
Example: Weather Assistant with Tool Calling#
Let’s create a model that can check weather information by calling a weather API:
# serve_my_qwen3.py
from ray.serve.llm import LLMConfig, build_openai_app
llm_config = LLMConfig(
model_loading_config=dict(
model_id="my-qwen3",
model_source="Qwen/Qwen3-32B",
),
accelerator_type="L40S",
deployment_config=dict(
autoscaling_config=dict(
min_replicas=1,
max_replicas=2,
)
),
### Uncomment if your model is gated and needs your Hugging Face token to access it.
# runtime_env=dict(env_vars={"HF_TOKEN": os.environ.get("HF_TOKEN")}),
engine_kwargs=dict(
tensor_parallel_size=4,
max_model_len=32768,
reasoning_parser="qwen3",
enable_auto_tool_choice= True,
tool_call_parser= "hermes"
),
)
app = build_openai_app({"llm_configs": [llm_config]})
Deploy
!serve run serve_my_qwen3:app --non-blocking
Using Tool Calling#
Now let’s test our tool-calling model. The model will decide when to call tools and provide the results:
#tool_call_client.py
import random
import json
from openai import OpenAI
# Dummy APIs
def get_current_temperature(location: str, unit: str = "celsius"):
temperature = random.randint(15, 30) if unit == "celsius" else random.randint(59, 86)
return {
"temperature": temperature,
"location": location,
"unit": unit
}
def get_temperature_date(location: str, date: str, unit: str = "celsius"):
temperature = random.randint(15, 30) if unit == "celsius" else random.randint(59, 86)
return {
"temperature": temperature,
"location": location,
"date": date,
"unit": unit
}
# Tools schema definitions
tools = [
{
"type": "function",
"function": {
"name": "get_current_temperature",
"description": "Get current temperature at a location.",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The location to get the temperature for, in the format \"City, State, Country\"."
},
"unit": {
"type": "string",
"enum": ["celsius", "fahrenheit"],
"description": "The unit to return the temperature in. Defaults to \"celsius\"."
}
},
"required": ["location"]
}
}
},
{
"type": "function",
"function": {
"name": "get_temperature_date",
"description": "Get temperature at a location and date.",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The location to get the temperature for, in the format \"City, State, Country\"."
},
"date": {
"type": "string",
"description": "The date to get the temperature for, in the format \"Year-Month-Day\"."
},
"unit": {
"type": "string",
"enum": ["celsius", "fahrenheit"],
"description": "The unit to return the temperature in. Defaults to \"celsius\"."
}
},
"required": ["location", "date"]
}
}
}
]
######################### Sending request for tool calls #########################
client = OpenAI(base_url="http://localhost:8000/v1", api_key="FAKE_KEY")
messages = [
{
"role": "system",
"content": "You are a weather assistant. Use the given functions to get weather data and provide the results."
},
{
"role": "user",
"content": "What's the temperature in San Francisco now? How about tomorrow? Current Date: 2025-07-29."
}
]
response = client.chat.completions.create(
model="my-qwen3",
messages=messages,
tools=tools,
tool_choice= "auto" # let the model decide to use tools or not
)
######################### Process tool calls #########################
for tc in response.choices[0].message.tool_calls:
print(f"Tool call id: {tc.id}")
print(f"Tool call function name: {tc.function.name}")
print(f"Tool call arguments: {tc.function.arguments}")
print("\n")
# Helper tool map (str -> python callable to your APIs)
helper_tool_map = {
"get_current_temperature": get_current_temperature,
"get_temperature_date": get_temperature_date
}
######################### Add your model's tool calls request to the chat history #########################
# `response` is your model's last response containing the tool calls it requests.
# Add the previous response containing the tool calls
messages.append(response.choices[0].message.model_dump())
######################### Add `tool` messages in your chat history #########################
# Loop through the tool calls and create `tool` messages
for tool_call in response.choices[0].message.tool_calls:
call_id, fn_call = tool_call.id, tool_call.function
fn_callable = helper_tool_map[fn_call.name]
fn_args = json.loads(fn_call.arguments)
output = json.dumps(fn_callable(**fn_args))
# Create a new message of role `"tool"` containing the output of your tool
messages.append({
"role": "tool",
"content": output,
"tool_call_id": call_id
})
######################### Sending final request #########################
response = client.chat.completions.create(
model="my-qwen3",
messages=messages
)
print(response.choices[0].message.content)
!serve shutdown -y
Key Benefits#
Intelligent Tool Selection: Model decides when and which tools to use
Structured Parameters: Tools receive properly formatted arguments
Seamless Integration: Natural conversation flow with tool execution
Extensible: Easy to add new tools and capabilities
Learn More#
For comprehensive tool calling guides, see:
LLM deployment with tool and function calling on Anyscale - Complete tool calling setup