Function calling

TL;DR

Function calling lets you declare tool schemas alongside your LLM request; the model decides when to invoke them and returns structured JSON you execute on your side.
It replaces fragile "parse JSON from text" patterns with schema-enforced, protocol-level structured output.
Parallel tool calls (GPT-4o, Claude 3.5+) let the model invoke multiple independent tools in a single round-trip, cutting latency in half for multi-tool queries.
Tool descriptions are load-bearing: the model reads them to decide whether a tool applies. Bad descriptions cause wrong tool selection more than any other factor.
OpenAI's structured output feature uses constrained decoding at the tokenizer level, guaranteeing schema compliance without retries.
Function calling is the bridge between "chatbot" and "agent." Every production AI system that takes actions uses it.

Ask an LLM to return a JSON object with a temperature field. Eight times out of ten, you get valid JSON. The other two times, the model wraps it in a markdown code fence, prepends "Sure! Here is the JSON:", or invents extra fields your parser does not expect. Your downstream code breaks.

You can throw regex at it. Teams do. They write string-stripping logic, retry on parse failure, and add increasingly desperate system prompt instructions like "RESPOND WITH ONLY JSON. NO EXPLANATION." It works until it does not.

The deeper problem is that text generation is probabilistic. Every token is a dice roll. Without structural enforcement, there is no guarantee the model's output will conform to any schema. You cannot build reliable systems on "usually works."

Feature	What it does	When model returns it	Schema enforcement
JSON mode	Forces valid JSON output	Every response	None (any shape)
Function calling	Returns tool invocations matching declared schemas	When model decides to call a tool	Per-tool JSON Schema
Structured output	Every response matches a provided JSON Schema	Every response	Constrained decoding at tokenizer level

Function calling

TL;DR

The problem it solves

Continue Reading with Premium

Comments

What is it?

How it works

The tool schema

The execution loop

Parallel tool calls

Structured output vs function calling