Grok 4.3 is now live on the xAI API – It is best for Agentic tool calling

xAI officially announced Grok 4.3 on the API on May 5, 2026, and notified developers on May 6 that eight earlier models will be retired from the API on May 15, 2026.

The model itself first launched on April 17 as a consumer beta on grok.com for SuperGrok Heavy subscribers at $300/month. API access opened quietly on April 30 before the public announcement two weeks later. Teams running on the older endpoints have until May 15 to migrate.

This article covers what Grok 4.3 is, what it can do, what it leads on, how it’s priced, how it compares to Claude, GPT and Gemini, and the use cases where it actually fits.

What Is Grok 4.3

Grok 4.3 is xAI’s flagship reasoning model. It is a large language model that accepts text, images and video as input, generates text as output, and runs always-on chain-of-thought reasoning before every response. It supports a 1 million token context window, function calling, web search, code execution, and remote MCP server connections.

It is the direct successor to Grok 4.20 and replaces Grok 4 as xAI’s default flagship. The model is drop-in compatible with the OpenAI SDK and available through the xAI API, OpenRouter and Vercel AI Gateway.

What It’s Best At

Per xAI’s official announcement, Grok 4.3 tops Artificial Analysis leaderboards in agentic tool calling and instruction following, and ranks #1 on Vals AI enterprise benchmarks in case law and corporate finance.

The verifiable benchmark wins, per Vals AI and Artificial Analysis:

#1 on Vals Case Law (v2): 79.3%
#1 on Vals Corporate Finance (v2): 68.5%
Tied #1 on τ²-Bench Telecom: 98% (agentic customer support)
81% on IFBench: frontier-class instruction following
+321 Elo on GDPval-AA: largest single-version jump on real-world agentic tasks

It does not lead the headline composite. Grok 4.3 scores 53 on the Artificial Analysis Intelligence Index, behind GPT-5.5 (60), Claude Opus 4.7 (57) and Gemini 3.1 Pro (57).

Pricing Of Grok 4.3

grok price

xAI released Grok 4.3 alongside two companion APIs at the same launch: the Voice API for real-time conversations and speech, and the Imagine API for image and video generation. Full pricing is on the xAI models page.


Grok 4.3	Voice API	Imagine API
Excels at agentic reasoning, knowledge work, and tool use.	Real-time conversations, speech-to-text, and text-to-speech.	Turn ideas into reality with image and video generation.
Context: 1 million tokens	Agent: $3.00 / hour	Modes: Generation & editing
Input: $1.25 / 1M tokens	TTS: $4.20 / 1M characters	Image: $0.02 / image
Output: $2.50 / 1M tokens	STT: $0.10 / hour	Video: $0.05 / second

Cached input drops to roughly $0.31 per million tokens. Requests above 200K tokens are billed at a higher tier.

Grok 4.3 vs Claude Opus 4.7, GPT-5.5 and Gemini 3.1 Pro


	Grok 4.3	Claude Opus 4.7	GPT-5.5	Gemini 3.1 Pro
Input ($/1M)	$1.25	$5.00	$5.00	$2.00
Output ($/1M)	$2.50	$25.00	$30.00	$12.00
Context	1M	1M	1M	1M
AA Intelligence Index	53	57	60	57
SWE-bench Verified	not top	87.6%	high	80.6%
Modalities in	Text, image, video	Text, image	Text, image	Text, image, audio, video
Reasoning control	Always on	Adaptive	none → xhigh	Low/Med/High

Claude Opus 4.7 leads coding agents and self-verification. GPT-5.5 leads frontier reasoning. Gemini 3.1 Pro leads multimodal breadth (only model with native audio input). Grok 4.3 leads cost-per-capability and enterprise legal/finance benchmarks.

Best Use Cases of Grok 4.3

Where Grok 4.3 fits:

Legal tech. Contract review, case law research, jurisdictional reasoning. The Vals #1 on Case Law (v2) makes this the strongest fit.
Fintech and compliance. Credit agreement analysis, financial document QA, lending workflows. #1 on CorpFin (v2).
Customer support agents. 98% on τ²-Bench Telecom translates to reliable production tool use.
High-volume agentic pipelines. When running millions of inferences daily, the cost gap versus Opus 4.7 is decisive.
Long-document RAG. 1M context plus cached input pricing makes large-corpus retrieval economically viable.
Video analysis at scale. First xAI model with native video input.

Where it does not fit:

Production coding agents. Opus 4.7’s SWE-bench lead and self-verification still matter for unsupervised code work.
Latency-sensitive chat. Always-on reasoning means a time-to-first-token around 19 seconds.
Frontier scientific reasoning. GPT-5.5 leads ARC-AGI-2 and FrontierMath-class problems.
Workflows requiring persistent memory. Grok still doesn’t remember context across sessions.

Models Being Retired May 15, 2026 by xAI

xAI is sunsetting eight models alongside the Grok 4.3 release. Requests to these endpoints will fail after the cutoff:

grok-4-1-fast-reasoning
grok-4-1-fast-non-reasoning
grok-4-fast-reasoning
grok-4-fast-non-reasoning
grok-4-0709
grok-code-fast-1
grok-3
grok-imagine-image-pro

What you should know about Grok 4.3

Grok 4.3 is not the smartest model on the market. It is the cheapest frontier-class reasoning model with a 1M context window, and it leads the benchmarks that legal-tech and fintech buyers actually run.

For enterprise teams running agentic workloads at scale, that combination is the only metric that matters.

Grok 4.3 is now live on the xAI API – It is best for Agentic tool calling

What Is Grok 4.3

What It’s Best At

Pricing Of Grok 4.3

Grok 4.3 vs Claude Opus 4.7, GPT-5.5 and Gemini 3.1 Pro

Best Use Cases of Grok 4.3

Models Being Retired May 15, 2026 by xAI

What you should know about Grok 4.3

AIFreeForever Team

Other readers also enjoyed…

eBook as a business in 2026…We built an eBook generator. You might need it!

One Chat Assistant, Multiple Models, 100% Free

Tesla Reclaims Global EV Sales Lead from BYD in Q1 2026