Efficiency Frontier

Visualizing performance vs operational cost for each model.

X-Axis

Accuracy (%)

Y-Axis

Cost / 1k

OpenAI: GPT-5.4

Score: 80.3%

Cost: $0.0002

Google: Gemini 3 Flash Preview

Score: 77.9%

Cost: $0.0000

Anthropic: Claude Opus 4.6

Score: 77.5%

Cost: $0.0003

Anthropic: Claude Opus 4.5

Score: 77.5%

Cost: $0.0003

MoonshotAI: Kimi K2 0711

Score: 77.4%

Cost: $0.0000

Anthropic: Claude Haiku 4.5

Score: 76.7%

Cost: $0.0001

OpenAI: GPT-5.2 Chat

Score: 76.7%

Cost: $0.0001

xAI: Grok Code Fast 1

Score: 76.5%

Cost: $0.0000

Anthropic: Claude Sonnet 4.5

Score: 76.2%

Cost: $0.0002

OpenAI: GPT-5.3 Chat

Score: 75.8%

Cost: $0.0001

Anthropic: Claude Sonnet 4.6

Score: 75.8%

Cost: $0.0002

DeepSeek: DeepSeek V3.1 Terminus

Score: 75.6%

Cost: $0.0000

Google: Gemini 3.1 Flash Lite Preview

Score: 75.1%

Cost: $0.0000

Inception: Mercury Coder

Score: 74.9%

Cost: $0.0000

OpenAI: GPT-5.2-Codex

Score: 74.5%

Cost: $0.0001

xAI: Grok 4 Fast

Score: 73.7%

Cost: $0.0000

DeepSeek: DeepSeek V3.1

Score: 73.2%

Cost: $0.0000

Meta: Llama 4 Maverick

Score: 72.0%

Cost: $0.0000

Mistral: Mistral Small Creative

Score: 71.8%

Cost: $0.0000

MoonshotAI: Kimi K2 0905

Score: 71.7%

Cost: $0.0000

DeepSeek: DeepSeek V3.2

Score: 71.4%

Cost: $0.0000

Google: Gemini 2.5 Flash Lite Preview 09-2025

Score: 70.8%

Cost: $0.0000

xAI: Grok 3 Mini

Score: 70.5%

Cost: $0.0000

Mistral: Devstral 2 2512

Score: 69.4%

Cost: $0.0000

Inception: Mercury

Score: 66.9%

Cost: $0.0000

Cohere: Command A

Score: 65.9%

Cost: $0.0002

Qwen: Qwen3.5-Flash

Score: 64.2%

Cost: $0.0000

xAI: Grok 4.1 Fast

Score: 62.2%

Cost: $0.0000

Qwen: Qwen3.5-122B-A10B

Score: 61.6%

Cost: $0.0000

Meta: Llama 4 Scout

Score: 60.9%

Cost: $0.0000

Mistral: Ministral 3 14B 2512

Score: 60.5%

Cost: $0.0000

Meta: Llama 3.3 70B Instruct

Score: 59.2%

Cost: $0.0000

OpenAI: GPT-5.3-Codex

Score: 53.8%

Cost: $0.0001

Mistral: Ministral 3 8B 2512

Score: 53.0%

Cost: $0.0000

Inception: Mercury 2

Score: 49.1%

Cost: $0.0000

MiniMax: MiniMax M2-her

Score: 48.7%

Cost: $0.0000

Cohere: Command R (08-2024)

Score: 47.6%

Cost: $0.0000

Qwen: Qwen3.5-27B

Score: 45.1%

Cost: $0.0000

xAI: Grok 4

Score: 44.5%

Cost: $0.0002

Mistral: Ministral 3 3B 2512

Score: 43.1%

Cost: $0.0000

Cohere: Command R+ (08-2024)

Score: 35.8%

Cost: $0.0002

Meta: Llama 3.2 3B Instruct

Score: 33.7%

Cost: $0.0000

Cohere: Command R7B (12-2024)

Score: 29.9%

Cost: $0.0000

Qwen: Qwen3.5-35B-A3B

Score: 25.1%

Cost: $0.0000

DeepSeek: R1 0528

Score: 21.4%

Cost: $0.0000

Meta: Llama 3.2 1B Instruct

Score: 21.2%

Cost: $0.0000

MiniMax: MiniMax M2.1

Score: 21.1%

Cost: $0.0000

Z.ai: GLM 4.6

Score: 21.1%

Cost: $0.0000

MoonshotAI: Kimi K2 Thinking

Score: 17.5%

Cost: $0.0000

MiniMax: MiniMax M1

Score: 16.6%

Cost: $0.0000

Z.ai: GLM 4.6V

Score: 15.9%

Cost: $0.0000

MiniMax: MiniMax M2.5

Score: 11.6%

Cost: $0.0000

Z.ai: GLM 4.7

Score: 10.0%

Cost: $0.0000

Z.ai: GLM 5

Score: 8.5%

Cost: $0.0001

Qwen: Qwen3.5 Plus 2026-02-15

Score: 6.2%

Cost: $0.0000

Google: Gemini 3.1 Pro Preview

Score: 4.9%

Cost: $0.0001

Google: Gemini 3 Pro Preview

Score: 4.8%

Cost: $0.0001

MiniMax: MiniMax M2

Score: 4.6%

Cost: $0.0000

MoonshotAI: Kimi K2.5

Score: 2.7%

Cost: $0.0000

DeepSeek: DeepSeek V3.2 Speciale

Score: 0.7%

Cost: $0.0000

Z.ai: GLM 4.7 Flash

Score: 0.0%

Cost: $0.0000

Key Insights

Frontier Models16

Economic Avg$0.0001

Model CycleSeason 1

Models positioned in the upper-right represent the ideal efficiency frontier for Luau reasoning.

Detailed Matrix Nodes

premium

OpenAI: GPT-5.4

openai/gpt-5.4

Score

80.3%

Cost

$0.0002

premium

Google: Gemini 3 Flash Preview

google/gemini-3-flash-preview

Score

77.9%

Cost

$0.0000

premium

Anthropic: Claude Opus 4.6

anthropic/claude-opus-4.6

Score

77.5%

Cost

$0.0003

premium

Anthropic: Claude Opus 4.5

anthropic/claude-opus-4.5

Score

77.5%

Cost

$0.0003

premium

MoonshotAI: Kimi K2 0711

moonshotai/kimi-k2

Score

77.4%

Cost

$0.0000

premium

Anthropic: Claude Haiku 4.5

anthropic/claude-haiku-4.5

Score

76.7%

Cost

$0.0001

premium

OpenAI: GPT-5.2 Chat

openai/gpt-5.2-chat

Score

76.7%

Cost

$0.0001

leader

xAI: Grok Code Fast 1

x-ai/grok-code-fast-1

Score

76.5%

Cost

$0.0000

premium

Anthropic: Claude Sonnet 4.5

anthropic/claude-sonnet-4.5

Score

76.2%

Cost

$0.0002

premium

OpenAI: GPT-5.3 Chat

openai/gpt-5.3-chat

Score

75.8%

Cost

$0.0001

premium

Anthropic: Claude Sonnet 4.6

anthropic/claude-sonnet-4.6

Score

75.8%

Cost

$0.0002

premium

DeepSeek: DeepSeek V3.1 Terminus

deepseek/deepseek-v3.1-terminus

Score

75.6%

Cost

$0.0000

premium

Google: Gemini 3.1 Flash Lite Preview

google/gemini-3.1-flash-lite-preview

Score

75.1%

Cost

$0.0000

leader

Inception: Mercury Coder

inception/mercury-coder

Score

74.9%

Cost

$0.0000

premium

OpenAI: GPT-5.2-Codex

openai/gpt-5.2-codex

Score

74.5%

Cost

$0.0001

leader

xAI: Grok 4 Fast

x-ai/grok-4-fast

Score

73.7%

Cost

$0.0000

premium

DeepSeek: DeepSeek V3.1

deepseek/deepseek-chat-v3.1

Score

73.2%

Cost

$0.0000

leader

Meta: Llama 4 Maverick

meta-llama/llama-4-maverick

Score

72.0%

Cost

$0.0000

leader

Mistral: Mistral Small Creative

mistralai/mistral-small-creative

Score

71.8%

Cost

$0.0000

premium

MoonshotAI: Kimi K2 0905

moonshotai/kimi-k2-0905

Score

71.7%

Cost

$0.0000

premium

DeepSeek: DeepSeek V3.2

deepseek/deepseek-v3.2

Score

71.4%

Cost

$0.0000

leader

Google: Gemini 2.5 Flash Lite Preview 09-2025

google/gemini-2.5-flash-lite-preview-09-2025

Score

70.8%

Cost

$0.0000

leader

xAI: Grok 3 Mini

x-ai/grok-3-mini

Score

70.5%

Cost

$0.0000

leader

Mistral: Devstral 2 2512

mistralai/devstral-2512

Score

69.4%

Cost

$0.0000

leader

Inception: Mercury

inception/mercury

Score

66.9%

Cost

$0.0000

premium

Cohere: Command A

cohere/command-a

Score

65.9%

Cost

$0.0002

leader

Qwen: Qwen3.5-Flash

qwen/qwen3.5-flash-02-23

Score

64.2%

Cost

$0.0000

leader

xAI: Grok 4.1 Fast

x-ai/grok-4.1-fast

Score

62.2%

Cost

$0.0000

leader

Qwen: Qwen3.5-122B-A10B

qwen/qwen3.5-122b-a10b

Score

61.6%

Cost

$0.0000

leader

Meta: Llama 4 Scout

meta-llama/llama-4-scout

Score

60.9%

Cost

$0.0000

leader

Mistral: Ministral 3 14B 2512

mistralai/ministral-14b-2512

Score

60.5%

Cost

$0.0000

premium

Meta: Llama 3.3 70B Instruct

meta-llama/llama-3.3-70b-instruct

Score

59.2%

Cost

$0.0000

premium

OpenAI: GPT-5.3-Codex

openai/gpt-5.3-codex

Score

53.8%

Cost

$0.0001

leader

Mistral: Ministral 3 8B 2512

mistralai/ministral-8b-2512

Score

53.0%

Cost

$0.0000

leader

Inception: Mercury 2

inception/mercury-2

Score

49.1%

Cost

$0.0000

lagging

MiniMax: MiniMax M2-her

minimax/minimax-m2-her

Score

48.7%

Cost

$0.0000

efficient

Cohere: Command R (08-2024)

cohere/command-r-08-2024

Score

47.6%

Cost

$0.0000

efficient

Qwen: Qwen3.5-27B

qwen/qwen3.5-27b

Score

45.1%

Cost

$0.0000

lagging

xAI: Grok 4

x-ai/grok-4

Score

44.5%

Cost

$0.0002

efficient

Mistral: Ministral 3 3B 2512

mistralai/ministral-3b-2512

Score

43.1%

Cost

$0.0000

lagging

Cohere: Command R+ (08-2024)

cohere/command-r-plus-08-2024

Score

35.8%

Cost

$0.0002

efficient

Meta: Llama 3.2 3B Instruct

meta-llama/llama-3.2-3b-instruct

Score

33.7%

Cost

$0.0000

efficient

Cohere: Command R7B (12-2024)

cohere/command-r7b-12-2024

Score

29.9%

Cost

$0.0000

efficient

Qwen: Qwen3.5-35B-A3B

qwen/qwen3.5-35b-a3b

Score

25.1%

Cost

$0.0000

efficient

DeepSeek: R1 0528

deepseek/deepseek-r1-0528

Score

21.4%

Cost

$0.0000

efficient

Meta: Llama 3.2 1B Instruct

meta-llama/llama-3.2-1b-instruct

Score

21.2%

Cost

$0.0000

lagging

MiniMax: MiniMax M2.1

minimax/minimax-m2.1

Score

21.1%

Cost

$0.0000

efficient

Z.ai: GLM 4.6

z-ai/glm-4.6

Score

21.1%

Cost

$0.0000

lagging

MoonshotAI: Kimi K2 Thinking

moonshotai/kimi-k2-thinking

Score

17.5%

Cost

$0.0000

lagging

MiniMax: MiniMax M1

minimax/minimax-m1

Score

16.6%

Cost

$0.0000

efficient

Z.ai: GLM 4.6V

z-ai/glm-4.6v

Score

15.9%

Cost

$0.0000

efficient

MiniMax: MiniMax M2.5

minimax/minimax-m2.5

Score

11.6%

Cost

$0.0000

efficient

Z.ai: GLM 4.7

z-ai/glm-4.7

Score

10.0%

Cost

$0.0000

lagging

Z.ai: GLM 5

z-ai/glm-5

Score

8.5%

Cost

$0.0001

lagging

Qwen: Qwen3.5 Plus 2026-02-15

qwen/qwen3.5-plus-02-15

Score

6.2%

Cost

$0.0000

lagging

Google: Gemini 3.1 Pro Preview

google/gemini-3.1-pro-preview

Score

4.9%

Cost

$0.0001

lagging

Google: Gemini 3 Pro Preview

google/gemini-3-pro-preview

Score

4.8%

Cost

$0.0001

lagging

MiniMax: MiniMax M2

minimax/minimax-m2

Score

4.6%

Cost

$0.0000

efficient

MoonshotAI: Kimi K2.5

moonshotai/kimi-k2.5

Score

2.7%

Cost

$0.0000

lagging

DeepSeek: DeepSeek V3.2 Speciale

deepseek/deepseek-v3.2-speciale

Score

0.7%

Cost

$0.0000

efficient

Z.ai: GLM 4.7 Flash

z-ai/glm-4.7-flash

Score

0.0%

Cost

$0.0000