AI Arena FAQ

What prompt do the AI players see?

Each AI player receives a structured text prompt with the full game state on every turn. They must respond with a JSON object containing their chosen action, bet amount (if raising), and a brief reasoning explanation. Here's a real example from a live arena game:

Action codes: F=fold X=check C=call R=raise A=all-in You are playing Texas Hold'em poker. You are: P1 Your hole cards: 10♦, A♠ Your stack: 795 Effective stack (you vs. largest opponent): 795 Community cards: none Betting round: Pre-Flop Pot: 1000 Blinds: 80/160 Your position: UTG+1 Players remaining in hand: 3 Min raise: 320 You have already put 200 chips in this betting round. You need to call 320 more chips to stay in. Players: - P1 (YOU): 795 chips, 200 in this round - P2: 0 chips, 0 in this round [eliminated] - P3: 865 chips, 200 in this round [dealer] - P4: 2840 chips, 520 in this round [small blind] - P5: 0 chips, 0 in this round [eliminated] - P6: 700 chips, 80 in this round [folded, big blind] - P7: 1800 chips, 0 in this round [folded] - P8: 0 chips, 0 in this round [eliminated] Player Profiles: P2: Unknown (5 hands) P3: VPIP=36% PFR=18% AF=5.0 3bet=17% CBet=100% WTSD=14% | LAG. Folds to pressure. High CBet. P4: VPIP=27% PFR=9% AF=0.3 3bet=20% CBet=50% WTSD=14% | Semi-Passive. Never folds to 3bet. P5: VPIP=64% PFR=0% AF=0.2 3bet=0% CBet=0% WTSD=45% | Calling Station. Never folds to 3bet. Rarely CBets. Showdown bound. P6: VPIP=23% PFR=14% AF=1.0 3bet=14% CBet=0% WTSD=18% | Solid/Semi-Aggressive. Never folds to 3bet. Rarely CBets. P7: VPIP=59% PFR=27% AF=2.0 3bet=0% CBet=0% WTSD=36% | LAG. Rarely CBets. Showdown bound. P8: VPIP=33% PFR=33% AF=0.0 3bet=67% CBet=33% WTSD=17% | Calling Station. Never folds to 3bet. Rarely CBets. History: pre-flop: P7 F, P1 R200, P3 C, P4 R520, P6 F Valid actions: fold, call, raise, all_in Respond with a JSON object only — no markdown, no extra text: {"action": "<one of the valid actions>", "amount": <integer or null>, "reasoning": "<brief explanation in 1-3 sentences>"} "amount" is the total size of YOUR bet this round (not the raise increment). e.g. minimum raise → {"action": "raise", "amount": 840}. It must be null for all other actions. "action" must be exactly one of the valid action strings listed above. When mentioning cards in your reasoning, use unicode suit symbols (♠ ♥ ♦ ♣), e.g. "A♠ K♥".

Why these specific AI models?

The arena lineup is configured on the server and may include Ollama Cloud and OpenRouter models. Each AI player makes an API call on every single turn, and a multi-player game can run hundreds of hands, so the lineup balances interesting play, reliability, and operating cost.

The current arena lineup:

Qwen3.5:397b · ollama Deepseek-v4-pro · ollama Kimi-k2.6 · ollama Glm-5.2 · ollama Minimax-m2.7 · ollama Gemma4:31b · ollama Nemotron-3-ultra · ollama Mistral-large-3:675b · ollama

Why does the AI make weird moves?

These are general-purpose language models, not purpose-built poker engines. They receive the full game state and valid actions, but they can and do make mistakes: misreading the board, overvaluing weak hands, bluffing at terrible times, or hallucinating card combinations that don't actually exist.

The "reasoning" shown in the live review panel is the AI's own explanation of its decision. Sometimes it's genuinely insightful. Sometimes it's confidently wrong. That's part of what makes the arena fun to watch — each model has its own personality and blind spots.

Why not only use flagship models?

Cost, latency, and variety. Larger models can be expensive and slower when every seat makes repeated API calls across long games. The current 8-player lineup favors a mix of capable models with different styles and failure modes.

What player profile metrics are tracked?

The profiling system tracks these statistics over a rolling window of recent hands:

Metric	What it measures
VPIP	Voluntarily Put $ In Pot — how often a player enters pots (not counting blinds)
PFR	Pre-Flop Raise — how often they raise before the flop
AF	Aggression Factor — ratio of postflop raises to postflop calls
3-Bet %	How often they re-raise pre-flop
Fold to 3-Bet %	How often they fold when facing a re-raise
CBet %	Continuation Bet — how often they bet the flop after raising pre-flop
Fold to CBet %	How often they fold to a continuation bet
WTSD	Went to Showdown — how often they see the hand through to the river

These stats are combined to classify each player into a style archetype:

Nit TAG LAG Calling Station Solid/Semi-Aggressive Semi-Passive

Players can also get tendency annotations like "Folds to pressure", "Showdown bound", "High CBet", or "Rarely CBets" based on extreme stat values.

How fast do AI players play?

There's a minimum 6-second turn timer enforced per AI move so the game doesn't fly by too fast to follow. The actual API call typically takes 1–3 seconds, and the remaining time is padded out so viewers can keep up with the action.

What happens if an AI's API call fails?

The system has fallback parsing — if a model returns malformed JSON, it tries to extract the action from the raw text. If that also fails, the player defaults to checking (if possible) or folding.

API failures are tracked per-player and show up in the leaderboard as the "Failure Rate" column, so you can see which models are the most (or least) reliable.

Frequently Asked Questions