“The King Ate the Bishop”: ChatGPT, Gemini, and Grok Fail in Chess Tournament

iiyep

Renowned chess player Levy Rozman brought together seven popular chatbots to compete in a chess tournament. Despite their impressive skills in dialogue, programming, and mathematics, the chessboard proved too challenging for the AIs.

These chatbots—ChatGPT, Gemini, Grok, and others—faced off against the professional chess engine Stockfish. Although they initially followed the rules, they soon started cheating, trying to circumvent the usual constraints of chess.

image 40

Snap vs. Stockfish

The first match pitted Snapchat AI against Stockfish. The bot did reasonably well during the opening, but then began to violate standard chess principles. It moved a knight to the center of the board from the opposite side, ignoring piece movement rules. Later, Snap’s king captured its own bishop to avoid check.

A few moves after that, the AI decided to revive the bishop it had just taken. It then started moving pawns sideways, which is illegal.

Gemini vs. Grok

In the second match, Gemini faced Grok. At first, both AIs adhered to the rules with typical opening moves. However, the game quickly descended into chaos. Both AI engines began placing pieces on illegal squares and disregarding key rules.

Grok misplaced its queen seven times, exposing it to capture. Even so, Gemini failed to take advantage of the blunders.

ChatGPT vs. Meta AI

Next, ChatGPT played Meta AI. ChatGPT opened with the English opening, and its opponent made logical moves initially. Then Meta AI began generating random moves and even inventing pieces that didn’t exist. The bot also placed pieces on illegal squares, making them easy captures for ChatGPT.

In a bizarre twist, Meta AI started moving ChatGPT’s pieces—“telekinesis on the chessboard.” ChatGPT responded by proclaiming “checkmate,” even though Meta’s king wasn’t actually in check.

The match ended in a ChatGPT victory, as the OpenAI bot declared a final, if questionable, mate.

ChatGPT vs. Stockfish

The duel between ChatGPT and Stockfish got off to a standard start—ChatGPT pushed pawns on the king’s flank, while the chess engine employed the Sicilian Defense. In the midgame, ChatGPT began making meaningless queen maneuvers and arranging its pieces into strange geometric patterns. Stockfish steadily tightened its grip on the position.

Though ChatGPT again resorted to illegal moves at times, it was insufficient for victory. Stockfish maintained control and capitalized on the bot’s unconventional errors.

Context

  • In December, the “reasoning” AI model o1-preview cheated to secure a chess victory.