Neural GB

A Game Boy whose logic units are neural networks — each verified bit-exact over its complete input domain. Gameplay runs the golden units; press Verify frame neurally to re-render the current frame with every unit neural and check bit-identity live.

Power is OFF. Press the power button.


Neural GB — A Game Boy Whose Logic is Neural Networks

The claim: A real, unmodified Game Boy game (Usurper of the Ghoul Throne, a 512KB MBC5 cartridge from itch.io) boots and renders on an emulator in which every functional unit — instruction decode, the entire ALU, every rotate and bit operation, tile decoding, palette mapping, sprite priority — is a trained neural network.

The output is bit-identical to a conventional reference implementation: every pixel of the framebuffer and every byte of machine state.

That title screen was not produced by ordinary emulator code. The frame's 4,178 CPU instructions were each decoded by a neural net and executed through neural ALU nets, and each of its 23,040 pixels passed through neural tile-decode, palette, and sprite-mux nets.


The Methodology

Neural networks are approximators; emulation needs exactness. Being 99.9% correct is 0% playable, because one wrong flag bit desyncs everything downstream. The project's answer is a strict discipline applied uniformly to every functional unit:

  1. Decompose the machine the way hardware engineers do: into small units whose input spaces are completely enumerable (e.g., an 8-bit adder has 65,536 inputs; the opcode decoder has 256; a palette lookup has 1,024).
  2. Enumerate the unit's full input domain and its golden outputs.
  3. Train a small bit-level MLP on the entire domain.
  4. Verify exhaustively: The unit passes only at N/N. 65,535/65,536 is a fail.
  5. Compose verified units with plain wiring (routing, scheduling, address arithmetic). Composition of exact units is exact — so programs run forever with zero desync.

The Scaling Rule: Anything too wide to enumerate is never trained as one net — it is composed from verified narrow units (just as hardware ripples a carry). 16-bit arithmetic (2³² cases, untrainable) is two passes through the verified 8-bit ADC/SBC nets, each containing 131,072 cases with the carry-in as an explicit input.


The Proof Chain

Self-consistency isn't enough, so the core was held to the rigorous standards real emulators are judged by:

  • SingleStepTests/sm83: 512 opcodes × 1,000 randomized cases, each checking full register/flag/RAM state and cycle counts — all pass.
  • Blargg's cpu_instrs: The classic hardware-validated instruction exerciser, reporting over the emulated serial port — 11/11 "Passed all tests".
  • Golden vs. Neural Bit-Identity: On the game frame, the framebuffer and complete machine state are identical. The comparison is meaningful because both runs share one orchestrator — only the logic units differ.
  • Playability: The turbo raster is pixel-identical to the per-dot reference at the checked frame, runs ~25 fps, and the game responds to a Start press within 2 frames.

Across the entire build, the test suites found exactly one bug — in hand-written glue, not in any neural unit (SWAP and SRL were transposed in the CB dispatch table). The verified neural units were flawless by construction; the one unverified lookup table was where the bug hid. That is the methodology's thesis in miniature.


Formal Proofs & Test Results

1. Unit Verification (Exactness)

All neural units were tested exhaustively against their entire input domain.

Unit Category Operations Domain Size Result Time
Decoding decode, cbdecode 256 EXACT ~1s
8-Bit Math ADC, SBC 131,072 EXACT ~100-230s
Bitwise AND, OR, XOR 65,536 EXACT ~7-27s
Inc/Dec INC, DEC 256 EXACT ~1s
BCD/CPL DAA, CPL 2,048 / 256 EXACT ~1-6s
Rotates RLC, RRC, SLA, SRA, SRL, SWAP 256 EXACT ~1s
Wide Rotates RL, RR 512 EXACT ~1s
Bit Ops BIT, RES, SET 2,048 EXACT ~3s
Graphics tilerow 65,536 EXACT ~6s
Palettes palette 1,024 EXACT ~2s
Sprites sprmux 32 EXACT ~0s

2. Single Step Test (sm83)

Tested against SingleStepTests/sm83 (https://github.com/SingleStepTests/sm83).

  • Scope: 500 opcode files × 1,000 randomized full-state cases each (all 512 opcodes incl. the CB prefix; registers, flags, RAM, and m-cycle counts checked; HALT idle-cycle accounting excluded by design).
  • Result: ALL OPCODES PASS (0 failures).

3. Blargg CPU Instructions

[ 68.0M m-cycles 117s] serial: 'cpu_instrs\n\n01:ok 02:ok 03:ok 04:ok 05:ok 06:ok 07:ok 08:ok 09:ok 10:ok 11:ok \n\nPassed all tests\n'

FINAL serial output: cpu_instrs 01:ok 02:ok 03:ok 04:ok 05:ok 06:ok 07:ok 08:ok 09:ok 10:ok 11:ok
Passed all tests

4. Neural Game Frame & Turbo Validation

  • Render Time: 5s (4,178 instructions routed through neural decode+ALU).
  • Framebuffer (160x144): BIT-IDENTICAL
  • Machine state: IDENTICAL
  • Turbo vs per-dot frame 140: IDENTICAL
  • Speed: per-dot 11.0 fps, turbo 25.2 fps
  • Input Latency: START press registers and screen changes at frame 2 (20,483 px).

Limitations

Gameplay speed uses the golden (conventional) units. Because the fully neural machine runs at ~5 seconds per frame, --neural is intended as a proof mode, not a practical way to play.

Additionally, the orchestrator makes a few simplifications: OAM DMA is an instant copy, there are no sprite-fetch mode-3 stalls, no OAM/VRAM access blocking by PPU mode, and no audio. All simplifications are strictly identical in both the golden and neural runs, ensuring every bit-identity claim remains unaffected by them.


The Game

The software running in this emulator is Usurper of the Ghoul Throne, a free homebrew Game Boy game created by Evan Dahm.

Support the developer and download the game here: evandahm.itch.io/usurper-ghoul