Neural GB
A Game Boy whose logic units are neural networks — each verified bit-exact over its complete input domain. Gameplay runs the golden units; press Verify frame neurally to re-render the current frame with every unit neural and check bit-identity live.
Neural GB — A Game Boy Whose Logic is Neural Networks
The claim: A real, unmodified Game Boy game (Usurper of the Ghoul Throne, a 512KB MBC5 cartridge from itch.io) boots and renders on an emulator in which every functional unit — instruction decode, the entire ALU, every rotate and bit operation, tile decoding, palette mapping, sprite priority — is a trained neural network.
The output is bit-identical to a conventional reference implementation: every pixel of the framebuffer and every byte of machine state.
That title screen was not produced by ordinary emulator code. The frame's 4,178 CPU instructions were each decoded by a neural net and executed through neural ALU nets, and each of its 23,040 pixels passed through neural tile-decode, palette, and sprite-mux nets.
The Methodology
Neural networks are approximators; emulation needs exactness. Being 99.9% correct is 0% playable, because one wrong flag bit desyncs everything downstream. The project's answer is a strict discipline applied uniformly to every functional unit:
- Decompose the machine the way hardware engineers do: into small units whose input spaces are completely enumerable (e.g., an 8-bit adder has 65,536 inputs; the opcode decoder has 256; a palette lookup has 1,024).
- Enumerate the unit's full input domain and its golden outputs.
- Train a small bit-level MLP on the entire domain.
- Verify exhaustively: The unit passes only at N/N. 65,535/65,536 is a fail.
- Compose verified units with plain wiring (routing, scheduling, address arithmetic). Composition of exact units is exact — so programs run forever with zero desync.
The Scaling Rule: Anything too wide to enumerate is never trained as one net — it is composed from verified narrow units (just as hardware ripples a carry). 16-bit arithmetic (2³² cases, untrainable) is two passes through the verified 8-bit ADC/SBC nets, each containing 131,072 cases with the carry-in as an explicit input.
The Proof Chain
Self-consistency isn't enough, so the core was held to the rigorous standards real emulators are judged by:
- SingleStepTests/sm83: 512 opcodes × 1,000 randomized cases, each checking full register/flag/RAM state and cycle counts — all pass.
- Blargg's cpu_instrs: The classic hardware-validated instruction exerciser, reporting over the emulated serial port — 11/11 "Passed all tests".
- Golden vs. Neural Bit-Identity: On the game frame, the framebuffer and complete machine state are identical. The comparison is meaningful because both runs share one orchestrator — only the logic units differ.
- Playability: The turbo raster is pixel-identical to the per-dot reference at the checked frame, runs ~25 fps, and the game responds to a Start press within 2 frames.
Across the entire build, the test suites found exactly one bug — in hand-written glue, not in any neural unit (SWAP and SRL were transposed in the CB dispatch table). The verified neural units were flawless by construction; the one unverified lookup table was where the bug hid. That is the methodology's thesis in miniature.
Formal Proofs & Test Results
1. Unit Verification (Exactness)
All neural units were tested exhaustively against their entire input domain.
| Unit Category | Operations | Domain Size | Result | Time |
|---|---|---|---|---|
| Decoding | decode, cbdecode |
256 | EXACT | ~1s |
| 8-Bit Math | ADC, SBC |
131,072 | EXACT | ~100-230s |
| Bitwise | AND, OR, XOR |
65,536 | EXACT | ~7-27s |
| Inc/Dec | INC, DEC |
256 | EXACT | ~1s |
| BCD/CPL | DAA, CPL |
2,048 / 256 | EXACT | ~1-6s |
| Rotates | RLC, RRC, SLA, SRA, SRL, SWAP |
256 | EXACT | ~1s |
| Wide Rotates | RL, RR |
512 | EXACT | ~1s |
| Bit Ops | BIT, RES, SET |
2,048 | EXACT | ~3s |
| Graphics | tilerow |
65,536 | EXACT | ~6s |
| Palettes | palette |
1,024 | EXACT | ~2s |
| Sprites | sprmux |
32 | EXACT | ~0s |
2. Single Step Test (sm83)
Tested against SingleStepTests/sm83 (https://github.com/SingleStepTests/sm83).
- Scope: 500 opcode files × 1,000 randomized full-state cases each (all 512 opcodes incl. the CB prefix; registers, flags, RAM, and m-cycle counts checked; HALT idle-cycle accounting excluded by design).
- Result:
ALL OPCODES PASS(0 failures).
3. Blargg CPU Instructions
[ 68.0M m-cycles 117s] serial: 'cpu_instrs\n\n01:ok 02:ok 03:ok 04:ok 05:ok 06:ok 07:ok 08:ok 09:ok 10:ok 11:ok \n\nPassed all tests\n'
FINAL serial output:
cpu_instrs
01:ok 02:ok 03:ok 04:ok 05:ok 06:ok 07:ok 08:ok 09:ok 10:ok 11:ok
Passed all tests
4. Neural Game Frame & Turbo Validation
- Render Time: 5s (4,178 instructions routed through neural decode+ALU).
- Framebuffer (160x144):
BIT-IDENTICAL - Machine state:
IDENTICAL - Turbo vs per-dot frame 140:
IDENTICAL - Speed: per-dot 11.0 fps, turbo 25.2 fps
- Input Latency: START press registers and screen changes at frame 2 (20,483 px).
Limitations
Gameplay speed uses the golden (conventional) units. Because the fully neural machine runs at ~5 seconds per frame, --neural is intended as a proof mode, not a practical way to play.
Additionally, the orchestrator makes a few simplifications: OAM DMA is an instant copy, there are no sprite-fetch mode-3 stalls, no OAM/VRAM access blocking by PPU mode, and no audio. All simplifications are strictly identical in both the golden and neural runs, ensuring every bit-identity claim remains unaffected by them.
The Game
The software running in this emulator is Usurper of the Ghoul Throne, a free homebrew Game Boy game created by Evan Dahm.
Support the developer and download the game here: evandahm.itch.io/usurper-ghoul