X86 is Turing-Complete Without Data Fetches: A Deep Dive into Instruction-Only Computation

Stephanie Domas
Jul 15
4 min read

. We're all familiar with the fundamental assumption of computation: to do anything meaningful, you need to access data. Load, store, manipulate—it's the bedrock. Even the simplest Turing tarpits typically include some form of data access. But what if I told you that most modern architectures, specifically x86, are Turing-complete without ever reading data? Sounds counter-intuitive, right? Well, let's dive into how we proved it.

The Illusion of Data Fetches

The core observation, which might initially seem uninspiring, is that any traditional data fetch can be accomplished purely through an instruction fetch instead. Think about it: a mov eax, [data] (a data fetch) can be semantically replicated by mov eax, 0xdeadcode (an immediate move, which is an instruction fetch).

Building on this, we modeled memory as an array of "fetch cells." Each cell isn't a traditional memory location that you read from; instead, it's an instruction sequence. To "load" data from cell_0, you'd jmp cell_0. Inside cell_0, you'd find something like mov eax, 0xdeadcode; jmp esi. The mov eax, 0xdeadcode effectively "loads" the value into eax, and the jmp esi returns control to where it was needed. This way, the "data" is hardcoded as an immediate value within the instruction stream itself.

No Stack? No Problem!

A critical constraint for our "no data fetch" requirement was to implement our BrainF**k interpreter without using a stack. Why? Because a traditional ret instruction implicitly involves a data fetch from the stack pointer to retrieve the return address. To circumvent this, we use jmp instructions instead of call/ret pairs for our "simulated function calls." We save the return address in a register (e.g., esi) and jump to our target. The target then jumps back to the saved address.

To facilitate this, I cooked up a few macros:

%macro simcall 1: This macro saves esi as the return address (mov esi, %%retsim) and then jumps to the target (jmp %1).
%macro simfetch 2: This is our "read" operation. It calculates the address of the target cell (which contains our immediate value), saves esi as the return address, and jumps to that cell. The cell's instruction will load its immediate into eax.
%macro simwrite 2: This is where things get really interesting. If our "data" is part of the instruction stream, how do we write to it? Simple: self-modifying code. A write operation overwrites the immediate value embedded within the mov instruction of the target data cell. For example, mov [cell_1+1], 0xcoffee would modify the instruction at cell_1 to load 0xcoffee instead of its previous immediate.

The BrainF**k Interpreter: A Data-Fetchless Machine

BrainF**k was the ideal candidate for our proof of concept. Its simple, minimal instruction set mapped perfectly to our data-fetchless model. Our memory cells for the BF program are directly adapted from our "fetch cells."

The interpreter skeleton is straightforward:

Code snippet

_start:
.execute:
    simcall fetch_ip        ; Get current instruction pointer
    simfetch program, eax   ; Fetch the BF instruction at IP
    ; Compare fetched instruction with BF commands (>, <, +, -, [, ], ., ,)
    ; Jump to corresponding handler
    ; ...

Each BF instruction is then implemented using our simcall, simfetch, and simwrite macros, carefully avoiding any actual data fetches:

> (Increment Data Pointer): We fetch the current data pointer value, increment it, and then write it back to the dp (data pointer) cell.
< (Decrement Data Pointer): Similar to > but decrementing.
+ (Increment Data): We fetch the current data pointer, then use it to simfetch the data at that address, increment the value in eax, and simwrite it back to the corresponding data cell.
- (Decrement Data): Similar to + but decrementing.
[ (Conditional Jump Forward): If the current data cell (fetched without data reads, of course) is zero, we search forward in the program for the matching ] and update the instruction pointer. This involves incrementing a counter while scanning and decrementing for [ and ] to manage nesting.
] (Conditional Jump Backward): If the current data cell is non-zero, we search backward for the matching [ and update the instruction pointer. Again, managing nesting with a counter.

The X86 State in Instruction Form

Our program counter (ip) and data pointer (dp) are themselves "fetch cells." They contain mov eax, XXXX; jmp esi instructions where XXXX is their current value. Incrementing the ip or dp means simwrite'ing a new immediate value into their respective mov instructions.

Code snippet

ip:
    db 0xb8         ; mov eax, XXXXXXXX
    dd 0            ; Initial IP value
    jmp esi

fetch_dp:
    db 0xb8         ; mov eax, XXXXXXXX
    dp:
    dd 0            ; Initial DP value
    jmp esi

Our actual data cells and program instructions are similarly laid out:

Code snippet

data:
    times 30000 db 0xb8, 0, 0, 0, 0, 0xff, 0xe6, 0x90 ; mov eax, XXXXXXXX, jmp esi, nop
program:
    times 30000 db 0xb8, 0, 0, 0, 0, 0xff, 0xe6, 0x90 ; mov eax, XXXXXXXX, jmp esi, nop

The 0xb8 is the opcode for mov eax, imm32, followed by four bytes for the immediate value (our "data"), and then 0xff, 0xe6 for jmp esi, and a nop.

The Unpractical Reality

The I/O operations (. and ,) are, admittedly, the only part that inherently touches data reads due to the Linux int 0x80 system call implementation. For brevity, I've omitted the I/O functionality from the full description, but the complete source code is available.

And there you have it: a functioning Turing machine on x86, capable of execution without ever touching the data read pipeline. Does it have practical applications? Absolutely none! But it's a fascinating testament to the inherent computational power embedded within the instruction set itself, proving that even without explicit data fetches, x86 is truly Turing-complete.

If you want to dig into the full glory of this madness, the source code for the complete interpreter is available: git clone https://github.com/xoreaxeaxeax/tiresias

To dive in deeper check out the full writeup in POC||GTFO Volume 15 (Page 87-88)