browser-train

full LLM training in one HTML file. no server, no deps, just open it.

This is a 345K-parameter GPT (3 layers, 4 heads, d=96) that trains from scratch in your browser using plain JS. Character-level tokenizer, tape-based autograd, AdamW, the whole thing. It fits in ~500 lines because I didn't try to make it pretty.

The default corpus is 14KB of Shakespeare. With 345K params the model can't memorize it — it has to learn actual patterns: word structure, line breaks, which characters tend to follow which. Give it a few minutes and it starts producing recognizable iambic-ish dialogue.

You can also paste your own text. Anything works as long as it's over 100 chars.

Nerd details: causal self-attention (each position only sees previous positions), cross-entropy loss, gradient clipping at norm 1.0, LR warmup then cosine decay. Backprop walks a tape in reverse — same idea as PyTorch autograd but dumber. Open multiple tabs and they sync weights via BroadcastChannel.


step 0 loss - tok/s - - params

samples (generated every 20 steps)


what to expect: first ~50 steps are noise (loss starts at ln(vocab) ~3.5). by step 100-200 you'll see spaces in the right places. by 400+ it's writing recognizable words and dialogue structure. on a laptop it does 4-8 steps/sec, iphone safari 2-4. the loss won't hit 0 — that would mean the model memorized 15KB in 1.3MB of float32, which is impossible. it learns rules instead.

src: view source. multi-tab: open another tab to add a training node.