How Jane Street built a custom LLM for a $30B trading engine.
Off-the-shelf LLMs don't know OCaml. Jane Street runs their entire trading operation in it. So they trained their own model — on workspace snapshots, not commits — and wired it into every editor.
Workflow
1. Identify the gap. Off-the-shelf LLMs have almost no OCaml training data. Code completions require understanding a codebase LLMs have never seen. 2. Build a custom dataset. Snapshot developer workstations throughout the day — capturing small real edits and build results. Commits and PRs were too large and too vague to train on effectively. 3. Fine-tune with reinforcement learning. Any generated code must parse, type-check, and pass existing tests before it counts. 4. Build a sidecar proxy between each editor and the model. It manages context, constructs prompts, monitors build status, and collects latency metrics. 5. Roll out to the full engineering org. Track code acceptance rates. Iterate.
Tools
Custom fine-tuned OCaml LLM, workspace snapshotting pipeline, sidecar proxy for editor integration
