[{"content":"","date":"2 June 2026","externalUrl":null,"permalink":"/categories/","section":"Categories","summary":"","title":"Categories","type":"categories"},{"content":"","date":"2 June 2026","externalUrl":null,"permalink":"/tags/colorlight/","section":"Tags","summary":"","title":"Colorlight","type":"tags"},{"content":"TL;DR: during COVID I turned a $15 ColorLight LED-panel board (ECP5) into a small LiteX SoC with Ethernet and a couple of GPIOs, wired a cheap relay module to it, and used it to power-cycle a PC in the office from home. Built entirely with the open-source Yosys/NextPnr toolchain. At the end: the trick I use to reverse-engineer an unknown board from the FPGA side.\nThe problem, in 2020 # This one is a few years old. Back in the strange early-COVID days I needed to power a machine in the office on and off from home, and I really did not want to drag all the lab gear back to the flat. I had a pile of $15 ColorLight LED-panel boards on the desk. They are meant to drive big RGB LED matrices, but under the hood they are a perfectly good little FPGA board: a Lattice ECP5, a Gigabit Ethernet PHY, SDRAM, SPI flash, and rows of HUB75 connectors I could use as GPIO.\nSo I put a LiteX SoC on one, gave it Ethernet, and wired a two-channel relay module across the PC\u0026rsquo;s power and reset buttons. Could I have done this with a Raspberry Pi or an ESP32 for about the same money? Of course. But the fun was doing it with our own tools, on a board that costs less than lunch, with a fully open-source FPGA flow.\nThe whole thing: a ColorLight 5A-75B (ECP5), a $3 two-channel relay module wired into the HUB75 connector to fake the PC\u0026rsquo;s power and reset switches, and an Ethernet cable to reach it from home. A $15 board and a fully open toolchain # The board is a ColorLight 5A-75B (an LFE5U-25F ECP5). The nice part is that the entire build uses open-source tools: Yosys for synthesis, NextPnr for place-and-route, and Project Trellis for the ECP5 bitstream. No vendor login, no license server.\nAt the time, the ColorLight boards were not supported in litex-boards yet, so adding the platform and a target for them was part of the project. That turned out to be one of the more useful outcomes: the support written then is still in litex-boards today, and these boards have since become a popular cheap way into open-source FPGA work.\nThe SoC # It is a SoCMini with three things bolted on: a LiteEth RGMII PHY plus Etherbone so I can reach every register over the network, a couple of GPIOs on the HUB75 connector, and a LED chaser because why not. The core of it is short:\nfrom liteeth.phy.ecp5rgmii import LiteEthPHYRGMII from litex.soc.cores.gpio import GPIOOut # Ethernet + Etherbone: every CSR reachable over UDP. self.ethphy = LiteEthPHYRGMII( platform.request(\u0026#34;eth_clocks\u0026#34;), platform.request(\u0026#34;eth\u0026#34;), tx_delay = 0e-9) self.add_etherbone(phy=self.ethphy, ip_address=ip_address) # GPIO 0 = power switch (short pulse), GPIO 1 = reset switch. self.gpio0 = GPIOOut(power_sw_gpio) self.gpio1 = GPIOOut(reset_sw_pads) LiteEthPHYRGMII is the LiteEth ECP5 RGMII PHY: getting Gigabit Ethernet up on a cheap Lattice part is one import and a tx_delay you tune in the 0 to 2 ns range for your board revision. add_etherbone then exposes the whole SoC bus over UDP, so the relay GPIOs (and the SPI flash, and anything else) are reachable with a Python script from my laptop.\nThe architecture is about as simple as a networked SoC gets: an Etherbone bridge over Ethernet into the LiteX bus, and a few GPIOs out to the relay. Driving it from home # Build and flash with the IP you want, then point a LiteX server at it over UDP:\n./colorlite.py --ip-address=192.168.1.20 --flash litex_server --udp --udp-ip=192.168.1.20 After that, the control scripts are tiny RemoteClient programs. The relay module\u0026rsquo;s two channels are wired to the R0 and G0 pins of the HUB75 connector, emulating the power and reset buttons, so:\ncd scripts ./test_blink.py # blink the LED, confirm the link is alive ./test_gpios.py # toggle the two GPIOs ./test_power_on.py # short pulse on the power switch A short pulse on GPIO 0 is a power button press. A longer pulse is a hard power-off. GPIO 1 is the reset line. That is the whole \u0026ldquo;remote power control\u0026rdquo; feature: a relay, two pins, and Etherbone.\nThe protocol is not secured in any way, so this belongs on a local network. Exposing it to the internet through a router is possible but it is on you. Do not put it in front of anything you care about. Reversing a board from the FPGA side # One more thing, because it is a nice illustration of how flexible LiteX gets when you go off the beaten path. The trick itself is not mine, it is a known way to reverse a board: when you have an FPGA but no schematic and no usable pinout, you can recover the pin map from the FPGA itself, by driving every package ball as a slow bit-banged UART that transmits its own name, then walking the board with a scope or a USB-UART dongle and reading back the name of the pin you touch.\nWhat LiteX adds is how little code that takes. A platform is just Python, so you build the whole thing programmatically. On the ECP5 you do not even need a vendor file: the package pinout comes straight from the open-source Project Trellis IO database. Read it, drop the clock pin, create the platform on the fly, and attach one streamer per ball:\nimport json # Package balls straight from the open-source Project Trellis IO database. ios = list(json.load(open(\u0026#34;iodb.json\u0026#34;))[\u0026#34;packages\u0026#34;][\u0026#34;CABGA256\u0026#34;].keys()) ios.remove(\u0026#34;P6\u0026#34;) # P6 is the 25 MHz clock, keep it for the PLL for io in ios: # add every ball as a 1-bit output, then stream its name platform.add_extension([(io, 0, Pins(io), IOStandard(\u0026#34;LVCMOS33\u0026#34;))]) self.submodules += IOStreamer(io, platform.request(io), sys_clk_freq, baudrate=9600) IOStreamer is just a small RS232 TX PHY plus a 4-byte ROM holding the ball name, cycled out forever, so each pin endlessly announces itself:\nclass IOStreamer(Module): def __init__(self, identifier, pad, sys_clk_freq, baudrate=9600): phy = RS232PHYTX(...) # 1-bit UART TX on this pad mem = Memory(8, 4, init=[ord(c) for c in identifier]) # the ball name # ... cycle the four characters out of `pad`, forever ... Two practical notes from the real sessions:\nThe clock. The ColorLight has a 25 MHz crystal on P6, so the harness runs a PLL off it and excludes P6 from the streamed set. If you did not know the clock pin, the ECP5 has an internal oscillator (OSCG) you can fall back on, the same idea as the Xilinx STARTUPE2 configuration clock. Build size. Driving every ball at once is a lot for a small ECP5, so the harness streams the IOs in chunks, walking the package a quarter at a time. On some boards you also hit a pin that halts the gateware when driven; you find it by dichotomy and add it to the excludes. The point is not the trick, it is that going from an open IO database to a live, self-describing bitstream is a short Python script in LiteX. That is the same flexibility I used to map the ColorLight variants. Slow, but it always works, and it needs nothing but the board and a dongle.\nWhy I still like it # It is a small thing, but it is a good snapshot of what LiteX is for, and a compact LiteEth example to point people at: a Gigabit RGMII PHY up, the whole SoC bus exposed over Etherbone, and a useful gadget at the end, in not many lines. A $15 board meant for LED panels became a networked SoC on a toolchain that costs nothing, and it solved a real and slightly silly problem. The obvious next step has been sitting in the README for years: a remote logic analyzer over the Gigabit link. One day. 🙂\nA second life: LinuxCNC # The part I did not expect: the same idea (a cheap Colorlight, LiteX, and Etherbone over Ethernet) turned out to be a great base for open-source motion control, and a small ecosystem of LinuxCNC FPGA cards grew on exactly these boards.\nColorCNC, the original idea on the LinuxCNC forum: use a Colorlight 5A-75E/5A-75B as the FPGA controller, born from how hard MESA cards were to get hold of. forum thread LiteX-CNC by Peter van Tol: a generic LinuxCNC firmware and driver for LiteX-supported FPGA cards, with first-class support for the Colorlight 5A-75B and 5A-75E over Etherbone. GitHub / docs Lcnc by faeboli: another LiteX-plus-LinuxCNC combination. GitHub A lockdown hack to power-cycle a PC became, in other hands, a way to run real machines. That is a good outcome for a $15 board. 🙂\nEnd of the tutorial. Time to power off, from home. The board is open: github.com/enjoy-digital/colorlite.\nBuilt on LiteX and LiteEth, on a ColorLight 5A-75B with the open-source Yosys / NextPnr / Trellis ECP5 flow.\nWork and ideas by enjoy-digital; written up with AI in the loop.\n","date":"2 June 2026","externalUrl":null,"permalink":"/posts/colorlite/","section":"Posts","summary":"A $15 ColorLight ECP5 board, a LiteX SoC with Ethernet, and a cheap relay module: remote power-on/off for an office PC during COVID, built with the fully open-source Yosys/NextPnr toolchain. With a note on reversing boards from the FPGA side.","title":"ColorLite: remote power control on a $15 FPGA board","type":"posts"},{"content":"","date":"2 June 2026","externalUrl":null,"permalink":"/categories/demos/","section":"Categories","summary":"","title":"Demos","type":"categories"},{"content":"","date":"2 June 2026","externalUrl":null,"permalink":"/tags/ecp5/","section":"Tags","summary":"","title":"Ecp5","type":"tags"},{"content":"","date":"2 June 2026","externalUrl":null,"permalink":"/tags/fpga/","section":"Tags","summary":"","title":"Fpga","type":"tags"},{"content":"","date":"2 June 2026","externalUrl":null,"permalink":"/tags/liteeth/","section":"Tags","summary":"","title":"Liteeth","type":"tags"},{"content":"","date":"2 June 2026","externalUrl":null,"permalink":"/tags/litex/","section":"Tags","summary":"","title":"Litex","type":"tags"},{"content":"Notes on building open-source FPGA SoCs with LiteX: cores, peripherals, simulation, and the occasional debugging story.\nMostly things I wanted to write down while they were still fresh. 🙂\n","date":"2 June 2026","externalUrl":null,"permalink":"/","section":"LiteX Notes","summary":"Notes on building open-source FPGA SoCs with LiteX: cores, peripherals, simulation, and the occasional debugging story.\nMostly things I wanted to write down while they were still fresh. 🙂\n","title":"LiteX Notes","type":"page"},{"content":"","date":"2 June 2026","externalUrl":null,"permalink":"/tags/nextpnr/","section":"Tags","summary":"","title":"Nextpnr","type":"tags"},{"content":"Everything I\u0026rsquo;ve written so far, newest first.\n","date":"2 June 2026","externalUrl":null,"permalink":"/posts/","section":"Posts","summary":"Everything I’ve written so far, newest first.\n","title":"Posts","type":"posts"},{"content":"","date":"2 June 2026","externalUrl":null,"permalink":"/tags/reverse-engineering/","section":"Tags","summary":"","title":"Reverse-Engineering","type":"tags"},{"content":"","date":"2 June 2026","externalUrl":null,"permalink":"/tags/","section":"Tags","summary":"","title":"Tags","type":"tags"},{"content":"","date":"2 June 2026","externalUrl":null,"permalink":"/tags/yosys/","section":"Tags","summary":"","title":"Yosys","type":"tags"},{"content":"","date":"2 June 2026","externalUrl":null,"permalink":"/tags/demoscene/","section":"Tags","summary":"","title":"Demoscene","type":"tags"},{"content":"","date":"2 June 2026","externalUrl":null,"permalink":"/tags/framebuffer/","section":"Tags","summary":"","title":"Framebuffer","type":"tags"},{"content":"","date":"2 June 2026","externalUrl":null,"permalink":"/tags/javascript/","section":"Tags","summary":"","title":"Javascript","type":"tags"},{"content":"TL;DR: mquickjs on LiteX runs Fabrice Bellard\u0026rsquo;s JavaScript engine bare-metal on a VexRiscv softcore inside a LiteX SoC. JavaScript calls litex.* and framebuffer.* to drive real hardware. It runs in litex_sim with no board, and on real FPGAs it does framebuffer demos and a live browser editor served from the board itself.\nJavaScript, on the FPGA, for real # I wanted to know if I could run JavaScript on the FPGA itself. Not cross-compiled to C, not on the host, but actually interpreted by the softcore. So I took mquickjs, Fabrice Bellard\u0026rsquo;s small JavaScript engine, and dropped it into the firmware of a LiteX SoC.\nHere is what actually runs: LiteX boots a VexRiscv firmware, the firmware sets up an mquickjs heap, parses your script, and runs the VM. Your JavaScript reaches the hardware through a small litex object. The RISC-V core on the board is the thing reading your for loop.\nJavaScript file -\u0026gt; firmware.bin -\u0026gt; VexRiscv -\u0026gt; mquickjs -\u0026gt; LiteX CSRs It runs in litex_sim, so you do not need a board to try it, and I have run it on a Digilent Arty A7 and a LambdaConcept ECPIX-5.\nHow it fits together # There is not much to it, which is sort of the point. LiteX gives you the SoC, a bit of C exposes a few hardware bindings, and the JavaScript does the rest. The script is baked into the firmware at build time (a small embed_script.py turns your.js into a header), the LiteX BIOS loads the firmware into main_ram, and VexRiscv runs the VM on bare metal.\nThe firmware is a few small files. main.c boots, creates the context and heap, and either runs your script or drops into a UART REPL. mqjs_port.c holds the console.log, litex.* and framebuffer.* bindings, plus an optional SDCard load(). For the live demo there is a tiny HTTP server on top of LiteEth and lwIP. The JavaScript heap is static, 1 MiB by default, and mquickjs never calls malloc() for JavaScript objects, which keeps things predictable on a bare-metal target.\nIf you read the git log it is basically that, in order: get mquickjs running, make it draw to a framebuffer, add the browser runtime, then bring it up on the ECPIX-5 and the SDCard. Nothing surprising in there, which is how I like a bring-up to go.\nTry it in simulation # No board needed. The first run builds the LiteX simulator; after that it only rebuilds the firmware and your script:\ngit clone --recursive https://github.com/enjoy-digital/mquickjs-on-litex cd mquickjs-on-litex ./make.py sim ./make.py sim examples/demo.js ./make.py sim-video examples/plasma.js The plain run prints what you would hope:\n--========= mquickjs on LiteX =========-- running embedded script... hello from mquickjs on LiteX! [mqjs] done examples/demo.js is the one I show first, because you can watch the JavaScript drive the LiteX LEDs while it times itself over the UART:\nfunction leds(value) { litex.setLeds(value \u0026amp; MASK); } function pause(ms) { litex.delay(ms); } show(\u0026#34;knight rider\u0026#34;, function () { for (var round = 0; round \u0026lt; 4; round++) { for (var i = 0; i \u0026lt; LEDS; i++) { leds(1 \u0026lt;\u0026lt; i); pause(80); } for (var i = LEDS - 2; i \u0026gt; 0; i--) { leds(1 \u0026lt;\u0026lt; i); pause(80); } } }); litex.setLeds() is just a CSR write, so that knight-rider loop is JavaScript, interpreted on the softcore, toggling real LED pins through the LiteX bus. 🙂\nThe JavaScript API # I kept the API small on purpose. litex.* is generic board and CSR access, and optional peripherals return harmless values when they are missing, so a script written for one board tends to run on another:\nlitex.getIdentifier() // SoC identifier string litex.clockFrequency() // system clock in Hz litex.millis() // monotonic milliseconds litex.setLeds(mask) // no-op when LEDs are absent litex.csrRead32(addr) // raw CSR/bus access litex.csrWrite32(addr, value) On a board with a video framebuffer you also get framebuffer.*, with the usual primitives for demoscene-style effects: clear, fillRect, line, circle, text, fade, and a blit family for software-rendered buffers. Colours are plain 0xRRGGBB. For animation, the live runtime looks for a setup() / frame(t) pair, where t comes from litex.millis():\nfunction frame(t) { if (framebuffer.doubleBuffered) framebuffer.begin(); framebuffer.clear(0x020406); framebuffer.text(24, 24, \u0026#34;mquickjs on LiteX\u0026#34;, 0x12bdf2, 2); if (framebuffer.doubleBuffered) framebuffer.present(); } Demos on real hardware # On a board with video, this is the fun part. A single firmware cycles through a plasma, fire and tunnel playlist, all drawn from JavaScript into the LiteX framebuffer:\n./make.py board-build --target litex_boards.targets.lambdaconcept_ecpix5 \\ --build-dir build/ecpix5-video -- --with-video-framebuffer --uart-baudrate=1000000 ./make.py firmware examples/showcase.js --build-dir build/ecpix5-video ./make.py board-load --target litex_boards.targets.lambdaconcept_ecpix5 --build-dir build/ecpix5-video The standalone version is even less fuss. With --with-sdcard the LiteX BIOS loads boot.bin and mquickjs loads main.js off the card. Edit main.js on the SDCard, reset the board, and it runs the new script. No toolchain in the loop.\nA live editor served by the FPGA # My favourite bit: with Ethernet and a framebuffer enabled, the board serves its own editor. The firmware runs a small HTTP server, so you point a browser at http://192.168.1.50/, type some JavaScript, hit Run, and it gets sent back to the board and run by mquickjs on the LiteX CPU. The editor page itself is served by the FPGA.\nThe live editor on a LambdaConcept ECPIX-5 (ECP5): the page, the editor UI, and the running demo are all served and executed by the FPGA. Edit JavaScript in the browser, press Run, and the softcore runs it. The HTTP side is small and easy to read: GET / for the editor page, POST /run to reset the context and run a fresh script, POST /eval for a snippet in the live context, POST /control to stop, pause or resume, and GET /load / POST /save for main.js on the SDCard. If setup() throws, the board keeps the previous script and rejects the new one, which saves you when you are live-coding a demo and fat-finger something.\nWhy I like it # It is a toy, and that is fine. But it is also a decent advert for LiteX. Adding a CPU, a framebuffer, an Ethernet MAC and an SDCard to the SoC is a handful of flags on a board target, and every one of them is reachable from software in an obvious way. That is why a thing like this finishes in a weekend instead of turning into a project. And it is a much better way to show someone a softcore than blinking an LED: hand them the keyboard and let them type something the chip draws. 🙂\nTry it # Simulation needs no hardware:\ngit clone --recursive https://github.com/enjoy-digital/mquickjs-on-litex cd mquickjs-on-litex ./make.py sim-video examples/plasma.js The code, the examples and the full docs (architecture, JS API, hardware bring-up, the demoscene notes) are at github.com/enjoy-digital/mquickjs-on-litex.\nBuilt on LiteX and Fabrice Bellard\u0026rsquo;s mquickjs.\nWork and ideas by enjoy-digital; written up with AI in the loop.\n","date":"2 June 2026","externalUrl":null,"permalink":"/posts/mquickjs-on-litex/","section":"Posts","summary":"JavaScript running bare-metal on a RISC-V softcore inside a LiteX SoC, with framebuffer demos and a live browser editor served straight from the FPGA. No host, no transpilation.","title":"mquickjs on LiteX: JavaScript on an FPGA softcore","type":"posts"},{"content":"","date":"2 June 2026","externalUrl":null,"permalink":"/tags/quickjs/","section":"Tags","summary":"","title":"Quickjs","type":"tags"},{"content":"","date":"2 June 2026","externalUrl":null,"permalink":"/tags/vexriscv/","section":"Tags","summary":"","title":"Vexriscv","type":"tags"},{"content":"","date":"2 June 2026","externalUrl":null,"permalink":"/tags/agents/","section":"Tags","summary":"","title":"Agents","type":"tags"},{"content":"","date":"2 June 2026","externalUrl":null,"permalink":"/tags/ai/","section":"Tags","summary":"","title":"Ai","type":"tags"},{"content":"","date":"2 June 2026","externalUrl":null,"permalink":"/series/ai-era-fpga/","section":"Series","summary":"","title":"AI-Era FPGA","type":"series"},{"content":"TL;DR: LiteX was text and commands long before agents existed, so an agent can now drive the whole FPGA loop: build, load, test on hardware, probe with LiteScope, fix, repeat. Two real examples below, LiteNVMe and LiteX-M2SDR.\nA tool design that aged well # You can now put an FPGA board on the bench, give an agent a goal, and let it work: write the HDL, build it, load it, test it on real hardware, watch the bus with a logic analyzer, fix what is broken, and go round again. That works because LiteX has always been command-line and text first. No big IDE, no click-through GUI, no proprietary project format. You describe an SoC in Python, build it from the shell, and talk to the board with small command-line tools.\nThat was a deliberate, slightly boring choice, made for cost reasons. The goal at enjoy-digital was cost-effective hardware for clients: reuse common work, avoid reinventing the same blocks, avoid sinking time into heavy tooling. Text and a terminal travel everywhere, script easily, and need no license.\nWhat we did not see coming is how well that fits the AI era. An agent is, more or less, very good with text and very happy running commands, and LiteX hands it a whole FPGA workflow that is already text and commands. It turns out we had been building the perfect interface for an agent for years, for entirely different reasons. 🙂\nWhy LiteX fits an agent # The reason it works is that every step of FPGA development in LiteX has a text-and-command form, and an agent can reach all of them:\nBuild and load: ./target.py --build --load. One command from Python SoC to bitstream on the FPGA. Etherbone over Ethernet: litex_server --udp ... plus RemoteClient scripts. Etherbone tunnels the SoC\u0026rsquo;s internal bus over Ethernet, so you can read and write any CSR (control/status register) from a Python one-liner. PCIe bridge and rescan: litex_server --pcie ... and a rescan.py to re-enumerate the device after a reload, so a fresh bitstream comes back without a reboot. JTAG and UART: litex_term, crossover UART through LiteX, JTAG access to the same bus. CSR read/write: the whole register map is named and reachable, by software, by name. LiteScope: an in-fabric logic analyzer you configure from Python and trigger with litescope_cli, dumping a .vcd you can read back. Simulation: pytest and Verilator, so a change can be checked before it ever touches hardware. (Co-simulation makes that loop faster still: see QEMU co-simulation in LiteX.) No GUI and no proprietary IDE in the loop. Etherbone, PCIe, JTAG, UART, CSR access and LiteScope are all text and commands, so the agent can drive every one of them. None of this was built for AI. It was built so a human could work fast from a terminal.\nThe loop # Once the tools are reachable, the workflow is a loop. You give the agent a high-level goal and a spec, you put the target board on the bench, and it goes:\nThe agent writes the HDL, simulates it, builds and loads it, tests on the real board over the bridge, and when something misbehaves it adds a CSR or a LiteScope probe, captures, and goes round again. The important part is the bottom of that loop. When the design does not behave, the agent does not just stare at it. It adds a status CSR or a counter, or it adds a narrow LiteScope probe, rebuilds, reloads, triggers a capture over the bridge, reads back the waveform, and forms the next hypothesis. It writes its own observability. Then, when the bug is understood, it locks the fix in with a regression test so it cannot come back. (If you want the checklist version of this loop, the LiteX-M2SDR debugging guide is it.)\nThis matters because most of that loop is waiting and small decisions. An FPGA build is minutes, sometimes the better part of an hour. A bring-up is dozens or hundreds of these cycles. Very little of it needs senior-engineer judgement at each step. It needs patience and care, which is exactly the part you want to hand off. An agent does not get bored, wander off for coffee, or quietly give up at iteration 60. 🙂\nExample 1: LiteNVMe # LiteNVMe is an open-source NVMe host core: the PCIe RootPort side that brings up an off-the-shelf SSD (enable the controller, Identify, create the I/O queues) and exposes a simple block interface, with no host CPU, OS or driver on the data path. Open-source NVMe host cores are rare, and this is the LiteX-native one. It is hardware-validated at PCIe Gen3 x4, about 2.7 GB/s.\nThe bench for LiteNVMe: an Alibaba Cloud KU3P card driving a commercial NVMe SSD over PCIe, with the agent reaching the SoC over Ethernet (Etherbone). The interesting thing for this post is not the core, it is the git history. With 284 commits, it reads like a lab notebook of exactly the loop above. A few real commit subjects:\nbench: hostmem-write LiteScope probe + Gen3-256b HW harnesses \u0026amp; results Lever B diagnosed: 128B MPS read ceiling is gateware-bound, not firmware-raisable T6 coalescing on HW: functional (errors=0) but NO read speedup -- honest negative result HONEST CORRECTION: pipelined-write fix was NEVER applied (dee81ba claim was false) Read bottleneck fixed: pipelined hostmem write path ~2x reads on HW (~0.9-1.06 GB/s) debug: nvme_mmiotest sets root bridge memory window; capture MMIO no-data completion bench: prune one-off HW harnesses; factor shared boilerplate into hw_common.sh You can see the whole method in there. Add a LiteScope probe to watch the host-memory write path. Run a hardware harness, record the numbers. Diagnose a read ceiling and pin it to the gateware rather than firmware. Try a coalescing change, measure it on hardware, and report an honest negative result when it does not help. Catch a false claim and correct it (yes, that HONEST CORRECTION commit is the agent fact-checking its own earlier commit, which is a genre of git history I did not know I needed). And once a diagnostic probe has done its job, prune it so it does not bloat the design. That last habit, adding a probe to answer one question and then removing it, is the agent keeping its own workspace clean.\nHere is one of those probes, written by the agent itself mid-debug. The comment is its own hypothesis (\u0026ldquo;does a 16-byte write to a non-32-byte-aligned offset get shifted or dropped at 256 bits?\u0026rdquo;), and the code wires up exactly the signals needed to answer it:\nAn agent adding its own LiteScope probe, unprompted, to capture the host-memory completer write path. It picks the PCIe status, the write handshake, the address, byte-enables and low data words, builds the analyzer, captures, and once the question is answered, prunes it back out. 🙂 The end state is a real result: Gen3 x4, about 2.69 GB/s on 8 KiB reads and 2.74 GB/s on writes, errors zero, write then read-back bit-exact, on an Alibaba KU3P driving a commercial SSD. The remaining gap to the link ceiling is the SSD\u0026rsquo;s 512-byte MaxPayloadSize, not the core, which is itself something the loop figured out and wrote down.\nNVMe Identify over the firmware console: the SSD model, capacity and queue configuration read back during bring-up. Example 2: M2SDR, over the bridge # The LiteX-M2SDR board (Artix-7 plus an AD9361 RFIC) is where we first wrote the methodology down and started testing it more extensively. Its debugging guide opens, literally, with \u0026ldquo;a practical starting point for developers or agents.\u0026rdquo; It is the loop as a checklist: start from git status, reproduce on hardware, verify the host path with m2sdr_util info, inspect the generated Verilog, add a CSR if polling can answer the question, add a narrow LiteScope probe if the issue is temporal, capture, fix in the smallest layer, and add a regression test. There is even a rule of thumb to keep captures under ~128 signal bits so the build still closes timing.\nThat guide is really the team\u0026rsquo;s debugging instincts written down: the kind of skills a senior engineer carries in their head and applies without thinking. Which probe to reach for. When to drop the clock to 100 MHz before blaming the logic. Why a host write can pile up while the target quietly ignores it. As plain text, those skills stop being tacit. The agent reads them once and applies them on every run, instead of rediscovering the same workflow each time, and the guide itself improves as we learn new patterns. It is institutional knowledge that an agent can actually use.\nThe LiteSATA and Ethernet work on M2SDR is a good example of the unglamorous-but-necessary kind of iteration. Here is the Ethernet throughput, as it improved across the history that is recorded in the repo:\nEthernet path 8 MiB SigMF, host to SATA SATA to host Original Etherbone path ~9.1 MiB/s ~7.3 MiB/s Pipelined Etherbone reads ~9.1 MiB/s ~26 MiB/s 128 KiB SATA host buffer ~47.8 MiB/s ~25.5 MiB/s That is the shape of timing-and-throughput work: a long series of small changes, each measured on hardware, most of them boring, a few of them decisive. The same history also records the things that did not work and why, which is just as valuable:\nA 256 KiB host buffer was tried first and rejected because Vivado DRC failed on cascaded RAMB36 address pins. So the buffer is 128 KiB. An Etherbone burst-size sweep (1, 2, 4 \u0026hellip; 128 words) showed where the throughput knee is. Linux sendmmsg/recvmmsg batching was tested and dropped: it landed within measurement noise, sometimes slower, so it was not kept. Writing the dead ends down is the polite version of \u0026ldquo;stop suggesting this to me at 2am.\u0026rdquo;\nA person can do all of that. But it is exactly the work where the value per decision is low and the number of iterations is high. It is the work you most want to delegate.\nWhat this removes # Put the two examples together and the pattern is clear. The expensive part of FPGA development is rarely the one clever idea. It is the long tail around it: waiting on builds, making small mechanical decisions, running the next iteration, reading back a capture, trying the next variant. The intelligence needed per step is low, but the steps are many (often well past a hundred for a real bring-up or a timing-closure campaign), and each one carries a multi-minute build.\nThat is precisely what an agent is good at now, because the FPGA loop is long and the agent is patient. You can let it run for hours, or across days, with only occasional guidance, and come back to a working prototype or a debugged design. Coming back to more progress than you left, instead of a stalled build and a cold coffee, takes some getting used to. The phase it eats into is the one where most of a project\u0026rsquo;s time actually goes: the early proof of concept and the debug grind, where you spend days chasing small details rather than designing anything.\nWith recent models (Claude Opus 4.8 and OpenAI GPT 5.5, both in high thinking) the guidance is needed less often than I expected. I mostly step in when the agent clearly loses the thread, and that is rarer than it used to be. More than once it has reached for a debug approach I would not have thought of. 🙂\nThe role this leaves me with is different, and honestly more pleasant. As a single developer I now work more like a project lead or an architect than someone typing every line: I decide what to build and why, set the architecture and the interfaces, write the spec, prepare the hardware, and review what comes back, while the agents handle the long mechanical middle. It also frees up time for things that are not engineering at all. Lately I have had more time to prepare the LiteX-M2SDR package and the hardware side of what we sell. 🙂\nWhere this goes # The two examples are bring-up and throughput work, but the same loop covers a lot more, and this is the part I find most exciting to imagine:\nEarly proof of concept. Give a spec and a board, get a first working prototype overnight, exactly the phase that usually eats weeks. Timing closure. Many iterations, low intelligence per step, long builds: change a directive or a pipeline, rebuild, read WNS/WHS (the worst-case timing slack), repeat. An agent can grind a hundred of these while you do something else. New-board bring-up. Point it at a new platform, let it work through clocking, resets and the first peripherals. Driver and software co-development. Bring the driver up against the gateware on real hardware, with the same CSR and capture access. Test farms. Several boards on the bench, an agent fanning regressions across all of them, bisecting a hardware regression to the commit that caused it. Porting cores across FPGA families, where most of the work is mechanical adaptation and re-validation. Exploration runs. Let it try variants of an architecture overnight and report back the ones that closed timing and passed, with the captures to prove it. The common thread: in all of these, the human contribution is the goal, the spec, the hardware on the bench, and the occasional course-correction. The grind in between is now something you can hand off.\nA caveat # The split is the obvious one: the ideas and the engineering are mine, and I still steer when it matters. The agent runs the loop, not the project, and it runs it on real hardware with real captures and numbers.\nIt also only works because LiteX was text and commands from the start. A GUI-first tool could not be driven this way.\nLiteNVMe: github.com/enjoy-digital/litenvme. LiteX-M2SDR: github.com/enjoy-digital/litex_m2sdr, and its debugging guide.\nWork and ideas by enjoy-digital; written up with AI in the loop.\n","date":"2 June 2026","externalUrl":null,"permalink":"/posts/ai-era-fpga/","section":"Posts","summary":"LiteX was always command-line and text first, for cost reasons. It turns out that’s exactly what an AI agent needs to drive an FPGA. Here’s the methodology we use now, with LiteNVMe and M2SDR as real examples.","title":"FPGA development with LiteX in the AI era","type":"posts"},{"content":"","date":"2 June 2026","externalUrl":null,"permalink":"/tags/litescope/","section":"Tags","summary":"","title":"Litescope","type":"tags"},{"content":"","date":"2 June 2026","externalUrl":null,"permalink":"/categories/methodology/","section":"Categories","summary":"","title":"Methodology","type":"categories"},{"content":"","date":"2 June 2026","externalUrl":null,"permalink":"/tags/methodology/","section":"Tags","summary":"","title":"Methodology","type":"tags"},{"content":"","date":"2 June 2026","externalUrl":null,"permalink":"/tags/nvme/","section":"Tags","summary":"","title":"Nvme","type":"tags"},{"content":"","date":"2 June 2026","externalUrl":null,"permalink":"/tags/sdr/","section":"Tags","summary":"","title":"Sdr","type":"tags"},{"content":"","date":"2 June 2026","externalUrl":null,"permalink":"/series/","section":"Series","summary":"","title":"Series","type":"series"},{"content":"","date":"2 June 2026","externalUrl":null,"permalink":"/tags/linux/","section":"Tags","summary":"","title":"Linux","type":"tags"},{"content":"","date":"2 June 2026","externalUrl":null,"permalink":"/series/litex-co-simulation/","section":"Series","summary":"","title":"LiteX Co-Simulation","type":"series"},{"content":"","date":"2 June 2026","externalUrl":null,"permalink":"/tags/qemu/","section":"Tags","summary":"","title":"Qemu","type":"tags"},{"content":" The 1 MHz problem # If you build FPGA SoCs, you know the simulation tax. You change one register in a peripheral, start Verilator, and then you wait. Not for your peripheral, but for the CPU. Verilator simulates every gate of the RISC-V core while it boots, sets up a stack, runs the BIOS and parses a boot prompt, all at roughly one simulated megahertz, before it ever touches the thing you actually changed.\nSo most of the time I spend \u0026ldquo;in simulation\u0026rdquo; isn\u0026rsquo;t spent on my design. It\u0026rsquo;s spent waiting for a CPU I\u0026rsquo;m not even debugging. 😅\nPR #2468 takes the shortcut: don\u0026rsquo;t simulate the CPU. Run it in QEMU at native speed, and keep simulating the SoC (the interconnect and the peripherals) in Verilator. Each tool does the part it\u0026rsquo;s good at, and the two talk over a small bridge.\nWhat the PR adds # The split is simple:\nQEMU owns the CPU. The RISC-V core, its local ROM/RAM, the RISC-V timer and software interrupts (ACLINT/CLINT) and the PLIC all live inside QEMU and run at native speed. Verilator owns the SoC. The LiteX interconnect and every peripheral (UART, Ethernet, whatever IP you\u0026rsquo;re working on, DMA) is simulated cycle-accurately, exactly as before. Whenever the CPU in QEMU touches an address that belongs to the LiteX SoC (a CSR, that is a control/status register, a peripheral buffer, or shared main RAM), QEMU forwards the access over a TCP bridge to the Verilator side, which turns it into a real Wishbone/AXI bus cycle, lets the SoC react, and sends the result back.\nQEMU runs the CPU, Verilator runs the SoC, and a TCP bridge carries every MMIO access between them. Why it\u0026rsquo;s faster, and where it isn\u0026rsquo;t # A Verilated soft CPU runs at maybe 0.1 to 1 MIPS, because every pipeline stage is simulated. QEMU runs the same code at host speed. The peripherals don\u0026rsquo;t get any faster (they stay in Verilator, which is the whole point), but the time you used to spend simulating the CPU on every boot, every memcpy, every driver probe is mostly gone.\nIt\u0026rsquo;s worth being clear about what this is not. It\u0026rsquo;s co-simulation, not cycle-accurate CPU verification. The v1 bridge does blocking, single-beat MMIO (no AXI bursts yet), it doesn\u0026rsquo;t model CPU caches, and exact cycle-level interrupt timing isn\u0026rsquo;t the goal. If you\u0026rsquo;re verifying the CPU pipeline, use a Verilated core. If you\u0026rsquo;re working on the peripherals, drivers and software around the CPU (for me, most of the time), co-simulation is what you want.\nBooting the BIOS # The QEMU CPU shows up as just another --cpu-type. Build the patched QEMU once (it fetches QEMU v8.2.4, applies the litex-sim machine, and drops the binaries in build/qemu-litex/bin/):\npython3 litex/build/sim/qemu/build_qemu_litex.py Then run the simulator with qemu as the CPU:\npython3 -m litex.tools.litex_sim \\ --cpu-type=qemu \\ --cpu-variant=rv32 \\ --qemu-binary build/qemu-litex/bin/qemu-system-riscv32 LiteX builds the SoC, Verilator compiles it, and once the bridge is listening LiteX starts QEMU pointed at it. Then the familiar banner shows up, except the CPU on the other end is QEMU:\n[qemu_wishbone] client connected __ _ __ _ __ / / (_) /____ | |/_/ / /__/ / __/ -_)\u0026gt; \u0026lt; /____/_/\\__/\\__/_/|_| Build your hardware, easily! (c) Copyright 2012-2026 Enjoy-Digital (c) Copyright 2007-2015 M-Labs BIOS built on Jun 2 2026 12:03:24 BIOS CRC passed (ed75e705) LiteX git sha1: 01ac04ff5 --================ SoC =================-- CPU:\tQEMU RISC-V @ 1MHz BUS:\twishbone 32-bit data/32-bit addr CSR:\t32-bit data big ordering ROM:\t128.0KiB SRAM:\t8.0KiB --================ Boot ================-- Booting from serial... Press Q or ESC to abort boot completely. No boot medium found --============== Console ===============-- litex\u0026gt; Two things to notice. [qemu_wishbone] client connected is the Verilator-side bridge module reporting that QEMU has connected. And QEMU runs with -serial none, so there is no QEMU console. The UART you\u0026rsquo;re reading is the LiteX UART, simulated in Verilator and reached over the bridge. The peripheral under test is doing the talking. (@ 1MHz is just the simulated SoC clock; the CPU itself runs as fast as QEMU likes.)\nA first benchmark # The prompt is nice but it doesn\u0026rsquo;t show the speed-up: a tiny BIOS boots fast either way. So let\u0026rsquo;s make the CPU work: give the SoC some main RAM and let the BIOS run its memory test, which is a tight read/write loop, exactly the kind of CPU-bound code a Verilated core struggles with.\nSame SoC, same BIOS, same 2 MiB of main RAM. Only the CPU implementation changes.\n# CPU in QEMU (peripherals + RAM still in Verilator, via the shared-RAM file) python3 -m litex.tools.litex_sim --cpu-type=qemu --cpu-variant=rv32 \\ --qemu-binary build/qemu-litex/bin/qemu-system-riscv32 \\ --integrated-main-ram-size=0x200000 # CPU simulated cycle-by-cycle in Verilator python3 -m litex.tools.litex_sim --cpu-type=vexriscv \\ --integrated-main-ram-size=0x200000 Timing the wall-clock between Memtest at 0x40000000 (2.0MiB)... and Memtest OK:\nCPU implementation 2 MiB BIOS memtest (wall-clock) Speed-up QEMU co-simulation ~0.15 s ~100× VexRiscv in Verilator ~15.3 s 1× (baseline) The BIOS memspeed numbers say the same thing from the other side: 285.7 MiB/s write / 400.0 MiB/s read with QEMU, versus 1.6 MiB/s write / 918.9 KiB/s read on the Verilated core. Same SoC, same machine; the only difference is who executes the instructions.\nOne detail: in the QEMU run, main RAM is a shared file (memory-backend-file) mapped by both QEMU and Verilator, so a DMA-capable peripheral in Verilator and the CPU in QEMU see the same bytes. The peripheral side stays accurate; only the CPU sped up. 🚀\nHow the bridge works # The protocol is small. QEMU opens one TCP connection and does one blocking request at a time. Every message is a fixed 32 bytes, little-endian:\nOffset Field Size Meaning 0 magic 4 0x3051584c (LXQ0) request / 0x3052584c (LXR0) response 4 version 2 1 6 op 2 0 read · 1 write · 2 IRQ poll 8 size 4 1, 2, 4, 8 (or 0 for an IRQ poll) 16 addr 8 byte address 24 data 8 write data / read data When the CPU reads or writes an address in the LiteX MMIO window, QEMU sends a READ/WRITE and the Verilator-side qemu_wishbone module runs a real bus cycle, replying with the data plus the current LiteX interrupt bitmask. The IRQ poll (op=2) lets a LiteX interrupt reach the CPU even when it isn\u0026rsquo;t otherwise touching the bus.\nThe memory map follows the same split. This is the command LiteX builds and launches for you (rv32):\nqemu-system-riscv32 -M litex-sim,xlen=32,\\ bridge-host=127.0.0.1,bridge-port=1235,\\ bridge-base=0x80000000,bridge-size=0x80000000,\\ reset-addr=0x0,rom-base=0x0,sram-base=0x10000000,main-ram-base=0x40000000,\\ clint-base=0xf0010000,plic-base=0xf0c00000,csr-base=0xf0000000,... \\ -m 67108864B -nographic -serial none -monitor none \\ -bios build/sim/software/bios/bios.bin ROM, RAM, CLINT and PLIC are QEMU-local (fast). Everything in the bridge-base/bridge-size window (CSRs, LiteEth buffers, a framebuffer) is forwarded to Verilator. You don\u0026rsquo;t write any of that command; LiteX does.\nQEMU is just another CPU # None of this needed special-casing in the SoC code, and that\u0026rsquo;s the part I find satisfying.\nIn LiteX a CPU is an abstraction: a Python class that inherits from a common CPU base, declares a few attributes (family, variants, register width), and exposes one or more bus masters to the SoC. VexRiscv does it, Rocket does it, NaxRiscv, CVA6 and PicoRV32 do it. Even the hard cores do it: the Zynq and ZynqMP processing systems are LiteX CPU types too. The integration code doesn\u0026rsquo;t care what\u0026rsquo;s behind the bus master; it wires it into the interconnect like any other.\nSoft cores, hardened processing systems, and now QEMU all implement the same CPU contract (a bus master, an interrupt vector, a reset line). Swap the core on top and the SoC underneath doesn\u0026rsquo;t change. Making QEMU \u0026ldquo;a CPU\u0026rdquo; is the same job. Here\u0026rsquo;s the core of litex/soc/cores/cpu/qemu/core.py, trimmed down:\nclass QEMU(CPU): category = \u0026#34;emulator\u0026#34; family = \u0026#34;riscv\u0026#34; name = \u0026#34;qemu\u0026#34; human_name = \u0026#34;QEMU RISC-V\u0026#34; variants = [\u0026#34;standard\u0026#34;, \u0026#34;rv32\u0026#34;, \u0026#34;rv64\u0026#34;] def __init__(self, platform, variant=\u0026#34;standard\u0026#34;): # Pick the bus standard the SoC asked for... self.bus_standard = _get_qemu_bus_standard(platform) # wishbone / axi-lite / axi self.bus = _qemu_bus_interface(self.bus_standard) # ...and present it to the SoC as a regular bus master. self.periph_buses = [self.bus] self.memory_buses = [] self.interrupt = Signal(32) self.reset = Signal() ... No Verilog, no pipeline. The class hands the SoC a Wishbone (or AXI-Lite, or AXI) master, an interrupt vector and a reset line, and the simulation glue connects that master to the external QEMU process through the bridge module.\nFrom the SoC\u0026rsquo;s side there\u0026rsquo;s nothing special about it. The build log brings QEMU up exactly the way it would a soft core:\nINFO:SoC:CPU qemu added. INFO:SoC:CPU qemu adding IO Region 0 at 0x80000000 (Size: 0x80000000). INFO:SoC:CPU qemu setting reset address to 0x00000000. INFO:SoC:CPU qemu adding Bus Master(s). INFO:SoCBusHandler:cpu_bus0 added as Bus Master. INFO:SoC:CPU qemu adding SoC components. INFO:SoCBusHandler:clint Region added at Origin: 0xf0010000 ... INFO:SoCBusHandler:plic Region added at Origin: 0xf0c00000 ... A whole emulator gets added the same way as a 2000-LUT soft core, because in LiteX a CPU is a contract, not a block of HDL.\nThis looks a lot like Zynq and SoC FPGAs # If the shape feels familiar, it should. A Xilinx Zynq / Zynq MPSoC, or an Intel/Altera SoC FPGA, is a hardened processing system (the Arm complex, its caches, its DDR controller, all fixed silicon) next to programmable fabric where you build your peripherals. You don\u0026rsquo;t redesign the CPU; you build around it. That\u0026rsquo;s the same picture as the diagram above, with the hardened PS sitting in the slot where QEMU sat.\nSo the same way of working carries over: model the hardened PS with an emulator, simulate your fabric peripherals cycle-accurately, and bring up the drivers and software against the pair, with the visibility of simulation, before or alongside real hardware. Whether the \u0026ldquo;hard\u0026rdquo; part is a QEMU process or a vendor PS, the rest of the SoC doesn\u0026rsquo;t care.\nBooting Linux # A BIOS is a warm-up. The reason I wanted this is bigger software. The linux-on-litex-vexriscv project can already boot Linux in the LiteX simulator, and if you\u0026rsquo;ve run ./sim.py there and watched a Verilated VexRiscv-SMP work through OpenSBI and a kernel decompress, you know it takes a while. Booting an OS is hundreds of millions of instructions; at ~1 simulated MHz, that\u0026rsquo;s the go-get-a-coffee range.\nSo let\u0026rsquo;s boot a real rv64 Linux with the kernel in QEMU and its console coming out of the LiteX UART in Verilator:\npython3 -m litex.tools.litex_sim \\ --cpu-type=qemu --cpu-variant=rv64 \\ --qemu-ram-size=0x10000000 \\ --qemu-binary build/qemu-litex/bin/qemu-system-riscv64 \\ --qemu-firmware .../opensbi-riscv64-generic-fw_dynamic.bin \\ --qemu-kernel .../Image \\ --qemu-dtb .../litex-sim-rv64.dtb QEMU owns the CPU, RAM and the RISC-V local-interrupt machinery Linux expects (the ACLINT/CLINT timer and the SiFive-compatible PLIC); OpenSBI is the stock generic build. The device tree describes those plus one Verilator-side peripheral, the litex,liteuart console at 0x12001800, which sits in the bridge window. So every byte Linux prints goes CPU → bridge → Verilator → your terminal:\nOpenSBI v1.3.1 Platform IPI Device : aclint-mswi Platform Timer Device : aclint-mtimer @ 1000000Hz Boot HART Base ISA : rv64imafdch [ 0.000000] Linux version 5.11.0 ... (riscv64 ... GCC 13.4.0) ... [ 0.000000] earlycon: liteuart0 at MMIO 0x0000000012001800 (options \u0026#39;\u0026#39;) [ 0.000000] printk: bootconsole [liteuart0] enabled [ 0.000000] Kernel command line: console=liteuart rdinit=/init earlycon [ 0.000000] Memory: 176352K/260096K available ... [ 0.114190] devtmpfs: initialized [ 0.405614] 12001800.serial: ttyLXU0 at MMIO 0x0 ... is a liteuart [ 0.419278] printk: console [liteuart0] enabled [ 0.535703] Freeing unused kernel memory: 2092K [ 0.543183] Run /init as init process ======================================================== [init] Hello from Linux userspace! [init] CPU executed by QEMU, peripherals by Verilator. [init] This console is the LiteX UART (liteuart), reached [init] through the QEMU \u0026lt;-\u0026gt; Verilator co-simulation bridge. ======================================================== That liteuart line is the point: OpenSBI and Linux are both talking through a Verilator-simulated peripheral, over the bridge, while the CPU runs at QEMU speed. From the OpenSBI banner to the /init prompt was about 1.3 seconds of wall-clock. 🐧\nFor comparison: the same boot is hundreds of millions of instructions, and the Verilated VexRiscv runs ~100× slower than QEMU on the same SoC (that\u0026rsquo;s the memtest number from earlier), which puts a cycle-simulated boot in the many-minutes range. A second or so versus several minutes is the difference between iterating on a driver and waiting on one.\nA note on reproducibility, since I\u0026rsquo;d rather be honest about it: this is a real boot from my machine, but getting there meant extending the PR. The litex-sim machine originally only loaded a -bios; I added the usual virt-style -kernel/-dtb loading (riscv_load_firmware / riscv_load_kernel / riscv_load_fdt plus the reset vector) so OpenSBI and a kernel actually boot, wrote the matching rv64 device tree, and added an earlycon to the in-tree liteuart driver to see early output. Follow-up-commit material, but the path works end to end today.\nWhat\u0026rsquo;s next # This is the first post in a series on LiteX co-simulation, and it\u0026rsquo;s really the foundation for the fun stuff. A few threads I want to pull on:\nSoftware CI that actually boots. Wire the QEMU CPU into linux-on-litex-vexriscv so the full boot-to-shell runs in CI in seconds instead of the many minutes a Verilated core needs. Once a boot is that cheap, you can gate every commit on \u0026ldquo;does the software still come up and pass its tests\u0026rdquo;, with no FPGA and no overnight simulation. More software-modeled peripherals. Today the one peripheral on the bridge is the LiteX UART. Imagine more co-sim modules on the Verilator side: Ethernet bridged to a host TAP, a virtual block device, a framebuffer in a window, fake sensors on I2C/SPI. The drivers get realistic peripherals to talk to, while the CPU stays in QEMU. Verilator → real silicon. The PR already has a qemu_remote CPU and an Etherbone bridge (litex_qemu_etherbone_bridge.py): same protocol, but the MMIO window is forwarded over Etherbone / PCIe-Bone to a LiteX SoC on an actual FPGA. CPU and software in QEMU on the laptop, peripherals on real hardware. Debugging and visibility. Where co-simulation really earns its keep: $display from Verilator, LiteScope captures of the bus, GDB on the software running in QEMU, and waveforms, all at once. Agents make this loop even more useful: see FPGA development with LiteX in the AI era. Other hacks. Snapshots, fault injection, fuzzing a driver against a deliberately flaky peripheral model\u0026hellip; a lot gets easier once the CPU is basically free. :) The short version: LiteX already treats the CPU as an abstraction, so QEMU slotted in without hacks. What you get back is iteration speed. Stop simulating the CPU, and simulate the part you\u0026rsquo;re actually building. 🙂\nCode: LiteX PR #2468. The LiteX community is on Discord and #litex on Libera if you want to talk about it.\nWork and ideas by enjoy-digital; written up with AI in the loop.\n","date":"2 June 2026","externalUrl":null,"permalink":"/posts/qemu-litex-cosim/","section":"Posts","summary":"PR #2468 lets LiteX run the CPU in QEMU while Verilator keeps simulating the SoC. Here’s how it works, a first benchmark, and a real rv64 Linux boot.","title":"QEMU co-simulation in LiteX","type":"posts"},{"content":"","date":"2 June 2026","externalUrl":null,"permalink":"/tags/riscv/","section":"Tags","summary":"","title":"Riscv","type":"tags"},{"content":"","date":"2 June 2026","externalUrl":null,"permalink":"/categories/simulation/","section":"Categories","summary":"","title":"Simulation","type":"categories"},{"content":"","date":"2 June 2026","externalUrl":null,"permalink":"/tags/simulation/","section":"Tags","summary":"","title":"Simulation","type":"tags"},{"content":"","date":"2 June 2026","externalUrl":null,"permalink":"/tags/verilator/","section":"Tags","summary":"","title":"Verilator","type":"tags"},{"content":"Hi, I\u0026rsquo;m Florent Kermarrec. I started LiteX and I\u0026rsquo;m its main developer at enjoy-digital, though by now it\u0026rsquo;s very much a community project, with a lot of people contributing cores, ports and fixes.\nLiteX is an open-source framework that lets you build FPGA SoCs (CPUs, DRAM controllers, PCIe, Ethernet, all the fun stuff) in Python, and deploy them on more than a hundred boards. The motto has been the same from day one:\nBuild your hardware, easily!\nThis blog is where I write up things I\u0026rsquo;m experimenting with. Some are features landing in LiteX, some are detours, some are just me checking whether an idea actually works.\nIf you want to follow along or build something yourself:\nLiteX: https://github.com/enjoy-digital/litex enjoy-digital: https://github.com/enjoy-digital Chat: the LiteX community hangs out on Discord and IRC (#litex on Libera). Email: florent@enjoy-digital.fr Everything here is open source and BSD-2-Clause, like LiteX itself.\nThe work and ideas here are my own. I write up the posts and draw the diagrams with AI in the loop, which I\u0026rsquo;ve come to think is just part of how RTL gets built now.\nIt also helps with a more practical constraint: I have four kids, I try to keep weekends for them, and my time is tight. Without a bit of help like this, the blog would probably just not exist. 🙂\n","externalUrl":null,"permalink":"/about/","section":"LiteX Notes","summary":"Hi, I’m Florent Kermarrec. I started LiteX and I’m its main developer at enjoy-digital, though by now it’s very much a community project, with a lot of people contributing cores, ports and fixes.\n","title":"About","type":"page"}]