Just-in-time programming

Hey there! We're Riza and we make it safe and easy to run untrusted code generated by LLMs or humans, in our cloud or on your infra.

There's a pattern emerging across Riza's early customers. They're using LLMs to build highly-adaptable software systems with a technique that we've started to call "just-in-time programming."

They take some dynamic input at runtime and show it to an LLM asking for runnable code to manipulate that input. They then take the code from the LLM and run it immediately, in production. No code review. No tests. Just run it and handle the result. The LLM writes code "just-in-time" to process the input.

Here's a concrete example. Let's say you're building an application that helps users tabulate their gains and losses from stock trades. The app accepts individual stock trade data submitted by users, but the app only needs the ticker symbol, the transaction date, the number of shares traded and the share price for each trade. You expect users can provide a CSV export from their brokerage platform that includes this information, but you won't know the exact CSV format until it hits your system and you don't want to make users manipulate the CSV before submission.

Without an LLM you'd have to build a multi-step workflow that takes the CSV and presents a field mapping interface to users along with a format selector for the date and price fields.

Or you could dispense with all that and just prompt an LLM to write a script that extracts the relevant data from the user's CSV, then run that script immediately to get the data you need for further processing.

Using an LLM, you can create bespoke code for each input at runtime rather than constrain input to meet the expectations of your code. It's not too hard to see the same approach working for JSON or XML data, arbitrary log data, even HTML or PDF data. And once you have the code, you can cache it and run it on similar-looking future input without an inference round-trip.

This is pretty wild, but it works because LLMs can write code fast. When I first started making web applications it was common to just edit some scripts in your production server's cgi-bin directory and hit save. The next page view on your site would then immediately run the new code. The pace of iteration was incredible, but you'd endure a few seconds (or even minutes) of downtime each time you forgot a semicolon. Every software professional knows you shouldn't edit code live in production.

But when you think about it, this principle became dogma because humans are unpredictable, unreliable and most importantly slow. It takes time to realize our mistakes, and even more time to correct them. An LLM can write code, run it, see an error, rewrite the code and retry all in the span of a second or two. That's less time than it takes VS Code to start up on my machine. No matter how careful we are, we'll forget a semicolon once in a while, and when we do it may take a few seconds (or even minutes) to realize our mistake. That much downtime is unacceptable for most production services.

Of course though they are fast relative to humans, LLMs aren't exactly known for their reliability. But when specifically tasked with writing code, they are already quite good and getting better. Still, they aren't going to give you correct code 100% of the time. What we've learned is that a simple retry loop where you include the error message in a follow-on LLM prompt works remarkably well. And in cases where that fails after a couple of attempts you'll typically fall back to a user-facing error message explaining the failure and suggesting a manual reformat of the input.

We've found ourselves servicing this just-in-time programming system design because LLM-generated code is properly treated as untrusted, and we provide an isolated locked-down runtime for untrusted code execution. But the runtime is only one piece of a larger just-in-time programming platform that we're really excited to build. We've learned from our early customers that they'd also need help with:

  • A framework for integrating the write code, see error, retry loop
  • End-to-end testing
  • Runtime observability
  • Tailored LLM code output evaluation

We're just getting started on building the best-possible toolkit to put just-in-time programming into your production applications.