The RTX Spark is NVIDIA's bet that your next AI agent runs on your desk, not in a distant data centre — a Windows-on-Arm machine with a petaflop of compute and 128GB of memory.
On 31 May 2026, at Computex, NVIDIA unveiled RTX Spark, a new class of personal computer built to run AI agents locally. The headline specifications: roughly one petaflop of AI compute and 128GB of unified memory in a desktop-class machine, with NVIDIA claiming up to twice the inference speed of current hardware on leading agentic models — about 2x on Qwen 3.6 27B in its own figures. NVIDIA puts named software partners behind it, including Microsoft, Adobe, Blender and H Company, and says more than 30 laptops and around 10 desktops will follow from Dell, HP, Lenovo, Asus and MSI.
Read against NVIDIA's own data-centre pitch from the same keynote, RTX Spark is the other half of a single argument. The cloud builds the factories. The desk gets an agent of its own.
What NVIDIA showed at Computex
RTX Spark sits in the lineage NVIDIA started with DGX Spark — a small machine carrying serious AI silicon — but aimed at a wider audience and built on Windows-on-Arm. Early hands-on coverage framed it as NVIDIA turning Windows into something closer to an agentic operating system: a personal computer whose default workload is running models and agents, not opening documents.
The 128GB of unified memory is the detail that matters. Large models are memory-bound; you cannot run a capable agent locally if the weights do not fit. Putting 128GB on the desk means models that previously needed a cloud GPU now fit on a machine you own. NVIDIA's GeForce announcements rounded out the consumer side, but RTX Spark is the strategic piece — the agent leaves the data centre.
💡Why 128GB is the headline
Memory, not raw speed, is what kept capable agents in the cloud. Put 128GB on the desk and the maths changes: the model fits, the data stays local, and the agent answers without a round trip.
The agent moves to the edge
For two years, using a capable AI agent meant renting one. Your prompt travelled to a data centre, a model you did not control answered, and your data sat — however briefly — on someone else's machine. RTX Spark proposes a different default: the agent runs where you are, on hardware you bought.
This is the counterweight to the sovereign-factory story NVIDIA told on the same stage. One pitch concentrates intelligence in vast plants; the other distributes it to the desk. Both are true at once, and the tension between them is the interesting part. NVIDIA is selling the centre and the edge, and betting the edge grows fast.
There is a personal-agency angle here that the spec sheet hides. An agent on your own machine is one you can, in principle, inspect, constrain and switch off. An agent rented from a cloud is one whose rules are set elsewhere. Owned intelligence and rented intelligence are not the same relationship — and which one becomes the default shapes how much control ordinary people keep.
Why local matters: privacy, latency, ownership
Three concrete advantages follow from running the agent locally. Privacy: sensitive data — medical notes, legal drafts, source code — never leaves the machine. Latency: no network round trip, so the agent responds at the speed of local memory. Ownership: the cost is the hardware you bought, not a metered bill that grows with use.
The most private agent is the one that never sends your data anywhere. Local compute is not just faster — it changes who holds your information.
— On the case for on-device AI, 31 May 2026 (https://blogs.nvidia.com/blog/rtx-ai-garage-computex-spark-local-agents/)
None of this is free. Local models still trail the largest cloud models on raw capability, and a petaflop on the desk is not a data centre. But for a large class of everyday work — drafting, coding, search over your own files — local is now good enough, and good enough on your own machine beats excellent on someone else's terms.
The new risk surface
A capable agent with access to your files and your operating system is a new kind of risk, not just a new convenience. The same autonomy that makes a local agent useful makes a compromised one dangerous — it can read what you can read and do what you can do. The agentic-development world has been wrestling with this; I wrote about it around Google Antigravity 2.0 and the agentic stack.
Putting that capability on millions of personal machines widens the attack surface considerably. Local agents need real sandboxing, clear permission models, and an off switch that ordinary users can find. RTX Spark also points at where physical AI is heading — capable models running close to the world they act on, which is the same instinct behind on-device robotics work like NVIDIA's own Cosmos 3 push into physical AI.
Frequently Asked Questions
These are the questions buyers, developers and security teams have been asking since NVIDIA unveiled RTX Spark at Computex 2026. Short answers follow, drawn from NVIDIA's announcement and early hands-on coverage.
What is RTX Spark?
In short, RTX Spark is a Windows-on-Arm personal computer NVIDIA built to run AI agents locally, with about one petaflop of AI compute and 128GB of unified memory. The answer, simply put, is a desktop-class machine that holds large models in memory without the cloud. The key is the 128GB — research and NVIDIA's own data show memory, not clock speed, is what kept capable agents off the desk.
How does RTX Spark run AI agents on-device?
RTX Spark loads a model's weights into its 128GB of unified memory and runs inference on NVIDIA silicon, so prompts never leave the machine. According to NVIDIA, the platform delivers up to 2x the inference speed of current hardware on agentic models — roughly 2x on Qwen 3.6 27B in its figures. Data from partners Microsoft, Adobe, Blender and H Company shows the software stack is already being tuned for local agents.
Why does on-device AI matter?
Cloud agents send your data to a machine you do not control. According to NVIDIA, RTX Spark changes the default: the agent runs where you are. The evidence for why that matters is concrete — local inference keeps sensitive data private, removes network latency, and converts a metered cloud bill into a one-time hardware cost. Owned intelligence is a different relationship from rented intelligence.
Who is RTX Spark for?
RTX Spark is for developers, creators and privacy-sensitive professionals — anyone who needs a capable agent without sending work to the cloud. In other words, the analysis points to coders, writers, designers and regulated fields like law and medicine. More than 30 laptops and around 10 desktops from Dell, HP, Lenovo, Asus and MSI are expected to carry the platform.
What are the real risks of a local AI agent?
Analysis of on-device agents reveals three durable risks. First, security: an agent with access to your files and operating system is dangerous if compromised, so sandboxing and permissions are essential. Second, fragmentation across many hardware and software combinations. Third, a capability gap — local models still trail the largest cloud models. Evidence shows the risks are manageable, but only if local agents ship with real off switches.
RTX Spark will not replace the data centre, and NVIDIA is not pretending it will. What it does is reopen a question the cloud era had quietly closed: should your intelligence live on a machine you own, under rules you set? For everyday work, the answer is starting to be yes. That is a dignity question as much as a hardware one — keeping people in command of the tools that act for them is the whole point of building with care, which is what the .person Protocol is for.
Related on humphreytheodore.com: