NVIDIA DGX Spark Launch

Published: 
Uncategorized
Updated: 
NVIDIA DGX Spark Launch

It's been 7 months since Nvidia announced the DGX Spark (previously DIGITS) calling it a "desktop AI Supercomputer" based on their Blackwell architecture. In the time since that announcement AMD has their Strix Halo (Zen 5/RDNA3.5/XDNA2 based) Ryzen AI MAX+ APU systems with up to 128 GB of system memory than can be split between the CPU and GPU. These systems are interesting to me because they seem like a possible solution to the anemic VRAM available at the consumer GPU level which hampers our ability to run large LLMs and image generation models at home and in the office.

The downside with these systems (IMO) is that they are an all-in-one system design - you won't be able to upgrade or swap out the CPU, GPU, or even the RAM. Most people don't care and never do crazy things like that, but I'm not most people so it bothers me. Just like I'm not really ever going to be okay with having liquid cooling in any of my systems because liquid + electronics = a bad time.

TL;DW NetworkChuck

NetworkChuck video thumbnail
NetworkChuck

NetworkChuck released his video where he compares the performance of the DGX Spark (128 gb of unified RAM) against his dual RTX 4090 (48 gb of VRAM total) AI server that cost him over $5,000 to put together. The custom, purpose-built machine beats the DGX Spark in many things, which is understandable since the Nvidia system's performance is likely going to be comparable to the RTX 5060 or RTX 5070 while having much more VRAM than any dedicated GPU available at the consumer level today. Where the DGX Spark shines is in being able to load much larger "AI" models into unified memory than you could even with a pair of $2000+ RTX 5090s with 64 gb of combined VRAM.

The DGX Spark has hardware support for FP4 compute formats, you're not going to get that on consumer level GPUs right now. That makes the DGX Spark possibly a better choice if your goal is to train models. The DGX Spark also wins when it comes to power consumption (albeit it has reduced performance as well) at 240 watts max, compared to a dual high-end GPU machine at 1100 watts max or more.

I don't think this feels like a supercomputer. Maybe a mini supercomputer.

-- NetworkChuck

TL;DW Level1Techs

Level 1 Techs video thumbnail
Level1Techs

Wendell put out his video on the DGX Spark saying he can get similar FP8 inference performance on AMD and Spark but the FP4 performance of the Spark should be unmatched. He also points out that another option would be dual RTX Pro 6000s giving you 192 gigs of VRAM combined. That's probably going to set you back at least $20,000, and it would be an order of magnitude faster. He also believes that Nvidia is working on an option between the dual RTX Pro 6000 workstation and Spark, and I presume something like that would come in at the $10,000+ level.

I'm disappointed there was no real test/comparison between the DGX Spark and any of the other systems L1Techs has that are capable of handling "AI" workflows at reasonable to excellent performance. I suspect Nvidia may have highly discouraged such comparisons in exchange for getting their hands on the DGX Spark before launch. NetworkChuck did a comparison of Nvidia hardware vs Nvidia hardware in his video so that may have been explicitly forbidden by Nvidia. Hopefully Wendell does another video soon-ish where he runs some benchmarks against alternative systems and setups.

This is a developer tool. You've got to to figure this out. This is the hallucination. This is the 0.1. This is the vacuum tubes of this generation.

-- Wendell of Level1Techs

TL;DW ServeTheHome

Serve The Home video thumbnail
ServeTheHome

ServeTheHome also has a video and is not a channel I've seen before. He's coming at this from much more of an IT perspective of things than I'm interested in. Talks a lot about the networking features and capabilities the DG Spark has and how you could take advantage of this with enterprise grade hardware. The AMD solution has two 10 gigabit ethernet ports while the DGX has one 10 gb port but that's the low-speed port! Nvidia's little box has dual QSFP 56 ConnectX-7 ports which he says are like 7200 gigabit in OCP NIC 3.0 form factor.

If you look at the front of the system, you're going to see this kind of foam vent. It looks like it's foam, but it's actually pretty hard. I don't even know how to describe it, but it is actually a a hard surface. The idea is that it looks kind of cool and there's also air flow. So, it is designed so you get air flow through the chassis this way.

-- ServeTheHome

EOD

At the end of the day the DGX Spark seems to be a small form factor machine targeting "AI" developers more than consumers. There are probably some very good and interesting use cases for machines like this in small business settings that I haven't thought of or stumbled onto yet. The lower power consumption is definitely a benefit if the machine is going to see constant, heavy use, but the reduced performance that comes with lower power consumption could nullify any of that benefit - time is money. And 128 gb of RAM, while nothing to sneeze at, isn't that impressive to me in these little boxes because we've been able to build consumer grade PCs with that much RAM for a few years now, I'm hoping to see 256 to 512 gb options very soon in these little "AI" computing boxes, honestly. Currently my main workstation (which is over 6 years old as of today) and my 2 year old laptop both have 64 gb of RAM and yes, I do utilize that much sometimes! Free/available RAM is wasted RAM, as they say.

If you want generation speed (whether for generating images or LLM responses) then high-end consumer GPUs are going to be faster (and consume more power) than Nvidia's micro "supercomputer". With an MSRP of $3999 I was hoping to see a little bit more to convince me the DGX Spark is going to be much better than AMD's ~$2000 solution, myself. Only time will tell if my initial skepticism is warranted 😆

Addendum

Digital Spaceport video thumbnail
Digital Spaceport

Digital Spaceport has tested his quad RTX 3090 system against lmsys.org's benchmarks and the results are not surprising but still interesting.