It is wild that the number of GPUs purchased by a company has become, like, an infrastructure investment or something. Like the count itself is worth reporting.
What will they accomplish with the things? Why even think about that part? Probably AI. Selling premium GEMMs, what a trick. Bah. Hopefully TSMC got a really good cut, they are at least doing some interesting engineering.
>It is wild that the number of GPUs purchased by a company has become, like, an infrastructure investment or something. Like the count itself is worth reporting.
An estimated count is newsworthy for the journalists and the readers because it's an indirect proxy for outsiders -- who are not privy to the internal plans of FAANG companies -- to try and figure out what's happening. Basically trying to "read the tea leaves" of the AI industry.
Demand exceeds supply. NVIDIA has limited number chips to sell and TSMC factory time is overbooked. In the current zero-sum situation, NVIDIA picking and choosing who to sell to may be a signal of something. And/or Microsoft/OpenAI's willingness to spend billions on 2x the NVIDIA chips is a signal of something.
Aren't Amazon, Google, and Meta running their own silicon for some training and inference? Does MS have an equivalent?
That could explain a large part of the gap.
Edit: it looks like Microsoft announced their own last year, but I can imagine they may be behind the curve in capability and scale out compared to the others
My outsider's understanding is Google is the only one whose custom silicon is the primary compute for their flagship foundation models. I didn't see any messaging about the Nova models being trained on Traininium (AWS), and Meta still talks about the number of H110's training their Llama models.
I was writing a comment saying the same thing when your comment appeared. Yest, Meta, Google, and Amazon all have custom silicon, and it seems Microsoft's similar efforts came later. None of these companies want to give Nvidia all of the money, so going forward, I think Nvidia isn't going to see more competition from these efforts. The big players aren't going to sell their chips to others (I don't think), but they'll make them available to cloud customers.
Amazon and Meta is at an early stage for their TPU equivalent and I don't think they're ready for production loads. Only Google has comparable silicons but I suspect even Google TPU are mostly for internal products rather than consumers.
Maybe some other company will catch up. But it is hard.
Intel is better at chip design than any of those companies. They spent a lot of effort coming up with a very clever chip that competed well against the current generation of Nvidia chips, while still running your old x86 codes.
Nvidia continued increasing memory bandwidth, and nobody cared about Knights Whatever.
Genuine question: is that true? It seems bonkers that they'd be cranking out more proprietary processors than they could acquire from an established GPU manufacturer.
This is what they agreed to, in order to win the OpenAI partnership. In exchange, OpenAI doesn't have to build or support their own infra. In theory, a win-win, but only if MSFT can effectively sell OpenAI-on-Azure.
Is there a possible way to unblow the fuse? I imagine it depends on the type of e-fuse used. The Athlon XP pencil trick probably won't work, haha. Curious if anyone has more information.
Fascinating, I had no idea this was even a thing. Simultaneously badass team green is far ahead and also a bummer to be artificially limited / segmented.
I wonder if the upcoming 5090 core will mostly be a fuse-intact 4090. I imagine nearly all of NVs current focus is on H200 and Blackwell and whatever else is in the pipeline rather than these "silly" little gamer cards which bring in comparably trivial financial resources.
From my research it seems that restoring full GPU capabilities by repairing or circumventing a deliberately blown eFuse on an NVIDIA AD102 die is, for all practical purposes, impossible.
Maybe, but it's also possible that it wouldn't do anything. Other Nvidia boards (like the Tegra in the Switch) also come with arbitrarily disabled "dark silicon", but enabling the extra SOC hardware only causes the board to crash when using everything at once. It wouldn't surprise me if this was a binning measure, even though I also wouldn't be surprised if it was an arbitrary limit.
In a way... I can actually see this as fair. What's the difference between the 4090 rtx and the 6000 ada? 5x the price for 2x the memory? Ridiculous. But then you have to factor in all the R&D dollars Nvidia poured into their compute/non-graphics ecosystem which now easily eclipses the gaming one, probably by a factor of 10 or more, and suddenly it doesn't seem so ridiculous. You either a) don't get 4090 level of a graphics card anymore... or b) you do get it, but only if it's nerfed for non-graphics uses... Nvidia wants its big R&D bucks back (and then some) and its gonna get em
It won’t happen by other company coming up with faster chips. It will happen by other company coming with cheaper chips and less energy demanding, dominating the low end and then reaching higher.
While Nvidia might have won the training market, it’s inference where the real money is.
TPUs are like the NPU of the training world. You take a bunch of extra time, money and dedicated silicon and end up with an ASIC that barely competes on equal terms with a similarly priced GPU. Unless you've got access to Nvidia's TSMC supply, you're probably not going to make a dent on their demand.
Additionally - TPUs are completely useless if AI goes out of style, unlike CUDA GPUs. The great thing about Nvidia's hardware right now is that you can truly use the GPU for whatever you want. Maybe AI falls through in 2026, and now those GPUs can be used for protein folding or crypto mining. Maybe crypto mining and protein folding falls through - you can still use most of those GPUs for raster renders and gaming too! TPUs are just TPUs - if AI demand goes away, your dedicated tensor hardware is dead weight.
Also TPU v1,v2 and v3 were ASICs, but since v4 they have added some new features so they have a lower performance/watt which is quite near Nvidia's power draw. I think Hopper is at 700W and TPU are around 600W.
I think this is a vast understatement. Google has been using their own TPUs for a very long time now. I think they still have some GPUs from Nvidia, but it's marginal compared to their own silicon. Other big players are behind the curve on this front, but very much working to close the gap. These are companies with nearly infinite pockets, Google has shown that you can make it work without Nvidia, it's only a matter of time before others do it too.
See I absolutely dislike this thought that hyperscalars can easily beat nvidia. It is not their domain of expertise. Tpu are not where near GPU in performance. People really underestimate Nvidia's expertise and strengths.
They don't need to beat them on performance though. If you get half the performance at third the price you can just make more chips and be fine. It's not like Google is gonna run out of datacenter space.
Archive link: https://web.archive.org/web/20241218092954/https://www.ft.co...
Better title would be "Analysts at Omdia, a technology consultancy, estimate that Microsoft acquires twice as many Nvidia AI chips as tech rivals"
"Omdia analyses companies’ publicly disclosed capital spending, server shipments and supply chain intelligence to calculate its estimates."
It is wild that the number of GPUs purchased by a company has become, like, an infrastructure investment or something. Like the count itself is worth reporting.
What will they accomplish with the things? Why even think about that part? Probably AI. Selling premium GEMMs, what a trick. Bah. Hopefully TSMC got a really good cut, they are at least doing some interesting engineering.
>It is wild that the number of GPUs purchased by a company has become, like, an infrastructure investment or something. Like the count itself is worth reporting.
An estimated count is newsworthy for the journalists and the readers because it's an indirect proxy for outsiders -- who are not privy to the internal plans of FAANG companies -- to try and figure out what's happening. Basically trying to "read the tea leaves" of the AI industry.
Demand exceeds supply. NVIDIA has limited number chips to sell and TSMC factory time is overbooked. In the current zero-sum situation, NVIDIA picking and choosing who to sell to may be a signal of something. And/or Microsoft/OpenAI's willingness to spend billions on 2x the NVIDIA chips is a signal of something.
https://fortune.com/2024/02/21/nvidia-earnings-ceo-jensen-hu...
https://fortune.com/2024/09/12/nvidia-jensen-huang-ai-traini...
[dead]
Aren't Amazon, Google, and Meta running their own silicon for some training and inference? Does MS have an equivalent?
That could explain a large part of the gap.
Edit: it looks like Microsoft announced their own last year, but I can imagine they may be behind the curve in capability and scale out compared to the others
My outsider's understanding is Google is the only one whose custom silicon is the primary compute for their flagship foundation models. I didn't see any messaging about the Nova models being trained on Traininium (AWS), and Meta still talks about the number of H110's training their Llama models.
Meta is still GPUs. Amazon Trainium 1 was a failure and is trying an upgrade with Trainium 2. Google is a TPU shop but still busy GPUs for cloud.
I was writing a comment saying the same thing when your comment appeared. Yest, Meta, Google, and Amazon all have custom silicon, and it seems Microsoft's similar efforts came later. None of these companies want to give Nvidia all of the money, so going forward, I think Nvidia isn't going to see more competition from these efforts. The big players aren't going to sell their chips to others (I don't think), but they'll make them available to cloud customers.
Amazon and Meta is at an early stage for their TPU equivalent and I don't think they're ready for production loads. Only Google has comparable silicons but I suspect even Google TPU are mostly for internal products rather than consumers.
Maybe some other company will catch up. But it is hard.
Intel is better at chip design than any of those companies. They spent a lot of effort coming up with a very clever chip that competed well against the current generation of Nvidia chips, while still running your old x86 codes.
Nvidia continued increasing memory bandwidth, and nobody cared about Knights Whatever.
Yeah the number of tpus google has alone massively dwarfs the compute anybody else has.
Google has custom made TPUs.
Yeah if you're looking at this from a 'who has more compute' google beats everyone by a mile.
Genuine question: is that true? It seems bonkers that they'd be cranking out more proprietary processors than they could acquire from an established GPU manufacturer.
Not more than nvidia total. But more than any nvidia buyer? Sure. Don't forget they _also_ have nvidia gpus.
This is a bug not a feature:
https://finance.yahoo.com/news/microsoft-stock-receives-rare...
This is what they agreed to, in order to win the OpenAI partnership. In exchange, OpenAI doesn't have to build or support their own infra. In theory, a win-win, but only if MSFT can effectively sell OpenAI-on-Azure.
When will we break past Nvidia's dominance of the field? Is AMD anywhere close to catching up? Other players? TPUs?
Not soon, it seems. Nvidia is so far ahead that they're artificially limiting their 4090's to half speed to fill a market niche.
https://x.com/realGeorgeHotz/status/1868356459542770087
Is there a possible way to unblow the fuse? I imagine it depends on the type of e-fuse used. The Athlon XP pencil trick probably won't work, haha. Curious if anyone has more information.
Fascinating, I had no idea this was even a thing. Simultaneously badass team green is far ahead and also a bummer to be artificially limited / segmented.
I wonder if the upcoming 5090 core will mostly be a fuse-intact 4090. I imagine nearly all of NVs current focus is on H200 and Blackwell and whatever else is in the pipeline rather than these "silly" little gamer cards which bring in comparably trivial financial resources.
/me *cries a tear*
Mixed messages on that answer, but I'd like to know as well.
https://x.com/cognitivecompai/status/1868399108924592391
https://x.com/cognitivecompai/status/1868401738706993301
From my research it seems that restoring full GPU capabilities by repairing or circumventing a deliberately blown eFuse on an NVIDIA AD102 die is, for all practical purposes, impossible.
> Is there a possible way to unblow the fuse?
Maybe, but it's also possible that it wouldn't do anything. Other Nvidia boards (like the Tegra in the Switch) also come with arbitrarily disabled "dark silicon", but enabling the extra SOC hardware only causes the board to crash when using everything at once. It wouldn't surprise me if this was a binning measure, even though I also wouldn't be surprised if it was an arbitrary limit.
Also interested in this ...
I assumed this was to make GPUs more affordable for users with gaming use cases in mind.
In a way... I can actually see this as fair. What's the difference between the 4090 rtx and the 6000 ada? 5x the price for 2x the memory? Ridiculous. But then you have to factor in all the R&D dollars Nvidia poured into their compute/non-graphics ecosystem which now easily eclipses the gaming one, probably by a factor of 10 or more, and suddenly it doesn't seem so ridiculous. You either a) don't get 4090 level of a graphics card anymore... or b) you do get it, but only if it's nerfed for non-graphics uses... Nvidia wants its big R&D bucks back (and then some) and its gonna get em
No, it is to make GPUs more expensive for users with professional use cases in mind.
[dead]
It won’t happen by other company coming up with faster chips. It will happen by other company coming with cheaper chips and less energy demanding, dominating the low end and then reaching higher.
While Nvidia might have won the training market, it’s inference where the real money is.
Screw performance, just give me a consumer priced card that has oodles of RAM.
Strange to see no mention of TPUs...
They are overhyped and not as performant as Nvidia regardless of marketing.
TPUs are like the NPU of the training world. You take a bunch of extra time, money and dedicated silicon and end up with an ASIC that barely competes on equal terms with a similarly priced GPU. Unless you've got access to Nvidia's TSMC supply, you're probably not going to make a dent on their demand.
Additionally - TPUs are completely useless if AI goes out of style, unlike CUDA GPUs. The great thing about Nvidia's hardware right now is that you can truly use the GPU for whatever you want. Maybe AI falls through in 2026, and now those GPUs can be used for protein folding or crypto mining. Maybe crypto mining and protein folding falls through - you can still use most of those GPUs for raster renders and gaming too! TPUs are just TPUs - if AI demand goes away, your dedicated tensor hardware is dead weight.
Also TPU v1,v2 and v3 were ASICs, but since v4 they have added some new features so they have a lower performance/watt which is quite near Nvidia's power draw. I think Hopper is at 700W and TPU are around 600W.
Power draw doesn't matter in cloud. All that matters is performance/price for the task in hand.
In the cloud, opex == energy.
Which is why you don't send 4 TPUs to do 1 GPU's job.
If the TPUs cost less, that implies that they draw less power. If they cost more then nobody will use them.
the article says microsoft and meta bought more than 100k AMD each so maybe they're making inroads.
Meta is Nvidia shop for training and AMD for inferencing. Strict seperation between vendors.
Never? AMD can't write proper drivers...Most don't realize how far ahead NVIDIA is.
I think this is a vast understatement. Google has been using their own TPUs for a very long time now. I think they still have some GPUs from Nvidia, but it's marginal compared to their own silicon. Other big players are behind the curve on this front, but very much working to close the gap. These are companies with nearly infinite pockets, Google has shown that you can make it work without Nvidia, it's only a matter of time before others do it too.
See I absolutely dislike this thought that hyperscalars can easily beat nvidia. It is not their domain of expertise. Tpu are not where near GPU in performance. People really underestimate Nvidia's expertise and strengths.
They don't need to beat them on performance though. If you get half the performance at third the price you can just make more chips and be fine. It's not like Google is gonna run out of datacenter space.
TPU support is very minimal. Also XLA just does not compare to CUDA yet.
And yet Google seems to be doing just fine while spending less on GPUs than other big tech, so reality doesn't seem to align with your comment.
https://www.hpcwire.com/2024/10/30/role-reversal-google-teas... Well they are still buying GPUs.
Mostly for external cloud customers so doesn't change my point - TPUs have been more than enough for Google to lead the pack.