Marvell Refocuses Thunder Server Platforms Towards Custom Silicon Business
by Andrei Frumusanu on August 28, 2020 4:00 PM EST- Posted in
- Servers
- CPUs
- Marvell
- Arm
- Enterprise
- Enterprise CPUs
- ThunderX3
Yesterday during Marvell’s quarterly earnings call, the company had made a surprise announcement that they are planning to restructure their server processor development team towards fully custom solutions, abandoning plans for “off-the-shelf” product designs.
The relevant earning call statements are as follows:
"Very much aligned with our growing emphasis on custom solutions, we are evolving our ARM-based server processor efforts toward a custom engagement model.
[…]
Having worked with them for multiple generations, it has become apparent that the long-term opportunity is for ARM server processors customized to their specific use cases rather than the standard off-the-shelf products. The power of the ARM architecture has always been in its ability to be integrated into highly customized designs optimized for specific use cases, and we see hyperscale data center applications is no different. With our breadth of processor know-how and now our custom ASIC capability, Marvell is uniquely positioned to address this opportunity. The significant amount of unique ARM server processor IP and technology we have developed over the last few years is ideal to create the custom processors hyperscalers are requesting.
Therefore, we have decided to target future investments in the ARM server market exclusively on custom solutions. The business model will be similar to our ASIC and custom programs where customers contribute engineering and mask expenses through NRE for us to develop and produce products specifically for them. We believe that this is the best way for us to continue to drive the growing adoption of ARM-based compute within the server market."
We’ve had the opportunity to make a follow-up call with the teams at Marvell to get a little more background on the reasoning for such a move, given that only 6 months ago during the launch of the ThunderX3, the company had stated they were planning to ship products by the end of this year.
Effectively, as we’ve come to understand it, is that Marvell views the Arm server market at this moment in time to be purely concentrated around the big hyperscaler customers which have specific requirements in terms of their workloads, which require specific architecture optimisations.
Marvell sees the market beyond these hyperscaler customers to not be significant enough to be of sufficient value to engage in, and thus the company prefers to refocus their efforts in towards closer collaborations with hyperscaler costumers and fulfilling their needs.
The statement paints a relatively bleak view of the open Arm server market right now; in Marvell’s words, they do not rule out off-the-shelf products and designs in several years’ time when and if Arm servers become ubiquitous, but that currently is not the best financial strategy for the current ecoysystem. That’s quite a harsh view of the market and puts into question the ambitions of other Arm server vendors such as Ampere.
The company seemed very upbeat about the custom semicon design business, and they’re seemingly seeing large amounts of interest in the latest generation 5nm custom solutions they’re able to offer.
The company stated during the earning call that it still plans to ship the ThunderX3 by the end of this year, however it will only be available through customer specific engangements. Beyond that, it’s looking for custom opportunities for their hyperscaler customers.
That also means we won't be seeing public availability for the dual-die TX3 product, and neither the in-design TX4 chip, which is unfortunate given that the company had presented their chip roadmap through 2022 only a few weeks ago at HotChips, which is now out of date / defunkt.
Although the company states that it’ll continue to leverage its custom IP for the future, I do wonder if the move has anything to do with Arm’s recent rise in the datacentre, and their very competitive Neoverse CPU microarchitectures and custom interconnects, essentially allowing anybody to design highly customizable products in-house, creating significant competition in the market.
From Marvell’s perspective, this all seems to make perfect sense as the company is simply readjusting towards where the money and maximum revenue growth opportunities lie. Having a hyperscaler win and keeping it is already a significant pie of the total market, and I think that’s what Marvell’s goal is here in the next several years.
Related Reading:
- Hot Chips 2020: Marvell Details ThunderX3 CPUs - Up to 60 Cores Per Die, 96 Dual-Die in 2021
- Hot Chips 2020 Live Blog: Marvell ThunderX3 (10:30am PT)
- Marvell Announces ThunderX3: 96 Cores & 384 Thread 3rd Gen Arm Server Processor
- Assessing Cavium's ThunderX2: The Arm Server Dream Realized At Last
- Marvell Unveils its Comprehensive Custom ASIC Offering
- Next Generation Arm Server: Ampere’s Altra 80-core N1 SoC for Hyperscalers against Rome and Xeon
- Ampere’s Product List: 80 Cores, up to 3.3 GHz at 250 W; 128 Core in Q4
- Amazon's Arm-based Graviton2 Against AMD and Intel: Comparing Cloud Compute
42 Comments
View All Comments
demian_thorne - Sunday, August 30, 2020 - link
Wilco1,
before getting to the main argument here let me point out some of the discrepancies in your post (at least thats how it seems)
1. The only performance test that I have seen between Graviton2 and EPYC is with EPYC1 which is 3 years old. EPYC3 will be released by end of 2020, don't you think this is not a fair comp?
2. The size of the silicon that you mentioned I believe refers to EPYC2 which like I said was not part of the comp test.
3. One very large part of the said chip is the L3 cache - EPYC2 has basically twice as much
Those arguments on tech web sites can go on forever so let me pass to the main point. There is not yet an apples to apples comparison test, is there? I mean these SPECint tests are not irrelevant - not calling them apples to oranges but they are not apples to apples either.
You know why? Because there is no comparable profitable software that has been tuned to run on both platforms. If that was the case then we could talk about chip sizes (it's not just the size of the core, it is the cache, the mem controllers the intra coms etc) ... If we could run SAP on both platforms then we could see how each chip is finally architected with all the bells and whistles to perform. And then we could make a fair comp.
And that is what you are missing. Nobody will save hardware money only to spend an order of magnitude more on re-architecting software.
>any company which spends many millions a year on x86 servers would be stupid not to try Arm
yet they don't because they will spend billions on the software side to get the benefits out of it.
See the story of AMD vs INTC. Much more clear apples to apples comparison there and yet AMD just got to 10% of the market which is of the total market not just the hyperscalers (say what 6% there?)
Same ISA - but yet the hyperscalers didn't jump to this cost saving opportunity as much as you would think, did they?
There is a reason those MBAs are still needed in SV. LMK if you still disagree...
DT
Wilco1 - Monday, August 31, 2020 - link
1. Where did I compare with EPYC 1? Next year we can compare next-gen EPYC with next-gen Arm, but today we've got EPYC 2, Graviton 2 and Ampere Altra. That's a fair comparison.2. EPYC 1, like anything Intel, is so far behind it's not worth looking at.
3. Actually EPYC 3 has 8 times as much L3 cache than Graviton 2. It improves performance but it also has a huge area penalty. Graviton 2 is able to get similar performance using a fraction of the silicon and power. How is that not a huge advantage?
A lot of server software has already been ported and optimized for Arm. You can find many benchmarks using Graviton 2 if you cared to look. The general conclusion is that most software works efficiently out of the box. So the old "it's prohibitively expensive to port software to Arm" excuse is simply false. Years ago Cloudflare blogged on how easy it was to port all their software to Arm (it took one guy a few weeks, including full optimization). Claiming it costs billions is ridiculous and only shows how biased you are.
So yes I disagree. AMD can't offer the same cost savings as Arm, high throughput per Watt or more than 64 cores, so it's not nearly as attractive for hyperscalers. Graviton is only the first step, I expect many of the hyperscalers to move to Arm in the coming years.
demian_thorne - Monday, August 31, 2020 - link
The only comparison I have seen between EPYC and Graviton2 is on Anandtech and that was EPYC1. But do you care to backup with facts your statement about lots of server software ported to Arm. Either links or websites or google phrases ?Hell it is difficult to make an apples to apples comparison between Android and iPhone and you say there is lots to compare between Arm and x86?
Links - Websites - Google phrases so you can educate me :)
Wilco1 - Monday, August 31, 2020 - link
AnandTech also reviewed EPYC 7742. The new EPYC 2 instances didn't get much coverage or benchmarking however.Arm has various blogs, a recent one I thought was interesting: https://community.arm.com/developer/tools-software...
That shows both Mentor and Cadence EDA tools have been ported to Arm and look fast on Graviton 2. So Arm now uses Arm servers to make better Arm CPUs! A few more:
https://docs.keydb.dev/blog/2020/03/02/blog-post/
https://idk.dev/improving-performance-of-php-for-a...
https://blog.treasuredata.com/blog/2020/03/27/high...
https://extricate.org/2020/05/13/benchmarking-the-...
So a wide variety of server software runs faster on Graviton 2 and is cheaper as well.
Industry_veteran - Monday, August 31, 2020 - link
Wilco1,Do you have total power consumption comparison for entire SoC of server grade ARM vs Intel?
Giving single core simulated numbers doesn't show the whole picture.
Also SPECint numbers are just one parameter. If you ask hyper scale customers they will give you a long list of applications that they consider for comparison.
Also to note here - there is an important difference between having any software just ported to ARM and having software optimized for ARM. These are two different things. Intel has been working with software community for decades. There is no question ARM can do it but unfortunately no ARM hardware vendor is willing to stay in the game for decades and sustain big losses year after year with the hope that someday ARM will have equal footing with Intel in software development community. For this reason, almost all big companies backed out of ARM server one by one like AMD, Broadcom, Qualcomm and now Cavium/Marvell. It makes no sense to spend 100 million dollars a year to make less than 10 million dollar revenue unless there is a path that clearly shows company is going to turn it around in future.
Wilco1 - Tuesday, September 1, 2020 - link
Industry_veteran,That graph proves that your assertion that a beefy core uses the same power as Intel/AMD is clearly false. A core can be beefy and beat Intel/AMD using far less power. Graviton 2 uses about 110W max, which is about half the power of a 64-core EPYC 2 and about a quarter Intel needs for equivalent performance.
There are lots of Arm developers working on optimizing compilers and tuning software so Arm code works well on Linux, Android, iOS, Windows etc. This is not something a single company is doing, and it certainly doesn't take many decades nor cost billions (I already mentioned that Cloudflare did port and tune all their internal code in a few weeks).
Besides grossly exaggerating the costs/effort, you're also overestimating the gains from tuning. If, as expected, next-generation Arm servers have significantly higher per-core performance than Intel/AMD, how does the last 10% from tuning matter? It's the icing on the cake, nice but not essential.
It sounds to me both you and Demian are desperately trying to change the goal posts to spin Intel/AMD in a positive light. AMD and Intel simply cannot compete with Arm's lightning pace of innovation and annual 20-30% performance gains. I bet we'll see 5nm Arm servers with DDR5 before Intel/AMD.
Industry_veteran - Tuesday, September 1, 2020 - link
Wilco1,I think you are comparing apples with oranges when you compare so called custom SoC like Gravitron 2 with general purpose CPUs from Intel and AMD.
I have already mentioned that custom ARM chips made by hyper scale customer like Amazon have inherent advantages because they are made for specific narrow workloads and have large guaranteed market.
Another point is that for general purpose chips you need to see power consumption per SoC under variety of different workloads. Your statements like "Graviton 2 uses about 110W max, which is about half the power of a 64-core EPYC 2 and about a quarter Intel needs for equivalent performance" makes me wonder, under what conditions, benchmarks and workloads etc.?
I do not have any affinity towards Intel. However, I have been through this at a major ARM chip maker. I know how these numbers games are played.
Just think about it for a minute. If general purpose ARM server SoCs were so good then customers would be jumping up and down to get them given the lower costs of ARM cpu SoCs. For hyper scale customers in particular who buy hundreds of thousands of servers each year, just few dollar savings per chip will result in massive amount of savings. However, you see all major chip companies like AMD, Broadcom, Qualcomm, and now Marvell got out of that market one by one. All these companies had made massive R&D investment in ARM server SoCs. Do you think they were not thinking properly when they got out of that business?
Wilco1 - Wednesday, September 2, 2020 - link
Claims that an Arm SoC isn't suitable for X is literally from Intel's FUD playbook - apparently only x86 can access the internet and is suitable for phones! We know how that one played out...Yes Graviton 2 is custom but server CPUs are not designed for one specific workload. Neoverse N1 does particularly well on anything you throw at it (as my links show) - if anything it is more general purpose than any previous microarchitecture! So calling it special purpose without a shred of evidence is Intel-style FUD.
We can debate history forever, but the bottom line is that designing a server is expensive and needs volume to be sustainable. Previous attempts used older processes and the microarchitectures were not as advanced, so Neoverse N1 on 7nm is a huge leap forward. That makes it much easier to get design wins, and hyperscalers is where the volumes, money and growth are. Once you've got volume, you can branch out into the wider server market.
Industry_veteran - Wednesday, September 2, 2020 - link
Wilco1,If you are claiming that general purpose CPU vendors don't make chips with particular workloads in mind then I don't know what to say!.
Instead of answering my common sense question about why all big name, deep pocket companies got out of general purpose ARM server market and that too after spending significant sums of money? you chose to punt and call it FUD.
No one is denying hyperscales is where the volumes are but that doesn't mean that is where the money is, if you are a chip maker like Marvell. As I said hyper scale customers like Amazon and perhaps Google or Facebook can easily put together a team and get their own custom SoC chips designed using ARM cores. They do not need likes of Broadcom, Marvell etc. It validates everything I said about the lack of interest in general purpose ARM server SoCs.
Wilco1 - Thursday, September 3, 2020 - link
Let me give you a clear example: A64FX is not a general purpose server chip. Everything is designed around HPC, its micorarchitecture implies it will be terrible at SPEC, but give it proper vectorized and optimized FP code and it will beat anything else. Obviously Graviton 2 is not like that at all. Throw any random code at it and it will do well and compete with EPYC 2. So in what way is it not general purpose server chip?I did answer your question: "the bottom line is that designing a server is expensive and needs volume to be sustainable. Previous attempts used older processes and the microarchitectures were not as advanced, so Neoverse N1 on 7nm is a huge leap forward. That makes it much easier to get design wins, and hyperscalers is where the volumes, money and growth are. Once you've got volume, you can branch out into the wider server market."
Yes some of the large hyperscalers may design their own SoC since that is now feasible with recent high-end Arm cores and 7nm/5nm TSMC. But smaller hyperscalers will use TX3 or Ampere Altra, and Nuvia in the future. The market is large and growing.