An Interview with AMD’s Forrest Norrod: Naples, Rome, Milan, & Genoa
by Dr. Ian Cutress on June 24, 2019 8:00 AM ESTThere’s no getting away from the fact that AMD’s big revenue potential exists in the server space. While the glitz and the glamor is all about the Ryzen, the big money is with EPYC. Heading up AMD’s EPYC efforts, as well as their extended collaborations in the embedded and semi-custom segments, is Forrest Norrod, the SVP and GM of the Datacenter and Embedded Solutions Group at AMD. We spent some time at Computex talking with Forrest about the upcoming premise of Rome, and how AMD now sees itself in the server and enterprise ecosystem.
Forrest Norrod SVP & GM, AMD DESC |
Dr. Ian Cutress AnandTech, Senior Editor |
Also present was a journalist from PCWatch.
As with our interviews, we implement a refined transcription to benefit ease of reading, and also rearrange questions into segments.
Rome and the Future of EPYC
Ian Cutress: From our perspective we see Rome being the next step-up in AMD's revenues - as good as all the Ryzen and Zen 2 announcements for the consumer market, combined with the argument is consumer market is flat to declining, every competitor is trying to chase the same overall volume, but higher ASPs. By contrast, how do you see the enterprise market growing as AMD looks forward to bringing out Rome in Q3?
Forrest Norrod: We are super excited, so from our perspective Naples was always about reintroducing ourselves to the business, the ecosystem, the customers, the partners, and building credibility that we were back in the game. We set relatively modest share gain ambitions for Naples, and we're pretty much on track to do what we thought we could do. From day one, I always said to the team that Naples doesn't matter if we don't deliver Rome, if we can't look at our customers and say Rome is on time. By the way, Rome doesn't matter if we don't look at our customers and say Milan is on time, or it doesn't matter if we don't tell customers that we're ready to deliver Genoa. I'm going to stop there because otherwise I'm going to giveaway code names we haven't said yet!
But for the removal of doubt, Genoa is the Zen 4 design. Here's an exclusive: the one after that is another Italian City. I know you're shocked. We always knew that having the regular cadence in the road map was extremely important, and so the strategy was to have Rome be a part that could further accentuate our throughput leadership, and to get rid of as many asterisks as possible in terms of single thread performance, memory latency, and so on. That was always the strategy. Then Milan was designed to further erase any asterisks that remain, so in thinking about it, in the original strategy, Milan was where we expected to be back to IPC (or better) parity across all workloads. That was the thought process to the strategy.
Even with Intel's road map delays, we have continued to execute within a quarter of when we said we were going to execute - we've always come out within one quarter of where we planned it 3 years ago. Milan is right on track as well, but Intel is not, and so the relative competitiveness of Rome is actually better than we had originally had planned. We're seeing tremendous excitement in our customer base around it, and I think that's really a function of it being more competitive, but it's also a function of the fact that our partners now believe us. Now AMD delivered the original part and it works, and it's good and it's quality, we can deploy it at scale. We told customers two years ago that we we're going to come out with Rome in the middle of 2019 and now we’re telling them that we’re going to come out with Milan at a certain point in time, and they believe us. That helps a lot that belief, that the proof points, the predictable execution are very important.
IC: When Naples was announced, early access partners and cloud providers got it first. It was still another two quarters until general availability, and what we're hearing out of our sources is that Rome would be on a similar cadence to that for a variety of reasons. Can you explain what AMD’s plan is here?
Forrest Norrod: No. I would say normally over the last 10 years that when a processor gets introduced, most of the OEMs have most of the platforms available for 30-60 days at least, for most. Some will have some platforms available later but some will have platforms available in 30-60 days but that's normal. Then they shipped volume to lead HPC and cloud customers maybe 3 or 4 months earlier. I'd say we're right on track for that. So we're going to introduce Rome in Q3, and we're anticipating pretty good availability at launch and within that 30-60 day window.
IC: Just to put into perspective, does AMD have enough FAE's* to support its customers?
(An FAE is a Field Application Engineer, whose duties may include assisting partners in system and product design.)
Forrest Norrod: That's interesting. What you might be hearing is that compared to the current situation with Naples, we probably have quadruple the requests for Rome. So the reality is there is so much interest in Rome, that we are turning away platforms ideas right now, and we have been turning away platforms for the last 6 months. But plotting it out in terms of the platforms that we are supporting, we’re going to see a doubling of platforms from Rome.
PCWatch: I saw that market research from Mercury shows that AMD has almost 2% of the market share in the server market. When do you expect to have double digit market share?
Forrest Norrod: We normally look at IDC rather than Mercury, because Mercury includes a whole bunch of add-on parts in there that go into embedded applications, which are don't think are representative of real server markets. So if you peel out IDC, or even Mercury on the standard parts, we're roughly 5%. I think on that basis, what we've said in the past is we think it's 4 to 6 quarters to get to 10%, from Q4 of last year. So call it Q4 to Q2 next year.
Some Core Facts
PCWatch: With the upcoming Rome CPUs, are they compatible with the Naples platform?
Forrest Norrod: That's right - you can take a Rome, and you can take your existing Naples platform, and you can drop Rome in the Naples platform with a new BIOS and it'll work. However you will only have PCIe Gen3 and you'll have DDR4-2666. If you upgrade the platform and re-do the design on the I/O channels, then you can use PCIe Gen4. So it's very similar to what we do on the client side as well. We aren’t stating the memory speed, but you do get more than DDR4-2666.
PCWatch: For the upcoming Rome platform, how about the amount of the memory? The Naples platform is up to 2TB per socket, but how about Rome? Will it be over double, and will you support the new ICs from Samsung (and Micron)?
Forrest Norrod: Rome will support more, both in memory speed and in memory capacity.
PCWatch: Support for DDR5?
Forrest Norrod: DDR5 is a different design. It will be on a different socket. We've already said Milan is a mid-2020 platform, and we've already said that's socket SP3, so DDR4 will still be used for Milan.
Frontier and Semi-Custom Solutions
PCWatch: For the Frontier supercomputer, powered by AMD, is that using the Milan platform? With that processor, and with NVIDIA as a GPU*, Frontier, is a brand new system, but I can’t understand the current remarks on the CPU. Is it Zen 3?
Forrest Norrod: Frontier is not Milan. It’s a custom CPU.
*It should be noted that Frontier uses AMD GPUs, not NVIDIA GPUs.
IC: When Intel says they have a custom CPU, all they are doing is changing the core count, frequency, the cache, and perhaps the TDP. Are you using custom CPU in the same context?
Forrest Norrod: Intel talks big about having custom CPUs for the cloud guys and it's just different TDPs or configurations. When I say we are going to use a custom CPU for Frontier, that is not what I mean. Under Intel’s definition, we already supply custom CPUs, for example, Amazon needed a different model so we provided it to them - you can't buy it anywhere else. But no, when I say custom CPU, I mean it's a different piece of silicon.
IC: Would you classify it as coming from the semi-custom division?
Forrest Norrod: You know, there's a blurred line there, the some of the semi-custom resources are working on it. Here's one way to put it - we slot engineers around between the semi-custom group and the Datacenter and Embedded Solutions Group.
PCWatch: AMD disclosed that the Infinity Fabric is being used between the CPU and GPU. Is it going to become the common interface topology?
Forrest Norrod: Again, the thing about Frontier is I don't want to say anything to that level of specificity about something that is 2 years out. So let me hang onto that. It is a true statement with regards to Frontier that we use Infinity Fabric to connect the CPUs and the GPUs together, as well as GPU to GPU. We have also stated that Frontier is one CPU and four GPUs per system.
PCWatch: CCIX seems very suitable for connecting between the CPU and GPU, but in Frontier, but you're not using CCIX, you're using Infinity Fabric. Why are you not choosing CCIX?
Forrest Norrod: CCIX is a good interface, but was designed as a general purpose interface and is a standard. Defining standards requires you to accommodate a wide variety of devices on both sides of the wire. If you want to go for maximum performance, and can guarantee specific connectivity between two parts, you can increase the performance, but it won’t be a standard connection. And so the Infinity Fabric interconnect between the CPUs and the GPUs is lower latency, higher bandwidth, and it adds a number of attributes to really maximise the performance of that machine. But it's not a standard, you couldn't easily hook another device up to that.
IC: Is there anything about the Stadia that you'd like to tell us that we don't already know? Or do we have to ask Google?
Forrest Norrod: You would really have to ask Google. I'll say just as a gamer myself that when we first started talking to Google, and there were plans for a streaming game service, it was interesting, it was exciting and all that, but as you know the idea was not particularly new. But we were excited because the time may have finally come when we can do this properly. I'd say that as a gamer, when I first learned the completeness and the vision they have, that it wasn't just a streaming game service - they have rethought the social aspects of gaming. Also, if you decide to only write games that only work in a cloud gaming environment, it can be extremely fine-tuned. The completeness and vision of Stadia in the way that you can find the games, play the games, interact with the games, and totally new ways to do that - I think that's the most powerful thing about it. The graphics technology is cool and all that, but I got to hand props to Google on really thinking through those aspects.
PCWatch: Instinct is based on 7nm, so how about changing to the Navi architecture and performance?
Forrest Norrod: There's going to be some overlap between the two. I think Lisa eluded to this earlier, where GCN and Vega will stick around for some parts and some applications, but Navi is really our new gaming architecture so I don't want to go beyond that. You'll see us have parts for both gaming applications and non-gaming applications.
Power, Packaging, and Storage Class Memory
IC: With the new CPUs, and looking from Rome to Milan to Genoa, AMD is obviously going to use the chiplet design and leverage it to its full. Obviously the push to more and more performance is going to be borne out through not only IPC but also cores and that in turn involved a lot of interconnect, and that's what we're seeing the Infinity Fabric is slowly becoming half if not more than the power consumption of these chips. How does AMD manage that going forward?
Forrest Norrod: You know, I don't think it's that high so I don't want to give you that number, but it’s not that high. But certainly interconnect on a multi-die architecture is a significant concern. We do build our own PHYs to make sure that we're optimising the performance and the pJ per data transition. Then there's other things that are factored in, like channel capacitance and general characteristics. So you know you should expect to see us continue to evolve all aspects of the system package and chipset architecture to try to continue to manage power, while continuing to increase performance.
The thing that I will readily admit is every watt of power you're spending on something that's not the CPU, could be spent around the CPU core. So we do completely believe that managing power is the absolute proof. Actually very early on when I first got to AMD, I said to the power management group that they’re not the power management group they are the performance management group. Because the managing power in every other aspect and being able to divert that power to clock frequency in the CPU is the whole performance gain.
IC: So how is AMD investigating innovative packaging technologies?
Forrest Norrod: We continue to look at all of these aspects to make sure that we're pushing the technology forward.
PCWatch: In the future roadmaps, you have no plan to use 3D packaging technology?
Forrest Norrod: We already use traditional 2.5D in GPUs, and you should expect us to continue use it. I gave a talk at a university a couple of months ago and said at that talk that we're pushing forward on all sorts of packaging innovation. But you have to go up as well. But going up is not a panacea - if you stack, depending on how you stack, you can create thermal and power delivery issues. There are a large range of interesting problems that have to be solved.
IC: Intel has extended out to all the other aspects of the enterprise market in order addressing more TAM than ever before, because that is the goal for investors. One of those aspects is volatile memory and Optane. What can AMD do in this space, or is it worth it the customers that you're going for to do some sort of non-volatile memory?
Forrest Norrod: With all these forms of NVM I do think that there are two value propositions that people have been talking about.
One is the non-volatile aspect, to blur the lines between memory and storage, and customers will get much better large memory database machines etc. On the whole non-volatile aspect, I think people are doing the software work to enable that on a broader range of applications, but at the end of the day, the fact that you still have a failure domain at the node level means that the value is relatively smaller. Before you can commit, you're going to trust a commit to just one machine, to the SCM on one machine. However realistically you're not going to commit until you got a commit on multiple nodes. And so, that tends to somewhat degrade the value that people were thinking about.
So the other aspect of course is lower cost per bit. They’ll use it as DRAM replacement and the fact is that it has longer latency and non-uniform latency, and so there are a bunch of issues there. Now withstanding that it is close enough that we can use it as DRAM replacement, I'd say there was probably more interest in that 12 months ago when DRAM price were at a historically high level. Today there is less interest in that now as DRAM prices have come way down and I do think that DRAM/memory is a commodity market, and commodity markets have a very set of economic rules. The cure for high oil prices is high oil prices right? You know because that increases production and that brings the oil prices back down. The prospect of Optane being a replacement for DRAM in of itself would bring the cost of DRAM down, regardless of the current market factors in play today.
But there are a lot of other storage class memory (SCM) technologies which are in development. I think that you will see SCM settle into a niche of the memory hierarchy over 2-3 years and I think that there will be a lot of choices, not just Optane. But I don't think it's the be-all and end-all. I think that that Intel has made a horrific mistake hitching their system architecture to a propriety memory interface. I think that they've made a key strategic mistake.
I think that in general, Intel may be forgetting what got them here. Truly having an open ecosystem where others could add value to that ecosystem, and that the platform is a key part of the success of the x86 market. Intel still talks about it in that way, but that's not what they are doing, any pico-acre of silicon that doesn't belong to Intel is something that they covet. But I think acting that way is to the detriment of the health of their platform ecosystem long term.
Many thanks for Forrest and his team for their time. Also thanks to Gavin Bonshor for the transcription.
Related Reading
- Naples, Rome, Milan, Zen 4: An Interview with AMD CTO, Mark Papermaster (Q4 2018)
- An Exclusive Media Interview at AMD Tech Day, with CEO Dr. Lisa Su (Q1 2018)
- Making AMD Tick: A Very Zen Interview with Dr. Lisa Su, CEO (Q1 2017)
- An AnandTech Exclusive: The Jim Keller Interview
- Talking Snapdragon: An Interview with Cristano Amon, President of Qualcomm
- An Interview with Lisa Spelman, VP of Intel’s DCG
- Interview with Rod O’Shea, EMEA Embedded Group Director and Site Manager for Intel UK
- Interview with Jackson Hsu, Product Management Director at GIGABYTE
48 Comments
View All Comments
edzieba - Monday, June 24, 2019 - link
Though rather hilarious given he had just finished excusing AMD for eschewing a standardised interconnect in favour of one customised for a better fit to the task. Exactly what Intel did with the Optane DIMMs once it became clear DDR4 was not suitable for an NVRAM interface without significant protocol modification. DDR5 was ratified long before any of that could be fed back upstream, so DDR6 would likely be the first viable standard that could be designed around native NVRAM support. The terribly named 'Optane Memory' on the consumer side uses standard NVME over PCIe and can be treated as any other NVME drive.edzieba - Monday, June 24, 2019 - link
"given he had just finished excusing AMD for eschewing a standardised interconnect in favour of one customised for a better fit to the task."Skipping over CCIX for Infinity Fabric, for those who skipped over the article to the comments.
IanCutress - Monday, June 24, 2019 - link
That customized interconnect for Frontier is designed for a custom CPU + GPU system isolated to one supercomputer. Optane is designed to be used throughout the industry - at least on Intel.edzieba - Tuesday, June 25, 2019 - link
There's also Radeon Instinct cards that run over IF rather than CCIX (MI50 & MI60).The two situations are almost the inverse: AMD have selected their own proprietary interconnect over an existing standard for performance reasons. Intel have selected their own proprietary interconnect version because no existing standard exists and existing interfaces simply did not work (same as Everspin did for their MRAM when facing the same issue with DDR).
deltaFx2 - Tuesday, June 25, 2019 - link
Radeon Instinct, I believe, CAN run on IF if hooked up to AMD CPUs. Otherwise, they default to PCIe 4 (or maybe CCIX, not sure). The test for an open system is whether there is vendor lock-in. If a standard did not exist, Intel has the clout to define one and people will build for it (like it finally did with Thunderbolt after keeping it proprietary for close to a decade). Obviously that would allow Micron and others to be compatible with intel CPUs, and for AMD and others to be compatible with Optane. Intel is clearly using optane as a competitive advantage. Can AMD or ARM build an Optane-compatible interface to allow optane modules to be plugged in? Can anyone else use Optane? I think you know the answer.AMD could make IF open like they did with hypertransport. However, there are already 3 open coherent interconnect proposals out there well before IF came about: CAPI, GenZ, CCIX. Intel has its own version called Omnipath. Does the world need another interconnect specification and if so what problem does it uniquely solve? What uptake do you expect to see if IF were made public? Could they have used CCIX instead of IF as their interconnect? Probably, but these designs were probably done 5-6 years ago.
deltaFx2 - Tuesday, June 25, 2019 - link
Oh, forgot, Intel is now backing CXL. So 4 open standards. Why does Intel think the world needs CXL? My guess is, not invented here.jamescox - Thursday, June 27, 2019 - link
It is in AMD’s and Intel’s best interest to use a proprietary connection that is only available to their own GPUs. The two planned exascale supercomputers in the US are single vendor. One is AMD cpus and GPUs while the other will be Intel cpus and gpus. If they wanted to use nvidia, they would need to fall back to pci-express for use with AMD or Intel cpus. This is why nvidia has been talking about ARM for supercomputers. They have ARM processors such that they could make high performance ARM cores with NVlink.extide - Monday, June 24, 2019 - link
Yeah I think Frontier is really a separate specific case here. I think Forrest would have liked to see Intel create an open NVDIMM standard that others could easily use instead of just one which is proprietary to them.Kevin G - Monday, June 24, 2019 - link
Lot of interesting discussion. Just as much was said as it couldn't be said because you asking the right questions.Obviously not directly stated, I would say that it is implied that AMD is going to be pursuing more advanced packaging techniques for their future server CPUs. Leveraging interposers does lower power consumption vs. traditional wire bonding that is used in the Rome packaging. It is interesting to hear that the amount of power being spend on IO is not as large as being portrayed externally of AMD. This would imply that a change to interposers or similar packaging technologies is further down the road than believed.
Frontier comments are interesting and I wish more pressure was put on this point. With AMD's chiplet strategy, the semi-custom part may only be the IO die to account for the desired number of memory channels, PCIe lanes and Infinity Fabric links. The CPU die side of Frontier may just be the commodity die. To follow up my comments on packaging, this also could be what makes Frontier custom: the CPU and IO die could be part of a massive interposer package that includes the four GPU dies and HBM stacks. Cooling could be a problem but such packaging is feasible for a project like Frontier. AMD has options here and that should be followed up.
The other interesting take are the thoughts on non-volatile. The pricing scenario vs. DRAM today is correct, but I would question how much impact Optane would have if DRAM pricing was near its peak as it was ~18 months ago. Intel certainly would have had greater demand than they do right now but the flip side is their capabilities to match that demand: DRAM pricing would only be impacted by Optane if Intel could meet their own demand.
I would disagree about the comments regarding non-volatile memory and commits. It is indeed true that for enterprise DB systems you want that commit to be replicated to provide redundancy. However, that is true for Optane *AND* existing systems today: data should never rest at a single point. Thus the cost equation between Optane and existing technologies wouldn't change in this regard as the comparison on both sides should inherently include the concept of redundancy as part of the baseline. How that is done may change but that has recently happened already with the transition from SAS based primary storage pools to NVMe.
guycoder - Monday, June 24, 2019 - link
Maybe AMD are directly integrating the Cray Slingshot interconnect into the IO die for Frontier? Also, given the acquisition of Cray by HPE that might lead to some interesting possibilities for HPE in the server space to separate themselves commodity wise.