Memory Scaling on Ryzen 7 with Team Group's Night Hawk RGBby Ian Cutress & Gavin Bonshor on September 27, 2017 11:05 AM EST
A large number of column inches have been put towards describing and explaining AMD's new underlying scalable interconnect: the Infinity Fabric. A superset of HyperTransport, this interconnect is designed to enable both the CPUs and GPUs from AMD to communicate quickly, at high bandwidth, low latency, and with low power with the ability to scale out to large systems. One of the results of the implementation of Infinity Fabric on the processor side is that it runs at the frequency of the DRAM in the system, with a secondary potential uplift in performance when using faster memory. The debate between enthusiasts, consumers and the general populous in regards to Ryzen's memory performance and has been an ever-raging topic since the AGESA 184.108.40.206 BIOS updates were introduced several weeks ago. We dedicated some time to test the effect of high-performance memory on Ryzen using Team Group's latest Night Hawk RGB memory.
Memory Scaling on Ryzen 7: AMD's Infinity Fabric
Typically overlooked by many when outlining components for a new system, memory can a key role in system operation. For the last ten years, memory performance for consumers has been generally inconseqential on memory speed: we tested this for DDR3 for Haswell and DDR4 for Haswell-E, and two major conclusions came out of that testing:
- As long as a user buys something above the bargain basement specification, performance is better than the worst,
- Performance tapers to a point with memory, very quickly hitting large price increases for little gain,
- The only major performance gain that scales comes from integrated gaming
So it is perhaps not surprising to read in forums that the general pervasive commentary is that “memory speed over DDR4-2400 does not matter and is a con by manufacturers”. This has the potential to change with AMD's Infinity Fabric, where the interconnect speed between sets of cores is directly linked with the memory speed. For any workload that transfers data between cores or out to main memory, the speed of the Infinity Fabric can potentially directly influence the performance. Despite the fact that pure speed isn’t always the ‘be all and end all’ of establishing performance gains, it has the potential to provide some gains with this new interconnect design.
The Infinity Fabric (hereafter shortened to IF) consists of two fabric planes: the Scalable Control Fabric (SCF) and the Scalable Data Fabric (SDF).
The SCF is all about control: power management, remote management and security and IO. Essentially when data has to flow to different elements of the processor other than main memory, the SCF is in control.
The SDF is where main memory access comes into play. There's still management here - being able to organize buffers and queues in order of priority assists with latency, and the organization also relies on a speedy implementaiton. The slide below is aimed more towards the IF implementation in AMD's server products, such as power control on individual memory channels, but still relevant to accelerating consumer workflow.
AMD's goal with IF was to develop an interconnect that could scale beyond CPUs, groups of CPUs, and GPUs. In the EPYC server product line, IF connects not only cores within the same piece of silicon, but silicon within the same processor and also processor to processor. Two important factors come into the design here: power (usually measured in energy per bit transferred) and bandwidth.
The bandwidth of the IF is designed to match the bandwidth of each channel of main memory, creating a solution that should potentially be unified without resorting to large buffers or delays.
Discussing IF in the server context is a bit beyond the scope of what we are testing in this article, but the point we're trying to get across is that IF was built with a wide scope of products in mind. On the consumer platform, while IF isn't necessarily used to such a large degree as in server, the potential for the speed of IF to affect performance is just as high.
AGESA 220.127.116.11 (aka AGESA 1006) and Memory Support
At the time of the launch of Ryzen, a number of industry sources privately disclosed to us that the platform side of the product line was rushed. There was little time to do full DRAM compatibility lists, even with standard memory kits in the marketplace, and this lead to a few issues for early adopters to try and get matching kits that worked well without some tweaking. Within a few weeks this was ironed out when the memory vendors and motherboard vendors had time to test and adjust their firmware.
Overriding this was a lower than expected level of DRAM frequency support. During the launch, AMD had promized that Ryzen would be compatible with high speed memory, however reviewers and customers were having issues with higher speed memory kits (3200 MT/s and above) . These issues have been addressed via a wave of motherboard BIOS updates built upon an updated version of the AGESA (AMD Generic Encapsulated Software Architecture), specifically up to version 18.104.22.168.
Given that the Ryzen platform itself has matured over the last couple of months, now is the time for a quick test on the scalability on AMDs Zen architecture to see if performance can scale consistency with raw memory frequency, or if any performance gains are achieved at all. For this testing we are using Team Group's latest Night Hawk RGB memory kit at several different memory straps under our shorter CPU and CPU gaming benchmark suites.
- The AMD Zen and Ryzen 7 Review: A Deep Dive on 1800X, 1700X and 1700
- The AMD Ryzen 5 1600X vs Core i5 Review: Twelve Threads vs Four at $250
- The AMD Ryzen 3 1300X and Ryzen 3 1200 CPU Review: Zen on a Budget
- The AMD Ryzen Threadripper 1950X and 1920X: CPUs on Steroids
- Retesting AMD Ryzen Threadripper’s Game Mode: Halving Cores for More Performance
Post Your CommentPlease log in or sign up to comment.
View All Comments
Drumsticks - Wednesday, September 27, 2017 - linkInteresting findings. I've seen Ryzen hailed on other simple forums like Reddit as having great scaling. There's definitely some at play, but not as much as I'd have thought.
How does this compare to Intel? Are there any plans to do an Intel version of this article?
ScottSoapbox - Thursday, September 28, 2017 - link2nd!
I'd like to see how much quad channel helps the (low end) X299 vs the dual channel Z370. With overlapping CPUs in that space it could be really interesting.
blzd - Thursday, September 28, 2017 - linkYes please compare to Intel memory gains, would be very interested to see if it sees less/more boost from higher speed memory.
Great article BTW.
jospoortvliet - Saturday, September 30, 2017 - linkWhile I wouldn't mind another test there have been plenty over the last year's as the authors also pointed out in the opening of the article and the results were simple - it makes barely any difference, far less even than for Ryzen.
.vodka - Wednesday, September 27, 2017 - linkDefault subtimings in Ryzen are horribly loose, and there's lots of performance left on the table apart from IF scaling through memory frequency and more bandwidth. You've got B-die here, you could try these, thanks to The Stilt:
This has also been explored by AMD in one of their community updates, at least in games:
Ian Cutress - Wednesday, September 27, 2017 - linkThe sub-timings are determined by the memory kit at hand, and how aggressive the DRAM module manufacturer wants to make their ICs. So when you say 'default subtimings on Ryzen are horribly loose', that doesn't make sense: it's determined by the DRAM here. Sure there are adjustments that could be made to the kit. We'll be tackling sub-timings in a later piece, as I wanted Gavin's first analysis piece for us to a reasonable task but not totally off the deep end (as our Haswell scaling piece showed, 26 different DRAM/CL combinations can take upwards of a month of testing). I'll be working with Gavin next week, when I'm back in the office from an industry event the other side of the world and I'm not chasing my own deadlines, to pull percentile data from his numbers and bringing parity with some of our other testing.
xTRICKYxx - Wednesday, September 27, 2017 - link.vodka is right. Please investigate!
looncraz - Wednesday, September 27, 2017 - linkAMD sets its own subtimings as memory kits were designed for Intel's IMC and the subtimings are set accordingly.
The default subtimings are VERY loose... sometimes so loose as to even be unstable.
.vodka - Wednesday, September 27, 2017 - linkSadly, that's the situation right now. We'll see if the upcoming AGESA 22.214.171.124 does anything to get things running better at default settings.
This article as is, isn't showing the entire picture.
notashill - Wednesday, September 27, 2017 - linkThere's a new AGESA 126.96.36.199b but AMD has said very little about what changed in it.