Following a relative lull in the desktop memory industry in the previous decade, the past few years have seen a flurry of new memory standards and form factors enter development. Joining the traditional DIMM/SO-DIMM form factors, we've seen the introduction of space-efficient DDR5 CAMM2s, their LPDDR5-based counterpart the LPCAMM2, and the high-clockspeed optimized CUDIMM. But JEDEC, the industry organization behind these efforts, is not done there. In a press release sent out at the start of the week, the group announced that it is working on standards for DDR5 Multiplexed Rank DIMMs (MRDIMM) for servers, as well as an updated LPCAMM standard to go with next-generation LPDDR6 memory.

Just last week Micron introduced the industry's first DDR5 MRDIMMs, which are timed to launch alongside Intel's Xeon 6 server platforms. But while Intel and its partners are moving full steam ahead on MRDIMMs, the MRDIMM specification has not been fully ratified by JEDEC itself. All told, it's not unusual to see Intel pushing the envelope here on new memory technologies (the company is big enough to bootstrap its own ecosystem). But as MRDIMMs are ultimately meant to be more than just a tool for Intel, a proper industry standard is still needed – even if that takes a bit longer.

Under the hood, MRDIMMs continue to use DDR5 components, form-factor, pinout, SPD, power management ICs (PMICs), and thermal sensors. The major change with the technology is the introduction of multiplexing, which combines multiple data signals over a single channel. The MRDIMM standard also adds RCD/DB logic in a bid to boost performance, increase capacity of memory modules up to 256 GB (for now), shrink latencies, and reduce power consumption of high-end memory subsystems. And, perhaps key to MRDIMM adoption, the standard is being implemented as a backwards-compatible extension to traditional DDR5 RDIMMs, meaning that MRDIMM-capable servers can use either RDIMMs or MRDIMMs, depending on how the operator opts to configure the system.

The MRDIMM standard aims to double the peak bandwidth to 12.8 Gbps, increasing pin speed and supporting more than two ranks. Additionally, a "Tall MRDIMM" form factor is in the works (and pictured above), which is designed to allow for higher capacity DIMMs by providing more area for laying down memory chips. Currently, ultra high capacity DIMMs require using expensive, multi-layer DRAM packages that use through-silicon vias (3DS packaging) to attach the individual DRAM dies; a Tall MRDIMM, on the other hand, can just use a larger number of commodity DRAM chips. Overall, the Tall MRDIMM form factor enables twice the number of DRAM single-die packages on the DIMM.

Meanwhile, this week's announcement from JEDEC offers the first significant insight into what to expect from LPDDR6 CAMMs. And despite LPDDR5 CAMMs having barely made it out the door, some significant shifts with LPDDR6 itself means that JEDEC will need to make some major changes to the CAMM standard to accommodate the newer memory type.


JEDEC Presentation: The CAMM2 Journey and Future Potential

Besides the higher memory clockspeeds allowed by LPDDR6 – JEDEC is targeting data transfer rates of 14.4 GT/s and higher – the new memory form-factor will also incorporate an altogether new connector array. This is to accommodate LPDDR6's wider memory bus, which sees the channel width of an individual memory chip grow from 16-bits wide to 24-bits wide. As a result, the current LPCAMM design, which is intended to match the PC standard of a cumulative 128-bit (16x8) design needs to be reconfigured to match LPDDR6's alterations.

Ultimately, JEDEC is targeting a 24-bit subhannel/48-bit channel design, which will result in a 192-bit wide LPCAMM. While the LPCAMM connector itself is set to grow from 14 rows of pins to possibly as high as 20. New memory technologies typically require new DIMMs to begin with, so it's important to clarify that this is not unexpected, but at the end of the day it means that the LPCAMM will be undergoing a bigger generational change than what we usually see.

JEDEC is not saying at this time when they expect either memory module standard to be completed. But with MRDIMMs already shipping for Intel systems – and similar AMD server parts due a bit later this year – the formal version of that standard should be right around the corner. Meanwhile, LPDDR6 CAMMs will be a bit farther out, particularly as the memory standard itself is still under development.

Source: JEDEC

Comments Locked

19 Comments

View All Comments

  • dotjaz - Wednesday, July 24, 2024 - link

    It's cute you think DDR5 can actually just fetch 32 bits to transfer 32 bits of data per clock per channel.

    Also why would you assume per channel width would go backwards from 32-bit to 24-bit?

    Literally for one channel you already get 96 bits per cycle. On any sensible system you would get 192/384-bits per cycle.

    That's not to mention there's absolutely no way you are waiting 40ns - that's 200 CPU cycles or over 100 DDR cylces just to transfer one cycle of data especially when the next 7 cycles of data have zero additional latency thanks to 16n prefetch.

    That means for a 96-bit module (2x 48-bit), you would typically get 2x48x16=2x768 bits or 2x96 bytes per "burst" over 8 cycles or 16 transfers.

    Why would a "word" - that's 16 bits BTW - or even a quadword (64 bit) be of any concern.
  • erinadreno - Wednesday, July 24, 2024 - link

    I don't think the burst has much impact here. 192bit used to be hexa-channel (although I don't see any) means the DDR controller can get 6 32b word per cycle from 6 addresses. Now it's only quad channel, which means you can only get 4 48b word from 4 addresses. It's okay if the data is contiguous but that seems limits the flexibility of the DRAM a lot, if I understand correctly.

    And DDR5 can fetch 32bit per cycle, just not from the address issued in current cycle. But that's another story not related to the channel organization.
  • dotjaz - Wednesday, July 24, 2024 - link

    Obviously I'm only assuming 16n for DDR6, 32n would be even more absurd to even consider the impact of fetching just one quadword.
  • lightningz71 - Friday, July 26, 2024 - link

    While your individual program MIGHT issue a fetch for a single byte, word, or even DWORD, once that filters down to the memory controller, through multiple levels of cache and memory managers, I can assure you that, whatever request gets sent to the actual memory modules is NOT, in almost any case, a request for a single byte or word of information. Instead, the memory manager and cache controllers are passing block requests to the memory controller that is fulfilling them as quickly as possible. Widening the bus to the modules by 50% will just allow those same block requests to resolve in about 33% less time.

    In a bigger picture, the memory arrangement of a modern computer is MASSIVELY abstracted away from what an individual program can see.
  • James5mith - Wednesday, July 24, 2024 - link

    Dumb question, but can LPCAMM support multiple channels?

    I.e. current gen Ryzen is quad-channel memory controllers. Are the mobile systems with LPCAMM that are launching full quad-channel bandwidth equivalent?

    I honestly haven't see much discussion about how it's addressed with the coverage of LPCAMM so far in the general tech media.
  • Ryan Smith - Thursday, July 25, 2024 - link

    "Dumb question, but can LPCAMM support multiple channels?"

    LPDDR5 LPCAMM2 features a 128-bit memory bus. So yes, it technically already incorporates multiple "channels", though that concept is slowly losing meaning.

    As such, one module is sufficient for current and next-generation AMD and Intel CPUs.
  • back2future - Thursday, July 25, 2024 - link

    "though that concept is slowly losing meaning"
    Does this requires emphasizing bandwidth throughput on IO lines (instead of channel numbers, versus multiplexed rank access) towards a memory socket (including memory controller and power supply or even a dedicated memory (ai) optimizer)?
  • back2future - Thursday, July 25, 2024 - link

    LPDDR5 with Samsung is now on a 10.7GT/s, what's ~10.7Gbps, because there's no protocol overhead like with PCIe(?) and for a 64x interface, that's ~85.6GB/s, and possibly double for 4x32bit wide data bus (lpddr4 is 64bit, 6bits SDR command/address bus width, lpddr5 is 32b and 7bit DDR cmd/address bus width for each channel?)
  • Diogene7 - Friday, July 26, 2024 - link

    Is there (still) any ongoing work to enable emerging Non-Volatile-Memory (NVM) / Persistent Memory (PM) like SOT-MRAM to be used as Persitent DRAM ?

    I understand that making a 8 GB or more SOT-MRAM memory module to be used as a Persistent DRAM would likely be crazy expensive (likely even higher than HBM), but PM would enable to dramatically lower latencies, likely first in some niche server / military use cases, but if volume can be scale-up to significantly lower the cost, then in consumer use cases (ex: smartphone and IoT devices with Non-Volatile-Memory as main memory,…)

Log in

Don't have an account? Sign up now