X-Gene 1, Atom C2000 and Xeon E3: Exploring the Scale-Out Server World
by Johan De Gelas on March 9, 2015 2:00 PM ESTHP Moonshot
We discussed the HP Moonshot back in April 2013. The Moonshot is HP's answer to SeaMicro's SM15000: a large 4.3U chassis with no less than 45 cartridges that share three different fabrics: network, storage, and clustering. Each cartridge can contain one to four micro servers or "nodes". Just like a blade server, cooling (five fans), power (four PSUs), and uplinks are shared.
Back in April 2013, the only available cartridge was based on the anemic Atom S1260, a real shame for such an excellent chassis. Since Q4 2014, HP now offers six different cartridges ranging from the Opteron X2150 (m700) to the rather powerful Xeon E3-1284Lv3 (m710). The different models are all tailored to specific workloads. The m700 is meant to be used in a Citrix virtual desktop environment while the m710 is targeted at video transcoding. We tested the m400 (X-Gene 2.4), m300 (Atom C2750), and m350 (four Atom C2730 nodes) cartridges.
The m400 is the first server we have seen that uses the 64-bit ARMv8 AppliedMicro X-Gene. HP positions the m400 as the heir of mobile computing, and touts its energy efficiency. Other differentiators are memory bandwidth and capacity. The X-Gene has a quad-channel memory controller and as a result is the only cartridge with eight DIMMs. We were very interested in understanding how X-Gene would compare to the Intel Xeons. HP positions the m400 as the micro server for web caching (memcached) and web applications (LAMP). The m400 also comes with beefy storage: you can order a 480GB SSD with a SATA or M.2 interface.
The m300 cartridge is based on the Atom C2750 with support for up to 32GB of RAM. HP positions this cartridge as "web infrastructure in a box". The m400 is mostly about web caching and the web front-end while the m300 seems destined to run the complete stack (front- and back-end). However, it is clear that there is some overlap between the m300 and m400 as there's nothing to stop you from running a complete "web infrastructure" on the m400 if it runs well in 32GB or less.
The m350 cartridge is all about density: you get four nodes in one cartridge. There is a trade-off however: you are limited to 16GB of RAM and can only use M.2 flash storage, limited to 64GB.
Each node of the m350 is powered by one of Intel's most interesting SKUs, the 1.7GHz 8-core Atom C2730 that has a very low 12W TDP. The m350 is positioned as a way to offer managed hosting on physical (as opposed to virtualized) servers in a cost effective way.
47 Comments
View All Comments
JohanAnandtech - Tuesday, March 10, 2015 - link
Are you sure this is up to date? gcc tells me -march=native is not supported.JohanAnandtech - Tuesday, March 10, 2015 - link
Update. march=native does not work. I have tried -march=armv8-a but does not do much (it is probably the default). O3 makes the biggest difference. Omit it and you get 5.7 GB/s. With -O3, I am at 18 GB/s and more (stream m400)Alone-in-the-net - Tuesday, March 10, 2015 - link
Apologies. For AArch64 the only is "armv8-a", for intel, -march=native sets it to use the one for your CPU.https://gcc.gnu.org/onlinedocs/gcc-4.8.2/gcc/AArch...
https://gcc.gnu.org/onlinedocs/gcc-4.8.2/gcc/i386-...
From version 4.9.x and above of GCC, you can really start to add tuning for the CPU.
https://gcc.gnu.org/onlinedocs/gcc-4.9.2/gcc/AArch...
-mtune=name
Specify the name of the target processor for which GCC should tune the performance of the code. Permissible values for this option are: ‘generic’, ‘cortex-a53’, ‘cortex-a57’.
Additionally, this option can specify that GCC should tune the performance of the code for a big.LITTLE system. The only permissible value is ‘cortex-a57.cortex-a53’.
Where none of -mtune=, -mcpu= or -march= are specified, the code will be tuned to perform well across a range of target processors.
Alone-in-the-net - Tuesday, March 10, 2015 - link
Also support for the XGene1 as a compilation target is only from GCC5.https://gcc.gnu.org/gcc-5/changes.html
Support has been added for the following processors (GCC identifiers in parentheses): ARM Cortex-A72 (cortex-a72) and initial support for its big.LITTLE combination with the ARM Cortex-A53 (cortex-a72.cortex-a53), Cavium ThunderX (thunderx), Applied Micro X-Gene 1 (xgene1). The GCC identifiers can be used as arguments to the -mcpu or -mtune options, for example: -mcpu=xgene1
The_Assimilator - Monday, March 9, 2015 - link
So AMD, how's that bet on ARM you made looking now?extide - Monday, March 9, 2015 - link
Don't count them out yet. I really wish that intel didn't abandon ARM for the Atom, I bet they could come out with a sweet armv8 core if they had to, and on their process it would be sweet.BlueBlazer - Monday, March 9, 2015 - link
That AMD Opteron A1100 looking more like abandonware as more time passes on, and that was like 8 months ago. Until now not a single real world deployment nor was used in any of AMD's own SeaMicro servers. Currently available as development kit with a rather steep price tag.tuxRoller - Monday, March 9, 2015 - link
You REALLY should be using GCC 5. that includes many improvements for the armv8 isa. I'd suggest grabbing a nightly of Fedora 22, but Ubuntu 15.04 may be using gcc5 as well.Wilco1 - Monday, March 9, 2015 - link
Agreed, nobody doing anything on AArch64 should contemplate using GCC4.8. Even 4.9 is way out of date. GCC5.0 with latest GLIBC gives major speedups across the board.JohanAnandtech - Tuesday, March 10, 2015 - link
"Way out of date?" We tried out 4.9.2, which has been released on October 30th 2014. That is about 4 months old. https://www.gnu.org/software/gcc/releases.html. Latest version is 4.8.4, 5.0 has not even been released AFAIK.