X-Gene 1, Atom C2000 and Xeon E3: Exploring the Scale-Out Server World
by Johan De Gelas on March 9, 2015 2:00 PM ESTThe recent announcements of ARM, HP, AppliedMicro, AMD, and Intel make it clear that something is brewing in the low-end and micro server world. For those of you following the IT news that is nothing new, but although there have been lots of articles discussing the new trends, very little has been quantified. Yes, the Xeon E3 and Atom C2000 have found a home in many entry-level and micro servers, but how do they compare in real-world server applications? And how about the first incarnation of the 64-bit ARMv8 ISA, the AppliedMicro X-Gene 1?
As we could not find much more than some vague benchmarks and statements that are hard to relate to the real world, we thought it would be useful to discuss and quantify this part of the enterprise market a bit more. We wanted to measure performance and power efficiency (performance/watt) of all current low-end and micro server offerings, but that proved to be a bit more complex than we initially thought. So we went out for a long journey from testing basic building blocks such as the ASRock Atom board, trying out an affordable Supermicro cloud server with 8 nodes, to ultimately end up with testing the X-Gene ARM cartridge inside an HP Moonshot chassis.
It did not end there. Micro servers are supposed to run scale-out workloads, so we also developed a new scale-out test based on Elasticsearch. It's time for some in depth analysis, based on solid real-world benchmarks.
Target Audience?
This article – as with most of my articles – is squarely targeted at professionals like system administrators and web hosting professionals. However, if you are a hardware enthusiast, the low-end server market does have quite a bit to offer. For example, if you want to make your system more robust, the use of ECC RAM may help. Also, if you want to experiment/work with virtual machines, the Xeons offer VT-d that allows you to directly access I/O devices in a virtual machine (and in some cases, the GPU). Last but not least, server boards support out-of-band remote management which allows you to turn the machine on and off remotely. That can be quite handy if you use your desktop as a file server as well.
Micro and Scale-Out Servers?
The business model of many web companies is based on delivering a service to a large number of users in order to be profitable. This is because the income (advertising or something else) per user is relatively low. Dropbox, Facebook, and Google are the prime examples that everybody knows, but even AnandTech is no exception to this rule. The result is that most web companies need lots of infrastructure but do not have the budget of an IT department that is running a traditional transactional system. Web (hosting) companies need dense, power efficient, and cheap servers to keep the hosting costs low.
Small 1U servers: dense but terrible for administration & power efficiency
Just a few years ago, they had few options. One possible option was half-width or short depth 1U servers, another was the more dense forms of blade servers. As we have shown more than once, 1U servers are not power efficient and need too much cabling, PSUs, etc. Blade servers are more power efficient, reduce the cabling complexity, and the total number of PSUs and fans. But most blade servers have lots of features web companies do not use and are also too expensive to be the ideal solution for all the web companies out there.
The result of the above limitations is that both Facebook and Google developed their own servers, a clear indication that there was a need for a different kind of server. In the process, a new kind of dense server chassis was introduced. At first, the ones with low power nodes with "wimpy" cores were called "micro servers". The more beefy servers targeted at more demanding scale-out software were called "scale-out servers". Since then, SeaMicro, HP, and Supermicro have been developing these simplified blade server chassis that offer density, low power, and low(er) costs.
Each vendor took a different angle. SeaMicro focused on density, capacity, and bandwidth. Supermicro focused on keeping the costs and complexity down. HP went for a flexible solution that could address the largest possible market – from ultra dense micro servers to beefier scale-out servers to specialized purpose servers (video transcoding, VDI etc.). Let's continue with a closer look at the components and servers we tested.
47 Comments
View All Comments
JohanAnandtech - Tuesday, March 10, 2015 - link
Thanks! It is been a long journey to get all the necessary tests done on different pieces of hardware and it is definitely not complete, but at least we were able to quantify a lot of paper specs. (25 W TDP of Xeon E3, 20W Atom, X-Gene performance etc.)enzotiger - Tuesday, March 10, 2015 - link
SeaMicro focused on density, capacity, and bandwidth.How did you come to that statement? Have you ever benchmark (or even play with) any SeaMicro server? What capacity or bandwidth are you referring to? Are you aware of their plan down the road? Did you read AMD's Q4 earning report?
BTW, AMD doesn't call their server as micro-server anymore. They use the term dense server.
Peculiar - Tuesday, March 10, 2015 - link
Johan, I would also like to congratulate you on a well written and thorough examination of subject matter that is not widely evaluated.That being said, I do have some questions concerning the performance/watt calculations. Mainly, I'm concerned as to why you are adding the idle power of the CPUs in order to obtain the "Power SoC" value. The Power Delta should take into account the difference between the load power and the idle power and therefore you should end up with the power consumed by the CPU in isolation. I can see why you would add in the chipset power since some of the devices are SoCs and do no require a chipset and some are not. However, I do not understand the methodology in adding the idle power back into the Delta value. It seems that you are adding the load power of the CPU to the idle power of the CPU and that is partially why you have the conclusion that they are exceeding their TDPs (not to mention the fact that the chipset should have its own TDP separate from the CPU).
Also, if one were to get nit picky on the power measurements, it is unclear if the load power measurement is peak, average, or both. I would assume that the power consumed by the CPUs may not be constant since you state that "the website load is a very bumpy curve with very short peaks of high CPU load and lots of lows." If possible, it may be more beneficial to measure the energy consumed over the duration of the test.
JohanAnandtech - Wednesday, March 11, 2015 - link
Thanks for the encouragement. About your concerns about the perf/watt calculations. Power delta = average power (high web load measured at 95% percentile = 1 s, an average of about 2 minutes) - idle power. Since idle power = total idle of node, it contains also the idle power of the SoC. So you must add it to get the power of the SoC. If you still have doubts, feel free to mail me.jdvorak - Friday, March 13, 2015 - link
The approach looks absolutely sound to me. The idle power will be drawn in any case, so it makes sense to add it in the calculation. Perhaps it would also be interesting to compare the power consumed by the differents systems at the same load levels, such as 100 req/s, 200 req/s, ... (clearly, some higher loads will not be achievable by all of them).Johan, thanks a lot for this excellent, very informative article! I can imagine how much work has gone into it.
nafhan - Wednesday, March 11, 2015 - link
If these had 10gbit - instead of gbit - NICs, these things could do some interesting stuff with virtual SANs. I'd feel hesitant shuttling storage data over my primary network connection without some additional speed, though.Looking at that moonshot machine, for instance: 45 x 480 SSD's is a decent sized little SAN in a box if you could share most of that storage amongst the whole moonshot cluster.
Anyway, with all the stuff happening in the virtual SAN space, I'm sure someone is working on that.
Casper42 - Wednesday, April 15, 2015 - link
Johan, do you have a full Moonshot 1500 chassis for your testing? Or are you using a PONK?