CPU Performance & Power

On the CPU side of things, the Tensor SoC, as we discussed, does have some larger configuration differences to what we’ve seen on the Exynos 2100, and is actually more similar to the Snapdragon 888 in that regard, at least from the view of a single Cortex-X1 cores. Having double the L2 cache, however being clocked 3.7%, or 110MHz lower, the Tensor and the Exynos should perform somewhat similarly, but dependent on the workload. The Snapdragon 888 showcases much better memory latency, so let’s also see if that actually plays out as such in the workloads.

In the individual subtests in the SPEC suite, the Tensor fares well and at first glance isn’t all too different from the other two competitor SoCs, albeit there are changes, and there are some oddities in the performance metrics.

Pure memory latency workloads as expected seem to be a weakness of the chip (within what one call weakness given the small differences between the different chips). 505.mcf_r falls behind the Exynos 2100 by a small amount, the doubled L2 cache should have made more of a difference here in my expectations, also 502.gcc_r should have seen larger benefits but they fail to materialise. 519.lbm_r is bandwidth hungry and here it seems the chip does have a slight advantage, but power is still extremely high and pretty much in line with the Exynos 2100, quite higher than the Snapdragon 888.

531.deepsjeng is extremely low – I’ve seen this behaviour in another SoC, the Dimensity 1200 inside the Xiaomi 11T, and this was due to the memory controllers and DRAM running slower than intended. I think we’re seeing the same characteristic here with the Tensor as its way of controlling the memory controller frequency via CPU memory stall counters doesn’t seem to be working well in this workload. 557.xz_r is also below expectations, being 18% slower than the Snapdragon 888, and ending up using also more energy than both Exynos and Snapdragon. I remember ex-Arm’s Mike Filippo once saying that every single clock cycle the core is wasting on waiting on memory has bad effects on performance and efficiency and it seems that’s what’s happening here with the Tensor and the way it controls memory.

In more execution bound workloads, in the int suite the Tensor does well in 525.x264 which I think is due to the larger L2. On the FP suite, we’re seeing some weird results, especially on the power side. 511.povray appears to be using a non-significant amount lesser power than the Exynos 2100 even though performance is identical. 538.imagick also shows much less power usage on the part of the Tensor, at similar performance. Povray might benefit from the larger L2 and lower operating frequency (less voltage, more efficiency), but I can’t really explain the imagick result – in general the Tensor SoC uses quite less power in all the FP workloads compared to the Exynos, while this difference isn’t as great in the INT workloads. Possibly the X1 cores have some better physical implementation on the Tensor chip which reduces the FP power.

In the aggregate scores, the Tensor / GS101 lands slightly worse in performance than the Exynos 2100, and lags behind the Snapdragon 888 by a more notable 12.2% margin, all whilst consuming 13.8% more energy due to completing the task slower. The performance deficit against the Snapdragon should really only be 1.4% - or a 40MHz difference, so I’m attributing the loss here just to the way Google runs their memory, or maybe also to possible real latency disadvantages of the SoC fabric. In SPECfp, which is more memory bandwidth sensitive (at least in the full suite, less so in our C/C++ subset), the Tensor SoC roughly matches the Snapdragon and Exynos in performance, while power and efficiency is closer to the Snapdragon, using 11.5% less power than the Exynos, and thus being more efficient here.

One issue that I encountered with the Tensor, that marks it being extremely similar in behaviour to the Exynos 2100, is throttling on the X1 cores. Notably, the Exynos chip had issues running its cores at their peak freq in active cooling under room temperature (~23°C) – the Snapdragon 888 had no such issues. I’m seeing similar behaviour on the Google Tensor’s X1 cores, albeit not as severe. The phone notably required sub-ambient cooling (I tested at 11°C) to get sustained peak frequencies, scoring 5-9% better, particularly on the FP subtests.

I’m skipping over the detailed A76 and A55 subscores of the Tensor as it’s not that interesting, however the aggregate scores are something we must discuss. As alluded to in the introduction, Google’s choice of using an A76 in the chip seemed extremely hard to justify, and the practical results we’re seeing the testing pretty much confirm our bad expectations of this CPU. The Tensor is running the A76 at 2.25GHz. The most similar data-point in the chart is the 2.5GHz A76 cores of the Exynos 990 – we have to remember this was an 7LPP SoC while the Tensor is a 5LPE design like the Eynos 2100 and Snapdraogn 888.

The Tensor’s A76 ends up more efficient than the Exynos 990’s – would would hope this to be the case, however when looking at the Snapdragon 888’s A78 cores which perform a whopping 46% better while using less energy to do so, it makes the Tensor’s A76 mid-cores look extremely bad. The IPC difference between the two chips is indeed around 34%, which is in line with the microarchitectural gap between the A76 and A78. The Tensor’s cores use a little bit less absolute power, but if this was Google top priority, they could have simply clocked a hypothetical A78 lower as well, and still ended up with a more performant and more efficient CPU setup. All in all, we didn’t understand why Google chose A76’s, as all the results end up expectedly bad, with the only explanation simply being that maybe Google just didn’t have a choice here, and just took whatever Samsung could implement.

On the side of the Cortex-A55 cores, things also aren’t looking fantastic for the Tensor SoC. The cores do end up performing the equally clocked A55’s of the Snapdragon 888 by 11% - maybe due to the faster L3 access, or access to the chip’s SLC, however efficiency here just isn’t good, as it uses almost double the power, and is more characteristic of the higher power levels of the Exynos chips’ A55 cores. It’s here where I come back to say that what makes a SoC from one vendor different to the SoC from another is the very foundations and fabric design - for the low-power A55 cores of the Tensor, the architecture of the SoC encounters the same issues of being overshadowed in system power, same as we see on Exynos chips, ending up in power efficiency that’s actually quite worse than the same chips own A76 cores, and much worse than the Snapdragon 888. MediaTek’s Dimensity 1200 even goes further in operating their chip in seemingly the most efficient way for their A55 cores, not to mention Apple’s SoCs.

GeekBench 5

While we don’t run multi-threaded SPEC on phones, we can revert back to GeekBench 5 which serves the purpose very well.

Although the Google Tensor has double as many X1 cores as the other Android SoCs, the fact that the Cortex-A76 cores underperform by such a larger degree the middle cores of the competition, means that the total sum of MT performance of the chip is lesser than that of the competition.

Overall, the Google Tensor’s CPU setup, performance, and efficiency is a mixed bag. The two X1 cores of the chip end up slightly slower than the competition, and efficiency is most of the time in line with the Exynos 2100’s X1 cores – sometimes keeping up with the Snapdragon 888 in some workloads. The Cortex-A76 middle cores of the chip in my view make no sense, as their performance and energy efficiency just aren’t up to date with 2021 designs. Finally, the A55 behavioural characteristic showcases that this chip is very much related to Samsung’s Exynos SoCs, falling behind in efficiency compared to how Qualcomm or MediaTek are able to operate their SoCs.

Memory Subsystem & Latency GPU Performance & Power
Comments Locked

108 Comments

View All Comments

  • Alistair - Wednesday, November 3, 2021 - link

    It's the opposite, the iPhone is massively ahead in performance, but every high end phone takes the same high end photos... you got the same photos but a lot less performance...
  • aclos3 - Saturday, November 6, 2021 - link

    I took some time to really test the camera and you are simply wrong. I have been photographing with it heavily for the last couple of days and the camera is incredible. Call it a gimmick or whatever, but the way they do their photo stacking puts this phone in a league of its own. If your main use case for a phone is benchmarking, I guess this is not your device.
  • Lavkesh - Thursday, November 11, 2021 - link

    Everyone and their grand mother do image stacking. iPhone is almost as good if not better even with a smaller sensor when compared to the latest Pixel. How's that for "in a league of its own"?
  • Amandtec - Wednesday, November 3, 2021 - link

    I don't doubt the veracity of your comment but I find the hostile undertone somewhat curious.
  • damianrobertjones - Wednesday, November 3, 2021 - link

    But... but... they said that it's amazing!! Who do I believe? /s
  • Zoolook - Saturday, November 6, 2021 - link

    As long as they use Samsung process they will be hopelessly behind Apples Socs in efficiency unfortunately, would be interesting to see SD back on TSMC process for a direct comparison with Apple silicon.
  • Tigran - Tuesday, November 2, 2021 - link

    Performance looks very disappointing. Google promised 4.7x GPU performance improvement vs Pixel 5.
  • singular9 - Tuesday, November 2, 2021 - link

    I was enjoying how the speculation about the GS101 were claiming its "not far behind" the SD888. I was never expecting google to make another high end device, let alone one that undercuts most of the competition, as its just not what trends would say.

    I am not impressed. As someone who was rather hopeful that google would take control and bring us android users a true apple chip equivalent some day, this is definitely not the case with google silicon.

    Considering how cookie cutter this design is, and how google made some major amateur decisions, I do not see google breaking away from the typical android SOC mold next generation.

    Looking back at how long it took apple to design a near 100% solo design for the iPhone (A8X was the first A chip to use a complete inhouse GPU and etc design, other than ARM cores), that is a whopping 4 and a half years. Suppose this first google "designed" chip is following the same trend, an initial "brand name" break away yet still using a lot of help from other designs, and then slowly fixing one part at a time till its all fixed, while also improving what is already good, I could see google getting there by the Pixel X (10?). But as it stands, unless google dedicates a lot of time to actually altering Arm's own designs and simply having samsung make it, I don't see Tensor every surpassing qualcomm (unless samsung has some big breakthrough in their own CPU/GPU IP which may or may not come with AMD's help).

    As the chip stands today, its "passable", but not impressive. Considering Google can get android to run really well on a SD765G, this isn't at all surprising. The TPU seems like a nice touch, since honestly, focusing on voice is more important than on "raw" cpu performance or something. I have always been frustrated with speech to text not being "perfect" and constantly having to correct it manually and "working around" its limitations. As for my own experience with the 6 Pro, its bloody good.

    Now to specifics.
    The X1 chips do get hot, as does the 5G modem. I switched the device to LTE for now. I do get 5G at home and pretty much most places I go, and it is fast, its not something I need right now. I even had a call drop over 5G because I walked around a buildings corner. Not fun.

    The A76 excuse I have heard floating around, is that it takes up less physical die space, by A LOT. And apparently, there was simply no room for an A77 or A78 because the TPU and GPU took up so much room. I don't understand this compromise, when the GPU performance is this mediocre. Why not simply use the same GPU size as the S21 (Ex2100) and give the A78's more room? Don't know, but an odd choice for sure.

    The A55 efficiency issues are noticeable. Try playing spotify over bluetooth for an hour, and watch the battery drain. I get consistently great standby time, and very good battery life when heavily using my device, but its these background screen off tasks that really chug the battery more than expected.

    Over all though I haven't noticed any serious issues with my unit. The finger print scanner works as intended, and is better than my 8T. The camera does just as well if not better than the previous pixels. And over all...no complaints. But I wonder how much of this UX comes from google literally brute forcing their way with 2 X1 cores and a overkill GPU, and how much of it is them actually trying.

    As for recommendations to google for Tensor V2, they need to not compromise efficiency for performance. This phone isn't designed to game, cut the GPU down, or heck, partner with AMD (who is working with samsung) to bring competitive graphics to mobile to compete with Adreno from QComm. 2 X1 cores, if necessary, can stay, but at that point, might as well just have 4 of them and get rid of all the other cores entirely and simply build a very good kernel to modulate the frequency. Or make it a 2+6 design with A57 cores. As someone who codes kernels for pixels and nexus devices for a long time, trying to optimize the software to really get efficiency out of the big.LITTLE system is near impossible, and in my opinion, worthless unless your entire scheduler is "screen on/off" based, which is literally BS. I doubt google has any idea how to build a good CPU governor nor scheduler to truly make this X+X+X system even work properly, since I have yet see qcomm or samsung do it "well" to call commendable.

    The rest of the phone is fine. YoY improvements are always welcome, but I think the pixel 6/pro just really show how current mobile chips are so far behind apple that you might as well give up. YoY improvements have imo halted, and honestly no one seems to be having the thought that maybe we should cut power consumption in half WITHOUT increasing performance. I mean...the phones are fast enough.

    Who knows. We will see next year.

    PS: I also am curious what google will do with the Pixel 6A (if they make one at all). Will it use a cut down GS101 or will it get the whole chip? It would seem overkill to shove this into a 399$ phone. Wonder what cut downs will be made, or if there will be improvements as well.
  • sharath.naik - Tuesday, November 2, 2021 - link

    Good thoughts, there is one big issue you missed. Pixel camera sensors 50mp/48mp being binned to 12mp yet Google labeled them as 50mp/48mp. Every shot outside the native 1x,4x is just a crop of the 12mp image including pottaitk3mp crop) and 10x(2.5mp crop}.
  • teldar - Thursday, November 4, 2021 - link

    You are absolutely a clueless troll and should go back to your cave. Your stupidity is unwanted.

Log in

Don't have an account? Sign up now