Machine observation room: Kirin 980 performance depth analysis

In China's huge mobile phone market, there are many brands that can make mobile phones, but it is rare to be a processor. In the high-end market, there is only Huawei's Kirin processor. Huawei is the only self-developed Heis Kylin in China. Has always been a concern, after several years of accumulation, has established a foothold in the high-end mobile processor market.

Like Apple, Huawei's Kirin chip is currently only used in its own products. This is to make Huawei's own mobile terminal differentiated and closely integrated with the terminal. It is more convenient for Huawei mobile phones to be flexible and improved according to the actual needs of users. Of course, such a product strategy also makes Huawei must be cautious for each generation of Kirin chips, because if the Kirin chip does not match the original idea, it will lead to a huge impact on the flagship product for a whole year. Therefore, the annual Kirin flagship chip is The focus of our attention. Today we will take a look at Huawei's latest Kirin 980, what is it?

The new Cortex-A76

The biggest highlight of the Kirin 980 is the 7nm process, which integrates 6.9 billion transistors. According to TSMC's official statistics, the performance of the Kirin 970, 980 is about 20% higher than the previous generation's flagship 10nm process, and the energy efficiency is improved. 40%, the logic circuit density increased by 60%, which is 1.6 times. It is well known that traditional chips follow Moore's Law to increase the number of transistors per unit area to improve the performance of the chip. We already know that the performance of a processor is the most The core factors are mainly process, architecture, etc. The more advanced the process, the higher the fineness of the processor, the more transistors can be integrated. More importantly, the finer manufacturing process can be as much as possible. Reduce the distance between components and reduce power consumption. Especially for the architecture of smart phones today, the more and more 'violent' architecture is more dependent on the process. At present, the chip manufacturing process generally stays at 10nm, and the 7nm process is once It is called 'the closest physical limit to the silicon-based semiconductor process'. Therefore, the 7nm process of the Kirin 980 first demonstrated the progress of Qilin in the process.

Interestingly, we all know that ARM has three main product design teams: European design team, Texas Austin team and Cambridge team. From the product line, Cortex-A57, A72 architecture comes from ARM in the US The Austin team in Texas, while the A53, A73 is designed by ARM's team in Europe, and the small core such as the A55 is from Cambridge, England. This can also be seen in different styles of different design teams, such as the nature of Texas. The character, the literary atmosphere of the Europeans, and the gentleman's style of the British. And the A76 is from the Austin team in Texas. It can also be seen that the basic character of the A76: Oriented towards the A57, A72.

On the CPU side, the Kirin 980 is based on the ARM Cortex-A76 CPU architecture. In the mobile SoC field, flagship chips such as the Opteron 845 and Apple A12 Fusion have already adopted the self-developed micro-architecture, while Kirin has been using the ARM public version architecture. Huawei said that the industry has a specialization. The self-study architecture is not the best result for Kirin. Moreover, it is decided that the instruction set and architecture of the CPU are invented by ARM. Therefore, if there is no significant improvement, the self-developed and adopted public version architecture. There is not much difference. Compared with competing products, Kirin is pursuing an improvement in overall performance.

In terms of performance, according to ARM's official statement, the Cortex A76 can run at 3GHz under the latest 7nm process. Compared with the A73 used in the previous generation Kirin 970, the integer performance is improved by 90% and the floating point performance is improved by 150%. Performance is improved by 80%. Although this comparison is not in the same state of the same process, it does not fully reflect the performance of the new architecture. Therefore we have to look at the actual data applied to the chip. At the launch of the Kirin 980, According to Huawei's official data, it is 75% higher than the previous generation Kirin 970, and the energy consumption is increased by 58%. This is also considering that the Kirin 980's highest frequency is 2.6GHz, which brings power consumption and heat. The balance between the two. Comparing ARM's official reference data, it is more reliable.

This newly designed A76 CPU architecture consists of two large cores, two large cores and four small cores of three energy-efficient architectures, using DynamIQ scheduling technology. Compared to Big.little, DyanmIQ redefines multi-core micro-architecture The biggest feature of DynamIQ is that the number of Cortex-A CPUs in a cluster of DynamIQ clusters can range from single core to 8 cores, and also supports mashups between heterogeneous CPUs. Therefore, the Kirin 980 is in scheduling. For different usage scenarios, it is possible to flexibly schedule different cores and perform more efficiently.

In addition, in terms of architecture, A76 has increased from sequential three launches to out-of-order four launches compared to the previous generation A72 made by its own team, and in the pipeline, the A72's 15-level core pipeline has been streamlined to a 13-stage pipeline. High execution efficiency. We know that, in general, out-of-order transmission requires more registers, and the pipeline is generally deeper, so the power consumption will be greater. But thanks to advances in technology, the power consumption can be reduced. Bring a larger architecture.

Low profile Cortex-A55

In terms of energy efficiency core, Kirin 980 chose A55. When it comes to A55, it has to mention its predecessor A53. This 12-year-old famous architecture still dominates the low-end processor. It has excellent energy consumption ratio and The powerful scalability makes the A53 almost used on most low-end and mid-range processors. The A55 is an upgraded version of the A53.

The Cortex-A55 uses the latest ARMv8.2 architecture. According to the data given by ARM, the memory performance can be up to twice as high as that of the Cortex-A53 under the same frequency and process conditions. Under the same frequency and process conditions, Performance is 15% higher than the Cortex-A53. It is worth mentioning that ARM designed a L2 cache for the A55, dedicated memory for each core, and access to the L2 cache compared to the Cortex-A53. It has been reduced by more than 50%. Moreover, the operating frequency of the L2 cache is designed to be the same frequency as the CPU. The performance of the CPU in various benchmarking tools is greatly improved by reducing the delay.

In addition, ARM has introduced a three-level cache for the A55, which can be shared by all Cortex-A55 CPUs in the cluster. Especially for the core under the DynamIQ cluster, it can benefit from the increased memory capacity near the CPU, thereby improving performance. Reduce system power. Compared to the A53, the Cortex-A55 can save up to 30% less power than the A53 with the same performance. This is very important for low-end and mid-range processors.

Can the new G76 lead the unicorn to turn over?

Kirin 980 uses Mali-G76 MP10, the main frequency is 720MHz, compared with the previous generation, Mali-G76 improves unit power performance and unit area performance, performance density is 30% higher than the previous generation Mali-G72, architecture efficiency Increased by 30%, machine learning processing capacity increased by 2.7 times. In fact, compared to Qualcomm's Adreno, Mali has always been a younger brother in single-core performance. So when the Mali series is launched, you can often see what MP20, MP32 and other 'very scary' numbers. But the problem is that as a basic mobile phone, it is impossible to stack the core infinitely, so Mali has been slightly weak in front of Adreno.

Looking at its history, we can see through the comparison of parameters. With Mali's two, Samsung's strategy is low-frequency multi-core solutions, such as Exynos 9810, which has 18 cores, but only 546MHz, while Kirin is high. The low-core strategy, the previous Kirin 970 MP12 clocked at 746MHz. Therefore, in the G76, ARM will reduce Mali's largest available core to 20 to increase energy density. This is also in line with the current trend of the entire mobile SoC. .

The Mali G76 uses the latest Bifrost-based architecture. Compared to the previous midgrad architecture, Bifrost's biggest innovation is the use of the instruction set shader (ClausedShader), and the Bifrost architecture also uses Quad based vectorization technology, compared to the previous SIMD vectorization technology. Can execute a single thread, Quad vectorization technology supports up to four threads to execute, shared control logic, the usage rate is close to 100%. Compared with G72, G76 performance density increased by 30%, energy efficiency increased by 20%.

In terms of power consumption, we quoted anandtech's energy efficiency data using GFXBench Manhattan 3.1 off-screen. It can be seen that under the 7nm process, the average power of the G76 MP12 performs well, only 4.08W, even lower than the 5.01W of the S9+. Compared with the previous generation of the Kirin 970 6.33W has been greatly improved. In terms of efficiency, the progress is very obvious (note, here refers to energy consumption, not absolute performance). Therefore, energy consumption progress Will make Kirin 980 another generation of classics after Kirin 950.

Run test:

The first is Geekbench. After three tests (average number), the single-core score is 3308 points and the multi-core score is 9752 points. Mainly thanks to the A76 architecture and the 7nm process improvement, the Kirin 980 has a significant improvement compared to the Snapdragon 845. ;

In the test of the CPU power and stability of the pi, the Kirin 980 also gave us an unexpected surprise, the performance is very good. On average, each test is slightly ahead of the 845.

GFXBench test, Kirin 980's Mali-G76 does have a gap in Adreno, although the gap has been narrowing compared to the previous generation. But indeed, in terms of GPU, Adreno is ahead of Mali.

In the 3DMark test, OpenGL ES 3.1 is 3524 and Vulkan is 4018. Under Vulkan, you can see that Mali's performance is significantly higher than OpenGL ES 3.1, and even nearly equals the GPU gap. This also benefits. Yu Qilin has been optimizing in Vulkan in recent years. Currently popular games such as "Glory of the King" have gradually changed from OpenGL to Vulkan, and Vulkan is the future of Android large games.

Other aspects:

In terms of ISP, Kirin 980 adopts the fourth-generation self-developed ISP, which has a pixel throughput rate of 46% higher than the previous generation, an energy efficiency improvement of 23%, and the ability to adjust image color and grayscale in different regions, support more cameras, and also have HDR color reproduction. Ability to adjust image color in sub-regions to achieve a balance of photo color and detail.

This also brings the powerful night shooting function of the Mate 20 series. The Kirin 980 is designed for dark scenes. The chip uses the new Multi-pass multi-noise technology to accurately reduce noise in night scenes, retaining more complete details, and night shots. Jiatong is clear. With the Unicorn 980's dual-core NPU, it can recognize 4,500 images per minute, and the recognition speed is 120% higher than the previous generation. In the official video, Kirin 980 can also draw the joints and lines of the human body in real time. It can accurately identify a variety of objects, achieving a leap from image recognition to object detection.

In terms of the most important baseband, Kirin 980 is the first in the world to support LTE Cat.21, supporting the industry's fastest downlink 1.4Gbps rate, and more flexible to deal with the frequency band combination of different operators around the world. And it is worth mentioning that Huawei has already Kirin is ready for the Balong 5000 Modem, which is 5G Solution Ready.

to sum up:

Since the Kirin 950, the Kirin processor has gradually stood on the ranks of high-end processors. Although it is not perfect, it can be clearly felt that in the past few years, Qilin has made progress and improved continuously. Until this generation, Kirin 980 Regardless of whether it is a key core indicator or other scatter function, Kirin 980 has no obvious shortcomings, and on the basis of balance, it can take advantage of its close integration with the terminal, and further improve the actual needs of users. In the mobile terminal. This is the most important strategic significance for Huawei, for Kirin.