Huawei released the P20 Pro and P20 last month. It ranked top among the DxOMark mobile phones announced on the day, and the lead was not one point or two points. The P20 Pro is 10 points higher than the previous Samsung Galaxy S9 Plus. It can be said that Not an era, but one or two leading times. Certainly the P20 Pro's 1/1.73 inch CMOS specification is inevitably somewhat bully, but the 1/2.7 inch CMOS P20 is only slightly smaller than the Galaxy S9 Plus, Pixel 2. High, but its imaging quality also has obvious advantages. Where did this advantage come from?
There are two ways of thinking about the improvement of the quality of photography. One is the use of violent hardware, such as the use of CMOS with a larger area and larger pixels, which increases the mechanical optical image stabilization and greater aperture. The second idea is soft power: And optimization.
The first example of a typical case is the Samsung Galaxy S9 Plus. The previous Galaxy S8 took a picture relative to the Galaxy S7. The Galaxy S9 Plus was the first to use the new 1.4μm CMOS with a three-layer stack of DRAM. This is a secondary factor in photo enhancement. The major factor in the real progress is the great leap forward of optical components. The Galaxy S9 Plus has adopted a breakthrough F1.5 aperture. However, using such a large aperture requires a high cost. First of all, large apertures above F1.8 are difficult to produce, yields are low, resulting in high production costs for the lens modules. The second is that the aperture is too large, the image will be overexposed in bright scenes, and the GS9 even introduces physical Changing the iris to solve this problem greatly increases the complexity of the camera assembly and reduces reliability.
Another idea is software algorithm optimization. This typical case is Pixel 2, 12.2MP, 1.4μm pixels, f/1.8 single camera hardware specifications are not very exaggerated, but it can still overwhelm iPhone X and other hardware monsters. This is thanks to the DeepLab-v3+ algorithm. This algorithm is an AI-based deep learning natural network. It can analyze the scene in real time, identify the content of the target scene, optimize it, and perform HDR+ processing on the scene to avoid overexposure to some extent. Occurrence. The most critical technology of DeepLab-v3+ algorithm is single camera to achieve depth of field through the algorithm. This function requires a lot of encoding and decoding operations, multiple sampling and conversion, a huge amount of computing. Google Pixel 2 in order to achieve This function even specially developed an acceleration chip named Pixel Visual Core to perform algorithm processing. The FPU floating point performance of the accelerated chip is five times that of the Apple A11 Bionic. Google also looks unselfishly implementing the DeepLab-v3+ algorithm. Open source, free to the world to share. Of course, Google also has its own calculations, on the one hand, can be fully beneficial The free development resources of the open source community continue to improve. On the other hand, its main competitor Qualcomm Snapdragon 835/845 does not have the same AI performance as A11, and can't run this set of algorithms that require extremely high performance, so they don't have to worry about it. There are also algorithms that do not have sufficient performance to implement.
P20 sample comparison and analysis
However, on this issue, Google is miscalculated. Huawei Unicorn 970 is the first to incorporate the NPU processor, which can provide sufficient performance to achieve the same level of deep learning algorithms used to improve the picture quality performance of shooting. Based on deep learning How does the AI algorithm improve the picture quality? Let's take a look at the DxO comparison sample. (Compare source: https://www.dxomark.com/huawei-p20-pro-camera-review-innovative-technologies -outstanding-results/, the specific comparison of the original can be viewed on this page)
Low light and anti-glare
This is a typical night scene scenario. The black canopy in the lower left corner of the Huawei P20 Pro is restored to authenticity, while the iPhone X and Pixel 2 are obviously overexposed. The right-lowlighted streetlights iPhone X and Pixel 2 also have obvious glare. The P20 Pro performs the scene through deep learning. Detection, to a great extent to avoid the occurrence of overexposure. And the upper right corner of the tree Huawei P20 Pro is better, but it is not due to excessive sharpening resulting in the rise of the point, which is a large area sensor better high sensitivity + The result of the joint action of the late algorithm.
Zoom and optical image stabilization
I used to think naively that bottom is justice, although this is just in the field of DSLR, but not in the field of mobile phones. Before Sony/Meizu used a 1/2.7-inch outsole sensor, but also because of the big bottom, in the mobile phone Space can not add OIS optical image stabilization, the result is worth the candle, I once thought that the bottom and optical image stabilization, such as fish and bear's paw can not have both.
Huawei's P20 Pro introduces innovative AIS functionality, which allows it to take into account the CMOS and optical image stabilization at the end of the floor. Its approximate working principle is to determine the trend of hand-held motion through the framing of the No. 3 camera, and then through the AI post-processing through multi-frame stabilization. The algorithm performs image stabilization and completes tasks that I once thought could not be accomplished.
Portrait scene
The portraits and beauty shots of foreign brand mobile phones have always lagged behind domestic brands. The reason for this is that they are not focused enough. Domestically, the pain points for portraits and beauty are more pressing, forcing domestic mobile phone manufacturers to invest more in this area. Huawei took full advantage of its technology and adopted the AI deep learning algorithm to identify 3D facial features of human faces and optimize and enhance skin and color. Sample P20 Pro is more rosy and moving. As the characters are changed in real time, they need to be changed in real time. The image is analyzed and compared with the previous neural network's deep learning results to become 3D facial information, and then optimized. The amount of real-time processing is still very large.
HDR scene
This scene is a portrait, but the backlight, the brightness of the scene outside the window is too high, the iPhone X's window is heavily exposed, Pixel 2 over exposure control is better, and the P20 Pro control is more perfect, while the indoor character is not too dark. The accumulation of learning scenes makes it possible to perform segmentation judgments on such complex scenes with large differences in light and shade, and set different exposure strategies. The powerful NPU performance of Unicorn 970 makes these processes even more powerful.
The purpose of the dual camera is to use the secondary camera as the depth information of the scene depth sensor to obtain the depth of field information through multi-frame synthesis and post-processing. But before the algorithm of the AI era, there was a problem in distinguishing the foreground and the background, especially Is a more complex vegetation, the edge is prone to defects.
Depth of field effect and edge judgment
Based on scenes after AI depth learning, the accumulation of data in a large number of scenes makes the camera more accurate in judging the front and rear views, and the edge part is more finely divided, but it does not appear that the above does not support the AI dual camera complex edge processing. The problem.
Overall, the Unicorn 970's NPU has machine learning training for more than 500 scenes in 19 categories, such as cats and dogs, food, crowds, macros, night scenes/texts, flowers, blue sky, snow scenes, beaches, etc. Realizing such a function requires a large amount of data accumulation. When real-time photographing is performed, a large amount of data needs to be processed through a neural network. Therefore, high requirements are imposed on the algorithm and computing power. Huawei P20's DXOMark summit is not just a plant. Due to the advantages of CMOS and optical components, there are more excellent algorithms for icing on the cake. And excellent algorithms need powerful computing power to ensure that the NPU of the Unicorn 970 is indispensable.
In response to the demand for AI deep learning, Huawei has been proactive in joining the NPU independent unit at the beginning of the Kirin 970 R&D.
As mentioned earlier, the core of image processing is scene analysis and identification. There are two phases in this work. The first phase is training. A large number of samples need to be given through the Convolutional Neural Network of the convolutional neural network. The training process can be It can be performed locally on the mobile device, or it can be operated on a large server in the cloud. However, the subsequent new photos, analysis and judgment can only be basically completed locally. Although this is only an analysis of a single brand new sample, it needs to be compared with the previous accumulated data. Yes, and this analysis is summed up in the machine learning results. And the Convolutional Neural Network process of the convolutional neural network must carry out a large number of branch decision processes, which requires a powerful FP16 floating-point processing capability.
The so-called 'NPU' of the Qualcomm Snapdragon 845 is the Hexagon 685. The Hexagon 685 is just a small change from the previous Hexagon 682. Strictly speaking, this is only a DSP, not a real NPU like the Unicorn 970 and the A11. It's just simplistic. The vector processing unit, and this DSP computing power may also be occupied by other tasks. The more complex AI machine learning tasks in the Snapdragon 845 still require the GPU or even the CPU to implement, and such a cost is huge.
This is TechInsights Kirin 970 core diagram, we can find the A53 small core below the NPU chip area, the current mobile SoC, only Apple A11 and Kirin 970 has a real sense of the NPU.
AI and deep learning are not just hardware issues. Mobile AI processing is a multi-level system project that combines software and hardware. The top layer is the application layer, and the API below is the application interface and hardware layer of the interactive application interface. Currently in the Android platform AI There are two main acceleration APIs: one is Google's official Android AI Runtime, similar to Direct Compute under Windows, with good software and hardware compatibility, and is an industry standard; the other is HiAI from Huawei, which is unique to Unicorn's API. NVIDIA's CUDA, although limited in hardware, is more efficient. There is also a HiAI Heterogeneous Resource Management System below the API layer, assigning tasks to the underlying hardware layer. The following hardware layer can be an NPU or a GPU. CPU, DSP, or even ISP.
This shows that AI doesn't have to run on the NPU. Other CPUs, GPUs, and DSPs can all work, but there are fundamental differences in performance and efficiency: GPU performance is 4 times that of CPU, and NPU is CPU 25 Times. In addition to absolute performance, the gap between energy consumption ratios is even more pronounced. Compared with GPUs and CPUs, NPUs have a difference of up to 8 times and 50 times. This can be said to be an essential gap for mobile devices that are stretched to the limit. .
Unicorn 970 uses the NPU to identify items in real time. The performance can reach 16GFlops computing power. The single processing takes only 32ns and the operating current is only 300mA. Compared to CPUs and GPUs with several watts, it is very green.
From Master Lu's AI performance test, we can see that Qilin 970's performance advantage over Xiaolong 845's development machine is obvious. Master Lu's AI test includes three items: InceptionV3, Resnet34, and VGG16. These three test items use three different methods. The algorithm identifies 100 images respectively and evaluates performance through time-consuming. These three algorithms are basically the only three algorithms currently trained in artificial neural network recognition. They are very representative and can be said to be current AI deep learning algorithms. All of these three algorithms, InceptionV3 is more dependent on the CPU and GPU, Kirin 970 and Xiaolong 845 two platforms have little difference; and Resnet34, VGG16 algorithm is more advanced, can take full advantage of the NPU's performance advantages, Therefore, the performance of Unicorn 970 with NPU in the next two test items has obvious advantages.
In particular, the VGG16 algorithm, which consists of a 16-layer structure combining 13 convolution layers and 3 full-chain layers, is very suitable for NPU. The computational accuracy of this algorithm is FP16, and the CPU's emission is FP32, which has higher accuracy. But when dealing with FP16, a single transmission can only handle a single FP16, so resources will be wasted. Therefore, for FP16-based calculations such as VGG16, FP16-optimized NPU is more efficient and can be more fully utilized. Resources.
Some companies even boast about the 660's AI performance, but the Xiaolong 660's Hexagon 680 DSP doesn't have FP16 computing power at all. So what does AI rely on? CPU or GPU? It can indeed run AI, and we can't blame it for being false. Publicity, but performance and power consumption is not good-looking.
In the pre-AI era, improving the quality of the camera has no obvious threshold for brands with no underlying R&D capabilities. They only need to spend a few dollars to find Sony's father to buy better CMOS, and then suffer from lower yields and harder apertures. Apply Qualcomm or Samsung's public version of the ISP algorithm toss and toss, you can also make a good camera phone. However, after the arrival of the AI, traditional ideas will no longer work, simple competition hardware at most brands that do not have the underlying R & D strength can Good hardware is produced. But if there is a lack of intelligent algorithms based on AI deep learning, the product concept will be completely lagging behind. The gap between the two echelons will be opened, and then a gap will emerge. It will be difficult to keep up with the top AI companies. s level.
Although the company's R&D strength may be slightly better, it can participate in AI deep learning games. However, these companies lack chip-level R&D capabilities. In the case that upstream chip providers cannot provide sufficient NPU computing power, they can only Complaining and reluctant to follow.
Therefore, companies that have AI machine learning and development capabilities can be considered second-rate. However, companies with chip R&D capabilities and platform ecosystem construction are top-level. Huawei has a Kirin brand as the foundation, and HiAI and development machines are used as platform support in the middle. More consumer-grade electronic products and software applications are in direct contact with end-users. Such an ecosystem is a healthy and sustainable AI ecosystem. For consumers, AI based on deep learning is only a black box. Users do not need to To understand how it works and how it works, just enjoy the fruits of its mellowness. And Huawei Unicorn is the behind-the-scenes hero of this escalation, and her hard training will make our life better in the future.