With the rapid development of technologies such as artificial intelligence (AI) and edge computing (Edge Computing) in recent years, various consumer electronics and home appliances included in the concept of smart home will undergo a revolutionary change. Ultimately, An artificial intelligence network composed of home devices may become another family member that you cannot see. The concept of the local cloud and its related equipment will be an indispensable element in implementing a home artificial intelligence network.
Smart speaker/monitoring will become two major spindles of consumer AI
Ronan de Renesse, a researcher at the research institute Ovum who is responsible for tracking consumer technology development (Figure 1), stated that the application of AI in consumer electronics has often become the focus of media attention in the past two years, but the trend of consumer electronics and AI integration, It is only now beginning to develop. In the next three to five years, many consumer electronics products will carry AI functionality and will link with each other to form an artificial intelligence network in the home.
Figure 1 Ronan de Renesse, researcher of consumer technology at Ovum, believes that various electronic devices in the future family will become an invisible family member.
For the hardware industry chain, this trend will certainly bring about many new business opportunities. However, at a higher level, the artificial intelligence network that has quietly tapped into the home will become another "family" that you cannot see. member".
On the hardware side, smart speakers that are familiar to everyone are basically relatively mature products. Although there will be significant growth in sales in the next five years, the growth will gradually slow down. It is estimated that by 2022, global smart speakers The sales amount will be close to 9.5 billion US dollars. In fact, Renesse believes that Amazon and Google may not be able to launch their own brand of smart speakers in the future, because this type of product itself has little room for profit. For home network giants, as long as hardware vendors use their platform services, they can collect the user data they need.
In the same period, changes in such products as home intelligent monitoring systems will be more pronounced than smart speakers. At present, the so-called home intelligent monitoring products do not actually have artificial intelligence components, but rather cameras, alarms, door locks, sensors and other hardware. Products are connected to each other to form a security system that supports Event Trigger. However, as the related software and hardware technologies become more mature, the proportion of home surveillance cameras carrying artificial intelligence will increase, and at the same time it will be able to achieve more. Applications, such as the use of voice assistants, provide multiple users with more accurate services in a multi-user environment.
Consumer privacy protection for AI apps
However, for the hardware industry, what is most noteworthy is that the concept of local cloud and related application products will be picked up as devices in the home generally support AI. Renesse pointed out that the electronic products equipped with AI function will Produce a large amount of user data, and many of them are data related to personal privacy. Therefore, if these home electronic products equipped with artificial intelligence completely rely on the external cloud to operate, it will obviously cause privacy concerns.
On the other hand, many consumer IoT devices with relatively simple functions are limited by power, computing power, and production costs. They may not be able to support very high-level AI algorithms. At this time, local cloud devices can play the role of the brain. Uniformly order these devices.
However, Renesse also admitted that it is still difficult to assert which device will play a local cloud center. It may be a higher-order smart speaker, it may be smart TV or other products.
Ian Smythe, Arm's senior marketing director (Figure 2) also believes that there will be more and more computing and inference engine moving to the terminal in the future. The main driving force for this transfer is to protect user privacy. By processing and analyzing the data in the terminal, you can easily anonymize the data and ensure that sensitive data is not leaked through the network. Taking home applications as an example, consumers do not want anyone to be able to know from the Internet that they are not at home. People's time, and then easily to steal at home.
Figure 2 Ian Smythe, senior marketing director at Arm, said that for consumer AI applications, whether the privacy protection mechanism is reliable will be the key to whether the application can be popularized.
For visual applications, Smythe believes that cameras that support visual recognition are inherently considered important privacy issues. Obviously, these devices must be designed so that they can be protected when stored locally or transmitted to the cloud. Privacy and sensitive information. Since the transmission is usually connected wirelessly, special attention must be paid to the security of the wireless transmission function. Engineers designing the device must ensure that the devices connected to the network are not hacked and snooped.
Battery life remains the main technical challenge
However, to push AI to the edge node, the biggest technical challenge at present is still the power consumption of the system. Taking consumer surveillance cameras as an example, consumers may expect such products to be completely wireless, and it is best not to even connect power cables. This means that these products must be battery-powered and also support wireless networks. In addition, it must be able to identify all items and need unlimited storage space.
The above requirements pose a major challenge for system design, requiring the ability to run months of uninterrupted batteries to run machine learning (ML) capabilities, and the ability to continuously upload files to the cloud for storage. These extreme scenarios are for chip design and system components. The requirements are quite demanding, and most importantly, they have mastered when to enable the orchestration of these functions to prolong battery life.
In the case of home surveillance cameras, the camera does not need to transmit the video in the room for 24 hours. It is only reasonable to upload the part of the image when there is an unconfirmed person. In the same way, if the scene like the vacancy remains unchanged. It does not make sense to enable the ML algorithm. Carefully arrange where and when these features are enabled so that the consumer device can operate in the expected mode with only 2 AA batteries and it can be used for a long period of time.
Because power consumption is one of the major obstacles for AIs to enter terminal devices, many startups on the market are now seeing this opportunity to launch a low-power neural network (NN) accelerator, the silicon intellectual property (IP), to assist the chip. While reducing power consumption, developers can meet the performance required by algorithm inference. Kneron officially released its NPU series, a dedicated artificial intelligence processor designed for terminal devices. IP. This series includes three products, namely ultra-low-power version KDP 300, standard version KDP 500, and high-performance version KDP 700, which can meet the needs of smart phones, smart homes, smart security, and a variety of IoT devices. The full range of products have low power consumption, small size, and provide powerful computing capabilities. Different from the power consumption of the processor for artificial intelligence on the market, the Kneron NPU IP consumes 100 milliwatts. (mW) class, for the KDP 300 dedicated to face recognition for smartphones, consumes less than 5 milliwatts.
Shi Yalun, Marketing and Application Manager of Endurance Products (left in Figure 3) pointed out that the need to perform artificial intelligence operations on the terminal device while meeting power consumption and performance requirements is a top priority. Therefore, it is critical to provide optimized solutions for individual applications. At present, the application of artificial intelligence can be broadly divided into two major categories: voice and video, and the neural network structure used is different. The focus of speech applications is natural language analysis. The mainstream network architecture is recurrent neural network (RNN). The main network structure used for image analysis is the convolutional neural network (CNN). In order to optimize for different network structures, the solution provided by the capability is different.
Shen Mingfeng, software design manager for energy-intensity (Figure 3, right), added that although natural language analysis has a lower demand for chip computing performance, due to the tone of the language, there are great differences in speaking habits, so the data sets needed for model training Far more than video recognition, on the other hand, because consumers are already accustomed to using cloud-based voice assistants such as Apple Siri and Google Assistant, offline semantic analysis applications are favored by consumers. The precondition is that we must provide a similar consumer experience under limited computing resources. This is a challenge for chip vendors and system developers.
Figure 3 Aaron Alan (Left), Marketing and Application Manager of Endurance Products, believes that speech and image recognition are very different in nature and need to be met by different solutions. Right is Shen Mingfeng, a software design manager for software development.
In fact, the vast majority of smart speakers are still not edge computing products. Allen Aaron pointed out that whether it is Amazon's Echo, Apple's Homepod or Baidu, Alibaba platform smart speakers, still have to The data is sent back to the cloud for processing and semantic analysis, in order to respond to users. Voice operations that can be performed directly on the end product are mostly Rule-based rather than machine-based natural semantic understanding.
Since the introduction of NPU IP, the company’s first artificial intelligence processor dedicated to terminal devices, in 2016, Nasdalem has continuously improved its design and specifications, and optimized for different industrial applications. Among the IPs currently available to customers, KDP 500 has been adopted by the system factory customers and will enter production tape production (Mask Tape-out) in the second quarter. Voice recognition in cooperation with Sogou has also achieved offline semantic analysis. Even if the terminal equipment is not connected to the network, it can Can understand the user's voice instructions.
Kneron NPU IP is a dedicated artificial intelligence processor designed for terminal devices, allowing terminal devices to run ResNet, YOLO and other deep learning networks in an offline environment. Kneron NPU is a complete terminal artificial intelligence hardware solution, including hardware IP, Compiler, and Model Compression can support various mainstream neural network models such as Resnet-18, Resnet-34, Vgg16, GoogleNet, and Lenet, as well as support mainstream deep learning. Frames, including Caffe, Keras and TensorFlow.
The Kneron NPU IP consumes 100 milliwatts of power, the ultra-low power version of the KDP 300 is even less than 5 milliwatts, and the full range of products has a performance of 1.5 TOPS/W or more per watt. Thanks to a number of exclusive technologies, Meet the needs of chip vendors, system vendors for low power consumption, high computing power.
Locking basic elements Hardware accelerators are not afraid of technical iteration
The use of hardwired circuits to improve the efficiency of the execution of certain specific computing tasks and reduce power consumption has been in the chip design field for a number of years, but at the price of low application flexibility, in the event of a significant change in the market's demand for chip functionality. , Or the software algorithm drastically changes, chip designers have to re-develop new chips.
In the situation where the market's demand for chip functions has been largely determined, this design method is not a problem. However, in emerging technology areas where technological iterations are fast, adopting this design approach will have a relatively large commercial risk. Artificial intelligence is a very fast technology iteration field. Almost every year there are new algorithms and models come out. The research institute Open AI also pointed out that in the past 6 years, the AI model training demand on computing performance, will increase every 3.43 months. Times.
In this regard, Shen Mingfeng pointed out that hardware accelerators are not necessarily inflexible. Take the endurance products as an example. In terms of architecture design, the company uses a convolution kernel decomposition (Filter Decomposition) technique to convolve large convolution kernels. The operation block is divided into a plurality of small convolution operation blocks to be respectively operated, and then combined with a reconfigurable convolution acceleration (Reconfigurable Convolutional Acceleration) technology, the operation results of a plurality of small convolution operation blocks are merged to accelerate the entire operation. Operational efficiency.
With a relatively easy to understand metaphor, it is as if LEGO bricks can be assembled into various types of objects, but the entire object itself is still a stack of a few basic blocks. The endurance program is indispensable for AI algorithms. The basic elements are accelerated to improve the execution performance of the entire algorithm. Therefore, even if the AI algorithm is updated at a very high speed, the performance-based solution can still exert an acceleration effect.
In addition to the accelerometer's own design that focuses on the basic elements, rather than accelerating the specific algorithm as a whole, Terrain also provides other techniques for accelerating or deploying AI applications. For example, its Model Compression technology compresses unoptimized models. Dozens of times; Multi-level Caching can reduce CPU usage and data transfer, further improving overall operational efficiency. In addition, Kneron NPU IP can be combined with Kneron image recognition software to provide real-time identification analysis. The response is not only more stable, but also meets security and privacy requirements. Because the hardware and software can be tightly integrated, the overall solution is smaller and the power consumption is lower to assist the rapid development of products.
Image recognition AI is more urgent toward the edge
On the whole, the current market demand for image recognition is more urgent. Although there is a potentially huge market for smart speakers in offline semantic analysis, there are fewer resources for the betting industry. The key reason for this phenomenon is that Transmission will occupy a large amount of bandwidth, which in turn raises the overall system cost of ownership. Voice does not have this problem.
Lin Zhiming, general manager of Jingxin Technology (Figure 4) explained that in the process of combining artificial intelligence with the Internet of Things, it will also drive the introduction of edge computing technology. The edge computing technology will be applied to a variety of emerging applications. Among these trends, flexibility and speed are the biggest advantages for Taiwanese manufacturers. For most Taiwanese companies and IC design companies, it is easier to cut into the artificial intelligence market from the edge.
Figure 4: Zhixin Lin, General Manager of Jingxin Technology estimates that IP Cam will be one of the main applications for performing AI inference on edge devices.
At the same time, due to the introduction of edge computing technology, hardware requirements such as memory and transmission will also increase, which will greatly increase manufacturing costs. Since the image-related system-on-chip (SoC) is originally more complex than other applications, The cost tolerance is also large, so the edge computing technology is expected to be the first to be imported by image-related applications such as IP Cam.
Artificial intelligence applications can be divided into training and identification. In the massive computing process of deep learning, cloud computing is still performed in a short period of time. The task that the edge computing is responsible for is to do the information collected first. The initial processing, after filtering out unimportant information, uploads the data to the cloud to save the transmission cost. On the other hand, the deep learning completed by the cloud can also make the terminal's identification function more intelligent. For example, the work of image deep learning can be completed by cloud computing first. After the standby learner recognizes the pedestrian, the IP Cam at the edge can only perform the identification work.
On the other hand, because IP Cam is widely used in security maintenance and community security, the government and enterprises are relatively willing to support investment, which will also be a reason for the rapid development of IP Cam.
Lin Zhiming shared that many manufacturers are now exploring how to import artificial intelligence into their own chips and systems. The current situation is similar to the beginning of the Internet of Things. Everyone is still trying to figure out how to use this technology. It is estimated that manufacturers will be around 2020. Will launch more actual products.
Real-time applications must use edge computing architecture
Artificial intelligence is a hot topic nowadays. The gradual shift from a cloud computing architecture to an edge computing architecture will have a significant impact on supply chain vendors. Although the development of artificial intelligence in the short term will continue to be dominated by cloud computing, However, many artificial intelligence functions regarding vision applications will begin to import edges.
Dale K. Hitt, director of market development for Xilinx's visual intelligence strategy (Figure 5), points out that in the foreseeable future, training components in AI development may still be dominated by cloud computing. However, inference/deployment components have begun. Use edge operations to support applications that require low latency and network efficiency.
Figure 5 Dale K. Hitt, director of market development for Xilinx visual intelligence strategy, believes that for applications that require very low latency, edge operations will be the best solution.
Machine-learning for vision-related applications will be one of the key and far-reaching trends for edge-operated applications. It will also be strong in industrial machine vision, smart cities, visual analysis, and the self-driving market. Growth potential. As far as industrial vision and consumer applications are concerned, because edge arithmetic must implement machine learning algorithms, the performance requirements are also much higher than previous generation solutions. In addition, machine learning edge algorithms/functions have also been rapidly evolving. Needs self-adaptive hardware to optimize for future machine learning inference architectures.
Hitt uses self-driving cars as an example. Behind each sensor in the car, there is a precise algorithm support that is responsible for producing the results of sensory interpretation from the sensor data. The latest trend is to use deep learning algorithms to generate these perceptual interpretation results. However, Deep learning algorithms must be trained through a large number of potential situations to learn how to read all possible sensor data.
After training, deep learning algorithms require extremely high computational efficiency and ultra-low latency in order to safely control the vehicle. For electric vehicles, low power consumption must be applied to limit operating temperature and extend battery power. The goal is to provide high-efficiency, low-power, adaptable solutions to meet the various needs of self-driving edge AI.
In the development of edge computing, the biggest challenge is that market demand changes too quickly. Therefore, technologies that can quickly adapt to various changes are extremely important to enable companies to maintain their competitiveness.
Hitt further explained that deep learning algorithms are continuously advancing at a rapid rate, and many of 2017's leading solutions have now faced the fate of elimination. Even with the ability to outperform many others nowadays, as computing needs continue to climb, hardware still needs to Optimize. Hardware must be updated at a faster rate to avoid being eliminated. Some hardware may even need to be updated during production. Many alternative technologies also need to be recalled to update the chip.
Hitt added that the unique advantages of FPGAs include deep hardware optimization including operations, memory architecture, and links. Compared with CPUs and GPUs, they can achieve higher performance with lower power consumption after optimization. The hardware architecture cannot be quickly optimized for new derived requirements.
Edge operation is overwhelming
Relying on AI applications running in cloud data centers, although it has extremely high computing capability support, its identification accuracy is generally higher than that of edge devices based on simplified model inference, but after considering privacy concerns, real-time response and online cost, and other factors It is still an attractive option to make inferences directly at the edge devices. On the other hand, the market size of terminal devices is much larger than that of cloud data centers, and there are strong economic incentives. This is also the slogan of AIoT shouting in the past year. The price is sky-high, and the major semiconductor companies are actively deploying.
Looking to the future, AI applications fully supported by the cloud will still exist in the market, but the proportion will be reduced year by year. Instead, a new architecture that combines cloud and edge computing will be replaced. For AI application developers, the cloud cannot be replaced. The value lies in model training, not inference. Also for this reason, whether or not the solution provider can achieve seamless integration between "cloud" and "end" for application developers will be an application developer. The most important consideration when evaluating suppliers.