According to map co-founder: AI pattern, the scene and the future

I am Leo, co-founder of ETO Technologies, Ph.D. in statistics of UCLA, I am a researcher in artificial intelligence for 15 years. She was a researcher in the Prof. Yann LeCun Laboratory, a deep learning foundation, winner of the PASCAL Image Target Detection Competition in 2010, National Institute of Standards and Technology NIST and IARPA World Champion of Face Recognition, National Intelligence Advanced Research Projects Agency.

Recently, a large number of research reports on artificial intelligence, investors, entrepreneurs, scholars and hot trends in AI and the impact on various sectors of society, there is no lack of misunderstanding of AI technology and industrial development, it is easy to mislead. Generally divided into three areas: AI is how big thing? Who is the real AI player? AI scene where?

From the dual perspectives of researchers and entrepreneurs, I will talk about my main point: the boundaries of AI, only the leading figure can accurately grasp and expand; the top enterprises, because of the vision to create potential energy; the future of AI, incomparable, no history can learn from, There is no authority to predict.

"S" curve to see the AI ​​pattern

My history and prediction of AI development is modeled using the "S" curve shown above (the Sigmoid function happens to be also used to characterize the activation function of neurons in a neural network.) The horizontal axis represents time and the vertical axis represents machine intelligence. Points on the curve represent the highest level of intelligence in the world at a certain point in time, starting with the new AI era (Deep Learning) beginning in 2013, with a level of machine intelligence development that is negligible compared to the development of the past five years until 2013. The red line represents Pessimism (AI ebb, AI bubble, etc.), a period of rapid development will take place after 2017; the blue line represents optimism and there will be rapid development after 2017. It is worth emphasizing that the blue and red curves have the same understanding of AI history , But many on the market or research report to see another curve, a high probability of the survey to see the AI ​​level has a great gap from the highest level.Analyzing the different positions of the AI ​​pattern, you can interpret the S curve of the three aspects :

1, the past development of AI and the prediction of the extent and rate of AI's future development

2, the relationship between the level of development of AI and business scenarios

3, the location of each player and the gap

Specifically speaking, let's start with the development of AI in the past five years. Taking face recognition as an example, the probability of finding a face from N individuals is 95%. The vertical axis is a recognizable scale (the size of N) .

Technology is not convergence, but will amplify the gap to unlock the scene

The highest recognition level of face recognition in 2017 is 2 billion people, which is about two hundred times more than that of 2016 and tens of thousands of times more than in 2015. In the world's most authoritative face recognition test (NIST 2017) ), We are 2% ahead of the second Vocord team (Vocord is 10% above Tencent Merit in another test set), which is commonly known as the convergence of technical level, one or two high The percentage point does not make any sense (it is difficult to realize the value of competition into cash.) This misunderstanding needs to be interpreted from two aspects:

On the first hand, the algorithm can quickly scale up to 5% and 20% of the leader in billions and billions of classes, which is the general law of the algorithm performance curve. In addition to identifying major differences in scale, hard) data.From the algorithmic experience, blacks, females, children, large age span, occlusion and so on are more difficult to identify groups and categories in these sub-categories, the performance difference between different algorithms Will be bigger.

Evaluation on a very large scale is itself a simple academic proposition and requires a lot of data support. Few people can really observe the performance under 2 billion data. For example, it is very difficult for the United States to set up 2 billion test sets. It's not the first source of misunderstandings to interview some face recognition practitioners.

On the other hand, when the algorithm is increased and the recognizable size is increased, more commercial application scenarios will be unlocked, and millions and tens of millions of identity domains correspond to authentication scenarios, remote authentication and unlocking of mobile phones. Argument can be set up in this scenario, but the crime detection and detection of billions of billions and against the rigid comparison of demand, in these scenarios, not to identify more than a few criminals, but to find out more than ten times the probability difference , It is almost an issue of whether or not a "noncritical application" is highly misleading.

In the latest case of security, Wan Road and even 100,000 road camera video face search, archiving algorithms have very high requirements, assuming that every flow of people, to Wan Road video, search performance equivalent to the algorithm required 10 billion , 100 billion on the scale of the recognition rate.This performance than other scenarios require an additional thousandfold.On the basis of different algorithms based on product-side experience differences were amplified.Another, the global ethnic identification, is anti-terrorism, out Immigration business recognition of the coverage requirements are very high.

In summary, 99% recognition rate algorithm and 99.99% algorithm, the difference lies in the unlockable application scenarios.These new scenarios are unlocked, is the pioneer algorithm team and vertical field pioneer (such as public security system innovation team) Working together, nor is it possible for security practitioners interviewing to be at the forefront of change, another source of misunderstandings.

Three levels of technology VIE: Vision (vision), Insight (insight), Execution (execution)

The most common is to test the championship, the actual case, bid PK results, papers, etc. These may be able to distinguish the top 10 is not the AI ​​team, but it is difficult to distinguish the best team. My three-tier deconstruction of technology: Vision, foresight, or strategic landscape, judgments on technology trends; Insight, insight, the nature of the algorithm and the laws of the distribution of the objective world; Execution, implementation, algorithms, data acquisition, engineering computing platform. For:

The most basic Execution algorithm is what level of the algorithm can be achieved quickly, especially after the general framework is known, including the basic algorithm, scene data, computing experiment platform, product application, etc. For example, how soon after AlphaGo comes out, Speech Recognition can catch up with the best results in the world as soon as possible. Top Execution is not an open-source algo- rithm platform that can be filled in. Expertise in specific areas helps teams rapidly increase the level of Execution in their field. The Chinese team should be world class If Google is the No. 1 in the world, China should not be inferior to Facebook, Microsoft, Apple, Amazon, or even some aspects, no matter whether it is chess, face recognition, speech recognition, etc. Most people Compare technology, basically at this level, but more importantly, the more powerful is the above two levels.

Going one more time to Insight is a deep insight into technology, including mathematical explanations of algorithmic models and insights into the distribution of the objective world Insight guides how to use data and calculate power (that is, how to use algorithms and even innovative algorithms). This level of decision can do better than Google, or can maintain the same pace of development.If you have a deep learning algorithm framework, mass data in the same level, but we have a huge gap between algorithm performance tuning.Face recognition, for example, We used 200 million human face pictures (a subset of billions of pictures) trained to reach 1 billion effective model parameters, using reasonable assumptions a priori about the attributes of the human face, including light, age, race , Motion blur, imaging resolution, model customization, how the data is assembled, and how computation accelerates There are significant differences in performance tuning and model learning efficiency (that is, the above mentioned Execution), which is why there are algorithms, calculations, Internet giants of data conditions are not necessarily able to do the world's top three in a single AI task.

Vision: Predict the development trend, define the future direction, imagine the impact on life and production, which requires a deep understanding of technology, but also the technical innovation, technical and commercial value of imagination, creativity, Answer where and how fast AI is coming.

Strong Execution, Insight is definitely good, but may not have Vision; the strongest Vision, Insight certainly first class, but the Execution may be poor. VIE strong team The world is extremely scarce. With the depth of the field of two masters Hinton Talk to LeCun about my feelings. By 2010, many people in academia were already talking about the importance of big data to machine learning. The Hinton team spent millions of training data on the basis of the algorithm LeCun invented in 2012, Made world-class breakthroughs on ImageNet; in the same period, the LeCun team used less than 100,000 data, but within the first two months of Hinton's announcement of ImageNet, the LeCun team was unable to reproduce Hinton's experiment with his own algorithm As a result, the results of the LeCun team easily outperformed the Hinton team after Hinton announced algorithmic implementation and tips.

Both of these masters have superb Vision and stay in the depth of learning for thirty years, but their differences in Vision and the resulting differences in beliefs make Insight's difference (whether pursuing deeper insights) might have been immense at the time, There are also significant differences in the understanding of breakthrough learning conditions such as training data size, model regularization, activation function selection, GPU computation, etc. These were probably not entirely clear at the time, and were probably entirely based on Hinton's (including That superb PhD student.) This difference in Insight did not make the LeCun team aware of the algorithmic framework used and the target performance but did not know the key implementation, but the LeCun team later had a better Execution (large-scale system tuning), in a short period of time the performance of the algorithm over this subtle difference between the faith of the most cattle experts, in the end come from what is worth pondering.

Why is Vision important? Just like radar, which is a blind spot for others, Vision makes you see, see, believe, and calm. Not only do you get the strategic advantage, but you also have the power to rule out temptations and distractions.

Vision how to identify it? Very difficult, or almost impossible, only by the same people who enjoy the Vision. Just as taste difficult to score the same, can only be enjoyed by the same taste. Vision shows you that 99% Experts do not see, do not believe, so great and misunderstandings often accompanied by LeCun in the depth of learning is verified by the actual test data, it is difficult to be recognized by the mainstream academic circles in the United States, and even published top-level meeting is not a simple matter Nowadays, almost all theses are labeled with deep learning.

But judging the team that filters without Vision, there is a clear track to follow.Generally speaking, both academic and entrepreneurial, great breakthroughs, require consistent investment and deep plowing years and years ago.Any year for the field or what mode are doing (vertical, Platform, etc.), classified as no Vision should be no problem.

With the dismantling of the VIE, I think that the new era of AI is only one of the barriers, the top people.Leading figures on the future distribution of AI technology and business boundaries can not be replaced, determine the basic elements of AI development (algorithms, calculations, data And scenarios.) Teams with top-level Execution and Insight are most aware of where and how effective data is valid for the algorithm. Teams with the best Insight and Vision are the first to know that technology breakthroughs bring the most business Where and when value scenes come.

AI future: no history can learn from, nor authoritative to predict

Talked about the development of AI, how to deconstruct the technology, and talk about the future of AI. The new age of AI based on deep learning is greatly different from the 30-year history of AI, which is widely used by various applications, in real scene and large-scale data to verify the performance Technology, not just theory or concept.Although the development of the past five years has met the expectations of the people, today there are still many people who worry that the new AI will ebb tide very much like in the past, but I think the new age of AI is just the beginning. I briefly outline the three features of the new AI:

1, AI is a brand new dimension, which is the most important thing that determines how much AI is in the end.

AI technology innovation and development, how to change the business, there is no history can learn from, nor authoritative to accurately judge.AI is not just a technology, AI breakthrough can break through all technologies including human-computer interaction, search, robotics, chip computing, medicine, Pharmaceutical science and almost all disciplines.

2, AI's rapid development, strong jump

From the S curve, we can see that in the past 5 years, the development of AI and its rapid, single-class (face recognition) algorithm has ten-fold growth, but I am even more optimistic about the future, that is, the curve after 2018 in the S curve How steep how the multi-dimensional technology brought by the AI ​​development combined with the depth of each scene will bring a more impact experience.In the multi-technology dimension, from vision to hearing, semantic understanding and motion control A few years will be a quick breakthrough; and the combination of chip, intelligent end-to-end interaction with the user's last 30 cm, from Internet Of Things to Internet Of Intelligence across, so that intelligence everywhere.

3, AI one step ahead, will bring great potential

In the S-curve, teams in different positions have the advantage of not just the difference in horizontal axis time but the cumulative effects (curve integrals) of leading technology and the overlay of the AI ​​technology with more multivariate (multiple AI curve techniques) This makes AI a cross-industry destructive decision not only to determine the gap between one industry, second place, and second place, but also enables leaders in AI's leading industries to leverage AI's backward-looking industries.

AI future, unparalleled; because of seeing, so believe.

2016 GoodChinaBrand | ICP: 12011751 | China Exports