On account of large quantity of research reports on artificial intelligence (AI), AI trends and its influence on all sectors of society are heatedly discussed by investors, entrepreneurs and scholars. There is much misunderstanding on AI technologies and industrial development. On the macro level, the topic is divided into three aspects: What is AI? Who are the real AI players? What are the scenarios for AI to be applied to? This article looks at different scenarios using facial recognition and facial verification as examples. 

The S-shaped Curve

Let’s take the S-shaped curve above for modeling AI development history and forecasting. In the curve, the horizontal axis represents time and vertical axis represents machine intelligence level. Any point in the curve would indicate the world’s highest intelligence level at that time.

Before the new AI era came in, the progress of the machine intelligence level is too insignificant to be taken into account. The red curve represents the views of the pessimists (thinking about an AI recession, AI bubble, etc.), who believe development slowed down or even paused since 2017; The blue curve represents the optimists, who think AI keeps the momentums to develop rapidly in 2017. It is worth noticing that tendency of both curves has shown the similar understanding of the AI development history.

However, we see the red curve more often from many views or on research reports, showing a conservative attitude and prediction on the AI development. In fact, the capability and the development of AI might have far exceeded that in any survey findings. In the past five years, the accuracy of the facial recognition technology that identifying a face from a database of N samples (N is a number) is 95%. The vertical axis is the scale of the ability of facial recognition (the magnitude of N).

To understand the status quo of the AI development, we can read the S curve by analyzing the three perspectives from it:

1. The past and the forecast of development level and rate of the AI development

2. The relationship between AI development level and business scenarios

3. The gaps between different players in the game of AI development

Technology development will increase the gaps between AI companies rather than converge and it will unlock more scenarios 

In 2017, the best facial recognition technology is capable of identifying 2 billion faces, 200 times than that of 2016 and even tens of thousand times of 2015. In 2017, in the Face Recognition Vendor Test (FRVT) organized by the National Institute of Standards and Technology (NIST), the world’s most authoritative face recognition test, we have kept 2% ahead of competitors in terms of accuracy with 10 million pairs of comparison samples. People tend to say the tiny gap means the technology from different competitors will get closer and finally converge. It is believed by the majority of the public that 1% or 2% higher in accuracy means nothing. (It is suggested the gap will not cash in on the business value.). I argue, this is a big misunderstanding that should be explained from the following two aspects: 

Firstly, the tiny advance in algorithms will be amplified 5% to 20% in the large-scale comparison based on samples of over 100 million or over one billion, which is the general rule of the algorithms performance curve.

Besides the distinctive outcomes of the algorithms in large scale, it also shows in recognition rate when it comes to the hard data. Based on the previous algorithms performance, faces of African Americans, female, children, the same person with big age span, or partially covered faces, are in the categories that difficult to be recognized. Under those categories, different algorithms might show bigger gaps in recognition capability in specific classifications.

The evaluation of algorithms in super large scale is not an easy academic project itself and it needs extensive data to support any results. There will only be a few that will have the chance to see the performance of an algorithms based on a database of 2 billion because it is extremely hard to build up a test of this size of sample. 

Hence, to evaluate the algorithms, we need to test it under different circumstances and on a large scale, which is a big and complex project. I suggest it is mistaken to believe some people, whose job related to facial recognition, will simply offer opinions and be able to provide the evaluation on algorithms.

Secondly, when the algorithms and the recognition capability keep improving, the technology will unlock more business application scenarios. The recognition ability based on millions or dozen millions of samples unlock the identification scenarios, including remote identification verification and unlocking a phone. The theory of “the improving technologies show no difference” works out here as the scenarios have a low requirement on the accuracy relatively. 

However, in the scenarios like public security, or criminal investigations, it is non-negotiable that the technology is required to have recognition ability based on over 100 million or over a billion samples. A misconception among general views is the difference between technology with higher accuracy and lower accuracy is just about the number of the matching comparison results ---- which is not the case. In fact, it means sometimes the technology with the higher accuracy can be able to be applied to some certain scenario while the technology with the lower accuracy will completely fail. 

In the latest security cases, facial recognition and archive based on the videos collected from 10,000 even 100,000 cameras require algorithms with extremely high performance. Assuming 10,000 faces appear in each camera and there are 10,000 cameras in total, the performance of the algorithms need to reach the recognition ability of 10 billion or 100 billion samples. It increases 1,000 times on performance than that in other common scenarios. Based on different algorithms, the gaps of product experience among them will be amplified in the same proportion. Plus, the algorithms will also be challenged when trained to recognize faces from different racial groups.

In conclusion, I argue that algorithms with the recognition ability of 99.99% will be able to unlock more scenarios than those with recognition ability of 99%. The new unlocked scenarios are the outcomes from the joint effort of leading algorithms team and the trailblazers in industry. Such progress will be only known by a few industry heralds rather than general practitioners.

In my views from the perspective of a scientific researcher and an entrepreneur: only the leaders of the technology industry can keep expanding the boundary in artificial intelligence; top companies create potential energy by their foresight; and the future of AI will be incomparable, unprecedented or unpredictable. 

Leo Zhu, Co-founder and CEO, YITU Technology

Leo Zhu, the co-founder and CEO of YITU Technology. He has been engaged in artificial intelligence research for 15 years as a PhD in UCLA Statistics. He also had worked as a researcher in the Lab of Professor Yann LeCun, who is considered as the founder of deep learning. In 2010, he obtained the top prize in PASCAL Image Object Detection competition and then the global champion in 2017 Face Recognition Prize Challenge jointly held by National Institute of Standards and Technology (NIST) and Intelligence Advanced Research Projects Activity (IARPA).