Data sharing giving Chinese AI firms a head start
Xu Li's software scans more faces than maybe any on earth. He has the Chinese police to thank.
Xu runs SenseTime Group which makes artificial intelligence software that recognizes objects and faces, and counts China's biggest smartphone brands as customers.
In July, SenseTime raised €350m, a sum it said was the largest single round for an AI company to date. That feat may soon be topped, probably by another startup in China.
The nation is betting heavily on AI. Money is pouring in from China's investors, big internet companies and its government, driven by a belief that the technology can remake entire sectors of the economy, as well as national security.
A similar effort is underway in the US, but in this new global arms race, China has three advantages: A vast pool of engineers to write the software, a massive base of 751 million internet users to test it on, and most importantly staunch government support that includes handing over gobs of citizens' data - something that makes Western officials squirm.
Data is key because that's how AI engineers train and test algorithms to adapt and learn new skills without human programmers intervening. SenseTime built its video analysis software using footage from the police force in Guangzhou, a southern city of 14 million. Most Chinese mega-cities have set up institutes for AI that include some data-sharing arrangements, according to Xu. "In China, the population is huge, so it's much easier to collect the data for whatever use-scenarios you need," he said. "When we talk about data resources, really the largest data source is the government."
This flood of data will only rise. China just enshrined the pursuit of AI into a kind of national technology constitution. A state plan, issued in July, calls for the nation to become the leader in the industry by 2030. Five years from then, the government claims the AI industry will create 400 billion yuan ($59 billion) in economic activity.
China's tech titans, particularly Tencent Holdings and Baidu, are getting on board. And the science is showing up in unexpected places: Shanghai's courts are testing an AI system that scours criminal cases to judge the validity of evidence used by all sides, ostensibly to prevent wrongful prosecutions.
"Data access has always been easier in China, but now people in government, organisations and companies have recognized the value of data," said Jiebo Luo, a computer science professor at the University of Rochester who has researched China. "As long as they can find someone they trust, they are willing to share it."
Every major US tech company is investing deeply as well. Machine learning - a type of AI that lets driverless cars see, chatbots speak and machines parse scores of financial information - demands computers learn from raw data instead of hand-cranked programming.
Getting access to that data is a permanent slog. China's command-and-control economy, and its thinner privacy concerns, mean that country can dispense video footage, medical records, banking information and other wells of data almost whenever it pleases.
Xu argued this is a global phenomenon. "There's a trend toward making data more public. For example, NHS and Google recently shared some medical image data," he said. But that example does more to illustrate China's edge.
DeepMind, the AI lab of Google's Alphabet, has laboured for nearly two years to access medical records from the UK's National Health Service for a diagnostics app. The agency began a trial with the company using 1.6 million patient records. Last month, the top UK privacy watchdog declared the trial violates British data-protection laws, throwing its future into question.
Contrast that with how officials handled a project in Fuzhou. Government leaders from that southeastern Chinese city of more than seven million people held an event on June 26.
Venture capital firm Sequoia Capital helped organise the event, which included representatives from Dell, IBM and Lenovo A spokeswoman for Dell characterized the event as the nation's first "Healthcare and Medical Big Data Ecology Summit."
The summit involved a vast handover of data. At the press conference, city officials shared 80 exabytes worth of heart ultrasound videos, according to one company that participated. With the massive data set, some of the companies were tasked with building an AI tool that could identify heart disease, ideally at rates above medical experts. They were asked to turn it around by the autumn.
"The Chinese AI market is moving fast because people are willing to take risks and adopt new technology more quickly in a fast-growing economy," said Chris Nicholson, co-founder of Skymind, one of the companies involved in the event. "AI needs big data, and Chinese regulators are now on the side of making data accessible to accelerate AI."
Representatives from IBM and Lenovo declined to comment. Last month, Lenovo CEO Yang Yuanqing said he will invest $1bn into AI research over the next three to four years.
Along with health, finance can be a lucrative business in China. In part, that's because the country has far less stringent privacy regulations and concerns than the West.
For decades the government has kept a secret file on nearly everyone in China called a dang'an. The records run the gamut from health reports and school marks to personality assessments and club records. This dossier can often decide a citizen's future - whether they can score a promotion or be allowed to reside in the city they work.
US companies that partner in China stress that AI efforts, like those in Fuzhou, are for non-military purposes. Luo, the computer science professor, said most national security research efforts are relegated to select university partners.
However, one stated goal of the government's national plan is for a greater integration of civilian, academic and military development of AI. (Bloomberg)