Scale AI, the five-year-old visual data labeling company, has acquired a small startup that could help expand its reach in Europe and speed up development of its newest product.
Financial terms were not disclosed.
SiaSearch, which spun out of European venture studio Merantix, has built out a data management platform that acts as a search engine for petabyte-scale data captured by advanced driving assistance and automated driving systems. The startup, which is already working with automakers like Volkswagen and Porsche, is able to automatically index and structure the raw sensor data collected by fleets of vehicles.
That capability fits in nicely with Scale AI’s existing technology. Scale uses software and people to label image, text, voice and video data for companies building machine learning algorithms. It initially launched to provide autonomous vehicle companies with the labeled data needed to train machine learning models to develop and deploy robotaxis, self-driving trucks and automated bots used in warehouses and on-demand delivery. The company has long since expanded into other industries such as government, enterprise and real estate and is now working with companies like Airbnb, Doordash and Pinterest.
Berlin-based SiaSearch could be particularly beneficial in the build out of Nucleus, which co-founder and CEO Alexandr Wang has previously called “the first product of our future,.” The plan is to fold the team into the Nucleus effort, according to Wang.
Nucleus is an AI development platform that Wang describes as the “Google Photos for machine learning data sets.” The product provides customers a way to organize, curate and manage massive data sets, giving companies a means to test their models and measure performance among other tasks. SiaSearch allows Scale AI to accelerate its efforts and even expand the functions to support the entire machine learning lifecycle, Wang said.
The aim is weave SiaSearch’s tech into Nucleus to offer a full data engine that any AI developer can use — even outside of automotive or AV tech. That could prove enormously useful to any company — including robotics companies and automakers — that needs to not only capture, label and organize data, but also to have additional tools to continually redefine what new kinds of data are needed to improve algorithms used in its products.
It’s akin to what Tesla has done, Wang said, who pointed out the company spearheaded the data engine concept to help engineers improve the Autopilot advanced driver assistance system.
Wang said that automotive and robotics companies have struggled with how to make the most of the vast amounts of data, especially as its fleets of vehicles, robots or other devices expand. Merely uploading all of this data back into the cloud would cost literally billions and billions of dollars, Wang said.
“Basically what every AI team is really looking for, is how do we supercharge our machine learning development and accelerate our data set efforts, as much as as Tesla’s been able to,” he said. “We’re just going to give them the same superpowers that Tesla has in terms of being able to constantly supercharge their algorithms with the most relevant most interesting data from their mobile fleets.”