“Feature stores,” with their dreary and opaque moniker, might not sound like the sexiest subject.
But they’re an essential part of the AI systems that enterprises — and consumers, for that matter — use every day. That’s why they’re attracting an increasing amount of attention and investment from venture firms, which see the market opportunity growing into the distant future.
AI systems are made up of many components, one of which is features. Features are the individual variables that act like inputs in the system. In thinking about features, it can be helpful to visualize a table, where the data used by AI systems is organized into rows of examples (data from which the system learns to make predictions) and columns of attributes (data describing those examples). Features are attributes used to describe each example — an AI spam detector tool might use features like words in the email body, for example, or a sender’s contact information.
Working with features tends to be an ad hoc process within a single AI system. But at the enterprise scale, where data science teams are responsible for maintaining dozens to thousands of systems, a place to manage and track features becomes a necessity.
Enter the feature store, a centralized repository for organizing, storing, and serving the features that AI systems rely on. Introduced as a concept by Uber in 2017, feature stores provide a unified place to build and share features across different teams in an organization.
“Feature stores sit at the intersection of data and machine learning,” Michael Del Balso, the CEO of Tecton.ai, a startup developing feature store software for businesses, told TechCrunch in an email. “[Feature stores are] an essential part of the ‘MLOps’ stack because they enable data teams to quickly, reliably build high-quality features using real-time data and serve those features in production for real-time inference. They serve as the interface between data and [AI] models.”
Going beyond simply a database, feature stores allow data engineers to see statistics on features, including which features have been used, where they’ve been used, and the impact they’ve had on models. Feature stores also transform data, allowing users to aggregate, filter, and join features without necessarily needing to code. (Think aggregating orders at a restaurant to get the feature value “number of orders over the past 30 minutes.”)
Del Balso explained: “Advanced feature stores … automate production pipelines to collect data from batch data sources and real-time sources, transform the data in real time, and store the data in the offline and online store. [They often also] include built-in monitoring capabilities to monitor pipeline health, data drift, service levels and more.”
Feature stores promise to enhance collaboration between teams while streamlining the development of AI systems. As the demand for them grows, tech giants and startups like Tecton are developing products to meet the need — and investors are backing them enthusiastically.