Meet Soda, a data monitoring platform that is going to help you discover issues with your data processing setup. This way, you can react as quickly as possible and make sure that you keep the full data picture.
If you’re building a digital-first company, you and your customers are likely generating a ton of data. And you may even be leveraging that data to adjust your product itself — think about hotel pricing, finding the right restaurant on a food delivery website, applying for a loan with a fintech company, etc. Those are data-heavy products.
“Companies build a data platform — as they call it — in one of the big three clouds [Amazon Web Services, Google Cloud, Microsoft Azure]. They land their data in there and they make it available for analytics and more,” Soda co-founder and CEO Maarten Masschelein told me.
You can then tap into those data lakes or data warehouses to display analytics, visualize your data, monitor your services, etc. But what happens if there’s an issue in your data workflows?
It might take you a while to realize that there’s some missing data, or that you’re miscounting some stuff. For instance, Facebook miscalculated average video view times for several years. When you spot that issue, an important part of your business might be affected.
Soda wants to catch data issues as quickly as possible by monitoring your data automatically and at scale. “We sit further upstream, closer to the source of data,” Masschelein said.
When you set up Soda with your data platform, you instantly get some alerts. Soda tells you if there’s something off. For example, if your application generated only 6,000 records today while you usually generate 24,000 records in 24 hours, chances are there’s something wrong. Or if you usually get a new entry every minute and there hasn’t been an entry in 15 minutes, your data might not be fresh.
“But that only covers a small part of what is considered data issues. There’s more logic that you want to test and validate,” Masschelein said.
Soda lets you create rules to test and validate your data. Basically, think about test suite in software development. When you build a new version of your app, your code needs to pass several tests to make sure that nothing critical is going to break with the new version.
With Soda, you can check data immediately and get the result. If the test doesn’t pass, you can programmatically react — for instance, you can stop a process and quarantine data.
Today, the startup is also launching Soda Cloud. It’s a collaboration web application that gives you visibility in your data flows across the organization. This way, nontechnical people can easily browse metadata to see whether everything seems to be flowing correctly.
Basically, Soda customers use Soda SQL, a command-line tool that helps someone scan data, along with Soda Cloud, a web application to view Soda SQL results.
Beyond those products, Soda’s vision is that data is becoming an entire category in software products. Development teams now have a ton of dev tools available to automate testing, integration, deployment, versioning, etc. But there’s a lot of potential for tools specifically designed for data teams.
Soda has recently raised a $13.5 million Series A round (€11.5 million) led by Singular, a new Paris-based VC fund that I covered earlier this week. Soda’s seed investors Point Nine Capital, Hummingbird Ventures, DCF and various business angels also participated.