Artificial intelligence Business

How The Modern Data Stack Is Going Real-Time

By Nnamdi Iregbulem

Times have changed. Organizations are increasingly fed up with traditional data infrastructure, which is slow to yield answers to key business intelligence questions and often out of date and out of sync with current business realities, typically by a day or more.

The needs and demands of the modern enterprise are shifting in dramatic fashion. As a result, older “batch” paradigms (one big update, once per day, slow to query) are giving way to more granular, higher-frequency updates on a continual basis (multiple updates per second, fast to query), leading to much fresher data and quicker time to insight.

Search less. Close more.

Grow your revenue with all-in-one prospecting solutions powered by the leader in private-company data.

In addition to analytical insights, real-time data infrastructure is enabling a new category of applications that can react to changing data as it happens. This touches every part of the data stack, from data ingestion, to business analytics, to machine learning and artificial intelligence.

Nnamdi Iregbulem, partner at Lightspeed Venture Partners

As use cases have evolved, the underlying infrastructure supporting them has evolved as well. Going real-time is not as simple as tuning your old data systems. In many cases, infrastructure has been rewritten from the ground up to enable real-time workloads.

Real-time infrastructure and tooling can take a number of forms within the modern data stack:1

  • Streaming small packets of data from A to B at high frequency and volume (Ex: Apache Kafka, Redpanda, Apache Pulsar)
  • Filtering and transforming streaming data in-flight via stream processing tools (Ex: Apache Flink, Apache Samza, Decodable)
  • Real-time analytics that lets analysts get fresh, up-to-date responses to their business queries at low latency (Ex: Materialize, ClickHouse, Tinybird)
  • Real-time or online machine-learning models that continuously adapt and learn from data and generate predictions on-the-fly (Ex: Tecton)

Assembling and stringing together these various systems is still tricky today. But organizations that make these investments will reap rich rewards—primarily the achievement of the fabled “real-time enterprise,” an organization capable of perceiving and reacting to events and changes in their business as they happen.

Interested in going real-time but looking for some inspiration? A handful of next-generation organizations have become early adopters and have trailblazed an adoption path for newcomers.

Some of my favorite case studies in real-time data infrastructure at scale include:

  • Netflix: Over the course of seven years, Netflix grew its streaming data use cases from 0 to over 2,000, along the way building real-time capabilities across data ingestion, movement, analytics and operational processing, and machine learning. Today, Netflix’s real-time infrastructure handles tens of trillions of events per day.
  • Uber: Uber’s real-time infrastructure generates multiple petabytes of data and trillions of messages per day, continuously collected from Uber drivers, riders and other users. Uber has real-time use cases for its mobile apps, internal dashboards, machine-learning models, and ad-hoc data exploration tools.

It’s time for real-time, and the revolution is happening faster than you think. Blink, and you might miss it.

 Nnamdi Iregbulem, partner at Lightspeed Venture Partners, is a self-taught programmer and lifelong technology nerd. His mission is to increase total software output by supporting the entrepreneurs building technical tools for technical people. Iregbulem focuses on investments in technical enterprise software such as developer tools, application infrastructure and machine learning.

Illustration: Li-Anne Dias

  1. Of the listed companies, Lightspeed Venture Partners is an investor in Redpanda, Materialize and ClickHouse.

Stay up to date with recent funding rounds, acquisitions, and more with the Crunchbase Daily.

Copy link