CDO Exchange discussion

Mar 22, 2018
3 min read

Updated: Sep 24, 2020

I facilitated a roundtable discussion at the Chief Data Officer Exchange in London recently.

https://chiefdataofficerexchange.iqpc.co.uk/

Here's how I introduced it...

_________________________

I want to start this discussion by talking for 5 minutes, to set the scene on who we are and what Confluent does. We will then follow with a 25 minute roundtable discussion.

So first, to set context…

We all know that data is valuable; Indeed, 6 or 7 of the top 10 global businesses (by market cap) are now data driven companies;

Amazon, Apple, Alphabet (paren to Google), Microsoft, Facebook, Also Tencent and Alibaba.
They are living proof that being data driven has value.

To date, the question around the value of data has been dominated by Big Data.

The greater the volume, the greater the value.

We want to alter this conversation and change how organizations work with - and extract value from - data.

We believe there is a Time element for many uses of data; The most recent data is the most valuable - and as data ages it becomes less valuable.

This requires a fundamental change of mindset.

With big data, we’ve mostly focused on the passive storage of data.

Phrases like “data warehouse” or “data lake” or “data store” all evoke places data goes to sit.
Movement of data tends to work in batches. By its very nature it’s static or slow. Analysis is done on data ‘stores’.

At Confluent, we think Streaming data can include both big data and FAST data.

We are turning the database on its side, or some say, inside out.
We have a data streaming platform based on Apache Kafka.

So, how does data streaming, using Apache Kafka work - and why is it different to a traditional relational database?

This is the science bit: Apache Kafka is an open-source stream processing software platform developed by the founders of Confluent and an Apache Software Foundation project.

Whilst Kafka is often categorized as a messaging system (as it serves a similar role), it provides a fundamentally different abstraction. The key abstraction in Kafka is a structured commit log of updates:

Producers of data send a stream of records - appended to this log
Any number of consumers can continually stream updates off the tail of the log with millisecond latency.
Importantly, Kafka is built as a modern distributed system - to be fault-tolerant, high-throughput, horizontally scalable, and allows geographically distributed data streams and stream processing applications
Kafka’s storage layer is essentially a "massively scalable pub/sub message queue architected as a distributed transaction log.

Examples

If all that sounds a bit technical, let me give you a few examples of Streaming data in action - fueling the economy of the future.

EVERY message in UBER, NETFLIX, YELP, PayPal – is through Kafka.

Indeed, Streaming data architectures have become a central element of Silicon Valley’s technology companies. Many of the largest of these have built themselves around real-time streams as a kind of central nervous system that connect applications, data systems, and makes available in real-time a stream of everything happening in the business.

At Confluent, our VISION: is for Kafka to act as the central nervous system of every modern company, across all verticals.

If you think about it, a lot of life is a stream of events. Conversations are a stream of information.

One view of a business is as a kind of data processing system that takes various input streams and produces corresponding output streams (and maybe some physical goods along the way). This view of data can seem a little foreign to people who are more accustomed to thinking of data as rows in databases rather than as events - but I start to see a future where traditional databases become redundant.

This is truly ground breaking.

Overall, we think this technology is changing how data is put to use in companies. We are seeing that streaming data is redefining competition. Those that capitalize on it are creating a new, powerful customer experience, reducing costs, designing for regulatory uncertainty, and lowering risk in real-time.

We are building the Confluent Platform, a distribution of Kafka aimed at helping companies adopt and use it as a streaming platform. We think the Confluent Platform represents the best place to get started if you are thinking about putting streaming data to use in your organization whether for a single app or at company-wide scale.

Our Mission is to build this streaming platform and put it at the heart of every modern company.

Discussion

I’d like to start the discussion by asking for a show of hands…

Who thinks their current systems - incl. all their data stores - are architected in a way that will satisfy the changing landscape, in terms of: increased customer expectations? Competition - potential for disruption? Managing simplicity with increasingly complex systems.

CDO Exchange discussion

Comments

RECENT POST

Two AI thought pieces from the Guardian

Reimagining the value proposition of tech services for agentic AI

IBM’s Acquisition of Confluent Will Change Everything For the Tech Sector

Intelligence at scale: Data monetization in the age of gen AI

Triple the return: How companies can get more from enterprise tech

Data Streaming: The Key to Tackling Data Challenges for AI Success

The platform play:How to operate like atech company

The missing data link: Five practical lessons to scale your data products

The Business Value of the DSP: Part 1 – From Apache Kafka® to a DSP and Part 2 – A Framework for Measuring Impact

On The Future Of Cloud Services And BYOC

The bottom-line benefit of the product operating model

Enterprise Apache Kafka Cluster Strategies: Insights and Best Practices

McKinsey: Is your company rewired to outcompete? & The potential of gen AI in maximizing cloud value

2023: The State of Generative AI in the Enterprise

How to build a data architecture to drive innovation—today and tomorrow

The data-driven enterprise of 2025

7 enterprise data strategy trends

Moving Up the Curve: 5 Tips For Enabling Enterprise-Wide Data Streaming

Managing the forces of fragmentation: How IT can balance local needs and global efficiency in a mult

What every CEO should know about generative AI

Two Great Data Mesh articles

Who Owns the Generative AI Platform?

4 great data posts

Why Modern Business Runs On Data Streaming

We’re Abusing The Data Warehouse; RETL, ELT, And Other Weird Stuff