CDO Exchange discussion
I facilitated a roundtable discussion at the Chief Data Officer Exchange in London recently.
Here's how I introduced it...
I want to start this discussion by talking for 5 minutes, to set the scene on who we are and what Confluent does. We will then follow with a 25 minute roundtable discussion.
So first, to set context…
We all know that data is valuable; Indeed, 6 or 7 of the top 10 global businesses (by market cap) are now data driven companies;
Amazon, Apple, Alphabet (paren to Google), Microsoft, Facebook, Also Tencent and Alibaba.
They are living proof that being data driven has value.
To date, the question around the value of data has been dominated by Big Data.
The greater the volume, the greater the value.
We want to alter this conversation and change how organizations work with - and extract value from - data.
We believe there is a Time element for many uses of data; The most recent data is the most valuable - and as data ages it becomes less valuable.
This requires a fundamental change of mindset.
With big data, we’ve mostly focused on the passive storage of data.
Phrases like “data warehouse” or “data lake” or “data store” all evoke places data goes to sit.
Movement of data tends to work in batches. By its very nature it’s static or slow. Analysis is done on data ‘stores’.
At Confluent, we think Streaming data can include both big data and FAST data.
We are turning the database on its side, or some say, inside out.
We have a data streaming platform based on Apache Kafka.
So, how does data streaming, using Apache Kafka work - and why is it different to a traditional relational database?
This is the science bit: Apache Kafka is an open-source stream processing software platform developed by the founders of Confluent and an Apache Software Foundation project.
Whilst Kafka is often categorized as a messaging system (as it serves a similar role), it provides a fundamentally different abstraction. The key abstraction in Kafka is a structured commit log of updates:
Producers of data send a stream of records - appended to this log
Any number of consumers can continually stream updates off the tail of the log with millisecond latency.
Importantly, Kafka is built as a modern distributed system - to be fault-tolerant, high-throughput, horizontally scalable, and allows geographically distributed data streams and stream processing applications
Kafka’s storage layer is essentially a "massively scalable pub/sub message queue architected as a distributed transaction log.
If all that sounds a bit technical, let me give you a few examples of Streaming data in action - fueling the economy of the future.
EVERY message in UBER, NETFLIX, YELP, PayPal – is through Kafka.
Indeed, Streaming data architectures have become a central element of Silicon Valley’s technology companies. Many of the largest of these have built themselves around real-time streams as a kind of central nervous system that connect applications, data systems, and makes available in real-time a stream of everything happening in the business.
At Confluent, our VISION: is for Kafka to act as the central nervous system of every modern company, across all verticals.
If you think about it, a lot of life is a stream of events. Conversations are a stream of information.
One view of a business is as a kind of data processing system that takes various input streams and produces corresponding output streams (and maybe some physical goods along the way). This view of data can seem a little foreign to people who are more accustomed to thinking of data as rows in databases rather than as events - but I start to see a future where traditional databases become redundant.
This is truly ground breaking.
Overall, we think this technology is changing how data is put to use in companies. We are seeing that streaming data is redefining competition. Those that capitalize on it are creating a new, powerful customer experience, reducing costs, designing for regulatory uncertainty, and lowering risk in real-time.
We are building the Confluent Platform, a distribution of Kafka aimed at helping companies adopt and use it as a streaming platform. We think the Confluent Platform represents the best place to get started if you are thinking about putting streaming data to use in your organization whether for a single app or at company-wide scale.
Our Mission is to build this streaming platform and put it at the heart of every modern company.
I’d like to start the discussion by asking for a show of hands…
Who thinks their current systems - incl. all their data stores - are architected in a way that will satisfy the changing landscape, in terms of: increased customer expectations? Competition - potential for disruption? Managing simplicity with increasingly complex systems.