Kafka Summit London

Lyndon Hedderly
May 8, 2019
15 min read

Updated: Sep 24, 2020

Talk at Kafka Summit in London: DRAFT NOTES

Connecting Kafka to Cash (CKC): May 13, 2019, 4:15 pm - 4:55 pm

https://kafka-summit.org/events/kafka-summit-london-2019/

https://kafka-summit.org/speakers/lyndon-hedderly-2/

Summary:

Hello, my name is Lyndon Hedderly, I’m based here in London - Lyndon from London - and I’m a Director of Customer Solutions for Confluent. I’m going to talk about connecting Kafka to Cash, or quantifying the value of Kafka.

So, why Listen, or what’s in this talk for you?

First, I’m going to talk about Why we measure Value of a Technology. This talk is useful if you have ever had to create a business case - or justify the implementation of a technology - incl. Kafka.

Then I’ll talk about some of the Problems with measuring Value - and why it’s considered ‘hard’ and therefore - is often skipped - or left to gut feel / subjective judgement. And I’ll cover some solutions to these problems.

In the 2nd half of the talk: I’ll give three examples of measuring Kafka’s business value - in terms of saving money, making more money and protecting money. I’ll conclude with a framework (instructions) on how you might quantify the value of Kafka in your org and some of the nuances of this exercise.

So, why measure value?

We know Kafka is in production in tens of thousands of companies. Here’s just a few examples (see pic of examples). Many of these implementations will have gone through some sort of benefits assessment or ROI. And this is the first reason to measure value - as a justification for doing something; to complete an ROI, before a project starts, in order to obtain, or justify, a budget.

But what about Kafka specifically? Event streaming represents a major shift in data infrastructure. Kafka is not just one of the hundreds of supporting technologies enterprises deploy to make some process better or faster. It’s instead one of the very, very few foundational technologies that sit at the heart of how businesses innovate. This is important, because it signifies a technological shift as powerful and important as that of cloud or mobile. And this is all the more reason we should aim to quantify the value of Kafka. But it is also the reason it's especially hard to quantify this value.

Also - quantifying the value of Kafka is not just about creating a business case before we start a project. We know that Kafka is in at least 60% of the Fortune 100 today.. We can also value Kafka DURING and POST project…

1.DURING a transformation; Many projects fail through lack of governance and control. A good value assessment can maintain overall buy-in and focus on execution.

2.POST-transformation: Understanding value helps in benefits realization – which is essential for continual service improvement (CSI). If you’re not measuring something, how do you really know if you’re maximizing the benefit of that thing?

The use cases of Kafka are enormous - and understanding value is also useful for expanding the use of Kafka. To illustrate this point, let’s take an idea which seems obvious now -but relatively revolutionary just over 10 years ago; using data to improve customer experience.

Let me tell you a quick story…

In 2006, a few guys (incl. Geoff Teehan, → a Product Design Director at Facebook, formed Teehan+Lax.) and formed a hypothesis:

“Companies that use data to deliver great user experiences will see it reflected in their stock price.”

They created the UX fund and invested $50k across 10 companies which in their view offered a great User Experience.

(https://medium.com/habit-of-introspection/the-ux-fund-investing-50-000-in-10-companies-10-years-later-6fc65bd35e7a)

So, what happened… Overall $50k in 2006 became $306k 10 years later.

That’s a 503% gain (at a time when the S&P 500 returned 50% - so this fund did >10x better)..

$5k in Netflix became >$157k. That’s over a 3,000% gain.

And what is interesting here - in this study, you can spot the top performing companies are also significant users of Kafka. If we take Netflix for example, which is clearly leading the pack...

Netflix is a massive user of Kafka at scale.
Everything in the business is an event.
They run 50+ Kafka clusters, with 4000+ brokers, processing an astonishing 2+ trillion messages every single day.

Can we use this to draw a correlation between Kafka and Value?

Well… Netflix has been going for a while (since ‘97) a lot longer than Kafka (2011)... but, let’s take a closer look…

When I last looked at Netflix, the share price was about $380 a share.

Had you invested $10k in Netflix - shortly after Kafka was created it’d be worth almost half a million dollars now ($475k). That’s a 4750% increase!

Clearly we can’t attribute the value of Netflix to Kafka - but Perhaps, just as Teehan & Lax identified CX as an indicator of superior performance, we can identify being event-driven for an investment fund? Maybe there’s another reason to identify value - to generate more value - not only for business investment and ROI but also for our own investment.

Which brings me to my next point…

The Problem with measuring value.

Which is also partly why quantifying value is often skipped or left to subjective gut feel.

And I’m going to give some suggestions to address these problems…

Quick show of hands - who’s had to create a business case?

Who has implemented Kafka without a business case?

I predicted about ⅓ ⅓ ⅓ would have created a business case, wouldn’t have -and wouldn’t raise their hands.

Firstly, we’ve often wildly miscalculated business cases and ROI’s, which has made people skeptical.

There’s no doubt business cases are hard, We are trying to predict the future.

In my experience, most business cases over estimate benefits, as they are compiled be people with vested interests.

The HS2 business case - as a rough example - included benefits of reduced travel time. For example, reducing the travel time from Birmingham to London from 1 hour 21 minutes to less than 50 minutes. The business case team assumed this saved time would result in business Productivity (by multiplying hrs saved by productivity in £).

But this completely ignored the fact that many people work on the train - so a slower journey doesn’t necessarily mean less productivity. This is just one example of thousands. In my experience, most business cases over egg the benefits - and very few are re-visited after the event for benefits realisation.

So, how do we address this problem?

We focus on the real benefits of Kafka over and above alternative solutions - and we ensure our assumptions are reasonable and ratified. The benefits of Kafka include:

1. Moving from batch to real-time. In a nutshell, this means no longer living in a world where you need to wait until the end of the day for a report to run. This is important --- as an example… when people tell us reports run at the end of the day, we ask, at the end of what day? Today, there is no end to the business day.

2. Breaking down silos and the maze of point-to-point connections, and instead having a single enterprise-wide source of truth for events. Without this, developers wind up spending way too much time in finding data, and way too little time building apps.

3. Building for scale. It’s no secret that data volumes are exponentially increasing. Current solutions simply were not built for the volume of data required today, in particular when it comes to IoT.

4. Storing events in a manner that enables real-time replay.

5. ...But most importantly, the true value of Kafka - and perhaps most transformative - is the capability to leverage context to create rich real-time applications. These applications are contextual, they are event-driven, and they completely change how you run the back-end of your business as well as how you interact with customers. And this brings me to the next Problem.

To illustrate this, I have a little game…

This is a 1954 Color Field painting by the Abstract expressionist artist Mark Rothko - called ‘No 1 (Royal Red and Blue)’. What’s the value? Have a guess…

In 2012, the painting sold for US$75.1 million (£47.2m) at Sotheby's.

This is Andy Wharhol - Liz - created in 1963.

In 2001 Hugh Grant bought this for £2M. He sold it in 2007 for £11.4M -and the painting sold again for $21M in 2018. So, it jumped from £2M to $21M (over 10x) in 17 yrs.

But Liz isn’t as valuable as Warhol’s Turquoise Marilyn. In 2007, this silkscreen ink on synthetic polymer paint on canvas sold for $80 M.

Perhaps my favourite example of an interesting valuation is this; Jeff Koons Bunny...

In ‘99 sold for $940k

In 2018 - sold for $65M - that’s a 68x return in 19 years. How was this valued?

It turns out, it’s not easy to value art - it’s really quite subjective… and yet, value is still measured in hard cash - and that’s the purpose of this little game. Just because it’s hard -and we don’t always agree with the outcome, we shouldn’t shy away from valuing something. It can be super useful to do so.

So, what’s this worth?

KAFKA.

I can guess what you’re thinking - art is different to Software - and of-course you’re partly right.

-Art is subjective. It is more aesthetics over utility.

-It is subject to supply and demand economics. There’s a limited supply of desirable artists, driving up the price of works. And so Art can be used as an investment.

Software, on the other hand has a utilitarian value - and can be copied and distributed, practically for free (especially open source). Software also constantly evolves and previous versions and therefore become out dated. But I wanted to play that little game to show that whilst value can vary widely -and sometimes feel difficult to quantify… it doesn’t remove the value. Those art pieces still sold for hard cash! There’s a huge subjective element to valuing art. And when it comes to valuing Contextual business applications there’s also subjective element.

To further explain the subjective element of valuing a (contextual) piece of technology…

I have an example. Let’s take an everyday item using data, or events: Let’s say you have an Apple watch - what is that worth?

1.We can talk about value in terms of Price: The production cost is $83 to build to iwatch. The retail value is $350. Is this the value?

2.However, being situationally aware, the new Apple Watch is packed with a slew of sensors. Let’s say the fitness information encourages you to run 5k three times a week; Now what’s it worth? Some health insurers are offering a better deal, based on your tracking data. Can value be tied to lower insurance premiums?

3.And the iWatch can administer a medically accurate electrocardiogram. Let’s say the data gives you an early warning for a serious health issue; Now the value of the watch - and the data - is literally life and death.

The bottom-line - Value of data - and how you work with data is highly relative and situationally specific… So, now the art examples start to feel a little more relevant.

So, how do we deal with this?

If we take Kafka to be a data infrastructure layer, we can focus on the data, or events.

We know Data = Dollars. So, let’s look at the value of DATA first. In the Teehan & Lax example, we looked at the value of data in driving customer experience - and saw companies that were good at this drive better business performance.

Prior to the Digital revolution the most valuable companies were mostly oil or FS companies.

As Sir Clive Humby was developing the Tesco club card in 2006 (same yr as Teehan & Lax study) he realised data was driving huge value potential. He coined the term data is the new oil. Now of-course, its Tech or Data driven companies, not oil companies that are the most valuable.

Sir Clive Humby went on to say, Oil is valuable, but if unrefined it cannot really be used. It has to be changed into gas, plastic, chemicals, etc. to create a valuable entity that drives profitable activity; so must data be broken down and analyzed for it to have value.

This is an interesting statement because whilst the analagy between data and oil isn’t perfect, there are further extensions we can use.

With this analagy, we can see, like oil, it is not just the data that is important - it is how it is moved around - and how timely it is. If we look at an oil pipeline for example, the pipeline starts to assume the value of the oil it is carrying. The ability to move the oil becomes almost as valuable as the oil itself.

In other words, if a pipeline was transferring $1m of oil a day - and it goes down, you could say the pipeline is worth $1m / day. And to complicate things further, there are external factors which will impact the value or price of the oil. If a pipeline elsewhere goes down, then the oil price may go up. Especially if there’s a dependence on the oil from other critical infrastructure, such as power plants. You need to be “situationally aware”. Situational awareness can be defined simply as “knowing what is going on around us”. And value changes - as with art - we can see the value of oil varying from below $20 a barrel to over $160 in just a few years...

Now imagine a pipeline that could also refine your oil, in near real-time. What would than be worth?

In business things are worth what people will pay for them. But they’re also worth what people think they’re worth. Look at Uber - it’s never made a profit -but is currently worth about $80bn…

So, to show how I’ve measured Kafka’s value I’m going to provide three examples.

With Kafka - and software in general - we should focus on its utility value.

And we need to take a semi scientific approach - by extracting the dependent variable - Kafka - and modelling the difference - with and without Kafka.. Let me explain.

So, here’s 3 examples of how I’ve approached this in business.

I worked with a Retail Bank (which will remain nameless). This bank - like all Retail banks - had issues with ATM disputes, which can include; ATM dispensing wrong amount, or not dispensing at all, or deposits not registering, or registering wrong amount. Agents at the bank spend a lot of time on these disputes. In addition, the bank often pays the complainant off -as it’s cheaper than going through complex resolution / or escalation to ombudsman.

This specific bank was getting 7-9,000 disputes / month. Agents spending on average 3-5 hrs to resolve cases. So, the cost to resolve was significant. We modeled this and estimated the costs at around £3M / yr, or £9M over 3 years (in Agent time and write-offs).

In this case, we worked with the bank to prove an Event Streaming Platform (the Confluent Platform) to help address these costs. This is what the applications and data infrastructure looks like with a single solution for managing ATM disputes. We call this a universal event pipeline --- which is high throughput, persistent, ordered, and has low latency --- and where all your events and systems are connected; incl. Bank account info, logs from ATMs, analysis and other info feeds.

The benefits of Kafka include:

4. Storing events in a manner that enables real-time replay. Databases that store events have existed for a long time. But today, those events also need to accessible in real-time.

...But most importantly, the true value of Kafka - and perhaps most transformative - is the capability to leverage context to create rich real-time applications. These applications are contextual, they are event-driven, and they completely change how you run the back-end of your business as well as how you interact with customers.

It’s this area - Contextual Applications - I want to focus on for my three examples…

This ROI is based on assumption of:

1. ATM Dispute Cost savings assumption: 50% reduction in agent time

2. Avoidable Payments assumption: 75% reduction in avoidable payments (where the bank believed the claims to be fraudulent).

So, whereas before we modeled a loss of £3M / yr, we reduced that significantly - to £4.2M over 4 yrs. We are saving a significant amount of Agent time (50%) and >75% of the payments - but we are including the costs of the project - we still get a reduced Target cost over 3 years. When compared…

We can see the initial cost associated with implementing the project £1.5M - but the annual net savings were significant - 1.69M in Yr 1, 4.8M in Yr 3 and 8M by Yr 5.

The 5 Yr data is shown here in the next slide…

Here’s another example… Customer 360.

Let’s take a Retail organisation - selling on-line and in-store, seeking to improve CX - with better Customer 360.

Let’s assume, we are able to offer real-time personalized and contextualized offers - both in store at point of sale, and on-line, enabled with a Kafka streaming platform.

There’s lots of data which suggests improved CX drives increased revenue.

By implementing a Kafka Event Streaming Platform this retailer was able to better manage inventory - in real time - and understand the customer better - through on-line behaviour and past purchases - and then offer personalised, contextualized, targeted offers, to provide an all round better customer experience..

We can model what a small percentage uplift to revenue might look like, together with increased project and operations costs. In this instance, we showed that a $2M upfront costs + annual ops costs totaling $6.3M over 5 yrs provided a net gain of $13.5M.

This provides a 2.14x ROI.

Finally, let’s look at a Risk Mitigation use case - implementing Fraud Detection use-case within a payments provider.

Here’s a quick joke - but online Fraud is a very serious issue.

We had a client from one of our banking customers talk about millions of dollars lost each month, through on-line credit card fraud. This is a game of cat and mouse, with the criminals constantly trying to outstep the FS institutions.

In this example we can see losses = about $1m / month, or $36m over 3 years.

Here the contextual application is a Fraud Detection app - which pulls in and scores all credit card transactions…

And we can model the costs of implementing a Fraud prevention system - based on Kafka (real-time insights) - and we can see reduced loss through Fraud.

In this example, we see the initial project costs = £3M + Ops costs - a total 5 yr TCO of £6M. But we see significantly reduced Fraud -so the total loss through Fraud + Project costs. We see a Net reduction in Fraud losses of £21M over 5 years, which gives us a 3.5x ROI on the spend.

Additional benefits include 1) customer retention and 2) insurance premiums go down. So, as with most ROIs there any other benefits, not included in this simple calculation.

So, the point here is - whilst the problems with identifying value are significant. We can stand up to this.

I have definitely seen a trend in the past 10 years to move away from hefty business cases. With the advent of Digital and agile working, organizations typically start iteratively, playing around with tech. If it proves value in an intuitive ‘this is working’ kind of way, then that’s sufficient.

But Kafka isn’t just a data infrastructure layer. To really understand its value, requires a paradigm shift - to event-centric thinking… So, it’s worth digging in more and really trying to quantify its value…

If we are going to change the way of doing things (from a de-facto - data in relational databases for request, response type activity, then we need to get better at identifying the value....

An analogy is trying to value the Foundations of a house separately from the house. It doesn't make sense.

The house would be practically worthless without the foundations. They are intrinsically linked. You can’t have one without the other.

...The true value of the Event Streaming Paradigm is the new generation of contextual event-driven applications that can be built. Think of that as the house - the above the line value...

Below the line - is all about IT savings - the traditional benefits of a data infrastructure layer. We can model how this simpler architecture can save costs and increase agility etc -but it’s the ‘above the line’ where the true value lies.

And when valuing, we should always link back to the three things; making money, saving money, or protecting money.

This quote is from cio.com: An exceptional IT team knows there are only three possible sources of business value for an emerging technology; it can help us grow our revenue, improve our profitability, and/or help mitigate some risk that is important to the company as a whole.

So, to conclude…

To help frame these value conversations / approaches to assessing value, I’ve created a 5 step process.

This is semi-scientific in that you take the dependent variable we are trying to test - which is Kafka - and event-driven architectures - and aim to isolate that between a Baseline (without) and a Target (with) - a bit like a Control Group and an Experimental Group - we then assess the benefits - Net (incl. Costs to implement and run etc). And we also look at soft benefits and then proof-points to back up the value assessment.

This same process can be used to assess the value of features and functions of Kafka and an event streaming platform. As an example, I’ve used this model to quantify the benefits of the Confluent Platform over OS Kafka… You can build in the time to build features and functions, or look at costs of Self Managed vs. Fully Managed. It’s really quite simple…

Overall Recap:

The definition of Value is elusive → think Cash - as it’s important to identify value - to help a business change the ways of working. This is required for a paradigm shift.
There are problems in measuring value - It is Subjective, it changes, it requires situational awareness → live with it.
When measuring Kafka, isolate a variable to be tested - and compare in a semi-scientific way - think above the line - the whole business use-case, rather than just Kafka as a Universal Event Pipeline.
We have a framework - to help quantify the value of Kafka.

Kafka Summit London

Comments

RECENT POST

The missing data link: Five practical lessons to scale your data products

The Business Value of the DSP: Part 1 – From Apache Kafka® to a DSP and Part 2 – A Framework for Measuring Impact

On The Future Of Cloud Services And BYOC

The bottom-line benefit of the product operating model

Enterprise Apache Kafka Cluster Strategies: Insights and Best Practices

McKinsey: Is your company rewired to outcompete? & The potential of gen AI in maximizing cloud value

2023: The State of Generative AI in the Enterprise

How to build a data architecture to drive innovation—today and tomorrow

The data-driven enterprise of 2025

7 enterprise data strategy trends

Moving Up the Curve: 5 Tips For Enabling Enterprise-Wide Data Streaming

Managing the forces of fragmentation: How IT can balance local needs and global efficiency in a mult

What every CEO should know about generative AI

Two Great Data Mesh articles

Who Owns the Generative AI Platform?

4 great data posts

Why Modern Business Runs On Data Streaming

We’re Abusing The Data Warehouse; RETL, ELT, And Other Weird Stuff

The Unfortunate Reality about Data Pipelines

Why Is Cloud Migration Reversing From Public To On-Premises Private Clouds?

Enhancing ETL by turning to real-time data streaming.

The Death of Data Modeling - Pt. 1

How to unlock the full value of data? Manage it like a product

A Better Way to Put Your Data to Work

How to get a core banking transformation right: Eight mistakes to avoid