Explain Event Sourcing Like I'm Five: Intro

Event Sourcing is a method to persist data. It is not inherently more difficult than the traditional CRUD operations we've all become accustomed to, but it can sure seem that way.

Photo by Sharon McCutcheon from Pexels.
Photo by Sharon McCutcheon from Pexels.
πŸ’‘
This post is a primer for anyone looking to learn about Event Sourcing.

Posts in this series:

  • Introduction (you are here)
  • Modeling the Domain: Events (to be published)
  • Modeling the Domain: Aggregates (to be published)
  • Commands and Behavior (to be published)
  • Queries and Current State (to be published)
  • Projections and Interpretations (to be published)
  • Event Streams and Event Stores (to be published)

An Introduction's Intro

If you are reading this, you likely have heard about Event Sourcing. Or were convinced to look into it. To cut to the chase, it's a great way to build distributed systems. Not only is Event Sourcing a battle-tested software design pattern that is being used by globally recognized companies, which by no means should be the reason to use it, but it effectively brings concepts and principles other professions have been formalizing for hundreds of years into the world of software engineering.

While it can be incredibly powerful and yield tremendous value, there is a learning curve. One that, if the Internet is to be believed, is very difficult. That is to be expected when many of the free resources online leave something to be desired, and the quality ones do not receive the attention they deserve.

In this post we will cover why Event Sourcing can be considered challenging, and in this blog series we will take an approach I have found effective for newcomers and veteran developers alike. It will help you understand when and why Event Sourcing is the right tool for the job and how to effectively use it.

How Developers Historically Have Treated Data

Event Sourcing is a method to persist data. It is not inherently more difficult than the traditional Create Read Update Delete (CRUD) operations developers have become accustomed to, but it can sure feel that way. The reason? We have been trained with a mindset that primarily focuses on Current State.

An unpopulated Entity Relationship Diagram (ERD).

Whether you're experienced with relational or "NoSQL" databases, modern software development has largely emphasized a CRUD-like mental model when it comes to handling durable data. The rise of Object Relational Mappers (ORMs) has reinforced this way of solutioning. And to give credit where credit is due, many applications are just fine as a result! Simple problems should take simple solutions.

πŸ’‘
ORMs and related software have come a long way. Calling these pieces of software simple is a disservice. The concept of CRUD however is relatively simple.

So, when you have been treating data the effectively the same way your entire career, it is no wonder Event Sourcing can be challenging because it will be different from what one is used to. Practice, iterating, and patience go a long way here, like when learning any other skill!

There's no need to feel uncomfortable learning a new skill. Photo by Craig Adderley from Pexels.

Current State

While the topic of Current State is something we will dive back into in the context of Event Sourcing, let's establish a definition specifically for the traditional way of handling data that was described above. Current State is an entity's representation derived from a record. Let's cover what that exactly means to be on the safe side.

A database table called Orders with five records.

When you query for an Order entity with the identity (ID) 9001, you are fetching its current state. Represented as JSON, it would look like this:

{
  "id": 9001,
  "itemId": "ABC-5687-007",
  "status": "COMPLETED"
}

Mission accomplished! Now let's imagine a scenario that you may be familiar with if you've worked in or have been Designing Data-Intensive Applications. Which is a great book, by the way.

As soon as Order 9001 is fetched (resolved) but before it is displayed to the user who requested it, another process updates the entity. The status is changed from COMPLETED to CUSTOMER_RETURN_PENDING. Meaning the customer is sending it back to the company, likely to receive a refund or to get a working version.

Uh oh. How impactful is this change in the context of providing the user the correct information? Should the user, or system, who queried for this information be told of this? If so, how? And how soon? Will it change what they are doing? Do any other consumers need to be informed of this particular change, or for that matter any change to this entity? How could we have detected this state change since we're updating the same source (row) of data.

Pause. That's it. That's the problem.

This is a tradeoff many applications endure without knowing it. When you have a Current State -based solution, you lose all previous states.

While CRUD makes handling of data simple, it does not lend itself well for domains that would leverage a complete history of all state changes.

The face I make when I see my mutable data mutate right in front of me.

Shopping Carts - The Go-To Example for Event Sourcing

Need a more visual representation? Here is a version of what's effectively become, for better or worse, the golden standard example when explaining Event Sourcing. There is a shopping cart on an ecommerce website. Here you can see the entity was created, which leads to version 1 of the cart. Products are added and removed over time until the end of that cart's lifecycle is met, represented by the Cart Confirmed event.

A Shopping Cart's entire event stream lifecycle.

Time Travel

If you have ever experienced the joys (challenges) of regression testing, you will be happy to hear that Event Sourcing makes this much easier. As a matter of fact, testing in general when you have a good grasp of modeling in an Event Sourced ecosystem can be quite a bit easier in domains with high complexity.

Because every state change is persisted forever, it's like you're traveling through time when testing and debugging. Simply put, every change can be examined in insolation.

Debugging the behavior of the shopping cart is simple when you can time travel and replay events. Using the example below, imagine you are investigating an apparent bug that users have been reporting surrounding removing products from the shopping cart. With Event Sourcing, you can easily have the Current State of a shopping cart be whatever version you need.

Version 4 of this shopping cart has one product in total.
Need to rewind? No problem. With version 3 we can see two products in the shopping cart.

As a matter of fact, if a stakeholder requests a new report to be generated daily from X date forward, you can surprise them and say you can have that report be generated for prior points in time as well. How? Because you have been storing state changes with Event Sourcing. How cool is that? 😎

Event Sourcing can help your applications travel through time! Photo by Tima Miroshnichenko from Pexels.

When to Use Event Sourcing

In general, Event Sourcing applications tend to favor domains with the following:

  1. High complexity
  2. Where change is frequent and or expected
  3. Intense data usage and data contention

Event Sourcing can simplify many of these challenges CRUD faces and open up a wealth of business value.

⚠️
Meeting the above criteria does not necessarily mean Event Sourcing is a perfect fit for your domain. Further blog entries will help provide context so you understand the tradeoffs. You very likely do not want to use Event Sourcing everywhere in your system, but rather in particular areas. 

Let's wrap this introduction up.

Reasoning & Goals

The title Explain Like I'm Five (ELI5) is a phrase that means to break a topic down so a five-year old can understand it. While that isn't verbatim the goal here, we'll be covering topics in a way I've seen have success. Because there's a lot of rewinding and concepts to break, it's a benefit to you that we take this one step at a time.

This isn't meant to be a 5-minute Hello World demonstration, nor is it going to discuss the theoretics or specific edge cases Fortune 50 companies encounter. In fact, many examples out on the Internet tend to use overly naive models and use cases, which I've seen first-hand lead to bad practices. Which then leads to developers being frustrated at this pattern when they were, frankly, taught poorly. That is why modeling will be the first entry in the series! And why I'll be linking the resources I do believe are high quality, because they deserve more attention than they are receiving.

Prerequisites

While Event Sourcing is a data persistence pattern itself, embracing concepts and principles from Domain-Driven Design (DDD) and Command and Query Responsibility Segregation (CQRS) has been beneficial in every quality, robust, production-grade event sourcing application I've seen. You are not expected to have had mastered them before moving on to further entries in this series, but I cannot promote them enough. I will be simplifying many of those concepts and terms so we can focus on our primary subject: Event Sourcing.

Thank you for reading. Until next time!

Erik