Python Absolute Beginner

Python Absolute Beginner, a Medium series by Isaac Cheng

独家优惠奖金 100% 高达 1 BTC + 180 免费旋转

Event Driven Systems

An event is something happened that our application needs to react to. Changing the customer address, making a purchase, or calculating the customer bill, are all events. These events might come from the external world or triggered internally such as having a scheduled job that is being executed every some time.

And the essence here is to capture these events and then process them to cause changes to the application in addition to storing them as an audit log.

The figure above shows three components: Customer, Order and Payment services. Each of these services receives requests and convert the inputs into a stream of events which are then stored in a persistent log. Each event is encapsulated into an event model, an event object.

The event processor is what reads the events from the log and processes them causing our application to do whatever it’s supposed to do.

While most of the interactions are asynchronous, that is not always the case. A request sent to the event processor directly demanding the customer address change and waiting for a response can be a synchronous task.

Events flow in the application carrying some data along. Since each event represents something that has happened in the past then it makes sense to be immutable. That is extremely important, particularly when using the event logs to keep tracking of changes to our application.

However, we may also attach some mutable data that has no side effects such as the result of the processing.

For instance, immutable data would include the items ordered by a customer and how much did they cost at that time, while mutable data can include the confirmation statement generated as a result.

It is also worth recording the time the event was captured by our application and the time the event was processed. Both might, and will, be different.

Lastly, storing the previous state of the application in the event object might be handy when reversing that event — We’ll get into reversing events later.

We’ve got a log of events provides a full record of what happened and can also be used for debugging.

We can easily know what happened in the past at any given point of time.

Now, because these components, these services work together, that brings us to a question: How can they interact by sending and listening to events?.

Most of the interactions are being driven by requests. One component needs information from another so it queries it asking for that information. Or, perhaps, a component might issue a command to another demanding a change or an action to be taken.

In the event-driven world, that works differently. Instead of components talking to each other directly, they emit events that something has happened. Other components can then listen to these events if needed and react accordingly.

Going back to our example.

There are two different kinds of interactions occurred in this example. The query that’s being sent from order to customer service and the command from order to payment service.

In this pattern, services are coupled as they need to know to talk to each other. If one service is down, that might stop the whole application.

On the other hand, when working with events, we reverse the relationship; Instead of having order service talking to payment service, payment service listens to events emitted by the order service.

Similarly, order service listens to events emitted by customer service when customer data has been changed, and keep a note of it.

These events are saved in the event log.

This loose coupling allows the sender to broadcast the event without having to worry about who is interested and who should act. Any new component can hook up and listen to events, and so the existing component doesn’t need to know about that new component, nor it is location.

Commands are imperative, in the present, can be rejected, don’t get recorded, triggered by external request (user).

Events are declarative, happened in the past, immutable (can’t be changed), and can be understood by all stakeholders (descriptive text).

If component A does care about having component B to do something, then this is a command. If, however, component A emits a change happened and doesn't care about who’s interested in that change, that’s an event.

Queries are when a component explicitly asks another for a piece of data.

In events, one component doesn’t ask the other, instead, every component broadcasts events when something changes. And then, components that are interested in that data keep a note of what it needs and ignore the rest.

Therefore, the order service doesn’t need to query the customer service because it kept a note of the last event and thus knows the customer data. Hence, we reduced the calls to customer service.

Consequently, data might be duplicated across different services. It is no longer the case where all customer information resides in one place, it is also placed in other components.

It is up to every component to structure the data, and what data to store and what to ignore.

If a centralized source of data is necessary, we’ll need to (1) emit events to update the data. When done, (2) broadcast the updates to components to act on.

Others might just store the log of events attached to the domain object itself, and there is nothing wrong about that. For example, the order object will have a property called “history”, an array of all the changes.

As a result, every component can work on its own even when other components are down, while when issuing a query or command, we can’t do that without a connection to the recipient component.

If customer service went down, order service doesn’t care because it has its own copy — An event queue will deliver the messages when customer service is up again.

However, we increased availability but consistency is a problem. Order service might not have the latest information in customer service (outdated data) when customer service is down. And so it may take actions based on out of date data. But, again, with requests, it would just not work — which might be preferable in some scenarios.

The other downside is, even though we have got decoupling but it gets hard to explain what exactly happens when an event is emitted. Who is listening? What happens next?. We have to look closely to logs and it gets harder when the application contains many services. With requests, it’s easy to spot and see that a call from one component goes to which other components.

Event collaboration opens the gate to Event sourcing.

Event sourcing is not only about having a sequence of events that one can query and look at but it also allows us to …

A common example of an application that uses event sourcing is a version control system.

One obvious problem is when the event log grows and becomes slow to query. A common solution is to take snapshots by storing the current application state, say, every night.

And so, we can cache the last snapshot in memory and use it throughout the day to build the current application state. On failure, we can quickly rebuild the application from the event logs in the persistent database.

Alternatively, we can also have two systems running at the same time. If one failed, the second takes over.

Things get complicated when working with external systems. The payment service perhaps relies on external systems to query the exchange rate and charge the customer’s credit card.

And since these external systems don’t know the difference between a real request and a replay, a common solution is to introduce a gateway. It acts as a middle layer between our application and the external systems.

During the rebuild and replays, the gateway must not forward the requests to the external systems (disabled mode). It should differentiate between an actual request and a replay. And It should do so without having the application to worry about it.

Queries that are likely to result in different values depending on the time like exchange rate can be stored in the gateway. This means that every or some of the responses from external systems need to be stored in the gateway, and returned upon replay.

Code changes can be lumped under either new features, bug fixes, change logic.

They usually add new capabilities to the system but don’t invalidate old events. These will get executed as new events come in. To apply these new features to old events, we can just reprocess all old events.

They are just a matter of fixing the bug and reprocessing all buggy events. However, if an external system is used, that might be tricky to solve.

For example, If the payment service relies on external service to charge the customer’s credit card. Now, say they charged the customer more than normal fees due to a bug in the code. A solution is by emitting a new event to refund the extra withdrawn money.

Alternatively, we can reverse the buggy event, and insert the correct one. So, we would revert all events until the buggy event (branch point). Then, we reverse it, mark it as deleted (but don’t actually delete it) and insert the corrected one.

Some of these fixes have to be done manually.

Changing the logic or business rules require to process old events with old rules and new event with the new rules. One way is to have a conditional statement — If before a certain date do A else do B. That solution can get very messy. And so, one can have rules encapsulated inside an object and mapped to dates.

Say, when placing an order, we charge the customer for an extra amount of 1.5% taxes. Starting from February 2018, we charge customers 1.8%. So, here are the steps:

What happens when data schema changes?. It is either not to mutate the schema fields (name, age, gender, etc), and extend it by adding new optional fields. This way old events won’t break the code.

Another solution is to migrate the old events to new ones. For sure, we could use If-else conditional statements, but again, that can get really messy.

Identification, as part of the schema, can also cause troubles on replaying events. If the events are associated with customer Ids, every time we replay, make sure we use the same id for the same customer. That’s because if we replayed the events in a different order, or ignored some, maybe for testing, customers might have a different id, and hence events won’t be applied on the respective customers.

A common solution is to use stickyIds (immutable). It means every time we create a customer, we assign an id, and that Id will be associated with the customer’s email (unique). So, upon replay, every created customer will get the same id given the email.

This problem is rather a common problem and happens everywhere, but worth mentioning here as well.

Good consistency can be achieved by:

An example of this would be a cluster of servers with in-memory databases, kept up to date with each other through a stream of events. If updates are needed, they can be routed to a single master server which applies the updates to the persistent database and then broadcasts the resulting events.

Though, …

You don’t always need event sourcing, you could also do this with more regular logging tool, and that way not have to deal with these pitfalls.

We’ve talked about being able to create a separate test environment and branches in the benefits above.

This gives us the ability to view the system using different paths. Maybe, we want to replay all events but remove, add, or modify some. Or, maybe we want to fix the system state but don’t want to mess with the currently running one, so we fork, fix, and merge back.

It is the same idea as with branching in Git.

So, how can we implement it?.

We can simply take a copy of the existing system, and have another separate independent parallel system running at the same time. The benefit is the change in one system doesn’t affect the other.

The problem is, if we want to replace the existing system, we still need to integrate the forked one to the existing running one.

The other way is to have one running system but can point to two databases. In addition, it should be easy to switch between the production and testing databases. The drawback is since both share the same code base, changing it will reflect on both views: The current running system and the forked one.

Event sourcing goes hand in hand with the Command Query Responsibility Segregation (CQRS) pattern.

It is about segregation the operations WRITE from READ. And so if the read ratio is much more than write, so splitting them to scale each independently. Moreover, each has a different schema, backed by different databases, services, etc.

How event sourcing fits into CQRS world?

The WRITE component handles events, and stores these events which are used to create different views: different ways of representing data (reports, analysis, etc). These views can be queried later through READ component.

An obvious pitfall is these different views might be inconsistent for a while until they are all updated with the latest changes. This is called eventual consistency; data becomes consistent will become consistent at a point in the future. We’ve talked about that before.

One of the solutions is to compare the last version of the view and the event log, if there’s a gab, update the view, and return the result.

References:

And finally, … it doesn’t go without saying,

Python Absolute Beginner

Event Driven Systems

Add a comment

Related posts:

2022 NASCAR Cup Series Top 50 Prospects

Presenting the Optybots Collection

Everything Know About Coronavirus