Ad Hoc Data Changes in an Event Driven Architecture

Ad hoc data changes are often a necessary evil within modern enterprise applications. Right-thinking development professionals hate them. They hate them because they’re dangerous, often implemented by people with only partial understanding and they’re addictive. A business can become addicted to resolving data issues with “quick” scripts. More often than not, hastily constructed, ad hoc data fix scripts introduce completely new side-affects.

In an Event Driven Architecture, data changes are distributed to interested parties via durable messaging. We like the service bus pattern for this. For any data entity, there still remains a single source of record, but there may be many consumers of that data, some of which will replicate the data locally in order to reduce coupling between the components.

I know many of you consider data replication bad, this is mainly because developers are still trained to be efficient with data storage. However, with the advent of cheaper resilient data storage and greater decoupling concerns, the pattern often becomes very attractive. Right now, the most common use of data replication is in reporting with data warehouses and data marts. With an Event Driven Architecture we find many opportunities for using it within our applications as well. For example, an order processing subsystem can “listen” to customer update messages in order to maintain a local copy of customer names and shipping addresses. Decoupling like this allows the customer system to shut down at any time, without affecting the processing of orders. However, the customer maintenance subsystem is still the source of record for all customer related data.

Ad-hoc data changes are not very compatible with a distributed data model like this. Using a SQL script to make changes to the data in the source of record can introduce data synchronization issues with those sub systems “downstream” listening to data changes made using the application. Their locally stored data can become out-of-sync with the source of record.

Traditionally, when a replicated data store becomes out-of-sync with the source of record, the replicated data store is often deleted and rebuilt from scratch. This is common with many ETL patterns. However, with very large or slowly changing data this may very inefficient and time consuming. What might be a more efficient pattern is for an application to be able to explicitly identify records that have been manually modified by SQL script. Thus when synchronization becomes relevant it is easy to identify what data needs to be redistributed downstream. A very simple implementation might be to add a true/false manual update column to every source of record table in a data store. Adding metadata columns to source of record tables is already a common pattern used in auditing and change tracking. The most common example being a “last modified” column for tracking when a record was last updated. This works for most data changes, except for deletes. Some applications never completely delete records merely mark them as deleted or migrate them to archive storage, but tracking manually deleted entities for the purposes of downstream synchronization is challenging. Whatever solution is adopted to track ad hoc data changes, a firm discipline is required by implementers to ensure manual modifications conform to the pattern. This implies very strong data governance is required for this mechanism to work, on the other hand, that’s largely true for any metadata columns in a table.

An alternative and possibly safer approach is to implement a change logging mechanism. Every action you take against the local data store is logged as a synchronization record. When synchronization is required, the synchronization records are played back to provide the synchronization changes.

Once it’s possible to identify manually modified data, it’s relatively simple to redistribute this data. My own preference is for the sync-request pattern. In this pattern, downstream data consumers periodically request a digest of all manually changed data entities, from the source of record, beginning from some point in time (typically since the last such request). The source of record queries and composes the data for all the manually changed entities from the requested time and then “publishes” them back via messages. This pattern can be implemented synchronously (request/response) or asynchronously. Implementing the mechanism synchronously is probably easier but implementing it asynchronously has more potential.

An alternative to the sync-request pattern is a sync-publish pattern. Periodically a source of record will query for entities manually modified after the last such query, publishing the resulting entity data via messages which can be consumed by any interested subsystems downstream. This has the benefit of being less complex but relies on the source of record to periodically publish changes, with the prior pattern the source of record can be induced to synchronize its data on-demand. For flexibility, it might be wise to construct a hybrid implementation supporting both patterns.

The more I consider these problems, the more I believe a message based synchronization correction mechanism becomes a necessity. There are additional uses for this pattern outside of resolving ad hoc data changes.

Image may be NSFW.
Clik here to view.

Ad Hoc Data Changes in an Event Driven Architecture

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112