Data lineage: The Water Analogy
December 6, 2016
This blog post entitled Data Lineage: The Water Analogy was written by Xenomorph CEO Brian Sentance.
Big thanks to all involved at the A-Team for the DMS NYC event a few weeks back, it was a great day and I enjoyed taking part in the afternoon panel on data lineage. Adam Bryan did a great job of being MC for the discussion and a big thank you to fellow participants Jerry, Sue and Jesse. Adam started off the discussion with a general definition of data lineage and then asked me how I thought it was best to frame the topic. Well not really having much to add to the formal definition, I dug out an old analogy of comparing financial markets data to water:
-It’s vital, everyone needs it.
-People take it for granted, like it should always be there.
-Most of us know where to get some.
-But not many know where it was sourced.
-Or how much it really costs.
-Certainly nobody likes to pay for it.
-And many don’t like sharing it with strangers, only with friends.
-Containers of it often leak and need maintenance.
-Easy contamination means it can need purifying.
-But you often only find out if it was bad after you have consumed it.
-And we all know less about where it goes than where it came from.
I hope some of the points resonate, and I would welcome further examples, both serious (“If you have a lot of it, it can be very valuable.”?) and slightly more frivolous (“The largest providers of it do not negotiate on price.”?). Anyway, the water analogy above might be a good way to more easily explain the challenges of data management and data lineage to some of your colleagues, without hitting them too heavily with technical jargon from the outset.
And with supposedly “dry” topics like data lineage, combined with (ironically?) opaque concepts like “semantics” and the typical abbreviation fest found in technology, we inside the data management sector really need to do more to help the non-specialist understand key data issues and how we can help them build and accelerate their business. Improved data transformation is supported and indeed justified by better terminology translation. So next time you feel tempted to roll out another TLA (“Three Letter Abbreviation” for those of you not in the know…) then don’t. Think again and keep it simple!