Rupert Brown of UBS did the keynote at this Spring’s A-Team Data Management Summit (DMS). Rupert’s talk was about understanding what data there is within a financial institution and understanding where it comes from and where it goes to. Rupert started by asking the question “Where are we?” illustrating it with a map of systems and data flows for an institution – to my recollection I think he said it stretched to 7 metres in length and did not look that accessible or easy to understand. He asked what dimensions it should have as a “map” of data, wondering what dimensions are analogous to latitude, longitude, altitude and orientation? Maybe things like function, product, process, accounting or legal entity as potential candidates.
Briefly Rupert took a bit of a detour into his love of trains with a little history on the London Underground Map. He started by mentioning the role of George Dow who illustrated maps for train routes in a single line, showing just dependency and lineage (what stations are next etc) and ignoring geography and distance. This was built upon by another gentleman, Harry Beck, who took these ideas a stage further with the early ancestors of the current Undergroud map, showing both routes but interweaving all the lines together into a map that additionally was topologically sufficient (indicating broad direction – NESW).
Continuing on with this analogy of Underground to maps of data and data management, Rupert then mentioned Frank Pick who created the Underground brand. Through creating such an identifiable brand, effectively Frank got people to believe and refer to the map, and that people in data governance need and could benefit from taking a similar approach to data governance with data management. I guess it is easy to take maps we see every day for granted and particularly some of the thought that went into them, maybe ideas that initially were not intuitive (or at least not directly representative of physical reality) but that greatly improved understand and comprehension. Put another way, representing reality one for one does not necessarily get you to something that is easy to understand (sounds like a “model” to me).
Rupert then described some of his efforts using Open Street Map to map data, making use of the concepts of nodes, ways and areas. Apparently he had implemented this using a NoSQL database (Mark Logic) for performance reasons (doesn’t sound like a really “big data” sized problem with several hundred apps and several thousand data transports but nevertheless he said it was needed, maybe as a result of its graph like nature?). He said that the data was crowdsourced to refine the data, with a wiki for annotations. He said he was interested in the bitemporality of data, i.e. how the map changes over time. He advised that every application should also be thought of as its own “databus” in addition to any de facto databuses might be present in the architecture.
In summary the talk was interesting, but it was demonstrable from what Rupert showed that we have long way to go in representing clearly and easily where data came from, where it goes to and how it is used. I think Rupert acknowledges this and has some academic partnerships trying to develop better ways of representing and visualizing data. Certainly data lineage and audit trail on everything is a hot topic for many of our clients currently, and something that deserves more attention. You can download Rupert’s presentation here and the A-Team’s take on his talk can be found here.