Data Scientists can now create Garbage from Truth itself!!!
I call it TIGO : Truth In Garbage Out
Is the worthy discipline of Data Management and it's wiser sidekick Data Governance fighting a losing battle? Well it might just be...because now you can dedicate much effort to getting such foundational stuff right and have it all twisted into something else that you didn't ask for and don't need.
I am, of course, talking about the the rise of Artificial Intelligence, Machine Learning and to some extent, inference engines that support the perceived value case graph-based modelling. All these much hyped technologies use algorithms to discover new things in your data.
The mind set is quite compelling though...applications are rubbish at recording the full picture a lot of the time and can't talk to each other very well...so let's train machines to fix this and we'll all reach data heaven...all our wildest dreams will be answered.
Now, the Data Management tribe would say "Let's fix the applications then!"...but that's not the fashion nowadays...far too difficult...instead we have an ever increasing set of people dedicated to doing all this in a more "sexy" way...using a computer's ability to crunch massive amounts of data to get to a quick answer...
But I'm more than just a bit uncomfortable about where this is all heading, and the fact that the foundational message is getting lost in the noise of quick wins and unwashed promises. I believe that data belongs to business and business is a people-centric endeavour. Yes, we can employ machines to help us....but not think for us.
So, we now have the situation that we are training machines to think badly for us...to use sets of data that, at least, we can say contain a known truth within their own context...mash them all together, remodel them...and then get an inference or learning algorithm generate very often false, incomplete, misleading and/or biased revelations.
To turn truth into supposition and guesswork...to create garbage from operational certainty and sell it as progress.
Now if we're all too lazy to do foundational stuff and want the quick fix all the time...governing and verifying all this detritus ain't going to happen either...business will gladly transfer accountability to the data scientists.
I say, be careful for what we wish for, the TIGO effect will bite us hard in the long run...business will punish all data practitioners, regardless of discipline, for perceived insights that prove to be damaging created by the well meaning few...they will tar us all with the same brush.
And worse, business won't believe us any more when we then ask them to invest in the fundamentals which would, let's face it, would have solved more of their root cause problems in the first place.