Automation

Agile Data Engineering is the New Black

Posted by Todd Goldman

It’s no secret that the inherent value a piece of data may have is completely moot if it’s not used to help deliver insight in time to make a difference within the organization. It’s not only the opportunities that are missed when your data’s journey to the front lines is delayed, it’s the fact that the data itself degrades over time. What your customers wanted last month may not be what they want now or tomorrow.

But there’s something else; putting data to work for your organization sooner rather than later also means discovering, sooner than later, whether or not the data was used wisely or well. Did the decision it helped formulate result in new business value? Was it a mistake? Can new insights be derived from the data’s performance to drive more optimized decision making going forward?

In a much-covered letter to shareholders last year (and something I’ve written about before as well), Amazon founder and CEO Jeff Bezos wrote, “Most decisions should be made with only about 70% of the information you need. If you wait for 90% or more you’re moving too slowly.” Yes, that may mean being wrong sometimes, but Bezos reasons, “If you’re good at course correcting, being wrong may be less costly than you think, whereas being slow is going to be expensive for sure.”

You can infer from Bezo’s letter that data’s role in “course correcting” is where the real value is, and it’s something I wholeheartedly agree with. Organizations need to stop thinking about data as if it were a pile of gold they’re sitting on and revisualizing it as something that flows (and should always be flowing) like water or oxygen. If data isn’t constantly moving throughout the enterprise, the enterprise dies. Sure, you’re bound to make some mistakes. But as long as the data’s flowing, those mistakes bring value too–they’re insights that give you the ability to course correct more frequently.

The implication is that it isn’t just the speed at which you can move data that is important.  It is the agility with which you can formulate a hypothesis, identify the data you need to test that hypothesis, build the data pipelines to integrate the data, and generate the data visualizations you need to interpret the results and take actions. The industry has been focusing on the real-time streaming of data to reports. But if still, it takes three months to generate a new report or dashboard,  and the result of viewing that dashboard is that you have to make a course correction, and that course correction takes another three months before you can get your new dashboard, then all that streaming data isn’t very real-time after all.

Note that I am not arguing against real-time streaming of data. I am only making the point that in addition to delivery of data in real time, you need the ability to create new kinds of analytics use cases with greater agility as well. After all, if you want to “fail fast” you have to be able to act fast and react to what you learn even faster.

Real-time data may still be a hot topic, it has been for years. But if you believe in “failing fast” and course correcting like Amazon, it’s agile data engineering and the speed with which you can deliver and manage new analytics use cases that will ultimately determine whether you will be a winner or loser when it comes to leveraging your data for strategic advantage. 

About this Author
Todd Goldman