Big Data Market Evolution: Strata 2018 vs Strata 2014

Written by Todd Goldman | Category: Big Data

Last week my colleagues and I attended the most recent Strata Data Conference in NYC and four years ago I attended my first Strata + Hadoop World conference in NYC and I have to say, the event and the market have evolved considerably in just four short years.

First of all, the obvious point is that Strata dropped the “Hadoop World” moniker last year and repositioned itself as a data/data engineering/data science conference.  But when you think that only four years ago, vendors like Mellanox, a supplier of Infiniband network interconnect technology, were attending Strata, it is a sign that the market has matured significantly that fewer component suppliers are no longer in attendance.  That is only reinforced by the fact that big data cloud solution suppliers are now very visible with Google Cloud, AWS and MS Azure now in full force.

Only 4 short years ago, the main deployment model of big data was to buy some hardware, connect it together with Infiniband, build out a cluster in your own data center, implement a Hadoop distribution and start writing code.  This year at Strata, there was much more talk about cloud deployments, ( even Cloudera has one now, ) and tooling that hides the complexity of having to manually code and operate everything. In fact, the big data market is starting to sound a lot more like the relational data management market in the sense that is the hand coders vs those that want to automate their deployments using commercial “no-code” development and operational platforms.

Another big change of course is that it isn’t just about Hadoop anymore, hence the name change of the event and the dropping of the “Hadoop” moniker.  Everyone is now talking about Spark and server-less implementations and most of the community has moved on, or is moving on from mapreduce.

A very positive change this years was the number of conversations we had that were about real projects with real deadlines.  Four short years ago, most of the conversations were educational in nature and attendees were learning about all of the big data components.  Attendees were just starting to understand how Hadoop might fit into their data ecosystem and questions were very fundamental in nature. They were talking about putting together 3 and 5 node clusters and putting architecture teams in place to experiment.  The viewpoint of big data was it was something that a bunch of data scientists would use to look for new opportunities.

Fast forward four years, and there is a realization that after you have finished experimenting, and a data scientist has found some new insight, that you have to be able to put that insight into production.  And that means operationalizing your data pipelines much in the same way you used to do it for relational data sources. So in 2018 at Strata, there were a lot more questions and discussions about data governance, data security and data lineage.  There is a clear mainstreaming of big data analytics where people have finally realized that the statement that “in the big data world, you don’t need to worry about data quality,” is just not true. As companies look to use so-called big data techniques to solve real world problems, data quality still counts.  So once again, that has become a topic of discussion as well within the overall “governance” conversation.

And of course, streaming, machine learning and AI were also big topics for discussion that were at the bleeding edge 4 years ago.  And while still not fully mainstream, they are a much bigger part of the conversation than they were.

One last important point is that there were clearly a lot fewer attendees and a lot fewer companies.  I didn’t to a full count of the vendors, but the 2014 list is about 30% longer ( by my very quick rough count)  than the 2018 list.  And while I don’t have an attendee count, those of us who were in attendance in 2014 consistently felt that it was much more crowded back in 2014.  I recall feeling that I could barely move through the trade show aisles it was so packed.   But while there was more activity and buzz, there was actually a lot less actual business.  Most people back then were just “kicking tires”.

So my bottom line take away from Strata 2018 as compared to 2014?  Less traffic, more real business. The market is clearly maturing and what was once a playground for big data nerds, is turning into a trade show for a group of technologies and users that are beginning to have a real effect on business decision making.


About this Author
Todd Goldman
Todd is the VP of Marketing and a silicon valley veteran with over 20+ years of experience in marketing and general management. Prior to Infoworks, Todd was the CMO of Waterline Data and COO at Bina Technologies (acquired by Roche Sequencing). Before Bina, Todd was Vice President and General Manager for Enterprise Data Integration at Informatica where he was responsible for their $200MM PowerCenter software product line. Todd has also held marketing and leadership roles at both start-ups and large organizations including Nlyte, Exeros (acquired by IBM), ScaleMP, Netscape/AOL and HP.

Eckerson Report: Best Practices in DataOps

This Eckerson Group report recommends 10 vital steps to attain success in DataOps.