What is Enterprise Data Operations and Orchestration (EDO2)?

Written by Todd Goldman | Category: Data Operations

As the field of data analytics and data integration continue to mature, it is increasingly clear that the solution to keeping pace with accelerating demands of business requires a more holistic enterprise-wide view and that automation, end-to-end integration, and infrastructure abstraction are foundational to success. However, one of the challenges we have noticed over the past few years is a lack of a distinct vocabulary that reflects the evolution that has happened in the data analytics, data integration and data management markets.  Terms like ETL, Kafka, Hadoop and Spark refer to specific technologies and don’t capture the breadth of the challenge. At the same time, there are new terms like DataOps which refers to a process. While there is a lot of vendor hype about DataOps in particular, many industry analysts have noted that since it is a process, there really is no such thing as DataOps software. As Nick Heudecker of Gartner wrote,DataOps is a practice, not a technology or tool; you cannot buy it in an application.”

This leaves a gap in how best to describe the requirements for data management when distributed data execution and storage platforms are constantly evolving, and companies are balancing on-premise and cloud data architectures.  Enterprise Data Operations and Orchestration (EDO2) is a concept that is meant to directly reflect new ways of thinking about managing data and data pipelines as a critical business process. This concept exists much in the same way that Enterprise Resource Planning (ERP) defined a market to address “deep operational end-to-end processes, such as those found in finance, HR, distribution, manufacturing, service and the supply chain.” (Gartner)

Enterprise Data Operations and Orchestration (EDO2) refers to the systems and processes that enable businesses to organize and manage data from disparate sources and process the data for delivery to analytic applications.

EDO2 systems aim at shorter development cycles, increased deployment frequency, and more dependable releases of data pipelines, in close alignment with business objectives. EDO2 is an integrated software system designed to automate the main steps of data pipeline development and operationalization from source to consumption by analytics applications. Historically, data integration platforms have provided independent modules for each step in the development and management of data pipelines and workflows.  In contrast, EDO2 systems integrate these modules into a fully unified system that provides a more holistic and agile environment for delivering data at scale in support of increasing numbers of analytics use cases.

This includes modules and processes (further defined below) for:

  • Self-Service, End-to-End Data Pipeline Development for data analysis at any speed
  • Data Pipeline Operationalization for diversified data architectures
  • Data Pipeline Orchestration for multi-cloud and hybrid environments
  • Team Based Development to leverage and share reusable artifacts
  • Data and Process Governance for better management control and compliance reporting
  • Data Pipeline Portability for multi-cloud and hybrid environments

EDO2 is not dependent on a specific data processing or integration technology (e.g. ETL, Hadoop, Kafka, Spark etc.), but delivers a semantic layer of systems and processes that provide independence and portability across different data processing and integration technologies in support of shorter development, deployment and dependable release cycles.

Additional characteristics of more advanced EDO2 systems include:

  • High levels of automation– because the end-to-end data orchestration and engineering processes are quite complex, more sophisticated systems automate significant aspects of the development, management and orchestration processes to reduce the level of complexity and knowledge required to successfully deploy EDO2 into production.
  • Reusability – all artifacts created in an EDO2 system should be sharable and reusable across diverse execution environments.  Data ingestion services, transformation logic, and data workflows are examples of development artifacts that should be available to be copied and reused or copied and modified for new uses.
  • Extensibility –Enterprise Data Operations and Orchestration systems must cover a wide variety of technologies and operate in a constantly evolving environment of distributed execution and storage frameworks. As a result, they need to support programmatic interfaces that allow users to extend the EDO2 system to fit into their environment and extend the functionality as ecosystems rapidly evolve.

This post represents is a high level description of Enterprise Data Operations and Orchestration (EDO2).  Future blog posts will go into more detail on both the benefits and core capabilities of EDO2 systems, so check out this space in future weeks  to learn more.

 

————————

 

If you are interested in learning more about EDO2 implementations, check out the rest of the www.infoworks.io  website.

About this Author
Todd Goldman
Todd is the VP of Marketing and a silicon valley veteran with over 20+ years of experience in marketing and general management. Prior to Infoworks, Todd was the CMO of Waterline Data and COO at Bina Technologies (acquired by Roche Sequencing). Before Bina, Todd was Vice President and General Manager for Enterprise Data Integration at Informatica where he was responsible for their $200MM PowerCenter software product line. Todd has also held marketing and leadership roles at both start-ups and large organizations including Nlyte, Exeros (acquired by IBM), ScaleMP, Netscape/AOL and HP.

Eckerson Report: Best Practices in DataOps

This Eckerson Group report recommends 10 vital steps to attain success in DataOps. 

READ MORE
Want to learn more?
Watch 12 minute product demo