Welcome to the Infoworks DataFoundry Overview. This course is designed to introduce system operators and architects, database administrators, data engineers, data scientists, and managers working with these teams, to the value and high-level capabilities of Infoworks DataFoundry. Let's get started!
It's no news to you that the quantities of data available and increasingly required for enterprises to remain competitive keep exploding, with no sign of slowing. 1 Engineering a useful response to these ever-widening rivers of data is ... a challenge. We'll start by taking a look at some specifics of this challenge, so you can begin mapping these specifics to your own real-world concerns. 2 Then we'll get to work, understanding how DataFoundry offers a solution to managing these engineering challenges. To begin orienting you to the solution, we'll dive right into DataFoundry, configure a few data sources, and set up a governance domain. 3 Next, we'll explore Ingestion, the process, which of course becomes and ongoing process over time, of channeling rivers of data into a Unified Platform, your "data lake." 4 Of course, it's not enough to merely ingest data into a common lake. The power of a Unified Data Platform emerges as your transform your data, breaking down silos by merging and melding your data from disparate systems into the common collections you need. With Infoworks, this is done through the power of automated and coordinated transformation pipelines. We'll demonstrate how easy it is to configure a transformation pipeline that might otherwise take days or even weeks to code. 5 As a data lake evolves, and external analytic, reporting, and machine learning tools are attached to take advantage of it, your needs for speed emerge. One size does not fit all. In this lesson we'll survey the options you have for optimizing access speeds. 6 Last, we'll see how large-scale efficiency emerges, as you begin to orchestrate data access from multiple external systems, through a single user experience, coordinating ingestion, transformation, optimization, and potentially re-export among all your disparate data silos. We'll demonstrate how managing these processes among multiple systems can be visually designed, configured, monitored, and controlled.
While anyone new to Infoworks will learn a lot from this short introductory course, 1 you’ll gain the most if you have a basic understanding of Spark Hadoop, and the problems they solve, along with basic familiarity with cloud technologies. 2 We’ll also assume you’re familiar with relational database management systems, SQL, and similar technologies for data access, management, and transfer.
Now that you know where we’re going, let’s fire it up! We’ve got a lot to cover ahead. Come on back.