AWS Glue DataBrew

AWS DataBrew

The course is about AWS Glue DataBrew.

What you’ll learn

  • AWS Glue DataBrew.
  • Transformations using DataBrew.
  • Cleanup and Normalize data.
  • Data Profiling.
  • Scheduling DataBrew Jobs.

Course Content

  • Introduction –> 2 lectures • 3min.
  • Creating Data Set –> 2 lectures • 8min.
  • Filter and Column –> 2 lectures • 4min.
  • Format, Clean and Extract –> 3 lectures • 8min.
  • Missing, Invalid, Duplicates and Outliers –> 4 lectures • 7min.
  • Split, Merge and Create –> 3 lectures • 7min.
  • Group, Join and Union –> 3 lectures • 8min.

AWS Glue DataBrew

Requirements

The course is about AWS Glue DataBrew.

AWS Glue DataBrew is a new visual data preparation tool that makes it easy for data analysts and data scientists to clean and normalize data to prepare it for analytics and machine learning. You can choose from over 250 pre-built transformations to automate data preparation tasks, all without the need to write any code. You can automate filtering anomalies, converting data to standard formats, and correcting invalid values, and other tasks. After your data is ready, you can immediately use it for analytics and machine learning projects. You only pay for what you use – no upfront commitment.

 

Using DataBrew, business analysts, data scientists, and data engineers can more easily collaborate to get insights from raw data. Because DataBrew is serverless, no matter what your technical level, you can explore and transform terabytes of raw data without needing to create clusters or manage any infrastructure.

With the intuitive DataBrew interface, you can interactively discover, visualize, clean, and transform raw data. DataBrew makes smart suggestions to help you identify data quality issues that can be difficult to find and time-consuming to fix. With DataBrew preparing your data, you can use your time to act on the results and iterate more quickly. You can save transformation as steps in a recipe, which you can update or reuse later with other datasets, and deploy on a continuing basis.

Get Tutorial