Servian Cloud and Technology Services

Data Engineering on Google Cloud (4 days)

This course covers both how to use Google Cloud tools and how to maximise cloud computing efficiency in just four days. Our Servian Google Cloud certified instructors will cover cloud design, building end-to-end data pipelines, analyse data, carry out machine learning and more. This course dives into the main GC tools, providing an overview on Cloud Dataflow, large BigQuery datasets, Dataflows with Pub/Sub, TensorFlow and Spark in Dataproc. This class is for developers who have skills with ETL, SQL, statistics and familiarity with Machine Learning concepts in Python and/or Java.


This course teaches participants the following skills:

  • Design and build data processing systems on Google Cloud Platform
  • Process batch and streaming data by implementing autoscaling data pipelines on Cloud Dataflow
  • Derive business insights from extremely large datasets using Google BigQuery
  • Train, evaluate and predict using machine learning models using Tensorflow and Cloud ML
  • Leverage unstructured data using Spark and ML APIs on Cloud Dataproc
  • Enable instant insights from streaming data


To get the most of out of this course, participants should have:

  • Completed Google Cloud Fundamentals: Big Data & Machine Learning course OR have equivalent experience
  • Basic proficiency with common query language such as SQL
  • Experience with data modeling, extract, transform, load activities
  • Developing applications using a common programming language such as Python
  • Familiarity with Machine Learning and/or statistics

Course Outline

Day 1

  • Module 1: Google Cloud Dataproc Overview
  • Module 2: Running Dataproc Jobs
  • Module 3: Integrating Dataproc with Google Cloud Platform
  • Module 4: Making Sense of Unstructured Data with Google’s Machine Learning APIs
    Serverless Data Analysis with Google BigQuery and Cloud Dataflow (also available on demand)
  • Module 5: Serverless data analysis with BigQuery

Day 2

  • Module 6: Serverless, autoscaling data pipelines with Dataflow
    Serverless Machine Learning with TensorFlow on Google Cloud Platform (also available on demand)
  • Module 7: Getting started with Machine Learning

Day 3

  • Module 8: Building ML models with Tensorflow
  • Module 9: Scaling ML models with CloudML
  • Module 10: Feature Engineering
    Building Resilient Streaming Systems on Google Cloud Platform (also available on demand)

Day 4

  • Module 11: Architecture of streaming analytics pipelines
  • Module 12: Ingesting Variable Volumes
  • Module 13: Implementing streaming pipelines
  • Module 14: Streaming analytics and dashboards
  • Module 15: High throughput and low-latency with Bigtable Summary

Want to learn how to enable instant insights from streaming data?

Related Courses