Industry recognized certification enables you to add this credential to your resume upon completion of all courses

Need Custom Training for Your Team?
Get Quote
Call Us

Toll Free (844) 397-3739

Inquire About This Course
Ellie Ordway, Instructor - Comprehensive Pig

Ellie Ordway

Ellie Ordway-West is a Physicist and Professional Data Scientist. She holds Masters of Science degree in Physics from the University of Missouri St. Louis. She has deep expertise in Apache Pig and has created several data science pipelines, completing all data preprocessing and aggregation in Apache Pig. Currently, she utilizes machine learning algorithms to create data-driven decisions at one of the largest telecom companies in the world.

Instructor: Ellie Ordway

An In-depth Training for Apache Pig

  • Unlimited access to online self-paced videos
  • Taught by Data Scientist at one of the largest telecom companies in the world 
  • Coding exercises and over 40 quizzes and programming exercises

Duration: 2h 15m

Course Description

This course is a general overview of the Apache Pig Framework. It will provide an introduction to the structure and methodologies of Apache Pig and an overview of Pig Latin, the Language of Apache Pig. No prior knowledge of Pig or Pig Latin is assumed, but it may be helpful to be familiar with one other programming language, such as python. This course will include interactive tutorials for processing and aggregating data with Apache Pig, it will cover many of the functionality that is built into the language as well as how to incorporate user defined functions into pig scripts to further increase their functionality. In the end you should be able to read and understand pig code and write your own scripts that you can implement in the interactive grunt shell or directly from the command line.

What am I going to get from this course?

  • Process and aggregate data with Apache Pig

Prerequisites and Target Audience

What will students need to know or do before starting this course?

  • Some familiarity with hadoop and map reduce is helpful, but not necessary.

Who should take this course? Who should not?

  • Anyone who will be working with large data sets, data engineers, data scientists and developers


Module 1: Introduction

Lecture 1 Intro and Overview
Lecture 2 About the Instructor

Module 2: What is Pig?

Lecture 3 So what is Pig anyway?
Quiz 1
Lecture 4 Why It's Called Pig
Quiz 2

Module 3: Data Types

Lecture 5 Basic Types 1
Quiz 3 Basic Data Types Q1
Lecture 6 Basic Types 2
Quiz 4 Basic Data Types Q2
Lecture 7 Non Basic Types
Quiz 5 Non Basic Types Quiz
Lecture 8 Nulls vs Empty

Module 4: Getting Started with Pig

Lecture 9 Introduction to the Data
Lecture 10 Getting Hadoop
Quiz 6 Setting up Hadoop

If you don't have access to a hadoop environment, download and set up a sandbox now.

Lecture 11 Starting Hadoop and moving data
Quiz 7 Start Hadoop and Move Data

Lecture 12 Three Ways to Run Pig Commands
Lecture 13 Utility Commands: Help and Quit

Quiz 8 Try it out: Help and Quit
Lecture 14 Common Development Environments

Module 5: Basic Elements of a Pig Script

Lecture 15 Pig Latin Statements
Lecture 16 Load Data
Quiz 9 Load Data Quiz
Lecture 17 Store/dump Data
Quiz 10 Store/Dump quiz
Lecture 18 Setting up Sublime Text
Quiz 11 Set up Sublime Text Exercise
Lecture 19 Load Data Example
Quiz 12 Load Data Exercise
Lecture 20 Store/dump Example
Lecture 21 Quick Note about pig Logs

Module 6: Relational Operators

Lecture 22 Describe
Quiz 13 Describe Exercise

Lecture 23 Limit and Sample
Lecture 24 Group
Lecture 25 Foreach
Quiz 14 Group Exercise
Lecture 26 Flatten
Lecture 27 Join
Quiz 15 Join Exercise
Lecture 28 Disambiguation
Quiz 16 Disambiguation Exercise
Lecture 29 Union
Lecture 30 Cogroup
Lecture 31 Distinct
Lecture 32 Cross
Lecture 33 Filter
Quiz 17 Filter Exercise
Lecture 34 Split
Quiz 18 Split Exercise
Lecture 35 Conditional Statements
Lecture 36 Order
Quiz 19 Order By Exercise
Lecture 37 Rank
Lecture 38 Nested Foreach
Quiz 20 Nested ForEach Exercise

Module 7: Built In Functions

Lecture 39 Intro
Lecture 40 Eval Functions
Lecture 41 Eval Functions 2
Quiz 21 Eval Functions Exercise
Lecture 42 Arithmetic Functions
Quiz 22 Arithmetic Functions Exercise
Lecture 43 Datetime Functions
Lecture 44 String Functions
Quiz 23 String Functions Exercise
Lecture 45 Tuple/map/bag
Lecture 46 User Defined Functions

Module 8: Configuring Pig

Lecture 47 Part 3 Intro
Lecture 48 Parametrization
Lecture 49 Utility Commands


8 Reviews

Abel C

December, 2016

What a kick-ass intro to Apache Pig and its language Pig Latin! If you are working with large data sets, then this course will be very useful. Lots of quizzes and programming tasks make this course very approachable. I was able to learn a lot in a short time. If you are looking to get started with Pig, take this course. Great course overall. Good number of examples to help you master the subject matter. Four and a half-stars!

Kevin L

May, 2017

As a big data scientist, I find this is an excellent course to learn from. Especially unlimited access to online self-paced videos. Equally, it is a collaborative learning experience with coding exercises and may quizzes and programming exercises. It helped me to understand writing and coding pig scripts and increase their functionality. As a big data scientist, this course helped me as my knowledge was a decade old, and helped me understand current developments. It helped me work very well with large data sets.

Anna D

May, 2017

This is really an in-depth training program for learning Apache Pig. Indeed as promised by the instructor in the tutorial, I was able to read and understand pig code and write my own scripts.

Jake H

May, 2017

I find the interactive tutorials an excellent idea. It helped me grasp the tutorials very well including user defined functions into pig scripts. It is an invaluable experience.

Murali K

July, 2017

This is undoubtedly an intelligent instructional curriculum for training Apache Pig. As a data research associate, I’m so glad I ran into this remarkable course. It encouraged me to move quite effectively with high data arrays.

Errol T

July, 2017

Awesome course. Easy to figure out. Thank you for making this course! The subject was well organized, the demonstrations were great and there was a good balance between theory and real problems that you had to deal with yourself.

Bruno M

July, 2017

Indeed as vouched for by the teacher, I was capable of understanding and interpreting pig code and formulating my personal scripts. I picked up the tutorials very well working with user defined functions into pig scripts. It’s valuable knowledge.

Naveen K

July, 2017

Tutorials were designed well. As a data research worker, this course of study helped as my knowledge was old, and I learned contemporary issues. Correspondingly, it is collaborative training with coding practices and many questions and programming activities. It pushed me to learn to transcribe and code pig scripts and develop their functionality.