Course Description

The bootcamp will cover different aspects of Data Science with hands-on exercises and industrial use-cases. Major modules include Python for Data Science, Data Analysis, Practical Machine Learning, Large Scale Machine Learning, Deep Learning, and Business Perspectives of Data Science. This well-structured offering is by Data Science Initiative (DSI) - a team of industry and academia experts - who have executed 10 hands-on workshops in the past with an excellent rating of 4.5/5.0.

What am I going to get from this course?

Perform effective data analysis i.e. 50% to 80% of Data Science work
Investigate and present your work in effective visual form
Understand most popular machine learning algorithms and their uses
Implement end-to-end machine learning pipe-lines for given usecases
Implement large-scale machine learning algorithms on cloud and through APIs
Employ deep learning for unstructured data (Images and Text)
Build a small block-chain
Differentiate real data science from the fuzz around it
Understand DS-business echo-system and can identify right use-cases

Prerequisites and Target Audience

What will students need to know or do before starting this course?

Participants may brush some basic concepts of probability, matrices, and programming (any language).

Who should take this course? Who should not?

You should join this Bootcamp if you are among:

Business or IT Professionals who wish to transform to a data scientist role
Early stage data scientists and data analysts who wish to learn end-to-end data science from a team of industry experts.
Students and fresh graduates who want to pursue a career in Data Science, Machine Learning and Artificial Intelligence Area
Researchers from any field who works with data and would like to employee Machine Learning in their research
Non-python Data Scientists who would like to go through a quick hands-on transformation to Python.

Curriculum

Module 1: Data Science Landscape

Lecture 1 Data Science Landscape - Overview

This sub-module will enable participants to differentiate real data science from the fuzz around it and develop a solution-oriented mindset. It is scheduled in the beginning of the workshop. • The four building blocks of Data Science • Common mistakes and best practices for data scientists • Relation of Data Science with AI, Machine Learning, Deep Learning, and other fields • Technology stack and choosing the best technology • An overview of industrial applications and their requirements

Lecture 2 Data Science from Business Perspective

Scheduled in the middle of the workshop, this module will introduce participants with practical requirements of business use-cases and best practices during CRISP-DM or garage approach. • Identifying and prioritizing data science use-cases within an organization • Translating business problems into ML problems • Analytics-roadmap for organizations • Mapping of Design Thinking to Data Science • Introduction and requirements for ML project P2, that will follow this session

Lecture 3 Data Science Roles and Methods

At the end of the workshop when the participants have gone through the end-to-end data science journey, this module will help participants understand which role fit best to them, the best practices, and how to enter and excel in that role. • Data Science value perspective • Data Science teams and organizations • Data Scince roles (e.g. Data Engineer, Analyst, Data Scientist, ML Engineer, etc) • Methods such as CRISP-DM, garage, scrum • Communication with non-data scientists • The non-technical skills needed for excellence in Data Science

Module 2: Exploratory Data Analysis with Python

Lecture 4 Python Programming for Data Science

Python is an easy to understand scripting language – yet its compact programming style and vast amount of libraries makes it a challenge for learners to focus on what matters the most. With a carefully designed contents trajectory, this hands-on module will provide participants a good basis in Python for the rest of the workshop – which then includes most relevant libraries and methods. • Why Python is the most popular language for Data Scientists • Introduction to Python as a language • Python native data structures including Lists, Set, Dictionary, Tuple • Numpy and Scipy • Control Structures • Functions and Classes • Hands-on Exercises in Python

Lecture 5 Exploratory Data Analysis with Python

EDA is what consume most time of a Data Scientist (50% to 80%). This module will equip participants with best techniques to do efficient and effective EDA. • Introduction to EDA, common mistakes, best practices • Introduction to Pandas library • DataFrame and Series data structures in pandas • Reading data from different sources (csv, web, excel, json, SQL, txt, etc) • DataFrame operations e.g. filtering, filling, merging, conditioning, aggregation • Summarization, outlier detection, and bird-eye-view reporting • Map-reduce for efficient operations on DataFrames • EDA hands-on excercises

Lecture 6 Data Visualization

Data Visualization is not only helpful to communicate results and findings with others but is equally important for data scientist itself in order to understanding the data. This module will enable participants to develop quick and pretty visualizations using Python libraries. • Good and bad types of visualizations • Practical working with Matplotlib – making any visualization • Practical working with Seaborn – making interactive and pretty visualizations • Practical working with Plotly – making and deploying interactive and pretty visualizations

Lecture 7 Data Analysis Project

This module will allow participants test and improve their skills developed so far i.e. in EDA with pandas, and visualization libraries in Python. A business use case will be discussed as a reference case for exercises. • Use-case and problem statements • Data loading and merging • Data analysis • Data cleaning • Data exploration • Data visualization

Module 3: Practical Machine Learning

Lecture 8 Basics of Machine Learning

This module will provide theoretical understanding of machine learning algorithms, their working, their advantages and limitations, and hence demystifying it for participants – eventually hoping they will be able to decide their own design of ML if needed. • Types of Machine Learning algorithms and application scenarios • Classification algorithms (e.g. Naïve Bayes, Decision Trees, KNN, ANN, Support Vector Machines) • Regression algorithms (e.g. cousins of classification algorithms, Linear/Ridge/Lasso Regression and all other cousins of classification algorithms) • Ensemble methods (e.g. Random Forests, Gradient Boosted Trees) • Outlier detection (e.g. One-Class SVM, auto-encoders) • Clustering algorithms (e.g. K-Means, DBSCAN, Hierarchical clustering) • Feature Selection and Dimensionality reduction (e.g. PCA, LDA, RFE and other techniques)

Lecture 9 Practical Machine Learning with Python

By using Scikit-learn as the main library, this tool will enable participants to apply machine learning process through model selection, model building, parameter optimization and evaluation. • Introduction of Scikit-learn • Exploration of built-in datasets • Building an ML model using Scikit-learn • Splitting data into training, validation, and testing • Cross-validation techniques • Hyperparameter search using GridSearchCV • ML Excercises and Assignment

Lecture 10 ML Project – Mobility Prediction in Metropolis

This is the first of two projects in Machine Learning. The objective of this projects is to enable participants apply their so far gained knowledge of end-to-end data science process (EDA+ML) on a real-world scenario. • Quick introduction of the use-case • Loading data from different sources • EDA and Data cleaning • Building machine learning models • Validation and testing of models • Debugging machine learning model w.r.t overfitting and underfitting

Lecture 11 Advance ML with Scikit-Learn

Primary objective of this module is to enable participants to use Pipelines as end-to-end machine learning construct. The module will also cover more details on feature selection, and dimensionality reduction with exercises. • Curse of Dimensionality • Feature selection methods (Univariate and multivariate methods) • Dimensionality reduction techniques exercises (PCA, LDA, RFE, etc) • Concept of transformations and operations in Python • Introduction to Machine Learning Pipelines in Scikit-learn • Building 2-steps, 3-steps, k-steps ML Pipelines (e.g. Feature Selection + Feature Engineering + Classification) • Pipelines and GridSearchCV • Exercise of all of above • The untold truth of Machine Learning

Lecture 12 ML Project – Customer Churn Prediction

This is second ML project with and objective to exercise a full CRISP-DM cycle i.e. domain-understanding + data understanding + data analysis + model building + model optimization + model deployment (as a pipeline). • Domain understanding from a problem statement • Exploratory Data Analysis • Feature Engineering and Feature Selection • Building machine learning models • Validation and testing of models • Doing it all within a Machine Learning Pipeline • Applying CRISP-DM

Module 4: Data Science at Scale

Lecture 13 Machine Learning for Large Scale Applications

This is second ML project with and objective to exercise a full CRISP-DM cycle i.e. domain-understanding + data understanding + data analysis + model building + model optimization + model deployment (as a pipeline). • Building ML Apps with REST APIs using Flask • Introduction to Apache Spark • Machine Learning on Spark – A Hands-on session • Introduction to Cloud-based Artificial Intelligence • Architecture of large-scale AI Applications

Lecture 14 Building Decentralized Applications: Blockchain Network

This module will enable participants understand and experiments on the concepts of decentralized applications. • Understanding Blockchain • Distributed Ledgers • Cryptocurrencies • Blockchain Potential: Use cases • Building a Blockchain application with Python (hands-on)

Lecture 15 Deep Learning

Deep Learning is probably the most talked-about area in Data Science. This module will equip participants with the understanding and practical experience of building a deep neural network using Keras/Tensorflow. • Types and applications of Neural Networks • Multi-layer Backpropagation Networks • Activation function (Sigmoid, Tanh, Relu, etc) • Introduction to Convolutional Neural Network • Introduction to Keras • Development of Image classification using Deep Neural Network • Development of Image classification using Convolutional Neural Network

Data Science in Practice - An Online Bootcamp

Instructor Led Online Classes

There are no active batches for this course. If you have any question feel free to contact us

Certification

Need Custom Training for Your Team?

Call Us

Inquire About This Course

Instructors

Dr. Muhammad Shahzad Cheema

Dr. Chan Naseeb

Dr. Zubair Nawaz

Dr. Mohammed Kamran Malik

Instructors: Dr. Muhammad Shahzad Cheema, Dr. Chan Naseeb, Dr. Zubair Nawaz, Dr. Mohammed Kamran Malik

A hands-on and end-to-end Data Science Program

About Course

Prerequisites

Curriculum

Course Description

What am I going to get from this course?

Prerequisites and Target Audience

What will students need to know or do before starting this course?

Who should take this course? Who should not?

Curriculum

Module 1: Data Science Landscape

Module 2: Exploratory Data Analysis with Python

Module 3: Practical Machine Learning

Module 4: Data Science at Scale