Program Card
Praxis logo

Post Graduate Program in Data Science and Machine Learning

Program Highlights:
Mentorship Support, Resume Preparation,Praxis Certification, Tech Update Sessions, Real Time Internships, Industry Projects

E 4

4 Industry Endorsements for this program

Post Graduate Program in Data Science and Machine Learning

This Post Graduate Program in Data Science and Machine Learning has a perfect blend of Technology, Data Science and Business cases and insights; it stands out to be among the best in the world. This uniquely blended Program is brought to you by Praxis, a Top-ranked Analytics B-School in India.

Sneak Peak

INR 59,600

Program Summary

  • 30 credits
    Credits

    With this course, you are 5 credits short of an assured placement.

    Learn more
  • Duration 1 Year
  • 90 Hours of projects/Assignments
  • 212 Hours of online sessions

Course Topics

  • 1

    Big Data 101

    • Big Data Characteristics

      • Volume
      • Variety
      • Velocity
      • Veracity
      • Valence
      • Value
    • Big Data and Business

    • Data Relationships and Data Model

      • One-to-one relationship
      • One-to-many relationship
      • Many-to-many relationship
      • Flat model
      • Hierarchical model
      • Network model
      • Relational model
      • Star schema model
      • Data vault model
    • Data Grouping

    • Clustering Algorithms

      • partitioning
      • hierarchical
      • grid based
      • density based
      • model based
    • Getting ready for Clustering Algorithms

    • Clustering Algorithms – UPGMA, single Link Clustering

    • KPIs, Businesses & Data Elements

    • Mapping for business outcomes

      • Define the pain point
      • Define the goal
      • Identify the actors
      • Identify the impacts
      • Identify the deliverables
      • Creating your impact map
    • Basic Query

    • Advanced Query – Embedding

    • Introduction to key mathematical concepts

      • eigenvalues and eigenvectors
    • Application of eigenvalues and eigenvectors

      • investigate prototypical problems of ranking big data
    • Application of the graph Laplacian

      • investigate prototypical problems of clustering big data
    • Application of PCA and SVD

      • investigate prototypical problems of big data compression
  • 2

    Statistics 101

    • Introduction to Statistics

    • Introduction to Statistics – II

    • Measures of Central Tendency, Spread and Shape – I

    • Measures of Central Tendency, Spread and Shape – II

    • Measures of Central Tendency, Spread and Shape – III

  • 3

    R Programming

    • R Programming

    • Introduction to R – I

    • Introduction to R – II

    • Common Data Structures in R

    • Conditional Operation and Loops

    • Looping in R using Apply Family Functions

    • Creating User Defined Functions in R

    • Graphics with R

    • Advanced Graphics with R

  • 4

    Hadoop

    • Introduction to Big Data and Hadoop

    • Introduction to DBMS systems using MySQL

    • Big Data and Hadoop EcoSystem

    • HDFS

    • Unix & HDFS Hands-on

    • Map-Reduce basics

    • Map Reduce Advanced Topics and Hands on

    • Pig introduction and Hands on

    • Pig Scripting

    • Hive Introduction, Metastore, Limitations of Hive

    • Comparison with Traditional Database and HIVE scripting

    • Hive Data Types, Partitioning and Bucketing

    • Hive Tables (Managed and External)

    • Hive Continued

    • Scoop Introduction and Hands-on

    • Introduction to NoSql and HBASE

    • HBASE architecture and Hands-on

  • 5

    Access Methods

  • 6

    Big Data with Spark and Python

  • 7

    Python

    • Understanding Basics of Python

    • Control Structures and for loop

    • Playing with while loop | break and continue

    • Strings and files

    • List

    • Dictionary and Tuples

  • 8

    Data Mining 1 - Machine Learning with R & Python

    • Introduction to NumPy

    • Introduction to Pandas

    • Slicing Data

    • Exploratory Data Analysis

    • Exploratory Data Analysis (Continue)

    • Missing Value Imputation and Outlier Analysis

    • Linear Regression Motivation

    • Linear Regression optimization objective

    • Linear Regression in Python

    • Introduction to Regression Tree

    • Introduction to Classification Tree

    • Measures of Selecting the best Split

    • Cluster Analysis – Hierarchical Clustering & k-Means Clustering

    • Customer segmentation in Telecom Industry using Cluster Analysis

    • k-Means clustering

    • Association Rules mining

    • Market Basket Analysis

  • 9

    Data Mining 2 - Advanced Machine Learning with R & Python

    • Sources of Error (Irreducible error, bias and variance)

    • Formally defining the 3 Sources of Error

    • Linear Regression – Multicollinearity (VIF)

    • Qualitative Predictors – Use of Dummy Variables

    • Observing overfitting in Polynomial Regression

    • Regularized Regression (L2 – Regularization) – To avoid overfitting

    • Regularized Regression (L1 – Regularization) – Feature selection using regularization

    • Regularized Regression – How does regularized regression handles multicollinearity?

    • Decision Tree – Pruning

    • Bagging Models

    • Designing your own Bagged Model

    • Random Forest

    • Boosting (Ada Boost)

    • K Nearest Neighbour – Concept. kNN algorithm for k=1 and k>1

    • Writing a K Nearest Neighbour algorithm from scratch

    • Comparison of kNN with Linear Regression; Difference between kNN and kMeans.

    • Revision of basics of Linear Algebra

    • The Theory of dimension reduction

    • Practical – Compressing an image file [Practical using R Software]

    • Practical – Compressing an image file [Practical using R Software] (Continue)

  • 10

    RDBMS with SQL and DWH

    • Introduction to DBMS / RDBMS

    • Data Modelling

    • Physical Data Model

    • Getting Started with SQL Lite

    • DDL

    • DML

    • Introduction to Data Warehousing

    • Dimensional Modelling

    • Advanced SQL

    • Olap Cubes

    • Olap Cubes Practicals

Industry Connect

  • G Infotech Logo

    We are elated by the program methodology, content, people and the platform of 361 DM which instills confidence in the quality of candidates emerging out of this program. As a techprenur, I look forward for such candidates who could partner in our growth


    -Praveen, Director, G Infotech

    Aaum Analytics

    This product is endorsed by Aaum Analytics.

    A company specialised in analytics with strong focus on research and technology

    Capgemini

    As an Industry person with over 20 years of experience,I have witnessed multiple training programmes and training providers.This program of 361DM stands out from all of them for the expertise of professionals delivering, the quality of the content and the engaging model of the platform.Truly Enriching! - Vijay-Project Manager-Capgemini


    - VijayKumar, Senior Manager, Capegemini

  • Infodrive Analytics

    Very good on trainings
    -G.Karpagavalli, Sr Manager - HR

Program Mentors

Learn from the best in the industry

Browse Courses