Bigdata Hadoop online training Ameerpet  hyderabad:

Introduction to Big Data and Hadoop

  • Big Data
  • What is Big Data?
  • Why all industries are talking about Big Data?
  • What are the issues in Big Data?
    • Storage
      • What are the challenges for storing big data?
    • Processing
      • What are the challenges for processing big data?
    • What are the technologies support big data?
      • Hadoop
      • Spark
      • Data Bases
        • Traditional
        • NO SQL
      • Hadoop
      • What is Hadoop?
      • Why Hadoop?
      • History of Hadoop
      • Hadoop Use cases
      • Advantages and Disadvantages of Hadoop
      • Different Ecosystems of Hadoop
      • Big Data Real time Use Cases

 HDFS (Hadoop Distributed File System)

HDFS Features

  • HDFS architecture
  • Name Node
    • Importance of Name Node
    • What are the roles of Name Node
    • What are the drawbacks in Name Node
  • Secondary Name Node
    • Importance of Secondary Name Node
    • What are the roles of Secondary Name Node
    • What are the drawbacks in Secondary Name Node
  • Data Node
    • Importance of Data Node
    • What are the roles of Data Node
    • What are the drawbacks in Data Node
  • JobTracker
  • TaskTrack

Bigdata Hadoop online training Ameerpet  hyderabad

  • Data Storage in HDFS
  • Storing blocks in DataNodes
  • Replications
  • Accessing(Read/Write) files in HDFS
  • HDFS Block size
  • Importance of HDFS Block size
  • Why Block size is so large?
  • How it is related to MapReduce split size
  • HDFS Replication factor
  • Importance of HDFS Replication factor in production environment
  • Can we change the replication for a particular file or folder
  • Can we change the replication for all files or folders
  • Accessing HDFS
  • CLI(Command Line Interface) using HDFS commands
  • Java Based Approach
  • How to overcome the Drawbacks in HDFS
  • Name Node failures
  • Secondary Name Node failures
  • Data Node failures
  • Where does it fit and Where doesn’t fit?
  • How to configure the Hadoop Cluster
  • How to add the new nodes ( Commissioning )
  • How to remove the existing nodes ( De-Commissioning )
  • How to verify the Dead Nodes
  • How to start the Dead Nodes
  • Hadoop 2.x.x version features
  • Introduction to Namenode federation
  • Introduction to Namenode High Availabilty with NFS
  • Introduction to Namenode High Availabilty with QJM
  • Difference between Hadoop 1.x, Hadoop 2.x and Hadoop 3.x versions

 MAPREDUCE

  • Map Reduce Architecture
  • JobTracker
  • Importance of JobTracker
  • Roles of JobTracker
  • Drawbacks in JobTracker
  • TaskTracker
  • Importance of TaskTracker
  • Roles of TaskTracker
  • Drawbacks in TaskTracker
  • Map Reduce Job execution flow
  • Data Types in Hadoop
  • Data types in Map Reduce
  • Can we write custom Data Types in MapReduce
  • Input Format's in Map Reduce
  • Text Input Format
  • Key Value Text Input Format
  • Sequence File Input Format
  • NLine Input Format
  • Importance of Input Format in Map Reduce
  • How to use Input Format in Map Reduce
  • How to write custom Input Format's and its Record Readers
  • Output Format's in Map Reduce
  • Text Output Format
  • Sequence File Output Format
  • Importance of Output Format in Map Reduce
  • How to use Output Format in Map Reduce
  • How to write custom Output Format's and its Record Writers
  • Mapper
  • What is mapper in Map Reduce Job
  • Why we need mapper?
  • What are the Advantages and Disadvantages of mapper
  • Writing mapper programs
  • Reducer
  • What is reducer in Map Reduce Job
  • Why we need reducer ?
  • What are the Advantages and Disadvantages of reducer
  • Writing reducer programs
  • Combiner
  • What is combiner in Map Reduce Job
  • Why we need combiner?
  • What are the Advantages and Disadvantages of Combiner
  • Writing Combiner programs
  • Partitioner
  • What is Partitioner in Map Reduce Job
  • Why we need Partitioner?
  • What are the Advantages and Disadvantages of Partitioner
  • Writing Partitioner programs

Bigdata hadoop online training ameerpet hyderabad

  • Distributed Cache
  • What is Distributed Cache in Map Reduce Job
  • Importance of Distributed Cache in Map Reduce job
  • What are the Advantages and Disadvantages of Distributed Cache
  • Writing Distributed Cache programs
  • Counters
  • What is Counter in Map Reduce Job
  • Why we need Counters in production environment?
  • How to Write Counters in Map Reduce programs
  • Importance of Writable and Writable Comparable Api’s
  • How to write custom Map Reduce Keys using Writable
  • How to write custom Map Reduce Values using Writable Comparable
  • Joins
  • Map Side Join
    • What is the importance of Map Side Join
    • Where we are using it
  • Reduce Side Join
    • What is the importance of Reduce Side Join
    • Where we are using it
    • What is the difference between Map Side join and Reduce Side Join?
  • Compression techniques
  • Importance of Compression techniques in production environment
  • Compression Types
    • NONE, RECORD and BLOCK
  • Compression Codecs
    • Default, Gzip, Bzip2, Snappy and LZO
  • Enabling and Disabling these techniques for all the Jobs
  • Enabling and Disabling these techniques for a particular Job
  • Speculative Execution
  • What is Speculative Execution?
  • Will Hadoop follows Speculative Execution?

 

YARN (Next Generation Map Reduce)

  • What is YARN?
  • What is the importance of YARN?
  • What is difference between YARN and Map Reduce
  • Yarn Architecture
  • Importance of Resource Manager
  • Importance of Node Manager
  • Importance of Application Manager
  • Yarn Application execution flow
  • Installing YARN on both windows & Linux
  • Exploring the YARN Web UI

 Apache PIG

  • Introduction to Apache Pig
  • Map Reduce Vs Apache Pig
  • SQL Vs Apache Pig
  • Different data types in Pig
  • Modes Of Execution in Pig
  • Local Mode
  • Map Reduce Mode
  • Execution Mechanism
  • Grunt Shell
  • Script
  • Embedded
  • UDF's
  • How to write the UDF's in Pig
  • How to use the UDF's in Pig
  • Importance of UDF's in Pig
  • Filter's
  • Load Functions
  • Store Functions
  • Transformations in Pig
  • How to write the complex pig scripts
  • How to integrate the Pig and Hbase

Bigdata Hadoop online training Ameerpet  hyderabad

Apache HIVE

  • Hive Introduction
  • Hive architecture
  • Driver
  • Compiler
  • Optimizer
  • Semantic Analyzer(Executor)
  • Hive Query Language(Hive QL)
  • SQL VS Hive QL
  • Hive Installation and Configuration
  • Hive DLL and DML Operations
  • Hive Services
  • CLI
  • Hiveserver
  • Hwi
  • Metastore
  • embedded metastore configuration
  • external metastore configuration
  • UDF's
  • How to write the UDF's in Hive
  • How to use the UDF's in Hive
  • Importance of UDF's in Hive
  • UDAF's
  • How to use the UDAF's in Hive
  • Importance of UDAF's in Hive
  • UDTF's
  • How to use the UDTF's in Hive
  • Importance of UDTF's in Hive
  • How to write a complex Hive queries
  • What is Hive Data Model?
  • Partitions
  • Importance of Hive Partitions in production environment
  • Limitations of Hive Partitions
  • How to write Partitions
  • Buckets
  • Importance of Hive Buckets in production environment
  • How to write Buckets
  • SerDe
  • Importance of Hive SerDe's in production environment
  • How to write SerDe programs
  • Semi Structured Data Processing through HIVE
  • XML Data Processing
  • JSON (Java Script Object Notation) Data Processing through HIVE
  • HIVE – HBASE Integration

 Bigdata Hadoop online training Ameerpet  hyderabad:

Apache Zookeeper

  • Introduction to zookeeper
  • Pseudo mode installations
  • Zookeeper cluster installations
  • Basic commands execution

 

Apache HBase

  • HBase introduction
  • HBase usecases
  • HBase basics
  • Importane of Column families
  • Basic CRUD operations
    • create
    • scan / get
    • put
    • delete / deleteall / drop
  • Bulk loading in Hbase
  • HBase installation
  • Local mode
  • Psuedo mode
  • Cluster mode
  • HBase Architecture
  • HMaster
  • HRegionServer
  • Zookeeper
  • Mapreduce integration
    • Mapreduce over HBase

Bigdata Hadoop online training Ameerpet  hyderabad

Apache SQOOP

  • Introduction to Sqoop
  • MySQL client and Server Installation
  • Sqoop Installation
  • How to connect to Relational Database using Sqoop
  • Examples on Import and Export Sqoop commands

 

Apache FLUME

  • Introduction to flume
  • Flume installation
  • Flume Architecture
  • Agent
  • Sources
  • Channels
  • Sinks
  • Practice on Flume examples

 

Apache Kafka

  • Introduction to Kafka
  • Installing Kafka
  • Practice on Kafka examples

 

Apache OOZIE

  • Introduction to oozie
  • Oozie installation
  • Oozie Configuration Files
  • Executing different oozie workflow jobs
  • Monitoring Oozie workflow jobs
Find out if YOU are a candidate

IELTS Coaching

IELTS Coaching

SKILLS TRAINING

SKILLS TRAINING

Imigration to Canada

Imigration to Canada

Visa Guidance

Visa Guidance

WHY CHOOSE i2C CONSULTANCY

Our promoters being passionate, experienced, and dedicated academicians with immense exposure of more than 25 years in India and Canada can help potential aspirants realise their dreams of settling in Canada.

Being a Canadian resident, our chairman has been associated with the Canadian Ministry of Education, Universities and Colleges of Canada for the past 13 years.

Read More >

OUR SERVICES

  • Students Study Visa
  • Visitors Visa guidance
  • IELTS Coaching
  • IT Training
  • Financial Assistance
  • Employability Skills Training
  • Campus Recruiting Training

OUR LOCATION

i2C Education & Immigration
#410, Nilgiri Block, Aditya Enclave, Ameerpet, Hyderabad - 500038,

040 4852 7978, +99 1212 0040, 99 1212 0050

+1 416 495 6922, +1 647 961 6047