HADOOP ADMIN

DURATION  45 hrs
CERTIFICATION
YES

Description

1) Linux Basics

2) Introduction to Big Data
• What is Big Data ?
• Big Data Facts
• The 5 V’s of Big Data
• Understanding Hadoop
• What is Hadoop ?
• Why learn Hadoop ?
• Relational Databases Vs. Hadoop
• Motivation for Hadoop
• 6 Key Hadoop Data Types
2) Hadoop Distributions
• Hortonworks
• Cloudera
• MapR

3) Ambari vs Cloudera Manager
• Ambari features
• Cloudera Manager features
• Real time use cases
• List of hadoop cluster Management choices

4) Hadoop Distributed File system (HDFS)

• What is HDFS ?
• HDFS components
• Understanding Block storage
• The Name Nodes
• Namenode High Availability
• The Data Nodes
• Data Node Failures
• HDFS Commands
• HDFS File Permissions
• Enable and Manage HDFS quota
• writing and reading the files
• Namenode memory consideration
• webUI for HDFS

5) The MapReduce Framework
• Overview of MapReduce
• Understanding MapReduce
• Job Tracker
• Task Traker
• The Map Phase
• The Reduce Phase
• WordCount in MapReduce
• Running MapReduce Job
• Yarn Architecture
• what is the use of Yarn
• Resoure Manager
• Node Manager
• configure and manager Yarn queues
• Understand basics of running simple Yarn applications.
• Yarn application logs
• WEBUIs

Managing and Scheduling Jobs
• Managing Jobs
• The FIFO Scheduler
• The Fair Schedule
• How to stop and start jobs running on the cluster
6) Planning Your Hadoop Cluster
• General Planning Considerations
• Choosing the Right Hardware
• Virtualization Options*
• Network Considerations
• Configuring Nodes

7) Ambari Installation.
8) Single-node cluster Installation and configurations
i) Local Installation
ii) Cloud Installation
9) Multinode cluster Installation and configurations
i) Cloud Installation

9) Installing and Managing Hadoop Ecosystem Projects
• Sqoop
• Flume
• Hive
• Pig
• HBase
• Oozie
• Spark
• Hue
• and More components

10)Cluster Monitoring and Troubleshooting
• Ambari Monitoring Features
• Monitoring Hadoop Clusters
• Troubleshooting Hadoop Clusters
• Common Misconfigurations
11) Hadoop Clients Including Hue
• What Are Hadoop Clients?
• Installing and Configuring Hadoop Clients
• Installing and Configuring Hue
• Hue Authentication and Authorization
12) Cluster Maintenance
• Checking HDFS Status
• Copying Data Between Clusters
• Adding and Removing Cluster Nodes
• Rebalancing the Cluster
• Directory Snapshots
• Cluster Upgrading

13) Hadoop Security
• Why Hadoop Security Is Important
• Hadoop’s Security System Concepts
• What Kerberos Is and how it Works
• Securing a Hadoop Cluster With Kerberos
• Ranger
• BDP
• Other Security Concepts

14) Advanced Cluster Configuration
• Advanced Configuration Parameters
• Configuring Hadoop Ports
• Configuring HDFS for Rack Awareness
• Configuring HDFS High Availability
• Configuring Yarn High Availability
• Configuring Hive High Availability
• Configuring Hbase High Availability
• Data copying from one cluster to other cluster
• Hbase Table copying from one cluster to other cluster
• Realtime ambari upgrade
• Realtime Hadoop cluster ugrade process

15) Everyday Activites of a Hadoop Admin
16) Mock Interview s and Hortonworks Certification Support