0% found this document useful (0 votes)

65 views20 pages

Setting Up and Running Hadoop 0.20.2

Hadoop is an open source framework for distributed storage and processing of large datasets across clusters of computers. It allows for the distributed processing of large datasets across clusters of computers using simple programming models. It scales up from single servers to thousands of machines, with very high fault tolerance. The core of Hadoop includes Hadoop Distributed File System for storage, and MapReduce for distributed computing.

Uploaded by

Suresh Singh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

65 views20 pages

Setting Up and Running Hadoop 0.20.2

Uploaded by

Suresh Singh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Hadoop

By Dinesh Amatya
Hadoop

 The exponential growth of data first presented

challenges to cutting-edge businesses such as
Google, Yahoo, Amazon, and Microsoft
 Google publicize GFS, MapReduce
 Doug Cutting led the charge to develop an open
source version of this MapReduce system called
Hadoop
 Yahoo supported

Hadoop

 Hadoop is an open source framework for writing and running

distributed applications that process large amounts of data
– Hdfs - distributed storage

– Mapreduce – distributed computation

 transfers code instead of data
 data replication

–
Building blocks of Hadoop

 NameNode
 DataNode
 JobTracker
 TaskTracker
 Secondary NameNode



Setting up SSH for a Hadoop
cluster
Define a common account
Verify SSH installation Sudo apt-get install openssh-server
or
[hadoop-user@master]$ which ssh
sudo dpkg -i openssh.deb
/usr/bin/ssh
[hadoop-user@master]$ which sshd
/usr/bin/sshd
[hadoop-user@master]$ which ssh-keygen
/usr/bin/ssh-keygen
Setting up SSH for a Hadoop
cluster
Generate SSH key pair
[hadoop-user@master]$ ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/home/hadoop-user/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:

Your identification has been saved in /home/hadoop-user/.ssh/id_rsa.

Your public key has been saved in /home/hadoop-user/.ssh/id_rsa.pub.
Setting up SSH for a Hadoop
cluster
Distribute public key and validate logins
[hadoop-user@master]$ scp ~/.ssh/id_rsa.pub hadoop-user@target:~/master_key
[hadoop-user@target]$mkdir ~/.ssh
[hadoop-user@target]$chmod 700 ~/.ssh
[hadoop-user@target]$mv ~/master_key ~/.ssh/authorized_keys
[hadoop-user@target]$chmod 600 ~/.ssh/authorized_keys
[locally :: cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys ]
[hadoop-user@master]$ ssh target
Last login: Sun Jan 4 15:32:49 2009 from master
Running Hadoop

[hadoop-user@master]$gedit .bashrc

export JAVA_HOME = /opt/jdk1.7.0

export PATH = $PATH:$JAVA_HOME/bin
Running Hadoop

[hadoop-user@master]$ cd $HADOOP_HOME/conf

hadoop-env.sh
export JAVA_HOME=/usr/share/jdk
Running Hadoop
core-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/opt/hadoop_tmp</value>
</property>
</configuration>
Running Hadoop

mapred-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>localhost:9001</value>
</property>
</configuration>
Running Hadoop

hdfs-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
</configuration>
Running Hadoop

[hadoop-user@master]$ cat masters

localhost
[hadoop-user@master]$ cat slaves
localhost

[hadoop-user@master]$ bin/hadoop namenode -format

[hadoop-user@master]$ bin/start-all.sh
Running Hadoop

In file .bashrc

export HADOOP_HOME=/opt/programs/hadoop-0.20.2-cdh3u6
export PATH=$PATH:$HADOOP_HOME/bin
Web-based cluster UI
Web-based cluster UI
Working with files in HDFS

Basic file commands

hadoop fs -cmd <args>

hadoop fs –ls /
hadoop fs –mkdir /user/chuck
hadoop fs -put example.txt .
hadoop fs -put example.txt /user/chuck
hadoop fs -get example.txt .
Working with files in HDFS

hadoop fs -cat example.txt | head

hadoop fs –rm example.txt

hadoop fs –rmr /user/hdfs/dir1

hadoop fs -chmod 777 -R example.txt

hadoop fs -chown hdfs:hadoop example.txt

Working with files in HDFS

hadoop copyFromLocal example.txt .

hadoop copyToLocal example.txt .

hadoop fs -getmerge files/ mergedFile.txt

hadoop fs -cp /user/hadoop/file1 /user/hadoop/file2

hadoop fs -mv /user/hadoop/file1 /user/hadoop/file2
hadoop fs -du /user/hadoop/file1
References

 http://opensource.com/life/14/8/intro-apache-hadoop-big-data
 Hadoop In Action
 Hadoop : The definitive guide

BDA Lab Manual UPDATED
No ratings yet
BDA Lab Manual UPDATED
45 pages
Hadoop Setup Guide for Linux Users
No ratings yet
Hadoop Setup Guide for Linux Users
23 pages
Lab 1
No ratings yet
Lab 1
12 pages
Hadoop Installation Guide for Ubuntu
No ratings yet
Hadoop Installation Guide for Ubuntu
8 pages
Hadoop Configuration
No ratings yet
Hadoop Configuration
12 pages
Ex 1
No ratings yet
Ex 1
5 pages
Single Node Hadoop Installation Guide
100% (1)
Single Node Hadoop Installation Guide
6 pages
Hadoop
No ratings yet
Hadoop
18 pages
Hadoop Setup Guide for Ubuntu 16.04/18.04
No ratings yet
Hadoop Setup Guide for Ubuntu 16.04/18.04
20 pages
Formatting Hadoop Namenode
No ratings yet
Formatting Hadoop Namenode
27 pages
Hadoop Installation
No ratings yet
Hadoop Installation
4 pages
Hadoop 6
No ratings yet
Hadoop 6
5 pages
Single Node Hadoop Cluster
No ratings yet
Single Node Hadoop Cluster
9 pages
Exp 1 1
No ratings yet
Exp 1 1
24 pages
Hadoop Multi Node Cluster
No ratings yet
Hadoop Multi Node Cluster
7 pages
Hadoop Installation
No ratings yet
Hadoop Installation
6 pages
Week 1 in Terminal
No ratings yet
Week 1 in Terminal
10 pages
Bda Lab
No ratings yet
Bda Lab
37 pages
BDA Lab Manual-1
No ratings yet
BDA Lab Manual-1
60 pages
Hadoop Setup & File Management Guide
No ratings yet
Hadoop Setup & File Management Guide
16 pages
Installing Standalone and Pseudocode Hadoop Cluster: 1. Setting Up Vmware Virtual Machine
No ratings yet
Installing Standalone and Pseudocode Hadoop Cluster: 1. Setting Up Vmware Virtual Machine
14 pages
Hadoop Installation and YARN Setup Guide
No ratings yet
Hadoop Installation and YARN Setup Guide
11 pages
Hadoop Setup Guide for Developers
No ratings yet
Hadoop Setup Guide for Developers
7 pages
Lab Manual
No ratings yet
Lab Manual
27 pages
Install Sqoop
No ratings yet
Install Sqoop
7 pages
$ Sudo Apt-Get Install Oracle-Java8-Installer
No ratings yet
$ Sudo Apt-Get Install Oracle-Java8-Installer
4 pages
Experiment No - 1
No ratings yet
Experiment No - 1
13 pages
Hadoop Single Node Setup on Ubuntu
No ratings yet
Hadoop Single Node Setup on Ubuntu
7 pages
Steps Single Node Setup
No ratings yet
Steps Single Node Setup
4 pages
Hadoop Installation
No ratings yet
Hadoop Installation
5 pages
Installing Hadoop 3.2.4 Guide
No ratings yet
Installing Hadoop 3.2.4 Guide
7 pages
Hadoop Installation and MapReduce Guide
No ratings yet
Hadoop Installation and MapReduce Guide
25 pages
Week 1 Lab
No ratings yet
Week 1 Lab
8 pages
BigData Lab Manual
No ratings yet
BigData Lab Manual
44 pages
Big Data Record
No ratings yet
Big Data Record
69 pages
Hive INstallation
No ratings yet
Hive INstallation
13 pages
Exp 1
No ratings yet
Exp 1
24 pages
3 Hadoop
No ratings yet
3 Hadoop
40 pages
DataVisuaization Lab
No ratings yet
DataVisuaization Lab
5 pages
Hadoop Installation Guide
No ratings yet
Hadoop Installation Guide
18 pages
Hadoop Installatio1
No ratings yet
Hadoop Installatio1
22 pages
Group A 1st
No ratings yet
Group A 1st
4 pages
Installation of Hadoop
No ratings yet
Installation of Hadoop
6 pages
Exp 1-2
No ratings yet
Exp 1-2
9 pages
Install Hadoop: Standalone & Pseudo Modes
No ratings yet
Install Hadoop: Standalone & Pseudo Modes
13 pages
Installation of Hadoop in Ubuntu
No ratings yet
Installation of Hadoop in Ubuntu
15 pages
Assignment Tanupriya BDDV
No ratings yet
Assignment Tanupriya BDDV
8 pages
Amc Engineering College: Dept. of Computer Science and Engineering
No ratings yet
Amc Engineering College: Dept. of Computer Science and Engineering
6 pages
L Hadoop 1 PDF
No ratings yet
L Hadoop 1 PDF
12 pages
Hadoop Installation Guide: Standalone Mode
No ratings yet
Hadoop Installation Guide: Standalone Mode
47 pages
Hadoop Installation Guide: Single & Multi Node
No ratings yet
Hadoop Installation Guide: Single & Multi Node
11 pages
Bda A2
No ratings yet
Bda A2
17 pages
Hadoop Multinode Cluster Installation
No ratings yet
Hadoop Multinode Cluster Installation
4 pages
BDA Unit-4
No ratings yet
BDA Unit-4
38 pages
BDA Lab Manual
No ratings yet
BDA Lab Manual
34 pages
Hbase Installationn
No ratings yet
Hbase Installationn
12 pages
Hadoop 2.6.5 Installing On Ubuntu 16.04 and 18.04 (Single-Node Cluster)
No ratings yet
Hadoop 2.6.5 Installing On Ubuntu 16.04 and 18.04 (Single-Node Cluster)
7 pages
Hadoop Cluster Setup
No ratings yet
Hadoop Cluster Setup
10 pages
Band Theory
No ratings yet
Band Theory
15 pages
Clippers Clampers Rectifiers-03
No ratings yet
Clippers Clampers Rectifiers-03
43 pages
Detecting Emotions in Text with Deep Learning
No ratings yet
Detecting Emotions in Text with Deep Learning
32 pages
Comparing KNN, Logistic Regression, and Random Forest
No ratings yet
Comparing KNN, Logistic Regression, and Random Forest
16 pages
Evolving Neural Network Weights For Time-Series Prediction of General Aviation Flight Data
No ratings yet
Evolving Neural Network Weights For Time-Series Prediction of General Aviation Flight Data
11 pages
NAC Competitive Exam Note - 5th Level Note
No ratings yet
NAC Competitive Exam Note - 5th Level Note
25 pages
Engineering Chemistry PDF
No ratings yet
Engineering Chemistry PDF
87 pages
99992ALS1-rh Main Landing Gear
No ratings yet
99992ALS1-rh Main Landing Gear
2 pages
LCL RNA Daily - Landscape
No ratings yet
LCL RNA Daily - Landscape
12 pages
Nmap For Pentester - Port Status
100% (1)
Nmap For Pentester - Port Status
10 pages
SPTU Handheld Test Unit Guide
No ratings yet
SPTU Handheld Test Unit Guide
10 pages
User Manual - SZGH-CNC1000TDb (V4.0) - VIP
No ratings yet
User Manual - SZGH-CNC1000TDb (V4.0) - VIP
215 pages
Google App Engine Overview and Features
No ratings yet
Google App Engine Overview and Features
11 pages
IC 733 Datasheetcatalogorg
No ratings yet
IC 733 Datasheetcatalogorg
7 pages
Agile Project Management - Agilism Versus Traditional Approaches
No ratings yet
Agile Project Management - Agilism Versus Traditional Approaches
9 pages
Lab Manual - Student Copy - Index & Experiments CCS334 - BDA
No ratings yet
Lab Manual - Student Copy - Index & Experiments CCS334 - BDA
66 pages
Tutorials Risa 3D
100% (1)
Tutorials Risa 3D
160 pages
Algo Trading
No ratings yet
Algo Trading
9 pages
A Case Study of Various Constraints Affecting Unit Commitment in Power System Planning
No ratings yet
A Case Study of Various Constraints Affecting Unit Commitment in Power System Planning
5 pages
Empowerment Tech Week 3
No ratings yet
Empowerment Tech Week 3
36 pages
ASUS e Emanual Ux302la LG Ver8438
No ratings yet
ASUS e Emanual Ux302la LG Ver8438
114 pages
New Oracle Project Management Presentation
No ratings yet
New Oracle Project Management Presentation
14 pages
Final Document Barangay Information System
No ratings yet
Final Document Barangay Information System
3 pages
QRA in Oil & Gas: Risk Assessment Course
No ratings yet
QRA in Oil & Gas: Risk Assessment Course
1 page
1 VoLTE Overview Masterclass
100% (2)
1 VoLTE Overview Masterclass
59 pages
LNG Plant Maintenance Engineer Role
No ratings yet
LNG Plant Maintenance Engineer Role
9 pages
Coursera YERU7XQIVVQV
No ratings yet
Coursera YERU7XQIVVQV
1 page
Fuzzy Speed Controller Design of Three Phase Induction Motor
No ratings yet
Fuzzy Speed Controller Design of Three Phase Induction Motor
6 pages
Co-Channel Interference in Cellular Systems
No ratings yet
Co-Channel Interference in Cellular Systems
58 pages
Microsoft Excel Resources For Mathematics and The Sciences Es
No ratings yet
Microsoft Excel Resources For Mathematics and The Sciences Es
6 pages
Apple Supply Chain Overview and Issues
100% (1)
Apple Supply Chain Overview and Issues
25 pages
DS500M - en V1.56 - Im
No ratings yet
DS500M - en V1.56 - Im
113 pages
Install Windows Server 2003 Guide
No ratings yet
Install Windows Server 2003 Guide
45 pages
PLL Basics for Induction Heaters
No ratings yet
PLL Basics for Induction Heaters
5 pages
A Systems Approach To Cyber Security
No ratings yet
A Systems Approach To Cyber Security
172 pages
kubandBUC Syssol
No ratings yet
kubandBUC Syssol
52 pages
LG 50PK950-Presentation PDF
No ratings yet
LG 50PK950-Presentation PDF
164 pages
Deadlocks and Paging
No ratings yet
Deadlocks and Paging
23 pages

Setting Up and Running Hadoop 0.20.2

Uploaded by

Setting Up and Running Hadoop 0.20.2

Uploaded by

Hadoop

 The exponential growth of data first presented

 Hadoop is an open source framework for writing and running

– Mapreduce – distributed computation

Your identification has been saved in /home/hadoop-user/.ssh/id_rsa.

export JAVA_HOME = /opt/jdk1.7.0

[hadoop-user@master]$ cat masters

[hadoop-user@master]$ bin/hadoop namenode -format

Basic file commands

hadoop fs -cat example.txt | head

hadoop fs –rm example.txt

hadoop fs -chmod 777 -R example.txt

hadoop fs -chown hdfs:hadoop example.txt

hadoop copyFromLocal example.txt .

hadoop fs -getmerge files/ mergedFile.txt

hadoop fs -cp /user/hadoop/file1 /user/hadoop/file2

You might also like