0% found this document useful (0 votes)

145 views20 pages

Hadoop Setup Guide for Ubuntu 16.04/18.04

The document outlines the 7 steps to install Hadoop on Ubuntu 16.04/18.04: 1. Install prerequisites like Java and SSH. 2. Create a dedicated Hadoop user and groups. 3. Download and extract the Hadoop source files. 4. Configure environment files like bashrc and Hadoop configuration files. 5. Format the HDFS file system. 6. Start the Hadoop daemons using start-all.sh. 7. Run a sample MapReduce job to test the Hadoop installation.

Uploaded by

Gurasees Singh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as ODT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

145 views20 pages

Hadoop Setup Guide for Ubuntu 16.04/18.04

Uploaded by

Gurasees Singh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as ODT, PDF, TXT or read online on Scribd

Hadoop Implementation Steps on Ubuntu

16.-04/18.04 Linux

(COMPUTER SCIENCE AND ENGINEERING)

ADITYA BHARDWAJ

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

PEC

SECTOR – 12, CHANDIGARH, INDIA

2019
Step 1 – Prerequsities
Before beginning the installation run login shell as the sudo user and
update the current packages installed. Lets my ubuntu host name is
server3

sudo apt update

OpenJDK 8

Java 8 is the current Long Term Support version and is still widely supported, though
public maintenance ends in January 2019. To install OpenJDK 8, execute the following
command:

root@server3: sudo apt install openjdk-8-jdk

Verify that this is installed with

root@server3: java -version

You'll see output like this:

Output
openjdk version "1.8.0_162"
OpenJDK Runtime Environment (build 1.8.0_162-8u162-b12-1-b12)
OpenJDK 64-Bit Server VM (build 25.162-b12, mixed mode)
You have successfully installed Java 11 on Ubuntu 16.04 LTS system.

root@server3:readlink -f /usr/bin/java | sed "s:bin/java::"

root@server3:sudo gedit /etc/environment
Following configuration are done in environment file

JAVA_HOME=/usr/lib/jvm/java-11-openjdk-amd64/
PATH=$PATH:$HOME/bin:$JAVA_HOME/bin

export JAVA_HOME

export PATH

Verify that the environment variable is set:

root@server3: echo $JAVA_HOME

Step 2 – Create User for Haddop

Hit CTRL+ALT+T to get started. We will install Hadoop from the terminal. For new Linux
users, things might get confusing while installing different programs and managing them from
the same login. If you are one of them, we have a solution. Let’s create a new dedicated Hadoop
user. Whenever you want to use Hadoop, just use the separate login. Simple.

$ sudo addgroup hadoop

$ sudo adduser –ingroup hadoop hduser
Note: You just enter Unix user name pwd and for other Just hit enter and press ‘y’ at the end.
Add Hadoop user to sudo group (Basically, grant it all permissions)

server1@server3: sudo adduser hduser sudo

Install SSH

root@server3: sudo apt-get install ssh

Passwordless entry for localhost using SSH

root@server3: su -hduser
hduser@server3: sudo ssh-keygen -t rsa
hduser@server3: ssh-keygen -t rsa
Note: When ask for file name or location, leave it blank.
hduser@server3: cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
hduser@server3: chmod 0600 ~/.ssh/authorized_keys
Figure: SSH Key generation

Check if ssh works,

$ ssh localhost
Figure: hduser permission

Once we are logged in localhost, exit from this session using following command.

$ exit
Step 3 – Download Hadoop Source Archive
In this step, download hadoop 3.1 source archive file using below
command. You can also select alternate download mirror for increasing
download speed.

cd ~

server1@server3: wget [Link]

3.1.2/[Link]

server1@server3: tar xzf [Link]

3.2 Hadoop Configuration

Make a directory called hadoop from the hduser and move the folder ‘hadoop-3.1.2’ to this
directory

server1@server3: sudo mkdir -p /usr/local/hadoop

server1@server3: cd hadoop-3.1.2/
server1@server3: sudo mv * /usr/local/hadoop
server1@server3: sudo chown -R hduser:hadoop /usr/local/hadoop
STEP 4 – Setting up Configuration files
We will change content of following files in order to complete hadoop installation.
1. ~/.bashrc
2. [Link]
3. [Link]
4. [Link]
5. [Link]
Details:
 [Link] – This file contains some environment variable settings used by Hadoop.
You can use these to affect some aspects of Hadoop daemon behavior, such as where log
files are stored, the maximum amount of heap used etc. The only variable you should
need to change at this point is in this file is JAVA_HOME, which specifies the path to the
Java 1.7.x installation used by Hadoop.
 [Link] – key property [Link] – for namenode configuration for
e.g hdfs://namenode/. Namenode is the node which stores the filesystem metadata i.e.
which file maps to what block locations and which blocks are stored on which datanode
 [Link] – key property – [Link] – by default 3
 [Link] – key property [Link] for jobtracker configuration for
e.g jobtracker:8021
 [Link]: resource management
4.1 ~/.bashrc

If you don’t know the path where java is installed, first run the following command to locate it
root@server3:readlink -f /usr/bin/java | sed "s:bin/java::"

Now open the ~/.bashrc file

hduser@server3:~$ sudo gedit ~/.bashrc

#HADOOP VARIABLES START

export JAVA_HOME=/usr/lib/jvm/java-11-openjdk-amd64

export HADOOP_HOME=/usr/local/hadoop

export PATH=$PATH:$HADOOP_HOME/bin

export PATH=$PATH:$HADOOP_HOME/sbin

export HADOOP_MAPRED_HOME=$HADOOP_HOME

export HADOOP_COMMON_HOME=$HADOOP_HOME

export HADOOP_HDFS_HOME=$HADOOP_HOME

export YARN_HOME=$HADOOP_HOME

export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native

export HADOOP_OPTS="-[Link]=$HADOOP_HOME/lib"

#HADOOP VARIABLES END

Update .bashrc file to apply changes
$source ~/.bashrc

4.2 [Link]
We need to tell Hadoop the path where java is installed. That’s what we will do in this file,
specify the path for JAVA_HOME variable.
Open the file,
hduser@server3:~$ sudo gedit /usr/local/hadoop/etc/hadoop/[Link]

Now, the first variable in file will be JAVA_HOME variable, change the value of that variable to
export JAVA_HOME=/usr/lib/jvm/java-11-openjdk-amd64

4.3 [Link]
Create temporary directory

hduser@server3 :~$ sudo mkdir -p /app/hadoop/tmp

hduser@server3 :~$ sudo chown hduser:hadoop /app/hadoop/tmp
open the file
hduser@server3 :~$ sudo gedit /usr/local/hadoop/etc/hadoop/[Link]

Append the following between configuration tags. Same as

below.
<property>

<value>/app/hadoop/tmp</value>

<description>A base for other temporary directories.</description>

</property>

<value>hdfs://localhost:54310</value>

<description>The name of the default file system. A URI whose scheme and authority
determine the FileSystem implementation. The uri’s scheme determines the config property
([Link]) naming the FileSystem implementation class. The uri’s authority is used to
determine the host, port, etc. for a filesystem.</description>

</property>
4.4 [Link]
Mainly there are two directories,
1. Name Node
2. Data Node
Make directories

hduser@server3 sudo mkdir -p /usr/local/hadoop_store/hdfs/namenode

hduser@server3 sudo mkdir -p /usr/local/hadoop_store/hdfs/datanode

hduser@server3 sudo chown -R hduser:hadoop /usr/local/hadoop_store

Open the file,

hduser@server3 sudo gedit /usr/local/hadoop/etc/hadoop/[Link]

Change the content between configuration tags shown as below.

<description>Default block [Link] actual number of replications can be specified when

the file is created. The default is used if replication is not specified in create time.

</description>

</property>

<value>file:/usr/local/hadoop_store/hdfs/namenode</value>

</property>

<value>file:/usr/local/hadoop_store/hdfs/datanode</value>

</property>
4.5 [Link]
Open the file,

hduser@server3 :~$ sudo gedit /usr/local/hadoop/etc/hadoop/[Link]

Just like the other two, add the content to configuration tags.

<name>[Link]-services</name>

<value>mapreduce_shuffle</value>

</property>
STEP 5- Format Hadoop file system
Hadoop installation is now done. All we have to do is change format the name-nodes before
using it.

hduser@server3 :~$ hadoop namenode -format

STEP 6- Start Hadoop daemons
Now that hadoop installation is complete and name-nodes are formatted, we can start hadoop by
going to following directory.

$ cd /usr/local/hadoop/sbin

$ [Link]

Just check if all daemons are properly started using the following command:

$ jps

STEP 7 – IF you want to Stop Hadoop daemons

Step 7 of hadoop installation is when you need to stop Hadoop and all its modules.

$ [Link]
Appreciate yourself because you’ve done it. You have completed all the Hadoop installation
steps and Hadoop is now ready to run the first program.
Let’s run MapReduce job on our entirely fresh
Hadoop cluster setup
Go to the following directory

$ cd /usr/local/hadoop
Run the following command

hduser@server3 :/usr/local/hadoop$ hadoop jar ./share/hadoop/mapreduce/hadoop-

[Link] pi 10 100
userdel hadoop Command to delete hadoop user name

Hadoop Install
No ratings yet
Hadoop Install
19 pages
Hadoop Administration Online Training Course
No ratings yet
Hadoop Administration Online Training Course
4 pages
Hadoop 2.x HA Setup with NFS
No ratings yet
Hadoop 2.x HA Setup with NFS
43 pages
BDE ManagedHadoopDataLakes PAVLIK PDF
No ratings yet
BDE ManagedHadoopDataLakes PAVLIK PDF
10 pages
Create Bootable USB Pen Drive For Windows 7 - Easy Tips & Tricks PDF
No ratings yet
Create Bootable USB Pen Drive For Windows 7 - Easy Tips & Tricks PDF
3 pages
Traing On Hadoop
No ratings yet
Traing On Hadoop
123 pages
Sujatha Hadoop Admin
No ratings yet
Sujatha Hadoop Admin
5 pages
Linux Server Troubleshooting Guide
No ratings yet
Linux Server Troubleshooting Guide
8 pages
BDT Lab Manual
No ratings yet
BDT Lab Manual
34 pages
Solaris Commands
No ratings yet
Solaris Commands
14 pages
Pipenv Documentation: Release 2018.11.27.dev0
No ratings yet
Pipenv Documentation: Release 2018.11.27.dev0
74 pages
Hadoop Installation Step by Step
No ratings yet
Hadoop Installation Step by Step
6 pages
Informatica Installation Guide
No ratings yet
Informatica Installation Guide
26 pages
Well House Consultants Samples Notes From Well House Consultants 1
100% (1)
Well House Consultants Samples Notes From Well House Consultants 1
24 pages
Agilepoint Web Services Api Guide: Enabling Next Generation Agile, Adaptive and Process-Managed Enterprise
No ratings yet
Agilepoint Web Services Api Guide: Enabling Next Generation Agile, Adaptive and Process-Managed Enterprise
180 pages
Introduction To Mobile Development With Xamarin
No ratings yet
Introduction To Mobile Development With Xamarin
56 pages
Mastering Windows PowerShell Pipeline
No ratings yet
Mastering Windows PowerShell Pipeline
55 pages
OBS MASTER CLASS 2
No ratings yet
OBS MASTER CLASS 2
39 pages
Software Updates Management - White Paper PDF
No ratings yet
Software Updates Management - White Paper PDF
81 pages
Treesize Professional: Test. Buy. Enjoy
No ratings yet
Treesize Professional: Test. Buy. Enjoy
171 pages
Installing and Operating Carrier Software On Networks
No ratings yet
Installing and Operating Carrier Software On Networks
11 pages
HTML Templates
No ratings yet
HTML Templates
31 pages
MonetDB User Guide
No ratings yet
MonetDB User Guide
49 pages
Overview of NFS File System Protocols
No ratings yet
Overview of NFS File System Protocols
29 pages
Telecom Commercial Communications Customer Preference Regulations, 2018
No ratings yet
Telecom Commercial Communications Customer Preference Regulations, 2018
113 pages
Basic OpenLdap Tutorial
No ratings yet
Basic OpenLdap Tutorial
5 pages
Essential Post-Install Tasks for RHEL 7
No ratings yet
Essential Post-Install Tasks for RHEL 7
17 pages
Docker Swarm Setup and Management Guide
No ratings yet
Docker Swarm Setup and Management Guide
85 pages
Oracle APEX Interactive Grid Cheat Sheet
No ratings yet
Oracle APEX Interactive Grid Cheat Sheet
8 pages
Siteground Dreamweaver Tutorial
100% (3)
Siteground Dreamweaver Tutorial
27 pages
Ossec Docs
No ratings yet
Ossec Docs
203 pages
The Official Red Hat Linux x86 Installation Guide
No ratings yet
The Official Red Hat Linux x86 Installation Guide
144 pages
MotoReview Report Kripal
No ratings yet
MotoReview Report Kripal
71 pages
Rabbit Command
No ratings yet
Rabbit Command
2 pages
A Practical Guide For WebLogic Server Domain Creation
No ratings yet
A Practical Guide For WebLogic Server Domain Creation
17 pages
Monitoring Overview
No ratings yet
Monitoring Overview
77 pages
9 How To Host A Website On IIS - Setup & Deploy Web Application
No ratings yet
9 How To Host A Website On IIS - Setup & Deploy Web Application
29 pages
Multi Node Cluster Installation Guide PDF
No ratings yet
Multi Node Cluster Installation Guide PDF
24 pages
AcronisBackup 12.5 Userguide en-US
No ratings yet
AcronisBackup 12.5 Userguide en-US
261 pages
Hibernate (An ORM Tool)
No ratings yet
Hibernate (An ORM Tool)
69 pages
Linux Questions
No ratings yet
Linux Questions
11 pages
Metasploit
No ratings yet
Metasploit
4 pages
Create An EC2 Instance On AWS
No ratings yet
Create An EC2 Instance On AWS
19 pages
Weblogic Thread Dump Analysis
No ratings yet
Weblogic Thread Dump Analysis
9 pages
Regex Cheatsheet: Examples & Guide
No ratings yet
Regex Cheatsheet: Examples & Guide
9 pages
HTML Element Question and Answer
No ratings yet
HTML Element Question and Answer
6 pages
Creating Users and Groups
No ratings yet
Creating Users and Groups
23 pages
Install/Configure Windows Deployment Services
No ratings yet
Install/Configure Windows Deployment Services
11 pages
Bigdata 2016 Hands On 2891109
No ratings yet
Bigdata 2016 Hands On 2891109
96 pages
Asterisk Interview Question
No ratings yet
Asterisk Interview Question
1 page
Hadoop Installation For Windows
No ratings yet
Hadoop Installation For Windows
10 pages
Simple Linux Backup with Rsync & Cron
No ratings yet
Simple Linux Backup with Rsync & Cron
3 pages
Public-Key Infrastructure (PKI) Lab
No ratings yet
Public-Key Infrastructure (PKI) Lab
7 pages
Veritas Cluster Cheat Sheet
No ratings yet
Veritas Cluster Cheat Sheet
6 pages
Lpic-101 Linuxacademy
No ratings yet
Lpic-101 Linuxacademy
82 pages
Experiment No - 1
No ratings yet
Experiment No - 1
13 pages
Hadoop Setup Guide for Linux Users
No ratings yet
Hadoop Setup Guide for Linux Users
23 pages
Hadoop Installation Guide for Ubuntu
No ratings yet
Hadoop Installation Guide for Ubuntu
8 pages
Hadoop Configuration
No ratings yet
Hadoop Configuration
12 pages
12 Volatile Oils Cinnamon Fennel Coriander
No ratings yet
12 Volatile Oils Cinnamon Fennel Coriander
22 pages
Section 7B CE 921E Brake Cool
No ratings yet
Section 7B CE 921E Brake Cool
8 pages
Nonfiction Critique Format Guide
No ratings yet
Nonfiction Critique Format Guide
3 pages
Cciev5 Configuration Troubleshooting Lab 1 4 Questions Solutions v1 Release
100% (9)
Cciev5 Configuration Troubleshooting Lab 1 4 Questions Solutions v1 Release
575 pages
Agenda Session: Wing Ownship Ouncil
No ratings yet
Agenda Session: Wing Ownship Ouncil
4 pages
Sci5 q3 Module3 Noanswerkey
No ratings yet
Sci5 q3 Module3 Noanswerkey
22 pages
STR-5 Stair Case & Midlanding Beam Schedule
No ratings yet
STR-5 Stair Case & Midlanding Beam Schedule
1 page
Kinginang Maneco
No ratings yet
Kinginang Maneco
5 pages
ART 5 LEARNING PACKET WEEK 3 and 5
No ratings yet
ART 5 LEARNING PACKET WEEK 3 and 5
9 pages
Linear Impulse and Momentum Guide
No ratings yet
Linear Impulse and Momentum Guide
54 pages
Electric Drive Ebja Ebjc Ebjd Repair Manual Eng
No ratings yet
Electric Drive Ebja Ebjc Ebjd Repair Manual Eng
226 pages
Dka
100% (1)
Dka
83 pages
Bridge To Success Learner Book 1
No ratings yet
Bridge To Success Learner Book 1
97 pages
OP 41 LCN Descriptor and Allocation of Logical Channel Numbers-Issue 6-March 2010
No ratings yet
OP 41 LCN Descriptor and Allocation of Logical Channel Numbers-Issue 6-March 2010
10 pages
Orca Share Media1556693030998
No ratings yet
Orca Share Media1556693030998
18 pages
Fees Notification and Time Table of BPT Course Examination-January 2022
No ratings yet
Fees Notification and Time Table of BPT Course Examination-January 2022
11 pages
Hydropower Project Cost Breakdown
No ratings yet
Hydropower Project Cost Breakdown
2 pages
ERP Project Plan Template Guide
No ratings yet
ERP Project Plan Template Guide
3 pages
Healthwise: Trainers' Guide
No ratings yet
Healthwise: Trainers' Guide
128 pages
6567-Article Text-27355-1-10-20230123
No ratings yet
6567-Article Text-27355-1-10-20230123
10 pages
Chapter 5 MPS710s Part 2
No ratings yet
Chapter 5 MPS710s Part 2
36 pages
Entrepreneurship SQP
No ratings yet
Entrepreneurship SQP
10 pages
Gaia Hypothesis
No ratings yet
Gaia Hypothesis
1 page
About Berhampur
No ratings yet
About Berhampur
9 pages
Sample Pack PDF
No ratings yet
Sample Pack PDF
42 pages
60-Article Text-441-1-10-20221223-1
No ratings yet
60-Article Text-441-1-10-20221223-1
5 pages
Lecture 2
No ratings yet
Lecture 2
53 pages
Kendriya Vidyalaya NO.2 Bhopal: Chemistry Investigatory Project
No ratings yet
Kendriya Vidyalaya NO.2 Bhopal: Chemistry Investigatory Project
16 pages
Lab.4 - Oral Absorption Model Answers
No ratings yet
Lab.4 - Oral Absorption Model Answers
30 pages
Concurrent Engineering Insights
100% (1)
Concurrent Engineering Insights
3 pages

Hadoop Setup Guide for Ubuntu 16.04/18.04

Uploaded by

Hadoop Setup Guide for Ubuntu 16.04/18.04

Uploaded by

Hadoop Implementation Steps on Ubuntu

(COMPUTER SCIENCE AND ENGINEERING)

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

SECTOR – 12, CHANDIGARH, INDIA

sudo apt update

root@server3: sudo apt install openjdk-8-jdk

Verify that this is installed with

root@server3: java -version

You'll see output like this:

root@server3:readlink -f /usr/bin/java | sed "s:bin/java::"

Verify that the environment variable is set:

root@server3: echo $JAVA_HOME

Step 2 – Create User for Haddop

$ sudo addgroup hadoop

server1@server3: sudo adduser hduser sudo

root@server3: sudo apt-get install ssh

Check if ssh works,

server1@server3: wget [Link]

server1@server3: tar xzf [Link]

3.2 Hadoop Configuration

server1@server3: sudo mkdir -p /usr/local/hadoop

Now open the ~/.bashrc file

hduser@server3:~$ sudo gedit ~/.bashrc

#HADOOP VARIABLES START

#HADOOP VARIABLES END

hduser@server3 :~$ sudo mkdir -p /app/hadoop/tmp

Append the following between configuration tags. Same as

<description>A base for other temporary directories.</description>

hduser@server3 sudo mkdir -p /usr/local/hadoop_store/hdfs/namenode

hduser@server3 sudo mkdir -p /usr/local/hadoop_store/hdfs/namenode

hduser@server3 sudo mkdir -p /usr/local/hadoop_store/hdfs/datanode

hduser@server3 sudo chown -R hduser:hadoop /usr/local/hadoop_store

Open the file,

hduser@server3 sudo gedit /usr/local/hadoop/etc/hadoop/[Link]

<description>Default block [Link] actual number of replications can be specified when

hduser@server3 :~$ sudo gedit /usr/local/hadoop/etc/hadoop/[Link]

hduser@server3 :~$ hadoop namenode -format

STEP 7 – IF you want to Stop Hadoop daemons

hduser@server3 :/usr/local/hadoop$ hadoop jar ./share/hadoop/mapreduce/hadoop-

You might also like