0% found this document useful (0 votes)

104 views14 pages

Installing Standalone and Pseudocode Hadoop Cluster: 1. Setting Up Vmware Virtual Machine

1. The document provides steps to install a standalone Hadoop cluster on a virtual machine. This includes setting up the VM, installing Java, downloading and extracting Hadoop, and configuring core Hadoop files and directories. 2. Key configuration files like core-site.xml, hdfs-site.xml, and yarn-site.xml are updated to configure the namenode, datanodes, and resource manager. 3. Directories for HDFS, YARN, and the temporary directory are created. The namenode is formatted and Hadoop daemons are started. 4. The installation can be verified by checking the namenode and resource manager web UIs

Uploaded by

Neerja M Guhathakurta

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

104 views14 pages

Installing Standalone and Pseudocode Hadoop Cluster: 1. Setting Up Vmware Virtual Machine

Uploaded by

Neerja M Guhathakurta

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 14

Installing Standalone and pseudocode Hadoop cluster

1. Setting up VMWare virtual machine

a. Download VMWare player from following URL.
https://my.vmware.com/en/web/vmware/free#desktop_end_user_computing/vmware_workst
ation_player/12_0
b. download the centos 6.4 image file: CentOS-6.4-x86_64-bin-DVD1 from following URL
http://vault.centos.org/6.4/isos/x86_64/
c. Start VMWare player and create new Virtual machine

d. Brows to the iso file location and select it

e. Name the virtual machine and create a user with password.

f. Select the location to store the virtual disk of the VM, this can be any location on the windows
file system.
g. Select the disk size and type as shown in the screenshot bellow.

h. Customize the hardware settings to set the RAM to 4mb(select as per your machine
configuration) and the network settings as custom: VMnet8(NAT) and click finish.Press
i. Hit the enter key
2. Setting up password less SSH:

a. vi /etc/sysconfig/network set the hostnames of each node in the

cluster.

b. find out the ip address of the VM machine as follows.

c. Update the host file on each machine in the cluster as follows.

d. Reset the network settings as follows and check the hostname it should be as follows.
service network restart
e. ssh-keygen and hit enter for all the promptings.

f. navigate to ~/.ssh folder and check the rsa keys

g. copy the public key to the authorized_keys file

cat id_rsa.pub >> authorized_keys
h. Disable the firewall
/etc/init.d/iptables stop

3. Installing JAVA:

a. Download java 1.8 from following URL

http://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html

b. Using winscp move the .rpm file to /usr/local/src folder on the VM machine

c. Install java using the following yum command

yum localinstall -y jdk-8u231-linux-x64.rpm

d. Check the installation by issuing the following command

java -version

4. Setting up Hadoop:

a. Download Hadoop 2.7.1 from the URL bellow in /usr/local/src folder

https://archive.apache.org/dist/hadoop/core/hadoop-2.7.1
b. create a folder called Apache
mkdir /apache

c. extract the gz file in this folder by running following command

tar -xvzf /usr/local/src/hadoop-2.7.1.tar.gz -C /apache

d. Create a soft link in the /apache folder as follows which allows us to switch between the Hadoop
version without changing the environment settings
ln -s hadoop-2.7.1 hadoop

e. Set the Linux environment variables in the .bashrc file as follows

export HADOOP_HOME=/apache/hadoop
export HADOOP_CONF=/apache/hadoop/etc/hadoop
export JAVA_HOME=/usr/java/jdk1.8.0_241-amd64
export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin

f. Source the .bashrc file

source .bashrc

g. The standalone hadoop installation is complete

hdfs dfs -ls /
h. Update the configuration files
1. Update core-site.xml
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://master:9000</value>
</property>
</configuration>

2. Update the hdfs-site.xml

<value>/grid/hadoop/hdfs/nn</value>
</property>
<property>

<value>/grid/hadoop/hdfs/dn</value>

</property>

<name>dfs.replication</name>

</property>

</configuration>
3. Ceate mapred-site.xml file from the mapred-site.xml.template file
cp mapred-site.xml.template mapred-site.xml

4. Update the mapred-site.xml

<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>

</configuration>

5. Update the yarn-ite.xml

<property>
<name>yarn.nodemanager.address</name>
<value>0.0.0.0:45454</value>
</property>

<property>
<name>yarn.nodemanager.local-dirs</name>
<value>/grid/hadoop/yarn/tmp/</value>
</property>

<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>

<property>
<name>yarn.nodemanager.aux-
services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>

<property>
<name>yarn.resourcemanager.scheduler.class</name>

<value>org.apache.hadoop.yarn.server.resourcemanager.schedu
ler.capacity.CapacityScheduler</value>
</property>

<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>master:8025</value>
</property>

<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>master:8030</value>
</property>

<property>
<name>yarn.resourcemanager.address</name>
<value>master:8032</value>
</property>

<property>
<name>yarn.nodemanager.address</name>
<value>master</value>
</property>

<property>
<description>NM Webapp address.</description>
<name>yarn.nodemanager.webapp.address</name>
<value>master</value>
</property>

</configuration>

6. Update masters with ‘master’

7. Update slaves with ‘master’

i. Create the required folders

mkdir -p /grid/hadoop/hdfs/nn
mkdir -p /grid/hadoop/hdfs/dn
mkdir -p /grid/hadoop/yarn/tmp

j. Format the namenode

hadoop namenode -format

k. Check the name node metadata directory

l. Start the hadoop daemons

$HADOOP_HOME/sbin/start-all.sh

m. Check the hadoop daemons bu jps command

n. Check the namenode webui from the 50070 port

o. Check the resource manager webui from the 8088 port

Group A 1st
No ratings yet
Group A 1st
4 pages
BDA Lab Manual UPDATED
No ratings yet
BDA Lab Manual UPDATED
45 pages
Lab 1
No ratings yet
Lab 1
12 pages
Hadoop Installation Steps
100% (1)
Hadoop Installation Steps
6 pages
TP2 - 3IM - en
No ratings yet
TP2 - 3IM - en
7 pages
Hadoop Setup Guide for Developers
No ratings yet
Hadoop Setup Guide for Developers
7 pages
Assignment Tanupriya BDDV
No ratings yet
Assignment Tanupriya BDDV
8 pages
Bda Lab
No ratings yet
Bda Lab
37 pages
BDA Lab Manual
No ratings yet
BDA Lab Manual
49 pages
BDA Practical Experiment 1
No ratings yet
BDA Practical Experiment 1
5 pages
Installation of Hadoop
No ratings yet
Installation of Hadoop
6 pages
Unit 3 PART 2
No ratings yet
Unit 3 PART 2
11 pages
Hadoop For Ubuntu 2
No ratings yet
Hadoop For Ubuntu 2
4 pages
Hadoop Setup Guide for Linux Users
No ratings yet
Hadoop Setup Guide for Linux Users
23 pages
DataVisuaization Lab
No ratings yet
DataVisuaization Lab
5 pages
Hadoop Setup Guide for Developers
No ratings yet
Hadoop Setup Guide for Developers
19 pages
Original
No ratings yet
Original
17 pages
Hive INstallation
No ratings yet
Hive INstallation
13 pages
Week 1 in Terminal
No ratings yet
Week 1 in Terminal
10 pages
BDA Lab Manual-1
No ratings yet
BDA Lab Manual-1
60 pages
Install Hadoop
No ratings yet
Install Hadoop
8 pages
Hadoop Configuration
No ratings yet
Hadoop Configuration
12 pages
Hadoop Cluster
No ratings yet
Hadoop Cluster
26 pages
Hadoop Multi Node Cluster
No ratings yet
Hadoop Multi Node Cluster
7 pages
Hadoop
No ratings yet
Hadoop
18 pages
Online:: Setting Up The Environment
No ratings yet
Online:: Setting Up The Environment
9 pages
Hadoop Installation Guide
No ratings yet
Hadoop Installation Guide
18 pages
Exp 1
No ratings yet
Exp 1
24 pages
PRACTICAL 4 - Single and Multi Node Hadoop Install
No ratings yet
PRACTICAL 4 - Single and Multi Node Hadoop Install
11 pages
Hadoop Setup Guide for Developers
No ratings yet
Hadoop Setup Guide for Developers
3 pages
Hadoop Installation Steps
No ratings yet
Hadoop Installation Steps
4 pages
6 Hadoop
No ratings yet
6 Hadoop
20 pages
CentOS Hadoop Cluster Setup Guide
No ratings yet
CentOS Hadoop Cluster Setup Guide
3 pages
Amc Engineering College: Dept. of Computer Science and Engineering
No ratings yet
Amc Engineering College: Dept. of Computer Science and Engineering
6 pages
Hadoop 2 - Pseudo Node Installation
No ratings yet
Hadoop 2 - Pseudo Node Installation
9 pages
Hadoop Multinode Cluster Installation
No ratings yet
Hadoop Multinode Cluster Installation
4 pages
Install Sqoop
No ratings yet
Install Sqoop
7 pages
Exp 1 1
No ratings yet
Exp 1 1
24 pages
Big Data
No ratings yet
Big Data
5 pages
Hadoop Installation
No ratings yet
Hadoop Installation
4 pages
Single Node Hadoop Cluster
No ratings yet
Single Node Hadoop Cluster
9 pages
Hadoop Installation Commands
No ratings yet
Hadoop Installation Commands
3 pages
Hadoop 2.7.1 Setup on CentOS 6.4
No ratings yet
Hadoop 2.7.1 Setup on CentOS 6.4
4 pages
Practical 5
No ratings yet
Practical 5
3 pages
Exp 1 Hadoop Installation Steps
No ratings yet
Exp 1 Hadoop Installation Steps
4 pages
Edureka Apache Hadoop Single Node Cluster On Ubuntu
No ratings yet
Edureka Apache Hadoop Single Node Cluster On Ubuntu
9 pages
Exp 1-2
No ratings yet
Exp 1-2
9 pages
Hadoop Setup Guide for Ubuntu 16.04/18.04
No ratings yet
Hadoop Setup Guide for Ubuntu 16.04/18.04
20 pages
Start Hadoop
No ratings yet
Start Hadoop
4 pages
Cloud Computing Ex 6
No ratings yet
Cloud Computing Ex 6
8 pages
$ Sudo Apt-Get Install Oracle-Java8-Installer
No ratings yet
$ Sudo Apt-Get Install Oracle-Java8-Installer
4 pages
Ex 3
No ratings yet
Ex 3
3 pages
On Master Nodes Nodes: Install and Edit Bashrc On All Nodes For JAVA and HADOOP
No ratings yet
On Master Nodes Nodes: Install and Edit Bashrc On All Nodes For JAVA and HADOOP
10 pages
Computer Science & Engineering: Department of
No ratings yet
Computer Science & Engineering: Department of
6 pages
Hadoop Installation
No ratings yet
Hadoop Installation
6 pages
Experiment-2 BDA Lab
No ratings yet
Experiment-2 BDA Lab
13 pages
Hadoop Cluster Setup Guide
No ratings yet
Hadoop Cluster Setup Guide
5 pages
Amrita CC 3.1
No ratings yet
Amrita CC 3.1
7 pages
Recommended Platform:: 4.1. Install Java 7 (Recommended Oracle Java)
No ratings yet
Recommended Platform:: 4.1. Install Java 7 (Recommended Oracle Java)
5 pages
Your Fusionnet Statement: Customer Name
No ratings yet
Your Fusionnet Statement: Customer Name
1 page
How To Set Up A Multi-Node Hadoop Cluster On Amazon EC2
No ratings yet
How To Set Up A Multi-Node Hadoop Cluster On Amazon EC2
25 pages
Illustration
No ratings yet
Illustration
2 pages
Torrent: High-Quality Exam Torrent & Valid Test Dumps & Reliable Guide Torrent
No ratings yet
Torrent: High-Quality Exam Torrent & Valid Test Dumps & Reliable Guide Torrent
5 pages
DAG Program For Compiler Lab
50% (2)
DAG Program For Compiler Lab
4 pages
Kubernetes Cost Management Insights
No ratings yet
Kubernetes Cost Management Insights
5 pages
Reflections On UNIX Vulnerabilities
No ratings yet
Reflections On UNIX Vulnerabilities
24 pages
Red Hat Enterprise Linux-8-Configuring Basic System settings-en-US PDF
0% (1)
Red Hat Enterprise Linux-8-Configuring Basic System settings-en-US PDF
219 pages
FPGA Accelerators in GNU Radio With Xilinx's Zynq System On Chip
No ratings yet
FPGA Accelerators in GNU Radio With Xilinx's Zynq System On Chip
8 pages
Vault Installation
No ratings yet
Vault Installation
29 pages
Local Setup Guide for Disco Diffusion
No ratings yet
Local Setup Guide for Disco Diffusion
5 pages
Operating Systems Course Guide
No ratings yet
Operating Systems Course Guide
1 page
Installation Yum in Installed Mailman Is Because The Old Version To Install From Source
No ratings yet
Installation Yum in Installed Mailman Is Because The Old Version To Install From Source
3 pages
SLE HA 15 SP2 - Administration Guide - OCFS2
No ratings yet
SLE HA 15 SP2 - Administration Guide - OCFS2
12 pages
Peer Graded Assignment
No ratings yet
Peer Graded Assignment
6 pages
Profinet User Guide v1.00920 PDF
No ratings yet
Profinet User Guide v1.00920 PDF
12 pages
MCUXpresso IDE Installation Guide
No ratings yet
MCUXpresso IDE Installation Guide
14 pages
Wireguard Log
No ratings yet
Wireguard Log
9 pages
Linux Sed Command Guide with Examples
No ratings yet
Linux Sed Command Guide with Examples
17 pages
Postgres Articles
No ratings yet
Postgres Articles
78 pages
Linux Configuration Files Guide
No ratings yet
Linux Configuration Files Guide
7 pages
Como Instalar Certbot en Oracle-Linux 8
No ratings yet
Como Instalar Certbot en Oracle-Linux 8
9 pages
Ds Lab Manual
No ratings yet
Ds Lab Manual
61 pages
Storage Management Guide
No ratings yet
Storage Management Guide
272 pages
SL Lab
No ratings yet
SL Lab
9 pages
Paper 229 30 Sas Scheduling Getting The Most
No ratings yet
Paper 229 30 Sas Scheduling Getting The Most
14 pages
DOS and Linux Command
No ratings yet
DOS and Linux Command
8 pages
Linux Mount Devices Tutorial
No ratings yet
Linux Mount Devices Tutorial
6 pages
Oracle x86 Servers: Features and Customer Targeting
No ratings yet
Oracle x86 Servers: Features and Customer Targeting
19 pages
GitLab Runner Setup Guide
No ratings yet
GitLab Runner Setup Guide
9 pages
Free Backup Software 2021 Guide
No ratings yet
Free Backup Software 2021 Guide
37 pages
A Linux Command Line Primer
No ratings yet
A Linux Command Line Primer
20 pages
Sap Introscope Installation Guide - 108
No ratings yet
Sap Introscope Installation Guide - 108
34 pages
Windows XP Installation Guide
No ratings yet
Windows XP Installation Guide
173 pages

Installing Standalone and Pseudocode Hadoop Cluster: 1. Setting Up Vmware Virtual Machine

Uploaded by

Installing Standalone and Pseudocode Hadoop Cluster: 1. Setting Up Vmware Virtual Machine

Uploaded by

Installing Standalone and pseudocode Hadoop cluster

1. Setting up VMWare virtual machine

d. Brows to the iso file location and select it

a. vi /etc/sysconfig/network set the hostnames of each node in the

b. find out the ip address of the VM machine as follows.

c. Update the host file on each machine in the cluster as follows.

f. navigate to ~/.ssh folder and check the rsa keys

g. copy the public key to the authorized_keys file

a. Download java 1.8 from following URL

c. Install java using the following yum command

d. Check the installation by issuing the following command

a. Download Hadoop 2.7.1 from the URL bellow in /usr/local/src folder

c. extract the gz file in this folder by running following command

e. Set the Linux environment variables in the .bashrc file as follows

f. Source the .bashrc file

g. The standalone hadoop installation is complete

2. Update the hdfs-site.xml

4. Update the mapred-site.xml

5. Update the yarn-ite.xml

6. Update masters with ‘master’

i. Create the required folders

j. Format the namenode

k. Check the name node metadata directory

l. Start the hadoop daemons

m. Check the hadoop daemons bu jps command

o. Check the resource manager webui from the 8088 port

You might also like