0% found this document useful (0 votes)

1 views

Big data analytics lab-JD

The Big Data Analytics Lab Manual outlines various experiments related to Hadoop, including installation, file management tasks, matrix multiplication, and word count using MapReduce. Each experiment includes aims, descriptions, algorithms, and commands necessary for execution. The manual serves as a comprehensive guide for students to understand and implement big data concepts using Hadoop.

Uploaded by

ramyakrishnan201

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

1 views

Big data analytics lab-JD

Uploaded by

ramyakrishnan201

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 49

RENGANAYAGI VARATHARAJ COLLEGE OF ENGINEERING

SALVARPATTI, SIVAKASI – 626 128

DEPARTMENT OF COMPUTER SCIENCE ENGINEERING

BIG DATA ANALYTICS

LAB MANUAL
(CCS334)

SUBJECT HANDLED BY :
MS.JAIDHARNI AP/CSE
BIG DATA ANALYTICS LAB
(CCS334)

List of Experiments

1. Downloading and installing Hadoop; Understanding different Hadoop modes.

Startup scripts .Configuration files.

2. Hadoop implementation of the management tasks, such as Adding files and directories,
Retrieving files and deleting files.

3. Implemention of Matrix Multiplication with hadoop Map Reduce.

4. Run a basic word Count Map Reduce program to understand Map Reduce Paradigm.

5. Installation of Hive along with practice examples.

6. Installation of HBase, Installing thrift along with Practice examples.

7. Practice importing and exporting data from various databases.

EXPNO:1
1.Install Apache Hadoop
Date:

AIM:-
i) Perform setting up and Installing Hadoop in its three operating modes:

 Standalone

 Pseudo Distributed

 Fully Distributed

DESCRIPTION:
Hadoop is written in Java, so you will need to have Java installed on your machine,

version 6 or later. Sun's JDK is the one most widely used with Hadoop, although others have

been reported to work.

Hadoop runs on Unix and on Windows. Linux is the only supported production platform,

but other flavors of Unix (including Mac OS X) can be used to run Hadoop for development.

Windows is only supported as a development platform, and additionally requires Cygwin to run.

During the Cygwin installation process, you should include the openssh package if you plan to

run Hadoop in pseudo-distributed mode

ALGORITHM
STEPS INVOLVED IN INSTALLING HADOOP IN STANDALONE MODE:-

1. Command for installing ssh is “sudo apt-get install ssh”.

2. Command for key generation is ssh-keygen –t rsa –P “ ”.

3. Store the key into rsa.pub by using the command cat $HOME/.ssh/id_rsa.pub >>

$HOME/.ssh/authorized_keys

4. Extract the java by using the command tar xvfz jdk-8u60-linux-i586.tar.gz.

5. Extract the eclipse by using the command tar xvfz eclipse-jee-mars-R-linux-gtk.tar.gz

6. Extract the hadoop by using the command tar xvfz hadoop-2.7.1.tar.gz

HADOOP AND BIG DATA

7. Move the java to /usr/lib/jvm/ and eclipse to /opt/ paths. Configure the java path in the

eclipse.ini file
8. Export java path and hadoop path in ./bashrc

9. Check the installation successful or not by checking the java version and hadoop version

10. Check the hadoop instance in standalone mode working correctly or not by using an

implicit hadoop jar file named as word count.

11. If the word count is displayed correctly in part-r-00000 file it means that standalone mode

is installed successfully.

ALGORITHM
STEPS INVOLVED IN INSTALLING HADOOP IN PSEUDO DISTRIBUTED MODE:-

1. In order install pseudo distributed mode we need to configure the hadoop

configuration files resides in the directory /home/lendi/hadoop-2.7.1/etc/hadoop.

2. First configure the hadoop-env.sh file by changing the java path.

3. Configure the core-site.xml which contains a property tag, it contains name and

value. Name as fs.defaultFS and value as hdfs://localhost:9000

4. Configure hdfs-site.xml.

5. Configure yarn-site.xml.

6. Configure mapred-site.xml before configure the copy mapred-site.xml.template to

mapred-site.xml.

7. Now format the name node by using command hdfs namenode –format.

8. Type the command start-dfs.sh,start-yarn.sh means that starts the daemons like

NameNode,DataNode,SecondaryNameNode ,ResourceManager,NodeManager.

9. Run JPS which views all daemons. Create a directory in the hadoop by using

command hdfs dfs –mkdr /csedir and enter some data into lendi.txt using command

nano lendi.txt and copy from local directory to hadoop using command hdfs dfs –

copyFromLocal lendi.txt /csedir/and run sample jar file wordcount to check whether

pseudo distributed mode is working or not.

10. Display the contents of file by using command hdfs dfs –cat /newdir/part-r-00000.

FULLY DISTRIBUTED MODE INSTALLATION:

ALGORITHM

1. Stop all single node clusters

$stop-all.sh
2. Decide one as NameNode (Master) and remaining as DataNodes(Slaves).

3. Copy public key to all three hosts to get a password less SSH access

$ssh-copy-id –I $HOME/.ssh/id_rsa.pub lendi@l5sys24

4. Configure all Configuration files, to name Master and Slave Nodes.

$cd $HADOOP_HOME/etc/hadoop

$nano core-site.xml

$ nano hdfs-site.xml

5. Add hostnames to file slaves and save it.

$ nano slaves

6. Configure $ nano yarn-site.xml

7. Do in Master Node

$ hdfs namenode –format

$ start-dfs.sh

$start-yarn.sh

8. Format NameNode

9. Daemons Starting in Master and Slave Nodes

10. END

INPUT
ubuntu @localhost> jps

OUTPUT:
Data node, name nodem Secondary name node,

NodeManager, Resource Manager

Result:

We've installed Hadoop in standalone mode and verified it by running an example program it
provided
EXPNO:2
Hadoop Implementation of file management tasks
Date:

AIM:-
Implement the following file management tasks in Hadoop:
 Adding files and directories
 Retrieving files
 Deleting Files
DESCRIPTION:-
HDFS is a scalable distributed filesystem designed to scale to petabytes of data while
running on top of the underlying filesystem of the operating system. HDFS keeps track of where
the data resides in a network by associating the name of its rack (or network switch) with the
dataset. This allows Hadoop to efficiently schedule tasks to those nodes that contain data, or
which are nearest to it, optimizing bandwidth utilization. Hadoop provides a set of command line
utilities that work similarly to the Linux file commands, and serve as your primary interface with
HDFS. We‘re going to have a look into HDFS by interacting with it from the command line. We
will take a look at the most common file management tasks in Hadoop, which include:
 Adding files and directories to HDFS
 Retrieving files from HDFS to local filesystem
 Deleting files from HDFS
ALGORITHM:-
SYNTAX AND COMMANDS TO ADD, RETRIEVE AND DELETE DATA FROM HDFS
Step-1
Adding Files and Directories to HDFS
Before you can run Hadoop programs on data stored in HDFS, you‘ll need to put the data into
HDFS first. Let‘s create a directory and put a file in it. HDFS has a default working directory of
/user/$USER, where $USER is your login user name. This directory isn‘t automatically created
for you, though, so let‘s create it with the mkdir command. For the purpose of illustration, we
use chuck. You should substitute your user name in the example commands.

hadoop fs -mkdir /user/chuck

hadoop fs -put example.txt
hadoop fs -put example.txt /user/chuck

Step-2

Retrieving Files from HDFS

The Hadoop command get copies files from HDFS back to the local filesystem. To retrieve
example.txt, we can run the following command:
hadoop fs -cat example.txt

Step-3

Deleting Files from HDFS

hadoop fs -rm example.txt
 Command for creating a directory in hdfs is “hdfs dfs –mkdir /lendicse”.
 Adding directory is done through the command “hdfs dfs –put lendi_english /”.

Step-4

Copying Data from NFS to HDFS

Copying from directory command is “hdfs dfs –copyFromLocal
/home/lendi/Desktop/shakes/glossary /lendicse/”

 View the file by using the command “hdfs dfs –cat /lendi_english/glossary”
 Command for listing of items in Hadoop is “hdfs dfs –ls hdfs://localhost:9000/”.
 Command for Deleting files is “hdfs dfs –rm r /kartheek”

SAMPLE INPUT:
Input as any data format of type structured, Unstructured or Semi Structured

EXPECTED OUTPUT:

Result:
Thus the implementation for program is executed successfully.
EXPNO:3
Implementation of Matrix Multiplication with hadoop
Date:

AIM:-
Write a Map Reduce Program that implements Matrix Multiplication.

DESCRIPTION:
We can represent a matrix as a relation (table) in RDBMS where each cell in the matrix
can be represented as a record (i,j,value). As an example let us consider the following matrix
and its representation. It is important to understand that this relation is a very inefficient relation
if the matrix is dense. Let us say we have 5 Rows and 6 Columns , then we need to store only 30
values. But if you consider above relation we are storing 30 rowid, 30 col_id and 30 values in
other sense we are tripling the data. So a natural question arises why we need to store in this
format ? In practice most of the matrices are sparse matrices . In sparse matrices not all cells
used to have any values , so we don‘t have to store those cells in DB. So this turns out to be very
efficient in storing such matrices.

MapReduceLogic:
Logic is to send the calculation part of each output cell of the result matrix to a reducer.
So in matrix multiplication the first cell of output (0,0) has multiplication and summation of
elements from row 0 of the matrix A and elements from col 0 of matrix B. To do the
computation of value in the output cell (0,0) of resultant matrix in a seperate reducer we need to
use (0,0) as output key of mapphase and value should have array of values from row 0 of matrix
A and column 0 of matrix B. Hopefully this picture will explain the point. So in this algorithm
output from map phase should be having a <key,value> , where key represents the output cell
location (0,0) , (0,1) etc.. and value will be list of all values required for reducer to do
computation. Let us take an example for calculatiing value at output cell (00). Here we need to
collect values from row 0 of matrix A and col
0 of matrix B in the map phase and pass (0,0) as
key. So a single reducer can do the calculation

ALGORITHM
We assume that the input files for A and B are streams of (key,value) pairs in sparse
matrix format, where each key is a pair of indices (i,j) and each value is the corresponding matrix
element value. The output files for matrix C=A*B are in the same format.

We have the following input parameters:

The path of the input file or directory for matrix A.
The path of the input file or directory for matrix B.
The path of the directory for the output files for matrix C.
strategy = 1, 2, 3 or 4.
R = the number of reducers.
I = the number of rows in A and C.
K = the number of columns in A and rows in B.
J = the number of columns in B and C.
IB = the number of rows per A block and C block.
KB = the number of columns per A block and rows per B block.
JB = the number of columns per B block and C block.
In the pseudo-code for the individual strategies below, we have intentionally avoided
factoring common code for the purposes of clarity.
Note that in all the strategies the memory footprint of both the mappers and the reducers is flat at
scale.
Note that the strategies all work reasonably well with both dense and sparse matrices. For sparse
matrices we do not emit zero elements. That said, the simple pseudo-code for multiplying the
individual blocks shown here is certainly not optimal for sparse matrices. As a learning exercise,
our focus here is on mastering the MapReduce complexities, not on optimizing the sequential
matrix multipliation algorithm for the individual blocks.

Steps
1. setup ()
2. var NIB = (I-1)/IB+1
3. var NKB = (K-1)/KB+1
4. var NJB = (J-1)/JB+1
5. map (key, value)
6. if from matrix A with key=(i,k) and value=a(i,k)
7. for 0 <= jb < NJB
8. emit (i/IB, k/KB, jb, 0), (i mod IB, k mod KB, a(i,k))
9. if from matrix B with key=(k,j) and value=b(k,j)
10. for 0 <= ib < NIB
emit (ib, k/KB, j/JB, 1), (k mod KB, j mod JB, b(k,j))
Intermediate keys (ib, kb, jb, m) sort in increasing order first by ib, then by kb, then by jb,
then by m. Note that m = 0 for A data and m = 1 for B data.
The partitioner maps intermediate key (ib, kb, jb, m) to a reducer r as follows:
11. r = ((ib*JB + jb)*KB + kb) mod R
12. These definitions for the sorting order and partitioner guarantee that each reducer
R[ib,kb,jb] receives the data it needs for blocks A[ib,kb] and B[kb,jb], with the data for
the A block immediately preceding the data for the B block.
13. var A = new matrix of dimension IBxKB
14. var B = new matrix of dimension KBxJB
15. var sib = -1
16. var skb = -1

Reduce (key, valueList)

17. if key is (ib, kb, jb, 0)
18. // Save the A block.
19. sib = ib
20. skb = kb
21. Zero matrix A
22. for each value = (i, k, v) in valueList A(i,k) = v
23. if key is (ib, kb, jb, 1)
24. if ib != sib or kb != skb return // A[ib,kb] must be zero!
25. // Build the B block.
26. Zero matrix B
27. for each value = (k, j, v) in valueList B(k,j) = v
28. // Multiply the blocks and emit the result.
29. ibase = ib*IB
30. jbase = jb*JB
31. for 0 <= i < row dimension of A
32. for 0 <= j < column dimension of B
33. sum = 0
34. for 0 <= k < column dimension of A = row dimension of B
a. sum += A(i,k)*B(k,j)
35. if sum != 0 emit (ibase+i, jbase+j), sum
INPUT:-
Set of Data sets over different Clusters are taken as Rows and Columns

OUTPUT

Result:
Thus the implementation for program is executed successfully.
EXPNO:4
Word count Map Reduce program
Date:

AIM: To Develop a MapReduce program to calculate the frequency of a given word in a given
file. Map Function – It takes a set of data and converts it into another set of data, where
individual elements are broken down into tuples (Key-Value pair).

Example – (MapfunctioninWord Count)

Input

Setof data
Bus,Car,bus,car,train, car,bus,car,train,bus,TRAIN,BUS,buS, caR,CAR,car,BUS,TRAIN
Output

Convertintoanothersetofdata
(Key,Value)
(Bus,1),(Car,1), (bus,1),(car,1),(train,1),(car,1), (bus,1),(car,1), (train,1),(bus,1),
(TRAIN,1),(BUS,1),(buS,1),(caR,1),(CAR,1),(car,1), (BUS,1), (TRAIN,1)
ReduceFunction–TakestheoutputfromMapasaninputandcombinesthosedatatuples into a
smaller set of tuples.
Example– (Reducefunction in Word Count)
Input Set of Tuples
(output of Map function)
(Bus,1),(Car,1),(bus,1),(car,1),(train,1),(car,1),(bus,1),(car,1),(train,1),(bus,1), (TRAIN,1),
(BUS,1),
(buS,1),(caR,1),(CAR,1),(car,1),(BUS,1), (TRAIN,1)

Output Convertsintosmallersetoftuples

(BUS,7),(CAR,7),(TRAIN,4)
WorkFlowof Program

Workflow of MapReduce consists of 5steps

1. Splitting–Thesplittingparametercanbeanything,e.g.splittingbyspace,
comma, semicolon, or even by a new line (‘\n’).
2. Mapping–asexplainedabove
3. Intermediatesplitting –theentireprocessinparallelondifferentclusters.Inorder
togroupthemin“ReducePhase”thesimilarKEYdatashouldbeonsamecluster.
4. Reduce–itisnothingbutmostlygroupbyphase
5. Combining–Thelastphasewhereallthedata(individualresultsetfromeach
cluster) is combine together to form a Result

NowLet’sSeetheWordCountProgramin Java

MakesurethatHadoopisinstalledonyoursystemwithjavaidk Steps to

Step1.OpenEclipse>File>New>JavaProject >(Nameit–MRProgramsDemo)>Finish
Step2.RightClick>New>Package(Nameit -PackageDemo)>Finish Step 3. Right
Click on Package > New > Class (Name it - WordCount) Step 4. Add
Following Reference Libraries –

Right Click on Project>BuildPath>AddExternalArchivals

 /usr/lib/hadoop-0.20/hadoop-core.jar
 Usr/lib/hadoop-0.20/lib/Commons-cli-1.2.jar

Program : Step5 TypefollowingProgram:

package PackageDemo;
import java.io.IOException;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import
org.apache.hadoop.mapreduce.Mapper;
import
org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.util.GenericOptionsParser;
publicclassWordCount{
publicstaticvoidmain(String[]args)throwsException
{
Configurationc=newConfiguration();
String[]files=newGenericOptionsParser(c,args).getRemainingArgs();
Path input=new Path(files[0]);
Path output=new Path(files[1]);
Job j=new Job(c,"wordcount");
j.setJarByClass(WordCount.class);
j.setMapperClass(MapForWordCount.class);
j.setReducerClass(ReduceForWordCount.class);j.setOutputKey
Class(Text.class); j.setOutputValueClass(IntWritable.class);
FileInputFormat.addInputPath(j, input);
FileOutputFormat.setOutputPath(j, output);
System.exit(j.waitForCompletion(true)?0:1);
}
publicstaticclassMapForWordCountextendsMapper<LongWritable,Text,Text, Int
Writable>{
publicvoidmap(LongWritablekey,Textvalue,Contextcon)throwsIOException,
InterruptedException
{
Stringline=value.toString();

String[]words=line.split(",");
for(String word: words )
{
TextoutputKey=newText(word.toUpperCase().trim());
IntWritable outputValue = new IntWritable(1);
con.write(outputKey, outputValue);
}
}
}
publicstaticclassReduceForWordCountextendsReducer<Text,IntWritable,Text,
IntWritable>
{
publicvoidreduce(Textword,Iterable<IntWritable>values,Contextcon)throws
IOException,
InterruptedException
{
intsum=0;
for(IntWritablevalue:values)
{
sum+=value.get();
}
con.write(word,newIntWritable(sum));
}
}
}

MakeJarFile
RightClickonProject>Export>SelectexportdestinationasJarFile>next>Finish
ToMovethisintoHadoopdirectly,opentheterminalandenterthefollowing commands:
[training@localhost~]$hadoopfs-putwordcountFilewordCountFile

RunJarfile
(Hadoopjarjarfilename.jarpackageName.ClassNamePathToInputTextFile
PathToOutputDirectry)

[training@localhost~]$HadoopjarMRProgramsDemo.jar PackageDemo.WordCount
wordCountFile MRDir1

Result : OpenResult

[training@localhost~]$hadoopfs-lsMRDir1
Found 3 items
-rw-r--r--1trainingsupergroup
02016-02-2303:36/user/training/MRDir1/_SUCCESS
drwxr-xr-x - training supergroup
02016-02-2303:36/user/training/MRDir1/_logs
-rw-r--r--1trainingsupergroup
20 2016-02-23 03:36 /user/training/MRDir1/part-r-00000
[training@localhost~]$hadoopfs-catMRDir1/part-r-00000
BUS 7
CAR 4
TRAIN6

Result:
Thus the implementation for program is executed successfully.

EXPNO:5
Date:
Installation of Hive

Downloading Hive
We use hive-0.14.0 in this tutorial. You can download it by visiting the following link
http://apache.petsads.us/hive/hive-0.14.0/. Let us assume it gets downloaded onto the /Downloads directory.
Here, we download Hive archive named “apache-hive-0.14.0-bin.tar.gz” for this tutorial. The following
command is used to verify the download:

$ cd Downloads
$ ls

On successful download, you get to see the following response:

apache-hive-0.14.0-bin.tar.gz

Installing Hive
The following steps are required for installing Hive on your system. Let us assume the Hive archive is
downloaded onto the /Downloads directory.

Extracting and verifying Hive Archive

The following command is used to verify the download and extract the hive archive:

$ tar zxvf apache-hive-0.14.0-bin.tar.gz

$ ls

On successful download, you get to see the following response:

apache-hive-0.14.0-bin apache-hive-0.14.0-bin.tar.gz

Copying files to /usr/local/hive directory

We need to copy the files from the super user “su -”. The following commands are used to copy the files from
the extracted directory to the /usr/local/hive” directory.

$ su -
passwd:

# cd /home/user/Download
# mv apache-hive-0.14.0-bin /usr/local/hive
# exit

Setting up environment for Hive

You can set up the Hive environment by appending the following lines to ~/.bashrc file:

export HIVE_HOME=/usr/local/hive
export PATH=$PATH:$HIVE_HOME/bin
export CLASSPATH=$CLASSPATH:/usr/local/Hadoop/lib/*:.
export CLASSPATH=$CLASSPATH:/usr/local/hive/lib/*:.

The following command is used to execute ~/.bashrc file.

$ source ~/.bashrc

Configuring Hive
To configure Hive with Hadoop, you need to edit the hive-env.sh file, which is placed in the
$HIVE_HOME/conf directory. The following commands redirect to Hive config folder and copy the
template file:

$ cd $HIVE_HOME/conf
$ cp hive-env.sh.template hive-env.sh

Edit the hive-env.sh file by appending the following line:

export HADOOP_HOME=/usr/local/hadoop

Hive installation is completed successfully. Now you require an external database server to configure
Metastore. We use Apache Derby database.

Result:
Thus the implementation for program is executed successfully.
EXPNO:6
Installation of HBase
Date:

INSTALLATION OF HBASE, INSTALLING THIFT ALONG WITH PRACTICE EXAMPLES.

Installing HBase

We can install HBase in any of the three modes: Standalone mode, Pseudo Distributed mode, and Fully
Distributed mode.

Installing HBase in Standalone Mode

Download the latest stable version of HBase form http://www.interior-dsgn.com/apache/hbase/stable/ using

“wget” command, and extract it using the tar “zxvf” command. See the following command.

$cd usr/local/
$wget http://www.interior-dsgn.com/apache/hbase/stable/hbase-0.98.8-
hadoop2-bin.tar.gz
$tar -zxvf hbase-0.98.8-hadoop2-bin.tar.gz

Shift to super user mode and move the HBase folder to /usr/local as shown below.

$su
$password: enter your password here
mv hbase-0.99.1/* Hbase/

Configuring HBase in Standalone Mode

Before proceeding with HBase, you have to edit the following files and configure HBase.

hbase-env.sh

Set the java Home for HBase and open hbase-env.sh file from the conf folder. Edit JAVA_HOME
environment variable and change the existing path to your current JAVA_HOME variable as shown below.

cd /usr/local/Hbase/conf
gedit hbase-env.sh

This will open the env.sh file of HBase. Now replace the existing JAVA_HOME value with your current
value as shown below.

export JAVA_HOME=/usr/lib/jvm/java-1.7.0

hbase-site.xml

This is the main configuration file of HBase. Set the data directory to an appropriate location by opening the
HBase home folder in /usr/local/HBase. Inside the conf folder, you will find several files, open the hbase-
site.xml file as shown below.

#cd /usr/local/HBase/
#cd conf
# gedit hbase-site.xml

Inside the hbase-site.xml file, you will find the <configuration> and </configuration> tags. Within them, set
the HBase directory under the property key with the name “hbase.rootdir” as shown below.

<configuration>
//Here you have to set the path where you want HBase to store its files.
<property>
<name>hbase.rootdir</name>
<value>file:/home/hadoop/HBase/HFiles</value>
</property>

//Here you have to set the path where you want HBase to store its built in zookeeper
files.
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/home/hadoop/zookeeper</value>
</property>
</configuration>

With this, the HBase installation and configuration part is successfully complete. We can start HBase by using
start-hbase.sh script provided in the bin folder of HBase. For that, open HBase Home Folder and run HBase
start script as shown below.

$cd /usr/local/HBase/bin
$./start-hbase.sh

If everything goes well, when you try to run HBase start script, it will prompt you a message saying that
HBase has started.

starting master, logging to /usr/local/HBase/bin/../logs/hbase-tpmaster-

localhost.localdomain.out

Installing HBase in Pseudo-Distributed Mode

Let us now check how HBase is installed in pseudo-distributed mode.

Configuring HBase

Before proceeding with HBase, configure Hadoop and HDFS on your local system or on a remote system and
make sure they are running. Stop HBase if it is running.

hbase-site.xml

Edit hbase-site.xml file to add the following properties.

<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
It will mention in which mode HBase should be run. In the same file from the local file system, change the
hbase.rootdir, your HDFS instance address, using the hdfs://// URI syntax. We are running HDFS on the
localhost at port 8030.

<property>
<name>hbase.rootdir</name>
<value>hdfs://localhost:8030/hbase</value>
</property>

Starting HBase

After configuration is over, browse to HBase home folder and start HBase using the following command.

$cd /usr/local/HBase
$bin/start-hbase.sh

Note: Before starting HBase, make sure Hadoop is running.

Checking the HBase Directory in HDFS

HBase creates its directory in HDFS. To see the created directory, browse to Hadoop bin and type the
following command.

$ ./bin/hadoop fs -ls /hbase

If everything goes well, it will give you the following output.

Found 7 items
drwxr-xr-x - hbase users 0 2014-06-25 18:58 /hbase/.tmp
drwxr-xr-x - hbase users 0 2014-06-25 21:49 /hbase/WALs
drwxr-xr-x - hbase users 0 2014-06-25 18:48 /hbase/corrupt
drwxr-xr-x - hbase users 0 2014-06-25 18:58 /hbase/data
-rw-r--r-- 3 hbase users 42 2014-06-25 18:41 /hbase/hbase.id
-rw-r--r-- 3 hbase users 7 2014-06-25 18:41 /hbase/hbase.version
drwxr-xr-x - hbase users 0 2014-06-25 21:49 /hbase/oldWALs

Starting and Stopping a Master

Using the “local-master-backup.sh” you can start up to 10 servers. Open the home folder of HBase, master and
execute the following command to start it.

$ ./bin/local-master-backup.sh 2 4

To kill a backup master, you need its process id, which will be stored in a file named “/tmp/hbase-USER-X-
master.pid.” you can kill the backup master using the following command.

$ cat /tmp/hbase-user-1-master.pid |xargs kill -9

Starting and Stopping RegionServers

You can run multiple region servers from a single system using the following command.
$ .bin/local-regionservers.sh start 2 3

To stop a region server, use the following command.

$ .bin/local-regionservers.sh stop 3

Starting HBaseShell
After Installing HBase successfully, you can start HBase Shell. Below given are the sequence of steps that are
to be followed to start the HBase shell. Open the terminal, and login as super user.

Start Hadoop File System

Browse through Hadoop home sbin folder and start Hadoop file system as shown below.

$cd $HADOOP_HOME/sbin
$start-all.sh

Start HBase

Browse through the HBase root directory bin folder and start HBase.

$cd /usr/local/HBase
$./bin/start-hbase.sh

Start HBase Master Server

This will be the same directory. Start it as shown below.

$./bin/local-master-backup.sh start 2 (number signifies specific

server.)

Start Region

Start the region server as shown below.

$./bin/./local-regionservers.sh start 3

Start HBase Shell

You can start HBase shell using the following command.

$cd bin
$./hbase shell

This will give you the HBase Shell Prompt as shown below.

2014-12-09 14:24:27,526 INFO [main] Configuration.deprecation:

hadoop.native.lib is deprecated. Instead, use io.native.lib.available
HBase Shell; enter 'help<RETURN>' for list of supported commands.
Type "exit<RETURN>" to leave the HBase Shell
Version 0.98.8-hadoop2, r6cfc8d064754251365e070a10a82eb169956d5fe, Fri
Nov 14 18:26:29 PST 2014

hbase(main):001:0>

HBase Web Interface

To access the web interface of HBase, type the following url in the browser.

http://localhost:60010

This interface lists your currently running Region servers, backup masters and HBase tables.

HBase Region servers and Backup Masters

HBase Tables

Setting Java Environment

We can also communicate with HBase using Java libraries, but before accessing HBase using Java API you
need to set classpath for those libraries.

Setting the Classpath

Before proceeding with programming, set the classpath to HBase libraries in .bashrc file. Open .bashrc in any
of the editors as shown below.

$ gedit ~/.bashrc

Set classpath for HBase libraries (lib folder in HBase) in it as shown below.

export CLASSPATH = $CLASSPATH://home/hadoop/hbase/lib/*

This is to prevent the “class not found” exception while accessing the HBase using java API.

Result:
Thus the implementation for program is executed successfully.
EXPNO:7
Importing and exporting data from various database
Date:

SQL Server is very popular in Relational Database and it is used across many software industries. In MS SQL
Server, two sorts of databases are available.

 System databases
 User Databases

In this, we will learn to export and import SQL Server databases using Microsoft SQL Server. Exporting and
Importing stands as a backup plan for developers.

Step 1: Open the “Microsoft SQL Server” Click on “File”, “New” and select “Database Engine Query”.

Step 2: Create a Database

Query :

CREATE DATABASE college;

Output:
Step 3: Select the newly created table “college”.

Query :

USE college;

Output:

Step 4: Creating a Table

Query :

CREATE TABLE students( id INT NOT NULL PRIMARY KEY,

name VARCHAR(300) NOT NULL , reg _no INT NOT NULL ,
semester INT NOT NULL );

Output:
Step 5: Insert the Records

Query :

INSERT INTO students VALUES

(1,'priya',31,3),(2,'keerthi',12,1),
(3,'rahul',29,2),(4,'reyansh',38,3),
5,'lasya',47,2);

Output:
Exporting SQL Server Database:

After creating a database in “Microsoft SQL Server”, Let’s see how exporting takes place.

Step 1: Open the Object Explorer, Right-click on the Database that you want to export and click the “task”
option and select “Export Data-Tier Application”.
Step 2: Click Next and by browsing, select the destination folder in which you have to save the database file.
The filename should be as same as the database name ( here “college” ) and click “Next ” and “Finish”. You
will get a dialogue box showing the result of exporting.

Importing SQL Server Database :

Step 1: Right Click on the Database folder and select “Import Data-Tier Application” and click “Next.
Step 2: Select the file which you have exported and change the name of the database

here we changed the database name from “college” to “college_ info” and click “Next” and a dialogue box
appears showing the result of importing.
Whether you're preparing for your first job interview or aiming to upskill in this ever-evolving tech landscape,
GeeksforGeeks Courses are your key to success. We provide top-quality content at affordable prices, all geared
towards accelerating your growth in a time-bound manner. Join the millions we've already empowered, and we're
here to do the same for you

Result:
Thus the implementation for program is executed successfully.
.

Microsoft: Exam Questions DP-900
No ratings yet
Microsoft: Exam Questions DP-900
27 pages
Bda Manual
No ratings yet
Bda Manual
80 pages
Big Data & Analytics Lab Manual
No ratings yet
Big Data & Analytics Lab Manual
51 pages
Lab 1 - Hadoop HDFS and MapReduce
No ratings yet
Lab 1 - Hadoop HDFS and MapReduce
4 pages
Summer Training Report
50% (4)
Summer Training Report
58 pages
bigdatamanual(2)
No ratings yet
bigdatamanual(2)
45 pages
CCS334-BDA LAB MANUAL final (1)
No ratings yet
CCS334-BDA LAB MANUAL final (1)
46 pages
Bigdatamanualfinal 231019063224 d211cb48
No ratings yet
Bigdatamanualfinal 231019063224 d211cb48
45 pages
Bdafile
No ratings yet
Bdafile
9 pages
BDA LAB MANUAL
No ratings yet
BDA LAB MANUAL
45 pages
BigData_Lab_Manual
No ratings yet
BigData_Lab_Manual
44 pages
Big Data Record 2024-25
No ratings yet
Big Data Record 2024-25
46 pages
ccs 334 bigdata manual
No ratings yet
ccs 334 bigdata manual
45 pages
Big Data Manual Ai
No ratings yet
Big Data Manual Ai
33 pages
Big Data Analytics Lab Experiments
No ratings yet
Big Data Analytics Lab Experiments
16 pages
Big Data File
No ratings yet
Big Data File
16 pages
big datalab
No ratings yet
big datalab
4 pages
BDA-Lab Record
No ratings yet
BDA-Lab Record
43 pages
Week 1 in Terminal
No ratings yet
Week 1 in Terminal
10 pages
bda-manual
No ratings yet
bda-manual
33 pages
Prachi 20CS111 BDALab File
No ratings yet
Prachi 20CS111 BDALab File
20 pages
Dsa Practical File
No ratings yet
Dsa Practical File
16 pages
CCS334 Bda
No ratings yet
CCS334 Bda
23 pages
big data
No ratings yet
big data
28 pages
Bda Record
No ratings yet
Bda Record
83 pages
1.Mrplab Intro
No ratings yet
1.Mrplab Intro
18 pages
Pro 3
No ratings yet
Pro 3
45 pages
Cloud PDF
No ratings yet
Cloud PDF
47 pages
BDA Lab Manual_organized (2) (1) - Copy
No ratings yet
BDA Lab Manual_organized (2) (1) - Copy
69 pages
BIGDATA LAB MANUAL
No ratings yet
BIGDATA LAB MANUAL
27 pages
Big Data
No ratings yet
Big Data
23 pages
213nt1306- Big Data Analytics Lab Manual
No ratings yet
213nt1306- Big Data Analytics Lab Manual
80 pages
Big Data Manual
No ratings yet
Big Data Manual
19 pages
BIG data file
No ratings yet
BIG data file
28 pages
BDA Record (1)
No ratings yet
BDA Record (1)
34 pages
Computer Science & Engineering: Department of
No ratings yet
Computer Science & Engineering: Department of
6 pages
Data Storage Data Processing: Hadoop Distributed File System (HDFS) Mapreduce
No ratings yet
Data Storage Data Processing: Hadoop Distributed File System (HDFS) Mapreduce
35 pages
Bigdata Lab File
No ratings yet
Bigdata Lab File
20 pages
Big Data Analytics Lab
No ratings yet
Big Data Analytics Lab
18 pages
@bigdatalabfile 09
No ratings yet
@bigdatalabfile 09
35 pages
Hadoop Installation
No ratings yet
Hadoop Installation
11 pages
Hands On-Exercies
No ratings yet
Hands On-Exercies
17 pages
NEW BDA MANUAL
No ratings yet
NEW BDA MANUAL
80 pages
Ba Lab Record-It b2022-26
No ratings yet
Ba Lab Record-It b2022-26
43 pages
Exercise 03
No ratings yet
Exercise 03
2 pages
Final Copy - BDA LAB Record
No ratings yet
Final Copy - BDA LAB Record
44 pages
BDA lab Manual
No ratings yet
BDA lab Manual
62 pages
BDA Lab Manual-1
No ratings yet
BDA Lab Manual-1
60 pages
Bda Record
No ratings yet
Bda Record
46 pages
BD Lab File
No ratings yet
BD Lab File
39 pages
A Report On Distributed Computing
No ratings yet
A Report On Distributed Computing
25 pages
Experiment No - 1
No ratings yet
Experiment No - 1
13 pages
3 Hadoop
No ratings yet
3 Hadoop
40 pages
Extreme Computing Lab Exercises Session One: 1 Getting Started
No ratings yet
Extreme Computing Lab Exercises Session One: 1 Getting Started
6 pages
Amrita CC 3.1
No ratings yet
Amrita CC 3.1
7 pages
Hadoop Lab Manual
No ratings yet
Hadoop Lab Manual
54 pages
BDA LAB MANUEL
No ratings yet
BDA LAB MANUEL
9 pages
HadoopfilePP
No ratings yet
HadoopfilePP
83 pages
Hadoop Installation Manual 2.odt
No ratings yet
Hadoop Installation Manual 2.odt
20 pages
EX. NO Date Program NO Sign
No ratings yet
EX. NO Date Program NO Sign
80 pages
Data Science
No ratings yet
Data Science
82 pages
Quick Configuration of Openldap and Kerberos In Linux and Authenicating Linux to Active Directory
From Everand
Quick Configuration of Openldap and Kerberos In Linux and Authenicating Linux to Active Directory
Dr. Hidaia Mahmood Alassouli
No ratings yet
Shubham Srs Library Management System
No ratings yet
Shubham Srs Library Management System
11 pages
SQL interview questions
No ratings yet
SQL interview questions
50 pages
Hardware - Migration - Replication+Method+Preparation
No ratings yet
Hardware - Migration - Replication+Method+Preparation
11 pages
6014.optimizing SQL Server For Temenos T24 - FINAL
No ratings yet
6014.optimizing SQL Server For Temenos T24 - FINAL
60 pages
Tf13 Factorytalk System Design Considerations Rsteched
No ratings yet
Tf13 Factorytalk System Design Considerations Rsteched
101 pages
BioTime Installation Guide
No ratings yet
BioTime Installation Guide
12 pages
Power BI AI
No ratings yet
Power BI AI
20 pages
AZ 305 Questions Answers File 4
No ratings yet
AZ 305 Questions Answers File 4
94 pages
DBMS MCQ
No ratings yet
DBMS MCQ
74 pages
Crime Abstract Criminal Investigation Tracker
No ratings yet
Crime Abstract Criminal Investigation Tracker
25 pages
Skills Matrix
No ratings yet
Skills Matrix
6 pages
Internet Download Manager
No ratings yet
Internet Download Manager
4 pages
En ES 8.5.2 Depl Book
No ratings yet
En ES 8.5.2 Depl Book
160 pages
Changebase 6.2.2 Releasenotes
No ratings yet
Changebase 6.2.2 Releasenotes
13 pages
70-466: Implementing Data Models and Reports With Microsoft SQL Server
No ratings yet
70-466: Implementing Data Models and Reports With Microsoft SQL Server
11 pages
SQL Questions
100% (1)
SQL Questions
28 pages
H3-Airflow Elastic ETL
No ratings yet
H3-Airflow Elastic ETL
22 pages
SQL Server 2014 AlwaysOn AG Failover
No ratings yet
SQL Server 2014 AlwaysOn AG Failover
22 pages
Atm Database System
57% (14)
Atm Database System
30 pages
Student Grade
No ratings yet
Student Grade
54 pages
Built-In Functions - Aggregate Functions - SQLServerPedia
No ratings yet
Built-In Functions - Aggregate Functions - SQLServerPedia
49 pages
2091 Building XML-Enabled Applications Using Microsoft SQL S
No ratings yet
2091 Building XML-Enabled Applications Using Microsoft SQL S
358 pages
Qlikview Hide Expression in Pivot Table
No ratings yet
Qlikview Hide Expression in Pivot Table
60 pages
Aveva Pdms
No ratings yet
Aveva Pdms
34 pages
Config_MS_SQL_2022_SP_us
No ratings yet
Config_MS_SQL_2022_SP_us
23 pages
SQL Practice Problems 57 Beginning, Intermediate, and Advanced Challenges For You To Solve Using A Learn-By-Doing Approach - Nodrm
100% (1)
SQL Practice Problems 57 Beginning, Intermediate, and Advanced Challenges For You To Solve Using A Learn-By-Doing Approach - Nodrm
349 pages
2023 Release Platform Specifications
No ratings yet
2023 Release Platform Specifications
13 pages
NJ SQL Best Practices V1 0 QuickStartGuide en 201504 P77I-E-01
No ratings yet
NJ SQL Best Practices V1 0 QuickStartGuide en 201504 P77I-E-01
16 pages