0% found this document useful (0 votes)
15 views12 pages

Cloud Computing Lab Manual

The document serves as a comprehensive lab manual for cloud computing, focusing on the installation and configuration of Hadoop and Eucalyptus, and detailing their applications in data processing and analytics. It discusses the challenges of Hadoop adoption, integration with resource management software, and provides a guide for deploying services in the cloud, including management and security considerations. Additionally, it outlines the importance of cloud strategy and security management, emphasizing the need for effective controls to safeguard cloud systems.

Uploaded by

khyati gupta
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views12 pages

Cloud Computing Lab Manual

The document serves as a comprehensive lab manual for cloud computing, focusing on the installation and configuration of Hadoop and Eucalyptus, and detailing their applications in data processing and analytics. It discusses the challenges of Hadoop adoption, integration with resource management software, and provides a guide for deploying services in the cloud, including management and security considerations. Additionally, it outlines the importance of cloud strategy and security management, emphasizing the need for effective controls to safeguard cloud systems.

Uploaded by

khyati gupta
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

CLOUDING COMPUTING LAB MANUAL

1. Installation and configuration of Hadoop/Euceliptus etc.

Hadoop is a fault-tolerant distributed system for data storage which is highly scalable. The
scalability is the result of a Self-Healing High Bandwith Clustered Storage , known by the acronym
of HDFS (Hadoop Distributed File System) and a specific fault-tolerant Distributed Processing,
known as MapReduce.
(Hadoop Distributed File System) and a specific fault-tolerant Distributed Processing, known as
MapReduce.

Why Hadoop as part of the IT?


It processes and analyzes variety of new and older data to extract meaningful business operations
wisdom. Traditionally data moves to the computation node. In Hadoop, data is processed where the
data resides . The type of questions one Hadoop helps answer are:
 Event analytics — what series of steps lead a purchase or registration
 Large scale web click stream analytics
 Revenue assurance and price optimizations
 Financial risk management and affinity engine
 Many other... The Hadoop cluster or cloud is disruptive in data center. Some grid software
resource managers can be integrated with Hadoop. The main advantage is that Hadoop jobs
can be submitted orderly from within the data center. See below the integration with Oracle
Grid Engine.
What types of data we handle today?
Human-generated data that fits well into relational tables or arrays. Examples are conventional
transactions – purchase/sale, inventory/manufacturing, employment status change, etc. This is the
core data managed by OLTP relational DBMS everywhere. In the last decade, humans generated
other kinds of data as well, like text, documents (text or otherwise), pictures, videos, slideware.
Traditional relational databases are a poor home for this kind of data because:
 It often deals with opinions or aesthetic judgments – there is little concept of perfect
accuracy.
 There is little concept of perfect completeness.
 There’s also little concept of perfectly, unarguably accurate query results –
 Different people will have different opinions as to what comprises good results for a
search.
 No clear cut binary answers; documents can have differing degrees of relevancy
Another type of data is the machine generated data, machine that human created and that produce
unstoppable streams of data
1. Computer logs
2. Satellite telemetry (espionage or science)
3. GPS outputs
4. Temperature and environmental sensors
5. Industrial sensors
6. Video from security cameras
7. Outputs from medical devises
8. Seismic and Geo-phisical sensors
According to Gartner , Enterprise Data will grow 650% by 2014. 85% of these data will be
“unstructured data”, and this segment has a CAGR of 62% per year, far larger than transactional
data.

Example of Hadoop usage


Netflix (NASDAQ: NFLX) is a service offering online flat rate DVD and Blu-ray disc rental-by-
mail and video streaming in the United States. It has over 100,000 titles and 10 million subscribers.
The company has 55 million discs and, on average, ships 1.9 million DVDs to customers each day.
Netflix offers Internet video streaming, enabling the viewing of films directly on a PC or TV at
home.
Netflix’s movie recommendation algorithm uses Hive (underneath using Hadoop, HDFS &
MapReduce) for query processing and Business Intelligence. Netflix collects all logs from website
which are streaming logs collected using Hunu.
They parse 0.6TB of data running on Amazon S3 50 nodes. All data are processed for Business
Intelligence using a software called MicroStrategy.

Hadoop challenges
Traditionally, Hadoop was opened for developers. But the wide adoption and success of Hadoop
depends on business users, not developers.
Commercial distributions will have to make it even easier for business analysts to use Hadoop.
Templates for business scripts are a start, but getting away from scripting altogether should be the
long term goal for the business user segment. This has not happened yet. Nevertheless Cloudera is
trying to win the business user segment, and if they succeed they will create an enterprise Hadoop
market.
To best illustrate, here it is a quote from Yahoo Hadoop development team:
“The way Yahoo! uses Hadoop is changing. Previously, most Hadoop users at Yahoo! were
researchers. Researchers are usually hungry for scalability and features, but they are fairly tolerant
of failures. Few scientists even know what "SLA" means, and they are not in the habit of counting
the number of nines in your uptime. Today, more and more of Yahoo! production applications have
moved to Hadoop. These mission-critical applications control every aspect of Yahoo!'s operation,
from personalizing user experience to optimizing ad placement. They run 24/7, processing many
terabytes of data per day. They must not fail. So we are looking for software engineers who want to
help us make sure Hadoop works for Yahoo! and the numerous Hadoop users outside Yahoo!”

Hadoop Integration with resource management cloud software


One such example is Oracle Grid Engine 6.2 Update 5. Cycle Computing also announced an
integration with Hadoop. It reduces the cost of running Apache Hadoop applications by enabling
them to share resources with other data center applications, rather than having to maintain a
dedicated cluster for running Hadoop applications. Here is a relevant customer quote
“The Grid Engine software has dramatically lowered for us the cost of data intensive, Hadoop
centered, computing. With its native understanding of HDFS data locality and direct support for
Hadoop job submission, Grid Engine allows us to run Hadoop jobs within exactly the same
scheduling and submission environment we use for traditional scalar and parallel loads. Before we
were forced to either dedicate specialized clusters or to make use of convoluted, adhoc, integration
schemes; solutions that were both expensive to maintain and inefficient to run. Now we have the
best of both worlds: high flexibility within a single, consistent and robust, scheduling system"”

Getting Started with Hadoop


Hadoop is an open source implementation of the MapReduce algorithms and distributed file system.
Hadoop is primarily developed in Java. Writing a Java application, obviously, will give you much
more control and presumably improved performance. However, it can be used with other
environments including scripting languages using “streaming”. Streaming applications simply reads
data from stdin and write their output to stdout.

Installing Hadoop
To install Hadoop, you will need to download Hadoop Common (also referred as Hadoop Core)
from http://hadoop.apache.org/common/. The binaries are available from Open Source under an
Apache License. Once you have downloaded the Hadoop Common, follow the installation and
configuration instructions.

Hadoop With Virtual Machine


If you have no experience playing with Hadoop, there is an easier way to install and experiment
with Hadoop. Rather than installing a local copy of Hadoop, install a virtual machine from Yahoo!
Virtual machine comes with Hadoop pre-installed and pre-configured and is almost ready to use.
The virtual machine is available from their Hadoop tutorial. This tutorial includes well documented
instructions for running the virtual machine and running Hadoop applications. The virtual machine,
in addition to Hadoop, includes Eclipse IDE for writing Java based Hadoop applications.

Hadoop Cluster
By default, Hadoop distributions are configured to run on single machine and the Yahoo virtual
machine is a good way to get going. However, the power of Hadoop comes from its inherent
distributed nature and deploying distributed computing on a single machine misses its very point.
For any serious processing with Hadoop, you’ll need many more machines. Amazon’s Elastic
Compute Cloud (EC2) is perfect for this. An alternative option to running Hadoop on EC2 is to use
the Cloudera distribution. And of course, you can set up your own cluster of Hadoop by following
the Apache instructions. Resources
There is a large active developer community who created many scripted languages such as HBase,
Hive, Pig and others). Cloudera, has a supported distribution.
2. Service deployment & Usage over cloud.

In the Management Portal, click Cloud Services. Then click the name of the cloud service to open
the dashboard.
1. Click Quick Start (the icon to the left of Dashboard) to open the Quick Start page, shown
below. (You can also deploy your cloud service by using Upload on the dashboard.)

2. If you haven't installed the Windows Azure SDK, click Install Azure SDK to open the
Windows Azure Downloads page, and then download the SDK for the language in which
you prefer to develop your code.
On the downloads page, you can also install client libraries and source code for developing
web apps in Node.js, Java, PHP, and other languages, which you can deploy as scalable
Windows Azure cloud services.
Note For cloud services created earlier (known earlier as hosted services), you'll need to
make sure the guest operating systems on the virtual machines (role instances) are
compatible with the Windows Azure SDK version you install. For more information, see the
Windows Azure SDK release notes.
3. Click either New Production Deployment or New Staging Deployment.
If you'd like to test your cloud service in Windows Azure before deploying it to production,
you can deploy to staging. In the staging environment, the cloud service's globally unique
identifier (GUID) identifies the cloud service in URLs (GUID.cloudapp.net). In the
production environment, the friendlier DNS prefix that you assign is used (for example,
myservice.cloudapp.net). When you're ready to promote your staged cloud service to
production, use Swap to redirect client requests to that deployment.
When you select a deployment environment, Upload a Package opens.
4. In Deployment name, enter a name for the new deployment - for example,
MyCloudServicev1.
5. In Package, use Browse to select the service package file (.cspkg) to use.
6. In Configuration, use Browse to select the service configure file (.cscfg) to use.
7. If the cloud service will include any roles with only one instance, select the Deploy even if
one or more roles contain a single instance check box to enable the deployment to
proceed.
Windows Azure can only guarantee 99.95 percent access to the cloud service during
maintenance and service updates if every role has at least two instances. If needed, you can
add additional role instances on the Scale page after you deploy the cloud service. For more
information, see Service Level Agreements.
8. Click OK (checkmark) to begin the cloud service deployment.
You can monitor the status of the deployment in the message area. Click the down arrow to
hide the message.

To verify that your deployment completed successfully


1. Click Dashboard.
2. Under quick glance, click the site URL to open your cloud service in a web browser.
3. Management of cloud resources.

In theory, cloud computing services-based resources should be no different from the resources in
your own environment, except that they live remotely. Ideally, you have a complete view of the
cloud computing resources you use today or may want to use in the future.
In most cloud environments, the customer is able to access only the services they’re entitled to use.
Entire applications may be used on a cloud services basis. Development tools are sometimes cloud
based. In fact, testing and monitoring environments can be based on the cloud.
Performance management is all about how your software services run effectively inside your own
environment and through the cloud.
If you start to connect software that runs in your own data center directly to software that runs in the
cloud, you create a potential bottleneck at the point of connection.
Services connected between the cloud and your computing environment can impact performance if
they aren’t well planned. This is especially likely to be the case if there are data translations or
specific protocols to adhere to at the cloud gateway.
As a customer, your ability to directly control the resources will be much lower in the cloud.
Therefore,
 The connection points between various services must be monitored in real time. A
breakdown may impact your ability to provide a business process to your customers.
 There must be expanded bandwidth at connection points.
With Software as a Service (SaaS), a customer expects provisioning (to request a resource for
immediate use) of extra services to be immediate, automatic, and effortless. The cloud service
provider is responsible for maintaining an agreed-on level of service and provisions resources
accordingly.
The normal situation in a data center is that software workloads vary throughout the day, week,
month, and year. So the data center has to be built for the maximum possible workload, with a little
bit of extra capacity thrown in to cover unexpectedly high peaks.
Service management in this context covers all the data center operations activities. This broad
discipline considers the necessary techniques and tools for managing services by both cloud
providers and the internal data center managers across these physical, IT and virtual environments.
Service management encompasses many different disciplines, including
 Configuration management
 Asset management
 Network management
 Capacity planning
 Service desk
 Root cause analysis
 Workload management
 Patch and update management
The cloud itself is a service management platform. Well-designed cloud service portfolios include a
tight integration of the core service management capabilities and well-defined interfaces.
4. Using existing cloud characteristics & Service models

Cloud Adoption Strategy Services

The Cloud Adoption Strategy Services recognize the importance of developing a

cloud strategy with expert guidance and preparing a business justification based on

requirements, business and financial needs, and other success factors. These

services include:

Cloud Strategy Workshop for XaaS Adoption: Cisco Cloud Adoption Strategy

Workshop utilizes a collaborative discussion process to examine and evaluate

industry-leading practices around cloud adoption, as well as identify the areas of

interest and importance for you to successfully adopt a cloud model. This two- to

four-hour session, which can be virtual or face to face with Cisco cloud subject

matter experts, as well as partners if applicable, helps you to frame and

understand your current situation, challenges, implications, and benefits before

migration to a cloud model. The workshop also will introduce a cloud migration

approach recommended by Cisco for your environment and your business needs

and goals.

Cloud Strategy and Business Justification Service for XaaS Adoption: Cloud

Adoption Strategy and Business Justification Service introduces an interview format

and process as a forum for assessing and evaluating your application, network,

compute, and storage architectures. The emphasis of the service is on the

application portfolio as well as gathering success factors; cloud use cases; and

financial, business, and technology requirements from your business and IT teams.

In addition to collecting these requirements, the service provides the business case

and justification for a cloud migration and discovers any business and/or technical

effects that would result from implementing the cloud model. Finally, this service
helps define the risk and dependency analysis for your cloud model.

The services
 Your state of readiness to adopt a cloud model and the challenges, implications,

and benefits of adopting a cloud model

 How a cloud model aligns to or affects your business, IT goals, and operational

objectives

 How a cloud model affects business partners, such as vendors, suppliers,

resellers, and customers

 How a cloud model can enable the ability to provide new user services with

optimal service delivery or consume new business services from others

5. Cloud Security Management.

Cloud Security Controls


Cloud security architecture is effective only if the correct defensive implementations are in place.
An efficient cloud security architecture should recognize the issues that will arise with security
management. The security management addresses these issues with security controls. These
controls are put in place to safeguard any weaknesses in the system and reduce the effect of an
attack. While there are many types of controls behind a cloud security architecture, they can usually
be found in one of the following categories:

Deterrent Controls
These controls are set in place to prevent any purposeful attack on a cloud system. Much like a
warning sign on a fence or a property, these controls do not reduce the actual vulnerability of a
system.
Preventative Controls
These controls upgrade the strength of the system by managing the vulnerabilities. The preventative
control will safeguard vulnerabilities of the system. If an attack were to occur, the preventative
controls are in place to cover the attack and reduce the damage and violation to the system's
security.

Corrective Controls
Corrective controls are used to reduce the effect of an attack. Unlike the preventative controls, the
corrective controls take action as an attack is occurring.

Detective Controls
Detective controls are used to detect any attacks that may be occurring to the system. In the event of
an attack, the detective control will signal the preventative or corrective controls to address the
issue.

Dimensions of cloud security


Correct security controls should be implemented according to asset, threat, and vulnerability risk
assessment matrices.While cloud security concerns can be grouped into any number of dimensions
(Gartner names seven while the identifies fourteen areas of concern) these dimensions have been
aggregated into three general areas: Security and Privacy, Compliance, and Legal or Contractual
Issues.

Security and privacy


Identity managemen
Every enterprise will have its own identity management system to control access to
information and computing resources. Cloud providers either integrate the customer’s identity
management system into their own infrastructure, using federation or SSO technology, or
provide an identity management solution of their own.

Physical and personnel security


Providers ensure that physical machines are adequately secure and that access to these
machines as well as all relevant customer data is not only restricted but that access is
documented.

Availability
Cloud providers assure customers that they will have regular and predictable access to their
data and applications.

Application security
Cloud providers ensure that applications available as a service via the cloud are secure by
implementing testing and acceptance procedures for outsourced or packaged application code.
It also requires application security measures be in place in the production environment.

Privacy
Finally, providers ensure that all critical data (credit card numbers, for example) are masked
and that only authorized users have access to data in its entirety. Moreover, digital identities
and credentials must be protected as should any data that the provider collects or produces
about customer activity in the cloud.
Legal issues
In addition, providers and customers must consider legal issues, such as Contracts and E-
Discovery, and the related laws, which may vary by country.

Legal and contractual issues


Aside from the security and compliance issues enumerated above, cloud providers and their
customers will negotiate terms around liability (stipulating how incidents involving data loss or
compromise will be resolved, for example), intellectual property, and end-of-service (when data and
applications are ultimately returned to the customer).

Public records
Legal issues may also include records-keeping requirements in the public sector, where many
agencies are required by law to retain and make available electronic records in a specific fashion.
This may be determined by legislation, or law may require agencies to conform to the rules and
practices set by a records-keeping agency. Public agencies using cloud computing and storage must
take these concerns into the account.

You might also like