UNIT-1: Overview of Grid Computing
UNIT-1: Overview of Grid Computing
CASE STUDIES:
➢ Case study-1: The Next-Generation Fabric
Intel® Omni-Path Architecture (Intel® OPA), an element of Intel® Scalable System Framework, delivers the
performance for tomorrow’s high performance computing (HPC) workloads and the ability to scale to tens of
thousands of nodes—and eventually more—at a price competitive with today’s fabrics. The Intel OPA 100 Series
product line is an end-to-end solution of PCIe* adapters, silicon, switches, cables, and management software. As the
successor to Intel® True Scale Fabric, this optimized HPC fabric is built upon a combination of enhanced IP and
Intel® technology.
For software applications, Intel OPA will maintain consistency and compatibility with existing Intel True Scale
Fabric and InfiniBand* APIs by working through the open source OpenFabrics Alliance (OFA) software stack on
leading Linux* distribution releases. Intel True Scale Fabric customers will be able to migrate to Intel OPA through
an upgrade program.
The Future of High Performance Fabrics
Current standards-based high performance fabrics, such as InfiniBand*, were not originally designed for HPC,
resulting in performance and scaling weaknesses that are currently impeding the path to Exascale computing. Intel®
Omni-Path Architecture is being designed specifically to address these issues and scale cost-effectively from entry
level HPC clusters to larger clusters with 10,000 nodes or more. To improve on the InfiniBand specification and
design, Intel is using the industry’s best technologies including those acquired from QLogic and Cray alongside
Intel® technologies.
While both Intel OPA and InfiniBand Enhanced Data Rate (EDR) will run at 100Gbps, there are many differences.
The enhancements of Intel OPA will help enable the progression towards Exascale while cost-effectively supporting
clusters of all sizes with optimization for HPC applications at both the host and fabric levels for benefits that are not
possible with the standard InfiniBand-based designs.
Intel OPA is designed to provide the:
• Features and functionality at both the host and fabric levels to greatly raise levels of scaling
• CPU and fabric integration necessary for the increased computing density, improved reliability, reduced power, and
lower costs required by significantly larger HPC deployments
• Fabric tools to readily install, verify, and manage fabrics at this level of complexity
➢ Case study-2: Optimal Workload Performance Meets Intelligent Orchestration
The powerful new Intel® Xeon® processor E5-2600 v4 product family offers versatility across diverse workloads.
These processors are designed for architecting next-generation data centers running on, software defined
infrastructure supercharged for efficiency, performance, and agile services delivery across cloud-native and
traditional applications. They support workloads for cloud, high-performance computing, networking, and storage.
The Intel® Xeon® processor E5-4600 v4 product family delivers the compute horsepower in a 4-socket-based dense
form factor. This processor product family provides high-density, energy-efficient compute resources to support
larger workloads and high virtual machine densities in your data center or cloud. These 4-socket server platforms
give you more options and greater flexibility for scaling your infrastructure and growing your business.
The Intel® Xeon® processor E5-1600 v4 product family provides a professional, high-
performance workstation platform ideal for efficient multitasking, advanced model generation, and complex
applications.
➢ Case study-3: IBM Elastic Storage Server (ESS)
IBM Elastic Storage Server is a modern implementation of software defined cluster storage, combining IBM
Spectrum Scale™ software with POWER8 servers and disk arrays. Deploy petascale class high-speed storage
quickly with pre-assembled and optimized servers, storage and software.
➢ Case study-4: IBM Power System S822LC for high performance computing
The IBM Power System S822LC is built on industry standards and incorporates innovation from the OpenPOWER
Foundation ecosystem: including up to 2 NVIDIA® Tesla® GPU Accelerators and Mellanox® InfiniBand. The
Power S822LC delivers faster time to insight by pairing the built-for-big-data architecture of POWER8 and
accelerator performance. Also available without GPU accelerators as IBM Power System S822LC for commercial
computing.
➢ Case study-4: In india NETWEB Technology, HP/WIPRO working in HPC .
4. CLUSTER COMPUTING
A computer cluster consists of a set of loosely or tightly connected computers that work together so that, in many
respects, they can be viewed as a single system. Unlike grid computers, computer clusters have each node set to
perform the same task, controlled and scheduled by software.
The components of a cluster are usually connected to each other through fast local area networks ("LAN"), with
each node (computer used as a server) running its own instance of an operating system. In most circumstances, all of
the nodes use the same hardware and the same operating system, although in some setups (i.e. using Open Source
Cluster Application Resources (OSCAR)), different operating systems can be used on each computer, and/or
different hardware.
They are usually deployed to improve performance and availability over that of a single computer, while typically
being much more cost-effective than single computers of comparable speed or availability.
Computer clusters emerged as a result of convergence of a number of computing trends including the availability of
low-cost microprocessors, high speed networks, and software for high-performance distributed computing.[citation
needed] They have a wide range of applicability and deployment, ranging from small business clusters with a
handful of nodes to some of the fastest supercomputers in the world such as IBM's Sequoia.
The desire to get more computing power and better reliability by orchestrating a number of low-cost commercial
off-the-shelf computers has given rise to a variety of architectures and configurations.
The computer clustering approach usually (but not always) connects a number of readily available computing nodes
(e.g. personal computers used as servers) via a fast local area network. The activities of the computing nodes are
orchestrated by "clustering middleware", a software layer that sits atop the nodes and allows the users to treat the
cluster as by and large one cohesive computing unit, e.g. via a single system image concept.
Computer clustering relies on a centralized management approach which makes the nodes available as orchestrated
shared servers. It is distinct from other approaches such as peer to peer or grid computing which also uses many
nodes, but with a far more distributed nature.
A computer cluster may be a simple two-node system which just connects two personal computers, or may be a very
fast supercomputer. A basic approach to building a cluster is that of a Beowulf cluster which may be built with a few
personal computers to produce a cost-effective alternative to traditional high performance computing. An early
project that showed the viability of the concept was the 133-node Stone Supercomputer. The developers used Linux,
the Parallel Virtual Machine toolkit and the Message Passing Interface library to achieve high performance at a
relatively low cost.
5. PEER-TO-PEER COMPUTING
Peer-to-peer (P2P) computing or networking is a distributed application architecture that partitions tasks or
workloads between peers. Peers are equally privileged, equipotent participants in the application. They are said to
form a peer-to-peer network of nodes.
Peers make a portion of their resources, such as processing power, disk storage or network bandwidth, directly
available to other network participants, without the need for central coordination by servers or stable hosts. [1] Peers
are both suppliers and consumers of resources, in contrast to the traditional client-server model in which the
consumption and supply of resources is divided. Emerging collaborative P2P systems are going beyond the era of
peers doing similar things while sharing resources, and are looking for diverse peers that can bring in unique
resources and capabilities to a virtual community thereby empowering it to engage in greater tasks beyond those that
can be accomplished by individual peers, yet that are beneficial to all the peers.
While P2P systems had previously been used in many application domains, the architecture was popularized by the
file sharing system Napster, originally released in 1999. The concept has inspired new structures and philosophies in
many areas of human interaction. In such social contexts, peer-to-peer as a meme refers to the egalitarian social
networking that has emerged throughout society, enabled by Internet technologies in general.
While P2P networks open a new channel for efficient downloading and sharing of files and data, users need to be
fully aware of the security threats associated with this technology. Security measures and adequate prevention
should be implemented to avoid any potential leakage of sensitive and/or personal information, and other security
breaches. Before deciding to open firewall ports to allow for peer-to-peer traffic, system administrators should
ensure that each request complies with the corporate security policy and should only open a minimal set of firewall
ports needed to fulfill P2P needs. For end users, including home users, care must be taken to avoid any possible
spread of viruses over the peer-to-peer network
MANAGEMENT CONSIDERATIONS
TRENDS AND IMPACT
The first appearance of open source systems such as Napster in 1999 radically changed file-sharing mechanisms.
The traditional client-server file sharing and distribution approach using protocols like FTP (File Transfer
Protocol) was supplemented with a new alternative — P2P networks. At the time, Napster was used extensively
for the sharing of music files. Napster was shut down in mid-20012 due to legal action by the major record labels.
The shutting of Napster did not stop the growth of P2P applications. A number of publicly available P2P systems
have appeared in the past few years, including Gnutella, KaZaA, WinMX and BitTorrent, to name but a few.
From analysis of P2P traffic in 2007, BitTorrent is still the most popular file sharing protocol, accounting for 50-
75% of all P2P traffic and roughly 40% of all Internet traffic.
P2P technology is not just used for media file sharing. For example, in the bioinformatics research community, a
P2P service called Chinook4 has been developed to facilitate exchange of analysis techniques. The technology is
also used in other areas including IP-based telephone networks, such as Skype , and television networks, such as
PPLive. Skype allows people to chat, make phone calls or make video calls. When launched, each Skype client
acts as a peer, building and refreshing a table of reachable nodes in order to communicate for chat, making phone
calls or video calls. PPLive shares live television content. Each peer downloads and redistributes live television
content from and to other peers
GOVERNANCE AND REGULATIONS
In the U.S., a number of politicians have raised concerns about possible threats to national security due to P2P
network technology. The possibility of accidental leaks of classified information by government officers to
foreign governments, terrorists or organized crime via P2P file sharing programs has prompted a view that “new
laws and rules should be enacted to protect personal information held by federal agencies and other
organizations”. The proposal does not restrict P2P networks as a whole, but attempts to strike “a balance that
protects sensitive government, personal and corporate information and copyright laws”.
A P2P network itself is only a form of technology, and is not related to disputes over content and intellectual
property rights. However, there have been court cases in Hong Kong against illegal P2P activities. In 2005, a
Hong Kong resident was convicted of breaching the Copyright Ordinance by uploading illegal copies of
copyrighted works to the Internet using the BitTorrent peer-to-peer file sharing program, and making files
available for download by other Internet users.
SECURITY CONSIDERATIONS
CLASSIFICATION OF P2P NETWORKS
P2P networks can be roughly classified into two types — “pure P2P networks” and “hybrid P2P networks”. In a
pure P2P network, all participating peers are equal, and each peer plays both the role of client and of server. The
system does not rely on a central server to help control, coordinate, or manage the exchanges among the peers.
Gnutella and Freenet are examples of a pure P2P network.
In a hybrid P2P network, a central server exists to perform certain “administrative” functions to facilitate P2P
services. For example, in Napster, a server helps peers to “search for particular files and initiate a direct transfer
between the clients”. Only a catalogue of available files is kept on the server, while the actual files are scattered
across the peers on the network. Another example is BitTorrent (BT), where a central server called a tracker helps
coordinate communication among BT peers in order to complete a download.
The central distinction between the two types of P2P network is that hybrid P2P networks have a central entity to
perform certain administrative functions while there is no such server in pure P2P networks. Compared to the
hybrid P2P architecture, the pure P2P architecture is simpler and has a higher level of fault tolerance. On the other
hand, the hybrid P2P architecture consumes less network resources and is more scalable than the pure P2P
approach.
SECURITY THREATS
A P2P network treats every user as a peer. In file sharing protocols such as BT, each peer contributes to service
performance by uploading files to other peers while downloading. This opens a channel for files stored in the user
machine to be uploaded to other foreign peers.
The potential security risks include:
1. TCP ports issues: Usually, P2P applications need the firewall to open a number of ports in order to function
properly. BitTorrent, for example, will use TCP ports 6881-6889 (prior to version 3.2). The range of TCP ports
has been extended to 6881-6999 as of 3.2 and later. Each open port in the firewall is a potential avenue that
attackers might use to exploit the network. It is not a good idea to open a large number of ports in order to allow
for P2P networks.
2. Propagation of malicious code such as viruses: As P2P networks facilitate file transfer and sharing, malicious
code can exploit this channel to propagate to other peers. For example, a worm called VBS. Gnutella was detected
in 2000 which propagated across the Gnutella file sharing network by making and sharing a copy of itself in the
Gnutella program directory. Trojan horses have also been found over P2P networks. An example is W32/Inject-H,
which contained an IRC backdoor Trojan that utilized P2P networks to propagate itself. The Trojan would open a
backdoor in a user’s Windows PC to allow a remote intruder access and control of the computer. Theoretically
speaking, sensitive and personal information stored in the infected computer could be copied to other machines on
the P2P network.
3. Risks of downloaded content: When a file is downloaded using the P2P software, it is not possible to know
who created the file or whether it is trustworthy. In addition to the risks of viruses or malicious code associated
with the file, the person downloading the file might also be exposed to criminal and/or civil litigation if any illegal
content is downloaded to a company machine. Also, when downloading via a P2P network, it is not possible to
know what peers are connected at any one time and whether these peers are trustworthy or not. Untrusted sources
induce another security threat.
4. Vulnerability in P2P software: Like any software, P2P software is vulnerable to bugs. As each peer is both a
client and a server, it constantly receives requests from other peers, and if the server component of the P2P
software is buggy, it could introduce certain vulnerabilities to a user’s machine. Intruders could exploit this to
spread viruses, hack into a machine, or even launch a denial of service attack. It was reported in 2003 that a bug in
the P2P software Kazaa Media Desktop could cause a denial of service attack, or allow a remote attacker to
exploit arbitrary code.
In addition to general security risks, the use of P2P applications in a company network situation could generate an
unnecessarily large amount of network traffic, monopolizing network bandwidth that should be available for other
business applications. The time spent by employees in dealing with the effects of P2P download or upload will
affect employee productivity and the organization’s bottom line.
6. INTERNET COMPUTING
What exactly is Internet computing? Many people think they know, but they are surprised to learn that they don't.
Are you one of the select few who has a grasp of the subject?
Retailers today face a competitive marketplace with unprecedented challenges and opportunities. Increasing labor
costs, Blurring of market segmentation, Reduced customer loyalty. You know the list. But, now there is a whole
new set of challenges brought on by the Internet:
▪ Retailers are being "dot-commed" right out of their markets.
▪ Price visibility is allowing customers instant access to the lowest cost merchant.
▪ Manufacturers and new competitors are removing some retailers from the supply chain altogether.
▪ Online auctions have fundamentally changed the way merchandise is sold and purchased. This list continues to
grow as people think of more and more ways to leverage the Internet.
E-Business Or Out Of Business
It is a new world. For brick-and-mortar retailers in particular, the Internet is creating enormous disruption. But, it is also
presenting unprecedented opportunities for those who understand the use, implications, and terminology of Internet
technologies, and for those who move quickly and intelligently to become an e-business themselves. Increasingly, the
choice facing retailers is simple: it's e-business or out of business. Unfortunately, it's not as simple as deciding to
become an e-business. Terms like e-business, e-commerce, Web-deployed, Internet-enabled, customer relationship
management, and the like all seem to have different meanings to each retailer and software vendor. For retailers, one
fundamental term that must be clearly understood to succeed, is the true meaning of the words "Internet computing."
Why? Because the differences between true Internet computing, and the faux offerings that mimic the look of true
Internet computing, are subtle to the untrained eye. However, they are dramatic in the capabilities and benefits they
provide.
Defining Internet Computing
Internet computing is the foundation on which e-business runs. It is the only architecture that can run all facets of
business, from supplier collaboration and merchandise purchasing, to distribution and store operations, to customer
sales and service. Internet computing is the only architecture that supports all information flows and processes over the
Internet — providing access to all applications. With Internet computing, all a user needs is a standard Web browser
and security clearance. The Internet computing model represents a fundamental shift from the traditional client/server
enterprise application model. The four-walled efficiency that was once the goal of monolithic enterprise resource
planning implementations — known as business process redesign (BPR) — has been replaced. The new environment is
one in which economic gains are a result of systems efficiencies and collaboration across the extended network of
customers, retailers, manufacturers, and suppliers.
Shift In Focus
There are three tiers in true Internet computing. These three tiers provide the benefit of centralized data that supports a
unified view of the retailer's financial, human resources, inventory, logistics, trading partner, and customer information.
The business logic at the next layer accesses and transacts the data. The user interface is a simple, non-proprietary Web
browser. No complexity resides on the users' device, which can be anything from a PC to a mobile phone, or even a
uniquely purposed mobile unit. (Note: A unique "tool set" that allows writing in multiple languages allows Web
deployment functionality to occur within the application server.)
1. Increasing Capabilities: Increasing in computing power has a doubling time of 18 months, storage device
capacity doubles every 12 months and communication speed doubles every 9 months. The difference in rate of
increase creates a situation in all areas, although significant power available, because of traditional
implementation tradeoff method changes. This increasing capability has resulted in work to define new ways to
interconnect and manage computing resources.
2. New Application Requirements: New applications are being developed in physics, biology, astronomy,
visualization, digital image processing, meteorology etc. Many of the anticipated applications have
communication requirements between a small numbers of sites. This presents a very different requirement than
presented by “typical” Internet use, where the communication is among many sites. Applications can be defined in
three classes: 1) lightweight “classical” Internet applications (mail, browsing), 2) medium applications (business,
streaming, VPN) and 3) heavyweight applications (e-science, computing, data grids, and virtual presence). The
total bandwidth estimate for all users of each class of network application is 20 Gb/sec for the lightweight
Internet, 40 Gb/sec for all users of the intermediate class of applications and 100 Gb/sec for the heavyweight
applications. Note that the heavyweight applications use significantly more bandwidth than the total bandwidth of
all applications on the classical Internet. Different application types value different capabilities. Lightweight
applications give importance to interconnectivity, middleweight applications to throughput and QoS, while the
heavyweight applications to throughput and performance.
CONGESTION CONTROL
Moving bulk data quickly over high-speed data network is a requirement for many applications. These
applications require high bandwidth link between network nodes. To maintain the stability of internet, all
applications should be subjected to congestion control. TCP is well-developed, extensively used and widely
available Internet transport protocol. TCP is fast, efficient and responsive to network congestion conditions, but
one objection to using TCP congestion control is that TCP’s AIMD congestion back-off algorithm, which is too
abrupt in decreasing the window size, thus it hurts the data rate.
The performance of the congestion control system, TCP algorithm and the link congestion signal algorithm has
many facets and these variables can impact the Quality of Service (QoS).
A variety of merits described they are:
Fairness: The Jain index is a popular fairness metric that measures how equally the sources sharing a single
bottleneck link. A value of 1 indicates perfectly equal sharing and smaller values indicate worse fairness.
Throughput: Throughput is simply the data rate, typically in Mbps, delivered to the application. For a single
source this should be close to the capacity of the link. When the BDP is high, that is, when the link capacity or
RTT or both are high, some protocols are unable to achieve good throughput.
Stability: The stability metric measures the variations of the source rate and/or the queue length in the router
around the mean values when everything else in the network is held fixed. Stability is typically measured as the
standard deviation of the rate around the mean rate, so that a lower value indicates better performance. If a
protocol is unstable, the rate can oscillate between exceeding the link capacity, and thus resulting in poor delay
jitter and throughput performance.
Responsiveness: measures how fast a protocol reacts to a change in network operating conditions. If the source
rates take long time to converge to a new level, say after the capacity of the link changes, either the link may
become underutilized or the buffer may overflow. The responsiveness metric measures the time or the number of
round trips to obtain the right rate.
Queuing delay: Once congestion window is greater than the BDP, the link is well utilized and however, if the
congestion window is increased more, queuing delay builds up. Different TCP and AQM protocol combinations
operate on how to minimize the Queuing Delay.
Loss recovery: packet loss can be a result because of overflowing buffers, which indicates network congestion,
and also of transmission error, such as bit errors over a wireless channel. It is desirable that, when packet loss
occurs due to transmission error, the source continues to transmit uninterrupted. However, when the loss is due to
congestion, the source should slow down. Loss recovery is typically measured as the throughput that can be
sustained under the condition of a certain random packet loss
caused by transmission error. Loss based protocols typically
cannot distinguish between congestion and transmission
error losses.
In the past few years, more number of TCP variant were
developed, that address the under-utilization problem most
notably due to the slow growth of TCP congestion window,
which makes TCP unfavorable for high BDP networks.
UDP-based protocols provide much better portability and are easy to install. Although implementation of user level
protocols needs less time to test and debug than in kernel implementations, it is difficult to make them as efficient,
because user level implementations cannot modify the kernel code, there may be additional context switches and
memory copies. At high transfer speeds, these operations are very sensitive to CPU utilization and protocol
performance. In fact, one of the purposes of the standard UDP protocol is to
allow new transport protocols to be built on top of it. For example, the RTP
protocol is built on top of UDP and supports streaming multimedia. In this
section we study some UDP based transport protocol for data intensive grid
applications.
NETBLT
Bulk data transmission is needed for more applications in various fields and it is
must for grid applications. The major performance concern of a bulk data
transfer protocols is high throughput. In reality, achievable end-to-end
throughput over high bandwidth channels is often an order of magnitude lower
than the provided bandwidth. This is because it is often limited by the transport protocols mechanism, so it is
especially difficult to achieve high throughput and reliable data transmission across long delay, unreliable network
paths.
NETBLT works by opening a connection between two clients (the sender and the receiver) transferring data in a
series of large numbered blocks (buffers), and then closing the connection. NETBLT transfer works as follows: the
sending client provides a buffer of data for the NETBLT layer to transfer. NETBLT breaks the buffer up into
packets and sends the packets using the internet datagram’s. The receiving NETBLT layer loads these packets into a
matching buffer provided by the receiving client. When the last packet in that buffer has arrived, the receiving
NETBLT part will check to see if all packets in buffer have been correctly received or if some packets are missing.
If there are any missing packets, the receiver requests to resend the packets. When the buffer has been completely
transmitted, the receiving client is notified by its NETBLT layer. The receiving client disposes the buffer and
provides a new buffer to receive more data. The receiving NETBLT notifies the sender that the new buffer is created
for receiving and this continues until all the data has been sent.
Reliable Blast UDP (RBUDP)
RBUDP is designed for extremely high bandwidth, dedicated or quality of service enabled networks, which require
high speed bulk data transfer which is an important part. RBUDP has two goals: i) keeping the network buffer full
during data transfer and ii) avoiding TCP’s per packet interaction and sending acknowledgements at the end of a
transmission.
There are 3 version of RBUDP available:
i) Version 1: without scatter/gather optimization - this is naive implementation of RBUDP where each
incoming packet is examined and then moved.
ii) Version 2: with scatter/gather optimization this implementation takes advantage of the fact that most
incoming packets are likely to arrive in order, and if transmission rates are below the maximum throughput
of the network, packets are unlikely to be lost.
iii) Version 3: Fake RBUDP this implementation is the same as the scheme without the scatter/gather
optimization except that the incoming data is never moved to application memory.
The implementation result of RBUDP shows that it performs very efficiently over high speed, high bandwidth, and
Quality of Service enabled networks such as optically switched network. Also through mathematical modeling and
experiments, RBUDP has proved that it effectively utilizes available bandwidth for reliable data transfer.
TSUNAMI
A reliable transfer protocol, Tsunami, is designed for transferring large files fast over high speed networks. Tsunami
is a protocol which uses inter-packet delay for adjusting the rate control instead of sliding window mechanism.
UDP is used for sending data and TCP for sending control data. The goal of Tsunami is to increase the speed of file
transfer in high speed networks that use standard TCP.
During a file transfer, the client has two running threads. The network thread handles all network communication,
maintains the retransmission queue, and places blocks that are ready for disk storage into a ring buffer. The disk
thread simply moves blocks from the ring buffer to the destination file on disk. The server creates a single thread in
response to each client connection that handles all disk and network activity. The client initiates a Tsunami session
by connecting it to the TCP port of the server. Upon connection, the server sends a random data to the client. The
client checks the random data by using XOR with a shared secret key and calculates a MD5 check sum, then
transmits it to the server. The server does the same operation and checks the check sum and if both are same, the
connection is up. After performing the authentication and connection steps, the client sends the name of file to the
server. In the server side, it checks whether the file is available and if it is available, it sends a message to client.
After receiving a positive message from server, client sends its block size, transfer rate, error threshold value. The
server responds with the receiver parameters and sends a time-stamp. After receiving the timestamp, client creates a
port for receiving file from the server and server sends the file to the receiver.
Tsunami is an improvement of Reliable Blast UDP in two points. First, Tsunami receiver makes a retransmission
request periodically (every 50 packets) and it doesn’t wait until finishing of all data transfer, then it calculates
current error rate and sends it to the sender. Second, Tsunami uses rate based congestion control. Tsunami has best
performance in networks with limited distance, when it comes to long distance network, bandwidth utilization goes
down, absence of flow control affects its performance and issues like fairness and TCP friendliness has to be
studied.
8. Types of grids:
Based on the different levels of complexity for the enterprise, grids can be categorized as follows:
• Infra-Grid – this type of grid architecture allows optimizing the resource sharing within a division of the
organization’s departments. Infra-grid forms a tightly controlled environment with well defined business policies,
integration and security.
• Intra-Grid – it’s a more complex implementation than the previous because it’s focused on integrating various
resources of several departments and divisions of an enterprise. These types of grids require a complex security
policies and sharing resources and data. However, because the resources are found in the same enterprise, the focus
is on the technical implementation of the policies.
• Extra-Grid – unlike intra-grid, this type of grid is referring to resource sharing to/from a foreign partner towards
certain relationships are established. These grids extend over the administrative management of local resources of an
enterprise and therefore mutual conventions on managing the access to resources are necessary.
• Inter-Grid – this kind of grid computing technology enables sharing and storage resources and data using the Web
and enabling the collaborations between various companies and organizations. The complexity of the grid comes
from the special requirements of service levels, security and integration. This type of grid involves most of the
mechanism found in the three previous types of grid.
• Resource management: a grid must be aware of what resources are available for different tasks
• Security management: the grid needs to take care that only authorized users can access and use the
available resources
• Data management: data must be transported, cleansed, parceled and processed
• Services management: users and applications must be able to query the grid in an effective and efficient
manner
More specifically, grid computing environment can be viewed as a computing setup constituted by a number of
logical hierarchical layers. Figure 1 represents these layers. They include grid fabric resources, grid security
infrastructure, core grid middleware, user level middleware and resource aggregators, grid programming
environment and tools and grid applications.
The major constituents of a grid computing system can be identified into various categories from different
perspectives as follows:
• functional view
• physical view
• service view
Basic constituents of a grid from a functional view are decided depending on the grid design and its expected use.
Some of the functional constituents of a grid are
1. Security (in the form of grid security infrastructure)
2. Resource Broker
3. Scheduler
4. Data Management
5. Job and resource management
6. Resources
A resource is an entity that is to be shared; this includes computers, storage, data and software. A resource need not
be a physical entity. Normally, a grid portal acts as a user interaction mechanism which is application specific and
can take many forms. A user-security functional block usually exists in the grid environment and is a key
requirement for grid computing. In a grid environment, there is a need for mechanisms to provide authentication,
authorization, data confidentiality, data integrity and availability, particularly from a user’s point of view. In the case
of inter-domain grids, there is also a requirement to support security across organizational boundaries. This makes a
centrally managed security system impractical. The grid security infrastructure (GSI) provides a “single sign-on”,
run anywhere authentication service with support for local control over access rights and mapping from global to
local identities.
➢ HPC AND THE GRID
The latest fashion in some academic IT circles is ``The Grid'' , and many people have a quite incorrect view that
``The Grid'' in some way is deeply connected to high performance computing, or even that it is high performance
computing in its latest guise.
``The Grid'' is not related to high performance computing.
High performance computing and supercomputing have been around for tens of years without ``The Grid'' and they
will continue to be around for tens of years without it. The other view, which is also quite incorrect, is that there is
just one ``Grid'', which is also ``The Grid'', and that this grid is based on ``Globus'' , which is a collection of utilities
and libraries developed by various folks, but mostly by folks from the University of Chicago and the Argonne
National Laboratory. Many people, who actually know something about distributed computing, pointed out that
what is called ``The Grid'' nowadays, was called ``Distributed Computing'' only a decade ago. It is often the case in
Information Technology, especially in academia, that old washed out ideas are being given new names and flogged
off yet again by the same people who failed to sell them under the old names.
There are some successful examples of grids in place today. The most successful one, and probably the only one that
will truly flourish in years to come, is the Microsoft ``.NET Password'' program. It works like this: when you start
up your PC running windows, ``MSN Messenger'' logs you in with the ``.NET Password''. This way you acquire
credentials, which are then passed to all other WWW sites that participate in the ``.NET Password'' program. For
example, once I have been authenticated to ``.NET Password'', I can connect to [Link], Nature, Science,
Monster, The New York Times, and various other well known sites, which recognize me instantaneously and
provide me with customized services.
Another example of a grid is AFS, the Andrew File System. AFS is a world-wide file system, which, when mounted
on a client machine, provides its user with transparent access to file systems at various institutions. It can be
compared to the World Wide Web, but unlike WWW, AFS provides access to files on the kernel and file system
level. You don't need to use a special tool such as a WWW browser. If you have AFS mounted on your computer,
you can use native OS methods in order to access, view, modify, and execute files that live at other institutions. User
authentication and verification is based on MIT Kerberos and user authorization is based on AFS Access Control
Lists (ACLs).