0% found this document useful (0 votes)
112 views

Figure 1. Towards Grid Computing: A Conceptual View

Grid computing involves connecting distributed computer resources from multiple organizations to solve large computational problems. It allows for sharing and aggregation of various resources like supercomputers, storage systems, and data sources located around the world. Key advantages of grid computing include achieving higher computational power than possible with individual systems and enabling virtual organizations across institutions. Grids are well-suited for applications that can be broken into parallelizable pieces that do not require communication between processors. They provide on-demand access to distributed computing power and data through standard protocols and interfaces.

Uploaded by

Rockey Verma
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
112 views

Figure 1. Towards Grid Computing: A Conceptual View

Grid computing involves connecting distributed computer resources from multiple organizations to solve large computational problems. It allows for sharing and aggregation of various resources like supercomputers, storage systems, and data sources located around the world. Key advantages of grid computing include achieving higher computational power than possible with individual systems and enabling virtual organizations across institutions. Grids are well-suited for applications that can be broken into parallelizable pieces that do not require communication between processors. They provide on-demand access to distributed computing power and data through standard protocols and interfaces.

Uploaded by

Rockey Verma
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 35

Institute Of Engg.

& Technology, Lucknow

1.INTRODUCTION
The popularity of the Internet as well as the availability of powerful computers and high-speed network technologies as low-cost commodity components is changing the way we use computers today. These technology opportunities have led to the possibility of using distributed computers as a single, unified computing resource, leading to what is popularly known as Grid computing. The term Grid is chosen as an analogy to a power Grid that provides consistent, pervasive, dependable, transparent access to electricity irrespective of its source. A detailed analysis of this analogy can be found in. This new approach to network computing is known by several names, such as metacomputing, scalable computing, global computing, Internet computing, and more recently peer-to- peer (P2P) computing.

Figure 1. Towards Grid computing: a conceptual view.

[Grid Computing]

Page 1

Institute Of Engg. & Technology, Lucknow

1.1 What is Grid?


The grid can be thought of as a distributed system with non-interactive workloads that involve a large number of files. Although a grid can be dedicated to a specialized application, it is more common that a single grid will be used for a variety of different purposes. Grids are often constructed with the aid of general-purpose grid software libraries known as middleware used to divide and apportion pieces of a program among several computers, sometimes up to many thousands. It involves computation in a distributed fashion, which may also involve the aggregation of large-scale cluster computing-based systems. Grids enable the sharing, selection, and aggregation of a wide variety of resources including supercomputers, storage systems, data sources, and specialized devices (see Figure 1)that are geographically distributed and owned by different organizations for solving large-scale computational and data intensive problems in science, engineering, and commerce. Thus creating virtual organizations and enterprises as a temporary alliance of enterprises or organizations that come together to share resources and skills, core competencies, or resources in order to better respond to business opportunities or large-scale application processing requirements, and whose cooperation is supported by computer networks Grid size can vary by a considerable amount, it may vary from small confined to a network of computer workstations within a corporation, for exampleto large, public collaborations across many companies and networks. "The notion of a confined grid may also be known as an intra-nodes cooperation whilst the notion of a larger, wider grid may thus refer to an internodes cooperation".

[Grid Computing]

Page 2

Institute Of Engg. & Technology, Lucknow Coordinating applications on Grids can be a complex task, especially when coordinating the flow of information across distributed computing resources. Grid workflow systems have been developed as a specialized form of a workflow management system designed specifically to compose and execute a series of computational or data manipulation steps, or a workflow, in the Grid context.

1.2 Comparison of grids and conventional supercomputer


Distributed or grid computing in general is a special type of parallel computing that relies on complete computers (with onboard CPUs, storage, power supplies, network interfaces, etc.) connected to a network (private, public or the Internet) by a conventional network interface, such as Ethernet. This is in contrast to the traditional notion of a supercomputer, which has many processors connected by a local highspeed computer bus. The primary advantage of distributed computing is that each node can be purchased as commodity hardware, which, when combined, can produce a similar computing resource as multiprocessor supercomputer, but at a lower cost. This is due to the economies of scale of producing commodity hardware, compared to the lower efficiency of designing and constructing a small number of custom supercomputers. The primary performance disadvantage is that the various processors and local storage areas do not have high-speed connections. This arrangement is thus well-suited to applications in which multiple parallel computations can take place independently, without the need to communicate intermediate results between processors. The high-end scalability of geographically dispersed grids is generally favorable, due to the

[Grid Computing]

Page 3

Institute Of Engg. & Technology, Lucknow low need for connectivity between nodes relative to the capacity of the public Internet. There are also some differences in programming and deployment. It can be costly and difficult to write programs that can run in the environment of a supercomputer, which may have a custom operating system, or require the program to address concurrency issues. If a problem can be adequately parallelized, a thin layer of grid infrastructure can allow conventional, standalone programs, given a different part of the same problem, to run on multiple machines. This makes it possible to write and debug on a single conventional machine, and eliminates complications due to multiple instances of the same program running in the same shared memory and storage space at the same time.

1.3 What is Grid computing?


Grid computing is a term referring to the federation of computer resources from multiple administrative domains to reach a common goal. What distinguishes grid computing from conventional high performance computing systems such as cluster computing is that grids tend to be more loosely coupled, heterogeneous, and geographically dispersed. Grid computing combines computers from multiple administrative domains to reach a common goal, to solve a single task, and may then disappear just as quickly. The concept of Grid computing started as a project to link geographically dispersed supercomputers, but now it has grown far beyond its original intent. The Grid infrastructure can benefit many applications, including collaborative

[Grid Computing]

Page 4

Institute Of Engg. & Technology, Lucknow engineering, data exploration, high-throughput computing, and distributed supercomputing. A Grid can be viewed as a seamless, integrated computational and collaborative environment (see Figure 1). The users interact with the Grid resource broker to solve problems, which in turn performs resource discovery, scheduling, and the processing of application jobs on the distributed Grid resources. From the end-user point of view, Grids can be used to provide the following types of services: Computational services: These are concerned with providing secure services for executing application jobs on distributed computational resources individually or collectively. Resources brokers provide the services for collective use of distributed resources. A Grid providing computational services is often called a computational Grid. Some examples of computational Grids are: NASA IPG, the World Wide Grid, and the NSF TeraGrid . Data services: These are concerned with proving secure access to distributed datasets and their management. To provide a scalable storage and access to the data sets, they may be replicated, catalogued, and even different datasets stored in different locations to create an illusion of mass storage. The processing of datasets is carried out using computational Grid services and such a combination is commonly called data Grids. Sample applications that need such services for management, sharing, and processing of large datasets are high-energy physics and accessing distributed chemical databases for drug design.

[Grid Computing]

Page 5

Institute Of Engg. & Technology, Lucknow Application services: These are concerned with application management and providing access to remote software and libraries transparently. The emerging technologies such as Web services are expected to play a leading role in defining application services. They build on computational and data services provided by the Grid. An example system that can be used to develop such services is NetSolve. Information services: These are concerned with the extraction and presentation of data with meaning by using the services of computational, data, and/or application services. The low-level details handled by this are the way that information is represented, stored, accessed, shared, and maintained. Given its key role in many scientific endeavors, the Web is the obvious point of departure for this level. Knowledge services: These are concerned with the way that knowledge is acquired, used, retrieved, published, and maintained to assist users in achieving their particular goals and objectives. Knowledge is understood as information applied to achieve a goal, solve a problem, or execute a decision. An example of this is data mining for automatically building a new knowledge. To build a Grid, the development and deployment of a number of services is required. These include security, information, directory, resource allocation, and payment mechanisms in an open environment and high-level services for application development, execution management, resource aggregation, and scheduling.

[Grid Computing]

Page 6

Institute Of Engg. & Technology, Lucknow Grid applications (typically multidisciplinary and large-scale processing applications) often couple resources that cannot be replicated at a single site, or which may be globally located for other practical reasons. These are some of the driving forces behind the foundation of global Grids. In this light, the Grid allows users to solve larger or new problems by pooling together resources that could not be easily coupled before. Hence, the Grid is not only a computing infrastructure, for large applications, it is a technology that can bond and unify remote and diverse distributed resources ranging from meteorological sensors to data vaults and from parallel supercomputers to personal digital organizers. As such, it will provide pervasive services to all users that need them.

[Grid Computing]

Page 7

Institute Of Engg. & Technology, Lucknow

2. BENEFITS OF GRID COMPUTING


Grid computing can provide many benefits not available with traditional computing models:

Better utilization of resources Grid computing uses

distributed resources more efficiently and delivers more usable computing power. This can decrease time-to-market, allow for innovation, or enable additional testing and simulation for improved product quality. By employing existing resources, grid computing helps protect IT investments, containing costs while providing more capacity.

Increased user productivity By providing transparent access

to resources, work can be completed more quickly. Users gain additional productivity as they can focus on design and development rather than wasting valuable time hunting for resources and manually scheduling and managing large numbers of jobs.

Scalability Grids can grow seamlessly over time, allowing

many thousands of processors to be integrated into one cluster. Components can be updated independently and additional resources can be added as needed, reducing large one-time expenses.

Flexibility Grid computing provides computing power where

it is needed most, helping to better meet dynamically changing work

[Grid Computing]

Page 8

Institute Of Engg. & Technology, Lucknow loads. Grids can contain heterogeneous compute nodes, allowing resources to be added and removed as needs dictate. 3. LEVELS OF DEPLOYMENT Grid computing can be divided into three logical levels of deployment: Cluster Grids, Enterprise Grids, and Global Grids.

Cluster Grids: The simplest form of a grid, a Cluster Grid

consists of multiple systems interconnected through a network. Cluster Grids may contain distributed workstations and servers, as well as centralized resources in a datacenter environment. Typically owned and used by a single project or department, Cluster Grids support both high throughput and high performance jobs. Common examples of the Cluster Grid architecture include compute farms, groups of multiprocessor HPC systems, Beowulf clusters, and networks of workstations (NOW).

Enterprise Grids: As capacity needs increase, multiple Cluster

Grids can be combined into an Enterprise Grid. Enterprise Grids enable multiple projects or departments to share computing resources in a cooperative way. Enterprise Grids typically contain resources from multiple administrative domains, but are located in the same geographic location.

[Grid Computing]

Page 9

Institute Of Engg. & Technology, Lucknow

Global Grids: Global Grids are a collection of Enterprise Grids,

all of which have agreed upon global usage policies and protocols, but not necessarily the same implementation. Computing resources may be geographically dispersed, connecting sites around the globe. Designed to support and address the needs of multiple sites and organizations sharing resources, Global Grids provide the power of distributed resources to users anywhere in the world.

Figure 2 Three levels of grid computing: cluster, enterprise, and global grids.

[Grid Computing]

Page 10

Institute Of Engg. & Technology, Lucknow

4. GRID CONSTRUCTION: GENERAL PRINCIPLES


This section briefly highlights some of the general principles that underlie the construction of the Grid. In particular, the idealized design features that are required by a Grid to provide users with a seamless computing environment are discussed. Four main aspects characterize a Grid.

Multiple administrative domains and autonomy: Grid

resources are geographically distributed across multiple administrative domains and owned by different organizations. The autonomy of resource owners needs to be honored along with their local resource management and usage policies.

Heterogeneity: A Grid involves a multiplicity of resources that

are heterogeneous in nature and will encompass a vast range of technologies.

Scalability: A Grid might grow from a few integrated resources

to millions. This raises the problem of potential performance degradation as the size of Grids increases. Consequently, applications

[Grid Computing]

Page 11

Institute Of Engg. & Technology, Lucknow that require a large number of geographically located resources must be designed to be latency and bandwidth tolerant.

Dynamicity or adaptability: In a Grid, resource failure is the

rule rather than the exception. In fact, with so many resources in a Grid, the probability of some resource failing is high. Resource managers or applications must tailor their behavior dynamically and use the available resources and services efficiently and effectively.

5. GRID ARCHITECTURE
Our goal in describing our Grid architecture is not to provide a complete enumeration of all required protocols (and services, APIs, and SDKs) but rather to identify requirements for general classes of component. The result is an extensible, open architectural structure within which can be placed solutions to key VO requirements. Our architecture and the subsequent discussion organize components into layers, as shown in Figure. Components within each layer share common characteristics but can build on capabilities and behaviors provided by any lower layer. In specifying the various layers of the Grid architecture, we follow the principles of the hourglass model. The narrow neck of the hourglass defines a small set of core abstractions and protocols (e.g., TCP and HTTP in the Internet), onto which many different high-level behaviors can be mapped (the top of the hourglass), and which themselves can be mapped onto many different underlying technologies (the base of the hourglass). By definition,

[Grid Computing]

Page 12

Institute Of Engg. & Technology, Lucknow the number of protocols defined at the neck must be small. In our architecture, the neck of the hourglass consists of Resource and Connectivity protocols, which facilitate the sharing of individual resources. Protocols at these layers are designed so that they can be implemented on top of a diverse range of resource types, defined at the Fabric layer, and can in turn be used to construct a wide range of global services and application-specific behaviors at the Collective layerso called because they involve the coordinated (collective) use of multiple resources.

Figure3. The layered Grid architecture and its relationship to the Internet protocol architecture. Because the Internet protocol architecture extends from network to application, there is a mapping from Grid layers into Internet layers.

5.1 Fabric: Interfaces to Local Control

[Grid Computing]

Page 13

Institute Of Engg. & Technology, Lucknow The Grid Fabric layer provides the resources to which shared access is mediated by Grid protocols: for example, computational resources, storage systems, catalogs, network resources, and sensors. A resource may be a logical entity, such as a distributed file system, computer cluster, or distributed computer pool; in such cases, a resource implementation may involve internal protocols (e.g., the NFS storage access protocol or a cluster resource management systems process management protocol), but these are not the concern of Grid architecture. Fabric components implement the local, resource-specific operations that occur on specific resources (whether physical or logical) as a result of sharing operations at higher levels. There is thus a tight and subtle interdependence between the functions implemented at the Fabric level, on the one hand, and the sharing operations supported, on the other. Richer Fabric functionality enables more sophisticated sharing operations; at the same time, if we place few demands on Fabric elements, then deployment of Grid infrastructure is simplified. For example, resource-level support for advance reservations makes it possible for higher-level services to aggregate (coschedule) resources in interesting ways that would otherwise be impossible to achieve. However, as in practice few resources support advance reservation out of the box, a requirement for advance reservation increases the cost of incorporating new resources into a Grid. Experience suggests that at a minimum, resources should implement enquiry mechanisms that permit discovery of their structure, state, and capabilities (e.g., whether they support advance reservation) on the one hand, and resource management mechanisms that provide some control of delivered

[Grid Computing]

Page 14

Institute Of Engg. & Technology, Lucknow quality of service, on the other. The following brief and partial list provides a resource-specific characterization of capabilities. Computational resources: Mechanisms are required for starting programs and for monitoring and controlling the execution of the resulting processes. Management mechanisms that allow control over the resources allocated to processes are useful, as are advance reservation mechanisms. Enquiry functions are needed for determining hardware and software characteristics as well as relevant state information such as current load and queue state in the case of scheduler-managed resources. Storage resources: Mechanisms are required for putting and getting files. Third-party and high-performance (e.g., striped) transfers are useful. So are mechanisms for reading and writing subsets of a file and/or executing remote data selection or reduction functions. Management mechanisms that allow control over the resources allocated to data transfers (space, disk bandwidth, network bandwidth, CPU) are useful, as are advance reservation mechanisms. Enquiry functions are needed for determining hardware and software characteristics as well as relevant load information such as available space and bandwidth utilization. Network resources: Management mechanisms that provide control over the resources allocated to network transfers (e.g., prioritization, reservation) can be useful. Enquiry functions should be provided to determine network characteristics and load. Code repositories: This specialized form of storage resource requires mechanisms for managing versioned source and object code: for example, a control system such as CVS.

[Grid Computing]

Page 15

Institute Of Engg. & Technology, Lucknow Catalogs: This specialized form of storage resource requires mechanisms for implementing catalog query and update operations: for example, a relational database.

5.2 Connectivity: Communicating Easily and Securely


The Connectivity layer defines core communication and authentication protocols required for Grid-specific network transactions. Communication protocols enable the exchange of data between Fabric layer resources. Authentication protocols build on communication services to provide cryptographically secure mechanisms for verifying the identity of users and resources. Communication requirements include transport, routing, and naming. While alternatives certainly exist, we assume here that these protocols are drawn from the TCP/IP protocol stack: specifically, the Internet (IP and ICMP), transport (TCP, UDP), and application (DNS, OSPF, RSVP, etc.) layers of the Internet layered protocol architecture. This is not to say that in the future, Grid communications will not demand new protocols that take into account particular types of network dynamics. With respect to security aspects of the Connectivity layer, we observe that the complexity of the security problem makes it important that any solutions be based on existing standards whenever possible. As with communication, many of the security standards developed within the context of the Internet protocol suite are applicable.

[Grid Computing]

Page 16

Institute Of Engg. & Technology, Lucknow Authentication solutions for VO environments should have the following characteristics: Single sign on: Users must be able to log on (authenticate) just once and then have access to multiple Grid resources defined in the Fabric layer, without further user intervention. Delegation: A user must be able to endow a program with the ability to run on that users behalf, so that the program is able to access the resources on which the user is authorized. The program should (optionally) also be able to conditionally delegate a subset of its rights to another program (sometimes referred to as restricted delegation). Integration with various local security solutions: Each site or resource provider may employ any of a variety of local security solutions, including Kerberos and Unix security. Grid security solutions must be able to interoperate with these various local solutions. They cannot, realistically, require wholesale replacement of local security solutions but rather must allow mapping into the local environment. User-based trust relationships: In order for a user to use resources from multiple providers together, the security system must not require each of the resource providers to cooperate or interact with each other in configuring the security environment. For example, if a user has the right to use sites A and B, the user should be able to use sites A and B together without requiring that As and Bs security administrators interact.

5.3 Resource: Sharing Single Resources

[Grid Computing]

Page 17

Institute Of Engg. & Technology, Lucknow The Resource layer builds on Connectivity layer communication and authentication protocols to define protocols for the secure negotiation, initiation, monitoring, control, accounting, and payment of sharing operations on individual resources. Resource layer implementations of these protocols call Fabric layer functions to access and control local resources. Resource layer protocols are concerned entirely with individual resources and hence ignore issues of global state and atomic actions across distributed collections; such issues are the concern of the Collective layer discussed next. Two primary classes of Resource layer protocols can be distinguished: Information protocols: used to obtain information about the structure and state of a resource, for example, its configuration, current load, and usage policy (e.g., cost). Management protocols: used to negotiate access to a shared resource, specifying, for example, resource requirements (including advanced reservation and quality of service) and the operation(s) to be performed, such as process creation, or data access. Since management protocols are responsible for instantiating sharing relationships, they must serve as a policy application point, ensuring that the requested protocol operations are consistent with the policy under which the resource is to be shared. Issues that must be considered include accounting and payment. A protocol may also support monitoring the status of an operation and controlling (for example, terminating) the operation.

5.4 Collective: Coordinating Multiple Resources

[Grid Computing]

Page 18

Institute Of Engg. & Technology, Lucknow While the Resource layer is focused on interactions with a single resource, the next layer in the architecture contains protocols and services (and APIs and SDKs) that are not associated with any one specific resource but rather are global in nature and capture interactions across collections of resources. For this reason, we refer to the next layer of the architecture as the Collective layer. Because Collective components build on the narrow Resource and Connectivity layer neck in the protocol hourglass, they can implement a wide variety of sharing behaviors without placing new requirements on the resources being shared. For example: Directory services allow VO participants to discover the existence and/or properties of VO resources. A directory service may allow its users to query for resources by name and/or by attributes such as type, availability, or load. Resource-level GRRP and GRIP protocols are used to construct directories. Co-allocation, scheduling, and brokering services allow VO participants to request the allocation of one or more resources for a specific purpose and the scheduling of tasks on the appropriate resources. Examples include AppLeS, Condor-G, Nimrod-G, and the DRM broker . Monitoring and diagnostics services support the monitoring of VO resources for failure, adversarial attack (intrusion detection), overload, and so forth. Data replication services support the management of VO storage (and perhaps also network and computing) resources to maximize data access performance with respect to metrics such as response time, reliability, and cost.

[Grid Computing]

Page 19

Institute Of Engg. & Technology, Lucknow Grid-enabled programming systems enable familiar programming models to be used in Grid environments, using various Grid services to address resource discovery, security, resource allocation, and other concerns. Examples include Grid-enabled implementations of the Message Passing Interface and manager-worker frameworks. Workload management systems and collaboration framework salso known as problem solving environments (PSEs)provide for the description, use, and management of multi-step, asynchronous, multicomponent workflows Software discovery services discover and select the best software implementation and execution platform based on the parameters of the problem being solved. Examples include NetSolve and Ninf. Community authorization servers enforce community policies governing resource access, generating capabilities that community members can use to access community resources. These servers provide a global policy enforcement service by building on resource information, and resource management protocols (in the Resource layer) and security protocols in the Connectivity layer. Akenti addresses some of these issues. Community accounting and payment services gather resource usage information for the purpose of accounting, payment, and/or limiting of resource usage by community members. Collaboratory services support the coordinated exchange of information within potentially large user communities, whether synchronously or asynchronously. Examples are CAVERNsoft, Access Grid, and commodity groupware systems.

[Grid Computing]

Page 20

Institute Of Engg. & Technology, Lucknow These examples illustrate the wide variety of Collective layer protocols and services that are encountered in practice. Notice that while Resource layer protocols must be general in nature and are widely deployed, Collective layer protocols span the spectrum from general purpose to highly application or domain specific, with the latter existing perhaps only within specific VOs. Collective functions can be implemented as persistent services, with associated protocols, or as SDKs (with associated APIs) designed to be linked with applications. In both cases, their implementation can build on Resource layer (or other Collective layer) protocols and APIs. For example, Figure shows a Collective co-allocation API and SDK (the middle tier) that uses a Resource layer management protocol to manipulate underlying resources. Above this, we define a co-reservation service protocol and implement a coreservation service that speaks this protocol, calling the co-allocation API to implement co-allocation operations and perhaps providing additional functionality, such as authorization, fault tolerance, and logging. An application might then use the co-reservation service protocol to request endto-end network reservations.

[Grid Computing]

Page 21

Institute Of Engg. & Technology, Lucknow Figure4. Collective and Resource layer protocols, services, APIs, and SDKS can be combined in a variety of ways to deliver functionality to applications. Collective components may be tailored to the requirements of a specific user community, VO, or application domain, for example, an SDK that implements an application-specific coherency protocol, or a co-reservation service for a specific set of network resources. Other Collective components can be more general-purpose, for example, a replication service that manages an international collection of storage systems for multiple communities, or a directory service designed to enable the discovery of VOs. In general, the larger the target user community, the more important it is that a Collective components protocol(s) and API(s) be standards based.

5.5 Applications
The final layer in our Grid architecture comprises the user applications that operate within a VO environment. Figure illustrates an application programmers view of Grid architecture. Applications are constructed in terms of, and by calling upon, services defined at any layer. At each layer, we have well-defined protocols that provide access to some useful service: resource management, data access, resource discovery, and so forth. At each layer, APIs may also be defined whose implementation (ideally provided by third-party SDKs) exchange protocol messages with the appropriate service(s) to perform desired actions.

[Grid Computing]

Page 22

Institute Of Engg. & Technology, Lucknow

Figure5. APIs are implemented by software development kits (SDKs), which in turn use Grid protocols to interact with network services that provide capabilities to the end user. Higher level SDKs can provide functionality that is not directly mapped to a specific protocol, but may combine protocol operations with calls to additional APIs as well as implement local functionality. Solid lines represent a direct call; dash lines protocol interactions. We emphasize that what we label applications and show in a single layer in Figure 4 may in practice call upon sophisticated frameworks and libraries (e.g., the Common Component Architecture , SciRun , CORBA , Cactus, workflow systems) and feature much internal structure that would, if captured in our figure, expand it out to many times its current size. These frameworks may themselves define protocols, services, and/or APIs. (E.g., the Simple Workflow Access Protocol .) However, these issues are beyond the scope of this article, which addresses only the most fundamental protocols and services required in a Grid.

[Grid Computing]

Page 23

Institute Of Engg. & Technology, Lucknow

6. GRID APPLICATIONS
What types of applications will grids are used for? Building on experiences in gigabit testbeds, the I-WAY network, and other experimental systems, we have identified five major application classes for computational grids, and described briefly in this section. More details about applications and their technical requirements are provided in the referenced chapters.

6.1 Distributed Supercomputing


Distributed supercomputing applications use grids to aggregate substantial computational resources in order to tackle problems that cannot be solved on a single system. Depending on the grid on which we are working, these aggregated resources might comprise the majority of the supercomputers in the country or simply all of the workstations within a company. Here are some contemporary examples: Distributed interactive simulation (DIS) is a technique used for training and planning in the military. Realistic scenarios may involve hundreds of thousands of entities, each with potentially complex behavior patterns. Yet even the largest current supercomputers can handle at most 20,000 entities. In recent work, researchers at the California Institute of Technology have shown how multiple supercomputers can be coupled to achieve record-breaking levels of performance. The accurate simulation of complex physical processes can require high spatial and temporal resolution in order to resolve fine-scale detail. Coupled supercomputers can be used in such situations to overcome resolution barriers and hence to obtain qualitatively new scientific results.

[Grid Computing]

Page 24

Institute Of Engg. & Technology, Lucknow Although high latencies can pose significant obstacles, coupled supercomputers have been used successfully in cosmology, highresolution abinitio computational chemistry computations, and climate modeling. Challenging issues from a grid architecture perspective include the need to co schedule what are often scarce and expensive resources, the scalability of protocols and algorithms to tens or hundreds of thousands of nodes, latencytolerant algorithms, and achieving and maintaining high levels of performance across heterogeneous systems.

6.2 High-Throughput Computing


In high-throughput computing, the grid is used to schedule large numbers of loosely coupled or independent tasks, with the goal of putting unused processor cycles (often from idle workstations) to work. The result may be, as in distributed supercomputing, the focusing of available resources on a single problem, but the quasi-independent nature of the tasks involved leads to very different types of problems and problem-solving methods. Here are some examples: Platform Computing Corporation reports that the microprocessor manufacturer Advanced Micro Devices used high-throughput computing techniques to exploit over a thousand computers during the peak design phases of their K6 and K7 microprocessors. These computers are located on the desktops of AMD engineers at a number of AMD sites and were used for design verification only when not in use by engineers.

[Grid Computing]

Page 25

Institute Of Engg. & Technology, Lucknow The Condor system from the University of Wisconsin is used to manage pools of hundreds of workstations at universities and laboratories around the world. These resources have been used for studies as diverse as molecular simulations of liquid crystals, studies of ground penetrating radar, and the design of diesel engines. More loosely organized efforts have harnessed tens of thousands of computers distributed world wide to tackle hard cryptographic problems.

6.3 On-Demand Computing


On-demand applications use grid capabilities to meet short-term requirements for resources that cannot be cost effectively or conveniently located locally. These resources may be computation, software, data repositories, specialized sensors, and so on. In contrast to distributed supercomputing applications, these applications are often driven by cost-performance concerns rather than absolute performance. For example: The NEOS and NetSolve network-enhanced numerical solver systems allow users to couple remote software and resources into desktop applications, dispatching to remote servers calculations that are computationally demanding or that require specialized software. A computer-enhanced MRI machine and scanning tunneling microscope (STM) developed at the National Center for Supercomputing Applications use supercomputers to achieve real time image processing. The result is a significant enhancement in

[Grid Computing]

Page 26

Institute Of Engg. & Technology, Lucknow the ability to understand what we are seeing and, in the case of the microscope, to steer the instrument. A system developed at the Aerospace Corporation for processing of data from meteorological satellites uses dynamically acquired supercomputer resources to deliver the results of a cloud detection algorithm to remote meteorologists in quasi real time. The challenging issues in on-demand applications derive primarily from the dynamic nature of resource requirements and the potentially large populations of users and resources. These issues include resource location, scheduling, code management, configuration, fault tolerance, security, and payment mechanisms.

6.4 Data-Intensive Computing


In data-intensive applications, the focus is on synthesizing new information from data that is maintained in geographically distributed repositories, digital libraries, and databases. This synthesis process is often computationally and communication intensive as well. Future high-energy physics experiments will generate terabytes of data per day, or around a peta byte per year. The complex queries used to detect interesting" events may need to access large fractions of this data. The scientific collaborators who will access this data are widely distributed, and hence the data systems in which data is placed are likely to be distributed as well.

[Grid Computing]

Page 27

Institute Of Engg. & Technology, Lucknow The Digital Sky Survey will, ultimately, make many terabytes of astronomical photographic data available in numerous networkaccessible databases. This facility enables new approaches to astronomical research based on distributed analysis, assuming that appropriate computational grid facilities exist. Modern meteorological forecasting systems make extensive use of data assimilation to incorporate remote satellite observations. The complete process involves the movement and processing of many gigabytes of data. Challenging issues in data-intensive applications are the scheduling and configuration of complex, high-volume data flows through multiple levels of hierarchy.

6.5 Collaborative Computing


Collaborative applications are concerned primarily with enabling and enhancing human-to-human interactions. Such applications are often structured in terms of a virtual shared space. Many collaborative applications are concerned with enabling the shared use of computational resources such as data archives and simulations; in this case, they also have characteristics of the other application classes just described. For example: The BoilerMaker system developed at Argonne National Laboratory allows multiple users to collaborate on the design of emission control systems in industrial incinerators. The different users interact with each other and with a simulation of the incinerator.

[Grid Computing]

Page 28

Institute Of Engg. & Technology, Lucknow The CAVE5D system supports remote, collaborative exploration of large geophysical data sets and the models that generate themfor example, a coupled physical/biological model of the Chesapeake Bay. The NICE system developed at the University of Illinois at Chicago allows children to participate in the creation and maintenance of realistic virtual worlds, for entertainment and education. Challenging aspects of collaborative applications from a grid architecture perspective are the real- time requirements imposed by human perceptual capabilities and the rich variety of interactions that can take place. We conclude this section with three general observations. First, we note that even in this brief survey we see a tremendous variety of already successful applications. This rich set has been developed despite the significant difficulties faced by programmers developing grid applications in the absence of a mature grid infrastructure. As grids evolve, we expect the range and sophistication of applications to increase dramatically. Second, we observe that almost all of the applications demonstrate a tremendous appetite for computational resources (CPU, memory, disk, etc.) that cannot be met in a timely fashion by expected growth in single-system performance. This emphasizes the importance of grid technologies as a means of sharing computation as well as a data access and communication medium. Third, we see that many of the applications are interactive, or depend on tight synchronization with computational components, and hence depend on the availability of a grid infrastructure able to provide robust performance guarantees.

[Grid Computing]

Page 29

Institute Of Engg. & Technology, Lucknow

7. CONCLUSIONS AND FUTURE TRENDS


There are currently a large number of projects and a diverse range of new and emerging Grid developmental approaches being pursued. These systems range from Grid frameworks to application testbeds, and from collaborative environments to batch submission mechanisms. It is difficult to predict the future in a field such as information technology where the technological advances are moving very rapidly. Hence, it is not an easy task to forecast what will become the dominant Grid approach. Windows of opportunity for ideas and products seem to open and close in the blink of an eye. However, some trends are evident. One of those is growing interest in the use of Java and Web services for network computing. The Java programming language successfully addresses several key issues that accelerate the development of Grid environments, such as heterogeneity and security. It also removes the need to install programs remotely; the minimum execution environment is a Java-enabled Web browser. Java, with its related technologies and growing repository of tools and utilities, is having a huge impact on the growth and development of Grid environments. From a relatively slow start, the developments in Grid computing are accelerating fast with the advent of these new and emerging technologies. It is very hard to ignore the presence of the Common Object Request Broker Architecture (CORBA) in the background. We believe that frameworks incorporating CORBA services will be very influential on the design of future Grid environments. The two other emerging Java technologies for Grid and P2P computing are Jini and JXTA . The Jini architecture exemplifies a network-centric service-

[Grid Computing]

Page 30

Institute Of Engg. & Technology, Lucknow based approach to computer systems. Jini replaces the notions of peripherals, devices, and applications with that of network-available services. Jini helps break down the conventional view of what a computer is, while including new classes of services that work together in a federated architecture. The ability to move code from the server to its client is the core difference between the Jini environment and other distributed systems, such as CORBA and the Distributed Common Object Model (DCOM). Whatever the technology or computing infrastructure that becomes predominant or most popular, it can be guaranteed that at some stage in the future its star will wane. Historically, in the field of computer research and development, this fact can be repeatedly observed. The lesson from this observation must therefore be drawn that, in the long term, backing only one technology can be an expensive mistake. The framework that provides a Grid environment must be adaptable, malleable, and extensible. As technology and fashions change it is crucial that Grid environments evolve with them. Smarr observes that Grid computing has serious social consequences and is going to have as revolutionary an effect as railroads did in the American Midwest in the early 19th century. Instead of a 3040 year lead-time to see its effects, however, its impact is going to be much faster. Smarr concludes by noting that the effects of Grids are going to change the world so quickly that mankind will struggle to react and change in the face of the challenges and issues they present. Therefore, at some stage in the future, our computing needs will be satisfied in the same pervasive and ubiquitous manner that we use the electricity power grid. The analogies with the generation and delivery of electricity are hard to ignore, and the implications are enormous. In fact, the Grid is analogous to the electricity (power) Grid and the vision is to offer

[Grid Computing]

Page 31

Institute Of Engg. & Technology, Lucknow (almost) dependable, consistent, pervasive, and inexpensive access to resources irrespective of their location for physical existence and their location for access.

[Grid Computing]

Page 32

Institute Of Engg. & Technology, Lucknow

BIBLIOGRAPHY

[1] Foster, C. Kesselman, editors. The Grid: Blueprint for a New Computing Infrastructure, Morgan Kaufmann, San Francisco, Calif. (1999). [2] Foster. I, Kesselman, C. and Tuecke, S. The Anatomy of the Grid: Enabling Scalable Virtual Organizations. International Journal of High Performance Computing Applications [3] Rajkumar Buyya, Mark Baker. Grids and Grid technologies for wide-area distributed computing ,SP&E. [4] www.globus.org [5] Ian Foster. The Grid: A New Infrastructure for 21st Century Science, Physics today

[Grid Computing]

Page 33

Institute Of Engg. & Technology, Lucknow

[Grid Computing]

Page 34

Institute Of Engg. & Technology, Lucknow

[Grid Computing]

Page 35

You might also like