Grid Technology And Its Infrastructure

Print   

02 Nov 2017

Disclaimer:
This essay has been written and submitted by students and is not an example of our work. Please click this link to view samples of our professional work witten by our professional essay writers. Any opinions, findings, conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of EssayCompany.

Because equipment is supplied by several vendors, interoperability becomes critical and high standards are necessary, but it’s not easy to interchange parts and procedures between different vendors. The multilayer architecture of grid involves a modular system structure which makes it easier. For standardising grid specifications, protocols and interfaces, the Globus alliance and Open Grid Forum (OGF) were established, as explained below.

2.4.1 Globus Alliance

Globus Alliance [6] is an international collaboration of organisations and individuals conducting research for the development of fundamental grid technologies. Globus Alliance introduced open source software called Globus Toolkit for building grid systems and applications. The Globus Toolkit (GT) has been developed since the late 1990s to support the development of service-oriented distributed applications and infrastructures. Core GT components address basic issues relating to security, resource access and management, data movement and management, resource discovery, and so forth. Other projects have contributed to a broader Globus universe of tools and components that build on core GT functionality to provide many useful application-level functions. These tools have been used to develop a wide variety of Grid systems and applications.

Version 4 of the Globus Toolkit, GT4, released in early 2005, represents a significant advance relative to earlier releases in terms of the range of components provided, functionality, standards conformance, usability, and quality of documentation. The architecture of Globus Toolkit 4 is shown in Fig. 2.3.

Figure 2.3 : Globus Toolkit 4 Architecture

As shown in the figure, GT4 comprises both a set of service implementations ("server" code) and associated "client" code. GT4 provides both Web services (WS) components (on the left) and non-WS components (on the right). The white boxes in the "client" domain denote custom applications and/or third-party tools that access GT4 services or GT4-enabled services.

2.4.1.1 Predefined GT4 Services and Other Components

GT4 provides a set of predefined services, describe in a little more detail in next section. Nine GT4 services implement Web services (WS) interfaces: job management (GRAM); reliable file transfer (RFT); delegation; MDS-Index, MDS-Trigger, and archiver (collectively termed the Monitoring and Discovery System, or MDS); community authorization (CAS); OGSA-DAI data access and integration; and GTCP Grid TeleControl Protocol for online control of instrumentation. Of these, archiver, GTCP, and OGSA-DAI are "tech previews," meaning that their interface and implementation is likely to change in the future.

For two of those services, GRAM and MDS-Index, pre-WS "legacy" implementations are provided. They will be deprecated at some future time as experience is gained with WS implementations.

For three additional GT4 services, WS interfaces are not yet provided (but will be in the future): GridFTP data transport, replica location service (RLS), and MyProxy online credential repository.

Other libraries implement various security functionality, while the eXtensible I/O (XIO) library provides convenient access to a variety of underlying transport protocols. SimpleCA is a lightweight certification authority.

2.4.1.2 Globus Universe

GT4 components do not, in general, address end-user needs directly: they are more akin to a TCP/IP library or Web server implementation than a Web browser. Instead, GT4 enables a range of end-user components and tools that provide higher-level capabilities attuned to the needs of specific user communities. These components and tools constitute, together with GT4 itself, the "Globus universe." We introduce here some of its principal elements.

For the purposes of this presentation, we assign each Globus universe component to one of the following classes.

Execution management tools are concerned with the initiation, monitoring, management, scheduling, and/or coordination of remote computations.

Data management tools are concerned with data location, transfer, and management.

Interface tools are concerned with providing or supporting the development of graphical user interfaces for end-user or system administration applications.

Security tools are concerned with such issues as mapping between Grid credentials and other forms of credential and managing authorization policies.

Monitoring and discovery tools are concerned with monitoring various aspects of system behavior, managing monitoring data, discovering services, etc.

2.4.1.2.1 Execution Management

Execution management tools are concerned with the initiation, monitoring, management, scheduling, and/or coordination of remote computations. GT4 supports the Grid Resource Allocation and Management (GRAM) interface as a basic mechanism for these purposes. Its GRAM server is typically deployed in conjunction with Delegation and GridFTP servers to address data staging, delegation of proxy credentials, and computation monitoring and management in an integrated manner.

Associated tools fall into three main classes. First, we have GRAM-enabled schedulers for clusters or other computers on a local area network (Condor, OpenPBS, Torque, PBSPro, SGE, LSF). Second, we have systems that provide different interfaces to remote computers (OpenSSH) or that implement various parallel programming models in Grid environments by using GRAM to dispatch tasks to remote computers (Condor-G, DAGman, MPICH-G2, GriPhyN VDS, Nimrod-G). Third, we have various "meta-schedulers" that map different tasks to different clusters (CSF, Maui).

Name

Purpose

Grid Resource Allocation & Management service

GRAM service supports submission, monitoring, and control of jobs on computers. Interfaces to Unix shell ("fork"), Platform LSF, PBS, and Condor schedulers; others may be developed. Includes support for MPICH-G2 jobs: multi-job submission, process coordination in a job, sub-job coordination in a multi-job.

Java CoG Kit Workflow

Uses the Karajan workflow engine that supports DAGs, conditions, & loops; directs tasks to GRAM servers for execution.

Community Scheduler Framework

CSF is an open source meta-scheduler based on the WS-Agreement specification.

GSI OpenSSH

Version of OpenSSH that supports GSI authentication. Provides remote terminal (SSH) and file copy (SCP) functions.

Condor-G

Manage the execution of jobs on remote GRAM-enabled computers, addressing job monitoring, logging, notification, policy enforcement, fault tolerance, and credential management.

DAGman

Manage the execution of directed acyclic graphs (DAGs) of tasks that communicate by writing/reading files; works with Condor-G.

MPICH-G2

Execute parallel Message Passing Interface (MPI) programs over one or more distributed computers.

Nimrod-G

Graphical specification of parameter studies, and management of their execution on distributed computers.

Ninf-G

An implementation of the GridRPC remote procedure call specification, for accessing remote services.

GriPhyN Virtual Data System

Tools for defining, scheduling, and managing complex data-intensive workflows. Workflows can be defined via a high-level virtual data language; a virtual data catalog is used to track current and past executions. Includes heuristics for job and data placement. Uses DAGman/Condor-G for execution management.

Condor, OpenPBS, Torque, PBSPro, Sun Grid Engine, Load Sharing Facility

Schedulers to which GSI-authenticated access is provided via a GRAM interface. The open source Condor is specialized for managing pools of desktop systems. OpenPBS and Torque are open source versions of the Portable Batch System (PBS) cluster scheduler; PBSPro is a commercial version produced by Altair. SGE is also available in both open source and commercial versions. LSF is a commercial system produced by Platform.

Maui Scheduler

An advanced job scheduler for use on clusters and supercomputers, with support for meta-scheduling.

2.4.1.2.2 Data Management

Data management tools are concerned with the location, transfer, and management of distributed data. GT4 provides a variety of basic tools, including GridFTP for high-performance and reliable data transport, RLS for maintaining location information for replicated files, and OGSA-DAI for accessing and integrated structured and semistructured data.

Associated tools enhance GT4 components by addressing storage reservation (NeST), providing a command-line client for GridFTP (UberFTP), providing a uniform interface to distributed data (SRB), and supporting distributed data processing pipelines (DataCutter, STORM).

Name

Purpose

GridFTP server

Enhanced FTP server supporting GSI authentication and high-performance throughput. Iinterfaces to Unix POSIX, HPSS, GFPS, and Unitree provided; others can be developed.

globus-url-copy

Non-interactive command-line client for GridFTP.

Replica Location Service

RLS is a decentralized service for registering and discovering information about replicated files.

Reliable File Transfer service

RFT controls and monitors third-party, multi-file transfers using GridFTP. Features exponential back-off on failure, all or none transfers of multi-file sets, optional use of parallel streams and TCP buffer size tuning, and recursive directory transfer.

Lightweight Data Replicator

LDR is a tool for replicating data to a set of sites. It builds on GridFTP, RLS, and pyGlobus.

OGSA Data Access & Integration

OGSA-DAI is an extensible framework for accessing and integrating data resources, including relational and XML databases and semistructured files.

Network Storage

NeST allows GridFTP clients to negotiate reservations for disk space, which then apply to subsequent transfers.

UberFTP

Interactive command-line client for GridFTP.

Storage Resource Broker

Client-server middleware that provides a uniform interface for connecting to heterogeneous, distributed data resources. GSI authentication and GridFTP transport.

DataCutter & STORM

DataCutter supports processing of large datasets via the execution of distributed pipelines of application-specific processing modules; STORM supports relational data.

2.4.1.2.3 Interface

Grid portal and user interface tools support the construction of graphical user interfaces for invoking, monitoring, and/or managing activities involving Grid resources.

Many (but not all) of these tools are concerned with enabling access to Grid systems from Web browsers. Many (but not all) of such Web browser-oriented systems are based on a three-tier architecture, in which a middle-tier portal server (e.g., uPortal with Tomcat, or GridSphere) hosts JSR 168-compliant portlets that both (a) generate the various elements of the first-tier Web interface with which users interact and (b) interact with third-tier Grid resources and services.

Name

Purpose

Java CoG Desktop

Java application that provides a "desktop" interface to a Grid, so that for example a job is run, and a file copied, by dragging and dropping its description to a computer and storage system, respectively.

WebMDS

Uses XSLT to generate custom displays of monitoring data, whether from active services or archives.

Portal User Registration Service

PURSe provides for the Web-based registration of users and the subsequent generation and management of their GSI credentials, thus allowing easy access to Grid resources by large user communities.

Open Grid Computing Environment

OGCE packages a range of components, including JSR 158-compliant portlets for proxy management, remote command execution, remote file management, and GPIR-based information services.

GridSphere

An open source JSR 168-compliant portlet environment.

Sakai

JSR 168-compatible system for distributed learning and collaborative work, with tools for chat, shared documents, etc.

2.4.1.2.4 Security

Security tools are concerned with establishing the identity of users or services (authentication), protecting communications, and determining who is allowed to perform what actions (authorization), as well as with supporting functions such as managing user credentials and maintaining group membership information.

GT4 provides distinct WS and pre-WS authentication and authorization capabilities. Both build on the same base, namely standard X.509 end entity certificates and proxy certificates, which are used to identify persistent entities such as users and servers and to support the temporary delegation of privileges to other entities, respectively.

GT4’s WS security [1] comprises (a) Message-Level Security mechanisms, which implement the WS-Security standard and the WS-SecureConversation specification to provide message protection for GT4’s SOAP messages, and (b) an Authorization Framework that allows for a variety of authorization schemes, including a "grid-mapfile" access control list, an access control list defined by a service, a custom authorization handler, and access to an authorization service via the SAML protocol. For non-WS components, GT4 provides similar authentication, delegation, and authorization mechanisms, although with fewer authorization options.

Name

Purpose

Message-Level Security

Implements WS-Security standard and WS-SecureConversation specification to provide message protection for SOAP messages.

Authorization Framework

Allows for a variety of authorization schemes, including file- and service-based access control lists, custom handles, SAML protocol.

Pre-WS A&A

Authentication, delegation, authorization for non-WS components.

Delegation Service

Enable storage and subsequent (authorized) retrieval of proxy credentials, thus enabling delegation when using WS protocols.

Community Authorization Service

Issues assertions to users granting fine-grained access rights to resources. Servers recognize and enforce the assertions. CAS is currently supported by the GridFTP server.

Simple CA

A simplified certification authority for issuing X.509 credentials.

MyProxy service

Allow federation of X509 and other authentication mechanisms (e.g., username/password, one-time passwords) via SASL/PAM.

VOMS

Database of user roles and capabilities, and user client interface that supports retrieval of attribute certificates for presentation to VOMS-enabled services.

VOX & VOMRS

Extends VOMS to provide Web registration capabilities, rather like PURSe.

PERMIS

Authorization service accessible via SAML protocol.

GUMS

Grid User Management System: an alternative to grid map files.

KX509 & KCA

KX509 is a "Kerberized" client that generates and stores proxy credentials, so users authenticated via Kerberos can access the Grid; KCA is a Kerberized certification authority used to support KX509.

PKINIT

A service that allows users with Grid credentials to authenticate to a Kerberos domain.

2.4.1.2.5 Monitoring and Discovery

Monitoring and discovery mechanisms are concerned with obtaining, distributing, indexing, archiving, and otherwise processing information about the configuration and state of services and resources. In some cases, the motivation for collecting this information is to enable discovery of services or resources; in other cases, it is to enable monitoring of system status.

GT4’s support in its Java, C, and Python WS Core for WSRF and WS-Notification interfaces provides useful building blocks for monitoring and discovery, enabling the definition of properties for which monitoring and discovery is be provided, and subsequent pull- and push-mode access. GT4 services such as GRAM and RFT define appropriate resource properties, providing a basis for service discovery and monitoring. Other GT4 services are designed to enable discovery and monitoring, providing for indexing, archiving, and analysis of data for significant events.

Name

Purpose

Java, C, and Python WS Cores

Implements WSRF and WS-Notification specifications, thus allowing Web services to define, and allow access to, resource properties. Container incorporates a local index, enabling discovery of services.

Index service

Collects live monitoring information from services and enable queries against that information.

Trigger service

Compares live monitoring information against rules to detect fault conditions, and notifies operators (for example, by email)

Archiver service

Store historical monitoring data data and enable queries against that data.

Aggregator framework

The aggregator framework facilitates the building of aggregating services (for example the index, trigger and archive services).

Hawkeye

Monitor individual clusters, using Condor as a base. GT4 includes data provider that makes status information available in GLUE schema.

Ganglia

Monitor individual clusters and sets of clusters. GT4 includes data provider that makes status information available in GLUE schema.

Inca

Monitor services in a distributed system by performing a set of specified tests at specified intervals; publish results of these tests.

NetLogger

Generate, collect, and analyze high frequency data from distributed system components.

2.4.2 Open Grid Forum (OGF)

The Open Grid Forum (OGF) [31] is a community of users, developers, and vendors leading the global standardization effort for distributed computing (including clusters, grids and clouds). The OGF community consists of thousands of individuals in industry and research, representing over 400 organizations in more than 50 countries. Together we work to accelerate adoption of grid computing worldwide because we believe grids will lead to new discoveries, new opportunities, and better business practices. Various research groups within OGF have created many standards such as Open Grid Service Architecture (OGSA) to present a service-oriented view of the shared physical resources or services supported by theses resources, Open Grid Services Infrastructure (OGSI) to define mechanisms for creating and managing grid services (this acts as technical specification for implementing grid services), GridFTP and JSDL [7, 4] ; many other issues are currently being worked on.

2.4.2.1 Job Submission Description Language (JSDL) [7]

JSDL provides an XML-based language specifically for describing single job submission requirements. Since many different job management systems exist in distributed, heterogeneous computing systems, such as grids, a primary goal of JSDL is to provide a common language for describing job submission requirements. Hence, the JSDL vocabulary is informed by a number of existing job management systems such as Condor, Globus, Load Sharing Facility, Portable Batch System, Sun Grid Engine, and Unicore.

JSDL focuses on single job submission description and it must be combined with other specifications, from OGF or other standards bodies, to address broader requirements in job or workflow management. For example, JSDL is used with the OGSA Basic Execution Service, an OGF specification that provides a job submission and management interface. JSDL can also be used with BPEL as part of workflows. JSDL can also be combined with other scheduling, service agreement [WS-Agreement], or job policy languages. Attribute and element extensions are also allowed.

JSDL provides elements for:

Job identification. This includes a JobName, a description (any string for human consumption), a JobAnnotation (any string that may contain information for machine consumption) and a JobProject to which the job belongs.

Application information. This includes a name, description, and version number. This description can be extended with application-specific information. A normative extension for describing a POSIX application, including environment settings such as file size limit and core dump limit, is specified.

Resource requirements. As to be expected, the possible resource requirements are extensive, including 27 main elements, such as OS types, CPU types, file system types, physical memory, disk space, network bandwidth, and more.

Data requirements. The data requirement elements allow files to be identified that must be staged-in (to the remote host) prior to execution, and staged-out afterwards.

2.4.2.2 GridFTP

The GridFTP facility provides secure and reliable data transfer between grid hosts. Its protocol extends the well-known FTP standard to provide additional features, including support for authentication through GSI. One of the major features of GridFTP is that it enables third-party transfer. Third-party transfer is suitable for an environment where there is a large file in remote storage and the client wants to copy it to another remote server.

Developed by the Open Grid Forum, GridFTP was designed to provide reliable, efficient and secure access, and to transfer huge amounts of data between distributed resources in the grid using many facilities such as multi-streamed transfer, auto-tuning and globus-based security. It is a kind of service offered by grid computing.

GridFTP is very important in this work because it can be the basis for grid evolution through the use of aspects of migration.

The FTP protocol was attractive for the following reasons:

It is one of the most common data transfer protocols, as well as being the most likely candidate to meet a grid's needs.

It includes many features, such as its provision of a well-defined architecture and the fact that it is used extensively.

It is a widely implemented and well-understood Internet Engineering Task Force (IETF) [44] standard protocol.

Its support for third party transfers, which means its ability to directly transfer between two servers in client/server and other transfers.

Numerous groups have added various extensions through the IETF. Some of these extensions would be particularly useful in grids.

GridFTP has the following features:

Grid Security Infrastructure (GSI) and Kerberos support.

Robust and flexible authentication, integrity, and confidentiality features are critical when transferring or accessing files. GridFTP must therefore support GSI and Kerberos authentication. It provides this capability by implementing the Generic Security Services Application Program Interface (GSSAPI) [54] authentication mechanisms.

Third-party control of data transfer.

In order to manage sets of large data for large distributed communities, it is essential to control of transfers between storage servers by providing third party. GridFTP provides this capability by adding GSSAPI security to the existing third-party transfer capability defined in the standard FTP. Third party operation allows a user at one site to initiate, control and monitor a data transfer operation between two other parties (source and destination).

Parallel data transfer.

Using multiple Transmission Control Protocol (TCP) [65] streams in parallel (even between the same source and destination) on wide-area links can improve aggregate bandwidth over using a single TCP stream. GridFTP supports parallel data transfer through FTP command extensions and data channel extensions.

Striped data transfer.

Data may be striped or interleaved across multiple servers. Striped transfers provide further bandwidth improvements over those achieved with parallel transfers. GridFTP includes extensions that initiate striped transfers, which use multiple TCP streams to transfer data that is partitioned among multiple servers.

Partial file transfer.

The transfer of partial files required from many applications such as high-energy physics analysis. However, standard FTP supports the transfer of

complete files or the transfer of the remainder of a file starting at a particular offset. GridFTP introduces new FTP commands to support transfers of subsets of file.

Automatic negotiation of TCP buffer/window sizes.

Using automatic negotiation of TCP buffer/window size is an important in achieving maximum bandwidth with TCP/IP, thus improving the transfer performance, especially in a wide area. GridFTP extends the standard FTP to support both manual settings and automatic negotiation of TCP buffer sizes for large files and for large groups of small files.

Support for reliable and restartable data transfer.

Reliable transfer is important for many applications that manage data. Fault recovery methods are needed for handling such faults as transient network failures and server outages. The FTP standard includes basic features for restarting failed transfers that are not widely implemented. The GridFTP protocol exploits these features, and extends them to cover the new data channel protocol.

Integrated instrumentation.

The protocol calls for restart and performance markers to be sent back. Moreover, there are new functionalities and recent developments added to GridFTP to take advantage of the latest network technologies and transport protocols.

GridFTP Pipelining.

When the dataset is large, it will be broken up into many small files. This causes a problem known as "lots of small files" (LOSF). The solution to this problem is known as pipelining [12]. Pipelining approaches the LOSF problem by trying to minimise the amount of time between transfers. Pipelining allows the client to send transfer commands at any time. The server processes these requests in the order they are sent. Pipelining improves the performance of LOSF transfers.

GridFTP over UDT.

User Datagram Protocol (UDT) [42] is an application-level data transport protocol using UDP to transfer bulk data. It achieves good performance on high-bandwidth, high-delay networks in which TCP has significant limitations. Because of this, it can provide significantly higher end-to-end performance than GridFTP over TCP, especially on wide area networks.

Split DSI for GridFTP.

GridFTP has a Data Storage Interface (DSI) that interacts with storage systems, accepting requests such as get, put and stat and performing the necessary functions based on the underlying storage system. DSI interface has been implemented to achieve split TCP functionality. In the previous implementation, a GridFTP server could act either as an end or as an intermediate server, which is very restrictive from the point of view of production using GridFTP servers. GridFTP generalises by allowing DSI to perform different functions based on the input it receives. For example, the DSI "get" command could be either to contact the underlying storage directly or to forward the data to another GridFTP server. This can help both to overcome some of the TCP limitations and to avoid some bottleneck links.

GridFTP Where there is FTP (GWFTP).

GWFTP is an intermediate program used to act as a proxy between existing FTP clients and GridFTP servers. Users can connect to GWFTP with their favourite standard FTP client, and GWFTP will then connect to a GridFTP server on the client's behalf. This process allows the use of ordinary FTP clients to invoke operations on GridFTP servers.

Network provisioning.

Network provisioning technologies must be integrated into a scalable architecture that can provide on-demand setup of channels at varying bandwidth

resolutions. This allows for binding of GridFTP transfers to optical paths.

GridFTP over Infiniband.

Infiniband (IB) defines a point-to-point interconnect architecture that lever-ages networking principles (switching and routing) to provide a scalable, high performance server. Infiniband provides transport services for upper layer protocols and supports flow control and quality of service to provide ordered, guaranteed packet delivery. Using Infiniband enables GridFTP to take advantage of the recent developments infiniband to achieve high performance on wide area networks.

GridFTP in reality

NaradaBrokering [52, 53] is a distributed messaging middleware designed to run on large cooperating broker nodes networks. Because NaradaBrokering's communication is asynchronous communication, it can publish messages at a very high rate. It exploits GridFTP facilities and capabilities such as fragmentation/reliable delivery service.

Fig. 2.5 shows the integration between GridFTP and NaradaBrokering. The architecture embodies two approaches, proxy and router. The key component of the proxy approach is the remote GridFTP server, which simulated by NB Agent A. A GridFTP client contacts NB Agent A, which then forwards all requests to the NB Agent B that contacts the available GridFTP server. The router approach is used to upload GridFTP functionality with NaradaBrokering; GridFTP functionality requires two socket connections, control channel and data channel. Control channel is used for transfer commands (i.e. "put", "get"), data channel for transfer data.

Figure 2.5: GridFTP with NaradaBrokering [53]

2.5 Resource Broker (Scheduler)

The Resource Broker, sometimes called Scheduler, is one of the main components of grids. It plays an important role in building an effective grid environment by scheduling user applications onto grid resources to achieve certain performance goals such as minimising execution times and communication delays, as well as the use of resources, maximising resource utilisation and reliability, and load balancing. The broker is responsible for the discovery and selection of resources and for application execution by transferring job input files to the resources and sending job output back to users.

These tasks are:

Discovery of the resources in the grid.

The role of discovery is that of determining a list of authenticated resources available in the grid. This list is usually obtained by searching its database containing information about the resources. Resource discovery mechanisms use a single database (the centralised approach) or a set of databases (the distributed approach); they reside at various places in order to obtain information about the available resources regarding logical entities such as application soft-ware, operating systems, policies and data, and physical entities such as CPU speed and architecture, current loads, and networks, in order to compile a list of resources that can meet the application requirements. An example of such a database is the Monitoring and Discovery Service (MDS) in Globus [24, 21, 43, 55].

Selection of optimal resources.

Once the resource list has been assembled, the optimal resources that meet the user's requirements such as cost are selected from it.

Execution of job with available resources.

Once the job and resources have been selected, job input files are transferred to and run on the resources. When the job is finished, the broker informs the user.

Unpredictable situations that need addressing may occur during job runtime, so the job monitoring role is to check a job's execution and detect failure or unexpected incidents. An example of such monitoring is the Network Weather Service (NWS) [85].

Much work has been done on resource brokering in grid computing and many scheduling algorithms have been introduced to deal with grid applications. The following subsections outline some of these.

2.5.1 Condor G

Condor-g is a Task Scheduler designed to work with the Globus Middleware platform. It was designed to act as a first generation brokering system for grid computing. As such it works at a very low level, in much the same way that Globus does. Condor-g is also designed to extend the functionality of Globus to integrate DAG scheduling and a better grasp of which machines are running at a given time. One of the problems with Globus is that jobs are submitted and then run immediately; condor gives greater flexibility by running programs when it can and keeping track of the state of execution on the remote machines.

The management operations of Condor G are [38]:

submission of jobs to grid resources, including their input/output les and any arguments needed to initiate a job's execution.

querying a job's status or cancelling a job.

informing the user via email of job termination or errors that occur during execution.

storing a log that provides a history of a job's execution stages.

As shown in Fig. 2.6, once the user submits the job the scheduler responds by creating a GridManager daemon at the job submission machine to handle and manage this job. One GridManager daemon handles all jobs for a single user, and terminates once they are all completed.

Figure 2.6: Condor-G Mechanism for Executing a Job on Globus Managed Resources [38].

Each GridManager contacts the modified GRAM that has been configured to support Condor-G at the execution site. After passing the Gatekeeper's authentication process, a Globus JobManager daemon is created. The JobManager daemon at the execution site connects directly to the GridManager at the submission machine using the Global Access Secondary Storage (GASS) [9] in order to transfer the job's executable and standard input and output files as well as any error files. The JobManager then submits the job to the local site's resource management system. The JobManager sends updates on the job's status back to the GridManager, which then updates the Scheduler. All the states for the job are stored in the Scheduler's Persistence Job Queue to keep a history log. However, Condor G does not support checkpointing, migration of serial grid applications, application models, policy based, co-allocation and advance reservation. In addition, le transfer is mainly based on the GASS service rather than on GridFTP.

2.5.2 Nimrod/G

Nimrod/G is a hierarchical system based on the concept of computational economy, and is designed to run parametric applications on grids. It has been developed by Monash University, Australia [14]. Nimrod/G architecture is shown in Fig. 2.7.

Figure 2.7: Nimrod/G Architecture

In this architecture, the user (i.e. the client) creates an experimental plan by using declarative parametric modelling language before a job is passed to the parametric engine. The latter interacts with the scheduler to retrieve resource availability, while the scheduler discovers resource load and status through the Grid's directory service (which uses the Monitoring and Directory Service) and selects the resources that meet the user's requirement. Once the parametric engine has confirmed the availability of the user's resource selection, the information is forwarded to the dispatcher to initiate the job on the selected resource by job wrapper, which is responsible for beginning the execution of the job on the resource and returning the result to the parametric engine via the dispatcher.

Nimrod/G supports resource reservation, co-allocation and load balancing through periodic rescheduling. However, it does not support exposing migration, fault tolerance, and policy based scheduling.

2.6 Migration

The capability to move physical or virtual computational resources like software code, portable PC's, running objects, mobile agents and data from one location to another location across a local or global network is called migration. Migration is very broad concept used in distributed computing; it can be subdivided into personal, computer and computational migration. In the first of these, the user can do the work at locations remote from physical hardware, which users do not need to carry around with them, but can start work in one location and carry it on anywhere else throughout the world regardless of machine type. Examples include web-based email accounts. Computer migration is concerned with moving an actual piece of computer hardware such as portable notebook PCs and personal digital assistants (PDAs) from one location to another. Computational migration addresses the movement of software [83, 29]. This thesis is concerned with the latter. Computational migration can also be referred to as control migration, data migration, link migration and object migration [17]. Control migration supports moving a control thread such as Remote Procedure Call (RPC) and Remote Method Invocation (RMI) from one machine to another and back again. Data migration allows the data required by the process to be passed through the network. An example is Java RMI call the method name, its arguments and return type, packed and transmitted to and from the remote machine. Link migration, referring to the transmission of the ability to move objects (codes) between different servers, is the fundamental concept behind distributed objects. It is also the basic concept of distributed computing. Code migration (mobile computation) has come into popular use because it provides the capability to link software components at runtime. This means that a software component can move around and execute on different servers across the network. From the viewpoint of execution state, code migration can be divided into two groups, weak and strong migration [18].

2.6.1 Weak Migration

When code can be moved across the nodes, this kind of migration is known as weak migration. The codes sometimes have initialization data attached but without execution states. One illustration of a weak migration system is a Java applet; others are Code-on Demand (CoD) and Remote Execution/Evaluation [29, 77].

Remote evaluation is derived from the client-server and virtual machine styles [39]. A client component has to know how to control the remote process; the client thereupon sends the instruction to a server component at the remote site, which in turn executes the code using the resources available there. The client then receives the result from the sever. The remote evaluation style assumes that the code provided will be executed in a protected environment so that it will not affect other clients served by the same server beyond the capacity of the resources being used [28]. One of the advantages of remote evaluation is the ability to customise the services as server components, which provides improved extensibility and customisability and better efficiency when the code can adapt its actions to the environment inside the server. A good example of Remote Execution Evaluation is grid computing. Code-on-Demand style systems are used when a client component has access to a set of resources but does not know how to process them. It therefore sends a request to remote server to acquire the code of \know-how". Once the code is received, the client executes it locally. The advantages of Code-on-Demand are the adding of features to a pre-deployed client, which provides enhanced extensibility and configurability, and better user-perceived performance and efficiency, when the code can adapt its actions to the client's environment and interact with the user locally without remote interactions [28]. A famous example of Code-on-Demand is Java applets.

2.6.2 Strong Migration

Strong migration refers to the capability of a computational environment to migrate the code and execution state (the context of execution) to restart at a new resource. The execution state includes running code, program counter, saved processor registers, return addresses and local variables. A group of coordinating threads runs in a process on the client, and then accesses the remote resource by the invocation of a remote process. The advantage of strong migration is that an application can decide to move between locations while it is processing information, presumably in order to reduce the distance between it and the next set of data it wishes to process. In addition, the reliability problem of partial failure is reduced because the application state is only in one location at a time.

2.7 Summary

This chapter survey the background and related work on related research issues covered in this thesis. This background expressed the fast growth in hardware, software and network bandwidth which articulated the need for grid evolution which present as a significant problem and it also illustrated the need for increasing the CPU utilisation which is one of grid aims.



rev

Our Service Portfolio

jb

Want To Place An Order Quickly?

Then shoot us a message on Whatsapp, WeChat or Gmail. We are available 24/7 to assist you.

whatsapp

Do not panic, you are at the right place

jb

Visit Our essay writting help page to get all the details and guidence on availing our assiatance service.

Get 20% Discount, Now
£19 £14/ Per Page
14 days delivery time

Our writting assistance service is undoubtedly one of the most affordable writting assistance services and we have highly qualified professionls to help you with your work. So what are you waiting for, click below to order now.

Get An Instant Quote

ORDER TODAY!

Our experts are ready to assist you, call us to get a free quote or order now to get succeed in your academics writing.

Get a Free Quote Order Now