Grid Computing


by, Jisha P Abraham (Lecturer, Dept of Computer Science)

1. Overview

A grid computing system uses that same concept: the load across multiple computers to complete tasks more efficiently and quickly. Normally, a computer can only within the limitations of its own resources. There is an upper limit to how fast it can complete an information or how much information it can store. Most computers are upgradeable, which means it is possible to add more power or capacity to a single computer, but that’s just an incremental increase in performance.

Grid computing systems link computer resources together in a way that lets someone use one computer to access and leverage the collected power of all the computers in the system. To the individual user it is as if the user’s computer has transformed into a super computer.

The grid computing concept isn’t a new one. Its special kind of distributed computing. In distributed computing different computers within the same network share one or more resources. In the ideal grid computing system, every resource is shared turning a computer network into a powerful supercomputer. With the right user interface, accessing a grid computing system would look no different than accessing a local machine’s resources. Every authorized computer would have access to enormous processing power and storage capacity. Grid computing can be defined simply as “A kind of high- performance computing, an emerging technique in which multiple computers link together to combine resources”.

2. Faces of grid computing

Grid computing is still a developing and is related to several other innovative computing systems, some of which are subcategories of grid computing. Shared computing usually refers to a collection of computers that share processing power in order to complete a specific task. Then there’s a software-as-a-service (SaaS) system known as utility computing, in which a company offers specific services (such as data storage or increased processor power) for a metered cost. Cloud computing is a system in which applications and storage “live” on the Web rather than on a user’s computer.

Grid vs. supercomputers.

“Distributed” or “Grid” computing in general is a special type of parallel computing that relies on complete computers (with on-board CPU, storage, power supply, network interface, etc..) connected to a network (private, public or the internet) by a conventional network interface ,such as Ethernet. This is in contrast to the traditional notion of a supercomputer which has many processors connected by a local high speed computer bus.

The primary advantage of distributed computing is that each node can be purchased as commodity hardware, which when combined can produce similar, computing resources to a multiprocessor supercomputer, but at lower cost. This is due to the economies of scale of producing commodity hardware, compared to the lower efficiency of designing and constructing small number of custom supercomputers. The primary performance disadvantage is that the various processors and local storage areas do not have high-speed connections. This arrangement is thus well suited to applications in which multiple parallel computations can take lace independently, without the need to communicate intermediate results between processors. The high-end scalability of geographically dispersed grids is generally favourable, due to the low need for connectivity between nodes relative to the capacity of the public Internet.

3. Design considerations and variations

One feature of distributed grids is that they can be formed from computing resources belonging to multiple individuals or organizations (known as multiple administrative domains).This can facilitate commercial transactions, as in utility computing, or make it easier to assemble volunteer computing networks.

Due to lack of central control over hardware, there is no way to guarantee that nodes will not drop out of the network at random times. Some nodes (like laptops or dialup internet customers) may also be available for computation but not network communications for unpredictable periods. These variations can be accommodated by assigning large work units(thus reducing the need for continuous network connectivity) and reassigning work units when a given node fails to report its result as expected.

IN many cases, the participating nodes must trust the central system not to abuse the access that is being granted, by interfering with the operation of other programs, mangling stored information, transmitting private data, or creating new security holes. Other systems employ measures to reduce the amount of trust “client” nodes must place in the central system such as placing applications in virtual machines.

Public systems or those crossing administrative domains (including different departments in the same organisation)often result in the need to run on heterogeneous systems, using different operating systems and hardware architectures. With many languages, there is a trade-off between investment in software development and the number of platforms that can be supported(and thus the size of the resulting network).Cross-platform languages can reduce the need to make this trade off ,though potentially at the expense of high performance on any given node(due to runtime interpretation or lack of optimization for the particular platform).Various middleware projects have created generic infrastructure ,to allow diverse scientific and commercial projects to harness a particular associated grid or for the purpose of setting up new grids. In fact, the middleware can be seen as a layer between the hardware and the software.

The emerging protocols for grid computing systems are designed to make it easier for developers to create applications and to facilitate communication between computers.

The most prevalent technique computer engineers use to protect data is encryption. To encrypt data is to encode it so that only someone possessing the appropriate key can decode the data and access it.Ironically,a hacker could conceivably create a grid computing system for the purpose of cracking encrypted infoemation.Because the encryption techniques use complicated to encode data, it would take a normal computer several years to crack a code (which usually involves finding the two largest prime divisors of an incredibly large number).With a powerful enough grid computing system, a hacker might find a way to reduce the time it takes to decipher encrypted data. It is hard to protect a system from hackers, particularly if the system relies on open standards. Every computer in a grid computing system has to have specific software to be able to connect and interact with the system as a whole—computers don’t know how to do it own their own. If the computer systems software is proprietary, it might be harder (but not impossible) for a hacker to access the system).

In most grid computing systems, only certain users are authorized to access the full capabilities of the network.Otherwise,the control node would be flooded with processing requests and nothing would happen (a situation called deadlock in the IT business).It is also important to limit access for security purposes. For that reason, most systems have authorization and authentication protocols. These protocols limit network access to a selected number of users. Other users are still able to access their own machines, but they can’t leverage the entire network.

The middleware and control mode of a grid computing system are responsible for keeping the system running smoothly together, they control how much access each computer has to the network resources and vice versa. While it’s important not to let any one computer dominate the network, it’s just as important not to let network applications take up all the resources of any one computer. If the system robs users of computing resources, it is not an efficient system.

Grid computing applications

The search for extra terrestrial intelligence project is one of the earliest grid computing systems to gain popular attention. The mission of the SETI project is to analyze data gathered by radio telescopes in search of evidence for intelligent alien communications. There is far too much information for a single computer to analyze effectively. The SETI project created a program called [email protected], which networks computers together to form a virtual supercomputer instead.

A similar program is the [email protected] project administrated by the Pande Group ,a non profit institution in Sanford university’s chemistry department. The Pande Group is studying proteins. The research include the way proteins take certain shapes, called folds, and how that relates to what proteins do. Scientist believe that protein “misfolding” could be the cause of diseases like Parkinson’s or Alzheimer’s. It is possible that by studying proteins, the Pande Group may discover new ways to treat or even cure these diseases.

As grid computing systems’ sophistication increases, we will see more organisations and corporations create versatile networks. There may even come a day when corporations internetwork with other companies. In that environment, computational problems that seem impossible now may be reduced to a project that lasts a few hours. We will have to wait and see!



Leave a Reply