Working on Grid Computing for the last 5 years I've accumulated my own share of strange looks on people faces when trying to describe to them the grid computing idea. What is fascinating is that the actual concepts behind Grid Computing are very familiar, simple and logical to anyone with basic understanding of how computers work.
Grid computing is not a technical term - it is a marketing term. There is no technology called "Grid Computing", there are no products that implement "Grid Computing". The term "Grid Computing" collectively defines a set of distributed computing use cases that have certain technical aspects in common.
There are four main use cases in distributed computing that are most often referred to as Grid Computing. Not all of them are equal in their usage or usefulness for everyday business. Yet they probably represent more than 95% of "Grid Computing" today:
1. Computational Grids
Computational Grids account for a lion share of Grid Computing usage now and will certainly retain this lead in the near future to the least.
In a nutshell, the idea behind it is very simple: if you have a task that is executing unacceptably long time, you can split this task into multiple sub-tasks, execute each sub-task in parallel on a separate computer, combine results from the sub-tasks and get the original task's result O(N)-times faster, where N is a number of sub-tasks in the split.
Computational Grids amount for almost all real-life applications of Grid Computing. The reason for this is quite obvious: unlike other use cases it provides clear and unumbigious value and is applicable to a wide verity of business tasks. When properly explained, it is easy to understand and easy to see how it can solve the real-life problems.
Despite ill-fated comparison to academia HPC, Computational Grids can be applicable to businesses of any size - from small size companies to Google and Yahoo of this world. For example, even a relatively small company may have a need to speed up a business report generation from 1 minute to 5 seconds to make it suitable for online access on their website - and in most cases Computational Grids is the only practical approach to achieve 10 times performance improvement.
2. Utility or On-Demand Grids
This scenario is probably most over-generalized among all by the technology "visionaries" (you've surely read by now about comparing computing resources to utility aspects of electricity or water services and how you will be able to tap into this new type of utility by simply plugging in your computer into imaginary "computing" socket on the wall).
In a nutshell though it has surprisingly simple idea: if you have a number of tasks that from time to time spike in execution and overload your computer(s), you can offload execution of some of these tasks during the peak time to other computers, often located remotely. This has the following interesting characteristics:
- In most cases you only pay for the actual usage time of remote computers vs. buying your own computers that will idle for the most of the time (off peak). As an example, SUN is offering $1/CPU/hour server farms which you can rent in hour increments.
- You solve the scalability but not performance, i.e. you can execute more tasks without slowing down, but you won't be executing each task faster (in many cases you will be executing each task actually slower due to extra work for off-loading task to a remote computer, remote computer being less powerful, etc.).
The main point of Utility or On-Demand Grids is on-demand computing availability which for certain types of businesses makes it an attractive economic choice. For example, all online retailers experience repetitive cycles of increased customers' activity: major holidays, back-to-school season, etc. During these cycles the activity spikes (sometimes in order of magnitude) and on-demand computing can provide economically sound solution of handling increased load.
You may ask why not split the tasks as we do in Computation Grids and achieve the performance increase as well? The answers lies in the fact that in majority cases the usage pattern of Computational Grids requires the constant availability of all computing resources making on-demand point irrelevant - there is a constant demand in Computational Grids.
3. Data Grids
Date Grids allows for splitting data onto multiple computers. Much like Computational Grids splitting computations, Data Grids allow placing data onto multiple computers or storage resources and treat them virtually like one.
However, the show here has been "stolen" by replicable distributed caching which accomplish almost the same and is readily available today from number of vendors. More over, unlike illusive Data Grids caching technology is highly integrated with server-side Java development. For the best example of distributed replicable caching look no further than Coherence fromTangosol.
Another caveat of Data Grids is that their value is often unclear comparing to NAS or similar Systems, Inc.
4. Management Grids
Management Grids are only mentioned here because of Oracle. Oracle was the first company that put up a TV advertising campaign promoting what their call "Oracle Grid". In an essence, it is a specific fault-tolerant database implementation plus some administrative tools that allow Oracle database operate in a distributed environment in a managed fault-tolerant mode.
Oracle's grid is probably the best example of technology that has almost nothing to do with what most of us think about the Grid Computing - yet it is routinely referred to as such.
As you can see in most cases the term Grid Computing means Computational Grid. It represents majority of uses cases and provides clear and ready value to the user. But wait, there is more to it...
What about SOA?
There is a lot of talk going on about synergy between Grid Computing and SOA. It is however driven primarily by implementation concerns at this point rather than by any deeper considerations. Clearly, Grid Computing can deliver unchanged value without SOA, yet WS-* based implementation (such as Globus) can be beneficial in some cases (highly distributed heterogeneous environments that should only exist in unfortunate legacy-support situations).
Any other prognosis (like OGSA) is nothing more than a fiction at this time.
What about cluster computing and traditional HPC?
Pundits from academia would like you to believe that there is irreconcilable difference between "primitive" cluster computing and "sophisticated" Grid Computing. The truth is that in most cases you won't really recognize one from the other and differences exist primarily in theory.
As far as Computational Grids are concerned the difference is minimal, if any.
Global Grids vs. Enterprise Grids?
Another categorization you can find in the media is Global Grids vs. Enterprise (non-global) Grids. In all cases you can safely rename this discussion as Non-commercial Grids vs. Commercial Grids as Global Grids (in their pure notion) have almost no applicability to commercial application.
In the same time, Universities find Global Grid much to their advantage as they usually don't concern themselves with security, complex resource allocation or any other concerns that are daily dealt with in the business world.
What about OGSA/OGSI proposed by GGF and WSRF from OASIS?
Grid Computing is plagued by over-generalization, science fiction "specifications" produced by various consortiums (whose primary reason is to collect the membership fees from the companies that want to get marketing value off a joining such consortium or organization). As harsh as it may sound these efforts are primarily responsible for the confusion that exists in understanding of Grid Computing.
OGSA being the most popular example of such "specifications" (and alphabet soup of relative documents surrounding it) reads indeed like a good science fiction. As visionary as it sounds (and some part of it will likely materialize in the future) it is best to be left for a late evening reading.