Blog

maximum
GridGain just posted service point releases for In-Memory HPC and In-Memory Database products version 5.1.6. If you are currently running either of these two products we recommend to update. This point release includes performance improvements and number of bug fixes:

  • New CLIENT_ONLY mode for partitioned cache.
  • New ATOMIC atomicity mode for better performance for non-transactional use.
  • New optional GridOptimizedMarshallable interface to improve optimized marshaller.
  • New one-phase commit in TRANSACTIONAL mode for basic put and putAll operations.
  • New automatic back-pressure control for async operations.
  • Multiple fixes/enhancements to Visor Management Console.

Release notes available.

What does the relatively new acronym MCI have to do with the accelerated adoption of in-memory computing? I’d say everything.

MCI stands for Memory Channel Interface storage (a.k.a MCS – Memory Channel Storage) and it essentially allows you to put NAND flash storage into a DIMM form factor and enable it to interface with a CPU via a standard memory controller. Put another way, MCI provides a drop-in replacement for DDR3 RDIMMs with 10x the memory capacity and a 10x reduction in price.

Historically, one of the major inhibitors behind in-memory computing adoption was the high cost of DRAM relative to disks and flash storage. While advantages such as 100x performance, lower power consumption and higher reliability were clearly known for years, the price delta was and is still relatively high:

Storage ~ Performance ~ Price
1TB DDR3 RDIMM (32 DIMMs) 1000-10,000x $20,000
1TB PCI-E 10-100x $4,000
1TB SSD 10-100x $1,000
1TB HDD 1x $100

While spinning HDDs are essentially cost-free for enterprise consumption, and flash storage is enjoying mass adoption, DRAM storage still lags behind simply due to higher cost.

MCI-based storage is about to change this once and for all as it aims to bring the price of flash-based DRAM to the same level as today’s SSD and PCI-E flash storage.

MCI vs. PCI-E Flash

If prices are relatively similar between MCI and PCI-E storage, what makes MCI so much more important? The answer is direct memory access vs. block-based device.

All of the PCI-E flash storage today (FusionIO, Violin, basic SSDs, etc.) are recognized by the OS as block devices, i.e. essentially fast hard drives. Applications access these devices via typical file interface involving all typical marshaling, buffering, OS context switching, networking and IO overhead.

MCI provides an option to view its flash storage simply as main system memory, eliminating all the OS/IO/network overhead, while working directly via a highly optimized memory controller – the same controller that handles massive CPU-DDR3 data exchange – and enabling software like GridGain’s to access the flash storage as normal memory. This is a game changer and potentially a final frontier in the storage placement technology. In fact, you can’t place application data any closer to the CPU than the main memory and that is precisely what MCI enables us to do on terabyte and petabyte scale.

Moreover, MCI provides direct improvements over PCI-E storage. Diablo Technology, the pioneer behind MCI technology, claims that MCI is more performant (lower latencies and higher bandwidth) than typical PCI-E and SATA SSDs while providing ever elusive constant latency that is unachievable with standard PCE-E or SSD technologies.

Plug-n-Play

Another important characteristic of MCI storage is the plug-n-play fashion in which it can be used – no custom hardware, no custom software required. Imagine, for example, an array of 100 micro-servers (ARM-based servers in micro form factor), each with 256GB of MCI-based system memory, drawing less than 10 watts of power, costing less than $1000 each.

You now have a cluster with 25TB in-memory storage, 200 cores of processing power, running standard Linux, drawing around 1000 watts for about the same cost as a fully loaded Tesla Model S. Put GridGain’s In-Memory Computing Stack on it and you have an eco-friendly, cost effective, powerful real-time big data cluster ready for any task.

GridGain Expands Its Management Team

Posted by on Tuesday, August 13, 2013
 Blog, News and Press

FOR IMMEDIATE RELEASE

FOSTER CITY, Calif., Aug. 13, 2013 /PRNewswire/ – GridGain™ Systems today announced it has expanded its management team, tapping the talents of Andy Sacks as Executive Vice President of Sales, Lisa Bergamo as Vice President of Marketing, and Jeff Stacey as Global Head of Business Development, to build on its current momentum in In-Memory Computing. The announcement follows its recent closing of $10 million in Series B venture financing in a round led by global venture capital firm Almaz Capital, with continued participation from previous investor RTP Ventures.

The team additions mark key elements of GridGain’s three-pronged promise to use its new capital to rapidly expand sales, marketing and new product development to meet the growing need for In-Memory Computing in big data environments. “The market need for new technology that can handle the 2.5 quintillion bytes of data that businesses generate has perfectly aligned with GridGain’s ability to offer unprecedented computing power,” said Nikita Ivanov, Founder and CEO, GridGain. “Andy Sacks, Lisa Bergamo and Jeff Stacey bring the skills and experience to capitalize this moment and offer In-Memory technology to every organization.”

As Executive Vice President of Sales, Andy Sacks brings more than 20 years of enterprise sales experience in developing direct and indirect routes to market. He comes to GridGain from Red Hat, Inc., where he spent over 8 years developing and leading sales teams, delivering substantial company revenue. Prior to Red Hat, he held sales leadership roles at Bluestone Software (acquired by HP), RightWorks (acquired by i2) and Inktomi (acquired by Yahoo! and Verity).

As Vice President of Marketing, Lisa Bergamo brings to GridGain more than 25 years of technology marketing, branding, and public relations experience. Bergamo, who has consistently delivered results for emerging technology companies over the past two decades, was founder and vice president of marketing for SOASTA, Inc., developer of an award-winning, on-demand cloud service for load and performance testing of websites and applications. Bergamo has also held executive marketing positions at Infochimps, Symplified, GlobalFluency, Sagent and CyberSource Corporation, and venture capital firm Canaan Partner. She currently serves as a mentor for start-up accelerator Founders Pad.

Jeff Stacey brings to his new position as Global Head of Business Development twenty years of experience at companies like Dell, IBM, SAP/Business Objects and Oracle, as well as smaller emerging technology firms, launching large scale analytic products into the marketplace. Partnering with system integrators, resellers and analytic solution providers, he repeatedly won, managed and grew technology ecosystems from zero to over $100M in global revenue.
“For enterprises, whether their data becomes a value prop or pain point depends largely on whether or not they embrace In-Memory technology,” said Ivanov. “From this moment, data will only become more unruly, problems more complicated, and time a more precious commodity. Companies need new technology to handle a new scope of data challenges, and GridGain provides the full stack.”

About GridGain™
GridGain’s complete In-Memory Computing platform enables organizations to conquer challenges that traditional technology can’t even fathom. While most organizations now ingest infinitely more data than they can possibly make sense of, GridGain’s customers leverage a new level of real-time computing power that allows them to easily innovate ahead of the accelerating pace of business.

Built from the ground up, GridGain’s product line delivers all the high performance benefits of In-Memory Computing in a simple, intuitive package. From high performance computing, streaming and database to Hadoop and MongoDB accelerators, GridGain provides a complete end-to-end stack for low-latency, high performance computing for each and every category of payloads and data processing requirements. Fortune 500 companies, top government agencies and innovative mobile and web companies use GridGain to achieve unprecedented performance and business insight. GridGain is headquartered in Foster City, California. Learn more at http://www.gridgain.com.

###

Contact:
Michael Burke
MSR Communications
415.989.9000
GridGain@msrcommunications.com

What are the performance differences between in-memory columnar databases like SAP HANA and GridGain’s In-Memory Database (IMDB) utilizing distributed key-value storage? This questions comes up regularly in conversations with our customers and the answer is not very obvious.

Storage Models

First off, let’s clearly state that we are talking about storage model only and its implications on performance for various use cases. It’s important to note that:

  • Storage model doesn’t dictate of preclude a particular transactionality or consistency guarantees; there are columnar databases that support ACID (HANA) and those that don’t (HBase); there are distributed key-value databases that support ACID (GridGain) and those that don’t (for example, Riak and memcached).
  • Storage model doesn’t dictate specific query language; using above examples – GridGain and HANA support SQL – HBase, for example, doesn’t.

Unlike transactionality and query language – performance considerations, however, are not that straightforward.

Note also: SAP HANA has pluggable storage model and experimental row-based storage implementation. We’ll concentrate on columnar storage that apparently accounts for all HANA usage at this point.

HANA’s Columnar Storage Model

Let’s recall what columnar storage model entails in general and note its HANA specifics.

Some of its stand out characteristics include:

  • Data in columnar model is kept in column (vs. rows as in row storage models).
  • Since data in a single column is almost always homogeneous it’s frequently compressed for storage (especially in in-memory systems like HANA).
  • Aggregate functions (i.e. column functions) are very fast on columnar data model since the entire column can be fetched very quickly and effectively indexed.
  • Inserts, updates and row functions, however, are significantly slower than their row-based counterparts as a trade-off of columnar approach (inserting a row leads to multiple columns inserts). Because of this characteristic – columnar databased typically used in R/OLAP scenario (where data doesn’t change) and very rarely in OLTP use cases (where data changes frequently).
  • Since columnar storage is fairly compact it doesn’t generally require distribution (i.e. data partitioning) to store large datasets – the entire database can often be logically stored in memory of a single server. HANA, however, provides comprehensive support for data partitioning.

It is important to emphasize that columnar storage model is ideally suited for very compact memory utilization for the two main reasons:

  • Columnar model is a naturally fit for compression which often provides for dramatic reduction in memory consumption.
  • Since column-based functions are very fast – there is no need for materialized views for aggregated values in exchange for simply computing necessary values on the fly; this leads to significantly reduced memory footprint as well.

GridGain’s IMDB Key-Value Storage Model

Key-value (KV) storage model is less defined than its columnar counterpart and usually involves a fair amount of vendor specifics.

Historically, there are two schools of KV storage models:

  • Traditional (examples include Riak, memcached, Redis). The common characteristic of these systems is a raw, language independent storage format for the keys and values.
  • Data Grid (examples include GridGain IMDB, GigaSpaces, Coherence). The common trait of these systems is the reliance on JVM as underlying runtime platform, and treating keys and values as user-defined JVM objects.

GridGain’s IMDB belongs to Data Grid branch of KV storage models. Some of its key characteristics are:

  • Data is stored in a set of distributed maps (a.k.a. dictionaries or caches); in a simple approximation you can think of a value as a row in row-based model, and a key as that row’s primary key. Following this analogy a single KV map can be approximated as row-based table with automatic primary key index.
  • Keys and values are represented as user-defined JVM objects and therefore no automatic compression can be performed.
  • Data distribution is designed from the ground up. Data is partitioned across the cluster mitigating, in part, lack of compression. Unlike HANA – data partitioning is mandatory.
  • MapReduce is the main API for data processing (SQL is supported as well).
  • Strong affinity and co-location semantics provided by default.
  • No bias towards aggregate or row-based processing performance and therefore no bias towards either OLAP or OLTP applicability.

Performance Considerations

It is somewhat expected that for heavy transactional processing GridGain will provide overall better performance in most cases:

  • Columnar model is rather inefficient in updating or inserting values in multiple columns.
  • Transactional locking is also less efficient in columnar model.
  • Required de-compression and re-compression further degrades performance.
  • KV storage model, on the other hand, provides an ideal model for individual updates as individual objects can be accessed, locked and updated very effectively.
  • Lack of compression in GridGain IMDB makes updates go even faster than in columnar model with compression.

As an example, GridGain just won a public tender for one of the biggest financial institutions in the world achieving 1 billion transactional updates per second on 10 commodity blades costing less than $25K all together. That transactional performance and associated TCO is clearly not the territory any columnar database can approach.

For OLAP workloads the picture is less obvious. HANA is heavily biased towards OLAP processing, and GridGain IMDB is neutral towards it. Both GridGain IMDB and SAP HANA provides comprehensive data partitioning capabilities and allow for processing parallelization – MPP traits necessary for scale out OLAP processing. I believe the actual difference observed by the customers will be driven primarily by three factors rooted deeply in differences between columnar and KV implementations in respective products:

  • Optimizations around data affinity and co-location.
  • Optimizations around the distribution overhead.
  • Optimizations around indexing of partitioned data.

Unfortunately – there’s no way to provide any generalized guidance on performance difference here… We always recommend to try both in your particular scenario, pay attention to specific configuration and tuning around three points mentioned above – and see what results you’ll get. It does take time and resources – but you may be surprised by your findings!

It’s been somewhat quiet here on the GridGain front for a few months, and for good reason!

We just announced closing a $10M Series B investment and bringing an awesome new investor on board. In the last 6 months we not only closed the new round, we also rebuilt and tripled our sales and business development team, retooled our marketing, released new products, and have 3 other products in the development pipeline scheduled for announcement later this year.

But I think the most important thing we’ve accomplished so far is the crystallization and validation of our vision and strategy around our end-to-end stack for In-Memory Computing.

In-Memory Computing

Kirill Sheynkman, one of our board members and an investor, probably put it the best: “In-Memory Computing is characterized by using high-performance, integrated, distributed memory systems to manage and transact on large-scale data sets in real-time, orders of magnitude faster than possible with traditional disk-based technologies”.

In-Memory Computing is a new way to compute and store data, a type of revolution we haven’t witnessed since the early 1970s when IBM released the “winchester” disk IBM 3340 and the era of HDDs officially began. Today, we are in same transitional period moving away from HDDs/SSDs or other block devices to a new era of DRAM-based storage — creating a tidal wave of innovation in software.

Just as the development of cheap HDDs pushed forward the database industry in the 1970s and SQL was born, today relentless data growth coupled with real time requirements for data processing necessitate a move to in-memory processing, massive parallelization and unstructured data.

Unlike companies around us, here at GridGain, we strongly believe that In-Memory Computing is a paradigm shift. It’s not just a single product, enhancement or feature add-on — it’s a different way to think about how we deal with exponentially growing data sets and unrelenting appetite for actionable and real-time data intelligence and analysis.

Here at GridGain we are leading this revolution and we have the vision and technology to do just that.

End-to-End In-Memory Computing Stack

Most of today’s business applications dealing with large data sets (outside of legacy batch processing) are built to process three different types of payload:

  • Database or system of record,
  • High performance, parallelized computations, and
  • Real time, high frequency streaming and CEP data processing

These three types of payload (or combination of them) are at the core of practically every big data end user system built today. Providing in-memory products directly addressing these three types of payload is what makes GridGain an end-to-end In-Memory Computing stack

In-Memory Computing Stack

What’s also important is that we have built our product line from the ground up. We didn’t acquire some fledging startup, gain pieces of technology from a merger, or revamp some dying open source project to quickly fill a gap in the product line. Every product we have is built by the same team, from the same base and came about as a natural evolution of our product line in the last 7 years.

That’s why you have absolutely zero learning curve when moving from product to product. Our customers often note just how cohesive and unified our products “feel” to them: familiar APIs, principles and concepts, same configuration, same management, same installation, same documentation… and the same engineers providing top-notch support.

Platforms don’t get built by haphazardly stitching together random pieces of software. They grow organically over time in the hands of dedicated product and engineering teams.

Integrated Products

A few years ago we noticed one class of customers that want the increased speed and scalability benefits of in-memory computing but just didn’t have the appetite for the development and simply shied away from using any in-memory computing products at all.

Instead of losing these customers, we’ decided to pick some of the most common use cases and create highly integrated, plug-n-play product, i.e accelerators, so that they can enjoy the benefits of in-memory computing without the need for development cycles or potential changes in their systems.

That’s how our In-Memory Hadoop and In-Memory NoSQL Accelerators came about. And soon we’ll add storage accelerators to the mix in a few months.

A unique characteristic of GridGain’s integrated products is the “no assembly required” nature in which they integrate. They deliver all the scalability and performance advantages of GridGain’s In-Memory Computing stack with zero code changes and minimal configuration changes to the host products.

Management and Monitoring

No stack can be truly considered end-to-end without incorporating a single and unified management and monitoring system. GridGain prides itself on providing the #1 DevOps support technology among any in-memory computing company with its Visor Administration Console. GridGain’s Visor is a GUI-and CLI-based system that provides deep runtime management, monitoring, and operational command and control for running any production GridGain product.

visor_dash2

Time Is Now

Einstein got it right when he said imagination is more important than knowledge. At GridGain, we’ve re-imagined ultimate performance as In-Memory Computing so that you can re-imagine your company for today’s increasingly competitive business environment.

GridGain understands that In-Memory Computing is more than the latest tech trend. It’s the next major shift for an increasingly hyper business world in which organizations face problems that traditional technology can’t even fathom, much less solve. In-Memory Computing is a step all organizations must take to remain competitive, and we’re ready to take that step with you.

You’ll never need to analyze less data. The speed of business will never be slower. Your business challenges will never be simpler. Now is the time for In-Memory Computing – only GridGain gives you a complete solution without any compromises.

FOR IMMEDIATE RELEASE

GridGain Secures $10 Million in Series B Funding
In-Memory Computing Pioneer Secures Global Investment and Accelerates Growth

FOSTER CITY, California – July 29, 2013 – GridGain™ Systems today announced a closing of $10 million in Series B venture financing. The round was led by new investor Almaz Capital, a global venture capital firm, with continued participation from previous investor RTP Ventures, the U.S. arm of ru-Net Holdings and one of the largest internet and technology investors in Russia.

“GridGain is addressing a real need in a rapidly growing big data market. Due to this market growth, the company is making tremendous traction,” said Geoff Baehr, managing partner, Almaz Capital. “Almaz Capital’s goal has always been to seek and build long-term partnerships with passionate entrepreneurs who are eager to make a difference. GridGain’s management team has proven experience in In-Memory Computing as well as the vision and drive to transform the industry.”

This new capital will be used to rapidly expand sales, marketing and new product development to meet the growing need for In-Memory Computing (IMC) in big data environments. GridGain also announced that Geoffrey Baehr has been added to the company’s board joining RTP Ventures.
“During the next two to three years, In-Memory Computing will become a key element in the strategy of organizations focused on improving effectiveness and business growth. Organizations looking for cost containment and efficiency will also increasingly embrace IMC,” said Massimo Pezzini, vice president and fellow, Gartner. “In-memory will have an industry impact comparable to web and cloud.”

GridGain was founded by Nikita Ivanov, a seasoned technologist and pioneer in using Java for high performance computing, and distributed computing expert Dmitriy Setrakyan. The company develops software for businesses that see real-time big data processing as a strategic asset. With a comprehensive, proprietary in-memory data platform, GridGain provides unique integration between in-memory data and compute-grid technologies and can scale up from a single server to thousands of machines to handle terabytes of data in real time.

“Investors such as RTP Ventures and Almaz Capital, which appreciate the new frontier of big data will help GridGain increase its market reach and penetration,” said CEO Nikita Ivanov. “This round of financing will enable us to aggressively invest in our products and go-to-market strategies, and is a testimony to the confidence that investors have in our company being able to capitalize on the market opportunity.”

About GridGain
GridGain’s complete In-Memory Computing platform enables organizations to conquer challenges that traditional technology can’t even fathom. While most organizations now ingest infinitely more data than they can possibly make sense of, GridGain’s customers leverage a new level of real-time computing power that allows them to easily innovate ahead of the accelerating pace of business. Built from the ground up, GridGain’s product line delivers all the high performance benefits of In-Memory Computing in a simple, intuitive package. From high performance computing, streaming and database to HDFS and MongoDB accelerators, GridGain provides a complete end-to-end stack for low-latency, high performance computing for each and every category of payloads and data processing requirements. Fortune 500 companies, top government agencies and innovative mobile and web companies use GridGain to achieve unprecedented performance and business insight. GridGain is headquartered in Foster City, California. Learn more at http://www.gridgain.com

Sources: Gartner press release, “Gartner Says In-Memory Computing Is Racing Towards Mainstream Adoption”, April 3, 2013. http://www.gartner.com/newsroom/id/2405315; SAP Innovation Forum, “The Next Generation Architecture: In-Memory Computing”, February 28, 2012.

###

Contact:
Michael Burke
MSR Communications
415.989.9000
gridgain@msrcommunications.com

Dmitriy Setrakyan will talk about Pearls of Distributed Programming with Scala and GridGain in Philly’s Scala Meetup, April 16th @ 7pm. Just a few slides but plenty of live coding with some pretty cool and advanced distributed concepts. If you are in or around Philly – stop by.

All information is here.

GridGain 4.5 Released!

Posted by on Friday, February 22, 2013
 Blog, Product Releases

GridGain is pleased to announce the GA release of GridGain 4.5. This is the last major release in 4.x product line and we’ve been long working on 5.x features and 5.0 is just around the corner.

For all products customers and user running 4.x we highly recommend upgrading to the latest 4.5 release.

New Features And Enhancements

  • HyperLocking to minimize locking and serialization overhead for cache transactions under load.
  • Risk Analytics benchmark.
  • Added support for Custom SQL Functions
  • GridCacheQueryCustomFunctionExample to show how to use them.
  • Full off-heap indexing to GridH2IndexingSpi.
  • Topic-based user message exchange.
  • GridNoopCheckpointSpi to remove checkpoint overhead whenever checkpoints are not used.
  • GridNoopSwapSpaceSpi to remove swap space overhead whenever it is not used.

Visor New Features and Enhancements

  • Telemetry screen in Visor to show overall grid status based on various metrics.
  • Dedicated cache tab to show all cache-specific information.

Core Bug Fixes

  • Path space issues in ggstart.bat startup script.
  • Deadlock with concurrent evictAll() and unswapAll()
  • Query iterators are removed but not closed when originating node leaves of fails.
  • Restructured all examples to make them easier to use and understand.

Client Connectivity Bug Fixes

  • Removed ADD method from client API as it was identical to putIfAbsent method.
  • Visor Management Bug Fixes
  • Visor graph tooltip does not show whole information.
  • Visor spits errors (failed to fetch model update) when new node joins and busy with data pre-loading.

Visor Management Bug Fixes

  • Visor graph tooltip does not show whole information.
  • Visor spits errors (failed to fetch model update) when new node joins and busy with data pre-loading.

GridGain will present at Seattle Scalability and Distributed Systems Meetup in Seattle, MSFT campus. We’ll do one of the coolest presentations we’ve done lately, namely “Pearls of Distributed Programming with GridGain & Scala”. Non-stop live Scala coding with pretty amazing examples of what modern distributed programming should be…

All information can be found here.

Wikibon produced an interesting material (looks like paid by Aerospike, NoSQL database recently emerged by resurrecting failed CitrusLeaf and acquihiring AlchemyDB, which product, of course, was recommended in the end) that compares NoSQL databases based on storing data in flash-based SSD vs. storing data in DRAM.

There are number of factual problems with that paper and I want to point them out.

Note that Wikibon doesn’t mention GridGain in this study (we are not a NoSQL datastore per-se after all) so I don’t have any bone in this game other than annoyance with biased and factually incorrect writing.

“Minimal” Performance Advantage of DRAM vs SSD

The paper starts with a simple statement “The minimal performance disadvantage of flash, relative to main memory…”. Minimal? I’ve seen number of studies where performance difference between SSDs and DRAM range form 100 to 10,000 times. For example, this University of California, Berkeley study claims that SSD bring almost no advantage to the Facebook Hadoop cluster and DRAM pre-caching is the way forward.

Let me provide even shorter explanation. Assuming we are dealing with Java – SSD devices are visible to Java application as typical block devices, and therefore accessed as such. It means that a typical object read from such device involves the same steps as reading this object from a file: hardware I/O subsystem, OS I/O subsystem, OS buffering, Java I/O subsystem & buffering, Java deserialization and induced GC. And… if you read the same object from DRAM – it involves few bytecode instructions – and that’s it.

Native C/C++ apps (like MongoDB) can take a slightly quicker route with memory mapped files (or various other IPC methods) – but the performance increase will not be significant (for obvious reason of needing to read/swap the entire pages vs. single object access pattern in DRAM).

Yet another recent technical explanation of the disadvantages of SSD storage can be found here (talking about Oracle’s “in-memory” strategy).

MongoDB, Cassandra, CouchDB DRAM-based?

Amid all the confusion on this topic it’s no wonder the author got it wrong. Neither MongoDB, Cassandra or CouchDB are in-memory systems. They are disk-based systems with support for memory caching. There’s nothing wrong with that and nothing new – every database developed in the last 25 years naturally provides in-memory caching to augment it’s main disk storage.

The fundamental difference here is that in-memory data systems like GridGain, SAP HAHA, GigaSpaces, GemFire, SqlFire, MemSQL, VoltDB, etc. use DRAM (memory) as the main storage medium and use disk for optional durability and overflow. This focus on RAM-based storage allows to completely re-optimized all main algorithms used in these systems.

For example, ACID implementation in GridGain that provides support for full-featured distributed ACID transactions beats every NoSQL database (EC-based) out there in read and even write performance: there are no single key limitations, no consistency trade offs to make, no application-side MVCC, no user-based conflict resolutions or other crutches – it just works the same way as it works in Oracle or DB2 – but faster.

2TB Cluster for $1.2M :)

If there was on piece in the original paper that was completely made up to fit the predefined narrative it was a price comparison. If the author thinks that 2TB RAM cluster costs $1.2M today – I have not one but two Golden Gate bridges to sell just for him…

Let’s see. A typical Dell/HP/IBM/Cisco blade with 256GB of DRAM will cost below $20K if you just buy on the list prices (Cisco seems to offer the best prices starting at around $15K for 256GB blades). That brings the total cost of 2TB cluster well below $200K (with all network and power equipment included and 100s TBs of disk storage).

Is this more expensive that SSD only cluster? Yes, by 2.5-3x times more expensive. But you are getting dramatic performance increase with the right software that more than justifies that price increase.

Conclusion

2-3x times price difference is nonetheless important and it provides our customers a very clear choice. If price is an issue and high performance is not – there are disk-based systems of wide varieties. If high performance and sub-second response on processing TBs of data is required – the hardware will be proportionally more expensive.

However, with 1GB of DRAM costing less than 10 USD and DRAM prices dropping 30% every 18 months – the era of disks (flash or spinning) is clearly coming to its logical end. It’s normal… it’s a progress and we all need to learn how to adapt.

Has anyone seen tape drives lately?

1 2 3 4 17