When the phrase in-memory is used, most people immediately think of databases such as SAP HANA, Apache Ignite or IBM DB2 BLU. They are, indeed, a form of in-memory computing (IMC). One that specifically looks to load as much data from a database into memory to speed up performance. However, they are not the only option.
The wider view of in-memory computing is that everything an organisation does is run in-memory. This means every application and all the data an organisation owns. Nothing that is used day to day is stored on a storage medium be that disk, SSD or tape. By removing the storage medium performance and access is almost instantaneous. It delivers on the real-time enterprise that many IT vendors have talked about over the last three decades.
This approach is what GridGain talked to Enterprise Times about earlier this year, so what does it really mean?
Earlier this year Abe Kleinfeld told delegates at the In-Memory Computing Summit (IMCS) we are just a few years away from IMC being mainstream. Kleinfeld believes that we are on the cusp of what he sees as the biggest speed up in technology that we have seen. His timeline is:
- 2018: Four key streams of data access – in-memory DBMS, data grids, stream analytics, other IMC technology.
- 2019: 75% of cloud native application development will use IMC or services giving access to IMC.
- 2021: In-memory computing platform as a single platform not multiple competing technologies.
- 2022: 40% of large and global enterprises will be using in-member DBMS
Fast storage still isn’t fast enough
It’s a good question. Every year storage systems get bigger and faster. The introduction of flash memory and processor enhancements means we are able to treat flash storage as an extension of main RAM, albeit at a slightly lower speed. This has been a major benefit for transaction systems. The faster they can process and write data, the more transactions they can deal with.
The problem comes when we look to use that data. Users don’t work against live transactions, especially when they are doing analysis. Instead, they work against copies of the data. There are exceptions to that such as some of the systems used in the financial markets. But even here, only a limited set of people work against real-time data.
When we write data away and then retrieve it for analytics or any other purpose we have to deal with latency. Some vendors have done a lot to reduce that latency. IBM, for example, created CAPI which allows memory subsystems to be plugged into the CPU reducing bus overhead. Meanwhile NVidia has created its own technology that allows GPUs to access RAM and flash memory systems. The result is very high-speed access to data with very low latency but it isn’t real-time.
All of these improvements offer a 2x, 5x even a possible 10x improvement on what organisations currently have. IMC promises a 10x-100x improvement.
Cheaper RAM and new approaches make in-memory faster and affordable
These approaches still rely on the need for a storage subsystem but can we go further? Yes, according to GridGain. The cost of RAM has tumbled over the last few years while the density has increased. Last year AWS announced plans for virtual servers with 16TB RAM. These are aimed at SAP HANA in-memory database users. It already offers 4TB solutions that can be used for multiple applications.
The falling price and availability of high density RAM is a technology expectation. But having large amounts of available does not mean it will be immediately used. While in-memory databases are now sufficiently developed to be mainstream technology, other uses of in-memory computing are not.
While TB of memory might seem more than most people need, it is small change compared to that required for complex in-memory computing. HPE announced a 160TB in-memory computing server in 2017. To get to that size HPE has had to link 128GB in-line memory modules together with its own fabric.
Building a large-scale in-memory computing solution also needs a new architecture. HPE, for example, is connecting up to 40 processors to deal with all that data. AWS is going further and building complex clusters that will deliver multiple nodes with multiple processors and RAM.
A Russian bank shaking up the market
These numbers are still small beer compared to the plans of Russia’s Sberbank. It believes in the advantages that in-memory computing has to offer. So much so, that it is planning to move its entire banking operation to an in-memory solution. This is not just about databases or analytics. It will encompass every piece of data and every app that they bank possesses. It will support 130 million customers across 22 time zones.
There are significant challenges to achieving this. It started with buying a stake in GridGain. How better to get access to the technology that you need. Three years on and Sberbank has since built its own banking solution. It consists of a 1.5PB in-memory computing solution spread over 2,000 nodes. The system has had to have backups built into it. Should the primary system crash and burn it has to be back online in just 5 mins. That means reloading more than a PB of backed up data.
The first phase of this project was scheduled to go-live in July 2018. Despite a request to the bank to confirm it was on target, it has so far not gotten back to us. Assuming that it did meet the July 2018 go-live date, the next milestone is the end of 2018. At that point the bank is planning a unified memory store and waving goodbye to using traditional storage in its daily operations.
Is banking the key market for this technology?
No. It is simply one example of a customer who is about to deliver a major project. According to Nikita Ivanov, Founder & CTO, GridGain, there are many other apps that will immediately take advantage of this technology. This includes:
- Machine Learning
- Artificial Intelligence
- Natural Language Processing
- Internet of Things
These are applications that are already becoming used inside the enterprise and Ivanov believes that their increased use will help drive IMC. However, he is cautious about the uptake. While companies such as Sberbank generate a lot of interest, there is a lot to overcome before we see wider use.
One of the big challenges that Ivanov talks about is operating systems. It will take many machines to reach the size of memory on which businesses can operate widespread IMC. New architectures for computing will need to evolve to deal with that large pool of memory and how it is accessed. Ivanov points out that we have 50 years of block level access to storage. For IMC to be hugely effectively we need to optimise for byte-level reads.
Microsoft and the Linux community are all looking at this. However, these are projects that are far from ready for the enterprise. IMC is not a solution that can be purchased off-the-shelf today. Instead, it requires help to customise and build solutions for customers. When asked when we will begin to see those commodity interfaces Ivanov suggested we could be more than five years out.
What does this mean
Faster has been the mantra of IT for decades. It is not just users demanding more speed and performance but the industry itself. IMC offers a new world of computing but to get there we need new architectures, models and a rewrite of operating systems.
Changing how we access data at the hardware level will also have to be considered. While IBM and NVidia have been pushing their solutions, they are not getting mainstream adoption. Ivanov added that Intel has said that it doesn’t believe that this can be solved in hardware. It believes that the operating system vendors are the solution.
Despite the Intel statement, some among the memory and storage industry are looking at a solution. The Storage Networking Industry Association (SNIA) SSSI has published its own thoughts in several white papers. It is arguably one of the keys to getting to IMC faster. If the hardware is capable of supporting byte-level access then the OS vendors will write for it. Based on estimates from SNIA, this is something that can be started now.
Having been around before the hard disk for PCs, the idea that everything can now transition to IMC takes some thinking about. The changes in performance over the last few decades have been exceptional and yet for many, they are not enough. IMC promises to deliver the 10x-100x performance improvement they want but it will take time.
For now, the success of Sberbank may be the catalyst required to kick off a lot more investment and work in IMC.