In the words of the great Arthur C. Clarke, “Any sufficiently advanced technology is indistinguishable from magic.” It is a brilliant maxim, but it doesn’t mean that a solution which appears to be magical must be backed by sufficiently advanced technology. Instead, the inverse is often true. Companies develop simple, appealing, even magical taglines to mask their technology’s underlying flaws and complexity.
Cloud tiering, if not the worst offender, is certainly high on the list. All the traditional storage vendors offer some variation of cloud tiering. Do not mistake them for cloud innovators, though. They are trying to remain relevant by bolting cloud features onto old-fashioned hardware. The approach is actually rather old. It can be traced back several decades to the equally unworkable promise of tiering in Information Lifecycle Management (ILM).
The truth about cloud tiering is that it’s merely a clever marketing term. It is one that hides a fundamentally broken approach to managing capacity. Organisations are beginning to realise that a cloud solution that relies on tiering instead of syncing data will become less efficient at scale.
Syncing, Tiering & Cloud
Tiering is not a new concept in the storage world. This sort of planned, organised movement of data from the front end to the back end has been going on for decades. Today, that backend target is the cloud. But the cloud itself has little in common with old-fashioned storage media. The technology we use to leverage unlimited, on-demand, low-cost capacity should be designed and optimised for this revolutionary new storage medium. Cloud demands a new approach, not a modified variation of one developed decades ago. Data should be synced to the cloud, not tiered.
The difference between tiering and syncing data to the cloud is a finite design versus an infinite one. It’s between something you have to babysit and a solution that just does the right thing for you. It’s the difference between an approach that is struggling to stay relevant in the era of cloud and one that was architected for the unlimited object-store.
The goal of both approaches is the same. Tiering and syncing are both trying to overcome the capacity limitations of local hardware by leveraging cloud storage. The volumes on standard NAS devices can only grow to a certain size. Tiering to the cloud allows you to thin out that volume and utilise the unlimited, cost-effective capacity of the cloud. But it’s not an easy operation, nor is it an incremental or granular one.
The Operational Hazards of Cloud Tiering
There is too much overhead involved with building a tier and making sure everything actually gets to the back end. So tiering happens in large, bulky sets — a directory structure or a large volume of files. This is a major weakness from a data protection standpoint. If the front-end device melts before you tier, that data is probably gone.
What happens as you start to accumulate large datasets in the cloud? NetApp will tell you that you can stitch together all these different backend volumes behind one volume. Cloud tiering and namespace aggregation sound beautiful in a marketing solution brief. However, it’s a huge burden on whoever is responsible for managing the system. You fabricate volumes, fill them up, and then decide whether to tier or create a new volume and aggregate on the back end.
Retrieving from the tier is very disruptive for applications, too. You might need to maintain different volumes for snapshots, servers, VMs and more because of sensitive apps. Ultimately, complexity increases with capacity, and when complexity scales with growth, that is the definition of bad technology.
The Simplicity of Syncing to the Cloud
Syncing to the cloud reverses this approach. Everything is always streamed to and maintained in the back — the cloud. The source of truth is the cloud, and the system keeps the front hot through automated caching.
Think of tiering as moving data from the front to the back with very large trucks. No delivery truck departs until it is full. On the other hand, a sync-based file storage system is more like a high-speed conveyor belt. Data flows continuously to the cloud backend — to one unlimited cloud volume, not an ever-growing number of loosely aggregated datasets.
In the sync model, if data is needed at the edge, it flows back in real-time immediately after the end-user clicks on that file. It is a continuous, granular process that applies to everything from a large CAD model to a sub-file-level asset, and it’s all happening automatically. You don’t have to think about what you will move to the tier.
The eviction or deletion of data from the edge device happens conveniently – and very, very quickly. This is because the data has already been committed to the source of truth. So, as an operator, you are allowed to run an infinite volume with the expectation that you’ll get uniform performance everywhere. You don’t have to decide what’s in the edge and what’s on the backend. The system does it for you. From an operational standpoint alone, this is a massive improvement.
You need to architect for scale
The traditional storage vendors might call their tiering technology cloud, but it’s not. NetApp has done a great job spinning up their controllers in the cloud, but their systems are not architected for unlimited scale. They were designed and optimised to run on local machines.
A system that is truly designed for the cloud has to be able to scale to infinity without increasing complexity or forcing you to build new volumes or orchestrate and babysit regular migrations of data from one tier to another. A true cloud system should automatically do all of this for you, so you don’t have to worry about capacity or data protection.
The problems of files are problems of scale. If scale causes an increase in complexity, that complexity will eventually take the system down. Cloud-native file systems designed to sync data are engineered to scale without complexity and operate as efficiently with a single site as they do across a global enterprise.
The effect on companies might seem magical, but I can assure you that it’s merely advanced technology.
Nasuni Corporation is a leading file data services company that helps organizations create a secure, file data cloud for digital transformation, global growth, and information insight. The Nasuni File Data Platform is a cloud-native suite of services that simplifies file data infrastructure, enhances file data protection, and ensures fast file access globally at the lowest cost. By consolidating file data in easily expandable cloud object storage from Azure, AWS, Google Cloud, and others, Nasuni becomes the cloud-native replacement for traditional network attached storage (NAS) and file server infrastructure, as well as complex legacy file backup, disaster recovery, remote access, and file synchronization technologies. Organizations worldwide rely on Nasuni to easily access and share file data globally from the office, home or on the road. Sectors served by Nasuni include manufacturing, construction, creative services, technology, pharmaceuticals, consumer goods, oil and gas, financial services, and public sector agencies. Nasuni’s corporate headquarters is based in Boston, Massachusetts, USA delivering services in over 70 countries around the globe. For more information, visit www.nasuni.com.