DigitalGlobe collects imagery of the Earth from its QuickBird satellites (first launched in 2001). Since then new satellites have added capacity and improved resolution. The total data being stored is reaching 100PB.
In the past 17 years, DigitalGlobe has collected more than 7 million square kilometres of imagery. Some images can be as large as 30GB. The total increases by 10PB per year and it is now moving to AWS.
As Jay Littlepage, VP Infrastructure & Operations, DigitalGlobe puts it: “Our ‘daily take’ – of 80 to 100TB – of imagery will go straight to S3—no more tape, and a big step closer to our goal of closing our commercial data centers.”
100PB main storage was on tape
In the past, DigitalGlobe exploited tape management. Its main library had 12,000 tape bays feeding 60 LTO-5 tape drives. These were constantly in motion. It was possible to recall and deliver any image for a customer within four hours. In 2016, this occurred 4 million times.
But, just as a car with lots of miles breaks down more often, so the tape library began to show its age. Tape heads had more than 10,000 miles of movement. Maintaining calibration became akin to maintaining an old sports car.
DigitalGlobe decided it was time to move its data to where customers work – in the cloud and across the Internet. The new destination: Amazon Web Services (AWS).
The decision was simpler than the practice
The AWS Snowmobile can transfer up to 100PB per 45′ ruggedized shipping trailer. Data is encrypted (with 256-bit encryption keys managed through the AWS Key Management Service) to ensures both security and full chain-of-custody information. In addition, dedicated security personnel, GPS tracking, alarm monitoring, 24/7 video surveillance and an optional security escort are available when a loaded Snowmobile trailer is in transit.
Once the data was loaded, the Snowmobile was driven back to Amazon where DigitalGlobe’s data will be imported into S3.
This wasn’t simply one massive file transfer. The legacy tape system was heavily dependent on NFS file systems. The objective was to transition to a service-oriented, object-based system.
To move the archive onto AWS Snowmobile, DigitalGlobe had to bring each and every tape online, mount it, move the files (not just the images themselves, but all of the ancillary data that describes each image), convert this and create the files into Amazon S3 objects, then encrypt.
This occurred during Q1 2017. The load-in is complete. The trailer is with AWS for validation, load-out and ingestion into the cloud. In all, DigitalGlobe processed 54 million files during load-in while balancing the load-in with satisfying normal production from the ageing tape library.
When the data ingest is complete, every image taken by any satellite in the history of DigitalGlobe will be available online via AWS. According to DigitalGobe the transition off tape would have needed to occur eventually.
But how do you transfer 100PB of data? Even assuming you could find an affordable 1GB/sec reliable, secure and error-checked long distance connection it would take almost 4 years.
Time and again physical solutions matter. The late Jim Gray of Microsoft Research, when working on the development of Virtual Earth (now part of Microsoft’s Bing Maps Platform), used to comment that it was simpler, and cheaper, to carry terabytes of data on a hard disk than to ship it across networks. The AWS Snowmobile confirms his practical wisdom.