Google to digitise the NYT photo libraryThe New York Times (NYT) has detailed plans to undertake an ambitious project to digitise its historical photo library. The archive library, affectionately known inside the NYT as the morgue, began clipping and saving articles in the 1870s.

The library now holds millions of photographs, along with tens of millions of historical news clippings, microfilm records and other archival materials. The technology partner selected for this project is Google Cloud.

Monica Drake, Assistant Managing Editor at The New York Times
Monica Drake, Assistant Managing Editor at The New York Times

“We’ve always known that we were sitting on a trove of historical photos and now, cloud technology allows us to not only preserve this archival source, but easily search and pull photos to provide even more historical context,” said Monica Drake, assistant managing editor, The New York Times. “Ultimately, this digitalisation will equip Times journalists with useful tools to make it easier to tell even more visual stories.”

“Google Cloud technologies like Cloud Storage, Cloud Pub/Sub, and Cloud Vision API are helping to preserve this priceless history and give journalists a new way to search, access, and analyse millions of historic photos and give them new life,” said Brian Stevens, chief technology officer, Google Cloud. “Cloud technology is allowing The Times to protect one of their most unique assets migrating from steel filing cabinets to a cloud-based platform where journalists can bring visual storytelling to a whole new level.”

NYT plans to use the Google Cloud Vision API’s to enable machine learning algorithms. This will staff to identify, classify and automatically organise its photos.

A significant challenge

This will be a significant challenge not just to capture the images but also to index them so that they can be accessed and used. One of the first problems will be the fragility of many of the older images. They will need to be handled so as to not create any more damage than they have already suffered.

There are also challenges when it comes to scanning different formats. This is not just sticking a negative on a lightbox or putting a photo in a scanner. It may require some interesting engineering to capture the images.

Once scanned, many of the older images will need to be digitally restored. This will include repairing the damage that age has wrought on the images and even adding colour. The latter, in particular, would give a new vision on the past.

Indexing and attributing the images is also going to be a problem. Many of the older images have not been catalogued. Even where they have, the indexes are patchy and incomplete. This means that there will have to be time spent researching the images. Unfortunately, the NYT plans to keep these away from the public eye. That’s a mistake. In circumstances like this. taking advantage of cloud sourcing help makes sense.

In conclusion

The archive has been called the “history of the world through the eyes of the New York Times”. You have to admire the monumental task ahead of the NYT and Google teams which will pull it off. The archive has over 100 years of history in photographs, many of which will never have been seen before being digitised.

It is easy to write this off as just another Google customer win. It is not. The major newspapers all have vast photo and image libraries waiting for digitisation. The success of this project will closely watched as others think about the future.


Previous articleDeltek boosts PIM
Next articleBBVA arranges blockchain-based syndicated loan
Neil Fawcett
Cut him in half and the word technologist runs through Neil Fawcett’s core. Starting life as an engineer, specialising in the world of home computing, Neil the move to writing in 1985 and as the expression goes… never looked back. He was key to moving Computer Weekly away from its bias as a mainframe/minicomputer news title and propelled it into the exciting world of personal computing, breaking many an exclusive story. Following his tenure at CW he went on to work for various other publications, including participating in the UK launch of Information Week. During this time, he played a pivotal role in establishing custom publishing units designed to work alongside vendors to help define end-user publications and campaigns. Neil’s ability to take complex technology subjects and deliver digestible content frequently saw him appear on the likes of the national newspapers, the BBC and Sky, and often found himself delivering speeches to audiences around the world. With numerous books under his belt, Neil took time out in the new millennium to pursue a passion for toys/gaming and military history as he set up a manufacturing company with a global reach. He is now thrilled to have come full-circle and be back writing about his core passion: technology!


Please enter your comment!
Please enter your name here