Neo4j Graph Data Science 2.0 has launched and will also be available as a Service on Google Cloud as AuraDS. The move will make it easy for customers who need highly accurate models. The accuracy of those models, according to Neo4j, is directly related to the completeness of the context. In other words, the larger your model, the more accurate it will become.
Enterprise Times asked Neo4j what sort of scale customers were looking for. Alicia Frame, Director of Graph Data Science, Neo4j, replied, “The largest instances available on AuraDS are 256G RAM – which should comfortably handle running algorithms, graph embeddings, and ML pipelines on hundreds of millions of nodes and relationships. We’ll be introducing larger instances over the course of the year.
“We have self-managed customers running data science workloads of tens of billions of nodes and relationships, where they’ve found that Neo4j GDS is the only platform where they can successfully execute graph algorithms at enterprise scale. We have self-managed customers running on machines of up to 4TB, with workloads addressing tens of billions of nodes and relationships.”
Organisations are often running multiple models. It means AuraDS will save some customers significant money as they scale up and scale down the size of their models.
Other clouds will become available this year
Not every customer will want to move their data to Google’s Cloud. ET asked if AuraDS would work with data in situ rather than moving it into Google’s Cloud.
Frame replied, “In addition to AuraDS (our data science as a service offering, we’ve offered our graph data science platform to on-prem and self-managed customers for nearly two years. With self-managed, users can deploy larger instances, hybrid architectures (a transactional cluster + a dedicated member for graph analytics), read replicas for disaster recovery or read scaling, and on whatever hardware or cloud that they choose.”
What about those customers who have significant investments in other vendors’ clouds?
Frame said, “We’re planning to bring AuraDS to AWS later this year and Azure soon after.”
The availability of other clouds is good news. It means that customers won’t have to port data to use AuraDS. What will be interesting is how this will work for customers with multi-cloud environments. Will they be able to create models over multiple clouds? Will they have to do partial modelling on the cloud platforms they use and then move the results to on-premises to get an aggregated model?
What has Neo4j added to the product?
There are two parts to this announcement. The first is the additional features delivered with Neo4j Graph Data Science 2.0. The second is the availability of AuraDS on the Google Cloud. Accompanying the press release is a blog that details many of the new features for customers.
New in Data Graph Science
Neo4j says that this is a major release with a lot of new features and capabilities. The full list is on the Neo4j GitHub. Here are some of the highlights that Neo4j has called out:
- Machine Learning Pipelines just got a whole lot easier with the addition of a pipeline catalog, a new unified syntax for model configuration, training, and application, and support for random forest models.
- Best in class data science with product tier graduation for Breadth First Search, Depth First Search, K-Nearest Neighbors, Delta Stepping, and the similarity functions. What this means for you is that these algorithms are all fully supported, parallelized, and optimized. Delta Stepping is 92% faster than the Neo4j previous shortest path implementation!
- New Enterprise Features, including cluster compatibility and graph backup/restore, make it simple to go from proof of concept to production. Customers can now run graph data science workloads seamlessly alongside transactional clusters without worrying about losing work.
- Graph Data Science is part of your data science ecosystem with our new Python client. Data scientists don’t have to spend time learning Cypher or understanding transaction functions – now, you can skip to the good part with our native Python API.
- Cypher projection syntax has had a complete overhaul.
- There are improved implementations for GraphSAGE and closeness centrality
All these features and more in AuraDS
The second part of the announcement is all about AuraDS. One of the key benefits that Neo4j has called out is that this is a cloud service. Customers can pay per use or integrate it into their existing Google Cloud plans.
As a fully managed service, it also comes with 65 graph algorithms in a single workspace. The customer can expand these if they want.
The press release and the blog call out a range of new capabilities that AuraDS delivers:
- Scale on demand: As you’d expect from any cloud service, you can scale up and down on demand keeping costs under control.
- Enterprise Graph Data Service: AuraDS includes Graph Data Science Enterprise Edition (GDS EE) licensing plus early access to new GDS algorithms and features.
- ML Ops support: GDS EE access allows users to persist, publish, and restore models without interruptions from restarts.
- Automated operations: Workloads are monitored, patched, and backed up behind the scenes without any user action.
- One-click backup: Take a snapshot of instances, models, and in-memory graphs in one click.
- Managed Security: Neo4j will automatically apply security patches, monitor workloads and back them up to avoid interruptions
- Simple, powerful workflow: A drag-and-drop UI to model and import data into a graph.
- Predictable cost: Manage costs with pay-as-you-go pricing and the option of pausing unused instances.
- Pause on demand: Data scientists can pause unused instances depending on the tier.
In addition, customers who have built models on-premises will be able to import those models into AuraDS.
Enterprise Times: What does this mean?
This is not just a major update to Neo4j Graph Data Services. It is the biggest feature update it has received. In addition to the features called out above in the different editions, there are a significant number of others listed in the GitHub document.
This release will allow existing customers to do more without a huge investment in hardware. AuraDS should also allow them to make more use of the product and scale to get the most complete context they can.
For Neo4j, this is an interesting move. It had to pick one cloud to go live with AuraDS first and chose Google over AWS and Azure. Why that is, is not clear. Perhaps it is responding to customer demand, or maybe it is about ease of porting to the platform. Either way, having a fully managed version of Graph Data Services will appeal to more customers.
What now needs to happen is for Neo4j to not just launch on those other cloud platforms but create a multi-cloud engine. Nobody wants to be moving data across multiple cloud platforms. It is time-consuming and costly.