TigerGraph has announced that OpenCorporates has migrated its back end database to TigerGraph. OpenCorporates is the world largest open database of companies with more than 170 million entries, all freely available on its website. It is widely used by journalists and corporate investigators, as well as Anti-Money Laundering and due diligence professionals, banks and credit-reference agencies.
Moving to Graph
The decision to move to an TigerGraph from its existing graph database technology (Neo4J)y was made as Chris Taggart, CEO OpenCorporates explains: “OpenCorporates has vast amounts of data on companies, and our corporate structure data in particular is growing. This, the particular queries we want to make, and the plans that we have for network data over next two years means that we needed a solution focused on speed and scale above all else.”
Those challenges led OpenCorporates to look for an alternative solution. Taggart continued: “We looked at multiple different graph database solutions, and in the end chose TigerGraph for its raw performance. Using a cutting-edge solution like TigerGraph meant that we had to write our own libraries and convert existing queries, but it was a tradeoff worth making given our particular use case. We’ve been able to substantially reduce query time, have better performance and the ability to do deeper analysis (multi-hops) – and the benefits of this are already being passed on to out hundreds of thousands of users.”
Neo4j was used successfully by OpenCorporates for several years, delivering value on several investigations by journalists. As time has passed, the volumes of data that OpenCorporates now processes has increased exponentially. It therefore needed to look for an alternative Graph database.
This was not a quick decision, they wanted to move to a new solution but one that would also offer something that surpassed Neo4j, that had set a high bar already. They created a competition between vendors using a sample set of 17 million nodes and 10 million edges on a single machine. Using a list of criteria they then measured the different solutions against each other. TigerGraph believes that it was better than its competition on five of those criteria:
- Degrees of separation: Support for queries of up to five degrees of separation between entities with real-time response times – a capability that was becoming increasingly difficult for OpenCorporates.
- Siblings: Support for sibling queries with real-time response times, to help answer questions like, “What else does the parent of a given company own?”
- Up the chain only: Enables users to see what entities exist up the chain only for any given company, with real-time response times.
- Temporal graph search: Users can ascertain if a relationship existed for a particular time frame. They can search what entities have been created from a particular date, and remove all old relationships from the results – not possible with Neo4j.
- Active vs. dead relationships: Supports queries on a given network to see what relationships are active vs. dead, so that each one can be filtered out of the query accordingly.
There will not have been many other companies that have moved from one graph database to another. Enterprise Times asked what challenges OpenCorporates faced. Taggart commented: “Our original challenges during the changeover were four-fold:
- writing code libraries to interface with TigerGraph, as it’s too new to have libraries for most languages
- changing the model due to different internal functionality in TigerGraph
- building the queries that mirrored existing behaviour
- managing the move of data into TigerGraph.”
Ultimately these proved a very small hurdle as Taggart concluded: “All these actions were done in weeks rather than months. Later this year, we’ll be working on new functionality that builds on the speed and scalability of TigerGraph to do even more.”
What difference has it made?
For OpenCorporates, the move to a Graph database has made a significant difference. Customers are now able to query their data in more complex ways. TigerGraph has proven that it can deliver the information required by investigative journalist quickly. Typical queries include:
- I would like to see all entities that are n degrees of separation away from any given entity that I am interested in, e.g. three degrees away so that I can build up a picture of a network and see connections that would not be obvious otherwise.
- I would like to show the ultimate parent of a given entity and everything else it controls so that I can connect different entities through the ultimate parent and make the relationship more transparent. This is typically very complex.
- I would like to see who the officers of a company are and what other entities they are either officers or BOs of, so that I can see what relationships may exist between two companies that don’t necessarily have an owner-subsidiary link or chain.
In the past OpenCorporates has helped to deliver key insights into several investigations these include:
- The Panama Papers (ICIJ)
- Narco-a-Lago: Money Laundering at the Trump Ocean Club (Global Witness)
- London Property: A Top Destination for Money Launderers (Transparency International).
What may be interesting is whether, using this new technology, such investigations are revisited. Perhaps more insights will be drawn from the data. This is an important win for TigerGraph and demonstrates its leadership in the field. It also has a company that frequently leverages its solution to provide regular and interesting use cases.
Yu Xu, CEO and founder, TigerGraph commented: “The potential of public benefit from OpenCorporates’ database on hundreds of millions of global corporations and their associates is remarkable. OpenCorporates was challenged with achieving must-have requirements that were impossible previously. By switching to TigerGraph, OpenCorporates is able to meet these needs. The result is powering the future of investigative journalism by unveiling even more data connections and insights.”
Enterprise Times: What does this mean
There are four different winners from this. TigerGraph is able to boast of a considerable success against a number of its competitors. However competitors might argue that this is at the top end of most companies requirements. They may offer better or cheaper solutions for other use cases.
The second winner is OpenCorporates whose service offering has significantly improved with TigerGraph. This will help it attract further customers.
The third are the journalists who will use the service. They are now able to unravel some complex corporate structure in ways that have not been possible in such a short time. They will still need to know what the questions to ask are though.
The fourth is the public and potentially law enforcement. Those groups will benefit from a deeper understanding of the complex web of companies that people including criminals have established.
This article has been substantially altered from the original piece as the information initially received on which it has been written proved incorrect. This led to a misrepresentation of the actual reasons for the migration to TigerGraph.