Yandex, the Russian online giant, created ClickHouse to solve its problem around web analytics. Like many web companies, Yandex was struggling to understand user behaviour. For example, why did people put stuff in their baskets but not complete the transaction? Yandex open-sourced ClickHouse in 2016 under the Apache licence. Since then, it has found a new market in the analytics world with people who have exceptionally large datasets.
To understand more about ClickHouse, Enterprise Times talked with Robert Hodges, CEO of Altinity. Alexander Zaitsev, CTO, formed Altinity in 2017 to provide support for ClickHouse. It has since become the number two committer behind Yandex.
What makes ClickHouse interesting is that it uses a columnar DBMS. What does that mean? It means it is capable of ingesting ultra-large data sets very quickly. Columnar databases are highly scalable because they simply add new data by creating a new column, not adding rows.
Having large amounts of data is of little interest if it cannot be processed quickly. Hodges said: “We are MPP [Massively Parallel Processing] enabled. We’re able to spread queries over many nodes. We use vectorised query with really, really great compression and codecs for reducing uncompressed code size.”
Those codecs also make ClickHouse attractive. One of the challenges with petabyte data sets is how it uses resources and its speed of processing. Hodges talks about both of these. He also addresses the challenge of building a company around an open-source solution.
To hear more of what Hodges had to say, listen to the podcast.
Where can I get it?
obtain it, for Android devices from play.google.com/music/podcasts
use the Enterprise Times page on Stitcher
listen to the Enterprise Times channel on Soundcloud
listen to the podcast (below) or download the podcast to your local device and then listen there.