(c) Big Data LondonCompanies have sought to become data-centric over recent years in a bid to become more responsive and customer-focused. Now, technologically, the trend among enterprises is to decentralise big data to increase relevancy and speed to insight.

Data is usually processed using ETL (Extract, Transform and Load). It is then fed into a giant data lake from each domain, where it is centrally managed, usually by IT. But a decentralised distributed architecture sees data reside within the domain, in its own lake. It means each domain has control of its own data.

Moving away from centralised data warehousing has distinct advantages. It enables teams to access and control their data through self-service platforms. It effectively creates Data Infrastructure-as-a-Platform that essentially allows data to be consumed and managed as a product.

Data Mesh vs Data Fabric

An example of this is Data Mesh. It is the brainchild of Zhamak Dehghani, who has founded a stealth tech startup of the same name. It is dedicated to reimagining data platforms with Data-Mesh-native technologies.

Data Mesh enables each domain to handle their own data pipelines. It also provides consistency in terms of syntax and standards, which sets it apart from data fabric. As a result, the business can get value from data rapidly, sustainably and at scale while continuing to ensure governance is also overseen locally.

However, decentralising your data does not automatically mean teams will optimise it effectively. Zhamak cautions that implementing Data Mesh also requires sociotechnical change. It requires organisations to embrace new behaviours and technology to reap this new approach’s full benefits.

Other advances within the data sector include data analysis during the ETL process. Various providers such as Google, SAP and Select Star to DBT and Snowflake have developed cloud-based platforms and analysis tools that enable analysis in flight in a bid to reduce latency. Moreover, the concept of Fast Data is seeing batch processing replaced by event-streaming systems, promising instant data analysis and real-time analytics.

However, businesses seeking to harness these new technologies must be mindful of how they handle any transition. They’ll need to consider micro-partitioning, expect syntax issues or discrepancies, and monitor for data errors during the conversion process.

People power

Migrating data and planning the implementation of these technologies requires specialist skill sets leading to increased demand for data engineers. These are increasingly developing and building the Modern Data Stack (MDS) by integrating data products and services within systems and business processes. To ensure the MDS delivers, data engineers must work closely with development teams and data analysts. Together, they help tweak the architecture in a continual bid to improve data processing efficiency, giving rise to the discipline of DataOps.

Despite this oversight, data transformation projects can quickly become derailed if not enough attention is paid to the governance of data processing. The governance framework must be prioritised. However, there’s a tendency by some businesses to continue to govern data centrally even though the data itself has been decentralised. This makes little sense as ownership of the data, and its governance should go hand-in-hand. It’s worth noting that progress is also being made here, with automated solutions emerging for handling data governance.

Finally, any changes to the data architecture will need to be reflected in the company’s data strategy and it describes how the business manages its people, policies and culture. The data strategy also seeks to improve data maturity, or the ability to improve and increase data usage. Keeping track of your data maturity can often help the business to identify where investment is needed and to prioritise change. But it also pays to look at and benefit from the experience of others who have followed a similar journey.

Learn from the big data experts

The Big Data LDN Conference and Exhibition is held at London Olympia from 21-22 September. Over 200 data and analytics experts will be on hand to share how they have decentralised their data or built dynamic, data-driven enterprises. Attendees also have access to free on-site data consultancy and interactive community meet-ups.

Zhamak Dehghani will be opening the event as the keynote speaker. She will deliver her session Rewire for Data Mesh: atomic steps to rewire the sociotechnical backbone of your organisation at 10am on 21 September. Similarly, other experts will share unique stories, peerless expertise and real-world use cases.

Starburst, Snowflake, LNER, Deliveroo, Microsoft, Actian, Confluent, Dataiku and Deloitte will also be taking a deep dive into topics ranging from Modern Analytics and DataOps to Data Governance and AI & MLOPS.

The Big Data LDN event is free to attend but you’ll need to register. To secure your place, please sign up here.

Big Data LDN is the UK’s leading free-to-attend data and analytics conference and exhibition, hosting leading experts and providers from across the sector, from both the technical and business spheres. The event brings together the data and analytics community with a focus on how to build dynamic, data-driven enterprises. The superior end-user led conference programme features industry pioneers, experts and real-world case studies, equipping visitors with the tools and techniques to deliver maximum business value from successful data projects. Delegates also have the opportunity to discuss their business requirements and challenges with more than 130 technology vendors and consultants.


Please enter your comment!
Please enter your name here