Organizations are getting serious about extracting value from the data they produce and collect, even when that data is spread out across multiple clouds, data centers, and silos. Three terms you might hear when learning techniques for this are data mesh, data fabric, and data virtualization. The three concepts might even seem to overlap when you first encounter them. But what distinctions exist among them? 

Joe Warnimont, senior analyst at HostingAdvice, offers one way to think about the three. “I like to think of data fabric as the actual highway system for moving data, while data mesh focuses on the intangible: your organizational data approach,” he says. “And data virtualization acts as your translator so every part of your organization, and the systems in it, can understand one another while data moves about.”  

[ Related: What is data fabric? How it offers a unified view of your data ]

To help you better understand data mesh, data fabric, and data virtualization, we’ll dive into the details, and then see a case study that shows how all three concepts can coexist. 

What is data mesh? 

“Data mesh is a decentralized model for data, where domain experts like product engineers or LLM specialists control and manage their own data,” says Ahsan Farooqi, global head of data and analytics, Orion Innovation. While data mesh is tied to certain underlying technologies, it’s really a shift in thinking more than anything else. In an organization that has embraced data mesh architecture, domain-specific data is treated as a product owned by the teams relevant to those domains. “Data mesh empowers teams and treats data as a strategic asset,” Farooqi says. 

Data mesh arises from the concept of domain-driven design, which in turn informed the idea of microservices-based architectures. You can think of data mesh like a microservices-based architecture for data: Data under a specific domain is owned by the appropriate teams, who use APIs or other techniques to make that data available to potential consumers. 

What is data fabric? 

Data fabric is a type of architecture designed to provide unified access to the data stored in various places across your organization. As Matt Williams, field CTO at Cornelis Networks, puts it, “Data fabric is an architecture and set of data services that provides intelligent, real-time access to data — regardless of where it lives — across on-prem, cloud, hybrid, and edge environments. This is the architecture of choice for large data centers across multiple applications.” 

The data fabric concept recognizes that most enterprises aren’t able or willing to consolidate every department’s valuable data, and so instead it serves as an abstraction layer that interacts with individual data silos, weaving together important information stored in everything from massive traditional RDBMSes to small departmental NoSQL databases. Data fabric uses AI/ML to understand how all this data relates to one another and provide useful insights into it. (For more details on how data fabric works and who’s adopting it, read InfoWorld’s “What is data fabric? How it offers a unified view of your data.”)  

Data fabric is more technical than the idea of data mesh, which is more conceptual. There are multiple vendors selling data fabric offerings, which are large-scale architectures that often promise a single pane of glass into your company’s data. 

What is data virtualization? 

Both data fabric and data mesh require working with data that might be saved in many places and formats. Data virtualization is the secret sauce that can make that happen. “Data virtualization is a technology layer that allows you to create a unified view of data across multiple systems and allows the user to access, query, and analyze data without physically moving or copying it,” says Williams. That means you don’ t have to worry about reconciling different data stores or working with data that’s outdated. 

Data fabric uses data virtualization to produce that single pane of glass: It allows the user to see data as a unified set, even if that’s not the underlying physical reality. This is also important in companies that implement data mesh: After all, the domains that own the data will be dealing with heterogenous data situations of their own, and will need data virtualization to create unified and useful data products.