When it comes to business intelligence and decision-making, there’s a common theme: business decisions are only as good as the data they’re based on. Yet in practice, data is rarely clean, unified, or easy to trust. According to the ISG report (2025), “Organisations recognise that combining data from multiple sources is essential for analytics and operational efficiency, but without trusted master data, insights risk being misleading or incomplete.” This is precisely where structure becomes critical. As Donna Burbank, Managing Director of Global Data Strategy, puts it: “Master Data Management (MDM) can help build this 360-degree view of key business information to allow you to take full advantage of your organization’s data for better business outcomes.”[1]
As a developer who has worked with Boomi for over 7 years now, and has more than 15 years experience working with data and integrations, in businesses from the manufacturing, agriculture, retail, insurance, and education sectors, I’ve seen my share of untrustworthy data.
Boomi Data Hub is an invaluable tool for organisations to manage their data efficiently and strategically and provide a unified view of their business-critical records independently of the multiple systems they may have in place. It creates an overlap of business operations and technology to build trust in organisational data, and to keep it integral, accurate and semantically consistent.[2]
Increasingly commonly, organisations are choosing best-of-breed software to manage individual business domains. They might select one vendor’s ERP system for its ease of use, another’s CRM for its configurability and ability to contact customers quickly, and perhaps yet another vendor’s product lifecycle management system. These individual components will on some level need to share data. ERP’s need to know about products for manufacture and customers for sales orders. CRM’s need to know about which customers order what, and how often. Product management needs to know about all of this, and what it takes to make something for sale.
This is where Data Hub is a great fit – a multi-system environment where information is disparate but inherently interlinked and relied on by business users in their day-to-day operations. The platform allows users to curate their data and provide a single source of truth for any domain of their choosing: customer data, products, regulatory compliance, supply chain management, or indeed any significant area within their business. But as we know, data is only as good as what is recorded in the user-facing systems that business has in place. Mistakes can happen, records can be entered twice (or more!), automation can fail. In fact, there’re a plethora of things that could go wrong, both within, and outside of, the control of the company and its users.
There are of course scenarios where Data Hub isn’t the right tool for your data. Highly volatile data or constantly changing values, such as real-time sensors from a manufacturing plant are one such example. Another might be low complexity data, or a construct that is not shared with any other system, for instance, shelving locations in a warehouse for raw materials. A good rule of thumb is to imagine two different departments of a business arguing over what is actually the correct view of a record; if you can imagine this, there’s a good chance Data Hub can help you.
This post will cover some issues I’ve encountered in my experience implementing Data Hub, why they happen, and what we can do to mitigate and resolve issues to allow Data Hub to be its best, and give a unified, consistent and accurate view right down to a record-by-record level.
Build Your Data Foundations
Modelling your data is key to a successful Data Hub implementation. It’s imperative to understand not only what is in your data, but how it flows and changes across your application landscape. For instance, a customer in a legacy system could be represented very differently in a new CRM – both in terms of how it is presented to users of those systems, as well as system-specific internal idiosyncrasies that don’t translate between the individual applications.
1. Name your data fields well
Field names should be specific enough within a model to indicate what the data contained within them is, but generic enough that they can be recognised regardless of the lens you’re viewing the data through. For example, any business users regardless of their domain could recognise what a “customer number” is – which is a perfect name for this data field.
But – what about a “customer status”? Active, inactive, suspended, ex-customer, loyal customer, or any number of descriptors could be reflected here. In a recent client’s project our solution was to define a discrete set of values and map these corresponding to the expected values any downstream system could expect. When we know what we expect a field to hold and can easily recognise that data by a human-readable field name, we end up with a unified view of that record that anyone reading it can understand.
2. Understand what constitutes a complete record
Data that’s incomplete leads to confusion. More often than not, we need a minimum set of mandatory data fields to create a meaningful golden record in Data Hub. For instance, there’s not much point creating a record that only has an email address, and no other information about the customer from our CRM. We should define in our data model those fields we know we must include to form a useful record. Continuing our customer example, we might define our customer number and customer name as mandatory fields. Data typing also comes into play here – for example, we might have a constraint on our phone number field that it must be numeric only.
Any record incoming to Data Hub will undergo checks against these rules and constraints, to ensure we only accepting clean data that has all the information the business requires to create a meaningful golden record that can be relied upon as the truth when it’s published downstream to other systems and applications. In the case of our client’s CRM integrations Data Hub made it possible to ensure that when a customer record is searched in the system, the user got a complete view of that customer’s contact details, and could get in touch with a sales opportunity instead of wasting their time searching for a phone number or email address.
3. Know where your data comes from
Some data sources will contribute data to Data Hub, while others will only accept data from Some sources will both contribute and accept data. Understanding how your data flows, which systems are “read-only”, and which will contribute updates to the modelled data is important so that we don’t end up overwriting or accidentally erasing information that we wanted to keep.
Sources enabled as contributors to the model can create and update fields – and this can be configured on a field-by-field basis. For instance, your brand-new CRM might assign customer numbers to newly created customers. In this case, its corresponding source can be set to contribute customer numbers to the model. If that customer number is then altered (by mistake or otherwise) in any system besides the CRM, those sources should exclude contributions to this field, leaving us a golden record with the correct customer number.
A potential pitfall here is when two different sources can both contribute to the same field. A recent client had this exact scenario: two systems could be independently changed, meaning that each sent a conflicting update to the same field in Data Hub. Data Hub is strong when updates are infrequent, but it’s not designed to be real-time event bus, and that distinction matters in this context. In practice, this is unlikely to happen, but it is worth noting that it is technically possible, so modelling these constraints is part of the Adaptiv pattern for success.
4. Validate your data
This is where Data Hub’s value really shines. When our accurate models inform us that our respective systems require a certain set of fields to be considered a complete record (e.g. a customer record must contain a customer number, name, postal address, and email address), we can create a tag named something like “Record contains customer number, name, postal & email address”, and have Data Hub examine each record in the model and tag each record appropriately. That might seem like a wordy tag name, but as we start to involve more and more systems, we might end up with a whole list of tags – so it’s worth making them as specific and readable as we can as we create them.
These tags can then be used to selectively publish records to the downstream systems and we’re able to withhold records in Data Hub from publishing until they meet their targeted tag definitions. This allows us to keep our systems clean and free of orphan records – and what’s even more convenient, is that if a record doesn’t meet our tag definition, the record is only withheld in Data Hub until it does.
Another Adaptiv client had defined in their data model a hierarchical relationship in their data – some of their customers belonged to larger organisations they termed parent companies. If they tried to load a child company to their CRM, Data Hub would check that the parent company existed beforehand. If not, it would load that parent record first and ensure the child data wasn’t just pushed in blindly and orphaned.
Reinforcing The Foundation
Just because we’ve painstakingly defined our model and ensured we’ve got the most stringent quality gates in place in Data Hub, doesn’t mean we automatically get nice clean, usable data. Inevitably, there are quality issues, anomalies, bugs, glitches and just downright incorrect data in our sources. This is especially true of legacy systems that have been in place for years, where users come and go, business processes evolve over time, and things simply end up in an inconsistent state.
Data stewardship plays a huge part in keeping operational data clean, accurate and concise. Organisations need someone – or perhaps many someone’s – who take responsibility over their master data, proactively ensuring it meets defined quality standards, is secure (especially if dealing with personally identifying information), and most importantly is able to understand the data in their area of responsibility. Which is why a company should appoint several stewards – each with a deep understanding of their business area. Many hands make light work, after all. Data stewards don’t need to be technical wizards whatsoever – there are just a few simple steps to take to keep master data uncontaminated. And when these things crop up, Data Hub can help resolve these issues by identifying them before they affect your operational systems.
Here are some of the most common issues:
1. Duplicated data
On systems with lax (or no!) data entry validation, it’s common that users end up creating duplicated records. Continuing our customer example, we might have customer “J. Smith” recorded under two different customer numbers – but we can tell by looking at the record we’re dealing with the same customer. They have the same postal address, same email address and same name, but they have two distinct customer numbers. Based on our model configuration described above, Data Hub will detect this and quarantine the incoming record rather than blindly adding it to the master set, giving data stewards the opportunity to review the identified duplicate – and either match it to an existing record or discard it entirely.
2. Invalid or inferior quality data
Data Hub detects any incoming data that doesn’t align with the defined model parameters. Our model outlines the mandatory fields, the field size limits (both minimum and maximum), the expected field types (e.g. numeric, text, timestamps, etc.), and whether the incoming data contains enough information to reliably check for a match – to tell us if we’re looking at updating an existing golden record or creating a new one.
If our model contains business rules (for instance, we will only master customers if they have transacted with the business in the last 12 months), Data Hub can check against the defined business rules and quarantine the record(s) that don’t meet the requirements, until they can be reviewed.
3. Data integration errors
It is entirely possible that if we’re dealing with a legacy or previously manually maintained system that a record we intend to master in Data Hub doesn’t have a way to uniquely identify it in relation to its source. Or perhaps something went wrong in the process supplying data to master in Data Hub, omitting its source-system identifying value. Data Hub will again identify these cases and quarantine the incoming data for review.
For a steward, the review process is streamlined by Data Hub to make it as easy as possible. A typical workflow starts with a daily review of the quarantine holding pen. Each record is listed with a timestamp and a reason – making it clear to the stewards what they need to do to resolve the issue. From here, a steward would choose to either:
- Approve the record, if recognised as true & correct
- Manually edit the record and resubmit it with after updating incorrect data
- Identify an existing record to match against, if it’s identified as a duplicate
- Or discard the record entirely
A quick daily check is all it takes to keep your environments free of clutter and inaccuracies – but the proactive steward will subscribe to Data Hub’s built in notifications, and address these issues as they arrive, instead of waiting on a daily routine. The notification pattern is our recommendation, so why wait? There’s no time like the present – fix your data as it flows in record by record, rather than building up a big backlog, and suddenly the daunting task of data quality turns into an easy step-by-step workflow that can be done in just a few clicks of the mouse.
Mastering your organisation’s data isn’t just a technical exercise, it’s a strategic capability. Boomi Data Hub provides the framework to bring structure, consistency, and clarity to even the most complex data landscapes, but its true value emerges when it’s paired with thoughtful modelling, strong data stewardship, and a clear understanding of how information flows across systems. By defining a robust model, enforcing quality rules, and leveraging Data Hub’s powerful matching, tagging, and quarantine features, businesses can move beyond reactive cleanup and into proactive, sustainable data governance.
When implemented well, Data Hub becomes far more than a database of mastered records – it becomes a reliable, living source of truth that supports better decision-making, improves operational efficiency, and reduces the downstream impacts of poor-quality data. With the right foundations and ongoing care, organisations can ensure that their master data remains accurate, complete, and trustworthy, empowering both systems and people to operate with confidence.
If you’re starting out with Boomi and you’re unsure of how to get the most from the platform, make sure to check out our ebook ‘10 best integration practices for successful Boomi Enterpise Solutions’. Or if you’re unsure where to start, drop us a message.
References:
[1] https://globaldatastrategy.com/master-data-management-mdm/
[2] https://www.dataversity.net/data-concepts/what-is-master-data-management/


















