Why data quality and data governance are important to achieve compliance

Dr. Rupa Mahanti
Analyst’s corner
Published in
5 min readNov 28, 2022

--

Regulations and Compliance

This article is a modified version of the article — Compliance, data, quality, and governance, published in EDPACS in March 2022, and draws significantly from the research presented in the books — Data Quality: Dimensions, Measurement, Strategy, Management and Governance and Data Governance and Compliance: Evolving to Our Current High Stakes Environment.

Introduction

Takeovers, mergers, acquisitions, stock market crashes, accounting scandals, operational failures, corporate failures, security breaches, fraud, thefts and globalization, have triggered the need for stricter governance, and creation and enforcement of regulations to improve the accuracy, credibility, security, and privacy of corporate data with an intent to prevent scandals, financial disasters and data breaches. As new risks emerge, new regulations come into the picture and/or existing regulations are amended.

In this article, we present a brief overview of compliance and regulations, discuss the cost of non-compliance and some related statistics, and the role data quality and data governance play in achieving compliance.

Definitions of key terms

Regulations are rules or laws created by government or authorities in order to control behaviors or the way activities are conducted.

Compliance is about following rules and regulations.

Non-compliance costs are the costs that result when a company fails to comply with rules, regulations, policies, contracts, and other legal obligations.

Data compliance is a term used to describe the practices and processes that organizations adopt to ensure data associated with regulations are organized, stored and managed such that they are guarded against loss, manipulation, corruption, theft and misuse.

Data governance is a system of policies, rules, standards, processes, practices and structures, roles and responsibilities, controls, and decision rights to oversee the management of data.

Some key regulations

  • Sarbanes-Oxley (USA)
  • Dodd-Frank Wall Street Reform Act (USA)
  • the Anti-money Laundering Act (Australia)
  • the EU backed Basel II
  • Solvency II
  • Health Insurance Portability and Accountability Act (HIPAA)
  • General Data Protection Regulation (GDPR)

Cost of non-compliance — a few statistics

While the cost of compliance is high, the cost of non-compliance is significantly higher — nearly three times higher than the cost of compliance through implementation of data quality, governance, and compliance frameworks and solutions. Below are a few eye-watering dollar figures related to the cost of non-compliance with regulations:

According to a 2021 IBM report, lost business due to downtime or diminished reputation accounts for 38% of the overall cost of a breach.

In 2020 alone, banks were fined $14.2 Billion for non-compliance, with the United States accounting for 78% of issued fines.

The average cost of a data breach among organizations surveyed reached $4.24 million per incident in 2021, the highest in 17 years. (IBM)

In 2021, the average breach costs for healthcare organizations increased by 29.5% to $9.32 Million.

Why are data Governance and data quality needed for compliance?

Compliance with regulations requires good quality data, and in order to have good quality data you need to have effective data governance. In fact, the early drivers of data governance originated as mandates in the compliance, legal, risk, and audit departments within organizations. Data governance is the biggest driver for compliance.

While data has always been important, regulations such as, but not limited to, Health Insurance Portability and Accountability Act (HIPAA) and the General Data Protection Regulation (GDPR), have challenged organizations to improve their data quality, and to create controls and formal accountability for data.

Compliance requirements require the ability to locate and understand data associated with regulations. Organizations also need to be able to produce the right data at the right level of granularity at the right time, and the data needs to be of high quality (that is, it should be accurate, consistent, valid, and complete).

The ability to trace data present in a report to its sources, is becoming a regulatory requirement, especially in cases of regulatory reporting. This is known as data lineage or traceability and is one of the dimensions of data quality.

Knowing where your compliance-related data resides is the first step towards achieving compliance, and is often the biggest challenge too. This is because most organizations are by nature data hoarders and compliance-related data is generally stored in a myriad of different locations, systems, data stores, and formats in organizations. In short, compliance-related data is often scattered all over the place.

However, unless a business knows where the related data are stored in the organization and can locate the data, it is not possible to provision the data in regulatory reports, adequately protect the data, prevent unauthorized access, ensure data privacy or guarantee if the data are fit for purpose. The reporting entities must be able to demonstrate how each value in a report is generated, including its calculation, transformation details, its lineage, and source data. Each data item in a regulatory report must be credited to a data owner and a data steward, who should be able to prove that it is of high quality. Also, organizations should have the knowledge and capability to trace the data item back to an authoritative source.

Data governance results in data stewardship roles being established. Data stewards who are subject matter experts (SMEs) are responsible for specific data sets. Data stewards understand the business meaning of the data, and the purposes for which the data are used. They can locate the data needed for compliance purposes with the help of data discovery software.

Also, data governance results in the establishment of standards around data elements and data entities. Data governance ensures that metadata and data lineage information is up-to-date, which in turn assists the data discovery process. Data governance also results in creation of processes for resolving data quality issues and ensuring data is of high quality.

Concluding thoughts

Change is the only constant. The regulatory landscape is diversified by industry sectors, markets, countries, and geographies, which are extremely dynamic and constantly evolving.

The number of regulations that require an organization to have control over data has increased over the last few years. Attaining compliance requires the joining of hands of different stakeholders — business, IT, and data stakeholders from several business units and departments that have a stake in the compliance data.

Data governance and data quality are important elements to ensure success with compliance. While data governance and data quality are not only about compliance, with good data governance and good quality data, organizations should be able to be compliant with respect to data.

Thank you for reading! Take care!

Please do let me know whether this article was helpful, and what more you would like to read with respect to compliance, data quality, and governance. Leave a comment here or connect on LinkedIn.

Biography: Rupa Mahanti is a consultant, researcher, speaker, data enthusiast and author of several books on data (data quality, data governance and data analytics). She is also publisher of “The Data Pub” newsletter on Substack.

--

--

Dr. Rupa Mahanti
Analyst’s corner

Author of 7 books, mostly on data; Ph.D. in Computer Sc. & Eng.; Digital art designer; Publisher- The Data Pub (https://thedatapub.substack.com/)