What has the GDPR taught us? We discuss the relationship between personal data protection and data quality, data governance and machine learning.
Two years after the implementation of the GDPR, many companies are still not complying with the protection of personal data rules. The regulatory framework has taught us a lesson in data quality and data governance that companies can no longer ignore.
In February 2019, we announced the imminent end of the period that the European Union had granted organizations to implement the compliance guidelines of the new General Data Protection Regulation (GDPR). In other words, the GDPR implementation period was ending on May 25, 2019. In that post —'GDPR: do you have an ETL? Then you have a problem'— we already commented on the problem that was coming up for companies that had contracted ETL systems, most of which were not prepared for the new personal data protection parameters of the GDPR. In fact, Forbes magazine reported in 2017 that the vast majority of companies —more than 50%— were not fully compliant with GDPR requirements.
Adrian Knapp, member of the Forbes Technology Council summarized in this article published in 2021 3 tips to comply with GDPR:
"1. Know what data you have. Conduct a data deep-dive. Learn what data your company is storing, who has access and whether it is secure.
2. Make a plan. Create internal workflows to check where data might be replicated outside the company's secure data storage infrastructure and create data policies to prevent future risk.
3. Reduce your data footprint, increase visibility and reduce data risk. Knowing exactly what data you have and where it's stored allows data managers to delete —yes, delete— redundant, obsolete and trivial data. With less overall data, companies have more control over access and limit the risk of breaches and leaks, reducing compliance headaches."
Two Years of GDPR: What Have We Learned?
After two years of its implementation, GDPR and its non-compliance continue to be a hot topic after numerous stories of failure starring companies that, despite warnings, were not prepared for the new regulatory framework. It seems that the lack of compliance by organizations has little to do with a surprise factor or lack of preparation. The law firm specializing in personal data protection and privacy DLA, announced in January 2021 that in 2020 GDPR fines increased by 40% compared to the previous 20 months; which leads us to believe that either in 2020 more companies failed to comply with the law, or the European Union plans to progressively toughen sanctioning measures to ensure compliance.
But not everything has been bad. These two years of GDPR have offered us several lessons that in a children's story would be summarized in a very simple moral: Control the quality of what you work with. If you work with poor quality materials, you will get bad results.
The Role of Data Quality in GDPR
The GDPR has made it clear that data compliance has more to do with data quality than with bad intentions.
Nowadays, most companies work with personal data relating to their customers, sensitive information and even personal data of third parties. In this sense, it is now more essential than ever for companies to take responsibility for the information they work with, not only by ensuring the security of the data they handle, but also by ensuring its quality and governance.
One of the measures of the GDPR goes in this direction, forcing companies to correct inaccurate or incomplete personal data. In other words, validate the data, consolidate it and ensure that it is reliable or, in other words, guarantee its quality.
Data quality is therefore essential to comply with the GDPR. Without comprehensive data quality controls, it is impossible to locate and resolve inaccuracies in personal data. Data quality processes must also be complemented by data governance initiatives. In other words, companies must change their approach to data-related policies and activities, moving away from understanding them as isolated tasks to fully integrating their data governance and data quality efforts.
In this blog we have also previously reflected on the close relationship between data quality and data governance, one being necessary for the other to exist. Let's remember that data quality measures the accuracy, relevance, completeness and understanding of the data, as well as the satisfaction of the requirements of the use to which they are to be put. On the other hand, the objective of data governance is "to ensure the integrity of data assets through processes and procedures, standardization of systems, and creation of consistent data distribution policies."
Nor should we forget that the implementation of data governance policies is wrapped under the umbrella of data management.
Data Governance Is Not Only Responsibility of the IT Department
As mentioned above, all companies, especially the largest and international ones, move large amounts of personal data through numerous systems, software, applications, etc. Independently of the amount of processes the data goes through, it is crucial that companies identify the locations of all personal data and that control and accountability measures are part of the corporate DNA.
In the data-centric era we live in, it is a must that all departments and employees understand personal data protection laws. The security and quality of data no longer depends on the IT department alone, but is now the responsibility of all departments as a whole. In short, a synergy between personal responsibility, data-driven culture, the processes, systems and technologies employed, and business values. It is therefore essential for organizations to have a centralized data management framework that allows data governance policies to be implemented through a team and collaborative approach, so that data quality concerns the entire team.
Technologies Needed to Comply With Data Protection Regulations
Just as data collection, analysis and data science technologies are becoming increasingly popular, the technology world has also reacted to the data governance, data quality and data compliance needs of the business world.
One of the elementary and most widely used resources are data catalogs, through which organizations can track the quality scores of their data, classify sensitive information, monitor the location of data, set access rights and enact usage restrictions.
Beyond the basics, there are already technologies specially dedicated to data governance such as Azure Pureview, Microsoft's new tool to help businesses govern and control their data assets.
On the other hand, machine learning technologies that can facilitate data compliance are gaining relevance as international regulations become more complex and data environments expand. Machine learning algorithms have the ability to identify personal data hidden in thousands of systems, source software and data processes.
Moreover, innovative technologies such as Master Data Management & Enterprise Information Integration (MDM/EII) provide a complete solution for solving all issues related to compliance, data quality and data governance.
MDM/EII is a system that enables the visualization of all company data —regardless of the number of sources, systems and softwares— in one place, which solves the identification of personal data and the discovery of hidden or unknown data. In addition, MDM/EII is specifically designed to ensure compliance with all GDPR requirements, as well as guarantee data integrity and quality. With MDM/EII "data is accurate, complete, homogeneous, solid and coherent with the intention for which they are transferred."
If you would like to learn more about MDM/EII you can download the solution datasheet below:
In conclusion, it is clear that compliance with the GDPR and other personal data protection regulations involves taking responsibility for the quality of the material we work with: data. Something that, on the other hand, should be an intrinsic concern of companies in their mission to become data-driven and leverage the value of data for business intelligence strategies.