top of page
Search

Cure to Bad Data - Speech

  • jaquasianicole
  • May 22, 2023
  • 2 min read

Bad data is like poison that plagues a dataset and is estimated to cost companies $3.1 trillion dollars a year (Redman, 2016). But what is the cure? My name is Jaquasia Donald and I am an aspiring Data Analyst. Humankind is currently in the information age, characterized by our economy being centered on information technology. Technology is accelerating rapidly and information is a tool to aid in decision making. Data can be leveraged to bring organizations success in a competitive market. However, if it is in a poor format data cannot be extracted, analyzed, or interpreted, transforming it into a silo.


Initiating and maintaining high data quality is the key to understanding the target audience, improving operational capabilities, and increasing productivity. Data quality refers to how well a dataset meets the criterion for accuracy, completeness, validity, consistency, uniqueness, timelines and fitness for a purpose (What is data quality?, n.d.). These are the Six Data Quality dimensions that measure data quality levels, recognize data errors, and assess usability. High data quality should be accurate, void of missing data, and reliable. It should also be formatted, up to date, and relevant (Suer, 2021). Ensuring these standards are met will allow organizations to optimize predictions, promote profits, and reveal valuable insights.


After setting up specific metrics, assessing data quality involves analyzing the impact of the health of the dataset. Having protocols in place will encourage all departments across the hierarchy of the business to value data quality. Supporting clean data will avoid downstream cleansing to follow practices of upstream prevention (Liliendahl, 2010). It is a group effort that can boost results and save time for all departments in a company. After establishing a metric and an enterprise-wide buy-in, teams should implement data quality dashboards to monitor the health of data assets. This will provide an overview of the Key Performance Indicators (KPI) to facilitate what metrics should be improved. Another effort that proves effective in ensuring data quality is scheduling data quality audits regularly. This process examines the organization's data at a deeper level considering the six data quality dimensions mentioned. This gives the team a chance to fix mistakes and make improvements by filling gaps and removing duplicate records.


Data cleaning is a crucial step in data analytics and the health of the data directly affects the ability to reach relevant insights. Data Cleaning is the beginning step in curing bad data. Thank you all for giving me your attention!


References


Liliendahl, H. G. (2010, September 25). Top 5 reasons for downstream cleansing. Liliendahl

on Data Quality. https://liliendahl.com/2010/09/25/top-5-reasons-for-downstream-cleansing/


Redman, T. C. (2016, September 22). Bad Data costs the U.S. $3 trillion per year. Harvard

Business Review. https://hbr.org/2016/09/bad-data-costs-the-u-s-3-trillion-per-year


Suer, M. (2021, August 5). What is data quality and why is it important?. Alation.

https://www.alation.com/blog/what-is-data-quality-why-is-it-important/


What is data quality?. IBM. (n.d.). https://www.ibm.com/topics/data-quality


 
 
 

Recent Posts

See All
All About Data Lineage

The data analysis process involves collecting, transforming, and modeling data to discover patterns for results. This process begins with...

 
 
 

Comments


Contact
Information

Department of Mathematics
Science Center

Fresno, Texas 77545

Greenville, South Carolina 29607

  • GitHub
  • LinkedIn

Thanks for submitting!

©2023 by Jaquasia Nicole Donald.

bottom of page