Data Quality

Jump to: navigation, search


Importance of Data Quality

If one believes information is the real wealth today, then data could well be regarded as the primary currency. Most of the critical decision making done by a firm today is done on the basis of understanding and insight derived from Data. It is therefore nothing short of shocking bad data quality can cost a company incalculable dollars annually.

Data Quality is a primary factor in determining how successful a Human Capital Management (HCM) technology strategy will be. With business environments becoming increasingly complex, maintaining Data Quality is becoming more and more daunting. Data must now be regarded as a strategic asset of an organization – and ensuring Data Quality is key to reaping maximum benefit.

What is Data Quality?

It is a concept which has been defined in more than one way:

Data Quality refers to the degree of excellence exhibited by the data in relation to the portrayal of the actual scenario.

The processes and technologies involved in ensuring the conformance of data values to business requirements and acceptance criteria.

There are more definitions but, the real Data Quality drivers are contingent upon the requirements of the organization and the purposes the data is ultimately going to be used for.

Dimensions of Data Quality

Like the definition, there are several dimensions of Data Quality and experts have varying opinions on what is correct. These are some of the important dimensions of Data Quality:

Integrity: All about preserving the Data, despite any kind of function being performed on it or any distortions – accidental or stemming from mal-intent – which are inflicted on it

Accuracy: The degree of correctness of Data

Accessibility: To what extent Data is available or retrievable and how quickly it can be accessed

Validity: Refers to Data conforming to business rules

Appropriateness: If the volume of Data is sufficient for the task at hand

Credibility: To what extent the Data can be regarded as true

Brevity: The Data is present in a compact fashion, without compromising on correctness

Timeliness: Availability of Data as need arises. If the data is not available, it is useless

Completeness: All relevant details are present in the Data set; nothing has been sacrificed to save time or money

Consistency: The extent to which Data is presented in the same format

Flexibility: Refers to the ease of Data transitions; when a process or the database supporting it is changed

Ease of manipulation: Refers to the ease with which Data can be manipulated and applied to different tasks

Objectivity: Refers to the quality of Data of being free of human biases

Interpretability: The extent to which the Data uses appropriate symbols wherever required

Security: The extent to which the restrictions to and safeguarding of the Data operate

Reputation: How much the Data is valued in terms of its source(s)

Data Quality Management

Business organizations today have data of all kinds, merely relying on it because a CMS is processing it does not ensure Data Quality. Errors and discrepancies can creep in for multiple reasons. The new practice of Data Quality Management (DQM) is, however, catching on. DQM entails the establishment and deployment of roles, responsibilities, policies, and procedures concerning the acquisition, maintenance, dissemination, and disposition of data. There are also increasingly available Data Quality tools which assure good quality data.

Most of these tools have processes like:

  • Data profiling - to understand quality challenges
  • Data standardization - ensuring conformation to quality rules
  • Geo-coding - for name and address data
  • Matching or Linking - weeds out Data duplication
  • Monitoring - keeping track of Data Quality over time
  • Batch and Real time – after initial Data cleansing, building in processes to run timely maintenance.

See Also

Personal tools

. .