Data Integration - Informatica

Informatica Data Quality

Information Content Quality

Larry English

Information Content Quality Characteristics Larry English

One of the root causes of poor quality information is defects in the data definition, specifically the "information product specifications." Because information is a product of our business, manufacturing and service processes, the analogy of an "information product" is real, and the requirement for quality in "information product specifications" is a critical requirement for Information Quality.

This blog is the second of a series of three blogs on the critical quality characteristics (or measures) of information quality required on the TIQM Quality System.

  1. Information Product Specification Data Quality
  2. Information Content Quality
  3. Information Presentation Quality

Information Content Quality Characteristics

  • Information standards
  • Data names
  • Data definitions
  • Attribute valid value set or range of values
  • Value format for structured attributes (VIN, SSN, Product Codes)
  • Business rule specifications of constraints on data
  • Information Steward accountable for data definition quality

Information Content Quality Characteristics: The major information
content (data values) quality characteristics
include:

  • Definition conformance. Data values are consistent with
    the attribute (fact) definition
  • Completeness. Each process or decision has all the information
    it requires

    • Record completeness. A record exists for every real world object or event the enterprise needs to know about
    • Value completeness. A given data element (fact) has a value stored for all records that should have a value
  • Validity. Data values conform to the information product specifications
    • Value validity. A data value is a valid value or within a specified range of valid values for this data element
    • Business rule validity. Data values conform to the specified business rules
    • Derivation validity. A derived or calculated data value is produced correctly according to a specified calculation formula or set of derivation rules. If the base values are accurate, and the calculation is correctly performed, then result will be Accurate
  • Accuracy. Data values are correct.
    • Accuracy to surrogate source. The data agrees with an original, corroborative source record of data, such as a notarized birth certificate, document, or unaltered electronic data received from a party outside the control of the organization that is demonstrated to be a reliable source
    • Accuracy to reality. The data correctly reflects the characteristics of a real-world object or event being described. Accuracy and precision represent the highest degree of inherent information quality possible
  • Precision. Data values are correct to the right level of detail, such as price to the penny or weight to the nearest tenth of a gram
  • Non-duplication. There is only one record in a database representing a given real-world object or event
  • Source quality warranties/certifications. The source of information: (1) guarantees the quality of information it provides with remedies for non-compliance; (2) documents its certification in its information quality management capabilities to capture, maintain, and deliver quality information; or (3) provides objective and verifiable measures of the quality of information it provides in agreed-upon quality characteristics
  • Equivalence of redundant or distributed data. Data in one database is semantically equivalent to data about the same objects or events in another database
  • Concurrency of redundant or distributed data. The information float or lag time is minimal between (a) when data is knowable created or changed) in one database to (b) when it is also knowable in a redundant or distributed database, and concurrent queries to each database produce the same result

For more about Information Content Quality, see Chapter 6, "Assessing
Information Quality," in Improving Data Warehouse and Information Quality.
This contains a more comprehensive list of quality characteristics with examples.
It also describes how to measure these quality characteristics. The next blog
will discuss information presentation quality characteristics required for the
finished Information Product presented to the knowledge workers.

What do you think? Share your experiences in measuring information content
quality, especially accuracy.

No Comments, Comment or Ping

Reply to “Information Content Quality”