Assessing the Validity and Relevance of Data

Kevin Hanegan, Chief Learning Officer, Qlik & Chair, Data Literacy Project Advisory Board

In a previous article, we discussed the lost art of questioning and its importance when working with data and information to find actionable insights. Here, we will expand on this topic and explain how questioning differs depending on where you are in the data analysis process.

Before we proceed, let’s revisit the difference between those three terms. Data is facts. It is the raw, untouched data that is captured. Information is data and facts that have been somehow transformed, for example, via aggregation or categorization. Typically, information is what is visualized and included in reports. An insight is derived from that information, usually through analytics. Insights should consider the context of the problem/question at hand and then draw conclusions, which will lead to decisions and actions.

At each stage in this process, we must question both the validity and relevance of what is being shared. First, we need to critically appraise it to discern whether we can trust it to be a fact or insight. Then, we need to evaluate it to determine its relevance to the problem or question at hand. When we don’t do this – and when we don’t apply healthy skepticism and critical thinking – we can make poor decisions.

Stage one: assessing the data

During the initial stage, when you are looking at just raw data, there are often situations in which that data is incorrect. For example, in 1492, Christopher Columbus sailed from Europe across the Atlantic Ocean to find an alternative route to Asia. But Columbus relied on the erroneous calculations of several geographers from conflicting sources and eras to chart his route. In addition, Columbus did not convert Arabic miles used by one of the geographers to Roman miles, leading him to grossly underestimate the expanse separating the continents. This bad data led Columbus to land in the Americas and not Asia.

Stage two: organizing data into information

Similarly, when organizing data into information, the information can also be incorrect. Perhaps the wrong transformation was used, or the wrong categorization was applied. Potentially the information is accurate, but the definition of what the information is trying to show is misleading.

For example, does everyone have the same interpretation of what the term “profit” means if the information shows an average monthly profit? Is it gross profit, net profit or perhaps something else? When the person sharing this inaccurate or misleading information does not know it is inaccurate or misleading, it is called misinformation. When the data and information have intentionally been shared, knowing it is inaccurate or misleading, it is called disinformation. Both misinformation and disinformation can lead someone to draw incorrect conclusions.

If the data and information are validated as accurate and relevant to the problem/question you are trying to address, you can then move on to the next stage and try to come up with insights.

Stage three: driving insights

During this stage, it is equally important – and many times more challenging – to apply healthy skepticism before believing the insights to be true and acting on them. There are many reasons why insights may be either inaccurate or irrelevant, but here are six of the most common ones:

Looking for trends where there may not be any
Looking at correlations when there are not any
Misunderstanding the results from an inferential statistic
Incorrect mental models
Looking at a symptom and not the root cause
Tunnel vision/lack of innovation

Everyone is susceptible to misinformation, disinformation and false insights because we are all prone to cognitive self-bias, which causes us to think we have the right information and insights when, in fact, we do not. We all have emotions, and typically are dealing with information overload, combined with a lack of time and attention, leaving us unable to complete everything needed. These can lead us to avoid challenging our assumptions and mental models.

Like how scientists apply rigorous skepticism to their observations, it is important for those working with data to do the same by critically questioning and challenging all data, information and insights before treating them as truths.

If you would like to learn more about how to reach high-quality insights that drive impactful decisions, check out the resources via the Qlik Continuous Classroom.