I found a way, years ago, to avoid getting bogged down in data; to avoid the challenge of ‘paralysis by analysis’ when digging into the numbers to find the cause of something.

It emerged over time as a product of possessing two seemingly unrelated skill sets.

Firstly the skill associated with quantitative analysis. This simply means having the ability to use a variety of statistical tools and methods to study data and understand the technicalities of drawing sound conclusions about the relationships between variables.

Secondly, the ability to dig deeper to find answers through forming the right type of query and asking the right questions.

Bring these two together, and suddenly we have a framework for undertaking data analysis and doing it in a structured and effective way.

The Question

So let’s answer the question within the context of problem solving – ‘What is the best way to begin the analysis of data so that we don’t get bogged down and we solve the problem as efficiently as possible?”

This question is about data analysis, an activity in six sigma oriented project work that [in my observation] not a lot of people do well without a little guidance.

The Investigator Strategy

A business improvement practitioner will do a brilliant job if they approach this part of their work like a detective.

From experience, I know that a detective doesn’t jump into their analysis without some preparation.

They begin with their suspicions, then they investigate and interview to answer specific questions that will either negate or prove their suspicions. They would never interview a suspect without first knowing what it is they need to ask.

Effective data investigators do the same thing.

The following steps outline how I go about doing this work quickly, effectively, and without unnecessary stress or confusion.

STEP 1 – VERBALISE YOUR SUSPICIONS

The goal here is to bring all of your Xs and Ys into the mix.

So you begin by grabbing your data collection plan, identifying all of the variables you’ve included in the plan, and then listing everything you or the team suspect about those variables.

For example, let’s assume we were working on resolving problems with some of our coffees being too cold in a Cafe we own.

Obviously the Y in this case is ‘temperature’.

At this point we’ve collected data that includes the variables listed below for each cup of coffee:

– Temperature of coffee (the Y)

– Who made the coffee (X1)

– When it was made (X2)

– Type of coffee ordered (X3)

– Cup size (X4)

The reason these input variables (the Xs) were included in the data collection is because I know the process and I suspect something about how these variables affect coffee temperature (the Y).

That is the basis for now verbalising exactly what I suspect about each of them and how they relate to coffee temperature.

Those suspicions might look like this.

SUSPICION 1. I suspect that some of our baristas deliver more defective coffees, with respect to temperature, than others

SUSPICION 2. I suspect that that most of our ‘too cold’ coffee defects are produced in the busy times in the morning

SUSPICION 3. I suspect that the coffees in smaller cups cool down more quickly than the larger ones

SUSPICION 4. I suspect that that milk based coffee might not hold heat as long as the water based coffees

Remember, we included these input variables in the plan because we inherently knew or suspected something about their involvement in coffee temperature variation.

So by verbalising those ‘unconscious’ suspicions we now have the foundation for identifying the questions we need to answer in our analysis.

STEP 2 – TURN YOUR SUSPICIONS INTO QUESTIONS

Now we review our suspicions, and then think about each of them as a question. You’ll notice that the questions begin to bring parameters into play such as proportions and averages.

Barista (the X) and Coffee Temperature (the Y)

I suspected that some of our baristas deliver more defective coffees, with respect to temperature, than others, so my questions are ..

Question 1.1 – Do different baristas produce more defective coffees than other baristas?

Question 1.2 – Does any particular barista make coffees that are on average a lower temperature than the others?

Time Period and Coffee Temperature

I suspected that that most of our ‘too cold’ coffee defects are produced in the busy times in the morning, so my questions are ..

Question 2.1 – Do AM periods produce more defective coffees than other times of the day?

Question 2.2 – Does the AM period on average produce coffee that is a lower temperature than other times of the day?

Cup Size and Coffee Temperature

I suspected that the coffees in smaller cups cool down more quickly than the larger ones, so my questions are ..

Question 3.1 – Are the majority of our ‘defective’ coffees delivered in small cups?

Question 3.2 – Is the average temperature of coffee when delivered in smaller cups less than for larger cups?

Coffee Type and Coffee Temperature

I suspected that that milk based coffee might not hold heat as long as the water based coffees, so my questions are ..

Question 4.1 – Are the majority of our ‘defective’ coffees milk based (i.e. latte, flat white etc)?

Question 4.2 – Is the average temperature of milk based coffees less than water based coffees?

STEP 3 – PLAN HOW YOU WILL ANSWER EACH QUESTION SPECIFICALLY

At this point we think about how we would test the data so that we can get definitive answers to our specific questions.

Notice that I’ve indicated two things in my plan – (a) how I would display the data visually and (b) how I would analyse the variables statistically to obtain an answer.

Questions 1.1, 2.1, 3.1, 4.1

My plan is to compare defect rates for each individual factor (baristas, AM versus PM, small versus large cups, milk coffees versus water based coffees) – display in pie charts or 100% Column Chart / analyse using Chi Square

Question 1.2

My plan is to compare average temperature values for individual baristas – display in stratified box plots / analyse using 1 Way ANOVA or ANOM for Means

Questions 2.2, 3.2, 4.2

My plan is to compare average temperatures for each individual factor (AM versus PM, small versus large cups, milk versus water based coffees) – display in stratified box plots / analyse using 2 Sample t Tests or ANOM for Means

STEP 4 – ANSWER YOUR QUESTIONS USING THE ANALYSIS TOOLS

Now you simply answer your Step 2 questions by doing what you planned in Step 3.

The golden rule is this:

Never begin any analysis without first knowing the question you are trying to answer!

That’s it, simple and effective, and extremely important if you are to avoid what most people experience – paralysis by analysis!

More Information

This article was written by George Lee Sye, author of PROCESS MASTERY WITH LEAN SIX SIGMA – the best lean six sigma text book in the world today.

Share This