Predicting Delivery Throughput

3 minute read

We were asked to predict how likely our client was to meet its deliverable objectives for a specific customer. We pulled SKU-specific historical delivery data for the past 36 months. For each part in a given month, we could see how many deliveries were required and how many were actually delivered. Keep in mind that based on the customer’s needs, the monthly requirements change, sometimes drastically, from month to month. Our client needs to be responsive to these dynamic needs. From this data set, we started digging in.

The first thing to do was check that delivery performance was normally distributed and consistent for a given part. If the data are not normally distributed, we have to treat the data differently in order to use it for effective predictions. Likewise, if there are increases/decreases in performance over time, any forecast will be incorrect if it assumes consistency. The Quantile-Quantile (Q-Q) plots below all check out nicely. The data fit pretty closely to the line, and P-values from a normality test are all greater than 0.05.

We also ran a quick Individuals Control Chart to check for stability. Here’s the first 5 SKUs. For the most part, things look good: no consistent increases/decreases. And, the data generally fall on both sides of the mean.

SKUs 4 & 5 have a couple of out of control points, where they performed significantly above the monthly plan…these points bear further investigation to see if there was a special cause that drove the unusual pattern.

Next, we plotted histograms of each part’s monthly performance against the goal. All the exact values are converted to percent of the monthly goal. The green vertical line in each plot represents 100% of the monthly goal. Here, we’re trying to see how often a SKU meets its goals and how much spread there is around the objective. Here’s the first 5 parts again.

Ideally, we’d like to see a tight distribution around 100%, indicating the client is capable of responsively meeting demands. Unfortunately, there’s quite a bit of spread (remember those outliers on SKUs 4 & 5?).

To make a very simple binomial estimate of forward looking SKU performance, we counted the # of months above goal for each SKU. # months above goal/total # months gives an idea of how likely the client is to achieve this goal. For the first SKU, 5 months above goal / 36 months = ~14%. See how much of the distribution is to the left of the 100% goal? Ugh. However, the distribution is pretty tight. Whatever is going on may be easier to solve than on SKU 2 where delivery performance better on average (20/36 months achieving goal or ~56%), but is all over the place.

That might be OK, depending on the particular situation…or it might be atrocious. Regardless, assuming ongoing stability, we could expect a similar probability of achieving a given goal.

We can take those same distributions and move out of the binomial space and into continuous land by using the distribution itself. If you’re still with me (and I bet there aren’t many), this is the fun part!

Remember how we spent all that time verifying the data are normally distributed? Here’s what that means: we can use the shape of the distribution to estimate where future values will land. Here’s a view of the standard normal distribution that let’s us do this:

Because we’ve normalized our performance to reflect percent of a goal, and we can show that these data are normally distributed, we can apply the standard normal to show how likely each part is to fall within a given percent of its goal. Roughly 68% of the time, the deliveries will be within 1 standard deviation of the average performance. That’s useful for making predictions! We can estimate how likely this part is to perform at any given level of performance based on the standard normal.

Look back up at the histogram for SKU 1. The body of the distribution is well below the green “100% of goal” line. If you were asked to bet your next paycheck on SKU 1 meeting/exceeding its goals next month, would you do it?

NO!

In simple binary terms 5 months achieving goal out of 36 is only 14%. That’s useful.

But, we know more than just binary results. We can characterize the shape of the distribution. The mean for this SKU is ~88% of goal, and the spread of the distribution is measured with standard deviation at 0.07. If we use the standard normal, we see that 100% of goal is about 1.7 standard deviations away. (1 – 0.88 = 0.12. Then 0.12/0.07 ~1.7).

The probability of achieving a value that is 1.7 standard deviations away is more like 7%. Or, there is a 93% chance of missing the goal. Here’s what that looks like graphically, with the red shaded area representing 93% of the standard normal distribution.

For SKU 1, our best estimate of the forecast would be 88% of goal with a standard deviation of 7% describing the shape of the distribution. In this way, we are able to predict throughput for each part.

Some fun questions we will try to answer in the future:

What if these SKUs are hierarchically related to each other?
Which SKU is the bottleneck (least capable)?
How would you allocate limited resources towards improving performance to maximize Return On Investment?