When it Comes to Learning Measurement—What is Good Enough?

When it comes to learning measurement—what is good enough?

This is a question that needs to be asked much more frequently. This is especially true in the case of isolating the impact of learning (Phillips’s level 4) or talking about the accuracy of an ROI calculation. Context is key. Business economists think about the answer to this question all the time because the one thing they know for sure is that their forecast (for GDP, sales, housing starts, commodity prices, exchange rates, etc.) will be wrong. Only by chance will the forecast be exactly right. So, the question is not whether the forecast will be exactly right but rather will it be close enough to make the right decision.  For example, should we raise production, should we hire more workers, should we invest in A rather than B?

We need to apply this same type of thinking in learning. We need to start with the context and the reason for measuring. What decision are we trying to make or what will we do with the estimate once we have it? Given the answers to these questions, how close does our estimate of impact or ROI need to be? I cannot think of a single instance where the estimate needs to be perfect. It just needs to be good enough to help us make the right decision or take the right course of action.

So, let’s step back for a minute and ask why we might estimate impact or ROI? First, I think we would all agree that we want to identify opportunities for improvement. If this is the context, how accurate does our estimate of impact need to be?  In this case, the estimate just needs to be roughly right or “in the ball park”. For example, if the true (but unknown) ROI is 20% we would like an estimate to be in the 10%-30% range. Typically, we would conclude that an ROI in this range has opportunity for improvement. Similarly, if the true ROI were 100%, we would want our estimate to be in the 70%-130% range, and we would likely conclude that no improvement is necessary.

Will the standard methods to isolate impact (control group, trendline analysis, regression or participant estimation methodology) be good enough for this purpose?  I believe so. We simply need to know whether improvement is required, and we want to avoid making improvement when non is needed or failing to make an improvement when improvement is needed. In other words, if the ROI is truly 10%, we don’t want an estimate of 100% And vice versa. The standard methods are all good enough for us to make the right decision in this context.

Now, suppose the reason to measure impact and ROI is not for improvement but to demonstrate the value or effectiveness of the program. At a minimum we want to be sure we are not investing in learning that has no impact and a negative ROI. This is a bit more demanding but the same logic applies. We want the error margin around the estimate to be small enough that we can use the estimate with confidence. For example, if the estimate for ROI is 10%, we want to be confident that the error margin is not plus or minus 10% or more. If it is, we might conclude that a program had a positive ROI when in fact it was negative.

In this context, then, we want to be more confident in our estimates than in the first scenario. Stated differently, we want smaller error margins. We will use the same four methods, but we need to be more thoughtful and careful in their use. We would have the most confidence in the results obtained using a control group as long as the conditions for a valid control group are met, so extra care needs to be taken to make sure the control group is similarly situated to the experimental group. Trendline and regression also can produce very reliable estimates for the “without training” scenario if the data are not too messy and if the fit of the line or model is good. All three of these methods are generally considered objective and, when the conditions noted above are met, should produce good estimates of the impact of learning with a suitably narrow error margin.

The participant estimation method is the most widely used because no special statistical expertise is required and because often there are no naturally occurring control groups. However, it does rely on the subjective estimates of the participants. Accordingly, we will want to be sure to have 30 or more respondents and, ideally, we will obtain their estimates of impact about 90 days after the training. It is also critical to adjust their estimate of impact by their confidence in the estimate. When this methodology is used as described by the Phillips, it, too, should produce estimates reliable enough to be close to the actual but unknown impact and ROI.

The common theme in both scenarios is good enough. At Caterpillar we conducted about three impact and ROI analyses per year using the participant estimation method. We used the results to both show the value of the programs and to identify opportunities for improvement. We always presented the results with humility and acknowledged the results were estimates based on standard industry methodology with adjustments for self-reported data. We had confidence that the estimates were good enough for our purposes and we never received any pushback from senior leadership.

So, remember that your results do not need to be perfect, just good enough.

Speak Your Mind