Alt text

Squeezing Apples and Oranges into the Performance Rating Bell Curve


Other than at work, do you remember the last time you saw the bell curve? For most of us, it was probably at school where it was used to describe the distribution of test scores and to help the instructor determine grades earned.  In this context, the bell curve was used to measure and evaluate student performance on a common test or course.  While it is possible to use the bell curve to evaluate students within a major, it is difficult to evaluate across majors.  For instance, it would be difficult to stack rank the combined populations of engineering majors and business majors.  (In my experience, business majors had much higher GPAs, often a ½ grade higher than engineering majors.)  While you can adjust both populations for this ½ grade difference, it would be unreasonable to compare individual students from each population since you do not have a common standard.  You cannot conclude one student is better than another especially if they shared few common courses.  After all, would you assume that a 4.0 history major would find equal success in physics or vice versa?  Probably not.  So if we acknowledge that we cannot compare performance across majors, why does it make sense for organizations to compare employees performing different functions?  

Now, let’s take this a step further.  Can you compare one manager’s employees with another manager’s?  While there may be instances when this can be done, it is again not necessarily wise as there are probably too few shared employees and probably too little shared experiences between manager and employee.  Moreover, one person’s tonic is another person’s poison.  Here I cite a study conducted by PDI Ninth House, a consulting firm, which reviewed the performance ratings of roughly six thousand employees reporting to two bosses.  In the study, the majority of employees earned inconsistent marks.  Of those employees rated “outstanding” by the first boss, 62 percent received a lower rating from the second boss.*  

Without a common calibrating instrument, we cannot rely on performance ratings to measure an employee’s performance relative to all other employees.  There is no way to validate the rating no matter how precise the final rating number.  The veracity and accuracy of the rating are suspect from a statistics standpoint.  Here the number does not tell you much and should not be deemed as a valid measurement.  Unfortunately, the preciseness of the ratings number conveys a false sense of rigor and confidence in something that is ultimately faulty and unreliable.  Using ratings for talent management and compensation decisions is a bad idea.  
 * Culbert, Samuel A.; Rout, Lawrence.  Get Rid of the Performance Review!  New York, NY:  Business Plus, 2010.  Print pp. 46-47

In Performance Review, Performance Management, Forced ranking
By Dwight Ueda
HTML Comment Box is loading comments...