To obtain print-quality JPEGs, contact the Office of Public Affairs at (815) 753-1681 or e-mail email@example.com.
Contact: Mark McGowan, NIU Office of Public Affairs
February 6, 2006
DeKalb — Reforms in the way figure skating is judged actually create more opportunities for error, says a Northern Illinois University professor who proposes the use of a Rasch model as a way to monitor judge performance.
The International Skating Union's new scoring process, which the world will experience in this month's Torino winter games, requires judges to rate each element (a jump, for example) of a skater's routine as it happens. They then score the entire performance on five program components (skating skills, linking footwork, execution, choreography and interpretation) rather than giving overall scores to an entire performance's technical and artistic merits.
Research has shown that the subjective realm of scoring can become more objective if the rating scale has minimal criteria to consider and is applied correctly, and consistently, by judges. Unfortunately, with the new system's two rating scales each having several criteria to consider, agreement among all judges will be difficult to achieve.
Meanwhile, the computer's random selection and arbitrary display of scores will keep secret the identities of the judges and remove them from public scrutiny and accountability. The public cannot tell whether a judge is showing favoritism to a countryman.
“It's disappointing,” says Marilyn Looney, a professor in the NIU College of Education's Department of Kinesiology and Physical Education.
“Cheating can occur no matter what scoring system is used. Randomly drawing nine scores from 12 judges, and dropping a high and low score, will not guarantee that biased scores are eliminated,” she adds. “The most important question that the ISU must address is: ‘Has the judge applied his or her interpretation of the scoring system consistently across all competitors and across each aspect of performance evaluated?' ”
Looney says the use of a many-facet Rasch model can help answer this question.
Its tracking of judges' scores across a competition can reveal when they break from their pattern, and thrust their objectivity into question. Had the Rasch model been used for the pairs skating competition during the 2002 winter Olympics in Salt Lake City, it would have flagged an unusual scoring pattern by the French judge who later was suspended.
Developed in the 1960s by Danish statistician Georg Rasch, the Rasch model (as expanded upon in 1989 by Mike Linacre, currently an adjunct professor at the University of Sydney, Australia) defines the characteristics of a valid scoring system:
The model's parameter estimates are used to compute expected scores for each skater. Software developed to perform a Rasch analysis detects when a judge gives an unexpected score, such as a high score to a less-able skater and vice versa.
“It is now more difficult to apply a many-facet Rasch model to ISU judges' scores,” she says, citing the lack of information revealed about the judges. “Secrecy surrounding the judges' scores does not build the public's trust in how fairly the winners are selected.”
Fortunately, the governing body for U.S. skaters will continue to reveal scores of its own championships.
“U.S. Figure Skating should be commended for deviating from the ISU policy by linking the judges' identities to their scores,” Looney says. “This allows those who are interested to investigate better ways to monitor judge performance.”
One positive aspect of the new rules is the almost-immediate review of a judge's performance following a competition. “In the past, the ISU did not evaluate the judges' performances until the end of the skating season,” she says. “If I were a biased judge, I could have given biased marks throughout the season.”
Yet the new system also discards information that can help analyze judge performance.
ISU officials will use a “trimmed mean” to determine if judges scored out of sync by eliminating the bottom two scores and the top two scores and calculating the average of the remaining eight scores. Judges whose number of deviations from the trimmed mean exceed a predetermined cut-off are flagged, Looney says, “but this is based on the assumption that the high and low scores are biased.”
The use of a Rasch model would take all available data into account when scrutinizing judges, she says. “I may be a judge who gives low scores because I'm the tough judge,” she says, “but I'd better be a tough judge across all skaters.”
Looney, who has paid attention to figure skating since Nancy Kerrigan's loss to Oksana Baiul in 1994, says the new scoring is changing the sport: Because certain elements carry higher base values, observers expect to see skaters who have mastered those jumps and spins to pack their programs with them.
The new scoring also has Looney wondering what data have merit for scrutiny.
“I'm flagging anomalies in scoring, not judges. I need an ISU judge to sit down with me and provide feedback on the validity of what's being flagged,” she says. “This (new) approach shows promise in helping sport governing bodies improve judge training and monitor the performance of the judges for bias, but further research needs to be done to see if judges' scores consistently fit the many-facet Rasch model, and if international judges agree with the explanations for the flagged anomalies and their significance.”
# # #