Jump to content

Comprehensive judge bias review


Recommended Posts

5 hours ago, rockstaryuzu said:

Sheesh! The number of countries with all or most judges tested showing bias! 

 

It might be quicker to list the unbiased ones, but at this point I'm wondering if, over time, a larger data set would just show that all the judges are biased to some degree. 

 

Also, as a Canadian, I'm profoundly irritated by the Canadian results here. For a country that has suffered unfairly from biased judging in the past, and was a big contributor to the supposedly less biased IJS, to have 9 out of 23 judges fail the bias test is a disappointment. We are a country that prides itself on fair play, for f**K's sake! 

 

On the other hand, this evidence pretty clearly shows that almost all feds have bias, so it would be hard to resist the temptation to pad your own country's scores a little. 

 

Here's the thing though: just because a skater benefits from bias, doesn't necessarily mean that skater is a terrible skater who doesn't deserve international titles. It just means they should be judged by a neutral party. 

 

It would definitely be quicker to list the judges who despite having a substantial judging record look pretty unbiased (p > 0.3). But less satisfying :emoticonaci2019_2:. I don't think literally every single judge is biased though. 

 

Well at least you aren't Russia, that section is a wall of deep red. The Russian judges who weren't flagged (all two of them) were almost certainly because of low sample size. One of them is reallyyy borderline as well, like p=0.0504 or something. 

 

 

Link to comment
Share on other sites

1 minute ago, shanshani said:

The Russian judges who weren't flagged (all two of them)

I know it's not really funny, but this still made me laugh. :1: Russia's been like that since Soviet times.  It's really not a surprise.

Link to comment
Share on other sites

4 hours ago, Paskud said:

:worship::worship::worship::worship::worship::worship:

I couldn't agree more. This is a significant, amazing and potentially disruptive piece of work. Huge kudos are in order. 

 

As I mulled over your results, a few additional questions came to mind:

 

1. Would it be meaningful to calculate the percentage of judges from each fed that displayed bias? 

2. What about the average degree of bias each country's judges exhibited?

3. And what are the odds of a skater from country X having a judging panel that includes a judge from country X? 

 

Using those three factors, couldn't you calculate the degree to which, over the course of a season, skaters from each country are likely to benefit from biased judging? Forgive me if I'm off base -- my days of formal schooling are long behind me. But if it's possible to calculate the degree to which some skaters are advantaged and others are disadvantaged just by their nationalities, it surely seems relevant.

 

In any case, congratulation on your great work. I look forward to the upcoming segments!!!

 

:gla:  :happy0065:

 

Link to comment
Share on other sites

Oops I forgot to upload the updated version of the database. The old one doesn't include Finlandia. Should be fixed in the next 5 minutes :13877886:

 

27 minutes ago, ZuCritter said:

I couldn't agree more. This is a significant, amazing and potentially disruptive piece of work. Huge kudos are in order. 

 

As I mulled over your results, a few additional questions came to mind:

 

1. Would it be meaningful to calculate the percentage of judges from each fed that displayed bias? 

2. What about the average degree of bias each country's judges exhibited?

3. And what are the odds of a skater from country X having a judging panel that includes a judge from country X? 

 

Using those three factors, couldn't you calculate the degree to which, over the course of a season, skaters from each country are likely to benefit from biased judging? Forgive me if I'm off base -- my days of formal schooling are long behind me. But if it's possible to calculate the degree to which some skaters are advantaged and others are disadvantaged just by their nationalities, it surely seems relevant.

 

In any case, congratulation on your great work. I look forward to the upcoming segments!!!

 

:gla:  :happy0065:

 

Thanks for the questions!

 

1. It's not entirely meaningless, but it's also subject to a lot of noise. There's clearly something to be said for the fact that 80%+ of the Russian judges are flagged, but overall the % of judges from a federation that are flagged is as much a function of how much data are available on the judges, specifically regarding how they score their own skaters. It's easier to detect bias the more data there is (with really small sample sizes ZDifference has to be very large for the judge to get flagged), so it's relatively easy to catch Russian judges, who have many opportunities to score many skaters across many disciplines, but harder to catch small fed judges, who have far fewer opportunities to score their own skaters. Russia in particular appears to have a relatively small stable of judges who go to a lot of competitions each, making it easier to catch them even in comparison to, say, US and Canadian judges who tend to go to fewer competitions per judge. Nonetheless, I do believe Rusfed is genuinely worse than most other federations, but the % of flagged judges exaggerates the difference for reasons that don't have much to do with the inherent bias (although I would say it's less that Russia's level of bias is overstated and more that other feds' level of bias is understated).

 

2. I could definitely take a look at that, although presently the sheet isn't built to make that easy. The easiest way to do it is just to average by judge (and maybe I'll add that to the sheet--shouldn't take too long), but it's probably better to aggregate the underlying data, which would take a bit of extra time to do. It's a good idea though, and I could add it as a supplemental at some point. It would also be helpful for looking at federations where the amount of data on each individual judge is small, but aggregating it produces a decent amount of data.

 

3. Hm, well there's no reason in principle this should be impossible, but I would have to think a while about how to do it in a way that doesn't take forever.

 

Calculating the degree a skater is likely to benefit from bias judging sounds like an interesting idea. It might be possible, but I would have to think quite hard about how to do it. One complication is that there might be other kinds of effects (eg. are judges more likely to be more biased in favor of the higher ranking skaters from their country?) which would make it difficult to generalize. You would also ideally take into account the trimming procedure used to calculate the actual score, and how the score would interact with placements (but I think this would be too difficult to account for).

Link to comment
Share on other sites

Okay so one question: Can we see how often a judge is assigned to competitions by their federation, and see if there's correlation between their P and the number of competition plus importance of competitions they get sent to?

 

I'm thinking of that comment I saw on reddit ages ago that if judges don't toe the fed line they don't get more assignments. 

Link to comment
Share on other sites

2 minutes ago, WinForPooh said:

Okay so one question: Can we see how often a judge is assigned to competitions by their federation, and see if there's correlation between their P and the number of competition plus importance of competitions they get sent to?

 

I'm thinking of that comment I saw on reddit ages ago that if judges don't toe the fed line they don't get more assignments. 

better to correlate with ZDifference, because p is necessary correlated with the number of competitions as it measures the strength of the evidence. more competitions = more evidence

 

and sure, I can take a look, at least for a few specific feds. 

Link to comment
Share on other sites

3 minutes ago, shanshani said:

better to correlate with ZDifference, because p is necessary correlated with the number of competitions as it measures the strength of the evidence. more competitions = more evidence

 

and sure, I can take a look, at least for a few specific feds. 

 

Great, this would take the link from individual judges to at least indicating fed-level encouragement of bias.

Link to comment
Share on other sites

7 hours ago, yuzuangel said:

This is so cool! Do you want the PH account to tweet about it or if you have a tweet, we can retweet it? Just so you can get more interest in your very comprehensive analysis!

That would be awesome! Although let me do some edits first. I'm also going to put it up on wordpress/a similar platform so it might be easier to tweet that?

 

Also you should definitely tweet the wall of shame part at the ISU. lol

Link to comment
Share on other sites

6 minutes ago, shanshani said:

Updated the OP to include some lovely graphs courtesy of @Veveco. @WinForPooh these graphs may be of special interest to you, as you seem to be interested in federation-level bias

Wow seeing it laid out like that in graphs - multiple forms - is really like getting slapped in the face.

 

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...