Topic: BiM Contest Judging

I'm a bit late to this because I was a robotics demo until 1 AM. I'm a bit disappointed by the results of BRAWL 2014. I'm shocked that 40% of the score is based on interpretation of the theme. That seems outrageous to me with how broad the theme was this year. You might as well just roll the dice. I'm not sure if the judging criteria was released before the contest. If it was, I must have missed it and would love to be corrected. I think for future contest it should be standard to explain the judging criteria. That way the results seem less pulled out of thin air. This will also tell the participants what the contest is looking for in a film.

This is not to pick on BRAWL but rather contest judging on BiM as a whole. For the most part I am using BRAWL because it is the most recent. I know these contests are mainly for fun and making a film is its own reward. However, people still care about the rankings mainly for bragging rights. This is a contest that I didn't enter so I can finally talk about contest judging in general in an unbiased manner.

I feel like every BiM contest happens and the judging is released and it always seems random to me. It's never oh yeah that was clearly the best film. I put the judging sheet into a spreadsheet and started playing around with it. What I discovered is that the four categories don't make much of a real difference. If film had good presentation it scored equally well on theme. On average each film had a 8.29 percent standard deviation between all the categories. This mean the categories were rather meaningless.

The spreadsheet is viewable here: https://docs.google.com/spreadsheets/d/ … vrX9EVRRJY

I understand people have different values on what makes a good film and that is why we have multiple judges. I would really like to see the breakdown by judges. I know in THAC X the BuilderBrothers' film was at the top of the list for two of the three judges but the third judge single handedly brought it down to a ranking of 6. From a math standpoint most BiM contests give the winner to the most mediocre film. If films have a weak point or unliked by one judge it kills their chances at winning. I think this is why every time result are released there are always people unhappy.

So what can be done to make them better? As I pointed out before having the judging criteria outlined before the start of the contest. If you look at competitions that judge things that are subjective most of the time there are at least 5 judges. The high and low scores for each event are removed. Let’s say for example the five judges gives scores of: 90, 47, 85, 99, 79. The 47 and 99 are removed from the calculation. I think this would improve the results.

I really don't want this to be viewed as a diss to the people who created film, the judges or FunSucker. Instead I would like to see some discussion about contest judging and ranking.

Edit: It appears the scoring system was laid out to begin. I missed that and I'm glad that was the case for this contest.

Last edited by AquaMorph (August 1, 2014 (12:26pm))

Re: BiM Contest Judging

I absolutely agree that judging could use a major overhaul when it comes to contest judging. I feel like long ago a certain way of ranking entries was establisedh and hasn't really changed or been improved for a while; each contest just co-opts the previous contest's ranking system and changing out the nameplates without actually changing much.

This isn't a slight against contest organizers. Organizing any kind of contest or event is lots of work.

At minimum, I think more judges are required. I know finding good judges is difficult in this community (and we should NOT go looking for volunteers - those who seek power often shouldn't have it), but expanding a judging pool to a minimum of 5 would help, I think.

When contests are announced, the method of judging should be announced as well. That way feedback can be submitted if there are any concerns of one judging category outweighing another.

*mod hat on*

I want to re-emphasize this part that Aqua wrote:

"I really don't want this to be viewed as a diss to the people who created films, the judges or FunSucker. Instead I would like to see some discussion about contest judging and ranking."

Re: BiM Contest Judging

I agree with you, Christian!

I'm afraid I can't toss any worthwhile suggestions as to how we can improve judging, but I hope the community can find ways to improve the already established methods, and that we can all voice our opinions and suggestions in a civil manner.

And more than 3 judges sounds like a good idea. I'd definitely suggest future contests consider that idea. mini/smile

https://farm9.staticflickr.com/8625/16037138950_5eeda635ce_o.png

Re: BiM Contest Judging

I know it might seem a bit sanctimonious to say this, given that I didn't do particularly well in this BRAWL, but I agree; I feel there should be some updating to the way contests are judged.

For a start, there should be more judges.  If one judge rates a film low while two others give it good ratings the film gets pulled down (as in BuilderBrothers' THAC X).  I'm not sure how removing extreme outliers would help, but I definitely feel there should be at least 5 judges, if possible.  That way, there's a better mean rating.

I think we should get more judges since the judges for these sorts of contests seem to always be the same people, i.e. BiM mods or officials.  To be clear, I'm not trying to attack BiM here at all; judging shouldn't be taken lightly and I have faith that these people were selected because they were considered to be the most mature, responsible and unbiased people available.  However, I do think it would be good to bring in new people who might bring more of a new balance to judging.  Of course, it would be difficult to find responsible people but it shouldn't be impossible.  Also maybe find some judges who are outsiders to this community, or only peripherally involved, as they would be less biased.

Maybe the judging criteria should be restructured as well.  40/100 for theme interpretations seems pretty excessive.  I'm guessing it might be to get people to adhere to the theme and deter cheating, but I found the allocation of points in this contest pretty random.  I think 10-20/100 should be enough.  Story should be given a greater proportion of the criteria, and I feel there should be separate categories for animation, cinematography and sound design--I found it odd that they were all lumped together in one category for BRAWL 2014.  Also, how is "creativity" defined as?  It doesn't seem like a very objective criteria.

I might not very qualified to offer such an opinion given that I've never organized/judged a contest, but these are just my personal thoughts and suggestions regarding this matter.

Retribution (3rd place in BRAWL 2015)

&Smeagol      make the most of being surrounded by single, educated women your own age on a regular basis in college
AquaMorph    I dunno women are expensive

Re: BiM Contest Judging

Having hosted the Christmas contest for two years now, one of the biggest things I learned for the judging stage was the need to view the work as a whole, and not just as the separate judging categories. It's Gestalt theory: the whole is greater than the sum of the parts. You have to break each entry down on a categorical level for scoring, but afterwards look at that score and see if it accurately reflects the entry as a complete piece.

Often I found that one entry might score higher on categorical basis, but another entry would just be better overall, so I'd have to adjust the scores accordingly.

Just some tips for those who might ever be judges or organizing a contest.

On a side note, I totally agree that theme interpretation should never be that much of a score.

Re: BiM Contest Judging

I agree that there is a problem with the judging and I suggest bringing back the Enjoyment category from the early BRAWLs. This was basically a category for what Spider is getting at; at the end of the day, what matters most is just how good of a watch it was. I believe that with this category, the results for this BRAWL would look vastly different and one of the entries in second or third would have won.

Re: BiM Contest Judging

I also support this idea.
It may sound a little self-serving since I could have placed a little better in this BRAWL, but even when I won BRAWL last year with Odoriferous, I felt a little weird about it because there were two entries I topped which I thought might beat me.  And, even if they I were to top them, I was pretty sure that I would do so only slightly and they would follow directly after mine, and yet they were a few entries behind, which all felt weird.

If it can be managed I think more judges would be an excellent idea.  Also, as Flying Minifig mentioned, if possible getting an outsider to be added to the judging panel may be interesting as it could give some rather unbiased opinions.  I feel as it is judging deviates somewhat from the community's general opinion of the films.  But more judges may help to make the general opinion of the judges somewhat closer.
Of course, I doubt we can ever make everyone happy, but we might be able to make the system to some degree slightly better.

FlyingMinifig wrote:

Maybe the judging criteria should be restructured as well.  40/100 for theme interpretations seems pretty excessive.  I'm guessing it might be to get people to adhere to the theme and deter cheating, but I found the allocation of points in this contest pretty random.  I think 10-20/100 should be enough.  Story should be given a greater proportion of the criteria, and I feel there should be separate categories for animation, cinematography and sound design--I found it odd that they were all lumped together in one category for BRAWL 2014.  Also, how is "creativity" defined as?  It doesn't seem like a very objective criteria.

I think one of the points Christian was trying to make in his spreadsheet thingy with his fancy math is that the criteria didn't actually matter that much, as regardless of having several different points of criteria, they all tended to be rated relatively similar percentages.
In THAC X the judges simply picked their favourite films, and I think that may in general do better, but especially so with more judges, and possibly even more so with Aq's other idea of cutting off the high and low ratings, of course, I'm not sure how that would turn out, but it sounds like an interesting idea.  A good film is good because it's really good, and choosing it as good from a few categories is a bit weird.

Though, theme is an especially weird category to have lead, as I've always thought of the theme as more of a way to prove that it was made within the time constraints, as well as a sort of almost running gag throughout our films so we could obviously theme-drop.  I've never watched a BRAWL or THAC film and thought I liked it because of the way the theme was interpreted was especially good, but I've always liked it more because it was just a good film.

I've judged several brickfilming contests before and been given some random criteria, sometimes it does seem like a sort of roll of the dice since you're not quite sure exactly how much it should get in some certain category, so I tended to just lean toward more points or less points depending upon how much I liked the film.
I also feel that judging the film as a whole takes somewhat less time than thinking about specific categories.

Re: BiM Contest Judging

Disclaimer: BRAWL is not a BiM contest in the official sense. So maybe the judging on BRAWL is not good, I don't really follow it closely enough to say, and I will make no attempt to defend it. The fact that I can't supervise contests closely in every instance is part of why BiM only has a few official contests that I put my stamp of approval on; in the past few years it has only been THAC.

I think THAC judging is generally pretty good, though I would say that after 10th place or so it becomes very difficult to judge degrees of goodness or badness because everything is on a mediocre-to-bad spectrum at that point.

That said, for the non-THAC contests I have overseen, such as the summer contests like TOY and Avant-Garde, a more thorough judging process was employed, and I think that it achieved the best results. The problem with the systematic approach used in THAC and (as far as I know) BRAWL is what Repelling Spider has touched on.

With the summer contests, I created a private forum where judges discussed the rankings of the winning entries until they had settled on an overall ranking they felt was accurate. We then (yes) fudged with the points to make the rankings fit the way we wanted. This was more about making the point averages something we could all agree on and reflect the results we wanted than, say, tampering to reflect our biases or something like that. This method isn't practical for THAC because THAC gets too many entries and the overall quality is not quite as high because of the 24-hour nature of the contest. Inevitably there are some compromises to make judging a contest like this efficient.

http://i.imgur.com/wcmcdmf.png

Re: BiM Contest Judging

I'm not complaining about BRAWL. I have been annoyed at the way judging has been treated for a while. This year's BRAWL was excellently run from what I saw. Questions were answered in a timely manner and the judging was prompt. This was general designed to get stuff going for future contest while people still cared about contest stuff. I have a few ideas for improving judging.

First, I think giving a score and ranking to every film is a huge waste of time. My suggestion is do the judging in stages. With a few judges select the pool of films that will be judge. I would not set a hard coded value or percentage because the amount of good entries will vary on the contest. After that go through all the select entries and rank them. Then announce a top 10 or top 5 or whatever is appropriate for the contest. That way being ranked would be more of an horror and it also eliminate a lot of the complain of films being under ranked.

Now as far as ranking systems I have two ideas.

The first in political science terms is called instant runoff voting. Each of the judges ranks the films from first to last. These rankings are tallied together. I think the easiest way to do this is to say a first place vote is one point a second place vote is two and so on. You tally up the point and eliminate the film with the most points. That film gets the last place ranking. From there the ballots are adjust removing the film that was just ranked. If the on the ballot the film was not the lowest ranked film the films that are lower ranked move up. This process is repeated until all the film are ranked and you have a winner. This entire process is easily automated and only takes each judge filling out one ballot. Statistically it leads to the judges being happier with the results.

The second is currently only a theory but it is based on the Elo ranking algorithm. This is used in a lot of things for example chess and most video games with ranking systems. How it works is all films would start out at the same level. For example, 1500. The judges would then compare one film against another and select the one they like better. From there the algorithm gives points to the winner and subtracts them from the looser. The amount of point is determined by the expected outcome of the face off. So for example, if a 1800 and a 1200 film face each other and the 1800 film wins it will not that gain that many points because it was heavily favored. However, if the 1200 film wins it will gain much more points. After a few rounds a clear ranking is developed.

Re: BiM Contest Judging

I have some ideas.

1) Get 6 judges for popular contests. For each category for each film, the highest and lowest scoring judge gets their score removed (like they do in scoring games in the olympics), and an average is calculated.
2) Have clear categories. Theme, story, presentation, and sound design.
          a: To me, story makes or breaks any film. This should be 40% of the total score.
          b: For timed contests, theme is important, otherwise someone could write their story and perfect it long before the contest even starts. In this case, the theme interpretation should probably have either a large bearing, or contestants get penalised to an extent on story if it doesn't fit the theme. I think that 30% for theme is important for timed contests. For every other contest, then it's not as important so could probably be 20%
          c: Presentation and sound design are both important, but technical parts of the animation process. Some are good at one, but not the other. Some both, and some not. This should be a point where the best can get extra points, but the worst don't loose so much on. Technically a better story should beat a better presentation any day. Fo timed competitions, I think these should each be worth 15%, and for not timed competition, 20%.

http://www.cxpulp.com/attachment.php?attachmentid=874&d=1279678499
The new KB Videos coming soon.

Re: BiM Contest Judging

Just throwing this out there:
I have nothing against Nathan Wells, he is an incredible brickfilmer, but I don't see why Blinders won BRAWL this year. When I looked at the spread sheet, I was shocked to see that Blinders had the highest score in theme interpretation. I've watched it over and over, but I can only find tiny references to development. Galan505's film, The Website, was obviously about development, it couldn't have been more blatant, and yet it got a 27 and Blinders got a 35? I feel like the judging was incredibly biased in theme interpretation this year. The score of theme interpretation was directly proportional to presentation. BRAWLING DURING BRAWL, while it was a joke film, got an 11.7 even though it was obviously about developing a cure for all diseases. THIS IS MESSED UP!!

There. Now look what you made me do. You made me rant.

Take it or leave it, I feel like my post is relevant to this thread.

The guy who got banned more times than DiCaprio said "f***" in The Wolf Of Wall Street.

Re: BiM Contest Judging

I think we need to be careful not to dis individual judges who are volunteering their time to make the community a better community.  We should do as the OP suggests, keep conversation to the general topic of competition judging.

Aka Fox
Youtube: My channel   Twitter: @animationantics
Best brick films: My selection

Re: BiM Contest Judging

A lot of good ideas are being tossed around in this thread! It's nice to see a discussion on the topic of judging. There will certainly be changes to the judging system next year. I haven't made a decision on what system to use yet (I don't need to for quite some time), so please feel free to keep suggesting judging systems/techniques you know of or thought of.

brickelodeon wrote:

Just throwing this out there:
I have nothing against Nathan Wells, he is an incredible brickfilmer, but I don't see why Blinders won BRAWL this year. When I looked at the spread sheet, I was shocked to see that Blinders had the highest score in theme interpretation. I've watched it over and over, but I can only find tiny references to development. Galan505's film, The Website, was obviously about development, it couldn't have been more blatant, and yet it got a 27 and Blinders got a 35? I feel like the judging was incredibly biased in theme interpretation this year. The score of theme interpretation was directly proportional to presentation. BRAWLING DURING BRAWL, while it was a joke film, got an 11.7 even though it was obviously about developing a cure for all diseases. THIS IS MESSED UP!!

There. Now look what you made me do. You made me rant.

Take it or leave it, I feel like my post is relevant to this thread.

The theme interpretation isn't rated on how obvious/apparent the theme was in a film, rather its interpretation, how clever/unique it was. BRAWLING DURING BRAWL was not a serious film, it was a joke, intended to get last place, so don't take the rating on that seriously.

Former BRAWL host, 2013-2017
             Youtube   Twitter

Re: BiM Contest Judging

The theme interpretation isn't rated on how obvious/apparent the theme was in a film, rather its interpretation, how clever/unique it was. BRAWLING DURING BRAWL was not a serious film, it was a joke, intended to get last place, so don't take the rating on that seriously.

This makes a bit more sense. Thanks for the clarification. Although I still hold that the score of theme interpretation was directly proportional to presentation.

The guy who got banned more times than DiCaprio said "f***" in The Wolf Of Wall Street.

Re: BiM Contest Judging

AquaMorph wrote:

The first in political science terms is called instant runoff voting. Each of the judges ranks the films from first to last. These rankings are tallied together. I think the easiest way to do this is to say a first place vote is one point a second place vote is two and so on. You tally up the point and eliminate the film with the most points. That film gets the last place ranking. From there the ballots are adjust removing the film that was just ranked. If the on the ballot the film was not the lowest ranked film the films that are lower ranked move up. This process is repeated until all the film are ranked and you have a winner. This entire process is easily automated and only takes each judge filling out one ballot. Statistically it leads to the judges being happier with the results.

This seems to be the most familiar and straight-forward solution. I think it would be good to explore it further for future contests.

AquaMorph wrote:

The second is currently only a theory but it is based on the Elo ranking algorithm. This is used in a lot of things for example chess and most video games with ranking systems. How it works is all films would start out at the same level. For example, 1500. The judges would then compare one film against another and select the one they like better. From there the algorithm gives points to the winner and subtracts them from the looser. The amount of point is determined by the expected outcome of the face off. So for example, if a 1800 and a 1200 film face each other and the 1800 film wins it will not that gain that many points because it was heavily favored. However, if the 1200 film wins it will gain much more points. After a few rounds a clear ranking is developed.

This is also an intriguing solution. It might be harder to convince the general public about it, since it seems a little more complex and uncommon (even though, as you point out, it is used in many ranking systems). I think if this method is used, it needs to be explained clearly so people aren't turned off by it's unfamiliarity.

Re: BiM Contest Judging

AquaMorph wrote:

Each of the judges ranks the films from first to last. These rankings are tallied together.

This is how we did the past two THACs. First place got 78 points, last place got 1, whoever got the most when all the scores were added up won.

https://i.imgur.com/1JxY79v.png

Re: BiM Contest Judging

brickelodeon wrote:

The theme interpretation isn't rated on how obvious/apparent the theme was in a film, rather its interpretation, how clever/unique it was. BRAWLING DURING BRAWL was not a serious film, it was a joke, intended to get last place, so don't take the rating on that seriously.

This makes a bit more sense. Thanks for the clarification. Although I still hold that the score of theme interpretation was directly proportional to presentation.

I feel like that if I had actually put effort into BRAWLING DURING BRAWL it would likely have gotten a much higher score in nearly every respect even if I left the story and theme basically untouched.  I could have lengthened it out and had it play itself more seriously with well-built sets, dramatic lighting, better minifigures, proper voice acting, nice music, and perhaps a few more jokes.  Maybe even a little bit more background showing how the doctor got his ambition, showing his wife and parents dying of disease.
Now, that would all likely be a much better film, however, the theme and story would be basically as they were, however I would guess get a higher score on all accounts, even theme and story.  They wouldn't be much different, yet they would be presented much better and seem much nicer (or maybe just not horrible).  And the film might've placed decently.

But the same would likely go the other way.  Imagine now We the Pumpkins Three, except instead it's called 3 Pumpkins.  The Pumpkins are just blank orange heads on unfitting red shirts, whilst just one of them has the correct head but it's turned the wrong way.  The elf looks kinda weird and it's just a normal yellow head turned round.  The set is just a few things of old grey.  The script doesn't rhyme, it's just me yelling into a mic making an intro and the elf's like, "yo can I have a face?" and the pumpkins would be like "yeah sure girl, we just need you to get us some stuff to develop it."  "K." and it says in comic sans "LATER" and then she comes back, she gets her face, she cries, and the pumpkins explain the moral less than eloquently without properly rhyming and then it ends.
All of it animated and shot just like BRAWLING DURING BRAWL.

Now this would all likely look disgusting, however, it would have exactly the theme interpretation and story as the actual film, though I have little doubt it would easily place last or very nearly there.

As Aquamorph pointed out, the criteria for the judging is a little meaningless.
I sometimes feel a little like every good brickfilm is just a bad brickfilm in disguise, that or every bad brickfilm is just a very good one in disguise.  A bad story can be made to look good, a good story can be made to look bad.  The whitest cleanest LEGO brick turns black in the dark, a black LEGO brick will shine brightly if you position it to reflect the light the right way.

Re: BiM Contest Judging

Squid wrote:

Imagine now We the Pumpkins Three, except instead it's called 3 Pumpkins.  The Pumpkins are just blank orange heads on unfitting red shirts, whilst just one of them has the correct head but it's turned the wrong way.  The elf looks kinda weird and it's just a normal yellow head turned round.  The set is just a few things of old grey.  The script doesn't rhyme, it's just me yelling into a mic making an intro and the elf's like, "yo can I have a face?" and the pumpkins would be like "yeah sure girl, we just need you to get us some stuff to develop it."  "K." and it says in comic sans "LATER" and then she comes back, she gets her face, she cries, and the pumpkins explain the moral less than eloquently without properly rhyming and then it ends.
All of it animated and shot just like BRAWLING DURING BRAWL.

Now this would all likely look disgusting

Are you kidding? This would be a thrilling finale to the successful trilogy. I want this film in front of my eyes immediately.

Have you seen a big-chinned boy?

Re: BiM Contest Judging

Remember, this thread is about how to improve contest judging for the future, and not about the previous BRAWLs. Let's stay on topic.

Re: BiM Contest Judging

This is a very interesting thread. I think it's important to discuss this topic because it's healthy to periodically question and revisit established procedures to make sure they are still 'up to snuff' and adequately meet the needs of an evolving community. I'm also glad to see that the discussion has, for the most part, stayed on point and has not turned into a forum to dissect specific judges and contest results.

There are only three things in life you can count on; death, taxes and someone questioning the judging results of a BiM contest. Often, the questioning seems like sour grapes, but sometimes valid questions are raised. However, due to the inherent nature of judging, you will never be able to eliminate some questioning of results. Because of that, there needs to be a level of transparency into the judging process. These things help with transparency; methodology must be clearly defined and knowledgeable and respected judges should be used.

So many excellent points have been raised here. The problem is that every solution has a downside. For example, Repelling Spider makes an excellent point about the individual components vs. the sum of the parts. However, if you take that too far and minimize, or even eliminate, category judging, you open yourself up to questions of how fair your judging was, and not giving entrants any guidance of criteria.

The key is to find the right balance. I think a hybrid approach should be used. I like the idea of not ranking all entries but having the panel select the top X entries to judge. It's like the academy awards, hundreds of films are produced each year and 'submitted for consideration' for judging, but there are only a handful of nominations. The judges produce their lists of the top X films and then they get together and determine, through discussion, the group of top finalists that will be judged. (This could be a separate announcement in the contest, The Finalists) The finalists are then judged in a two step process. I think you need to establish traditional categories (theme interpretation, cinematography, animation, etc.) to make sure all the judges are on the same page. Each judge comes up with their ranking of the finalists based on these established categories. Then, the judges share their rankings with each other and discuss them to determine the final results. The category rankings are simply the starting point for the final judging discussion. You need both; standardized judging criteria (think: required elements of a figure skating routine) and discussion of overall presentation and impression.

Some notes:

- Judging methodology must be clearly communicated in the announcement of the contest.
- Ranges for the number of Finalists should also be established prior to the contest. As the number of entries will not be known, ranges should be used: 1 - 20 Entries = 3 - 6 finalists; 21 - 40 Entries = 8 - 15 finalists; etc. (admittedly, this one is tricky but I think it is a necessary step to avoid accusations of fudging results)
- A head judge should be named who would settle any stalemates in the final judging

I suspect this is unofficially how many of the contests are already being judged, but I don't think that is being disclosed. Full disclosure up front is key. Ideally, independent judges would be best to avoid any accusations of playing favorites but given the size of the community this would be very difficult to achieve.

TCOTY Entry: The Perks of Being TCOTY
ToY Entry: Secrets of the Lost Tomb
Please visit me on the YouTubes!
Care to follow me on the Twitter?