subscribe: Posts | Comments      Facebook      Email Steve

Joe Roberts is right about bullsh*t unreliable wine judge studies

33 comments

I held off commenting on those two Journal of Wine Economics studies that were written up so sensationally in the Wall Street Journal the other day because, after all, I’m the poster child for the “’expert’ wine sipper” (TIME’s snarky words and punctuation) whose “hype” and “illusion” (WSJ’s words) are taking “us all for suckers” (TIME again), and I needed time to think about it.
Well, it’s time for me to have my say.

It all started with a series of research papers by a guy named Robert Hodgson that were published in the (subscription) Journal of Wine Economics. One, in Fall 2008, was “An Examination of Judge Reliability at a major U.S. Wine Competition.” The other, from Spring 2009, was “An analysis of the Concordance Among 13 U.S. Wine Competitions.” (I couldn’t find direct links to the papers, but you can easily PDF them by Googling “Robert Hodgson, wine.” They were the first two hits on my search.)

In the first paper, Hodgson gave “expert judges…a flight of 30 wines imbedded [sic] with triplicate samples poured from the same bottle.” This is of course precisely the revenge fantasy envisioned so often by critics of wine critics, but in this case Hodgson actually pulled it off, with the cooperation of the California State Fair Wine Competition, apparently by promising that the results would not be made public. Never mind that they were; Hodgson found “judge inconsistency, lack of concordance — or both.” (I like the ominous tension of that double hyphen.)

In the second paper, which sought to reinforce the conclusions of the first, Hodgson statistically analyzed “over 4000 wines entered in 13 U.S. wine competitions” and again found “little concordance among the venues in awarding Gold medals.” His most trenchant discovery was that, although “47 percent received Gold medals…84 percent of those same wines also received no award in another competition.” This would be a little like holding the 2008 Presidential election twice in two days and having Obama win big one day and McCain the next. That would surely cast doubt on the sanity of the electorate, and Hodgson surely meant to cast doubt on — if not critics’ sanity — then on their methods and conclusions.

The Wall Street Journal coverage (which was not written by Dorothy J. Gaiter and John Brecher) had the sub-headline, “They pour, sip and, with passion and snobbery, glorify or doom wines,” thereby alerting us, via the word “snobbery”, to the writer’s (Leonard Mlodinow, a professor of randomness at CalTech) attitude toward critics (although to be fair, Mlodinow may not have written either the header or the sub-head; those creative tasks often are done by editors). That sub-head, by the way, also compared ratings and reviews to “a coin toss,” the implication being that the next time you have to decide between a 100-point Heimoff wine or a 67-point Laube wine, you might as well toss a coin because either way it doesn’t matter.

Here’s my response. Yes, Hodgson’s papers (and the ensuing near-hysteria they prompted in the media) are a little “hard to swallow” for critics (as this article put it), not because we’re unaware of a certain inconsistency in wine reviewing (I’ve never denied it), but because these types of papers paint all critics and all wine criticism with the same brush and make it sound as though all wine criticism is sheer bunkum.

Joe Roberts, at 1WineDude, hit the nail on the head with his deconstruction of one of Hodgson’s papers by pointing out that Hodgson’s implication that all wine criticism is flawed is, itself, a flawed conclusion. (Joe called it “probably total bullsh*t.”) And even if Hodgson were to say that not all wine criticism is flawed, surely he must have known that his papers would be perceived as saying so and would empower the chatterazzi who love to bash wine critics.

Part of the problem is group wine judgings, which are subject to so many problems I don’t know where to begin. Let’s start with politics and personalities and then descend to contrasting and possibly suspect levels of qualification and the fact that committee scores tend to skew toward the median. Group tasting includes the phenomenon of peer pressure, which is the opposite of the kind of solipsism that characterizes individual tasting of the kind I do. I’ve seen tasters in groups get so weary of negotiating and being attacked by the majority that they simple crumple and agree to change their scores so everybody can get on with it. Besides, tasters in a group tend to consume more wine than a single taster will by himself, because as the group discusses and negotiates its findings, re-sipping becomes necessary, leading to the possibility of marred senses.

Hodgson’s paper on triplicate samples poured from the same bottle seems on the surface to provide more ammunition to the critic bashers, but on closer examination it can be explained. Hodgson writes that “When possible, triplicate samples of all four wines were served in the second flight of the day.” As I understand it, this means that the critics had tasted no fewer than 30 wines, and possibly as many as 60 wines, when they reviewed the doubles and triples; this raises questions about palate fatigue. Hodgson himself also recognized that judges may vary in competence. This is why I tend not to put much stock in wine competition findings, but lots of Americans do, and I have no problem with that.

It can’t be said enough: a wine review is a particular person’s impression of a wine at a particular moment in time. Readers are free to attach their trust in a wine review to their trust in the wine reviewer.

  1. First off, you deserve credit for to the WSJ article citing Hodgson’s studies. This is an emotionally charged issue especially for professional critics, as yourself. Now to directly confront the spirit of Hodgson’s work, after you taste (approx.) 10 wines at a time, given palate fatigue (my palate is shot after 7 or so wines), are you of the belief that your scores of a wine wouldn’t vary significantly (i.e., >/= 5 points) if tasted at differerent times? You write ” a wine review is a particular person’s impression of a wine at a particular moment in time.” It almost seems as if this is an admission that wine ratings can vary significantly.

  2. Sean, it’s an admission (if that’s the right word; I’d use “acknowledgment”) that a wine rating can vary over 2 different events. I don’t think there’s a wine critic on Earth who would deny that. I would hope not, anyway.

  3. I really agree with the argument you and Joe make. I think the biggest factor in those wine festival judgings are the politics behind the events, and the whole group judging/haggling/negociations. I’ve lived it a few years at a local wine festival, and found most of what my group scored low, had medaled, and what we gave high scores, got zilch.

  4. Steve Boyer says:

    Seems to be quite a waffle you have made here Steve. Claiming that the studies about wine judges being inconsistent are invalid, and then ending the post with qualified agreement of Hodgson’s conclusions.
    Not having read the studies word for word I could be wrong, but my understanding was that Hodgson’s studies were about contest judging not critics as a whole.
    How can palate fatigue affect contest judges and not critics who review 52 napa cabs in one sitting? How can group tasting negatively affect contest judges and not critics who group taste 9 aged wines?
    The truth is that Hodgson’s studies (while admittedly hit pieces, based on his personal motives for undertaking the studies in the first place) have merit and speak to a truth that all wine professionals familiar with wine contests know to exist. This only means that gold medals and such should be looked at as marketing, rather than as objective reviews. Of course there are good judges along with poor judges, just as there are good critics to go along with the bad, but to reject Hodgson’s criticism completely for this reason really puts you in the same boat as Hodgson.
    It seems as though Hodgson held up a garment tailored for another, and you claimed it was a poor fit for yourself. Just as blogging, wine writing and criticism is in need of some kind of ethical standards as well as full disclosure, perhaps contest judging needs more of these ideals as well.

  5. It’s interesting to see this debate take on new turns and new discussion – awesome!

    I’d add that I think Hodgson’s conclusions might be totally correct – I just don’t agree with *how* he came to those conclusions on the basis of his studies.

    Great stuff (again!), Steve.

  6. Morton Leslie says:

    Here is a link to the paper. http://www.wine-economics.org/journal/content/Volume3/number2/Full%20Texts/01_wine%20economics_Robert%20T.%20Hodgson%20(105-113).pdf

    I think you would have to be over sensitive, or want to create a straw man, or not really read the paper to think that Hodgson implies that all wine criticism is flawed. Read his conclusion. His point is that there is a way to measure the consistency of any wine judging. This is probably what strikes fear in the wine critic.

    If a competition were serious about wine judging they would chose their judges on the basis of tested and proven wine tasting ability and consistency. They would use professionals in wine sensory analysis who understand where error and bias occur and take measures to prevent such mistakes. They would monitor judge performance during the competition and eliminate judges who fail to be consistent. They would provide quality and flavor standards and definitions for each wine category to which the judges would be tasked with scoring.

    If you task a judge to taste 24 wines in one sitting or 80 wines in one day, you guarantee palate fatigue. If they taste in a panel situation in a noisy and distracting environment you guarantee error. If you allow them to talk you invite outside influence. If you have a judge who thinks Cabernet Sauvignon in the table wine category is acceptable at 16% alcohol with residual sugar and you have another judge who thinks this is unacceptable, a third judge who loves oak and brett, and a fourth judge who likes one thing now and another thing later and if you have no standards for them to judge a wine by…then you create a situation where the winner of the competition is really just a chance event.

    And this is what Hodgsen so beautifully illustrates.

  7. Morton, you’re asking for wine competition judges to essentially be Mentats: [Wikipedia: “A Mentat is a profession or discipline in Frank Herbert’s fictional Dune universe. Mentats are humans trained to mimic computers: human minds developed to staggering heights of cognitive and analytical ability.”] The completely unbiased, un-influenced, error-free, flawless and perfectly consistent human being exists only in fiction. All the rest of us are prone to error, some more, some less, and the incidence of error in any one judge can be random. Therefore there can be no perfect way of judging wine; we pays our money and takes our choice. For me, the choice is not group judgings, it’s a single taster whose palate I trust and whose writing I like.

  8. Morton Leslie says:

    You are reading things in my comment that aren’t there, just like you are reading things in Hodgson’s paper that aren’t there. Both his paper and my comments are about improving competitions, making them relevant, not about making them perfect.

    Why not measure whether a judge is consistent or whether they have the sensory skills necessary to judge a particular wine? Why not prohibit table talk? Why not reduce the number of wines tasted to help prevent palate fatique? Why not establish standards by which wines are judged? Why not ask that a judge have training in sensory analysis? Why not ask that the results be more than chance?

    This isn’t about making people into machines, it’s about making them more professional.

  9. Morton–

    I have yet to see a so-called competitive judgng that made sense to me. That is why I have refused to do them for the last twenty years–except for two in Australia for which I traded in my soul and went anyway.

    It does not matter if they are being run by Dan Berger or by the corner grocery store, there is no way for a big, all-inclusive tasting, regardless of the competence of the judges, to make consistent decisions when judging 100+ wines per day.

    Training in sensory analysis? Forget it. Palate fatigue is palate fatigue. 120 Sauvignon Blancs are 120 Sauvignon Blancs. Acidity is acidity. Tannin in tannin. Pity the poor bastards who get the Petite Sirah at nine in the morning–or as happened to me in Australia, the sparkling Shiraz first thing out of the chute.

    It seems to me that we have long ago identified these massive tastings as exercises in marketing and not in wine criticism. How does a Gold Medal, earned in a competition in which the best candidates typically do not enter anyhow because they have nothing to prove mean more than a description and score from a respected critic who has demonstrated over time to understand the topic being covered. Give me Schildknecht (splg?) on Riesling. Give me Gerry Dawes on Spanish wines. Give me the Decanter panel on claret. Despite his too high points, give me Parker on Rhones.

    I dont need Hodgson to know that these are good sources because over the years, their judgments have proven good by my palate. And, regardless of what Tom or Arthur may want by way of wine illumination, those guys “illuminate” the wine just fine for most of their readers. I like that standard a lot better than any other.

  10. It is my understanding that Robert Hodgson undertook his study because he himself had a winery and entered competitions. Sometimes he won medals and sometimes he didn’t. So he concludes that there must be somethng wrong with the competitions. I’m no scientist, that that doesn’t seem like a logical conclusion to me.

    Admittedly I am biased here, since I work for a wine critic and work on several wine competitions. I believe his reviews are extremely professional. I can’t speak for all wine competitions, but I believe our competitions are well-run and fair.

    But that isn’t the point I want to make. It is this — We live in a world where every item or services we might conceivably want to buy is advertised and promoted, most often using recommendations by everyone from actors playing housewives to superstar athletes. We are expected to take the word of Joe on the street or some famous actor that we will just love this soft drink, or that car. With that being the way most things are marketed, surely even “inconsistent” (IF they are) recommendations by people who are experienced and knowlegeable in their fields are more valuable than those from someone paid to say they like something.

    Our wine competitions judges are selected for their experience and competence. For Critics Challenge, they are all experienced wine journalist (Critics, Reviewers, Writers, call them what you will). For Sommelier Challenge they are professional sommeliers with extensive wine training. For the newest competition, Winemaker Challenge, they are winemakers including those from Nickel & Nickel, Patz & Hall, Sbragia Family, Merry Edwards and others (full list at http://www.WinemakerChallenge.com). Personally, I put more value in recommendations by professionals than I would in those of pitch-men or actors, yet that’s how most products are marketed.

    As much as I love wine and hate to reduce the discussion of it to the level of widgets, the fact is that results of wine competitions are used to market wine. They may not be a perfect, but they do give the average wine buyer a tool for chosing wine. Surely picking a wine because it won a Gold is more reliable than picking one because it has a cute animal on the label, or a catchy name.

  11. When a Charles Shaw Chardonnay won a Double Gold a few years back, that should have alerted people to the fact these competitions are essentially random. Instead they went out to buy cases of it.

    The main problem with ‘regular’ reviews is there are no variances given on scores. A 90 is really 90 +/- 3. Additionally, wines are professionally tasted in a way most people don’t consume them. Most people don’t have 1 oz., then spit. They have a glass or two, and they drink it. This makes a big difference on high extract wines as they become more numbing over time. Aromatic wines, meanwhile, often reveal more nuances.

  12. I find this all totally amusing. If fact, it makes me laugh out loud. I wish more consumers had confidence in their own palate and experiences and rely less on reviews. No offence! I eagerly read most publications and websites purely for perverse kicks.

  13. Thanks for this post Steve, this is a very relevant topic as many wineries consider entering competitions. I work for a wine competition company and our company believes that wine competitions and wine judging are an excellent marketing tool to convince average consumers to purchase a bottle of wine!

    When I say average consumer, I mean someone who enjoys wine but probably drinks their wine purchase within 3 days and usually grabs something off the shelf at their local grocery store. We believe that average consumer will be more likely to purchase a bottle of wine if they know that it has won a gold medal! A medal literally say, “Drink me, I’m good!” I think some wine industry professionals are against wine competitions because they see inconsistencies in the medal or scoring process. I totally agree that there are some inconsistencies, but medals are supposed to entice average consumers, not wine professionals who clearly have a greater knowledge of wine and of what they prefer to drink.

    I agree that there may be some room for improvements in wine competitions and we take great care to create consistent and unbiased judging. We choose our judges based on their wine experience and knowledge. For example, two of our wine judges from last year were Mary Ewing Mulligan and Ann Noble. We try our best to prevent palate fatigue but each of our panels taste at least 80 wines in a 6 hour day. Why not just get more judges and lower the number of wines tasted? Because we choose to wine and dine and and put up each of our judges in a hotel.
    We do this because it gets us better qualified judges who provide more consistent scoring. Obviously if we could we would like to have more judges to reduce the number of wines they are tasting but that is an issue of $ and other resources. Getting back to the judging process, each panel has their own tasting room and we rent out a whole wing of a local hotel to ensure that there are no noise distractions. Finally, we require our judges to taste through each of the wines before they decide upon a medal.

    Is there a way to improve wine competitions? Do the wineries who enter competitions want these to be improved? Wineries enter competitions to win medals. So what if a wine gets a gold at one competition and nothing at another. Those are two different tasting with two different bottles of wine! If you walk into any winery and taste the same wine three months apart you might have two very different experiences! Not to mention that giving the same wine to two people with different palates will result is two very different opinions of that wine. It is impossible to ask wine professionals to judge a wine exactly the same way. I think that Hodgson’s findings were probably valid. However it is unrealistic to ask wine judges to consistently award medals and scores to wines.

  14. Greg

    I would respectfully submit that the Two Buck Chuck Chard victory had more to do with the rules that allowed the winery to submit a sample from a fifty case lot than the randomness of results. It is a prime example of the deceit that happens when wineries are allowed to bottle wine in separate lots and put the same labels on them.

    That said, I do find the methodology at such events lacking.

    As regards high extract wines, I am happy with Chardonnay and I am happy with Riesling, and I do not find myself getting noticeably more numb drinking Chardonnay with the dishes it goes with than I do drinking Riesling with the dishes it goes with.

  15. Steve Boyer says:

    Wow… Where to begin!
    Felicia’s acknowledgment of her unscientific background doesn’t keep her from declaring a scientific study unscientific. Not sure if I am reading the Onion Wine Column or Mad Magazine’s top 100 excuses for poor wine judging.
    While admitting that medals and judged contests are potentially inconsistent and all about marketing, Felicia maintains that at least wine contests are judged by professionals whose opinions are of more value than those of “pitch-men or actors”. Oddly enough one of the ” professional” writers/judges/critics of the publication she defends has encountered an ethical quandary that potentially qualifies her as one of the “pitch-men or actors” Felicia discounts. Dr. Vino covered this well, http://www.drvino.com/2009/11/13/beringer-leslie-sbrocco-wine-cellars-7-11-chocolate-milk/ the comments section is most illuminating! Felicia’s conclusion that picking a wine because it won gold is more reliable than picking it because of the label or the name is as flawed as her idea of scientific integrity.
    Elisa, It is at least somewhat refreshing to know that while you “work for a wine competition company” you acknowledge that said competitions are simply schtick to “entice average consumers, not wine professionals” and that the height of your company’s ethical standards is that “we require our judges to taste through each of the wines before they decide upon a medal.” Imagine that… you actually require that the judges taste all of the wines before awarding the plethora of medals awaiting the marketable ones. So, can I infer from your statement that requiring each judge to actually taste all of the wines being judged is an unusually high standard for wine contests?
    I admit that Fred Franzia’s latest label can have as much bottle variation as any two cans of Campbell’s Chicken Noodle, but to give it a gold in one contest and no medal in another contest is somewhat disingenuous. I also admit that asking the average wine judge to “consistently award medals and scores” is over the top. We should all just relax and enjoy fooling the “average consumer”. Shame on them if they are silly enough to expect an honest and professional appraisal of the wines on which they are spending their hard earned money.
    The emperor’s tailor has nothing on wine contests!!

  16. Slow down there, Mr. Boyer. What Elisa said was that her competition requires all the judges to taste all the wines in a grouping before giving any of them a medal. At least this way, the judges are awarding medals to wines based on how they compare to each other.

    Your derision for the low-level of scientific rigor, as you perceive it, is obvious in your posts but don’t let it get in the way of understanding the specific methodology that Elisa describes and drawing snide conclusions based upon a misreading.

  17. TJ, I agree. Reading the local highly marketed Sonoma County Harvest Fair results is like reading a yellow tabloid of which of the same three wineries got the best of class for Zin, Sauv blanc, etc. (implying many of the same judges are used every single year) These competetions are for beginner wineries who think they need to submit seeking “peer acceptence” or for some other vain-ish reason. The sales very rarely pencil out (see below). Moreover, once these shiny medals (made in China) reach the winery, what do you do with ’em? Hang these gawdy things in the TR? I think not. Besides, hitting the big score or medal doesn’t translate into phone’s ringing or orders flowing in from the internet.

    The fact is, the main segment of wine people supporting this ancient form of wine marketing are the very same people to financially benefit (writers, contest hosts, etc). Once the children of winery owners, and the new gen of growers/winemakers realize this is an inconsistant, subjective, non-scientific approach to marketing one’s wines, there’ll be a change up.

    The only comp I found at all usefull was the AVA specific one where wines were tasted blind along side other wines from basically neighboring vineyards.

    Average fee to submit $100/wine (including time, ups, pkging, wine etc) average number of wines submitted = 3, average # of comp’s annually submitted = 4, $1,200-1,500 each year for the opportunity for some self-proclaimed judge to taste your wine. Seems like a lot of dough.

  18. Jason Brumley says:

    I don’t want to bash critics or wine judges, their jobs are hard enough. I do find it interesting, however, that many of those experts will vehemently defend their findings and ratings, yet when probed will admit (and oft times glorify) the fact that wine taste is subjective and that this can be caused by many different factors. You brought up one, palate fatigue; but, there are others like environment (i.e. State Fairs…it’s tough to taste and accurately judge with the wafting barnyard smells and fried food grease lingering in the air.) But those and other factors can be the things that make wine so cherished. Tell me that a wine does not taste different while sipping a glass at the winery, during a nice meal with your significant other, or looking out upon the Seine, compared to tasting in a convention center or with a print deadline looming over your head.

    These are the things that make wine so wonderful. The flavors are subjective and entirely linked to mood, time of day, and environment. I just wish that more reviewers would represent their findings, not as law, but more like interpretive dance or poetry.

  19. Jason, I haven’t compared my reviews to dance or poetry, but I take your point. I have said endlessly that there’s a subjective aspect of reviewing…that any given review is simply that reviewer’s experience of that wine at a particular moment in time…and that readers or consumers should see it that way. So I suppose the comparison between wine reviewing (on the one hand) and dance or poetry is apt — although that will infuriate the purists who insist wine reviewing should be as rigorously consistent as mathematics!

  20. It doesn’t surprise me a bit that Two Buck Chuck could win an award with their Chardonnay. The grape plays a lesser role in Chardonnay wines. Two Buck Chuck Chardonnay is a finely controlled concoction of oak extract, etc. Probably 75% Chardonnay and 25% something cheaper. It would surprise me if Two Buck Chuck won for a red wine. I seriously doubt any of their red wines have any varietal correctness except color.

    Clearly, wine competitions are set up with too many wines in too little time. Why pick on competitions? The whole industry is based on hype. Just watch everyone flock to the “names” at any tasting. Even the experienced, serious tasters do it. (Of course, nobody reading this blog does it.)

  21. Wine assessment involves both dependent and independent variables. Competitions seek to isolate the sensory quality of the wine itself, apart from setting, mood, etc. all of which bear upon enjoyment. Sure, if you are sitting alongside the Seine the wine and its memory is going to rank high. But if you live in Peoria and just want to select among a plethora of wines for sale in your city, you’d like some guidance from others who have rated the wine and not the experience.

    That said, as with other consumer items, people tend to trust the opinion of friends, then on to peers generally. In the Experts vs. The People, our new social networking world is shifting the relative importance of each. Experts still carry weight, but not like they use to. Such sites as wine.woot.com, Yelp.com and TripAdvisor.com (Zagat guide being the pioneer) play a much greater role in consumer decision making because of the ubiquity of the Internet. These sites, and all the others, tend to arrive at the “winners” that emerge regardless of individual circumstances (=personal experience). These channels provide the best means of eliminating the subjective variables.

  22. Steve Boyer says:

    Mr. Mirassou, My conclusions (snide or simply straight-forward?) are not based on a misreading. Restating the methodology doesn’t change the understanding that requiring all of the judges to taste all of the wines seems to me to be the bare minimum, not reason for patting any judging format on the back for being thorough and unbiased. Your conclusion that “At least this way, the judges are awarding medals to wines based on how they compare to each other.” again begs the question ” is this an unusually high standard for wine contests?” It is not a lack of understanding the specific methodology that causes my consternation, it is that said methodology should be standard, not unusual and certainly not reason for back patting.

    What concerns me even more (admittedly also what arouses my ire) is the cynicism with which the “average” wine drinker is viewed. I appreciate Elisa’s candor and acknowledgement that the process is inconsistent and at it’s heart nothing more than marketing schtick, and yet am disheartened that her response to this is ” so what”. You call me to task for snideness, as you percieve it, but seem to be OK with an industry approach that views the “average consumer” with such apparent disdain.
    It is obvious to anybody that has tasted your wine that you are passionate, thorough, consistent in craft and knowledgeable of place as well as market. Doesn’t your wine deserve consistent, thorough, knowledgeable judging all of the time. Don’t the people (actual people, not “average consumers) who pay for (gold medal from a store shelf or tasting room at your winery) and enjoy your wine, deserve to trust that the experience is genuine, not just a “so what” because it is difficult to ensure consistent results from contests.

  23. “It was in this climate that in the 1970s a lawyer-turned-wine-critic named Robert M. Parker Jr. decided to aid consumers by assigning wines a grade on a 100-point scale. Today, critics like Mr. Parker exert enormous influence. The medals won at the 29 major U.S. wine competitions medals are considered so influential that wineries spend well over $1 million each year in entry fees. According to a 2001 study of Bordeaux wines, a one-point bump in Robert Parker’s wine ratings averages equates to a 7% increase in price, and the price difference can be much greater at the high end.”–LEONARD MLODINOW, Wall Street Journal

    Which sentence doesn’t below and why? Jeez, who edited this piece? Why is this sentence on wine competitions in a graph that has is talking about Parker? What does one have to do with the other?

    And to Charlie Olken, thank you so much. You are the one who reads me!!! 😉

  24. Jason–

    Most of us know that our comments are our opinions. We are smart enough and humble enough to know that we are talking about snapshots, and some of us are capable of translating our opinions into words and descriptions which many people find accurate. Many of us also suggest context for the enjoyment of particular wines.

    But, of course, almost all critical work consists of snapshots of one form or another. One hopes that professional wine critics, following rigorous, bias-limiting methodologies can consistently make useful comments based on those snapshots. Not all can, but I have listed above many who can, in my opinion of course.

    I cannot think of any of the prominent reviewers who think their opinions are law, but I may be too close to the action to see what you see. Your comments are grist for the thought mill, and I would welcome some elaboration on them.

  25. I’ve read the Robert Hodgson studies, I’ve read many of the articles, essays and blogs they’ve prompted, and I write about wines I find of interest and value, drawing from experiences that range from visiting tasting rooms to judging at wine competitions. Given that background, let me say:

    1) Every party to every one of these forums inevitably and rightly concludes that each person should learn to trust his or her palate first, find the wines they most enjoy, and use the results of wine competitions and the scores and commentary of individual critics only as guideposts.

    2) No approach to wine criticism is perfect. Competitions have their shortcomings, so do individual critics. (One value of competitions that few observers have noted is that in a group evaluation an open-minded judge will learn to recognize and ackowledge blind spots pointed out by fellow panelists.)

    3) The Hodgson studies are intended to help improve the credibility of wine competitions, and while I may have quibbles with their methodology and interpretations I share his hope that they will lead eventually to improvements in how the contests are run.

    4) A close reading of the studies shows that they don’t conclude that group wine criticism is “sheer bunkum,” but that it is the interpretations of the studies that reach that conclusion.

    5) As Charle Olken does in this thread, someone invariably claims that “the best candidates typically do not enter” wine competitions. How does anyone know if they are the best wines if they don’t subject themselves to group evaluation? By my experience, it’s the most expensive wines that typically don’t enter wine competitions, which because of their cost and capitalization have ways of marketing themselves to convince buyers that they are the best. They have too much to lose by ending up with a bronze medal, or less.

    6) Much misperception attends these sorts of threads. Contrary to waht is said here, for one, State Fair wine competitions generally aren’t held during the State Fair itself, in part to avoid the intrusion of “barnyard smells” and other distractions. I just returned from judging the Houston Livestock Show & Rodeo International Wine Competition; the livestock show and rodeo itself won’t be until next spring, and eight months have elapsed since the last staging of the roundup. Also, Steve suggests here that Robert Hodgson broke his word that results of his studies wouldn’t be made public. He didn’t. Though California State Fair officials initially were apprehensive about being tied to potentially embarassing research, they ultimately agreed the results should be published. As a member of the California State Fair’s wine advisory committee, I was there.

  26. Mike, “an open-minded judge will learn to recognize and acknowledge blind spots pointed out by fellow panelists.” In theory, maybe. In practice it’s just as likely to be, ‘Those other guys are idiots.’ You did say ‘open-minded’ but most judges I’ve ever met are pretty set in their ways. Also, “it’s the most expensive wines that typically don’t enter wine competitions, which because of their cost and capitalization have ways of marketing themselves to convince buyers that they are the best. They have too much to lose by ending up with a bronze medal, or less.” This is certainly true. As for “breaking his word,” I relied on published reports, and I thank you for the clarification.

  27. Mike–

    I will repeat my assertion. The best wines are rarely entered in these competitions.

    You ask how anyone will know. Fair enough. Because those wines are compared, mostly in blind tastings, by reputable critics using rigoroous tasting methodologies and my tasting panels results are part of that body of knowledge that suggests that the best wines, according to my magazine’s ratings and those of lots of other people, are not those entered in public competitions.

  28. Thanks, Charlie. Sounds as if your tasting panel shares with several wine competitions the desire to evaluate wines with a fair, open and focused mind. Might be interesting if you and other similarly constructed panels were to retain Robert Hodgson to test the consistency and endurance of the judges. Maybe Marvin Shanken could be persuaded to underwrite such an expanded study.

  29. Given all the brouhaha (sp?) over wine competitions and judging inconsistencies here is a fun thing to take a look at: http://www.consumerwineawards.com

    We are getting ready to launch an amazing wine tasting and awards program using consumers as judges. Our intentions are:

    -Raise money for local charities (primarily a playground renovation project and wheelchair initiative) with the Lodi Tokay Rotary.

    -Continue our paradigm-shifting research on :

    o why consumers like the products they do.
    o how to segment consumers to create vibrant communities based on shared sensory and aesthetic values.
    o See how open the consumers are (or not) to exploring new and different styles and flavors and develop new means for our industry to embrace and cultivate the entire market. With gusto.

    o Provide feedback to producers to gauge how the hardest to reach consumers respond to their products.

    o Develop the means to have everyday consumers themselves become confident that they can assess wines, share the information and connect other consumers to the products they will love the most in stores, restaurants and online.

    o Capture data with several formal studies with our team of mentors and researchers to determine the validity of our various hypotheses and tasting methodology.

    We anticipate over 1,000 wines and a judging body of over 100 consumers. What makes this even more fascinating is that we are seeking consumers who are NOT necessarily be members or affiliated with wine organizations, take wine education classes, read Robert Parker or the Wine Spectator. Both Jancis Robinson MW, OBE and Gary Vaynerchuk (yay for extremes) are on our Board of Directors demonstrating the interest we are brewing in the wine community.

  30. Mike Dunne – would love to connect soon and take you through our next iteration! We are reversing the process now and selecting consumers by their declared wine preferences, have them assess wines consitent with their preferences, THEN have a team from Davis assess their taste sensitivity, taste buds, etc. I will give you a holler.

  31. In the FWIW category
    I work for an East Coast winery. In 2009 we entered 7 wine competitons. Of the two wines that received the most medals the results seem to have some consistency
    2006 Syrah tallied 2 golds, 2 silvers and one bronze
    2006 Merlot tallied 4 silvers and one bronze
    Here are the competitions we entered some local and some national.
    2009 Dallas Morning News Wine Competition
    2009 San Diego International Wine Competition
    2009 Tasters Guild International Wine Competition
    2009 Pacific Rim International Wine Competition
    2009 Riverside International Wine Competition
    2009 PWA Pennsylvania Wine Competition
    2009 Pennsylvania Farm Show
    You can draw your own conclusions.

    I think wineries enter these because they are there and you do want to be compaered to your peers and see how you come out, As well as the the fact that we don’t get reviewed by the national media/critics. Did these medals help sell these wines probably not, but it can’t hurt.

    I don’t hear our customers standing in the retail area of our tasting room looking at wines making any comments whatsoever about a wine winning a medal before they make a purchase.

    We don’t stand at the tasting bar spouting off this wine won this and that wine won that, but there are a number of wineries I have visited that this is a major part of their spiel. When I hear this I am usaully turned off immediately feeling the wines must stand on their own merits to my palate.

    BTW I read Gerry Dawes

  32. Steve- Thank you for your honest and frank opinion. I hope I have not mislead you into thinking that we are fooling average consumers by tricking them into buying wines that may not deserve a medal. We hope to offer some guidance to consumers that may not read wine reviews or that are just starting to drink wine. We know that medals sell wine to all levels of consumers and we feel that our judging method within our competitions are legitimate and consistent.

    My ‘so what ‘comment was address at the fact that different competitions through out the country may have inconsistencies at awarding wines medals. I stick by that comment base on the fact that we know wine will taste differently throughout its life and wine judges all have different palates. This makes it virtually impossible to consistently award medals to wines that are entered into several different competitions.

    Finally, my stating that we require all judges to taste the wines before awarding medals, I was simply trying to shine a light on tasting at a wine competition is conducted.

  33. My name is G.M. “Pooch” Pucilowski. It’s my fault. I’ll take the blame! I let the cat out of the bag — the dirty little secret is out.
    The awful truth has now emerged — wine judges (and wine writers, winemakers, winery owners, wine drinkers, wine retailers, wine critics and you and me) are human! And tasting wine is subjective. And each of us should taste wines, make our own decisions and not pay so much attention to wine reviews. And on one day we may enjoy a beautiful little wine with our significant other, and on the very next day, after a squabble, find the very same wine tasting terrible. And sometimes we may try a wine because someone recommended it or because of the shiny sticker on the label. But now its out! Everyone can breathe easier, relax, calm down and even sip a little wine.
    I have been the Chief Judge for the California State Fair for the past 25 years. I hired Bob Hodgson, at least 7-8 years ago, to help me find the best and most competent wine judges’ possible. I was looking to create a judging system that would test the consistency of judges. The first couple of years were used to establish the experimental design. The testing needed to run in the background of our competition and not interfere. The initial results were intriguing, but limited to a few panels and judges. Since 2005, we have uses the same testing program for all our judges. (Judges rotate every year, so some judges may have been tested 4 times and some as little as once.)
    I have been reading many of the consumer/trade responses from the Wall Street Journal, Dr. Vino and Steve Heimoff’s blogs. There are many misconceptions about wine competitions in general and how they are conducted. I would like to share how the California State Fair Competition is run, how we test our judges and answer some questions that have been asked in the discussion process. By the way, the California State Fair has been judging wines since 1854 which might make this the oldest wine competition in America.
    I would first like to state a few observations and food for thought:
    1) Wine Judges are not perfect (and dare I say, neither are any other humans I know).
    2) Wine Judges are honest, decent, and hard working individuals that come from many walks of life (sort of like you and me), although the majority are connected to the wine industry in some way. I frankly don’t know why they judge wines? They certainly don’t do this for the money. At the California State Fair, we pay judges a whooping $75 a day for 3 days. Yes, we may fly a few out-of-towners in (but anyone within 300 miles or so pay their own expenses, no reimbursements) and we put them in a hotel for 3-4 days. We feed them 3 lunches and 1 dinner. Tough way to make a living!
    3) I hired Bob Hodgson to help me find the best and most competent wine judges available. I’m sure its my fault, try as I might, I have not developed the technique to look at a person (or even talk and laugh with them) and decide that they are in fact a good judge or a bad judge. In fact, I also judge wines at other competitions (not the State Fair) and I’ve sat on many panels with other judges and still cannot state who is a good judge or not — even if they totally disagree with everything I taste (what makes my medals any better than theirs?).
    4) Most judges (yes, there are a few prima donnas) realize they don’t know everything about wine and understand that there are different styles of Chardonnay or Zinfandel. They are also willing to change their mind (sort of like you and me) if other judges politely suggest they may want to reconsider this particular style. And if the 4 judges choose not to change their minds, we have a scoring matrix that can convert a Gold, Silver, Bronze and No medal to Bronze score (or any other combination the judges may dream up). And that is the final score the wine will get unless one of the judges decides to change their own score.
    5) Judges look at tasting wine differently than consumers. They have different motivations. Judges judge wines to find faults and defects. They come to the table to evaluate and critique wines. Consumers drink wine to just drink, enjoy and/or sometimes share.
    In sharing with you how the State Fair Wine Competition is conducted, I hope I can also address some of the many comments in response to the article and blogs, including the remark about the smell of corn dogs and horse manure. We don’t judge wines during the actual State Fair and we go to great strides to ensure our tasting environment has plenty of light, fresh air and free of as many distractions as possible.
    When preparing a blind tasting, it’s important to note that we go to extreme lengths to make sure it’s truly blind. Judges do not know the winery, the appellation, the region, or even see the bottle shape. I also believe that wine judges should not know the price of the wines. They may know the vintage date for the larger categories, but basically they only know the variety of the grape. All the glasses are the same size and have a 4-digit random number on the stem that matches the paperwork. We are also one of the few competitions that only judge California wines.
    Since we started testing our judges, I have consistently lowered the amount of wines that judge’s taste in one day. There are competitions that have judges tasting 130 to 150 wines in one day. I don’t know if that is too much, that is the decision of each competition, but I’ve held our daily max at 100 and continually lower that number. Last year the total was between 80 and 85 wines per day per judge. This amount is usually finished by a majority of judges by 2 pm. We start around 9 am with a 45-minute lunch included.
    We use the Peterson method of wine tasting that is unique in competition circles and has judges tasting wines like a winemaker might do with barrel samples. Dr. Richard Peterson developed it for us. We bring 20 to 40 wines to the judge’s table and have each judge smell the wine only, placing them into 1 of 3 groups. Then starting with the group that the judge liked the most, smell each wine again and put them in rank order (this wine smells better than this one, etc.) At the finish, all 20 to 40 wines will be in 3 or 4 groups, ranked from his favorite to his least favorite, and then the judge begins to again smell and now taste the wines, rearranging the order again if he feels he needs to. This time he is looking to place wines in categories of Gold, Silver, Bronze, or no medal.
    For Professor Hodgson’s study, in each of these flights we may triplicate two or three different wines. Judges get a total of 4 wines per day that will be tasted 3 times. We attempt to avoid testing judges during the first or last flight of the day.
    After every judge on a panel has tasted and scored their wines, they form a group to discuss and record their final score. By the way, Bob Hodgson takes the judge’s first scores to evaluate, before discussion. These are the basics of our competition.
    Dr. Hodgson’s study has been kept confidential according to our agreement. All of my Wine Judges (nearly 150), to this day, do not know the results of their individual results.
    The study actually shows there is a bell shaped curve with most of the judges falling in the middle. There are 10 to 20% of the judges in any one year that score exceeding well, not perfect, but pretty darn good. On the other side of the bell shape, the right side, there are another 10 to 20% that score pretty badly — consistently. The majority of the judges are in that 60 to 80% of the middle of the bell shape (probably where you and I would fare). Here are a few points to consider:
    There are no “super judges”. No judges show up every year in that top 10 to 20%, they just slide back into that middle portion and some other judges in the next year are in that “high” range. On the other hand, the low 10 to 20% of judges almost always show up the right side of the bell and they frankly have proven to NOT be consistent—ever. These are the judges that I can now weed out and this is one of the results that I hoped to achieve by doing this study.
    After reviewing the judge’s results for the past 7-8 years, I’ve recently begun to seriously realize how difficult wine judging can actually be. I wonder how good any of us can be? We know it not humanly possible to be perfect, but how close can we get? The challenge I offer any one of you readers (and/or critics) is to find and score 3 of the same wines out of the same bottle in a pile of 20 to 40 Pinot Noirs, Sauvignon Blancs or name your favorite. Can you pick out the wines and score them the same? Now, up until I hired Professor Hodgson to conduct this consistency exam, I might have felt like you — what’s the problem? Seems easy enough. Anyone that’s any good should be able to do this! But now, after testing these very sincere judges — I’m not so sure any more. I will continue with the study to test the judging system and find the right combination of wines, judges and scores to give consumer the greatest confidence possible.
    There was a number of comments to the Wall Street Journal article by a consumer, Morton Leslie, that suggested that wine competitions “would chose their judges on the basis of tested and proven wine tasting ability and consistency.” He went on, “They would monitor judge performance during the competition and eliminate judges who fail to be consistent. They would provide quality and flavor standards and definitions for each wine … Why not measure whether a judge is consistent or whether they have the sensory skills necessary to judge a particular wine? Why not prohibit table talk? Why not reduce the number of wines tasted to help prevent palate fatigue? Why not establish standards by which wines are judged? Why not ask that a judge have training in sensory analysis? Why not ask that the results be more than chance?”
    I totally agree. These are logical steps to take and quite similar to the practices I’ve put in place for the California State Fair Wine Competition. We are the only competition that requires judges to take an evaluation exam, currently conducted at UC Davis, before they are able to judge for us. We are one of the few competitions that ask judges to rank their favorite wines so they won’t be stuck tasting a variety they dislike. Instead of testing our judges every year, which could be cumbersome, I bring in Professors and/or experts to conduct seminars that educate the judges on one aspect of wine. I started a Mentorship program that basically trains newer judges on how we judge wines and bring them up to speed.
    I’d like to point out that no other competition or wine critic that I’m aware of has tested their judges or themselves. It is only because of the courage and faith of the California State Fair and our Advisory Members that we continue to test our judges. What would we be talking about now if the study never happen? At least I know now, where we stand. I now have a base that allows me to make changes and evaluate whether it helps or hinders. I stand by the fact that we did do the study, that we continue to do the study and that it was published. I am looking for ways to improve the process at the California State Fair that will give judges the best opportunity to be as consistent as humanly possible. This is an important discussion that has begun as a result of the testing and I look forward to seeing it continue. And we will continue to share the results.

Trackbacks/Pingbacks

  1. STEVE HEIMOFF| WINE BLOG » Blog Archive » Joe Roberts is right … | The Bottle and Cork - Napa and Sonoma Wine blog - [...] STEVE HEIMOFF| WINE BLOG » Blog Archive » Joe Roberts is right … [...]

Leave a Reply

*

Recent Comments

Recent Posts

Categories

Archives