Why Brainteasers Don’t Belong in Job Interviews

Imagine that you are the captain of a pirate ship. You’ve captured some booty, and you need to divide it among your crew. But first the crew will vote on your plan. If you have the support of fewer than half of them, you will die. How do you propose to divide the gold, so that you still have some for yourself—but live to tell the tale?

There is a correct answer: divide it among the top fifty-one per cent of the crew. If you knew that, you’ve passed what used to be one of Google’s infamous, mind-scrambling job-interview questions, which would have placed you one step closer to a career at the technology giant. (Google reportedly banned the practice a couple of years ago.) In a surprising June 19th interview with the New York Times, Laszlo Bock, Google’s senior V.P. of “people operations,” explained why: the company discovered these brainteasers are “a complete waste of time,” and “don’t predict anything” when it comes to job success. Google shouldn’t be shocked. A psychologist would have known at the outset that tests of this nature hardly ever work, and that there are much better predictors of who will get hired and how they will perform.

Researchers have always tried to use psychology for predictive ends: Can what we already know about a person tell us how she will behave in a given situation? The results of these endeavors have been mixed. While there is some evidence for links between certain personality traits and later outcomes, the correlations tend to be limited, and the predictions that can be made are broad at best. For instance, we can tell when a given person will generally succeed at academic pursuits, but not if she’ll excel in a particular seminar on ancient hieroglyphics.

The major problem with most attempts to predict a specific outcome, such as interviews, is decontextualization: the attempt takes place in a generalized environment, as opposed to the context in which a behavior or trait naturally occurs. Google’s brainteasers measure how good people are at quickly coming up with a clever, plausible-seeming solution to an abstract problem under pressure. But employees don’t experience this particular type of pressure on the job. What the interviewee faces, instead, is the objective of a stressful, artificial interview setting: to make an impression that speaks to her qualifications in a limited time, within the narrow parameters set by the interviewer. What’s more, the candidate is asked to handle an abstracted “gotcha” situation, where thinking quickly is often more important than thinking well. Instead of determining how someone will perform on relevant tasks, the interviewer measures how the candidate will handle a brainteaser during an interview, and not much more.

Interviews in general pose a particular challenge when it comes to predictive validity—that is, the ability to determine someone’s future performance based on limited data. Not only are they relatively brief but also, over the past twenty years, psychologists have repeatedly found that few of a candidate’s responses matter. What is significant is the personal impression that the interviewer forms within the first minute (and sometimes less) of meeting the prospective hire. In one study, students were recorded as they took part in mock on-campus recruiting interviews that lasted from eight to thirty minutes. The interviewers evaluated them based on eleven factors, such as over-all employability, professional competency, and interpersonal skills. The experimenters then showed the first twenty or so seconds of each interview to untrained observers—the initial meet-and-greet, starting with the interviewee’s knock on the door and ending ten seconds after he was seated, before any questions—and asked them to rate the candidates on the same dimensions. What the researchers found was a high correlation between judgments made by the untrained eye in a matter of seconds and those made by trained interviewers after going through the whole process. On nine of the eleven factors, there was a resounding agreement between the two groups.

This phenomenon is broadly known as “thin-slice” judgment. As early as 1937, Gordon Allport, a pioneer of personality psychology, argued that we constantly form sweeping opinions of others based on incredibly limited information and exposure. Since then, multiple studies have shown the truth of that observation: first impressions are paramount. Once formed, they reliably color the rest of our impression formation. The exact same interview response given by two different candidates, one of whom the interviewer preferred, would be rated differently.

Given the failure of typical interviews to predict job performance consistently, what should companies do instead? Two things have been shown to make the interview process more successful. One is using a highly standardized interview process—for instance, asking each candidate the same questions in the same order. This produces a more objective measure of how each candidate fares, and it can reduce the influence of thin-slice judgment, which can alter the way each interview is conducted.

The other solution is to focus on relevant behavioral measures, both in the past and in the future. The ubiquitous interview question “Describe a situation where you did well on X or failed on Y” is an example of a past behavioral measure; asking a programmer to describe how she would solve a particular programming task would be a future measure. Google and many other tech companies may also ask some candidates to write code on the spot, a task that solves the problem of decontextualization by closely approximating what they would do on the job.

To Google’s credit, the company admitted its failure, and moved its interviews in a different direction. Finding the one right candidate in a group is hard, and companies don’t have much time to figure out exactly which questions can help them tell similar-seeming candidates apart. Or, to quote from another of the banned Google questions, “You have eight balls of the same size. Seven of them weigh the same, and one of them weighs slightly more. How can you find the ball that is heavier by using a balance and only two weighings?”

Illustration by Richard McGuire.