yourdailytao@gmail.com

Noise, Daniel Kahneman;Olivier Sibony;Cass R. Sunstein – 5

The potentially high costs of noise reduction often come up in the context of algorithms, where there are growing objections to “algorithmic bias.” As we have seen, algorithms eliminate noise and often seem appealing for that reason. Indeed, much of this book might be taken as an argument for greater reliance on algorithms, simply because they are noiseless. But as we have also seen, noise reduction can come at an intolerable cost if greater reliance on algorithms increases discrimination on the basis of race and gender, or against members of disadvantaged groups. There are widespread fears that algorithms will in fact have that discriminatory consequence, which is undoubtedly a serious risk. In Weapons of Math Destruction, mathematician Cathy O’Neil urges that reliance on big data and decision by algorithm can embed prejudice, increase inequality, and threaten democracy itself. According to another skeptical account, “potentially biased mathematical models are remaking our lives—and neither the companies responsible for developing them nor the government is interested in addressing the problem.” According to ProPublica, an independent investigative journalism organization, COMPAS, an algorithm widely used in recidivism risk assessments, is strongly biased against members of racial minorities. No one should doubt that it is possible—even easy—to create an algorithm that is noise-free but also racist, sexist, or otherwise biased. An algorithm that explicitly uses the color of a defendant’s skin to determine whether that person should be granted bail would discriminate (and its use would be unlawful in many nations). An algorithm that takes account of whether job applicants might become pregnant would discriminate against women. In these and other cases, algorithms could eliminate unwanted variability in judgment but also embed unacceptable bias. In principle, we should be able to design an algorithm that does not take account of race or gender. Indeed, an algorithm could be designed that disregards race or gender entirely. The more challenging problem, now receiving a great deal of attention, is that an algorithm could discriminate and, in that sense, turn out to be biased, even when it does not overtly use race and gender as predictors.

Can algorithms provide a fairer way to make judgements and decisions? Humans are prone to bias and we tend to be slaves to our emotions by the moment. All that depends on how you define fair.

If you aim to be logically consistent all the time, using algorithms to make decisions can perpetuate widespread discrimination and takes the individual element out of things. While we all already tend to stereotype and make assumptions based on certain visual characteristics, algorithms sets that in stone.

If you do want to allow for each situation to have its own judgement where context is taken into place, inefficiencies and discrepancies are inevitable. What then matters is the way we measure and level of discrepancies that we are willing to accept.

While it might seem that the book is saying that humans are guilty of noise, purely resorting to algorithms and rules will cause many people to be discriminated even more heavily without consideration for context. Now that seems like your dystopian fiction come true.

Noise, Daniel Kahneman;Olivier Sibony;Cass R. Sunstein – 4

The only measure of cognitive style or personality that they found to predict forecasting performance was another scale, developed by psychology professor Jonathan Baron to measure “actively open-minded thinking.” To be actively open-minded is to actively search for information that contradicts your preexisting hypotheses. Such information includes the dissenting opinions of others and the careful weighing of new evidence against old beliefs. Actively openminded people agree with statements like this: “Allowing oneself to be convinced by an opposing argument is a sign of good character.” They disagree with the proposition that “changing your mind is a sign of weakness” or that “intuition is the best guide in making decisions.” In other words, while the cognitive reflection and need for cognition scores measure the propensity to engage in slow and careful thinking, actively open-minded thinking goes beyond that. It is the humility of those who are constantly aware that their judgment is a work in progress and who yearn to be corrected. We will see in chapter 21 that this thinking style characterizes the very best forecasters, who constantly change their minds and revise their beliefs in response to new information. Interestingly, there is some evidence that actively open-minded thinking is a teachable skill. We do not aim here to draw hard-and-fast conclusions about how to pick individuals who will make good judgments in a given domain. But two general principles emerge from this brief review. First, it is wise to recognize the difference between domains in which expertise can be confirmed by comparison with true values (such as weather forecasting) and domains that are the province of respect-experts. A political analyst may sound articulate and convincing, and a chess grandmaster may sound timid and unable to explain the reasoning behind some of his moves. Yet we probably should treat the professional judgment of the former with more skepticism than that of the latter. Second, some judges are going to be better than their equally qualified and experienced peers. If they are better, they are less likely to be biased or noisy. Among many things that explain these differences, intelligence and cognitive style matter. Although no single measure or scale unambiguously predicts judgment quality, you may want to look for the sort of people who actively search for new information that could contradict their prior beliefs, who are methodical in integrating that information into their current perspective, and who are willing, even eager, to change their minds as a result. The personality of people with excellent judgment may not fit the generally accepted stereotype of a decisive leader. People often tend to trust and like leaders who are firm and clear and who seem to know, immediately and deep in their bones, what is right. Such leaders inspire confidence. But the evidence suggests that if the goal is to reduce error, it is better for leaders (and others) to remain open to counterarguments and to know that they might be wrong. If they end up being decisive, it is at the end of a process, not at the start.

Making right judgements inherently requires someone to be indecisive and to find fault in their own initial judgements. However, we tend to be attracted to people who make bold daring claims on the future or on situations. That blind confidences equals leadership material.

While it might be uninspiring to follow someone who can’t make up their mind, it might also not be the wisest to follow someone whom seemingly has it all figured out. This is especially when it comes to predicting the future. The best way to ensure the highest accuracy in judgement calls is to constantly question yourself and be your own devil’s advocate.

Changing your mind and being actively open-minded is good if not done in excess, and we should probably be more accepting of that if you truly want to follow a leader that has good judgement.

Noise, Daniel Kahneman;Olivier Sibony;Cass R. Sunstein – 3

First, we have several estimates of the relative weights of level noise and pattern noise. Overall, it appears that pattern noise contributes more than level noise. In the insurance company of chapter 2, for instance, differences between underwriters in the average of the premiums they set accounted for only 20% of total system noise; the remaining 80% was pattern noise. Among the federal judges of chapter 6, level noise (differences in average severity) represented slightly less than half of total system noise; pattern noise was the larger component. In the punitive damages experiment, the total amount of system noise varied widely depending on the scale used (punitive intent, outrage, or damages in dollars), but the share of pattern noise in that total was roughly constant: it accounted for 63%, 62%, and 61% of total system noise for the three scales used in the study. Other studies we will review in part 5, notably on personnel decisions, are consistent with this tentative conclusion. The fact that in these studies level noise is generally not the larger component of system noise is already an important message, because level noise is the only form of noise that organizations can (sometimes) monitor without conducting noise audits. When cases are assigned more or less randomly to individual professionals, the differences in the average level of their decisions provide evidence of level noise. For example, studies of patent offices observed large differences in the average propensity of examiners to grant patents, with subsequent effects on the incidence of litigation about these patents. Similarly, case officers in child protection services vary in their propensity to place children in foster care, with long-term consequences for the children’s welfare. These observations are based solely on an estimation of level noise. If there is more pattern noise than level noise, then these already-shocking findings understate the magnitude of the noise problem by at least a factor of two. (There are exceptions to this tentative rule. The scandalous variability in the decisions of asylum judges is almost certainly due more to level noise than to pattern noise, which we suspect is large as well.) The next step is to analyze pattern noise by separating its two components. There are good reasons to assume that stable pattern noise, rather than occasion noise, is the dominant component. The audit of the sentences of federal judges illustrates our reasoning. Start with the extreme possibility that all pattern noise is transient. On that assumption, sentencing would be unstable and inconsistent over time, to an extent that we find implausible: we would have to expect that the average difference between judgments of the same case by the same judge on different occasions is about 2.8 years. The variability of average sentencing among judges is already shocking. The same variability in the sentences of an individual judge over occasions would be grotesque. It seems more reasonable to conclude that judges differ in their reactions to different defendants and different crimes and that these differences are highly personal but stable. To quantify more precisely how much of pattern noise is stable and how much is occasion noise, we need studies in which the same judges make two independent assessments of each case. As we have noted, obtaining two independent judgments is generally impossible in studies of judgment, because it is difficult to guarantee that the second judgment of a case is truly independent of the first. Especially when the judgment is complex, there is a high probability that the individual will recognize the problem and repeat the original judgment. A group of researchers at Princeton, led by Alexander Todorov, has designed clever experimental techniques to overcome this problem. They recruited participants from Amazon Mechanical Turk, a site where individuals provide short-term services, such as answering questionnaires, and are paid for their time. In one experiment, participants viewed pictures of faces (generated by a computer program, but perfectly indistinguishable from the faces of real people) and rated them on various attributes, such as likability and trustworthiness. The experiment was repeated, with the same faces and the same respondents, one week later. It is fair to expect less consensus in this experiment than in professional judgments such as those of sentencing judges. Everyone might agree that some people are extremely attractive and that others are extremely unattractive, but across a significant range, we expect reactions to faces to be largely idiosyncratic. Indeed, there was little agreement among observers: on the ratings of trustworthiness, for instance, differences among pictures accounted for only 18% of the variance of judgments. The remaining 82% of the variance was noise. It is also fair to expect less stability in these judgments, because the quality of judgments made by participants who are paid to answer questions online is often substantially lower than in professional settings. Nevertheless, the largest component of noise was stable pattern noise. The second largest component of noise was level noise—that is, differences among observers in their average ratings of trustworthiness. Occasion noise, though still substantial, was the smallest component. The researchers reached the same conclusions when they asked participants to make other judgments—about preferences among cars or foods, for example, or on questions that are closer to what we call professional judgments. For instance, in a replication of the study of punitive damages discussed in chapter 15, participants rated their punitive intent in ten cases of personal injury, on two separate occasions separated by a week. Here again, stable pattern noise was the largest component. In all these studies, individuals generally did not agree with one another, but they remained quite stable in their judgments. This “consistency without consensus,” in the researchers’ words, provides clear evidence of stable pattern noise. The strongest evidence for the role of stable patterns comes from the large study of bail judges we mentioned in chapter 10. In one part of this exceptional study, the authors created a statistical model that simulated how each judge used the available cues to decide whether to grant bail. They built custom-made models of 173 judges. Then they applied the simulated judges to make decisions about 141,833 cases, yielding 173 decisions for each case—a total of more than 24 million decisions. At our request, the authors generously carried a special analysis in which they separated the variance judgments into three components: the “true” variance of the average decisions for each of the cases, the level noise created by differences among judges in their propensity to grant bail, and the remaining pattern noise. This analysis is relevant to our argument because pattern noise, as measured in this study, is entirely stable. The random variability of occasion noise is not represented, because this is an analysis of models that predict a judge’s decision. Only the verifiably stable individual rules of prediction are included. The conclusion was unequivocal: this stable pattern noise was almost four times larger than level noise (stable pattern noise accounted for 26%, and level noise 7%, of total variance). The stable, idiosyncratic individual patterns of judgment that could be identified were much larger than the differences in across-the-board severity. All this evidence is consistent with the research on occasion noise that we reviewed in chapter 7: while the existence of occasion noise is surprising and even disturbing, there is no indication that within-person variability is larger than between-person differences. The most important component of system noise is the one we had initially neglected: stable pattern noise, the variability among judges in their judgments of particular cases. Given the relative scarcity of relevant research, our conclusions are tentative, but they do reflect a change in how we think about noise—and about how to tackle it. In principle at least, level noise—or simple, across-the-board differences between judges—should be a relatively easy problem to measure and address. If there are abnormally “tough” graders, “cautious” child custody officers, or “risk-averse” loan officers, the organizations that employ them could aim to equalize the average level of their judgments. Universities, for instance, address this problem when they require professors to abide by a predetermined distribution of grades within each class. Unfortunately, as we now realize, focusing on level noise misses a large part of what individual differences are about. Noise is mostly a product not of level differences but of interactions: how different judges deal with particular defendants, how different teachers deal with particular students, how different social workers deal with particular families, how different leaders deal with particular visions of the future. Noise is mostly a by-product of our uniqueness, of our “judgment personality.” Reducing level noise is still a worthwhile objective, but attaining only this objective would leave most of the problem of system noise without a solution.

Much longer excerpt today, but this is the quintessential part of the book that can’t be neglected, so do take some time to read through this rather long part. I’ll walk you through the key message as I had to re-read a few times to understand this.

Essentially to generalize, the biggest contributor to noise is that of pattern noise. Let me reiterate. There are 2 main kinds of noises as defined that contributes to overall system noise.

  1. Level Noise (difference in average judgements between different people)
  2. Pattern Noise (variability in judgements across different cases)

Every human is different, and while we can have varying levels of judgements (harsh or lenient) in general, the biggest contributor to overall system noise is that of Pattern Noise. Yet, this is one that is hard to measure. We usually study the average of outcomes between judges, not each case individually.

Most efforts to deal with biasedness come from studies that only look at Level Noise as it is easier to measure and study, leaving out the one that contributes the most of variability. We attempt to analyze the differences between each judge, but we should also be studying the different cases or situations they make. To do so will be to acknowledge and understand that each of us are unique in our characteristics. Differences in environment or situations can trigger drastically different responses even from 2 seemingly similar people.

Noise, Daniel Kahneman;Olivier Sibony;Cass R. Sunstein – 2

Two researchers, Edward Vul and Harold Pashler, had the idea of asking people to answer this question (and many similar ones) not once but twice. The subjects were not told the first time that they would have to guess again. Vul and Pashler’s hypothesis was that the average of the two answers would be more accurate than either of the answers on its own. The data proved them right. In general, the first guess was closer to the truth than the second, but the best estimate came from averaging the two guesses. Vul and Pashler drew inspiration from the well-known phenomenon known as the wisdom-of-crowds effect: averaging the independent judgments of different people generally improves accuracy. In 1907, Francis Galton, a cousin of Darwin and a famous polymath, asked 787 villagers at a country fair to estimate the weight of a prize ox. None of the villagers guessed the actual weight of the ox, which was 1,198 pounds, but the mean of their guesses was 1,200, just 2 pounds off, and the median (1,207) was also very close. The villagers were a “wise crowd” in the sense that although their individual estimates were quite noisy, they were unbiased. Galton’s demonstration surprised him: he had little respect for the judgment of ordinary people, and despite himself, he urged that his results were “more creditable to the trustworthiness of a democratic judgment than might have been expected.” Similar results have been found in hundreds of situations. Of course, if questions are so difficult that only experts can come close to the answer, crowds will not necessarily be very accurate. But when, for instance, people are asked to guess the number of jelly beans in a transparent jar, to predict the temperature in their city one week out, or to estimate the distance between two cities in a state, the average answer of a large number of people is likely to be close to the truth. The reason is basic statistics: averaging several independent judgments (or measurements) yields a new judgment, which is less noisy, albeit not less biased, than the individual judgments. Vul and Pashler wanted to find out if the same effect extends to occasion noise: can you get closer to the truth by combining two guesses from the same person, just as you do when you combine the guesses of different people? As they discovered, the answer is yes. Vul and Pashler gave this finding an evocative name: the crowd within. Averaging two guesses by the same person does not improve judgments as much as does seeking out an independent second opinion. As Vul and Pashler put it, “You can gain about 1/10th as much from asking yourself the same question twice as you can from getting a second opinion from someone else.” This is not a large improvement. But you can make the effect much larger by waiting to make a second guess. When Vul and Pashler let three weeks pass before asking their subjects the same question again, the benefit rose to one-third the value of a second opinion. Not bad for a technique that does not require any additional information or outside help. And this result certainly provides a rationale for the age-old advice to decision makers: “Sleep on it, and think again in the morning.”

The wisdom of crowds. So long as we do not fall for the herd effect and allow opinions to cascade and let them remain independent, aggregated opinions usually work better.

In fact, what this excerpt says is that being able to generate a “second opinion” just by your own helps you improve your probability of a right guess. That’s when advice such as “sleep on it” makes sense. When making important and non time-critical decisions, give yourself some time to reset and make another guess.

Key thing is to know when to rely on the wisdom of crowds versus the opinions of experts. For niche topics that require advanced technical knowledge, it perhaps makes more sense to aggregate guesses from a crowd of experts rather than your everyday person.

Noise, Daniel Kahneman;Olivier Sibony;Cass R. Sunstein – 1

Before reading on, you may want to think of your own answer to the following questions: In a well-run insurance company, if you randomly selected two qualified underwriters or claims adjusters, how different would you expect their estimates for the same case to be? Specifically, what would be the difference between the two estimates, as a percentage of their average? We asked numerous executives in the company for their answers, and in subsequent years, we have obtained estimates from a wide variety of people in different professions. Surprisingly, one answer is clearly more popular than all others. Most executives of the insurance company guessed 10% or less. When we asked 828 CEOs and senior executives from a variety of industries how much variation they expected to find in similar expert judgments, 10% was also the median answer and the most frequent one (the second most popular was 15%). A 10% difference would mean, for instance, that one of the two underwriters set a premium of $9,500 while the other quoted $10,500. Not a negligible difference, but one that an organization can be expected to tolerate. Our noise audit found much greater differences. By our measure, the median difference in underwriting was 55%, about five times as large as was expected by most people, including the company’s executives. This result means, for instance, that when one underwriter sets a premium at $9,500, the other does not set it at $10,500—but instead quotes $16,700. For claims adjusters, the median ratio was 43%. We stress that these results are medians: in half the pairs of cases, the difference between the two judgments was even larger. The executives to whom we reported the results of the noise audit were quick to realize that the sheer volume of noise presented an expensive problem. One senior executive estimated that the company’s annual cost of noise in underwriting—counting both the loss of business from excessive quotes and the losses incurred on underpriced contracts—was in the hundreds of millions of dollars. No one could say precisely how much error (or how much bias) there was, because no one could know for sure the Goldilocks value for each case. But no one needed to see the bull’s-eye to measure the scatter on the back of the target and to realize that the variability was a problem. The data showed that the price a customer is asked to pay depends to an uncomfortable extent on the lottery that picks the employee who will deal with that transaction. To say the least, customers would not be pleased to hear that they were signed up for such a lottery without their consent. More generally, people who deal with organizations expect a system that reliably delivers consistent judgments. They do not expect system noise.

A follow-up book from Daniel Kahneman(Thinking Fast and Slow) alongside with 2 other co-authors. This book looks at the issue of noise and how it impacts governments, institutions, organizations and all of us.

It aims to break down system noise (defined as undesirable variability in the judgments of the same case by multiple individuals in the case of the law) into 2 main components:

  • Level noise is variability in the average level of judgments by different judges.
  • Pattern noise is variability in judges’ responses to particular cases.

Overall, I think there are some really interesting implications made in this book that I’ll share more in the upcoming excerpts. Ultimately, every system and institution is inherently “noisy”. What is important is finding the right balance between installing rigid and inflexible processes or leaving things totally up to the random and subjective judgement of each individual.

No Rules Rules, Reed Hastings;Erin Meyer – 5

As with all the dimensions of culture, when it comes to giving feedback internationally everything is relative. The Japanese find the Singaporeans unnecessarily direct. The Americans find the Singaporeans opaque and lacking transparency. The Singaporeans who join Netflix are shocked at their American colleagues’ bluntness. To many a Dutch person, the Americans at Netflix don’t feel particularly direct at all. Netflix, despite its multinational desires, continues to have a largely American-centric culture. And when it comes to giving negative feedback, Americans are more direct than many cultures but considerably less direct than the Dutch culture. Dutch director of public policy Ise, who joined Netflix Amsterdam in 2014, explains the difference like this: The Netflix culture has succeeded in creating an environment where feedback is frequent and actionable. Yet when an American gives feedback, even at Netflix, they almost always start by telling you what’s good about your work before telling you what they really want to say. Americans learn things like, “Always give three positives with every negative” and “Catch employees doing things right.” This is confusing for a Dutch person, who will give you positive feedback or negative feedback but is unlikely to do both in the same conversation. At Netflix, Ise quickly learned that the manner of giving feedback that would be natural and comfortable in her own Dutch culture was too blunt for her American collaborators: Donald, my American colleague who had recently moved to the Netherlands, was hosting a meeting in Amsterdam. Seven non-Netflix partners had taken planes and trains from around Europe for the discussions. The meeting went very well. Donald was articulate, detailed, and persuasive. His preparation was evident. But several times I could tell other participants wanted to share their own perspective but didn’t have the opportunity, because Donald talked so much. After the meeting Donald said to me, “I thought that went great. What did you think?” This seemed to me like a perfect time to give that candid feedback Netflix leaders are always preaching about so I jumped in: “Stinne came all the way from Norway to attend the meeting but you spoke so much she couldn’t get a word in edgewise. We asked these people to take planes and trains, and then they didn’t get time to speak. We didn’t hear all of the opinions that could have helped us. You talked for 80 percent of the meeting, making it difficult for anyone else to say anything at all.” She was about to move on to the part of the feedback where she gives actionable suggestions for future improvement when Donald did something that Ise feels is typical of Americans: Before I’d even finished, he groaned and looked crestfallen. He took my feedback way too harshly, as Americans often do. He said, “Oh my gosh, I’m so sorry for having messed this all up.” But he hadn’t “messed it all up.” That’s not what I said. The meeting was a success and he showed he knew that by saying, “That went great.” There was just this one aspect that was not good, and I felt understanding that could help him improve. That’s what frustrates me about my American colleagues. As often as they give feedback and as eager as they are to hear it, if you don’t start by saying something positive they think the entire thing was a disaster. As soon as a Dutch person jumps in with the negative first, the American kills the critique by thinking the whole thing has gone to hell. In her past five years at Netflix, Ise has learned a lot about giving feedback to international colleagues, especially Americans: Now that I better understand these cultural tendencies, I give the feedback just as frequently, but I think carefully about the person receiving the message and how to adapt to get the results I’m hoping for. With more indirect cultures I start by sprinkling the ground with a few light positive comments and words of appreciation. If the work has been overall good I state that enthusiastically up front. Then I ease into the feedback with “a few suggestions.” Then I wrap up by stating, “This is just my opinion, for whatever it is worth,” and “You can take it or leave it.” The elaborate dance is quite humorous from a Dutch person’s point of view . . . but it certainly gets the desired results! Ise’s words sum up the strategies Netflix learned for promoting candor as they opened offices around the world. When you are leading a global team, as you Skype with your employees in different cultures, your words will be magnified or minimized based on your listener’s cultural context. So you have to be aware. You have to be strategic. You have to be flexible. With a little information and a little finesse, you can modify the feedback to the person your speaking with in order to get the results that you need.

The final excerpt I’ll be sharing of this book. Among all the other things mentioned about Netflix’s core company culture, flexibility is still required especially when working in international environments. For this, the key is finding the balance between the core values of the company while allowing for some room when adjusting to different cultures.

The authors speak quite a bit on understanding the nuances of each country and giving some leeway to accommodate to these while still trying to tie in the core values they have set out.

No Rules Rules, Reed Hastings;Erin Meyer – 4

Research backs up Reed’s claims about the positive ramifications of the leader speaking openly about mistakes. In her book, Daring Greatly: How the Courage to Be Vulnerable Transforms the Way We Live, Love, Parent, and Lead, Brené Brown explains, based on her own qualitative studies, that “we love seeing raw truth and openness in other people, but we are afraid to let them see it in us. . . . Vulnerability is courage in you and inadequacy in me.” Anna Bruk and her team at the University of Mannheim in Germany wondered if they could replicate Brown’s findings quantitatively. They asked subjects to imagine themselves in a variety of vulnerable situations—such as being the first to apologize after a big fight and admitting that you made a serious mistake to your team at work. When people imagined themselves in those situations, they tended to believe that showing vulnerability would make them appear “weak” and “inadequate.” But when people imagined someone else in the same situations, they were more likely to describe showing vulnerability as “desirable” and “good.” Bruk concluded that honesty about mistakes is good for relationships, health, and job performance. On the other hand, there is also research showing that if someone is already viewed as ineffective, they only deepen that opinion by highlighting their own mistakes. In 1966, psychologist Elliot Aronson ran an experiment. He asked students to listen to recordings of candidates interviewing to be part of a quiz-bowl team. Two of the candidates showed how smart they were by answering most of the questions correctly, while the other two answered only 30 percent right. Then, one group of students heard an explosion of clanging dishes, followed by one of the smart candidates saying, “Oh my goodness—I’ve spilled coffee all over my new suit.” Another group of students heard the same clamor, but then heard one of the mediocre candidates saying he spilled the coffee. Afterward, the students said they liked the smart candidate even more after he embarrassed himself. But the opposite was true of the mediocre candidate. The students said they liked him even less after seeing him in a vulnerable situation. This tendency has a name: the pratfall effect. The pratfall effect is the tendency for someone’s appeal to increase or decrease after making a mistake, depending on his or her perceived ability to perform well in general. In one study conducted by Professor Lisa Rosh from Lehman College, a woman introduced herself, not by mentioning her credentials and education, but by talking about how she’d been awake the previous night caring for her sick baby. It took her months to reestablish her credibility. If this same woman was first presented as a Nobel Prize winner, the exact same words about being up all night with the baby would provoke reactions of warmth and connection from the audience. When you combine the data with Reed’s advice, this is the takeaway: a leader who has demonstrated competence and is liked by her team will build trust and prompt risk-taking when she widely sunshines her own mistakes. Her company benefits. The one exception is for a leader considered unproven or untrusted. In these cases you’ll want to build trust in your competency before shouting your mistakes.

I first read about the Pratfall effect while reading the autobiography (titled Not By Chance Alone) of Elliot Aronson, a psychologist known for his research in Cognitive Dissonance among many others. Strongly recommend it as its one of the few books I managed to finish in 2 or less sittings and a fascinating insight into the lift of a psychology researcher.

Back to the point on hand, what the key takeaway is that if your colleagues already think that you are competent, a few mishaps here and there actually makes you more relatable and hence, likeable. That is why having a good body of work such as your CV help you establish credibility and buys you a lot more leeway at the start.

For those who are unproven, you’ll have to be extra careful and focus on building trust first. Ultimately, people try to justify the prenotions they already have about you based on initial first impression. For those whom enter situations unproven, the deck is stacked against you. While we all wish we could let our body of work speak for ourselves, there is just no getting past basic human psychology and biases.

No Rules Rules, Reed Hastings;Erin Meyer – 3

We spent hours coming up with the right performance objectives and trying to link them to pay. Patty suggested we link the bonus of our chief marketing officer, Leslie Kilgore, to the number of new customers we signed on. Before Netflix, Leslie had worked for Booz Allen Hamilton, Amazon, and Procter & Gamble. Her compensation at all these places was metric oriented, with compensation tied to achieving predefined objectives. So she seemed a good person to start with. We wrote down Key Performance Indicators (KPIs) to calculate how much extra Leslie should make if she achieved her goals. At the meeting I congratulated Leslie on the thousands of new customers we’d recently signed on. I was about to announce how this would bring her a huge bonus if she continued like that, when she interrupted me. “Yes, Reed, it’s remarkable. My team has done an incredible job. But the number of customers we sign on is no longer what we should be measuring. In fact, it’s irrelevant.” She went on to show us numerically that, while new customers had been the most important goal last quarter, it was now the customer retention rate that really mattered. As I listened, I felt a wave of relief. Thankfully, I hadn’t already tied Leslie’s bonus to the wrong measure of success. I learned from that exchange with Leslie that the entire bonus system is based on the premise that you can reliably predict the future, and that you can set an objective at any given moment that will continue to be important down the road. But at Netflix, where we have to be able to adapt direction quickly in response to rapid changes, the last thing we want is our employees rewarded in December for attaining some goal fixed the previous January. The risk is that employees will focus on a target instead of spot what’s best for the company in the present moment. Many of our Hollywood-based employees come from studios like WarnerMedia or NBC, where a big part of executive compensation is based on specific financial performance metrics. If this year the target is to increase operating profit by 5 percent, the way to get your bonus—often a quarter of annual pay—is to focus doggedly on increasing operational profit. But what if, in order to be competitive five years down the line, a division needs to change course? Changing course involves investment and risk that may reduce this year’s profit margin. The stock price might go down with it. What executive would do that? That’s why a company like WarnerMedia or NBC may not be able to change dramatically with the times, the way we’ve often done at Netflix. Beyond that, I don’t buy the idea that if you dangle cash in front of your high-performing employees, they try harder. High performers naturally want to succeed and will devote all resources toward doing so whether they have a bonus hanging in front of their nose or not. I love this quote from former chief executive of Deutsche Bank John Cryan: “I have no idea why I was offered a contract with a bonus in it because I promise you I will not work any harder or any less hard in any year, in any day because someone is going to pay me more or less.” Any executive worth her paycheck would say the same.

Tying performance to bonuses sounds good on paper, but it doesn’t usually work out for the best interests of the company. In fast-growing, fluid environments, sometimes having incentives for certain targets backfires when these targets becomes meaningless when the business focus or environment changes.

This dosen’t even include the fact that the moment any indicators become a target, employees will begin to find ways to game the system and maximize their earnings.

Unless you are able to guarantee that your targets are 100% always going to be in the best interests of your company, today and for the foreseeable future, performance bonuses might end up causing more trouble and motivate employees to the wrong goals. Nevertheless, this might not apply to sales roles, as the target and outcomes are clear.

That is why Netflix chooses to forgo performance bonuses and instead, pay top-of-market monthly wages.

No Rules Rules, Reed Hastings;Erin Meyer – 2

But this is the most important message of this chapter: even if your employees spend a little more when you give them freedom, the cost is still less than having a workplace where they can’t fly. If you limit their choices by making them check boxes and ask for permission, you won’t just frustrate your people, you’ll lose out on the speed and flexibility that comes from a low-rule environment. One of my favorite examples is from 2014, when a junior engineer saw a problem that needed to be solved. Friday morning April 8, Nigel Baptiste, director of partner engagement, arrived at the Netflix Silicon Valley office at 8:15. a.m. It was a warm, sunny day, and Nigel whistled as he grabbed a cup of coffee in the open kitchen on the fourth floor and strolled back to the area where he and his team test Netflix streaming on TVs made by official partners like Samsung and Sony. But when Nigel arrived at his work space, he stopped whistling and froze. What he saw, or, rather, what he didn’t see, sent him into a panic. He remembers it like this: Netflix had invested a big chunk of money so that our customers could watch House of Cards on new 4K ultra high definition TVs. The problem was that until this moment basically no TVs supported 4K. We had this fresh super-crisp look, but few could see it. Now, our partner Samsung had come out with the only 4K television so far on the market. These TVs were expensive, and it wasn’t clear if customers would buy them. My big goal that year was to work with Samsung to get lots of people watching House of Cards in 4K. We had a minor media coup when journalist Geoffrey Fowler, who reviews high-tech products for the Washington Post and has about two million readers, agreed to test House of Cards on Samsung’s new TV. His review would need to be great for 4K to take off. On Thursday Samsung engineers had come to Netflix with the 4K TV and checked it with my engineers to make sure Mr. Fowler would have a terrific viewing experience. Thursday evening, the TV tested, we all went home. But Friday morning, when I arrived at the office, the TV was gone. After checking with facilities, I realized it had been disposed of with a bunch of old TVs we’d told them to get rid of. This was serious. That TV was due in Fowler’s living room in two hours. It was too late to call the Samsung people. We’d have to buy another TV before ten a.m. I started calling every electronic store in town. The first three calls resulted in: “I’m sorry sir, we don’t have that TV.” My heart was pounding in my throat. We were going to miss the deadline. I was almost in tears when Nick, the most junior engineer on our team, sprinted into the office. “Don’t worry, Nigel,” Nick said. “I solved that. I came in last night, and I saw the TV had been disposed of. You didn’t respond to my calls and texts. So I drove out to the Best Buy in Tracy, bought the same TV, and tested it this morning. It cost twenty-five hundred dollars, but I thought it was the right thing to do.” I was floored. Two and a half thousand dollars! Imagine, a junior engineer feeling so empowered that he spends that much without approval because he thinks it’s the right decision. I felt a wave of relief. Due to all the sign-off policies this could never have happened at Microsoft, HP, or any other company I have worked for.” In the end, Fowler loved the high-definition streaming and wrote in his April 16 Wall Street Journal article: “Even the unflappable Francis Underwood perspires in ultra-high definition. I spotted sweat on the upper lip of Kevin Spacey’s fictitious vice president while streaming Netflix’s ‘House of Cards.’” I don’t want rules that prevent employees from making good decisions in a timely way. Fowler’s review was worth hundreds of times more to both Netflix and Samsung than that TV. Nick had just five words to guide his actions: “Act in Netflix’s best interest.” That freedom enabled him to use good judgment to do what was right for the company. But freedom isn’t the only benefit of removing your expense policy. The second benefit is that the lack of process speeds everything up.

Netflix operates without an approval policy for expenses and claims. Doing so, it aims to prioritize speed and empowerment over an insignificant amount of cost savings. The rationale being that by hiring the “best talent” in town, it would not make sense to hamper their decision making and slow them down with all these red tape.

Also as mentioned earlier in the book, context matters. There were employees who were audited and were found to have abused the system. These employees were fired to set an example.

I’m sure many of us were in situations where having to justify your expenses became a hassle and in some cases, you actually spent more of your company’s money. This policy might work, but it definitely requires discipline and awareness from management when implementing it. What do you think?

No Rules Rules, Reed Hastings;Erin Meyer – 1

Every employee has some talent. When we’d been 120 people, we had some employees who were extremely talented and others who were mildly talented. Overall we had a fair amount of talent dispersed across the workforce. After the layoffs, with only the most talented eighty people, we had a smaller amount of talent overall, but the amount of talent per employee was greater. Our talent “density” had increased. We learned that a company with really dense talent is a company everyone wants to work for. High performers especially thrive in environments where the overall talent density is high. Our employees were learning more from one another and teams were accomplishing more—faster. This was increasing individual motivation and satisfaction and leading the entire company to get more done. We found that being surrounded by the best catapulted already good work to a whole new level. Most important, working with really talented colleagues was exciting, inspiring, and a lot of fun—something that remains as true today with the company at seven thousand employees as it was back then at eighty. In hindsight, I understood that a team with one or two merely adequate performers brings down the performance of everyone on the team. If you have a team of five stunning employees and two adequate ones, the adequate ones will sap managers’ energy, so they have less time for the top performers, reduce the quality of group discussions, lowering the team’s overall IQ, force others to develop ways to work around them, reducing efficiency, drive staff who seek excellence to quit, and show the team you accept mediocrity, thus multiplying the problem. For top performers, a great workplace isn’t about a lavish office, a beautiful gym, or a free sushi lunch. It’s about the joy of being surrounded by people who are both talented and collaborative. People who can help you be better. When every member is excellent, performance spirals upward as employees learn from and motivate one another.

Introducing the concept of “talent density”. This is a slight detour from my previous books as I sometimes like to take sometime to read such autobiographies from business leaders. I usually take insights from these books with a pinch of salt though, especially cause of the narrative fallacy.

Regardless, reading about “talent density” just makes so much sense, especially from my personal anecdotal experiences. I don’t think I’ve ever met anyone who left a company because there wasn’t free meals, well stocked pantry or great amenities. Instead, people start leaving when they observe “inequities”, especially when it comes to workload or their talent.

Whenever people feel like they are carrying an undue burden at work, the high performers tend to leave and this create a negative feedback loop where the ones who “get carried” stay in the company, causing even more high performers to quit. Sometimes, these “high performers” might not be considered high performers by senior management as well as they might get the job done but lack the ability to sell themselves.

This excerpt talks about how Netflix tried to break this feedback loop and create a positive cycle instead.