class: center, middle, inverse, title-slide # Balance in Bayesian Analysis ### Dr. Dogucu --- layout: true <div class="my-header"></div> <div class="my-footer"> From Bayes Rules! book Copyright © Drs. Alicia Johnson, Miles Ott & Mine Dogucu. <a href="https://creativecommons.org/licenses/by-nc-sa/4.0/">CC BY-NC-SA 4.0</a></div> --- __Bechdel Test__ Alison Bechdel’s 1985 comic Dykes to Watch Out For has a strip called [The Rule](https://www.npr.org/templates/story/story.php?storyId=94202522?storyId=94202522) where a person states that they only go to a movie if it satisfies the following three rules: - the movie has to have at least two women in it; - these two women talk to each other; and - they talk about something besides a man. This test is used for assessing movies in terms of representation of women. Even though there are three criteria, a movie either fails or passes the Bechdel test. --- ### Different Priors, Same Data Let `\(\pi\)` be the the proportion of movies that pass the Bechdel test. Below there are three different people with three different priors about `\(\pi\)`. <table align = "center"> <tr> <td>optimist</td> <td>clueless </td> <td>feminist </td> </tr> <tr> <td>Beta(14,1)</td> <td>Beta(1,1)</td> <td>Beta(5,11)</td> </tr> </table> Plot their priors. --- ## Priors <img src="slide-2-balance_files/figure-html/unnamed-chunk-2-1.png" style="display: block; margin: auto;" /> --- ### Vocabulary __Informative prior:__ An informative prior reflects specific information about the unknown variable with high certainty (ie. low variability). __Vague (diffuse) prior:__ A vague or diffuse prior reflects little specific information about the unknown variable. A flat prior, which assigns equal prior plausibility to all possible values of the variable, is a special case. --- - `library(fivethirtyeight)` has `bechdel` data frame. Randomly select 20 movies from this dataset (seed = 84735) - Based on observed data, update the posterior for all three people. Write the distribution of the posterior. - Calculate the summary statistics for the prior and the posterior for all three. - Plot the prior, likelihood, and the posterior for all three. - Explain the effect of different priors on the posterior. --- ```r library(tidyverse) library(fivethirtyeight) library(bayesrules) set.seed(84735) ``` -- ```r bechdel_sample <- sample_n(bechdel, 20) ``` -- ```r count(bechdel_sample, binary) ``` ``` ## # A tibble: 2 x 2 ## binary n ## <chr> <int> ## 1 FAIL 11 ## 2 PASS 9 ``` --- ## The Optimist ```r summarize_beta_binomial(14, 1, x = 9, n = 20) ``` ``` ## model alpha beta mean mode var ## 1 prior 14 1 0.9333333 1.0000000 0.003888889 ## 2 posterior 23 12 0.6571429 0.6666667 0.006258503 ``` --- ## The Optimist ```r plot_beta_binomial(14, 1, x = 9, n = 20) ``` <img src="slide-2-balance_files/figure-html/unnamed-chunk-7-1.png" style="display: block; margin: auto;" /> --- ## The Clueless ```r summarize_beta_binomial(1, 1, x = 9, n = 20) ``` ``` ## model alpha beta mean mode var ## 1 prior 1 1 0.5000000 NaN 0.08333333 ## 2 posterior 10 12 0.4545455 0.45 0.01077973 ``` --- ## The Clueless ```r plot_beta_binomial(1, 1, x = 9, n = 20) ``` <img src="slide-2-balance_files/figure-html/unnamed-chunk-9-1.png" style="display: block; margin: auto;" /> --- ## The Feminist ```r summarize_beta_binomial(5, 11, x = 9, n = 20) ``` ``` ## model alpha beta mean mode var ## 1 prior 5 11 0.3125000 0.2857143 0.01263787 ## 2 posterior 14 22 0.3888889 0.3823529 0.00642309 ``` --- ## The Feminist ```r plot_beta_binomial(5, 11, x = 9, n = 20) ``` <img src="slide-2-balance_files/figure-html/unnamed-chunk-11-1.png" style="display: block; margin: auto;" /> --- ## Comparison <img src="slide-2-balance_files/figure-html/unnamed-chunk-12-1.png" style="display: block; margin: auto;" /> --- ### Same Prior, Different Data Morteza, Nadide, and Ursula – all share the optimistic Beta(14,1) prior for `\(\pi\)` but each have access to different data. Morteza reviews movies from 1991. Nadide reviews movies from 2000 and Ursula reviews movies from 2013. How will the posterior distribution for each differ? --- ## Morteza's analysis ```r bechdel_1991 <- filter(bechdel, year == 1991) count(bechdel_1991, binary) ``` ``` ## # A tibble: 2 x 2 ## binary n ## <chr> <int> ## 1 FAIL 7 ## 2 PASS 6 ``` ```r 6/13 ``` ``` ## [1] 0.4615385 ``` --- ## Morteza's analysis ```r plot_beta_binomial(14, 1, x = 6, n = 13) ``` <img src="slide-2-balance_files/figure-html/unnamed-chunk-14-1.png" style="display: block; margin: auto;" /> --- ## Nadide's analysis ```r bechdel_2000 <- filter(bechdel, year == 2000) count(bechdel_2000, binary) ``` ``` ## # A tibble: 2 x 2 ## binary n ## <chr> <int> ## 1 FAIL 34 ## 2 PASS 29 ``` ```r 29/(34+29) ``` ``` ## [1] 0.4603175 ``` --- ## Nadide's analysis ```r plot_beta_binomial(14, 1, x = 29, n = 63) ``` <img src="slide-2-balance_files/figure-html/unnamed-chunk-16-1.png" style="display: block; margin: auto;" /> --- ## Ursula's analysis ```r bechdel_2013 <- filter(bechdel, year == 2013) count(bechdel_2013, binary) ``` ``` ## # A tibble: 2 x 2 ## binary n ## <chr> <int> ## 1 FAIL 53 ## 2 PASS 46 ``` ```r 46/(53+46) ``` ``` ## [1] 0.4646465 ``` --- ## Ursula's analysis ```r plot_beta_binomial(14, 1, x = 46, n = 99) ``` <img src="slide-2-balance_files/figure-html/unnamed-chunk-18-1.png" style="display: block; margin: auto;" /> --- ## Summary <img src="slide-2-balance_files/figure-html/unnamed-chunk-20-1.png" style="display: block; margin: auto;" /> ## Sequential Bayes Consider two new analysts Paola and Mark. Paola starts with Beta(14,1) prior. She first reviews movies from 1971 and updates her belief. Then reviews movies from 1972 and updates her belief. Then reviews movies from 1973 and updates her belief. Make sure to calculate the prior and posterior distribution at each point. Mark also starts with Beta(14,1) prior. However he reviews movies from 1971, 1972, 1973 all at once. Calculate the posterior. \section{Data order invariance}