Draft Bargains 2020 (Sleeper)

(EDIT (01.09.2020): Updated Version, since today is my Fantasy Draft)

Every year I try to compare the rankings on the fantasy football site I play on with consensus expert rankings in order to find exploits.
If a player is goes way earlier than he is evalued by experts, I will probably not draft him at all as it is not a good value.
If a player goes way later than projected by experts, I will maybe draft him, but I might get him later than I should be, hence the name Draft Bargain.
In this post we will compare the consensus Expert Ranking on FantasyPros with the previously scraped in-draft rankings from Sleeper. If you want to know how to do this, read my latest blog post.

Data Gathering

First, we should load our scraped mock ranks. Since we are only interested in skill positions and Quaterbacks, we remove everything else.

sleeper_ranks <- 
  #read_csv("https://maxhuebner.github.io/post/data/sleeper-mock-ranks-2020-09-01.csv") %>%
  read_csv("https://raw.githubusercontent.com/maxhuebner/maxhuebner.github.io/master/post/data/sleeper-mock-ranks-2020-09-01.csv") %>% 
  filter((pos %in% c("QB","RB","WR","TE")))

Next, we need data to compare. Let’s scrape the data from the FantasyPros-website using the package rvest. The Data is stored in a <table> tag so we can us rvest’s function html_table().

fantasypros_url <- "https://www.fantasypros.com/nfl/rankings/half-point-ppr-cheatsheets.php"

fp_html <- read_html(fantasypros_url) %>% 
  html_table(fill = T) %>% 
  .[[1]] %>% 
  as_tibble() %>% 
  janitor::clean_names()

fp_html
## # A tibble: 510 x 12
##    rank  wsid  overall_team pos   bye   best  worst avg   std_dev adp   vs_adp
##    <chr> <chr> <chr>        <chr> <chr> <chr> <chr> <chr> <chr>   <chr> <chr> 
##  1 &nbsp "&nb~ "&nbsp"      "&nb~ "&nb~ "&nb~ "&nb~ "&nb~ "&nbsp" "&nb~ "&nbs~
##  2 Tier~ ""    ""           ""    ""    ""    ""    ""    ""      ""    ""    
##  3 1     ""    "Christian ~ "RB1" "13"  "1"   "4"   "1.1" "0.3"   "1.0" "0.0" 
##  4 2     ""    "Saquon Bar~ "RB2" "11"  "1"   "5"   "2.1" "0.5"   "2.0" "0.0" 
##  5 3     ""    "Ezekiel El~ "RB3" "10"  "2"   "17"  "3.5" "1.7"   "3.0" "0.0" 
##  6 4     ""    "Alvin Kama~ "RB4" "6"   "3"   "12"  "4.7" "1.3"   "5.0" "+1.0"
##  7 5     ""    "Michael Th~ "WR1" "6"   "3"   "14"  "6.2" "2.6"   "4.0" "-1.0"
##  8 6     ""    "Dalvin Coo~ "RB5" "7"   "2"   "21"  "6.8" "2.5"   "7.0" "+1.0"
##  9 Tier~ ""    ""           ""    ""    ""    ""    ""    ""      ""    ""    
## 10 7     ""    "Derrick He~ "RB6" "7"   "2"   "28"  "8.3" "3.2"   "6.0" "-1.0"
## # ... with 500 more rows, and 1 more variable: notes <lgl>

Unfortunately the data/table is not perfect, so we need to do a bit of data cleaning:

  • Get rid of rows with non numeric rows (as.numeric(rank) will make non numeric values NA)
  • Extract name by matching everything up to the . and trimming the last 2 characters
  • Extract position by removing the number
  • Extract the team by extracting at least two back to back capital letters (only team names matches this description)

After that, we change a few outliers so that our data is acceptable.

fp_ranks <- fp_html %>% 
  mutate(rank = as.numeric(rank),
         name = str_extract(overall_team, ".*\\."),
         name = str_sub(name, end = -3),
         pos = str_extract(pos, "[:upper:]*"),
         team = str_extract(overall_team, "[:upper:]{2,}")) %>% 
  filter(!is.na(rank),
         pos %in% c("QB","RB","WR","TE")) %>% 
  select(rank, name, pos, team) %>%
  mutate(team = if_else(team == "JAC", "JAX", team),
         team = if_else(name == "Mark Ingram II", "BAL", team)) %>% 
  add_count(team) %>% 
  filter(n > 2) %>% 
  select(-n)

fp_ranks
## # A tibble: 412 x 4
##     rank name                pos   team 
##    <dbl> <chr>               <chr> <chr>
##  1     1 Christian McCaffrey RB    CAR  
##  2     2 Saquon Barkley      RB    NYG  
##  3     3 Ezekiel Elliott     RB    DAL  
##  4     4 Alvin Kamara        RB    NO   
##  5     5 Michael Thomas      WR    NO   
##  6     6 Dalvin Cook         RB    MIN  
##  7     7 Derrick Henry       RB    TEN  
##  8     8 Davante Adams       WR    GB   
##  9     9 Joe Mixon           RB    CIN  
## 10    10 Julio Jones         WR    ATL  
## # ... with 402 more rows

Merging Data

In order to compare the data we should merge it in a single data frame (or in this case tibble). We want to join by name, position and team so every player has a sleeper_rank and a fp_rank.
The only problem: The names don’t match up perfectly. To fix that we have to use the awesome package fuzzyjoin by David Robinson. We will also use the package stringdist

library(fuzzyjoin)
library(stringdist)

adp_tibble <- fp_ranks %>% 
  fuzzy_left_join(sleeper_ranks,
                  by = c("pos", "team", "name"),
                  list(`==`,`==`,function(x,y) stringdist(tolower(x), tolower(y),
                                                           method="osa") <= 6)) %>%
  select(name = name.x, pos = pos.x, team = team.x,
         fp = rank.x, sleeper = rank.y) %>%
  filter(!is.na(sleeper)) %>%
  mutate(diff = sleeper-fp,
         category = as.factor(ifelse(diff > 0, "Steal", "Overhyped"))) %>% 
  arrange(abs(diff)) %>% 
  distinct(name, team, pos, .keep_all = T)

adp_tibble
## # A tibble: 255 x 7
##    name                pos   team     fp sleeper  diff category 
##    <chr>               <chr> <chr> <dbl>   <dbl> <dbl> <fct>    
##  1 Christian McCaffrey RB    CAR       1       1     0 Overhyped
##  2 Saquon Barkley      RB    NYG       2       2     0 Overhyped
##  3 Joe Mixon           RB    CIN       9       9     0 Overhyped
##  4 Kenyan Drake        RB    ARI      17      17     0 Overhyped
##  5 Kareem Hunt         RB    CLE      62      62     0 Overhyped
##  6 Matt Breida         RB    MIA      86      86     0 Overhyped
##  7 Larry Fitzgerald    WR    ARI     183     183     0 Overhyped
##  8 Brian Hill          RB    ATL     281     281     0 Overhyped
##  9 Ezekiel Elliott     RB    DAL       3       4     1 Steal    
## 10 Alvin Kamara        RB    NO        4       3    -1 Overhyped
## # ... with 245 more rows

We have three join columns. This fuzzy_left_join() works the following way:
Join first and second column by exact match ==. Join third column (name) by a function that is true, if the stringdist of two names is less or equal 6.
Example: stringdist("Patrick Mahomes", "Pat Mahomes") would be 4, so it would still match our superstar quaterback. Difference of 6 seems like a lot, but we can’t match everyone with a smaller value and our overlap is minimal as well.

We will also calculate the difference in the rankings. Negative Value means the player goes too early on sleeper, positive means that we might be able to snatch him a bit later. We label the players accordingly in a new column category

Comparing Data

After joining the data, we can analyze it. Nobody cares about sleepers that go after round 13 so we will only look at Top100 players (in either ranking)

LIMIT <- 100

adp_compare <- adp_tibble %>% 
  arrange(desc(abs(diff))) %>% 
  filter(fp <= LIMIT | sleeper <= LIMIT)

steals <- adp_compare %>% 
  filter(category == "Steal") %>% 
  select(-category)

overhyped <- adp_compare %>% 
  filter(category == "Overhyped") %>% 
  select(-category)

steals
## # A tibble: 51 x 6
##    name              pos   team     fp sleeper  diff
##    <chr>             <chr> <chr> <dbl>   <dbl> <dbl>
##  1 Tarik Cohen       RB    CHI      88     120    32
##  2 Tyler Higbee      TE    LAR      77     106    29
##  3 Austin Hooper     TE    CLE      99     124    25
##  4 Josh Allen        QB    BUF      70      89    19
##  5 Matthew Stafford  QB    DET      90     109    19
##  6 D.J. Moore        WR    CAR      30      47    17
##  7 Odell Beckham Jr. WR    CLE      31      48    17
##  8 Courtland Sutton  WR    DEN      50      67    17
##  9 Allen Robinson    WR    CHI      24      40    16
## 10 DeVante Parker    WR    MIA      55      71    16
## # ... with 41 more rows
overhyped
## # A tibble: 54 x 6
##    name               pos   team     fp sleeper  diff
##    <chr>              <chr> <chr> <dbl>   <dbl> <dbl>
##  1 Deebo Samuel       WR    SF      123      75   -48
##  2 Mecole Hardman     WR    KC      147     100   -47
##  3 Rob Gronkowski     TE    TB      107      72   -35
##  4 Emmanuel Sanders   WR    NO      127      99   -28
##  5 Marlon Mack        RB    IND      91      69   -22
##  6 Alexander Mattison RB    MIN     113      96   -17
##  7 Aaron Rodgers      QB    GB       98      82   -16
##  8 Sony Michel        RB    NE      109      93   -16
##  9 David Montgomery   RB    CHI      52      37   -15
## 10 Devin Singletary   RB    BUF      57      42   -15
## # ... with 44 more rows

Creating Tables

We now have two datasets steals and overhyped that contain the data we were interested in. However, it is not pleasant to look at the players in this format. Therefore we will create beatiful tables using the gt package.

library(gt)
#Table Options Shared
table_init_with_options <- . %>% 
  gt(groupname_col = "pos", rownames_to_stub = T) %>% 
  tab_options(
    row_group.background.color = "#FFEFDB80",#EFFBFC
    heading.background.color = "#ebebeb",
    column_labels.background.color = "#ebebeb",
    stub.background.color = "#ebebeb",
    table.font.color = "#323232",
    table_body.hlines.color = "#989898",
    table_body.border.top.color = "#989898",
    heading.border.bottom.color = "#989898",
    row_group.border.top.color = "#989898",
    row_group.border.bottom.style = "none",
    stub.border.style = "dashed",
    stub.border.color = "#989898",
    stub.border.width = "1px",
    table.width = "60%"
  ) %>% 
  opt_all_caps() %>% 
  cols_align(align = "center", columns = c(1,3:7))

MINIMUM_DIFFERENCE <- 8

These are some options that we want to have for both our tables so we create a function for it. We also set the minimum difference to 8, so the table isn’t to crowded.

Over Table with Player we don’t want to draft looks like this:

overhyped %>% 
  filter(diff <= -MINIMUM_DIFFERENCE) %>% 
  table_init_with_options() %>% 
  tab_header(
    title = md("Overhyped Players on *Sleeper.App*"),
    subtitle = "(Players that tend to go before their general ADP)"
  ) %>% 
  data_color(
    columns = vars(diff),
    colors = scales::col_numeric(
      palette = paletteer::paletteer_d(
        palette = "ggsci::red_material"
      ) %>% as.character(),
      domain = NULL,
      reverse = T
    ),
    alpha = 0.8
  )
Overhyped Players on Sleeper.App
(Players that tend to go before their general ADP)
name team fp sleeper diff
WR
1 Deebo Samuel SF 123 75 -48
2 Mecole Hardman KC 147 100 -47
4 Emmanuel Sanders NO 127 99 -28
12 Brandin Cooks HOU 87 73 -14
17 T.Y. Hilton IND 63 52 -11
20 D.K. Metcalf SEA 53 44 -9
21 Marquise Brown BAL 73 64 -9
TE
3 Rob Gronkowski TB 107 72 -35
RB
5 Marlon Mack IND 91 69 -22
6 Alexander Mattison MIN 113 96 -17
8 Sony Michel NE 109 93 -16
9 David Montgomery CHI 52 37 -15
10 Devin Singletary BUF 57 42 -15
11 James Conner PIT 39 25 -14
13 Le'Veon Bell NYJ 42 29 -13
14 J.K. Dobbins BAL 89 76 -13
16 David Johnson HOU 44 33 -11
18 Melvin Gordon DEN 41 31 -10
19 Mark Ingram II BAL 48 39 -9
22 Jonathan Taylor IND 54 46 -8
QB
7 Aaron Rodgers GB 98 82 -16
15 Patrick Mahomes KC 25 14 -11

Here’s our table for potential steals:

steals %>% 
  filter(diff >= MINIMUM_DIFFERENCE) %>%
  table_init_with_options() %>% 
  tab_header(
    title = md("Potential Steals on *Sleeper.App*"),
    subtitle = "(Players that tend to go after their general ADP)"
  ) %>% 
  data_color(
    columns = vars(diff),
    colors = scales::col_numeric(
      palette = paletteer::paletteer_d(
        palette = "ggsci::green_material"
      ) %>% as.character(),
      domain = NULL
    ),
    alpha = 0.8
  )
Potential Steals on Sleeper.App
(Players that tend to go after their general ADP)
name team fp sleeper diff
RB
1 Tarik Cohen CHI 88 120 32
18 James White NE 80 90 10
TE
2 Tyler Higbee LAR 77 106 29
3 Austin Hooper CLE 99 124 25
19 Hayden Hurst ATL 97 107 10
20 Hunter Henry LAC 78 87 9
QB
4 Josh Allen BUF 70 89 19
5 Matthew Stafford DET 90 109 19
12 Carson Wentz PHI 83 98 15
22 Matt Ryan ATL 75 83 8
WR
6 D.J. Moore CAR 30 47 17
7 Odell Beckham Jr. CLE 31 48 17
8 Courtland Sutton DEN 50 67 17
9 Allen Robinson CHI 24 40 16
10 DeVante Parker MIA 55 71 16
11 JuJu Smith-Schuster PIT 28 43 15
13 Jarvis Landry CLE 67 81 14
14 Robert Woods LAR 36 49 13
15 Tyler Boyd CIN 72 85 13
16 Marvin Jones DET 81 94 13
17 Terry McLaurin WAS 49 59 10
21 D.J. Chark JAX 47 55 8

Conclusion

These tables might help, when drafting on Sleeper this year. You should, however, never base your whole draft around this. If you like a player and find him on the Steals-Table, great! You might even get him a round later than usual. If you like a player, but he is on the Overhyped-Table, you have to decide, if you really want him, because you might have to pay a hefty price.

Standalone Overhyped Table can be found here
Standalone Steal table can be found here

Max Hübner
Max Hübner
Computer Science Student

Computer Science Student from Germany

Related