Draft Bargains 2020 (Sleeper)
(EDIT (01.09.2020): Updated Version, since today is my Fantasy Draft)
Every year I try to compare the rankings on the fantasy football site I play on with consensus expert rankings in order to find exploits.
If a player is goes way earlier than he is evalued by experts, I will probably not draft him at all as it is not a good value.
If a player goes way later than projected by experts, I will maybe draft him, but I might get him later than I should be, hence the name Draft Bargain.
In this post we will compare the consensus Expert Ranking on FantasyPros with the previously scraped in-draft rankings from Sleeper. If you want to know how to do this, read my latest blog post.
Data Gathering
First, we should load our scraped mock ranks. Since we are only interested in skill positions and Quaterbacks, we remove everything else.
sleeper_ranks <-
#read_csv("https://maxhuebner.github.io/post/data/sleeper-mock-ranks-2020-09-01.csv") %>%
read_csv("https://raw.githubusercontent.com/maxhuebner/maxhuebner.github.io/master/post/data/sleeper-mock-ranks-2020-09-01.csv") %>%
filter((pos %in% c("QB","RB","WR","TE")))
Next, we need data to compare. Let’s scrape the data from the FantasyPros-website using the package rvest. The Data is stored in a <table>
tag so we can us rvest’s function html_table()
.
fantasypros_url <- "https://www.fantasypros.com/nfl/rankings/half-point-ppr-cheatsheets.php"
fp_html <- read_html(fantasypros_url) %>%
html_table(fill = T) %>%
.[[1]] %>%
as_tibble() %>%
janitor::clean_names()
fp_html
## # A tibble: 510 x 12
## rank wsid overall_team pos bye best worst avg std_dev adp vs_adp
## <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
## 1   "&nb~ " " "&nb~ "&nb~ "&nb~ "&nb~ "&nb~ " " "&nb~ "&nbs~
## 2 Tier~ "" "" "" "" "" "" "" "" "" ""
## 3 1 "" "Christian ~ "RB1" "13" "1" "4" "1.1" "0.3" "1.0" "0.0"
## 4 2 "" "Saquon Bar~ "RB2" "11" "1" "5" "2.1" "0.5" "2.0" "0.0"
## 5 3 "" "Ezekiel El~ "RB3" "10" "2" "17" "3.5" "1.7" "3.0" "0.0"
## 6 4 "" "Alvin Kama~ "RB4" "6" "3" "12" "4.7" "1.3" "5.0" "+1.0"
## 7 5 "" "Michael Th~ "WR1" "6" "3" "14" "6.2" "2.6" "4.0" "-1.0"
## 8 6 "" "Dalvin Coo~ "RB5" "7" "2" "21" "6.8" "2.5" "7.0" "+1.0"
## 9 Tier~ "" "" "" "" "" "" "" "" "" ""
## 10 7 "" "Derrick He~ "RB6" "7" "2" "28" "8.3" "3.2" "6.0" "-1.0"
## # ... with 500 more rows, and 1 more variable: notes <lgl>
Unfortunately the data/table is not perfect, so we need to do a bit of data cleaning:
- Get rid of rows with non numeric rows (
as.numeric(rank)
will make non numeric valuesNA
) - Extract name by matching everything up to the
.
and trimming the last 2 characters - Extract position by removing the number
- Extract the team by extracting at least two back to back capital letters (only team names matches this description)
After that, we change a few outliers so that our data is acceptable.
fp_ranks <- fp_html %>%
mutate(rank = as.numeric(rank),
name = str_extract(overall_team, ".*\\."),
name = str_sub(name, end = -3),
pos = str_extract(pos, "[:upper:]*"),
team = str_extract(overall_team, "[:upper:]{2,}")) %>%
filter(!is.na(rank),
pos %in% c("QB","RB","WR","TE")) %>%
select(rank, name, pos, team) %>%
mutate(team = if_else(team == "JAC", "JAX", team),
team = if_else(name == "Mark Ingram II", "BAL", team)) %>%
add_count(team) %>%
filter(n > 2) %>%
select(-n)
fp_ranks
## # A tibble: 412 x 4
## rank name pos team
## <dbl> <chr> <chr> <chr>
## 1 1 Christian McCaffrey RB CAR
## 2 2 Saquon Barkley RB NYG
## 3 3 Ezekiel Elliott RB DAL
## 4 4 Alvin Kamara RB NO
## 5 5 Michael Thomas WR NO
## 6 6 Dalvin Cook RB MIN
## 7 7 Derrick Henry RB TEN
## 8 8 Davante Adams WR GB
## 9 9 Joe Mixon RB CIN
## 10 10 Julio Jones WR ATL
## # ... with 402 more rows
Merging Data
In order to compare the data we should merge it in a single data frame (or in this case tibble). We want to join by name, position and team so every player has a sleeper_rank
and a fp_rank
.
The only problem: The names don’t match up perfectly. To fix that we have to use the awesome package fuzzyjoin by David Robinson. We will also use the package stringdist
library(fuzzyjoin)
library(stringdist)
adp_tibble <- fp_ranks %>%
fuzzy_left_join(sleeper_ranks,
by = c("pos", "team", "name"),
list(`==`,`==`,function(x,y) stringdist(tolower(x), tolower(y),
method="osa") <= 6)) %>%
select(name = name.x, pos = pos.x, team = team.x,
fp = rank.x, sleeper = rank.y) %>%
filter(!is.na(sleeper)) %>%
mutate(diff = sleeper-fp,
category = as.factor(ifelse(diff > 0, "Steal", "Overhyped"))) %>%
arrange(abs(diff)) %>%
distinct(name, team, pos, .keep_all = T)
adp_tibble
## # A tibble: 255 x 7
## name pos team fp sleeper diff category
## <chr> <chr> <chr> <dbl> <dbl> <dbl> <fct>
## 1 Christian McCaffrey RB CAR 1 1 0 Overhyped
## 2 Saquon Barkley RB NYG 2 2 0 Overhyped
## 3 Joe Mixon RB CIN 9 9 0 Overhyped
## 4 Kenyan Drake RB ARI 17 17 0 Overhyped
## 5 Kareem Hunt RB CLE 62 62 0 Overhyped
## 6 Matt Breida RB MIA 86 86 0 Overhyped
## 7 Larry Fitzgerald WR ARI 183 183 0 Overhyped
## 8 Brian Hill RB ATL 281 281 0 Overhyped
## 9 Ezekiel Elliott RB DAL 3 4 1 Steal
## 10 Alvin Kamara RB NO 4 3 -1 Overhyped
## # ... with 245 more rows
We have three join columns. This fuzzy_left_join()
works the following way:
Join first and second column by exact match ==
. Join third column (name) by a function that is true, if the stringdist of two names is less or equal 6.
Example: stringdist("Patrick Mahomes", "Pat Mahomes")
would be 4, so it would still match our superstar quaterback. Difference of 6 seems like a lot, but we can’t match everyone with a smaller value and our overlap is minimal as well.
We will also calculate the difference in the rankings. Negative Value means the player goes too early on sleeper, positive means that we might be able to snatch him a bit later. We label the players accordingly in a new column category
Comparing Data
After joining the data, we can analyze it. Nobody cares about sleepers that go after round 13 so we will only look at Top100 players (in either ranking)
LIMIT <- 100
adp_compare <- adp_tibble %>%
arrange(desc(abs(diff))) %>%
filter(fp <= LIMIT | sleeper <= LIMIT)
steals <- adp_compare %>%
filter(category == "Steal") %>%
select(-category)
overhyped <- adp_compare %>%
filter(category == "Overhyped") %>%
select(-category)
steals
## # A tibble: 51 x 6
## name pos team fp sleeper diff
## <chr> <chr> <chr> <dbl> <dbl> <dbl>
## 1 Tarik Cohen RB CHI 88 120 32
## 2 Tyler Higbee TE LAR 77 106 29
## 3 Austin Hooper TE CLE 99 124 25
## 4 Josh Allen QB BUF 70 89 19
## 5 Matthew Stafford QB DET 90 109 19
## 6 D.J. Moore WR CAR 30 47 17
## 7 Odell Beckham Jr. WR CLE 31 48 17
## 8 Courtland Sutton WR DEN 50 67 17
## 9 Allen Robinson WR CHI 24 40 16
## 10 DeVante Parker WR MIA 55 71 16
## # ... with 41 more rows
overhyped
## # A tibble: 54 x 6
## name pos team fp sleeper diff
## <chr> <chr> <chr> <dbl> <dbl> <dbl>
## 1 Deebo Samuel WR SF 123 75 -48
## 2 Mecole Hardman WR KC 147 100 -47
## 3 Rob Gronkowski TE TB 107 72 -35
## 4 Emmanuel Sanders WR NO 127 99 -28
## 5 Marlon Mack RB IND 91 69 -22
## 6 Alexander Mattison RB MIN 113 96 -17
## 7 Aaron Rodgers QB GB 98 82 -16
## 8 Sony Michel RB NE 109 93 -16
## 9 David Montgomery RB CHI 52 37 -15
## 10 Devin Singletary RB BUF 57 42 -15
## # ... with 44 more rows
Creating Tables
We now have two datasets steals
and overhyped
that contain the data we were interested in. However, it is not pleasant to look at the players in this format. Therefore we will create beatiful tables using the gt package.
library(gt)
#Table Options Shared
table_init_with_options <- . %>%
gt(groupname_col = "pos", rownames_to_stub = T) %>%
tab_options(
row_group.background.color = "#FFEFDB80",#EFFBFC
heading.background.color = "#ebebeb",
column_labels.background.color = "#ebebeb",
stub.background.color = "#ebebeb",
table.font.color = "#323232",
table_body.hlines.color = "#989898",
table_body.border.top.color = "#989898",
heading.border.bottom.color = "#989898",
row_group.border.top.color = "#989898",
row_group.border.bottom.style = "none",
stub.border.style = "dashed",
stub.border.color = "#989898",
stub.border.width = "1px",
table.width = "60%"
) %>%
opt_all_caps() %>%
cols_align(align = "center", columns = c(1,3:7))
MINIMUM_DIFFERENCE <- 8
These are some options that we want to have for both our tables so we create a function for it. We also set the minimum difference to 8, so the table isn’t to crowded.
Over Table with Player we don’t want to draft looks like this:
overhyped %>%
filter(diff <= -MINIMUM_DIFFERENCE) %>%
table_init_with_options() %>%
tab_header(
title = md("Overhyped Players on *Sleeper.App*"),
subtitle = "(Players that tend to go before their general ADP)"
) %>%
data_color(
columns = vars(diff),
colors = scales::col_numeric(
palette = paletteer::paletteer_d(
palette = "ggsci::red_material"
) %>% as.character(),
domain = NULL,
reverse = T
),
alpha = 0.8
)
Overhyped Players on Sleeper.App | |||||
---|---|---|---|---|---|
(Players that tend to go before their general ADP) | |||||
name | team | fp | sleeper | diff | |
WR | |||||
1 | Deebo Samuel | SF | 123 | 75 | -48 |
2 | Mecole Hardman | KC | 147 | 100 | -47 |
4 | Emmanuel Sanders | NO | 127 | 99 | -28 |
12 | Brandin Cooks | HOU | 87 | 73 | -14 |
17 | T.Y. Hilton | IND | 63 | 52 | -11 |
20 | D.K. Metcalf | SEA | 53 | 44 | -9 |
21 | Marquise Brown | BAL | 73 | 64 | -9 |
TE | |||||
3 | Rob Gronkowski | TB | 107 | 72 | -35 |
RB | |||||
5 | Marlon Mack | IND | 91 | 69 | -22 |
6 | Alexander Mattison | MIN | 113 | 96 | -17 |
8 | Sony Michel | NE | 109 | 93 | -16 |
9 | David Montgomery | CHI | 52 | 37 | -15 |
10 | Devin Singletary | BUF | 57 | 42 | -15 |
11 | James Conner | PIT | 39 | 25 | -14 |
13 | Le'Veon Bell | NYJ | 42 | 29 | -13 |
14 | J.K. Dobbins | BAL | 89 | 76 | -13 |
16 | David Johnson | HOU | 44 | 33 | -11 |
18 | Melvin Gordon | DEN | 41 | 31 | -10 |
19 | Mark Ingram II | BAL | 48 | 39 | -9 |
22 | Jonathan Taylor | IND | 54 | 46 | -8 |
QB | |||||
7 | Aaron Rodgers | GB | 98 | 82 | -16 |
15 | Patrick Mahomes | KC | 25 | 14 | -11 |
Here’s our table for potential steals:
steals %>%
filter(diff >= MINIMUM_DIFFERENCE) %>%
table_init_with_options() %>%
tab_header(
title = md("Potential Steals on *Sleeper.App*"),
subtitle = "(Players that tend to go after their general ADP)"
) %>%
data_color(
columns = vars(diff),
colors = scales::col_numeric(
palette = paletteer::paletteer_d(
palette = "ggsci::green_material"
) %>% as.character(),
domain = NULL
),
alpha = 0.8
)
Potential Steals on Sleeper.App | |||||
---|---|---|---|---|---|
(Players that tend to go after their general ADP) | |||||
name | team | fp | sleeper | diff | |
RB | |||||
1 | Tarik Cohen | CHI | 88 | 120 | 32 |
18 | James White | NE | 80 | 90 | 10 |
TE | |||||
2 | Tyler Higbee | LAR | 77 | 106 | 29 |
3 | Austin Hooper | CLE | 99 | 124 | 25 |
19 | Hayden Hurst | ATL | 97 | 107 | 10 |
20 | Hunter Henry | LAC | 78 | 87 | 9 |
QB | |||||
4 | Josh Allen | BUF | 70 | 89 | 19 |
5 | Matthew Stafford | DET | 90 | 109 | 19 |
12 | Carson Wentz | PHI | 83 | 98 | 15 |
22 | Matt Ryan | ATL | 75 | 83 | 8 |
WR | |||||
6 | D.J. Moore | CAR | 30 | 47 | 17 |
7 | Odell Beckham Jr. | CLE | 31 | 48 | 17 |
8 | Courtland Sutton | DEN | 50 | 67 | 17 |
9 | Allen Robinson | CHI | 24 | 40 | 16 |
10 | DeVante Parker | MIA | 55 | 71 | 16 |
11 | JuJu Smith-Schuster | PIT | 28 | 43 | 15 |
13 | Jarvis Landry | CLE | 67 | 81 | 14 |
14 | Robert Woods | LAR | 36 | 49 | 13 |
15 | Tyler Boyd | CIN | 72 | 85 | 13 |
16 | Marvin Jones | DET | 81 | 94 | 13 |
17 | Terry McLaurin | WAS | 49 | 59 | 10 |
21 | D.J. Chark | JAX | 47 | 55 | 8 |
Conclusion
These tables might help, when drafting on Sleeper this year. You should, however, never base your whole draft around this. If you like a player and find him on the Steals-Table, great! You might even get him a round later than usual. If you like a player, but he is on the Overhyped-Table, you have to decide, if you really want him, because you might have to pay a hefty price.
Standalone Overhyped Table can be found here
Standalone Steal table can be found here