Hello. I have been playing around with German soccer (Bundesliga) data in R using the dplyr
package.
There is a neat data package in R called bundesligR
. bundesligR
is also a dataset which contains all final tables of Germany’s top tier soccer league, the Bundesliga.
Notable teams from the Bundesliga include FC Bayern Munchen (Munich), Borussia Dortmund, Bayer 04 Leverkusen, and Borussia Monchengladbach.
If you have not installed the bundesligR
or the dplyr
package, you can install them both using:
install.packages("bundesligR")
install.packages("dplyr")
After installation, we convert the bundesligR dataset into a data frame and name it soccer. We also take a look at the data. The data spans from 1964 to 2016.
library(dplyr)
library(bundesligR)
soccer <- as.data.frame(bundesligR)
head(soccer)
## Season Position Team Played W D L GF GA GD Points
## 1 2015 1 FC Bayern Muenchen 34 28 4 2 80 17 63 88
## 2 2015 2 Borussia Dortmund 34 24 6 4 82 34 48 78
## 3 2015 3 Bayer 04 Leverkusen 34 18 6 10 56 40 16 60
## 4 2015 4 Borussia Moenchengladbach 34 17 4 13 67 50 17 55
## 5 2015 5 FC Schalke 04 34 15 7 12 51 49 2 52
## 6 2015 6 1. FSV Mainz 05 34 14 8 12 46 42 4 50
## Pts_pre_95
## 1 60
## 2 54
## 3 42
## 4 38
## 5 37
## 6 36
The team with the most points at the end of a season is the title winner for that season. The Season variable if the year in which the season starts. From head(soccer)
, the most recent data is from the 2015 Season (Late Summer 2015 to Spring 2016).
Position refers to the ranking on the table. Team is the football team. Played refers to the number of games played in the season. W, D and L refers to Wins Draws and Losses for the team. GF is goals for the team or how many goals scored for the season, GA is short for goals against the team and GD is goal differential which is GF - GA.
With points, a win gives the winning team 3 points, a draw gives 1 point and a loss gives zero points. The points system before 1995 had 2 points for wins under the variable Pts_pre_95.
The full R Documentation of the bundesligR dataset can be found with ??bundesligR
The last column Pts_pre_95 will be removed from the dataset. Also a few column names will be renamed.
# Season is the year when the season started until the end of the next year.
# Rename a few columns: W = Wins, D = Draws, L = Losses
# Remove Pts_pre_95 variable/column
soccer <- soccer %>%
rename(Games.Played = Played, Wins = W, Draws = D, Losses = L)
soccer <- subset(soccer, select = -Pts_pre_95)
head(soccer)
## Season Position Team Games.Played Wins Draws Losses
## 1 2015 1 FC Bayern Muenchen 34 28 4 2
## 2 2015 2 Borussia Dortmund 34 24 6 4
## 3 2015 3 Bayer 04 Leverkusen 34 18 6 10
## 4 2015 4 Borussia Moenchengladbach 34 17 4 13
## 5 2015 5 FC Schalke 04 34 15 7 12
## 6 2015 6 1. FSV Mainz 05 34 14 8 12
## GF GA GD Points
## 1 80 17 63 88
## 2 82 34 48 78
## 3 56 40 16 60
## 4 67 50 17 55
## 5 51 49 2 52
## 6 46 42 4 50
The %>%
pipe operator is used for easier reading. Instead of select(soccer, -Pts_pre_95)
, we use soccer %>% select(-Pts_pre_95)
. The negative sign in front of the column Pts_pre_95 inside select()
tells R to remove the specified column. Remove a column is easier than selecting everything else.
The rename()
part is used to rename past columns.
Now we use dplyr
to help us find some interesting data of the Bundesliga in its history.
2015-2016 Season
Here were the results from last year’s (previous) Bundesliga season. The filter()
function is used here.
season_2015 <- soccer %>% filter(Season == 2015)
This season was interesting in the sense that it was a very good season for Borussia Dortmund and they were still 10 points away from FC Bayern Munchen. The gap between Borussia Dortmund at second place and third place was 18 points.
Best Season Of All Time
The best season of all time in the Bundesliga belongs to the team which had the most points at the end of the season.
# Best season where the title winning team had the most
# points in history from 1964-2016.
best_season <- soccer %>% filter(Points == max(Points))
best_season
## Season Position Team Games.Played Wins Draws Losses GF GA
## 1 2012 1 FC Bayern Muenchen 34 29 4 1 98 18
## GD Points
## 1 80 91
For the 2012-2013 season, FC Bayern Munchen won the Bundesliga with a record 91 points. They also won the DFB-Pokal and the UEFA Champions League for that season, winning the treble. (Winning the treble is very difficult.)
Worst Season Of All Time
The worst season of all time in the Bundesliga belongs to the worst last placed team (and is also relegated to Bundesliga 2 which is the lower tier league).
# Worst season where the last place (and relegated) team had the lowest points
# in history from 1964 to 2016.
worst_season <- soccer %>% filter(Points == min(Points))
worst_season
## Season Position Team Games.Played Wins Draws Losses
## 1 1965 18 SC Tasmania 1900 Berlin 34 2 4 28
## GF GA GD Points
## 1 15 10 5 10
SC Tasmania 1900 Berlin came in dead last in 1965 with 10 points from 2 wins, 4 draws, and 28 losses.
Top 5 Teams Per Season
We can find the top 5 teams per season in this data. As this subset is quite large, we look at the top 5 teams from the 2010-2011 season to the 2015-2016 season.
# Top 5 Teams per Season w/o GF and GA
top5 <- soccer %>% group_by(Season) %>%
filter(Position <= 5)
top5 <- data.frame(top5)
head(top5, n = 30)
## Season Position Team Games.Played Wins Draws
## 1 2015 1 FC Bayern Muenchen 34 28 4
## 2 2015 2 Borussia Dortmund 34 24 6
## 3 2015 3 Bayer 04 Leverkusen 34 18 6
## 4 2015 4 Borussia Moenchengladbach 34 17 4
## 5 2015 5 FC Schalke 04 34 15 7
## 6 2014 1 FC Bayern Muenchen 34 25 4
## 7 2014 2 VfL Wolfsburg 34 20 9
## 8 2014 3 Borussia Moenchengladbach 34 19 9
## 9 2014 4 Bayer 04 Leverkusen 34 17 10
## 10 2014 5 FC Augsburg 34 15 4
## 11 2013 1 FC Bayern Muenchen 34 29 3
## 12 2013 2 Borussia Dortmund 34 22 5
## 13 2013 3 FC Schalke 04 34 19 7
## 14 2013 4 Bayer 04 Leverkusen 34 19 4
## 15 2013 5 VfL Wolfsburg 34 18 6
## 16 2012 1 FC Bayern Muenchen 34 29 4
## 17 2012 2 Borussia Dortmund 34 19 9
## 18 2012 3 Bayer 04 Leverkusen 34 19 8
## 19 2012 4 FC Schalke 04 34 16 7
## 20 2012 5 SC Freiburg 34 14 9
## 21 2011 1 Borussia Dortmund 34 25 6
## 22 2011 2 FC Bayern Muenchen 34 23 4
## 23 2011 3 FC Schalke 04 34 20 4
## 24 2011 4 Borussia Moenchengladbach 34 17 9
## 25 2011 5 Bayer 04 Leverkusen 34 15 9
## 26 2010 1 Borussia Dortmund 34 23 6
## 27 2010 2 Bayer 04 Leverkusen 34 20 8
## 28 2010 3 FC Bayern Muenchen 34 19 8
## 29 2010 4 Hannover 96 34 19 3
## 30 2010 5 1. FSV Mainz 05 34 18 4
## Losses GF GA GD Points
## 1 2 80 17 63 88
## 2 4 82 34 48 78
## 3 10 56 40 16 60
## 4 13 67 50 17 55
## 5 12 51 49 2 52
## 6 5 80 18 62 79
## 7 5 72 38 34 69
## 8 6 53 26 27 66
## 9 7 62 37 25 61
## 10 15 43 43 0 49
## 11 2 94 23 71 90
## 12 7 80 38 42 71
## 13 8 63 43 20 64
## 14 11 60 41 19 61
## 15 10 63 50 13 60
## 16 1 98 18 80 91
## 17 6 81 42 39 66
## 18 7 65 39 26 65
## 19 11 58 50 8 55
## 20 11 45 40 5 51
## 21 3 80 25 55 81
## 22 7 77 22 55 73
## 23 10 74 44 30 64
## 24 8 49 24 25 60
## 25 10 52 44 8 54
## 26 5 67 22 45 75
## 27 6 64 44 20 68
## 28 7 81 40 41 65
## 29 12 49 45 4 60
## 30 12 52 39 13 58
Number of Titles for FC Bayern Munchen
We can also find the number of times a certain team wins the Bundesliga title by placing first in a season. Here, we look at FC Bayern Munchen and their number of Bundesliga titles.
# How Many Times Bayern Muenchen won the Bundesliga title
# in (1964-2016). They won in 1931-1932
bayern_wins <- soccer %>% group_by(Season) %>%
filter(Team == "FC Bayern Muenchen" & Points == max(Points))
bayern_wins <- as.data.frame(bayern_wins)
bayern_wins
## Season Position Team Games.Played Wins Draws Losses GF
## 1 2015 1 FC Bayern Muenchen 34 28 4 2 80
## 2 2014 1 FC Bayern Muenchen 34 25 4 5 80
## 3 2013 1 FC Bayern Muenchen 34 29 3 2 94
## 4 2012 1 FC Bayern Muenchen 34 29 4 1 98
## 5 2009 1 FC Bayern Muenchen 34 20 10 4 72
## 6 2007 1 FC Bayern Muenchen 34 22 10 2 68
## 7 2005 1 FC Bayern Muenchen 34 22 9 3 67
## 8 2004 1 FC Bayern Muenchen 34 24 5 5 75
## 9 2002 1 FC Bayern Muenchen 34 23 6 5 70
## 10 2000 1 FC Bayern Muenchen 34 19 6 9 62
## 11 1999 1 FC Bayern Muenchen 34 22 7 5 73
## 12 1998 1 FC Bayern Muenchen 34 24 6 4 76
## 13 1996 1 FC Bayern Muenchen 34 20 11 3 68
## 14 1993 1 FC Bayern Muenchen 34 17 10 7 68
## 15 1989 1 FC Bayern Muenchen 34 19 11 4 64
## 16 1988 1 FC Bayern Muenchen 34 19 12 3 67
## 17 1986 1 FC Bayern Muenchen 34 20 13 1 67
## 18 1985 1 FC Bayern Muenchen 34 21 7 6 82
## 19 1984 1 FC Bayern Muenchen 34 21 8 5 79
## 20 1980 1 FC Bayern Muenchen 34 22 9 3 89
## 21 1979 1 FC Bayern Muenchen 34 22 6 6 84
## 22 1973 1 FC Bayern Muenchen 34 20 9 5 95
## 23 1972 1 FC Bayern Muenchen 34 25 4 5 93
## 24 1971 1 FC Bayern Muenchen 34 24 7 3 101
## 25 1968 1 FC Bayern Muenchen 34 18 10 6 61
## GA GD Points
## 1 17 63 88
## 2 18 62 79
## 3 23 71 90
## 4 18 80 91
## 5 31 41 70
## 6 21 47 76
## 7 32 35 75
## 8 33 42 77
## 9 25 45 75
## 10 37 25 63
## 11 28 45 73
## 12 28 48 78
## 13 34 34 71
## 14 37 31 61
## 15 28 36 68
## 16 26 41 69
## 17 31 36 73
## 18 31 51 70
## 19 38 41 71
## 20 41 48 75
## 21 33 51 72
## 22 53 42 69
## 23 29 64 79
## 24 38 63 79
## 25 31 30 64
bayern_winCount <- nrow(bayern_wins); bayern_winCount
## [1] 25
From 1964 to now (2016),FC Bayern Munchen has won the Bundesliga title 25 times, an impressive feat.
Number Of Titles For Borussia Dortmund
Here are the number of titles for Borussia Dortmund.
# How Many Times Borussia Dortmund won the Bundesliga title in (1964-2016).
dortmund_wins <- soccer %>% group_by(Season) %>%
filter(Team == "Borussia Dortmund" & Points == max(Points))
dortmund_wins <- as.data.frame(dortmund_wins)
dortmund_wins
## Season Position Team Games.Played Wins Draws Losses GF GA
## 1 2011 1 Borussia Dortmund 34 25 6 3 80 25
## 2 2010 1 Borussia Dortmund 34 23 6 5 67 22
## 3 2001 1 Borussia Dortmund 34 21 7 6 62 33
## 4 1995 1 Borussia Dortmund 34 19 11 4 76 38
## 5 1994 1 Borussia Dortmund 34 20 9 5 67 33
## GD Points
## 1 55 81
## 2 45 75
## 3 29 70
## 4 38 68
## 5 34 69
dortmund_winCount <- nrow(dortmund_wins); dortmund_winCount
## [1] 5
List of Title Winning Teams In The Bundesliga
Here is the full list of title winning teams in the Bundesliga.
winning_teams <- soccer %>% group_by(Season) %>%
filter(Points == max(Points)) %>%
group_by(Team) %>%
count(Team) %>%
rename(Title_Wins = n) %>%
arrange(desc(Title_Wins))
winning_teams <- data.frame(winning_teams)
winning_teams
## Team Title_Wins
## 1 FC Bayern Muenchen 25
## 2 Borussia Moenchengladbach 6
## 3 Borussia Dortmund 5
## 4 Werder Bremen 5
## 5 1. FC Kaiserslautern 3
## 6 Hamburger SV 3
## 7 1. FC Koeln 2
## 8 VfB Stuttgart 2
## 9 1. FC Nuernberg 1
## 10 Bayer 04 Leverkusen 1
## 11 Bor. Moenchengladbach 1
## 12 Eintracht Braunschweig 1
## 13 TSV 1860 Muenchen 1
## 14 VfL Wolfsburg 1
Total Number of Games Played, Wins, Draws, Losses & Goals For FC Bayern Munchen
# Total Number Of Games Played, Wins, Draws, Losses & Goals For FC Bayern Munchen
bayern_totals <- soccer %>% group_by(Team) %>%
filter(Team == "FC Bayern Muenchen") %>%
summarise_each(funs(sum), Games_Played = Games.Played, Wins = Wins,
Draws = Draws, Losses = Losses, GF = GF, GA = GA, GD = GD)
bayern_totals <- data.frame(bayern_totals)
bayern_totals
## Team Games_Played Wins Draws Losses GF GA GD
## 1 FC Bayern Muenchen 1738 1018 389 331 3764 1919 1845
Total Number of Games Played, Wins, Draws and Losses, Goals for Bayern Muenchen, Borussia Dortmund & Borussia Moenchengladbach & Bayer 04 Leverkusen
# Total Number of Games Played, Wins, Draws and Losses, Goals for Bayern Muenchen, Borussia Dortmund & Borussia Moenchengladbach & Bayer 04 Leverkusen
bteams <- soccer %>% group_by(Team) %>%
filter(Team %in% c("FC Bayern Muenchen", "Borussia Dormund",
"Borussia Moenchengladbach", "Bayer 04 Leverkusen")) %>%
summarise_each(funs(sum), Games_Played = Games.Played, Wins = Wins,
Draws = Draws, Losses = Losses, GF = GF, GA = GA, GD = GD)
bteams <- data.frame(bteams)
bteams
## Team Games_Played Wins Draws Losses GF GA GD
## 1 Bayer 04 Leverkusen 1020 457 278 285 1758 1299 459
## 2 Borussia Moenchengladbach 1568 632 415 521 2704 2299 405
## 3 FC Bayern Muenchen 1738 1018 389 331 3764 1919 1845
We can add win rates as a new column where the win rate is the number of wins divided by the number of games played. The dplyr function mutate()
is used to create a new column into the data.
bteams_wrate <- bteams %>% mutate(Win.Rate = round(Wins / Games_Played, 2))
bteams_wrate
## Team Games_Played Wins Draws Losses GF GA GD
## 1 Bayer 04 Leverkusen 1020 457 278 285 1758 1299 459
## 2 Borussia Moenchengladbach 1568 632 415 521 2704 2299 405
## 3 FC Bayern Muenchen 1738 1018 389 331 3764 1919 1845
## Win.Rate
## 1 0.45
## 2 0.40
## 3 0.59
Total Number of Games Played, Wins, Draws and Losses, goals for all Teams Who Played In The Bundesliga
# Overall Record
teams <- soccer %>% group_by(Team) %>%
summarise_each(funs(sum), Games_Played = Games.Played, Wins = Wins,
Draws = Draws, Losses = Losses, GF = GF, GA = GA, GD = GD) %>%
mutate(Win.Rate = round(Wins / Games_Played, 2)) %>%
arrange(desc(Wins), desc(Win.Rate))
teams <- data.frame(teams)
teams
## Team Games_Played Wins Draws Losses GF GA GD
## 1 FC Bayern Muenchen 1738 1018 389 331 3764 1919 1845
## 2 Werder Bremen 1726 737 424 565 2932 2511 421
## 3 Borussia Dortmund 1662 728 427 507 2915 2355 560
## 4 Hamburger SV 1798 728 480 590 2875 2548 327
## 5 VfB Stuttgart 1730 718 421 591 2901 2522 379
## 6 FC Schalke 04 1628 644 406 578 2409 2331 78
## 7 Borussia Moenchengladbach 1568 632 415 521 2704 2299 405
## 8 1. FC Koeln 1526 612 387 527 2531 2252 279
## 9 Eintracht Frankfurt 1594 581 405 608 2506 2484 22
## 10 1. FC Kaiserslautern 1492 575 372 545 2348 2344 4
## 11 Bayer 04 Leverkusen 1020 457 278 285 1758 1299 459
## 12 Hertha BSC 1114 406 280 428 1584 1701 -117
## 13 VfL Bochum 1160 356 306 498 1602 1887 -285
## 14 1. FC Nuernberg 1084 341 276 467 1402 1726 -324
## 15 Hannover 96 948 293 235 420 1310 1609 -299
## 16 MSV Duisburg 854 257 230 367 1115 1388 -273
## 17 VfL Wolfsburg 646 251 162 233 995 951 44
## 18 Fortuna Duesseldorf 786 245 215 326 1160 1386 -226
## 19 Eintracht Braunschweig 706 242 177 287 937 1086 -149
## 20 Karlsruher SC 812 241 230 341 1093 1408 -315
## 21 TSV 1860 Muenchen 672 238 170 264 1022 1059 -37
## 22 SC Freiburg 544 166 137 241 682 864 -182
## 23 Arminia Bielefeld 578 159 146 273 686 958 -272
## 24 1. FSV Mainz 05 340 117 93 130 450 478 -28
## 25 Hansa Rostock 374 114 96 164 449 566 -117
## 26 TSG 1899 Hoffenheim 272 87 76 109 400 434 -34
## 27 Kickers Offenbach 238 77 51 110 368 390 -22
## 28 SV Waldhof Mannheim 238 71 72 95 299 378 -79
## 29 FC Bayer 05 Uerdingen 204 70 46 88 310 385 -75
## 30 Bayer 05 Uerdingen 238 63 72 103 301 403 -102
## 31 Rot-Weiss Essen 238 61 79 98 346 390 -44
## 32 FC St. Pauli 272 58 80 134 296 485 -189
## 33 Energie Cottbus 204 56 43 105 211 338 -127
## 34 FC Augsburg 170 55 45 70 201 242 -41
## 35 TSV Bayer 04 Leverkusen 140 50 49 41 215 190 25
## 36 Alemannia Aachen 136 43 28 65 186 270 -84
## 37 Meidericher SV 94 39 29 26 176 132 44
## 38 SG Wattenscheid 09 140 34 48 58 186 248 -62
## 39 SV Bayer 04 Leverkusen 102 32 24 46 138 188 -50
## 40 1. FC Saarbruecken 166 32 48 86 202 336 -134
## 41 Rot-Weiss Oberhausen 102 29 20 53 149 215 -66
## 42 Bor. Moenchengladbach 68 28 18 22 95 79 16
## 43 Borussia Neunkirchen 98 25 18 55 109 223 -114
## 44 Wuppertaler SV 102 25 27 50 136 200 -64
## 45 Dynamo Dresden 102 21 35 46 98 161 -63
## 46 SV Darmstadt 98 102 21 29 52 124 210 -86
## 47 SpVgg Unterhaching 68 20 19 29 75 101 -26
## 48 FC 08 Homburg 68 15 18 35 70 121 -51
## 49 1. FC Dynamo Dresden 38 12 10 16 34 50 -16
## 50 SV Werder Bremen 38 11 16 11 44 45 -1
## 51 Tennis Borussia Berlin 68 11 16 41 85 174 -89
## 52 FC Ingolstadt 04 34 10 10 14 33 42 -9
## 53 Stuttgarter Kickers 34 10 6 18 41 68 -27
## 54 FC Hansa Rostock 38 10 11 17 43 55 -12
## 55 SV Stuttgarter Kickers 38 10 11 17 53 64 -11
## 56 SSV Ulm 1846 34 9 8 17 36 62 -26
## 57 SC Fortuna Koeln 34 8 9 17 46 79 -33
## 58 Preussen Muenster 30 7 9 14 34 52 -18
## 59 SC Paderborn 07 34 7 10 17 31 65 -34
## 60 SC Rot-Weiss Oberhausen 34 7 11 16 33 66 -33
## 61 FC Homburg 34 6 9 19 33 79 -46
## 62 KFC Uerdingen 05 34 5 11 18 33 56 -23
## 63 SpVgg Greuther Fuerth 34 4 9 21 26 60 -34
## 64 Blau-Weiss 90 Berlin 34 3 12 19 36 76 -40
## 65 VfB Leipzig 34 3 11 20 32 69 -37
## 66 SC Tasmania 1900 Berlin 34 2 4 28 15 10 5
## Win.Rate
## 1 0.59
## 2 0.43
## 3 0.44
## 4 0.40
## 5 0.42
## 6 0.40
## 7 0.40
## 8 0.40
## 9 0.36
## 10 0.39
## 11 0.45
## 12 0.36
## 13 0.31
## 14 0.31
## 15 0.31
## 16 0.30
## 17 0.39
## 18 0.31
## 19 0.34
## 20 0.30
## 21 0.35
## 22 0.31
## 23 0.28
## 24 0.34
## 25 0.30
## 26 0.32
## 27 0.32
## 28 0.30
## 29 0.34
## 30 0.26
## 31 0.26
## 32 0.21
## 33 0.27
## 34 0.32
## 35 0.36
## 36 0.32
## 37 0.41
## 38 0.24
## 39 0.31
## 40 0.19
## 41 0.28
## 42 0.41
## 43 0.26
## 44 0.25
## 45 0.21
## 46 0.21
## 47 0.29
## 48 0.22
## 49 0.32
## 50 0.29
## 51 0.16
## 52 0.29
## 53 0.29
## 54 0.26
## 55 0.26
## 56 0.26
## 57 0.24
## 58 0.23
## 59 0.21
## 60 0.21
## 61 0.18
## 62 0.15
## 63 0.12
## 64 0.09
## 65 0.09
## 66 0.06
The featured image is from http://arysports.tv/wp-content/uploads/2015/11/bundesliga.jpg.