Hello. I have been playing around with German soccer (Bundesliga) data in R using the dplyr package.

 

Topics

 

 

The Bundesliga Data Using bundesligR

 

There is a neat data package in R called bundesligR. bundesligR is also a dataset which contains all final tables of Germany’s top tier soccer league, the Bundesliga.

Notable teams from the Bundesliga include FC Bayern Munchen (Munich), Borussia Dortmund, Bayer 04 Leverkusen, and Borussia Monchengladbach.

If you have not installed the bundesligR or the dplyr package, you can install them both using:

 

install.packages("bundesligR")

install.packages("dplyr")

 

The Dataset

 

After installation, we convert the bundesligR dataset into a data frame and name it soccer. We also take a look at the data. The data spans from 1964 to 2016.

 

library(dplyr)
library(bundesligR)

soccer <- as.data.frame(bundesligR)

head(soccer)
##   Season Position                      Team Played  W D  L GF GA GD Points
## 1   2015        1        FC Bayern Muenchen     34 28 4  2 80 17 63     88
## 2   2015        2         Borussia Dortmund     34 24 6  4 82 34 48     78
## 3   2015        3       Bayer 04 Leverkusen     34 18 6 10 56 40 16     60
## 4   2015        4 Borussia Moenchengladbach     34 17 4 13 67 50 17     55
## 5   2015        5             FC Schalke 04     34 15 7 12 51 49  2     52
## 6   2015        6           1. FSV Mainz 05     34 14 8 12 46 42  4     50
##   Pts_pre_95
## 1         60
## 2         54
## 3         42
## 4         38
## 5         37
## 6         36

 

The team with the most points at the end of a season is the title winner for that season. The Season variable if the year in which the season starts. From head(soccer), the most recent data is from the 2015 Season (Late Summer 2015 to Spring 2016).

Position refers to the ranking on the table. Team is the football team. Played refers to the number of games played in the season. W, D and L refers to Wins Draws and Losses for the team. GF is goals for the team or how many goals scored for the season, GA is short for goals against the team and GD is goal differential which is GF - GA.

With points, a win gives the winning team 3 points, a draw gives 1 point and a loss gives zero points. The points system before 1995 had 2 points for wins under the variable Pts_pre_95.

The full R Documentation of the bundesligR dataset can be found with ??bundesligR

The last column Pts_pre_95 will be removed from the dataset. Also a few column names will be renamed.

 

# Season is the year when the season started until the end of the next year.

# Rename a few columns: W = Wins, D = Draws, L = Losses

# Remove Pts_pre_95 variable/column
soccer <- soccer %>% 
              rename(Games.Played = Played, Wins = W, Draws = D, Losses = L)

soccer <- subset(soccer, select = -Pts_pre_95)


head(soccer)
##   Season Position                      Team Games.Played Wins Draws Losses
## 1   2015        1        FC Bayern Muenchen           34   28     4      2
## 2   2015        2         Borussia Dortmund           34   24     6      4
## 3   2015        3       Bayer 04 Leverkusen           34   18     6     10
## 4   2015        4 Borussia Moenchengladbach           34   17     4     13
## 5   2015        5             FC Schalke 04           34   15     7     12
## 6   2015        6           1. FSV Mainz 05           34   14     8     12
##   GF GA GD Points
## 1 80 17 63     88
## 2 82 34 48     78
## 3 56 40 16     60
## 4 67 50 17     55
## 5 51 49  2     52
## 6 46 42  4     50

 

The %>% pipe operator is used for easier reading. Instead of select(soccer, -Pts_pre_95), we use soccer %>% select(-Pts_pre_95). The negative sign in front of the column Pts_pre_95 inside select() tells R to remove the specified column. Remove a column is easier than selecting everything else.

The rename() part is used to rename past columns.

 

Selecting Data Using R’s dplyr Package

 

Now we use dplyr to help us find some interesting data of the Bundesliga in its history.

 

2015-2016 Season

 

Here were the results from last year’s (previous) Bundesliga season. The filter() function is used here.

 

season_2015 <- soccer %>% filter(Season == 2015)

 

This season was interesting in the sense that it was a very good season for Borussia Dortmund and they were still 10 points away from FC Bayern Munchen. The gap between Borussia Dortmund at second place and third place was 18 points.

 


 

Best Season Of All Time

 

The best season of all time in the Bundesliga belongs to the team which had the most points at the end of the season.

 

# Best season where the title winning team had the most 
# points in history from 1964-2016.

best_season <- soccer %>% filter(Points == max(Points))
best_season
##   Season Position               Team Games.Played Wins Draws Losses GF GA
## 1   2012        1 FC Bayern Muenchen           34   29     4      1 98 18
##   GD Points
## 1 80     91

 

For the 2012-2013 season, FC Bayern Munchen won the Bundesliga with a record 91 points. They also won the DFB-Pokal and the UEFA Champions League for that season, winning the treble. (Winning the treble is very difficult.)

 


 

Worst Season Of All Time

 

The worst season of all time in the Bundesliga belongs to the worst last placed team (and is also relegated to Bundesliga 2 which is the lower tier league).

 

# Worst season where the last place (and relegated) team had the lowest points
# in history from 1964 to 2016.

worst_season <- soccer %>% filter(Points == min(Points))
worst_season 
##   Season Position                    Team Games.Played Wins Draws Losses
## 1   1965       18 SC Tasmania 1900 Berlin           34    2     4     28
##   GF GA GD Points
## 1 15 10  5     10

 

SC Tasmania 1900 Berlin came in dead last in 1965 with 10 points from 2 wins, 4 draws, and 28 losses.

 


 

Top 5 Teams Per Season

 

We can find the top 5 teams per season in this data. As this subset is quite large, we look at the top 5 teams from the 2010-2011 season to the 2015-2016 season.

 

# Top 5 Teams per Season w/o GF and GA

top5 <- soccer %>% group_by(Season) %>%
             filter(Position <= 5)

top5 <- data.frame(top5)

head(top5, n = 30)
##    Season Position                      Team Games.Played Wins Draws
## 1    2015        1        FC Bayern Muenchen           34   28     4
## 2    2015        2         Borussia Dortmund           34   24     6
## 3    2015        3       Bayer 04 Leverkusen           34   18     6
## 4    2015        4 Borussia Moenchengladbach           34   17     4
## 5    2015        5             FC Schalke 04           34   15     7
## 6    2014        1        FC Bayern Muenchen           34   25     4
## 7    2014        2             VfL Wolfsburg           34   20     9
## 8    2014        3 Borussia Moenchengladbach           34   19     9
## 9    2014        4       Bayer 04 Leverkusen           34   17    10
## 10   2014        5               FC Augsburg           34   15     4
## 11   2013        1        FC Bayern Muenchen           34   29     3
## 12   2013        2         Borussia Dortmund           34   22     5
## 13   2013        3             FC Schalke 04           34   19     7
## 14   2013        4       Bayer 04 Leverkusen           34   19     4
## 15   2013        5             VfL Wolfsburg           34   18     6
## 16   2012        1        FC Bayern Muenchen           34   29     4
## 17   2012        2         Borussia Dortmund           34   19     9
## 18   2012        3       Bayer 04 Leverkusen           34   19     8
## 19   2012        4             FC Schalke 04           34   16     7
## 20   2012        5               SC Freiburg           34   14     9
## 21   2011        1         Borussia Dortmund           34   25     6
## 22   2011        2        FC Bayern Muenchen           34   23     4
## 23   2011        3             FC Schalke 04           34   20     4
## 24   2011        4 Borussia Moenchengladbach           34   17     9
## 25   2011        5       Bayer 04 Leverkusen           34   15     9
## 26   2010        1         Borussia Dortmund           34   23     6
## 27   2010        2       Bayer 04 Leverkusen           34   20     8
## 28   2010        3        FC Bayern Muenchen           34   19     8
## 29   2010        4               Hannover 96           34   19     3
## 30   2010        5           1. FSV Mainz 05           34   18     4
##    Losses GF GA GD Points
## 1       2 80 17 63     88
## 2       4 82 34 48     78
## 3      10 56 40 16     60
## 4      13 67 50 17     55
## 5      12 51 49  2     52
## 6       5 80 18 62     79
## 7       5 72 38 34     69
## 8       6 53 26 27     66
## 9       7 62 37 25     61
## 10     15 43 43  0     49
## 11      2 94 23 71     90
## 12      7 80 38 42     71
## 13      8 63 43 20     64
## 14     11 60 41 19     61
## 15     10 63 50 13     60
## 16      1 98 18 80     91
## 17      6 81 42 39     66
## 18      7 65 39 26     65
## 19     11 58 50  8     55
## 20     11 45 40  5     51
## 21      3 80 25 55     81
## 22      7 77 22 55     73
## 23     10 74 44 30     64
## 24      8 49 24 25     60
## 25     10 52 44  8     54
## 26      5 67 22 45     75
## 27      6 64 44 20     68
## 28      7 81 40 41     65
## 29     12 49 45  4     60
## 30     12 52 39 13     58

 


 

Number of Titles for FC Bayern Munchen

 

We can also find the number of times a certain team wins the Bundesliga title by placing first in a season. Here, we look at FC Bayern Munchen and their number of Bundesliga titles.

 

# How Many Times Bayern Muenchen won the Bundesliga title
# in (1964-2016). They won in 1931-1932

bayern_wins <- soccer %>% group_by(Season) %>%
                 filter(Team == "FC Bayern Muenchen" & Points == max(Points))

bayern_wins <- as.data.frame(bayern_wins)
bayern_wins
##    Season Position               Team Games.Played Wins Draws Losses  GF
## 1    2015        1 FC Bayern Muenchen           34   28     4      2  80
## 2    2014        1 FC Bayern Muenchen           34   25     4      5  80
## 3    2013        1 FC Bayern Muenchen           34   29     3      2  94
## 4    2012        1 FC Bayern Muenchen           34   29     4      1  98
## 5    2009        1 FC Bayern Muenchen           34   20    10      4  72
## 6    2007        1 FC Bayern Muenchen           34   22    10      2  68
## 7    2005        1 FC Bayern Muenchen           34   22     9      3  67
## 8    2004        1 FC Bayern Muenchen           34   24     5      5  75
## 9    2002        1 FC Bayern Muenchen           34   23     6      5  70
## 10   2000        1 FC Bayern Muenchen           34   19     6      9  62
## 11   1999        1 FC Bayern Muenchen           34   22     7      5  73
## 12   1998        1 FC Bayern Muenchen           34   24     6      4  76
## 13   1996        1 FC Bayern Muenchen           34   20    11      3  68
## 14   1993        1 FC Bayern Muenchen           34   17    10      7  68
## 15   1989        1 FC Bayern Muenchen           34   19    11      4  64
## 16   1988        1 FC Bayern Muenchen           34   19    12      3  67
## 17   1986        1 FC Bayern Muenchen           34   20    13      1  67
## 18   1985        1 FC Bayern Muenchen           34   21     7      6  82
## 19   1984        1 FC Bayern Muenchen           34   21     8      5  79
## 20   1980        1 FC Bayern Muenchen           34   22     9      3  89
## 21   1979        1 FC Bayern Muenchen           34   22     6      6  84
## 22   1973        1 FC Bayern Muenchen           34   20     9      5  95
## 23   1972        1 FC Bayern Muenchen           34   25     4      5  93
## 24   1971        1 FC Bayern Muenchen           34   24     7      3 101
## 25   1968        1 FC Bayern Muenchen           34   18    10      6  61
##    GA GD Points
## 1  17 63     88
## 2  18 62     79
## 3  23 71     90
## 4  18 80     91
## 5  31 41     70
## 6  21 47     76
## 7  32 35     75
## 8  33 42     77
## 9  25 45     75
## 10 37 25     63
## 11 28 45     73
## 12 28 48     78
## 13 34 34     71
## 14 37 31     61
## 15 28 36     68
## 16 26 41     69
## 17 31 36     73
## 18 31 51     70
## 19 38 41     71
## 20 41 48     75
## 21 33 51     72
## 22 53 42     69
## 23 29 64     79
## 24 38 63     79
## 25 31 30     64
bayern_winCount <- nrow(bayern_wins); bayern_winCount
## [1] 25

 

From 1964 to now (2016),FC Bayern Munchen has won the Bundesliga title 25 times, an impressive feat.

 


 

Number Of Titles For Borussia Dortmund

Here are the number of titles for Borussia Dortmund.

 

# How Many Times Borussia Dortmund won the Bundesliga title in (1964-2016). 

dortmund_wins <- soccer %>% group_by(Season) %>%
                 filter(Team == "Borussia Dortmund" & Points == max(Points))

dortmund_wins <- as.data.frame(dortmund_wins)
dortmund_wins
##   Season Position              Team Games.Played Wins Draws Losses GF GA
## 1   2011        1 Borussia Dortmund           34   25     6      3 80 25
## 2   2010        1 Borussia Dortmund           34   23     6      5 67 22
## 3   2001        1 Borussia Dortmund           34   21     7      6 62 33
## 4   1995        1 Borussia Dortmund           34   19    11      4 76 38
## 5   1994        1 Borussia Dortmund           34   20     9      5 67 33
##   GD Points
## 1 55     81
## 2 45     75
## 3 29     70
## 4 38     68
## 5 34     69
dortmund_winCount <- nrow(dortmund_wins); dortmund_winCount
## [1] 5

 

 

List of Title Winning Teams In The Bundesliga

 

Here is the full list of title winning teams in the Bundesliga.

 

winning_teams <- soccer %>% group_by(Season) %>%
                    filter(Points == max(Points)) %>%
                    group_by(Team) %>%
                    count(Team) %>%
                    rename(Title_Wins = n) %>%
                    arrange(desc(Title_Wins))

winning_teams <- data.frame(winning_teams)

winning_teams
##                         Team Title_Wins
## 1         FC Bayern Muenchen         25
## 2  Borussia Moenchengladbach          6
## 3          Borussia Dortmund          5
## 4              Werder Bremen          5
## 5       1. FC Kaiserslautern          3
## 6               Hamburger SV          3
## 7                1. FC Koeln          2
## 8              VfB Stuttgart          2
## 9            1. FC Nuernberg          1
## 10       Bayer 04 Leverkusen          1
## 11     Bor. Moenchengladbach          1
## 12    Eintracht Braunschweig          1
## 13         TSV 1860 Muenchen          1
## 14             VfL Wolfsburg          1

 


 

Total Number of Games Played, Wins, Draws, Losses & Goals For FC Bayern Munchen

 

# Total Number Of Games Played, Wins, Draws, Losses & Goals For FC Bayern Munchen

bayern_totals <- soccer %>% group_by(Team) %>%
                     filter(Team == "FC Bayern Muenchen") %>%
                     summarise_each(funs(sum), Games_Played = Games.Played, Wins = Wins,
                                    Draws = Draws, Losses = Losses, GF = GF, GA = GA, GD = GD)

bayern_totals <- data.frame(bayern_totals)
bayern_totals
##                 Team Games_Played Wins Draws Losses   GF   GA   GD
## 1 FC Bayern Muenchen         1738 1018   389    331 3764 1919 1845

 


 

Total Number of Games Played, Wins, Draws and Losses, Goals for Bayern Muenchen, Borussia Dortmund & Borussia Moenchengladbach & Bayer 04 Leverkusen

 

# Total Number of Games Played, Wins, Draws and Losses, Goals for Bayern Muenchen, Borussia Dortmund & Borussia Moenchengladbach & Bayer 04 Leverkusen

bteams <- soccer %>% group_by(Team) %>%
                     filter(Team %in% c("FC Bayern Muenchen", "Borussia Dormund",
                                        "Borussia Moenchengladbach", "Bayer 04 Leverkusen")) %>%
                     summarise_each(funs(sum), Games_Played = Games.Played, Wins = Wins,
                                    Draws = Draws, Losses = Losses, GF = GF, GA = GA, GD = GD)

bteams <- data.frame(bteams)
bteams
##                        Team Games_Played Wins Draws Losses   GF   GA   GD
## 1       Bayer 04 Leverkusen         1020  457   278    285 1758 1299  459
## 2 Borussia Moenchengladbach         1568  632   415    521 2704 2299  405
## 3        FC Bayern Muenchen         1738 1018   389    331 3764 1919 1845

 

We can add win rates as a new column where the win rate is the number of wins divided by the number of games played. The dplyr function mutate() is used to create a new column into the data.

 

bteams_wrate <- bteams %>% mutate(Win.Rate = round(Wins / Games_Played, 2))

bteams_wrate
##                        Team Games_Played Wins Draws Losses   GF   GA   GD
## 1       Bayer 04 Leverkusen         1020  457   278    285 1758 1299  459
## 2 Borussia Moenchengladbach         1568  632   415    521 2704 2299  405
## 3        FC Bayern Muenchen         1738 1018   389    331 3764 1919 1845
##   Win.Rate
## 1     0.45
## 2     0.40
## 3     0.59

 


 

Total Number of Games Played, Wins, Draws and Losses, goals for all Teams Who Played In The Bundesliga

 

# Overall Record

teams <- soccer %>% group_by(Team) %>%
                     summarise_each(funs(sum), Games_Played = Games.Played, Wins = Wins,
                                    Draws = Draws, Losses = Losses, GF = GF, GA = GA, GD = GD) %>%
                     mutate(Win.Rate = round(Wins / Games_Played, 2)) %>%
                     arrange(desc(Wins), desc(Win.Rate))

teams <- data.frame(teams)

teams
##                         Team Games_Played Wins Draws Losses   GF   GA   GD
## 1         FC Bayern Muenchen         1738 1018   389    331 3764 1919 1845
## 2              Werder Bremen         1726  737   424    565 2932 2511  421
## 3          Borussia Dortmund         1662  728   427    507 2915 2355  560
## 4               Hamburger SV         1798  728   480    590 2875 2548  327
## 5              VfB Stuttgart         1730  718   421    591 2901 2522  379
## 6              FC Schalke 04         1628  644   406    578 2409 2331   78
## 7  Borussia Moenchengladbach         1568  632   415    521 2704 2299  405
## 8                1. FC Koeln         1526  612   387    527 2531 2252  279
## 9        Eintracht Frankfurt         1594  581   405    608 2506 2484   22
## 10      1. FC Kaiserslautern         1492  575   372    545 2348 2344    4
## 11       Bayer 04 Leverkusen         1020  457   278    285 1758 1299  459
## 12                Hertha BSC         1114  406   280    428 1584 1701 -117
## 13                VfL Bochum         1160  356   306    498 1602 1887 -285
## 14           1. FC Nuernberg         1084  341   276    467 1402 1726 -324
## 15               Hannover 96          948  293   235    420 1310 1609 -299
## 16              MSV Duisburg          854  257   230    367 1115 1388 -273
## 17             VfL Wolfsburg          646  251   162    233  995  951   44
## 18       Fortuna Duesseldorf          786  245   215    326 1160 1386 -226
## 19    Eintracht Braunschweig          706  242   177    287  937 1086 -149
## 20             Karlsruher SC          812  241   230    341 1093 1408 -315
## 21         TSV 1860 Muenchen          672  238   170    264 1022 1059  -37
## 22               SC Freiburg          544  166   137    241  682  864 -182
## 23         Arminia Bielefeld          578  159   146    273  686  958 -272
## 24           1. FSV Mainz 05          340  117    93    130  450  478  -28
## 25             Hansa Rostock          374  114    96    164  449  566 -117
## 26       TSG 1899 Hoffenheim          272   87    76    109  400  434  -34
## 27         Kickers Offenbach          238   77    51    110  368  390  -22
## 28       SV Waldhof Mannheim          238   71    72     95  299  378  -79
## 29     FC Bayer 05 Uerdingen          204   70    46     88  310  385  -75
## 30        Bayer 05 Uerdingen          238   63    72    103  301  403 -102
## 31           Rot-Weiss Essen          238   61    79     98  346  390  -44
## 32              FC St. Pauli          272   58    80    134  296  485 -189
## 33           Energie Cottbus          204   56    43    105  211  338 -127
## 34               FC Augsburg          170   55    45     70  201  242  -41
## 35   TSV Bayer 04 Leverkusen          140   50    49     41  215  190   25
## 36          Alemannia Aachen          136   43    28     65  186  270  -84
## 37            Meidericher SV           94   39    29     26  176  132   44
## 38        SG Wattenscheid 09          140   34    48     58  186  248  -62
## 39    SV Bayer 04 Leverkusen          102   32    24     46  138  188  -50
## 40        1. FC Saarbruecken          166   32    48     86  202  336 -134
## 41      Rot-Weiss Oberhausen          102   29    20     53  149  215  -66
## 42     Bor. Moenchengladbach           68   28    18     22   95   79   16
## 43      Borussia Neunkirchen           98   25    18     55  109  223 -114
## 44            Wuppertaler SV          102   25    27     50  136  200  -64
## 45            Dynamo Dresden          102   21    35     46   98  161  -63
## 46           SV Darmstadt 98          102   21    29     52  124  210  -86
## 47        SpVgg Unterhaching           68   20    19     29   75  101  -26
## 48             FC 08 Homburg           68   15    18     35   70  121  -51
## 49      1. FC Dynamo Dresden           38   12    10     16   34   50  -16
## 50          SV Werder Bremen           38   11    16     11   44   45   -1
## 51    Tennis Borussia Berlin           68   11    16     41   85  174  -89
## 52          FC Ingolstadt 04           34   10    10     14   33   42   -9
## 53       Stuttgarter Kickers           34   10     6     18   41   68  -27
## 54          FC Hansa Rostock           38   10    11     17   43   55  -12
## 55    SV Stuttgarter Kickers           38   10    11     17   53   64  -11
## 56              SSV Ulm 1846           34    9     8     17   36   62  -26
## 57          SC Fortuna Koeln           34    8     9     17   46   79  -33
## 58         Preussen Muenster           30    7     9     14   34   52  -18
## 59           SC Paderborn 07           34    7    10     17   31   65  -34
## 60   SC Rot-Weiss Oberhausen           34    7    11     16   33   66  -33
## 61                FC Homburg           34    6     9     19   33   79  -46
## 62          KFC Uerdingen 05           34    5    11     18   33   56  -23
## 63     SpVgg Greuther Fuerth           34    4     9     21   26   60  -34
## 64      Blau-Weiss 90 Berlin           34    3    12     19   36   76  -40
## 65               VfB Leipzig           34    3    11     20   32   69  -37
## 66   SC Tasmania 1900 Berlin           34    2     4     28   15   10    5
##    Win.Rate
## 1      0.59
## 2      0.43
## 3      0.44
## 4      0.40
## 5      0.42
## 6      0.40
## 7      0.40
## 8      0.40
## 9      0.36
## 10     0.39
## 11     0.45
## 12     0.36
## 13     0.31
## 14     0.31
## 15     0.31
## 16     0.30
## 17     0.39
## 18     0.31
## 19     0.34
## 20     0.30
## 21     0.35
## 22     0.31
## 23     0.28
## 24     0.34
## 25     0.30
## 26     0.32
## 27     0.32
## 28     0.30
## 29     0.34
## 30     0.26
## 31     0.26
## 32     0.21
## 33     0.27
## 34     0.32
## 35     0.36
## 36     0.32
## 37     0.41
## 38     0.24
## 39     0.31
## 40     0.19
## 41     0.28
## 42     0.41
## 43     0.26
## 44     0.25
## 45     0.21
## 46     0.21
## 47     0.29
## 48     0.22
## 49     0.32
## 50     0.29
## 51     0.16
## 52     0.29
## 53     0.29
## 54     0.26
## 55     0.26
## 56     0.26
## 57     0.24
## 58     0.23
## 59     0.21
## 60     0.21
## 61     0.18
## 62     0.15
## 63     0.12
## 64     0.09
## 65     0.09
## 66     0.06

 


 

References

 

The featured image is from http://arysports.tv/wp-content/uploads/2015/11/bundesliga.jpg.