Grouping Data by column in a DataFrame

The groupby function is primarily used to combine duplicate rows of a given column of a pandas DataFrame.

To explore the groupby function we will use a DataFrame of the St. Louis Cardinals starting lineups in a 4 game series against the Washington Nationals:

import pandas as pd\n&nbsp;\ndf =  pd.DataFrame([\n    #9/8/2022 WSH 11 - 6 STL\n    {'name' : 'Brendan Donovan', 'position' : '2B', 'AB' : 3, 'R' : 2, 'H' : 3, 'RBI' : 0, 'BB' : 2, 'SO' : 0, 'PA' : 5 },\n    {'name' : 'Tommy Edman', 'position' : 'SS', 'AB' : 4, 'R' : 0, 'H' : 2, 'RBI' : 0, 'BB' : 1, 'SO' : 1, 'PA' : 5 },\n    {'name' : 'Paul Goldschmidt', 'position' : '1B', 'AB' : 4, 'R' : 0, 'H' : 1, 'RBI' : 1, 'BB' : 0, 'SO' : 1, 'PA' : 4 },\n    {'name' : 'Nolan Arenado', 'position' : '3B', 'AB' : 4, 'R' : 0, 'H' : 0, 'RBI' : 0, 'BB' : 0, 'SO' : 0, 'PA' : 4 },\n    {'name' : 'Corey Dickerson', 'position' : 'LF', 'AB' : 5, 'R' : 0, 'H' : 2, 'RBI' : 2, 'BB' : 0, 'SO' : 1, 'PA' : 5 },\n    {'name' : 'Albert Pujols', 'position' : 'DH', 'AB' : 5, 'R' : 0, 'H' : 0, 'RBI' : 0, 'BB' : 0, 'SO' : 0, 'PA' : 5 },\n    {'name' : 'Alec Burleson', 'position' : 'RF', 'AB' : 4, 'R' : 1, 'H' : 0, 'RBI' : 0, 'BB' : 1, 'SO' : 0, 'PA' : 5 },\n    {'name' : 'Yadier Molina', 'position' : 'C', 'AB' : 4, 'R' : 2, 'H' : 2, 'RBI' : 3, 'BB' : 0, 'SO' : 0, 'PA' : 4 },\n    {'name' : 'Ben DeLuzio', 'position' : 'CF', 'AB' : 2, 'R' : 0, 'H' : 1, 'RBI' : 0, 'BB' : 0, 'SO' : 0, 'PA' : 2 },\n&nbsp;\n    #9/7/2022 WSH 5 - 6 STL\n    {'name' : 'Lars Nootbaar', 'position' : 'RF', 'AB' : 4, 'R' : 0, 'H' : 0, 'RBI' : 0, 'BB' : 0, 'SO' : 2, 'PA' : 4 },\n    {'name' : 'Brendan Donovan', 'position' : 'DH', 'AB' : 4, 'R' : 1, 'H' : 2, 'RBI' : 0, 'BB' : 0, 'SO' : 0, 'PA' : 4 },\n    {'name' : 'Paul Goldschmidt', 'position' : '1B', 'AB' : 3, 'R' : 2, 'H' : 1, 'RBI' : 1, 'BB' : 1, 'SO' : 0, 'PA' : 4 },\n    {'name' : 'Nolan Arenado', 'position' : '3B', 'AB' : 4, 'R' : 1, 'H' : 2, 'RBI' : 1, 'BB' : 0, 'SO' : 0, 'PA' : 4 },\n    {'name' : 'Corey Dickerson', 'position' : 'LF', 'AB' : 4, 'R' : 0, 'H' : 1, 'RBI' : 1, 'BB' : 0, 'SO' : 0, 'PA' : 4 },\n    {'name' : 'Tyler O\'Neill', 'position' : 'CF', 'AB' : 3, 'R' : 1, 'H' : 0, 'RBI' : 0, 'BB' : 1, 'SO' : 3, 'PA' : 4 },\n    {'name' : 'Nolan Gorman', 'position' : '2B', 'AB' : 4, 'R' : 0, 'H' : 0, 'RBI' : 0, 'BB' : 0, 'SO' : 3, 'PA' : 4 },\n    {'name' : 'Yadier Molina', 'position' : 'C', 'AB' : 4, 'R' : 0, 'H' : 2, 'RBI' : 1, 'BB' : 0, 'SO' : 0, 'PA' : 4 },\n    {'name' : 'Tommy Edman', 'position' : 'SS', 'AB' : 4, 'R' : 0, 'H' : 1, 'RBI' : 2, 'BB' : 0, 'SO' : 0, 'PA' : 4 },\n&nbsp;\n    #9/6/2022 WSH 1 - 4 STL\n    {'name' : 'Lars Nootbaar', 'position' : 'RF', 'AB' : 4, 'R' : 0, 'H' : 0, 'RBI' : 0, 'BB' : 0, 'SO' : 2, 'PA' : 4 },\n    {'name' : 'Brendan Donovan', 'position' : '3B', 'AB' : 3, 'R' : 1, 'H' : 1, 'RBI' : 1, 'BB' : 0, 'SO' : 0, 'PA' : 3 },\n    {'name' : 'Paul Goldschmidt', 'position' : '1B', 'AB' : 3, 'R' : 0, 'H' : 1, 'RBI' : 0, 'BB' : 1, 'SO' : 1, 'PA' : 4 },\n    {'name' : 'Nolan Arenado', 'position' : 'DH', 'AB' : 4, 'R' : 1, 'H' : 1, 'RBI' : 0, 'BB' : 0, 'SO' : 0, 'PA' : 4 },\n    {'name' : 'Corey Dickerson', 'position' : 'LF', 'AB' : 3, 'R' : 1, 'H' : 2, 'RBI' : 0, 'BB' : 0, 'SO' : 0, 'PA' : 3 },\n    {'name' : 'Tyler O\'Neill', 'position' : 'CF', 'AB' : 2, 'R' : 0, 'H' : 0, 'RBI' : 1, 'BB' : 0, 'SO' : 1, 'PA' : 3 },\n    {'name' : 'Nolan Gorman', 'position' : '2B', 'AB' : 3, 'R' : 1, 'H' : 2, 'RBI' : 2, 'BB' : 0, 'SO' : 1, 'PA' : 3 },\n    {'name' : 'Andrew Knizner', 'position' : 'C', 'AB' : 3, 'R' : 0, 'H' : 0, 'RBI' : 0, 'BB' : 0, 'SO' : 1, 'PA' : 3 },\n    {'name' : 'Tommy Edman', 'position' : 'SS', 'AB' : 3, 'R' : 0, 'H' : 1, 'RBI' : 0, 'BB' : 0, 'SO' : 0, 'PA' : 3 },\n&nbsp;\n    #9/5/2022 WSH 6 - 0 STL\n    {'name' : 'Lars Nootbaar', 'position' : 'RF', 'AB' : 4, 'R' : 0, 'H' : 0, 'RBI' : 0, 'BB' : 0, 'SO' : 4, 'PA' : 4 },\n    {'name' : 'Tyler O\'Neill', 'position' : 'CF', 'AB' : 4, 'R' : 0, 'H' : 1, 'RBI' : 0, 'BB' : 0, 'SO' : 0, 'PA' : 4 },\n    {'name' : 'Paul Goldschmidt', 'position' : 'DH', 'AB' : 3, 'R' : 0, 'H' : 0, 'RBI' : 0, 'BB' : 1, 'SO' : 0, 'PA' : 4 },\n    {'name' : 'Nolan Arenado', 'position' : '3B', 'AB' : 3, 'R' : 0, 'H' : 0, 'RBI' : 0, 'BB' : 1, 'SO' : 1, 'PA' : 4 },\n    {'name' : 'Albert Pujols', 'position' : '1B', 'AB' : 4, 'R' : 0, 'H' : 0, 'RBI' : 0, 'BB' : 0, 'SO' : 1, 'PA' : 4 },\n    {'name' : 'Corey Dickerson', 'position' : 'LF', 'AB' : 3, 'R' : 0, 'H' : 0, 'RBI' : 0, 'BB' : 0, 'SO' : 0, 'PA' : 3 },\n    {'name' : 'Nolan Gorman', 'position' : '2B', 'AB' : 3, 'R' : 0, 'H' : 0, 'RBI' : 0, 'BB' : 0, 'SO' : 2, 'PA' : 3 },\n    {'name' : 'Yadier Molina', 'position' : 'C', 'AB' : 3, 'R' : 0, 'H' : 1, 'RBI' : 0, 'BB' : 0, 'SO' : 0, 'PA' : 3 },\n    {'name' : 'Tommy Edman', 'position' : 'SS', 'AB' : 3, 'R' : 0, 'H' : 1, 'RBI' : 0, 'BB' : 0, 'SO' : 0, 'PA' : 3 }\n])\ndf

Reset Code Python Output:


  
     
      
      name
      position
      AB
      R
      H
      RBI
      BB
      SO
      PA
    
  
  
    
      0
      Brendan Donovan
      2B
      3
      2
      3
      0
      2
      0
      5
    
    
      1
      Tommy Edman
      SS
      4
      0
      2
      0
      2
      0
      5
    
    
      2
      Paul Goldschmidt
      1B
      4
      0
      1
      1
      0
      1
      4
    
    
      3
      Nolan Arenado
      3B
      4
      0
      0
      0
      0
      0
      4
    
    
      4
      Corey Dickerson
      LF
      5
      0
      2
      2
      0
      1
      5
    
    
      5
      Albert Pujols
      DH
      5
      0
      0
      0
      0
      0
      5
    
    
      6
      Alec Burleson
      RF
      4
      1
      0
      0
      1
      0
      5
    
    
      7
      Yadier Molina
      C
      4
      2
      2
      3
      0
      0
      4
    
    
      8
      Ben DeLuzio
      CF
      2
      0
      1
      0
      0
      0
      2
    
    
      9
      Lars Nootbaar
      RF
      4
      0
      0
      0
      0
      2
      4
    
    
      10
      Brendan Donovan
      DH
      4
      1
      2
      0
      0
      0
      4
    
    
      11
      Paul Goldschmidt
      1B
      3
      2
      1
      1
      1
      0
      4
    
    
      12
      Nolan Arenado
      3B
      4
      1
      2
      1
      0
      0
      4
    
    
      13
      Corey Dickerson
      LF
      4
      0
      1
      1
      0
      0
      4
    
    
      14
      Tyler O'Neill
      CF
      3
      1
      0
      0
      1
      3
      4
    
    
      15
      Nolan Gorman
      2B
      4
      0
      0
      0
      0
      3
      4
    
    
      16
      Yadier Molina
      C
      4
      0
      2
      1
      0
      0
      4
    
    
      17
      Tommy Edman
      SS
      4
      0
      2
      1
      0
      0
      4
    
    
      18
      Lars Nootbaar
      RF
      4
      0
      0
      0
      0
      2
      4
    
    
      19
      Brendan Donovan
      3B
      3
      1
      1
      1
      0
      0
      3
    
    
      20
      Paul Goldschmidt
      1B
      3
      0
      1
      0
      1
      1
      4
    
    
      21
      Nolan Arenado
      DH
      4
      1
      1
      0
      0
      0
      4
    
    
      22
      Corey Dickerson
      LF
      3
      1
      2
      0
      0
      0
      3
    
    
      23
      Tyler O'Neill
      CF
      2
      0
      0
      1
      0
      1
      3
    
    
      24
      Nolan Gorman
      2B
      3
      1
      2
      2
      0
      1
      3
    
    
      25
      Andrew Knizner
      C
      3
      0
      0
      0
      0
      1
      3
    
    
      26
      Tommy Edman
      SS
      3
      0
      1
      0
      0
      0
      3
    
    
      27
      Lars Nootbaar
      RF
      4
      0
      0
      0
      0
      4
      4
    
    
      28
      Tyler O'Neill
      CF
      4
      0
      1
      0
      0
      0
      4
    
    
      29
      Paul Goldschmidt
      DH
      3
      0
      0
      0
      1
      0
      4
    
    
      30
      Nolan Arenado
      3B
      3
      0
      0
      0
      1
      1
      4
    
    
      31
      Albert Pujols
      1B
      4
      0
      0
      0
      0
      1
      4
    
    
      32
      Corey Dickerson
      LF
      3
      0
      0
      0
      0
      0
      3
    
    
      33
      Nolan Gorman
      2B
      3
      0
      0
      0
      0
      2
      3
    
    
      34
      Yadier Molina
      C
      3
      0
      1
      0
      0
      0
      3
    
    
      35
      Tommy Edman
      SS
      3
      0
      1
      0
      0
      0
      3

	name	position	AB	R	H	RBI	BB	SO	PA
0	Brendan Donovan	2B	3	2	3	0	2	0	5
1	Tommy Edman	SS	4	0	2	0	2	0	5
2	Paul Goldschmidt	1B	4	0	1	1	0	1	4
3	Nolan Arenado	3B	4	0	0	0	0	0	4
4	Corey Dickerson	LF	5	0	2	2	0	1	5
5	Albert Pujols	DH	5	0	0	0	0	0	5
6	Alec Burleson	RF	4	1	0	0	1	0	5
7	Yadier Molina	C	4	2	2	3	0	0	4
8	Ben DeLuzio	CF	2	0	1	0	0	0	2
9	Lars Nootbaar	RF	4	0	0	0	0	2	4
10	Brendan Donovan	DH	4	1	2	0	0	0	4
11	Paul Goldschmidt	1B	3	2	1	1	1	0	4
12	Nolan Arenado	3B	4	1	2	1	0	0	4
13	Corey Dickerson	LF	4	0	1	1	0	0	4
14	Tyler O'Neill	CF	3	1	0	0	1	3	4
15	Nolan Gorman	2B	4	0	0	0	0	3	4
16	Yadier Molina	C	4	0	2	1	0	0	4
17	Tommy Edman	SS	4	0	2	1	0	0	4
18	Lars Nootbaar	RF	4	0	0	0	0	2	4
19	Brendan Donovan	3B	3	1	1	1	0	0	3
20	Paul Goldschmidt	1B	3	0	1	0	1	1	4
21	Nolan Arenado	DH	4	1	1	0	0	0	4
22	Corey Dickerson	LF	3	1	2	0	0	0	3
23	Tyler O'Neill	CF	2	0	0	1	0	1	3
24	Nolan Gorman	2B	3	1	2	2	0	1	3
25	Andrew Knizner	C	3	0	0	0	0	1	3
26	Tommy Edman	SS	3	0	1	0	0	0	3
27	Lars Nootbaar	RF	4	0	0	0	0	4	4
28	Tyler O'Neill	CF	4	0	1	0	0	0	4
29	Paul Goldschmidt	DH	3	0	0	0	1	0	4
30	Nolan Arenado	3B	3	0	0	0	1	1	4
31	Albert Pujols	1B	4	0	0	0	0	1	4
32	Corey Dickerson	LF	3	0	0	0	0	0	3
33	Nolan Gorman	2B	3	0	0	0	0	2	3
34	Yadier Molina	C	3	0	1	0	0	0	3
35	Tommy Edman	SS	3	0	1	0	0	0	3

Groupby Syntax

When using the groupby function to group data by column, you pass one parameter into the function. The parameter is the string version of the column name. So to group by the "name" column, we will pass the string "name" as a parameter to the function. The next thing you use is the agg function, that specifies how the grouped rows should be aggregated together. After aggregating the data, reset_index will bring it back to a simple, index-based DataFrame.

Example of the Groupby Syntax

import pandas as pd\n&nbsp;\ndf =  pd.DataFrame([\n    #9/8/2022 WSH 11 - 6 STL\n    {'name' : 'Brendan Donovan', 'position' : '2B', 'AB' : 3, 'R' : 2, 'H' : 3, 'RBI' : 0, 'BB' : 2, 'SO' : 0, 'PA' : 5 },\n    {'name' : 'Tommy Edman', 'position' : 'SS', 'AB' : 4, 'R' : 0, 'H' : 2, 'RBI' : 0, 'BB' : 1, 'SO' : 1, 'PA' : 5 },\n    {'name' : 'Paul Goldschmidt', 'position' : '1B', 'AB' : 4, 'R' : 0, 'H' : 1, 'RBI' : 1, 'BB' : 0, 'SO' : 1, 'PA' : 4 },\n    {'name' : 'Nolan Arenado', 'position' : '3B', 'AB' : 4, 'R' : 0, 'H' : 0, 'RBI' : 0, 'BB' : 0, 'SO' : 0, 'PA' : 4 },\n    {'name' : 'Corey Dickerson', 'position' : 'LF', 'AB' : 5, 'R' : 0, 'H' : 2, 'RBI' : 2, 'BB' : 0, 'SO' : 1, 'PA' : 5 },\n    {'name' : 'Albert Pujols', 'position' : 'DH', 'AB' : 5, 'R' : 0, 'H' : 0, 'RBI' : 0, 'BB' : 0, 'SO' : 0, 'PA' : 5 },\n    {'name' : 'Alec Burleson', 'position' : 'RF', 'AB' : 4, 'R' : 1, 'H' : 0, 'RBI' : 0, 'BB' : 1, 'SO' : 0, 'PA' : 5 },\n    {'name' : 'Yadier Molina', 'position' : 'C', 'AB' : 4, 'R' : 2, 'H' : 2, 'RBI' : 3, 'BB' : 0, 'SO' : 0, 'PA' : 4 },\n    {'name' : 'Ben DeLuzio', 'position' : 'CF', 'AB' : 2, 'R' : 0, 'H' : 1, 'RBI' : 0, 'BB' : 0, 'SO' : 0, 'PA' : 2 },\n&nbsp;\n    #9/7/2022 WSH 5 - 6 STL\n    {'name' : 'Lars Nootbaar', 'position' : 'RF', 'AB' : 4, 'R' : 0, 'H' : 0, 'RBI' : 0, 'BB' : 0, 'SO' : 2, 'PA' : 4 },\n    {'name' : 'Brendan Donovan', 'position' : 'DH', 'AB' : 4, 'R' : 1, 'H' : 2, 'RBI' : 0, 'BB' : 0, 'SO' : 0, 'PA' : 4 },\n    {'name' : 'Paul Goldschmidt', 'position' : '1B', 'AB' : 3, 'R' : 2, 'H' : 1, 'RBI' : 1, 'BB' : 1, 'SO' : 0, 'PA' : 4 },\n    {'name' : 'Nolan Arenado', 'position' : '3B', 'AB' : 4, 'R' : 1, 'H' : 2, 'RBI' : 1, 'BB' : 0, 'SO' : 0, 'PA' : 4 },\n    {'name' : 'Corey Dickerson', 'position' : 'LF', 'AB' : 4, 'R' : 0, 'H' : 1, 'RBI' : 1, 'BB' : 0, 'SO' : 0, 'PA' : 4 },\n    {'name' : 'Tyler O\'Neill', 'position' : 'CF', 'AB' : 3, 'R' : 1, 'H' : 0, 'RBI' : 0, 'BB' : 1, 'SO' : 3, 'PA' : 4 },\n    {'name' : 'Nolan Gorman', 'position' : '2B', 'AB' : 4, 'R' : 0, 'H' : 0, 'RBI' : 0, 'BB' : 0, 'SO' : 3, 'PA' : 4 },\n    {'name' : 'Yadier Molina', 'position' : 'C', 'AB' : 4, 'R' : 0, 'H' : 2, 'RBI' : 1, 'BB' : 0, 'SO' : 0, 'PA' : 4 },\n    {'name' : 'Tommy Edman', 'position' : 'SS', 'AB' : 4, 'R' : 0, 'H' : 1, 'RBI' : 2, 'BB' : 0, 'SO' : 0, 'PA' : 4 },\n&nbsp;\n    #9/6/2022 WSH 1 - 4 STL\n    {'name' : 'Lars Nootbaar', 'position' : 'RF', 'AB' : 4, 'R' : 0, 'H' : 0, 'RBI' : 0, 'BB' : 0, 'SO' : 2, 'PA' : 4 },\n    {'name' : 'Brendan Donovan', 'position' : '3B', 'AB' : 3, 'R' : 1, 'H' : 1, 'RBI' : 1, 'BB' : 0, 'SO' : 0, 'PA' : 3 },\n    {'name' : 'Paul Goldschmidt', 'position' : '1B', 'AB' : 3, 'R' : 0, 'H' : 1, 'RBI' : 0, 'BB' : 1, 'SO' : 1, 'PA' : 4 },\n    {'name' : 'Nolan Arenado', 'position' : 'DH', 'AB' : 4, 'R' : 1, 'H' : 1, 'RBI' : 0, 'BB' : 0, 'SO' : 0, 'PA' : 4 },\n    {'name' : 'Corey Dickerson', 'position' : 'LF', 'AB' : 3, 'R' : 1, 'H' : 2, 'RBI' : 0, 'BB' : 0, 'SO' : 0, 'PA' : 3 },\n    {'name' : 'Tyler O\'Neill', 'position' : 'CF', 'AB' : 2, 'R' : 0, 'H' : 0, 'RBI' : 1, 'BB' : 0, 'SO' : 1, 'PA' : 3 },\n    {'name' : 'Nolan Gorman', 'position' : '2B', 'AB' : 3, 'R' : 1, 'H' : 2, 'RBI' : 2, 'BB' : 0, 'SO' : 1, 'PA' : 3 },\n    {'name' : 'Andrew Knizner', 'position' : 'C', 'AB' : 3, 'R' : 0, 'H' : 0, 'RBI' : 0, 'BB' : 0, 'SO' : 1, 'PA' : 3 },\n    {'name' : 'Tommy Edman', 'position' : 'SS', 'AB' : 3, 'R' : 0, 'H' : 1, 'RBI' : 0, 'BB' : 0, 'SO' : 0, 'PA' : 3 },\n&nbsp;\n    #9/5/2022 WSH 6 - 0 STL\n    {'name' : 'Lars Nootbaar', 'position' : 'RF', 'AB' : 4, 'R' : 0, 'H' : 0, 'RBI' : 0, 'BB' : 0, 'SO' : 4, 'PA' : 4 },\n    {'name' : 'Tyler O\'Neill', 'position' : 'CF', 'AB' : 4, 'R' : 0, 'H' : 1, 'RBI' : 0, 'BB' : 0, 'SO' : 0, 'PA' : 4 },\n    {'name' : 'Paul Goldschmidt', 'position' : 'DH', 'AB' : 3, 'R' : 0, 'H' : 0, 'RBI' : 0, 'BB' : 1, 'SO' : 0, 'PA' : 4 },\n    {'name' : 'Nolan Arenado', 'position' : '3B', 'AB' : 3, 'R' : 0, 'H' : 0, 'RBI' : 0, 'BB' : 1, 'SO' : 1, 'PA' : 4 },\n    {'name' : 'Albert Pujols', 'position' : '1B', 'AB' : 4, 'R' : 0, 'H' : 0, 'RBI' : 0, 'BB' : 0, 'SO' : 1, 'PA' : 4 },\n    {'name' : 'Corey Dickerson', 'position' : 'LF', 'AB' : 3, 'R' : 0, 'H' : 0, 'RBI' : 0, 'BB' : 0, 'SO' : 0, 'PA' : 3 },\n    {'name' : 'Nolan Gorman', 'position' : '2B', 'AB' : 3, 'R' : 0, 'H' : 0, 'RBI' : 0, 'BB' : 0, 'SO' : 2, 'PA' : 3 },\n    {'name' : 'Yadier Molina', 'position' : 'C', 'AB' : 3, 'R' : 0, 'H' : 1, 'RBI' : 0, 'BB' : 0, 'SO' : 0, 'PA' : 3 },\n    {'name' : 'Tommy Edman', 'position' : 'SS', 'AB' : 3, 'R' : 0, 'H' : 1, 'RBI' : 0, 'BB' : 0, 'SO' : 0, 'PA' : 3 }\n])\ndf2 = df.groupby("name").agg("count").reset_index()\ndf2

Reset Code Python Output:


  
     
      
      name
      position
      AB
      R
      H
      RBI
      BB
      SO
      PA
    
  
  
    
      0
      Albert Pujols
      2
      2
      2
      2
      2
      2
      2
      2
    
    
      1
      Alec Burleson
      1
      1
      1
      1
      1
      1
      1
      1
    
    
      2
      Andrew Knizner
      1
      1
      1
      1
      1
      1
      1
      1
    
    
      3
      Ben DeLuzio
      1
      1
      1
      1
      1
      1
      1
      1
    
    
      4
      Brendan Donovan
      3
      3
      3
      3
      3
      3
      3
      3
    
    
      5
      Corey Dickerson
      4
      4
      4
      4
      4
      4
      4
      4
    
    
      6
      Lars Nootbaar
      3
      3
      3
      3
      3
      3
      3
      3
    
    
      7
      Nolan Arenado
      4
      4
      4
      4
      4
      4
      4
      4
    
    
      8
      Nolan Gorman
      3
      3
      3
      3
      3
      3
      3
      3
    
    
      9
      Paul Goldschmidt
      4
      4
      4
      4
      4
      4
      4
      4
    
    
      10
      Tommy Edman
      4
      4
      4
      4
      4
      4
      4
      4
    
    
      11
      Tyler O'Neill
      3
      3
      3
      3
      3
      3
      3
      3
    
    
      12
      Yadier Molina
      3
      3
      3
      3
      3
      3
      3
      3

	name	position	AB	R	H	RBI	BB	SO	PA
0	Albert Pujols	2	2	2	2	2	2	2	2
1	Alec Burleson	1	1	1	1	1	1	1	1
2	Andrew Knizner	1	1	1	1	1	1	1	1
3	Ben DeLuzio	1	1	1	1	1	1	1	1
4	Brendan Donovan	3	3	3	3	3	3	3	3
5	Corey Dickerson	4	4	4	4	4	4	4	4
6	Lars Nootbaar	3	3	3	3	3	3	3	3
7	Nolan Arenado	4	4	4	4	4	4	4	4
8	Nolan Gorman	3	3	3	3	3	3	3	3
9	Paul Goldschmidt	4	4	4	4	4	4	4	4
10	Tommy Edman	4	4	4	4	4	4	4	4
11	Tyler O'Neill	3	3	3	3	3	3	3	3
12	Yadier Molina	3	3	3	3	3	3	3	3

The line of code output a new DataFrame, but what does that DataFrame mean? In this case, we aggregated the "name" column by count. In this example, count found all of the duplicates in the "name" column, and summed them up. So, in the case of Yadier Molina, there were 3 duplicates found. This means that Yadier Molina started 3 out of the 4 possible games in the series versus Washington.

Different Types of Aggregation

count: counts the number of duplicates in the column passed into the groupby function. This type of aggregation would be useful when trying to find the total number of games played by a specific player.

sum: adds up every non-string column but the column passed into the groupby function. This type of aggregation would be useful if we tried to find the player with the most hits across the 4 game series.

import pandas as pd\n&nbsp;\ndf =  pd.DataFrame([\n    #9/8/2022 WSH 11 - 6 STL\n    {'name' : 'Brendan Donovan', 'position' : '2B', 'AB' : 3, 'R' : 2, 'H' : 3, 'RBI' : 0, 'BB' : 2, 'SO' : 0, 'PA' : 5 },\n    {'name' : 'Tommy Edman', 'position' : 'SS', 'AB' : 4, 'R' : 0, 'H' : 2, 'RBI' : 0, 'BB' : 1, 'SO' : 1, 'PA' : 5 },\n    {'name' : 'Paul Goldschmidt', 'position' : '1B', 'AB' : 4, 'R' : 0, 'H' : 1, 'RBI' : 1, 'BB' : 0, 'SO' : 1, 'PA' : 4 },\n    {'name' : 'Nolan Arenado', 'position' : '3B', 'AB' : 4, 'R' : 0, 'H' : 0, 'RBI' : 0, 'BB' : 0, 'SO' : 0, 'PA' : 4 },\n    {'name' : 'Corey Dickerson', 'position' : 'LF', 'AB' : 5, 'R' : 0, 'H' : 2, 'RBI' : 2, 'BB' : 0, 'SO' : 1, 'PA' : 5 },\n    {'name' : 'Albert Pujols', 'position' : 'DH', 'AB' : 5, 'R' : 0, 'H' : 0, 'RBI' : 0, 'BB' : 0, 'SO' : 0, 'PA' : 5 },\n    {'name' : 'Alec Burleson', 'position' : 'RF', 'AB' : 4, 'R' : 1, 'H' : 0, 'RBI' : 0, 'BB' : 1, 'SO' : 0, 'PA' : 5 },\n    {'name' : 'Yadier Molina', 'position' : 'C', 'AB' : 4, 'R' : 2, 'H' : 2, 'RBI' : 3, 'BB' : 0, 'SO' : 0, 'PA' : 4 },\n    {'name' : 'Ben DeLuzio', 'position' : 'CF', 'AB' : 2, 'R' : 0, 'H' : 1, 'RBI' : 0, 'BB' : 0, 'SO' : 0, 'PA' : 2 },\n&nbsp;\n    #9/7/2022 WSH 5 - 6 STL\n    {'name' : 'Lars Nootbaar', 'position' : 'RF', 'AB' : 4, 'R' : 0, 'H' : 0, 'RBI' : 0, 'BB' : 0, 'SO' : 2, 'PA' : 4 },\n    {'name' : 'Brendan Donovan', 'position' : 'DH', 'AB' : 4, 'R' : 1, 'H' : 2, 'RBI' : 0, 'BB' : 0, 'SO' : 0, 'PA' : 4 },\n    {'name' : 'Paul Goldschmidt', 'position' : '1B', 'AB' : 3, 'R' : 2, 'H' : 1, 'RBI' : 1, 'BB' : 1, 'SO' : 0, 'PA' : 4 },\n    {'name' : 'Nolan Arenado', 'position' : '3B', 'AB' : 4, 'R' : 1, 'H' : 2, 'RBI' : 1, 'BB' : 0, 'SO' : 0, 'PA' : 4 },\n    {'name' : 'Corey Dickerson', 'position' : 'LF', 'AB' : 4, 'R' : 0, 'H' : 1, 'RBI' : 1, 'BB' : 0, 'SO' : 0, 'PA' : 4 },\n    {'name' : 'Tyler O\'Neill', 'position' : 'CF', 'AB' : 3, 'R' : 1, 'H' : 0, 'RBI' : 0, 'BB' : 1, 'SO' : 3, 'PA' : 4 },\n    {'name' : 'Nolan Gorman', 'position' : '2B', 'AB' : 4, 'R' : 0, 'H' : 0, 'RBI' : 0, 'BB' : 0, 'SO' : 3, 'PA' : 4 },\n    {'name' : 'Yadier Molina', 'position' : 'C', 'AB' : 4, 'R' : 0, 'H' : 2, 'RBI' : 1, 'BB' : 0, 'SO' : 0, 'PA' : 4 },\n    {'name' : 'Tommy Edman', 'position' : 'SS', 'AB' : 4, 'R' : 0, 'H' : 1, 'RBI' : 2, 'BB' : 0, 'SO' : 0, 'PA' : 4 },\n&nbsp;\n    #9/6/2022 WSH 1 - 4 STL\n    {'name' : 'Lars Nootbaar', 'position' : 'RF', 'AB' : 4, 'R' : 0, 'H' : 0, 'RBI' : 0, 'BB' : 0, 'SO' : 2, 'PA' : 4 },\n    {'name' : 'Brendan Donovan', 'position' : '3B', 'AB' : 3, 'R' : 1, 'H' : 1, 'RBI' : 1, 'BB' : 0, 'SO' : 0, 'PA' : 3 },\n    {'name' : 'Paul Goldschmidt', 'position' : '1B', 'AB' : 3, 'R' : 0, 'H' : 1, 'RBI' : 0, 'BB' : 1, 'SO' : 1, 'PA' : 4 },\n    {'name' : 'Nolan Arenado', 'position' : 'DH', 'AB' : 4, 'R' : 1, 'H' : 1, 'RBI' : 0, 'BB' : 0, 'SO' : 0, 'PA' : 4 },\n    {'name' : 'Corey Dickerson', 'position' : 'LF', 'AB' : 3, 'R' : 1, 'H' : 2, 'RBI' : 0, 'BB' : 0, 'SO' : 0, 'PA' : 3 },\n    {'name' : 'Tyler O\'Neill', 'position' : 'CF', 'AB' : 2, 'R' : 0, 'H' : 0, 'RBI' : 1, 'BB' : 0, 'SO' : 1, 'PA' : 3 },\n    {'name' : 'Nolan Gorman', 'position' : '2B', 'AB' : 3, 'R' : 1, 'H' : 2, 'RBI' : 2, 'BB' : 0, 'SO' : 1, 'PA' : 3 },\n    {'name' : 'Andrew Knizner', 'position' : 'C', 'AB' : 3, 'R' : 0, 'H' : 0, 'RBI' : 0, 'BB' : 0, 'SO' : 1, 'PA' : 3 },\n    {'name' : 'Tommy Edman', 'position' : 'SS', 'AB' : 3, 'R' : 0, 'H' : 1, 'RBI' : 0, 'BB' : 0, 'SO' : 0, 'PA' : 3 },\n&nbsp;\n    #9/5/2022 WSH 6 - 0 STL\n    {'name' : 'Lars Nootbaar', 'position' : 'RF', 'AB' : 4, 'R' : 0, 'H' : 0, 'RBI' : 0, 'BB' : 0, 'SO' : 4, 'PA' : 4 },\n    {'name' : 'Tyler O\'Neill', 'position' : 'CF', 'AB' : 4, 'R' : 0, 'H' : 1, 'RBI' : 0, 'BB' : 0, 'SO' : 0, 'PA' : 4 },\n    {'name' : 'Paul Goldschmidt', 'position' : 'DH', 'AB' : 3, 'R' : 0, 'H' : 0, 'RBI' : 0, 'BB' : 1, 'SO' : 0, 'PA' : 4 },\n    {'name' : 'Nolan Arenado', 'position' : '3B', 'AB' : 3, 'R' : 0, 'H' : 0, 'RBI' : 0, 'BB' : 1, 'SO' : 1, 'PA' : 4 },\n    {'name' : 'Albert Pujols', 'position' : '1B', 'AB' : 4, 'R' : 0, 'H' : 0, 'RBI' : 0, 'BB' : 0, 'SO' : 1, 'PA' : 4 },\n    {'name' : 'Corey Dickerson', 'position' : 'LF', 'AB' : 3, 'R' : 0, 'H' : 0, 'RBI' : 0, 'BB' : 0, 'SO' : 0, 'PA' : 3 },\n    {'name' : 'Nolan Gorman', 'position' : '2B', 'AB' : 3, 'R' : 0, 'H' : 0, 'RBI' : 0, 'BB' : 0, 'SO' : 2, 'PA' : 3 },\n    {'name' : 'Yadier Molina', 'position' : 'C', 'AB' : 3, 'R' : 0, 'H' : 1, 'RBI' : 0, 'BB' : 0, 'SO' : 0, 'PA' : 3 },\n    {'name' : 'Tommy Edman', 'position' : 'SS', 'AB' : 3, 'R' : 0, 'H' : 1, 'RBI' : 0, 'BB' : 0, 'SO' : 0, 'PA' : 3 }\n])\ndf3 = df.groupby("name").agg("sum").reset_index()\ndf3.nlargest(1, "H")

Reset Code Python Output:


  
     
      
      name
      AB
      R
      H
      RBI
      BB
      SO
      PA
    
  
  
    
      4
      Brendan Donovan
      10
      4
      6
      1
      2
      0
      12

	name	AB	R	H	RBI	BB	SO	PA
4	Brendan Donovan	10	4	6	1	2	0	12

max: Finds the maximum value in all of the columns except for the column passed into the groupby function. This could be useful for finding the player with the most hits in one game.

min: Finds the maximum value in all of the columns except for the column passed into the groupby function. This could be useful for finding the player with the least hits in one game.

mean: Finds the mean value of all of the non-string columns, except for the column passed into the groupby function.

median: Finds the median value of all of the non-string columns, except for the column passed into the groupby function.

std: Finds the standard deviation of all of the non-string columns, except for the column passed into the groupby function.

var: Finds the variance of all of the non-string columns, except for the column passed into the groupby function.

mad: Finds the mean absolute deviation of all of the non-string columns, except for the column passed into the groupby function.

prod: finds the product of every non-string column, except for the column passed into the groupby function.

Passing Multiple Arguments to the Aggregation Function

If you wanted to aggregate by multiple things, you pass a list of the kinds of aggregation you want to use into the agg function.

import pandas as pd\n&nbsp;\ndf =  pd.DataFrame([\n    #9/8/2022 WSH 11 - 6 STL\n    {'name' : 'Brendan Donovan', 'position' : '2B', 'AB' : 3, 'R' : 2, 'H' : 3, 'RBI' : 0, 'BB' : 2, 'SO' : 0, 'PA' : 5 },\n    {'name' : 'Tommy Edman', 'position' : 'SS', 'AB' : 4, 'R' : 0, 'H' : 2, 'RBI' : 0, 'BB' : 1, 'SO' : 1, 'PA' : 5 },\n    {'name' : 'Paul Goldschmidt', 'position' : '1B', 'AB' : 4, 'R' : 0, 'H' : 1, 'RBI' : 1, 'BB' : 0, 'SO' : 1, 'PA' : 4 },\n    {'name' : 'Nolan Arenado', 'position' : '3B', 'AB' : 4, 'R' : 0, 'H' : 0, 'RBI' : 0, 'BB' : 0, 'SO' : 0, 'PA' : 4 },\n    {'name' : 'Corey Dickerson', 'position' : 'LF', 'AB' : 5, 'R' : 0, 'H' : 2, 'RBI' : 2, 'BB' : 0, 'SO' : 1, 'PA' : 5 },\n    {'name' : 'Albert Pujols', 'position' : 'DH', 'AB' : 5, 'R' : 0, 'H' : 0, 'RBI' : 0, 'BB' : 0, 'SO' : 0, 'PA' : 5 },\n    {'name' : 'Alec Burleson', 'position' : 'RF', 'AB' : 4, 'R' : 1, 'H' : 0, 'RBI' : 0, 'BB' : 1, 'SO' : 0, 'PA' : 5 },\n    {'name' : 'Yadier Molina', 'position' : 'C', 'AB' : 4, 'R' : 2, 'H' : 2, 'RBI' : 3, 'BB' : 0, 'SO' : 0, 'PA' : 4 },\n    {'name' : 'Ben DeLuzio', 'position' : 'CF', 'AB' : 2, 'R' : 0, 'H' : 1, 'RBI' : 0, 'BB' : 0, 'SO' : 0, 'PA' : 2 },\n&nbsp;\n    #9/7/2022 WSH 5 - 6 STL\n    {'name' : 'Lars Nootbaar', 'position' : 'RF', 'AB' : 4, 'R' : 0, 'H' : 0, 'RBI' : 0, 'BB' : 0, 'SO' : 2, 'PA' : 4 },\n    {'name' : 'Brendan Donovan', 'position' : 'DH', 'AB' : 4, 'R' : 1, 'H' : 2, 'RBI' : 0, 'BB' : 0, 'SO' : 0, 'PA' : 4 },\n    {'name' : 'Paul Goldschmidt', 'position' : '1B', 'AB' : 3, 'R' : 2, 'H' : 1, 'RBI' : 1, 'BB' : 1, 'SO' : 0, 'PA' : 4 },\n    {'name' : 'Nolan Arenado', 'position' : '3B', 'AB' : 4, 'R' : 1, 'H' : 2, 'RBI' : 1, 'BB' : 0, 'SO' : 0, 'PA' : 4 },\n    {'name' : 'Corey Dickerson', 'position' : 'LF', 'AB' : 4, 'R' : 0, 'H' : 1, 'RBI' : 1, 'BB' : 0, 'SO' : 0, 'PA' : 4 },\n    {'name' : 'Tyler O\'Neill', 'position' : 'CF', 'AB' : 3, 'R' : 1, 'H' : 0, 'RBI' : 0, 'BB' : 1, 'SO' : 3, 'PA' : 4 },\n    {'name' : 'Nolan Gorman', 'position' : '2B', 'AB' : 4, 'R' : 0, 'H' : 0, 'RBI' : 0, 'BB' : 0, 'SO' : 3, 'PA' : 4 },\n    {'name' : 'Yadier Molina', 'position' : 'C', 'AB' : 4, 'R' : 0, 'H' : 2, 'RBI' : 1, 'BB' : 0, 'SO' : 0, 'PA' : 4 },\n    {'name' : 'Tommy Edman', 'position' : 'SS', 'AB' : 4, 'R' : 0, 'H' : 1, 'RBI' : 2, 'BB' : 0, 'SO' : 0, 'PA' : 4 },\n&nbsp;\n    #9/6/2022 WSH 1 - 4 STL\n    {'name' : 'Lars Nootbaar', 'position' : 'RF', 'AB' : 4, 'R' : 0, 'H' : 0, 'RBI' : 0, 'BB' : 0, 'SO' : 2, 'PA' : 4 },\n    {'name' : 'Brendan Donovan', 'position' : '3B', 'AB' : 3, 'R' : 1, 'H' : 1, 'RBI' : 1, 'BB' : 0, 'SO' : 0, 'PA' : 3 },\n    {'name' : 'Paul Goldschmidt', 'position' : '1B', 'AB' : 3, 'R' : 0, 'H' : 1, 'RBI' : 0, 'BB' : 1, 'SO' : 1, 'PA' : 4 },\n    {'name' : 'Nolan Arenado', 'position' : 'DH', 'AB' : 4, 'R' : 1, 'H' : 1, 'RBI' : 0, 'BB' : 0, 'SO' : 0, 'PA' : 4 },\n    {'name' : 'Corey Dickerson', 'position' : 'LF', 'AB' : 3, 'R' : 1, 'H' : 2, 'RBI' : 0, 'BB' : 0, 'SO' : 0, 'PA' : 3 },\n    {'name' : 'Tyler O\'Neill', 'position' : 'CF', 'AB' : 2, 'R' : 0, 'H' : 0, 'RBI' : 1, 'BB' : 0, 'SO' : 1, 'PA' : 3 },\n    {'name' : 'Nolan Gorman', 'position' : '2B', 'AB' : 3, 'R' : 1, 'H' : 2, 'RBI' : 2, 'BB' : 0, 'SO' : 1, 'PA' : 3 },\n    {'name' : 'Andrew Knizner', 'position' : 'C', 'AB' : 3, 'R' : 0, 'H' : 0, 'RBI' : 0, 'BB' : 0, 'SO' : 1, 'PA' : 3 },\n    {'name' : 'Tommy Edman', 'position' : 'SS', 'AB' : 3, 'R' : 0, 'H' : 1, 'RBI' : 0, 'BB' : 0, 'SO' : 0, 'PA' : 3 },\n&nbsp;\n    #9/5/2022 WSH 6 - 0 STL\n    {'name' : 'Lars Nootbaar', 'position' : 'RF', 'AB' : 4, 'R' : 0, 'H' : 0, 'RBI' : 0, 'BB' : 0, 'SO' : 4, 'PA' : 4 },\n    {'name' : 'Tyler O\'Neill', 'position' : 'CF', 'AB' : 4, 'R' : 0, 'H' : 1, 'RBI' : 0, 'BB' : 0, 'SO' : 0, 'PA' : 4 },\n    {'name' : 'Paul Goldschmidt', 'position' : 'DH', 'AB' : 3, 'R' : 0, 'H' : 0, 'RBI' : 0, 'BB' : 1, 'SO' : 0, 'PA' : 4 },\n    {'name' : 'Nolan Arenado', 'position' : '3B', 'AB' : 3, 'R' : 0, 'H' : 0, 'RBI' : 0, 'BB' : 1, 'SO' : 1, 'PA' : 4 },\n    {'name' : 'Albert Pujols', 'position' : '1B', 'AB' : 4, 'R' : 0, 'H' : 0, 'RBI' : 0, 'BB' : 0, 'SO' : 1, 'PA' : 4 },\n    {'name' : 'Corey Dickerson', 'position' : 'LF', 'AB' : 3, 'R' : 0, 'H' : 0, 'RBI' : 0, 'BB' : 0, 'SO' : 0, 'PA' : 3 },\n    {'name' : 'Nolan Gorman', 'position' : '2B', 'AB' : 3, 'R' : 0, 'H' : 0, 'RBI' : 0, 'BB' : 0, 'SO' : 2, 'PA' : 3 },\n    {'name' : 'Yadier Molina', 'position' : 'C', 'AB' : 3, 'R' : 0, 'H' : 1, 'RBI' : 0, 'BB' : 0, 'SO' : 0, 'PA' : 3 },\n    {'name' : 'Tommy Edman', 'position' : 'SS', 'AB' : 3, 'R' : 0, 'H' : 1, 'RBI' : 0, 'BB' : 0, 'SO' : 0, 'PA' : 3 }\n])\n&nbsp;\nnumeric_cols = df.select_dtypes(include="number").columns\ndf4 = df.groupby("name")[numeric_cols].agg(["mean", "median", "max"]).reset_index()\ndf4

Reset Code Python Output:


  
    
      
      name
      AB
      R
      H
      RBI
      BB
      SO
      PA
    
    
      
      
      mean
      median
      max
      mean
      median
      max
      mean
      median
      max
      mean
      median
      max
      mean
      median
      max
      mean
      median
      max
      mean
      median
      max
    
  
  
    
      0
      Albert Pujols
      4.000000
      4.0
      4
      0.000000
      0.0
      0
      0.000000
      0.0
      0
      0.000000
      0.0
      0
      0.000000
      0.0
      0
      1.000000
      1.0
      1
      4.000000
      4.0
      4
    
    
      1
      Albert Pujols
      5.000000
      5.0
      5
      0.000000
      0.0
      0
      0.000000
      0.0
      0
      0.000000
      0.0
      0
      0.000000
      0.0
      0
      0.000000
      0.0
      0
      5.000000
      5.0
      5
    
    
      2
      Alec Burleson
      4.000000
      4.0
      4
      1.000000
      1.0
      1
      0.000000
      0.0
      0
      0.000000
      0.0
      0
      1.000000
      1.0
      1
      0.000000
      0.0
      0
      5.000000
      5.0
      5
    
    
      3
      Andrew Knizner
      3.000000
      3.0
      3
      0.000000
      0.0
      0
      0.000000
      0.0
      0
      0.000000
      0.0
      0
      0.000000
      0.0
      0
      1.000000
      1.0
      1
      3.000000
      3.0
      3
    
    
      4
      Ben DeLuzio
      2.000000
      2.0
      2
      0.000000
      0.0
      0
      1.000000
      1.0
      1
      0.000000
      0.0
      0
      0.000000
      0.0
      0
      0.000000
      0.0
      0
      2.000000
      2.0
      2
    
    
      5
      Brendan Donovan
      3.333333
      3.0
      4
      1.333333
      1.0
      2
      2.000000
      2.0
      3
      0.333333
      0.0
      1
      0.666667
      0.0
      2
      0.000000
      0.0
      0
      4.000000
      4.0
      5
    
    
      6
      Corey Dickerson
      3.750000
      3.5
      5
      0.250000
      0.0
      1
      1.250000
      1.5
      2
      0.750000
      0.5
      2
      0.000000
      0.0
      0
      0.250000
      0.0
      1
      3.750000
      3.5
      5
    
    
      7
      Lars Nootbaar
      4.000000
      4.0
      4
      0.000000
      0.0
      0
      0.000000
      0.0
      0
      0.000000
      0.0
      0
      0.000000
      0.0
      0
      2.666667
      2.0
      4
      4.000000
      4.0
      4
    
    
      8
      Nolan Arenado
      3.750000
      4.0
      4
      0.500000
      0.5
      1
      0.750000
      0.5
      2
      0.250000
      0.0
      1
      0.250000
      0.0
      1
      0.250000
      0.0
      1
      4.000000
      4.0
      4
    
    
      9
      Nolan Gorman
      3.333333
      3.0
      4
      0.333333
      0.0
      1
      0.666667
      0.0
      2
      0.666667
      0.0
      2
      0.000000
      0.0
      0
      2.000000
      2.0
      3
      3.333333
      3.0
      4
    
    
      10
      Paul Goldschmidt
      3.250000
      3.0
      4
      0.500000
      0.0
      2
      0.750000
      1.0
      1
      0.500000
      0.5
      1
      0.750000
      1.0
      1
      0.500000
      0.5
      1
      4.000000
      4.0
      4
    
    
      11
      Tommy Edman
      3.500000
      3.5
      4
      0.000000
      0.0
      0
      1.250000
      1.0
      2
      0.500000
      0.0
      2
      0.250000
      0.0
      1
      0.250000
      0.0
      1
      3.750000
      3.5
      5
    
    
      12
      Tyler O'Neill
      3.000000
      3.0
      4
      0.333333
      0.0
      1
      0.333333
      0.0
      1
      0.333333
      0.0
      1
      0.333333
      0.0
      1
      1.333333
      1.0
      3
      3.666667
      4.0
      4
    
    
      13
      Yadier Molina
      3.666667
      4.0
      4
      0.666667
      0.0
      2
      1.666667
      2.0
      2
      1.333333
      1.0
      3
      0.000000
      0.0
      0
      0.000000
      0.0
      0
      3.666667
      4.0
      4

	name	AB	R	H	RBI	BB	SO	PA
0	Albert Pujols	4.000000	4.0	4	0.000000	0.0	0	0.000000	0.0	0	0.000000	0.0	0	0.000000	0.0	0	1.000000	1.0	1	4.000000	4.0	4
1	Albert Pujols	5.000000	5.0	5	0.000000	0.0	0	0.000000	0.0	0	0.000000	0.0	0	0.000000	0.0	0	0.000000	0.0	0	5.000000	5.0	5
2	Alec Burleson	4.000000	4.0	4	1.000000	1.0	1	0.000000	0.0	0	0.000000	0.0	0	1.000000	1.0	1	0.000000	0.0	0	5.000000	5.0	5
3	Andrew Knizner	3.000000	3.0	3	0.000000	0.0	0	0.000000	0.0	0	0.000000	0.0	0	0.000000	0.0	0	1.000000	1.0	1	3.000000	3.0	3
4	Ben DeLuzio	2.000000	2.0	2	0.000000	0.0	0	1.000000	1.0	1	0.000000	0.0	0	0.000000	0.0	0	0.000000	0.0	0	2.000000	2.0	2
5	Brendan Donovan	3.333333	3.0	4	1.333333	1.0	2	2.000000	2.0	3	0.333333	0.0	1	0.666667	0.0	2	0.000000	0.0	0	4.000000	4.0	5
6	Corey Dickerson	3.750000	3.5	5	0.250000	0.0	1	1.250000	1.5	2	0.750000	0.5	2	0.000000	0.0	0	0.250000	0.0	1	3.750000	3.5	5
7	Lars Nootbaar	4.000000	4.0	4	0.000000	0.0	0	0.000000	0.0	0	0.000000	0.0	0	0.000000	0.0	0	2.666667	2.0	4	4.000000	4.0	4
8	Nolan Arenado	3.750000	4.0	4	0.500000	0.5	1	0.750000	0.5	2	0.250000	0.0	1	0.250000	0.0	1	0.250000	0.0	1	4.000000	4.0	4
9	Nolan Gorman	3.333333	3.0	4	0.333333	0.0	1	0.666667	0.0	2	0.666667	0.0	2	0.000000	0.0	0	2.000000	2.0	3	3.333333	3.0	4
10	Paul Goldschmidt	3.250000	3.0	4	0.500000	0.0	2	0.750000	1.0	1	0.500000	0.5	1	0.750000	1.0	1	0.500000	0.5	1	4.000000	4.0	4
11	Tommy Edman	3.500000	3.5	4	0.000000	0.0	0	1.250000	1.0	2	0.500000	0.0	2	0.250000	0.0	1	0.250000	0.0	1	3.750000	3.5	5
12	Tyler O'Neill	3.000000	3.0	4	0.333333	0.0	1	0.333333	0.0	1	0.333333	0.0	1	0.333333	0.0	1	1.333333	1.0	3	3.666667	4.0	4
13	Yadier Molina	3.666667	4.0	4	0.666667	0.0	2	1.666667	2.0	2	1.333333	1.0	3	0.000000	0.0	0	0.000000	0.0	0	3.666667	4.0	4

Selecting Certain Columns to Groupby

Sometimes when you use groupby, you only care about the result of one or two columns. As we saw in the sum example, it shows the results of the groupby for every column, even though we only really cared about the hits. So how would we only select the columns we want to see? Lets find out who had the most hits in one start for the Cardinals:

import pandas as pd\n&nbsp;\ndf =  pd.DataFrame([\n    #9/8/2022 WSH 11 - 6 STL\n    {'name' : 'Brendan Donovan', 'position' : '2B', 'AB' : 3, 'R' : 2, 'H' : 3, 'RBI' : 0, 'BB' : 2, 'SO' : 0, 'PA' : 5 },\n    {'name' : 'Tommy Edman', 'position' : 'SS', 'AB' : 4, 'R' : 0, 'H' : 2, 'RBI' : 0, 'BB' : 1, 'SO' : 1, 'PA' : 5 },\n    {'name' : 'Paul Goldschmidt', 'position' : '1B', 'AB' : 4, 'R' : 0, 'H' : 1, 'RBI' : 1, 'BB' : 0, 'SO' : 1, 'PA' : 4 },\n    {'name' : 'Nolan Arenado', 'position' : '3B', 'AB' : 4, 'R' : 0, 'H' : 0, 'RBI' : 0, 'BB' : 0, 'SO' : 0, 'PA' : 4 },\n    {'name' : 'Corey Dickerson', 'position' : 'LF', 'AB' : 5, 'R' : 0, 'H' : 2, 'RBI' : 2, 'BB' : 0, 'SO' : 1, 'PA' : 5 },\n    {'name' : 'Albert Pujols', 'position' : 'DH', 'AB' : 5, 'R' : 0, 'H' : 0, 'RBI' : 0, 'BB' : 0, 'SO' : 0, 'PA' : 5 },\n    {'name' : 'Alec Burleson', 'position' : 'RF', 'AB' : 4, 'R' : 1, 'H' : 0, 'RBI' : 0, 'BB' : 1, 'SO' : 0, 'PA' : 5 },\n    {'name' : 'Yadier Molina', 'position' : 'C', 'AB' : 4, 'R' : 2, 'H' : 2, 'RBI' : 3, 'BB' : 0, 'SO' : 0, 'PA' : 4 },\n    {'name' : 'Ben DeLuzio', 'position' : 'CF', 'AB' : 2, 'R' : 0, 'H' : 1, 'RBI' : 0, 'BB' : 0, 'SO' : 0, 'PA' : 2 },\n&nbsp;\n    #9/7/2022 WSH 5 - 6 STL\n    {'name' : 'Lars Nootbaar', 'position' : 'RF', 'AB' : 4, 'R' : 0, 'H' : 0, 'RBI' : 0, 'BB' : 0, 'SO' : 2, 'PA' : 4 },\n    {'name' : 'Brendan Donovan', 'position' : 'DH', 'AB' : 4, 'R' : 1, 'H' : 2, 'RBI' : 0, 'BB' : 0, 'SO' : 0, 'PA' : 4 },\n    {'name' : 'Paul Goldschmidt', 'position' : '1B', 'AB' : 3, 'R' : 2, 'H' : 1, 'RBI' : 1, 'BB' : 1, 'SO' : 0, 'PA' : 4 },\n    {'name' : 'Nolan Arenado', 'position' : '3B', 'AB' : 4, 'R' : 1, 'H' : 2, 'RBI' : 1, 'BB' : 0, 'SO' : 0, 'PA' : 4 },\n    {'name' : 'Corey Dickerson', 'position' : 'LF', 'AB' : 4, 'R' : 0, 'H' : 1, 'RBI' : 1, 'BB' : 0, 'SO' : 0, 'PA' : 4 },\n    {'name' : 'Tyler O\'Neill', 'position' : 'CF', 'AB' : 3, 'R' : 1, 'H' : 0, 'RBI' : 0, 'BB' : 1, 'SO' : 3, 'PA' : 4 },\n    {'name' : 'Nolan Gorman', 'position' : '2B', 'AB' : 4, 'R' : 0, 'H' : 0, 'RBI' : 0, 'BB' : 0, 'SO' : 3, 'PA' : 4 },\n    {'name' : 'Yadier Molina', 'position' : 'C', 'AB' : 4, 'R' : 0, 'H' : 2, 'RBI' : 1, 'BB' : 0, 'SO' : 0, 'PA' : 4 },\n    {'name' : 'Tommy Edman', 'position' : 'SS', 'AB' : 4, 'R' : 0, 'H' : 1, 'RBI' : 2, 'BB' : 0, 'SO' : 0, 'PA' : 4 },\n&nbsp;\n    #9/6/2022 WSH 1 - 4 STL\n    {'name' : 'Lars Nootbaar', 'position' : 'RF', 'AB' : 4, 'R' : 0, 'H' : 0, 'RBI' : 0, 'BB' : 0, 'SO' : 2, 'PA' : 4 },\n    {'name' : 'Brendan Donovan', 'position' : '3B', 'AB' : 3, 'R' : 1, 'H' : 1, 'RBI' : 1, 'BB' : 0, 'SO' : 0, 'PA' : 3 },\n    {'name' : 'Paul Goldschmidt', 'position' : '1B', 'AB' : 3, 'R' : 0, 'H' : 1, 'RBI' : 0, 'BB' : 1, 'SO' : 1, 'PA' : 4 },\n    {'name' : 'Nolan Arenado', 'position' : 'DH', 'AB' : 4, 'R' : 1, 'H' : 1, 'RBI' : 0, 'BB' : 0, 'SO' : 0, 'PA' : 4 },\n    {'name' : 'Corey Dickerson', 'position' : 'LF', 'AB' : 3, 'R' : 1, 'H' : 2, 'RBI' : 0, 'BB' : 0, 'SO' : 0, 'PA' : 3 },\n    {'name' : 'Tyler O\'Neill', 'position' : 'CF', 'AB' : 2, 'R' : 0, 'H' : 0, 'RBI' : 1, 'BB' : 0, 'SO' : 1, 'PA' : 3 },\n    {'name' : 'Nolan Gorman', 'position' : '2B', 'AB' : 3, 'R' : 1, 'H' : 2, 'RBI' : 2, 'BB' : 0, 'SO' : 1, 'PA' : 3 },\n    {'name' : 'Andrew Knizner', 'position' : 'C', 'AB' : 3, 'R' : 0, 'H' : 0, 'RBI' : 0, 'BB' : 0, 'SO' : 1, 'PA' : 3 },\n    {'name' : 'Tommy Edman', 'position' : 'SS', 'AB' : 3, 'R' : 0, 'H' : 1, 'RBI' : 0, 'BB' : 0, 'SO' : 0, 'PA' : 3 },\n&nbsp;\n    #9/5/2022 WSH 6 - 0 STL\n    {'name' : 'Lars Nootbaar', 'position' : 'RF', 'AB' : 4, 'R' : 0, 'H' : 0, 'RBI' : 0, 'BB' : 0, 'SO' : 4, 'PA' : 4 },\n    {'name' : 'Tyler O\'Neill', 'position' : 'CF', 'AB' : 4, 'R' : 0, 'H' : 1, 'RBI' : 0, 'BB' : 0, 'SO' : 0, 'PA' : 4 },\n    {'name' : 'Paul Goldschmidt', 'position' : 'DH', 'AB' : 3, 'R' : 0, 'H' : 0, 'RBI' : 0, 'BB' : 1, 'SO' : 0, 'PA' : 4 },\n    {'name' : 'Nolan Arenado', 'position' : '3B', 'AB' : 3, 'R' : 0, 'H' : 0, 'RBI' : 0, 'BB' : 1, 'SO' : 1, 'PA' : 4 },\n    {'name' : 'Albert Pujols', 'position' : '1B', 'AB' : 4, 'R' : 0, 'H' : 0, 'RBI' : 0, 'BB' : 0, 'SO' : 1, 'PA' : 4 },\n    {'name' : 'Corey Dickerson', 'position' : 'LF', 'AB' : 3, 'R' : 0, 'H' : 0, 'RBI' : 0, 'BB' : 0, 'SO' : 0, 'PA' : 3 },\n    {'name' : 'Nolan Gorman', 'position' : '2B', 'AB' : 3, 'R' : 0, 'H' : 0, 'RBI' : 0, 'BB' : 0, 'SO' : 2, 'PA' : 3 },\n    {'name' : 'Yadier Molina', 'position' : 'C', 'AB' : 3, 'R' : 0, 'H' : 1, 'RBI' : 0, 'BB' : 0, 'SO' : 0, 'PA' : 3 },\n    {'name' : 'Tommy Edman', 'position' : 'SS', 'AB' : 3, 'R' : 0, 'H' : 1, 'RBI' : 0, 'BB' : 0, 'SO' : 0, 'PA' : 3 }\n])\ndf5 = df[["name", "H"]].groupby("name").agg("count").reset_index()\ndf5.nlargest(1, "H")

Reset Code Python Output:


  
     
      
      name
      H
    
  
  
    
      5
      Corey Dickerson
      4

	name	H
5	Corey Dickerson	4

By passing a list of the columns I wanted to look at, and making sure to include the column passed into the groupby function, we guarantee that only the columns I want to see appear in the result. Lets take a look at what happens if I don't pass in a list of columns, and just take the max of every column:

import pandas as pd\n&nbsp;\ndf =  pd.DataFrame([\n    #9/8/2022 WSH 11 - 6 STL\n    {'name' : 'Brendan Donovan', 'position' : '2B', 'AB' : 3, 'R' : 2, 'H' : 3, 'RBI' : 0, 'BB' : 2, 'SO' : 0, 'PA' : 5 },\n    {'name' : 'Tommy Edman', 'position' : 'SS', 'AB' : 4, 'R' : 0, 'H' : 2, 'RBI' : 0, 'BB' : 1, 'SO' : 1, 'PA' : 5 },\n    {'name' : 'Paul Goldschmidt', 'position' : '1B', 'AB' : 4, 'R' : 0, 'H' : 1, 'RBI' : 1, 'BB' : 0, 'SO' : 1, 'PA' : 4 },\n    {'name' : 'Nolan Arenado', 'position' : '3B', 'AB' : 4, 'R' : 0, 'H' : 0, 'RBI' : 0, 'BB' : 0, 'SO' : 0, 'PA' : 4 },\n    {'name' : 'Corey Dickerson', 'position' : 'LF', 'AB' : 5, 'R' : 0, 'H' : 2, 'RBI' : 2, 'BB' : 0, 'SO' : 1, 'PA' : 5 },\n    {'name' : 'Albert Pujols', 'position' : 'DH', 'AB' : 5, 'R' : 0, 'H' : 0, 'RBI' : 0, 'BB' : 0, 'SO' : 0, 'PA' : 5 },\n    {'name' : 'Alec Burleson', 'position' : 'RF', 'AB' : 4, 'R' : 1, 'H' : 0, 'RBI' : 0, 'BB' : 1, 'SO' : 0, 'PA' : 5 },\n    {'name' : 'Yadier Molina', 'position' : 'C', 'AB' : 4, 'R' : 2, 'H' : 2, 'RBI' : 3, 'BB' : 0, 'SO' : 0, 'PA' : 4 },\n    {'name' : 'Ben DeLuzio', 'position' : 'CF', 'AB' : 2, 'R' : 0, 'H' : 1, 'RBI' : 0, 'BB' : 0, 'SO' : 0, 'PA' : 2 },\n&nbsp;\n    #9/7/2022 WSH 5 - 6 STL\n    {'name' : 'Lars Nootbaar', 'position' : 'RF', 'AB' : 4, 'R' : 0, 'H' : 0, 'RBI' : 0, 'BB' : 0, 'SO' : 2, 'PA' : 4 },\n    {'name' : 'Brendan Donovan', 'position' : 'DH', 'AB' : 4, 'R' : 1, 'H' : 2, 'RBI' : 0, 'BB' : 0, 'SO' : 0, 'PA' : 4 },\n    {'name' : 'Paul Goldschmidt', 'position' : '1B', 'AB' : 3, 'R' : 2, 'H' : 1, 'RBI' : 1, 'BB' : 1, 'SO' : 0, 'PA' : 4 },\n    {'name' : 'Nolan Arenado', 'position' : '3B', 'AB' : 4, 'R' : 1, 'H' : 2, 'RBI' : 1, 'BB' : 0, 'SO' : 0, 'PA' : 4 },\n    {'name' : 'Corey Dickerson', 'position' : 'LF', 'AB' : 4, 'R' : 0, 'H' : 1, 'RBI' : 1, 'BB' : 0, 'SO' : 0, 'PA' : 4 },\n    {'name' : 'Tyler O\'Neill', 'position' : 'CF', 'AB' : 3, 'R' : 1, 'H' : 0, 'RBI' : 0, 'BB' : 1, 'SO' : 3, 'PA' : 4 },\n    {'name' : 'Nolan Gorman', 'position' : '2B', 'AB' : 4, 'R' : 0, 'H' : 0, 'RBI' : 0, 'BB' : 0, 'SO' : 3, 'PA' : 4 },\n    {'name' : 'Yadier Molina', 'position' : 'C', 'AB' : 4, 'R' : 0, 'H' : 2, 'RBI' : 1, 'BB' : 0, 'SO' : 0, 'PA' : 4 },\n    {'name' : 'Tommy Edman', 'position' : 'SS', 'AB' : 4, 'R' : 0, 'H' : 1, 'RBI' : 2, 'BB' : 0, 'SO' : 0, 'PA' : 4 },\n&nbsp;\n    #9/6/2022 WSH 1 - 4 STL\n    {'name' : 'Lars Nootbaar', 'position' : 'RF', 'AB' : 4, 'R' : 0, 'H' : 0, 'RBI' : 0, 'BB' : 0, 'SO' : 2, 'PA' : 4 },\n    {'name' : 'Brendan Donovan', 'position' : '3B', 'AB' : 3, 'R' : 1, 'H' : 1, 'RBI' : 1, 'BB' : 0, 'SO' : 0, 'PA' : 3 },\n    {'name' : 'Paul Goldschmidt', 'position' : '1B', 'AB' : 3, 'R' : 0, 'H' : 1, 'RBI' : 0, 'BB' : 1, 'SO' : 1, 'PA' : 4 },\n    {'name' : 'Nolan Arenado', 'position' : 'DH', 'AB' : 4, 'R' : 1, 'H' : 1, 'RBI' : 0, 'BB' : 0, 'SO' : 0, 'PA' : 4 },\n    {'name' : 'Corey Dickerson', 'position' : 'LF', 'AB' : 3, 'R' : 1, 'H' : 2, 'RBI' : 0, 'BB' : 0, 'SO' : 0, 'PA' : 3 },\n    {'name' : 'Tyler O\'Neill', 'position' : 'CF', 'AB' : 2, 'R' : 0, 'H' : 0, 'RBI' : 1, 'BB' : 0, 'SO' : 1, 'PA' : 3 },\n    {'name' : 'Nolan Gorman', 'position' : '2B', 'AB' : 3, 'R' : 1, 'H' : 2, 'RBI' : 2, 'BB' : 0, 'SO' : 1, 'PA' : 3 },\n    {'name' : 'Andrew Knizner', 'position' : 'C', 'AB' : 3, 'R' : 0, 'H' : 0, 'RBI' : 0, 'BB' : 0, 'SO' : 1, 'PA' : 3 },\n    {'name' : 'Tommy Edman', 'position' : 'SS', 'AB' : 3, 'R' : 0, 'H' : 1, 'RBI' : 0, 'BB' : 0, 'SO' : 0, 'PA' : 3 },\n&nbsp;\n    #9/5/2022 WSH 6 - 0 STL\n    {'name' : 'Lars Nootbaar', 'position' : 'RF', 'AB' : 4, 'R' : 0, 'H' : 0, 'RBI' : 0, 'BB' : 0, 'SO' : 4, 'PA' : 4 },\n    {'name' : 'Tyler O\'Neill', 'position' : 'CF', 'AB' : 4, 'R' : 0, 'H' : 1, 'RBI' : 0, 'BB' : 0, 'SO' : 0, 'PA' : 4 },\n    {'name' : 'Paul Goldschmidt', 'position' : 'DH', 'AB' : 3, 'R' : 0, 'H' : 0, 'RBI' : 0, 'BB' : 1, 'SO' : 0, 'PA' : 4 },\n    {'name' : 'Nolan Arenado', 'position' : '3B', 'AB' : 3, 'R' : 0, 'H' : 0, 'RBI' : 0, 'BB' : 1, 'SO' : 1, 'PA' : 4 },\n    {'name' : 'Albert Pujols', 'position' : '1B', 'AB' : 4, 'R' : 0, 'H' : 0, 'RBI' : 0, 'BB' : 0, 'SO' : 1, 'PA' : 4 },\n    {'name' : 'Corey Dickerson', 'position' : 'LF', 'AB' : 3, 'R' : 0, 'H' : 0, 'RBI' : 0, 'BB' : 0, 'SO' : 0, 'PA' : 3 },\n    {'name' : 'Nolan Gorman', 'position' : '2B', 'AB' : 3, 'R' : 0, 'H' : 0, 'RBI' : 0, 'BB' : 0, 'SO' : 2, 'PA' : 3 },\n    {'name' : 'Yadier Molina', 'position' : 'C', 'AB' : 3, 'R' : 0, 'H' : 1, 'RBI' : 0, 'BB' : 0, 'SO' : 0, 'PA' : 3 },\n    {'name' : 'Tommy Edman', 'position' : 'SS', 'AB' : 3, 'R' : 0, 'H' : 1, 'RBI' : 0, 'BB' : 0, 'SO' : 0, 'PA' : 3 }\n])\ndf6 = df.groupby("name").agg("max").reset_index()\ndf6.nlargest(1, "H")

Reset Code Python Output:


  
     
      
      name
      position
      AB
      R
      H
      RBI
      BB
      SO
      PA
    
  
  
    
      4
      Brendan Donovan
      DH
      4
      2
      3
      1
      2
      0
      5

	name	position	AB	R	H	RBI	BB	SO	PA
4	Brendan Donovan	DH	4	2	3	1	2	0	5

As you can see the result is so much more cluttered and gives more information than necessary. That makes it a little harder to interpret. Sometimes it is nice to view every column's result, but it can be useful to only output the necessary columns.

Grouping by Multiple Columns

Sometimes it can be useful to sort the data by multiple columns. In this DataFrame, an example of when it would be useful is by looking at what position a player is more productive as. Someone like Brendan Donovan played 3 different positions, but Yadier Molina only played catcher. If you don't group by multiple columns, you won't be able to see how well a player hits at different positions. So in this example we will be grouping by the "name" and "position" columns:

import pandas as pd\n&nbsp;\ndf =  pd.DataFrame([\n    #9/8/2022 WSH 11 - 6 STL\n    {'name' : 'Brendan Donovan', 'position' : '2B', 'AB' : 3, 'R' : 2, 'H' : 3, 'RBI' : 0, 'BB' : 2, 'SO' : 0, 'PA' : 5 },\n    {'name' : 'Tommy Edman', 'position' : 'SS', 'AB' : 4, 'R' : 0, 'H' : 2, 'RBI' : 0, 'BB' : 1, 'SO' : 1, 'PA' : 5 },\n    {'name' : 'Paul Goldschmidt', 'position' : '1B', 'AB' : 4, 'R' : 0, 'H' : 1, 'RBI' : 1, 'BB' : 0, 'SO' : 1, 'PA' : 4 },\n    {'name' : 'Nolan Arenado', 'position' : '3B', 'AB' : 4, 'R' : 0, 'H' : 0, 'RBI' : 0, 'BB' : 0, 'SO' : 0, 'PA' : 4 },\n    {'name' : 'Corey Dickerson', 'position' : 'LF', 'AB' : 5, 'R' : 0, 'H' : 2, 'RBI' : 2, 'BB' : 0, 'SO' : 1, 'PA' : 5 },\n    {'name' : 'Albert Pujols', 'position' : 'DH', 'AB' : 5, 'R' : 0, 'H' : 0, 'RBI' : 0, 'BB' : 0, 'SO' : 0, 'PA' : 5 },\n    {'name' : 'Alec Burleson', 'position' : 'RF', 'AB' : 4, 'R' : 1, 'H' : 0, 'RBI' : 0, 'BB' : 1, 'SO' : 0, 'PA' : 5 },\n    {'name' : 'Yadier Molina', 'position' : 'C', 'AB' : 4, 'R' : 2, 'H' : 2, 'RBI' : 3, 'BB' : 0, 'SO' : 0, 'PA' : 4 },\n    {'name' : 'Ben DeLuzio', 'position' : 'CF', 'AB' : 2, 'R' : 0, 'H' : 1, 'RBI' : 0, 'BB' : 0, 'SO' : 0, 'PA' : 2 },\n&nbsp;\n    #9/7/2022 WSH 5 - 6 STL\n    {'name' : 'Lars Nootbaar', 'position' : 'RF', 'AB' : 4, 'R' : 0, 'H' : 0, 'RBI' : 0, 'BB' : 0, 'SO' : 2, 'PA' : 4 },\n    {'name' : 'Brendan Donovan', 'position' : 'DH', 'AB' : 4, 'R' : 1, 'H' : 2, 'RBI' : 0, 'BB' : 0, 'SO' : 0, 'PA' : 4 },\n    {'name' : 'Paul Goldschmidt', 'position' : '1B', 'AB' : 3, 'R' : 2, 'H' : 1, 'RBI' : 1, 'BB' : 1, 'SO' : 0, 'PA' : 4 },\n    {'name' : 'Nolan Arenado', 'position' : '3B', 'AB' : 4, 'R' : 1, 'H' : 2, 'RBI' : 1, 'BB' : 0, 'SO' : 0, 'PA' : 4 },\n    {'name' : 'Corey Dickerson', 'position' : 'LF', 'AB' : 4, 'R' : 0, 'H' : 1, 'RBI' : 1, 'BB' : 0, 'SO' : 0, 'PA' : 4 },\n    {'name' : 'Tyler O\'Neill', 'position' : 'CF', 'AB' : 3, 'R' : 1, 'H' : 0, 'RBI' : 0, 'BB' : 1, 'SO' : 3, 'PA' : 4 },\n    {'name' : 'Nolan Gorman', 'position' : '2B', 'AB' : 4, 'R' : 0, 'H' : 0, 'RBI' : 0, 'BB' : 0, 'SO' : 3, 'PA' : 4 },\n    {'name' : 'Yadier Molina', 'position' : 'C', 'AB' : 4, 'R' : 0, 'H' : 2, 'RBI' : 1, 'BB' : 0, 'SO' : 0, 'PA' : 4 },\n    {'name' : 'Tommy Edman', 'position' : 'SS', 'AB' : 4, 'R' : 0, 'H' : 1, 'RBI' : 2, 'BB' : 0, 'SO' : 0, 'PA' : 4 },\n&nbsp;\n    #9/6/2022 WSH 1 - 4 STL\n    {'name' : 'Lars Nootbaar', 'position' : 'RF', 'AB' : 4, 'R' : 0, 'H' : 0, 'RBI' : 0, 'BB' : 0, 'SO' : 2, 'PA' : 4 },\n    {'name' : 'Brendan Donovan', 'position' : '3B', 'AB' : 3, 'R' : 1, 'H' : 1, 'RBI' : 1, 'BB' : 0, 'SO' : 0, 'PA' : 3 },\n    {'name' : 'Paul Goldschmidt', 'position' : '1B', 'AB' : 3, 'R' : 0, 'H' : 1, 'RBI' : 0, 'BB' : 1, 'SO' : 1, 'PA' : 4 },\n    {'name' : 'Nolan Arenado', 'position' : 'DH', 'AB' : 4, 'R' : 1, 'H' : 1, 'RBI' : 0, 'BB' : 0, 'SO' : 0, 'PA' : 4 },\n    {'name' : 'Corey Dickerson', 'position' : 'LF', 'AB' : 3, 'R' : 1, 'H' : 2, 'RBI' : 0, 'BB' : 0, 'SO' : 0, 'PA' : 3 },\n    {'name' : 'Tyler O\'Neill', 'position' : 'CF', 'AB' : 2, 'R' : 0, 'H' : 0, 'RBI' : 1, 'BB' : 0, 'SO' : 1, 'PA' : 3 },\n    {'name' : 'Nolan Gorman', 'position' : '2B', 'AB' : 3, 'R' : 1, 'H' : 2, 'RBI' : 2, 'BB' : 0, 'SO' : 1, 'PA' : 3 },\n    {'name' : 'Andrew Knizner', 'position' : 'C', 'AB' : 3, 'R' : 0, 'H' : 0, 'RBI' : 0, 'BB' : 0, 'SO' : 1, 'PA' : 3 },\n    {'name' : 'Tommy Edman', 'position' : 'SS', 'AB' : 3, 'R' : 0, 'H' : 1, 'RBI' : 0, 'BB' : 0, 'SO' : 0, 'PA' : 3 },\n&nbsp;\n    #9/5/2022 WSH 6 - 0 STL\n    {'name' : 'Lars Nootbaar', 'position' : 'RF', 'AB' : 4, 'R' : 0, 'H' : 0, 'RBI' : 0, 'BB' : 0, 'SO' : 4, 'PA' : 4 },\n    {'name' : 'Tyler O\'Neill', 'position' : 'CF', 'AB' : 4, 'R' : 0, 'H' : 1, 'RBI' : 0, 'BB' : 0, 'SO' : 0, 'PA' : 4 },\n    {'name' : 'Paul Goldschmidt', 'position' : 'DH', 'AB' : 3, 'R' : 0, 'H' : 0, 'RBI' : 0, 'BB' : 1, 'SO' : 0, 'PA' : 4 },\n    {'name' : 'Nolan Arenado', 'position' : '3B', 'AB' : 3, 'R' : 0, 'H' : 0, 'RBI' : 0, 'BB' : 1, 'SO' : 1, 'PA' : 4 },\n    {'name' : 'Albert Pujols', 'position' : '1B', 'AB' : 4, 'R' : 0, 'H' : 0, 'RBI' : 0, 'BB' : 0, 'SO' : 1, 'PA' : 4 },\n    {'name' : 'Corey Dickerson', 'position' : 'LF', 'AB' : 3, 'R' : 0, 'H' : 0, 'RBI' : 0, 'BB' : 0, 'SO' : 0, 'PA' : 3 },\n    {'name' : 'Nolan Gorman', 'position' : '2B', 'AB' : 3, 'R' : 0, 'H' : 0, 'RBI' : 0, 'BB' : 0, 'SO' : 2, 'PA' : 3 },\n    {'name' : 'Yadier Molina', 'position' : 'C', 'AB' : 3, 'R' : 0, 'H' : 1, 'RBI' : 0, 'BB' : 0, 'SO' : 0, 'PA' : 3 },\n    {'name' : 'Tommy Edman', 'position' : 'SS', 'AB' : 3, 'R' : 0, 'H' : 1, 'RBI' : 0, 'BB' : 0, 'SO' : 0, 'PA' : 3 }\n])\ndf7 = df.groupby(["name", "position"]).agg("sum").reset_index()\ndf7

Reset Code Python Output:


  
     
      
      name
      position
      AB
      R
      H
      RBI
      BB
      SO
      PA
    
  
  
    
      0
      Albert Pujols
      1B
      4
      0
      0
      0
      0
      1
      4
    
    
      1
      Albert Pujols
      DH
      5
      0
      0
      0
      0
      0
      5
    
    
      2
      Alec Burleson
      RF
      4
      1
      0
      0
      1
      0
      5
    
    
      3
      Andrew Knizner
      C
      3
      0
      0
      0
      0
      1
      3
    
    
      4
      Ben DeLuzio
      CF
      2
      0
      1
      0
      0
      0
      2
    
    
      5
      Brendan Donovan
      2B
      3
      2
      3
      0
      2
      0
      5
    
    
      6
      Brendan Donovan
      3B
      3
      1
      1
      1
      0
      0
      3
    
    
      7
      Brendan Donovan
      DH
      4
      1
      2
      0
      0
      0
      4
    
    
      8
      Corey Dickerson
      LF
      15
      1
      5
      3
      0
      1
      15
    
    
      9
      Lars Nootbaar
      RF
      12
      0
      0
      0
      0
      8
      12
    
    
      10
      Nolan Arenado
      3B
      11
      1
      2
      1
      1
      1
      12
    
    
      11
      Nolan Arenado
      DH
      4
      1
      1
      0
      0
      0
      4
    
    
      12
      Nolan Gorman
      2B
      10
      1
      2
      2
      0
      6
      10
    
    
      13
      Paul Goldschmidt
      1B
      10
      2
      3
      2
      2
      2
      12
    
    
      14
      Paul Goldschmidt
      DH
      3
      0
      0
      0
      1
      0
      4
    
    
      15
      Tommy Edman
      SS
      14
      0
      5
      2
      1
      1
      15
    
    
      16
      Tyler O'Neill
      CF
      9
      1
      1
      1
      1
      4
      11
    
    
      17
      Yadier Molina
      C
      11
      2
      5
      4
      0
      0
      11

	name	position	AB	R	H	RBI	BB	SO	PA
0	Albert Pujols	1B	4	0	0	0	0	1	4
1	Albert Pujols	DH	5	0	0	0	0	0	5
2	Alec Burleson	RF	4	1	0	0	1	0	5
3	Andrew Knizner	C	3	0	0	0	0	1	3
4	Ben DeLuzio	CF	2	0	1	0	0	0	2
5	Brendan Donovan	2B	3	2	3	0	2	0	5
6	Brendan Donovan	3B	3	1	1	1	0	0	3
7	Brendan Donovan	DH	4	1	2	0	0	0	4
8	Corey Dickerson	LF	15	1	5	3	0	1	15
9	Lars Nootbaar	RF	12	0	0	0	0	8	12
10	Nolan Arenado	3B	11	1	2	1	1	1	12
11	Nolan Arenado	DH	4	1	1	0	0	0	4
12	Nolan Gorman	2B	10	1	2	2	0	6	10
13	Paul Goldschmidt	1B	10	2	3	2	2	2	12
14	Paul Goldschmidt	DH	3	0	0	0	1	0	4
15	Tommy Edman	SS	14	0	5	2	1	1	15
16	Tyler O'Neill	CF	9	1	1	1	1	4	11
17	Yadier Molina	C	11	2	5	4	0	0	11

While usually grouping by one column gives the intended result, sometimes grouping by multiple columns can give you a better sense of what the data is trying to tell you.

pandas Documentation

Click Here for the full pandas documentation for the groupby function

	name	AB			R			H			RBI			BB			SO			PA
		mean	median	max	mean	median	max	mean	median	max	mean	median	max	mean	median	max	mean	median	max	mean	median	max
0	Albert Pujols	4.000000	4.0	4	0.000000	0.0	0	0.000000	0.0	0	0.000000	0.0	0	0.000000	0.0	0	1.000000	1.0	1	4.000000	4.0	4
1	Albert Pujols	5.000000	5.0	5	0.000000	0.0	0	0.000000	0.0	0	0.000000	0.0	0	0.000000	0.0	0	0.000000	0.0	0	5.000000	5.0	5
2	Alec Burleson	4.000000	4.0	4	1.000000	1.0	1	0.000000	0.0	0	0.000000	0.0	0	1.000000	1.0	1	0.000000	0.0	0	5.000000	5.0	5
3	Andrew Knizner	3.000000	3.0	3	0.000000	0.0	0	0.000000	0.0	0	0.000000	0.0	0	0.000000	0.0	0	1.000000	1.0	1	3.000000	3.0	3
4	Ben DeLuzio	2.000000	2.0	2	0.000000	0.0	0	1.000000	1.0	1	0.000000	0.0	0	0.000000	0.0	0	0.000000	0.0	0	2.000000	2.0	2
5	Brendan Donovan	3.333333	3.0	4	1.333333	1.0	2	2.000000	2.0	3	0.333333	0.0	1	0.666667	0.0	2	0.000000	0.0	0	4.000000	4.0	5
6	Corey Dickerson	3.750000	3.5	5	0.250000	0.0	1	1.250000	1.5	2	0.750000	0.5	2	0.000000	0.0	0	0.250000	0.0	1	3.750000	3.5	5
7	Lars Nootbaar	4.000000	4.0	4	0.000000	0.0	0	0.000000	0.0	0	0.000000	0.0	0	0.000000	0.0	0	2.666667	2.0	4	4.000000	4.0	4
8	Nolan Arenado	3.750000	4.0	4	0.500000	0.5	1	0.750000	0.5	2	0.250000	0.0	1	0.250000	0.0	1	0.250000	0.0	1	4.000000	4.0	4
9	Nolan Gorman	3.333333	3.0	4	0.333333	0.0	1	0.666667	0.0	2	0.666667	0.0	2	0.000000	0.0	0	2.000000	2.0	3	3.333333	3.0	4
10	Paul Goldschmidt	3.250000	3.0	4	0.500000	0.0	2	0.750000	1.0	1	0.500000	0.5	1	0.750000	1.0	1	0.500000	0.5	1	4.000000	4.0	4
11	Tommy Edman	3.500000	3.5	4	0.000000	0.0	0	1.250000	1.0	2	0.500000	0.0	2	0.250000	0.0	1	0.250000	0.0	1	3.750000	3.5	5
12	Tyler O'Neill	3.000000	3.0	4	0.333333	0.0	1	0.333333	0.0	1	0.333333	0.0	1	0.333333	0.0	1	1.333333	1.0	3	3.666667	4.0	4
13	Yadier Molina	3.666667	4.0	4	0.666667	0.0	2	1.666667	2.0	2	1.333333	1.0	3	0.000000	0.0	0	0.000000	0.0	0	3.666667	4.0	4