Saving a DataFrame to an Excel file using Pandas


Many times, you may want to make changes to your original dataset as you clean the data, filter by certain categories, and more. After this, you may want to share these new and improved datasets with the world. An easy way to save your dataset is to export it to an Excel file that can then be shared. This can be done with the pandas to_excel function.

Saving a DataFrame as an Excel File

For this example, we will use the GPA Dataset, a dataset that contains the GPA for courses at the University of Illinois over a 10 year period.

First we will read in the dataset using pd.read_csv. A more detailed guide on how to create a dataset from a csv can be found here. Let's say for our assignment, we only want to look at the GPAs of Statistics classes for the past 10 years and save that dataset as a CSV. We saved that dataset into the variable stat_gpas.

Then we can use the to_excel function to save that DataFrame as a file named STAT_GPA.xlsx.

import pandas as pd 
df = pd.read_csv("https://waf.cs.illinois.edu/discovery/gpa.csv")
stat_gpas = df[df["Subject"] == "STAT"]
stat_gpas.to_excel("STAT_GPA.xlsx")
Saving a dataframe to an Excel File

Saving multiple DataFrames as sheets in an Excel File

With an excel file, we can save multiple DataFrames in one file. Say we want to save both the original and modified dataset from the last example in one excel workbook on two different sheets.

For this we will need to specify an ExcelWriter. The input for the excel writer will be the name of the excel file we want to output.
If we name our output file GPA_breakdown.xlsx, our writer would look like this: pd.ExcelWriter("GPA_breakdown.xlsx").

Next, we would use our .to_excel function, but this time with two inputs: first, our writer, and second a sheet name for each DataFrame we want as a sheet in our workbook. Altogether, our code would look like this:

import pandas as pd 
df = pd.read_csv("https://waf.cs.illinois.edu/discovery/gpa.csv")
stat_gpas = df[df["Subject"] == "STAT"]

with pd.ExcelWriter("GPA_breakdown.xlsx") as writer: 
    df.to_excel(writer, sheet_name='GPA_Dataset')
    stat_gpas.to_excel(writer, sheet_name='STAT_GPAs')
Saving two dataframes as separate sheets

Pandas Documentation

The full documentation for to_excel is available in the pandas documentation.