Adding Rows and Columns to a DataFrame
In addition to using data we load from an external data source, we can also add rows and columns to an existing DataFrame in Python!
Adding a Column to a DataFrame
Adding a column to a DataFrame is very common, and allows us to store new data about each row (observation). This is particularly useful when we want to preform some calculation on data. For example, we may want to convert a numeric column given as a percentage to a decimal by dividing the percentage by 100. The following code adds a new column called decimal
:
percentage | decimal | |
---|---|---|
0 | 50 | 0.5 |
1 | 80 | 0.8 |
Adding a Row to a DataFrame
Adding a row to a DataFrame allows us to add a new observation. Similar to adding a column, we need to refer to the specific row we want to add (or edit). Instead of referring to the name directly, the syntax df.loc[<row index>]
is used to specify the row index we want to modify.
When modifying a row, we provide a dictionary of key/value pairs that corresponds to each column name and value. In the example below, our DataFrame initially contains two courses at The University of Illinois by course name and title. Then a third row is added, with the row index value of "New Row":
course | name | |
---|---|---|
0 | CS 225 | Data Structures |
1 | STAT 107 | Data Science Discovery |
New Row | CS 440 | Artificial Intelligence |