Using apply on a Series (using only one column of data)
The numpy library provides many mathematical functions that can be used with apply, including a function np.sqrt that returns the square root of an input.
Using df.apply and np.sqrt, we can create a new column to contain the square root of the first side of our triangle:
import numpy as np\nimport pandas as pd\n \n# Creates a DataFrame with 'side 1', 'side 2', and 'side 3' columns\ndf = pd.DataFrame([\n {'side 1': 4, 'side 2': 121, 'side 3': 25},\n {'side 1': 9, 'side 2': 36, 'side 3': 100},\n {'side 1': 100, 'side 2': 4, 'side 3': 49},\n {'side 1': 36, 'side 2': 25, 'side 3': 64},\n])\n \n# Create a new column to store the square root of `side 1`:\ndf["sqrt1"] = df["side 1"].apply(np.sqrt)\ndf
Reset Code Python Output:
side 1
side 2
side 3
sqrt1
0
94
71
25
9.695360
1
79
36
100
8.888194
2
100
64
49
10.000000
3
36
45
64
6.000000
Creating Your Own Function for apply on a Series
When using one column of data, any function that takes one parameter as the value of your data can be used with apply. Here is a custom function that returns "small" or "large" depending on the input is 80 or larger that is then used in df.apply:
import numpy as np\nimport pandas as pd\n \n# Creates a DataFrame with 'side 1', 'side 2', and 'side 3' columns\ndf = pd.DataFrame([\n {'side 1': 4, 'side 2': 121, 'side 3': 25},\n {'side 1': 9, 'side 2': 36, 'side 3': 100},\n {'side 1': 100, 'side 2': 4, 'side 3': 49},\n {'side 1': 36, 'side 2': 25, 'side 3': 64},\n])\ndef isLarge(value):\n if value >= 80:\n return "large"\n else:\n return "small"\n \n# Using our custom `isLarge` function, we add a new "side 1 size" column to our DataFrame:\ndf["side 1 size"] = df["side 1"].apply(isLarge)\ndf
Reset Code Python Output:
side 1
side 2
side 3
sqrt1
side 1 size
0
94
71
25
9.695360
large
1
79
36
100
8.888194
small
2
100
64
49
10.000000
large
3
36
45
64
6.000000
small
Using apply on a DataFrame
Instead of using apply on a single column (a Series), we can also use apply on the whole DataFrame.
The default axis for applying the function is axis = 0 (applying the function to each column). To apply the function to each row, we specify axis = 1.
For example, let's find the perimeter of each triangle:
import numpy as np\nimport pandas as pd\n \n# Creates a DataFrame with 'side 1', 'side 2', and 'side 3' columns\ndf = pd.DataFrame([\n {'side 1': 4, 'side 2': 121, 'side 3': 25},\n {'side 1': 9, 'side 2': 36, 'side 3': 100},\n {'side 1': 100, 'side 2': 4, 'side 3': 49},\n {'side 1': 36, 'side 2': 25, 'side 3': 64},\n])\n# Summing the columns of each row to find the perimeter\ndf[["side 1", "side 2", "side 3"]].apply(np.sum, axis = 1)
Reset Code Python Output:
```
0 190
1 215
2 213
3 145
dtype: int64
```
Creating Your Own Function for use with apply on a DataFrame
Similar to a using apply on a single column, we can create a custom function. Our custom function will receive a row of data every time the function is called instead of a single value. For example, to find the area of the triangle:
import numpy as np\nimport pandas as pd\n \n# Creates a DataFrame with 'side 1', 'side 2', and 'side 3' columns\ndf = pd.DataFrame([\n {'side 1': 4, 'side 2': 121, 'side 3': 25},\n {'side 1': 9, 'side 2': 36, 'side 3': 100},\n {'side 1': 100, 'side 2': 4, 'side 3': 49},\n {'side 1': 36, 'side 2': 25, 'side 3': 64},\n])\ndef findArea(row):\n s = (np.sqrt(row["side 1"]) + np.sqrt(row["side 2"]) + np.sqrt(row["side 3"])) / 2\n return np.sqrt((s * (s - row["side 1"]) * (s - row["side 2"]) * (s - row["side 3"])))\n \ndf.apply(findArea, axis=1)