MicroProject #2: NWS Hourly Weather Forecast
In this MicroProject, you will do real data science in less than an hour and you will earn this MicroProject's card to your collection when you fully complete this MicroProject! ๐
Data API: National Weather Service (NWS) Weather Forecast
The United States' National Weather Service (NWS) allows, for free, "developers access to critical forecasts, alerts, and observations, along with other weather data." When any organization provides developers access to data, the most common way to provide data is through an Application Programming Interface or API. An API is simply a documented and structured way that data is provided, so that a developer is able to use the data in a reliable and predictable way. For example, you can find the full documentation of their API at: https://www.weather.gov/documentation/services-web-api.
In this MicroProject, you will use the NWS API to find the weather forecast for your location (anywhere in the United States!) and your MicroProject will always fetch the latest forecast the moment you run the code you will write! Let's nerd out with some weather data! ๐
Background Knowledge
To finish this MicroProject, we assume you already know how to:
- Load a CSV file into a DataFrame using
pd.read_csv(review loading a CSV file) - Perform simple row selection of a DataFrame (review row selection)
With that knowledge, this MicroProject will guide you through nerding out with the NWS API, learning how to use a new pd.read_json command, and then creating a scatter plot visualization of every hour in the forecast. Let's get started! :)
Part 1: Retrieving Your Forecast
The weather is quite different in Central Illinois than in Southern California, so the first thing we need to do is find the latitude and longitude coordinates of your location so that we can request the weather forecast for your location.
A quick way to find your latitude and longitude is to use Google Maps:
- Open https://maps.google.com/ and right click on a location. (Since this is NWS data, make sure your location is in the United States.)
- The first option on your right-click menu will be the
latitude, longitudeof your chosen location on the map. - Clicking your
latitude, longitudewill copy the data to your clipboard!
Once you have your location, separate your latitude (the first number, it will be a positive number) and longitude (the second number, it will be a negative number) into two separate Python variables named latitude and longitude:
Part 1.1: Finding Your Forecast Area Endpoint
To find your weather forecast from the National Weather Service, you cannot directly use your latitude and longitude. Instead, the NWS organizes their forecasts into small geographical "forecast areas".
To find the "forecast area" from your latitude and longitude, the NWS API provides the /points/ endpoint that uses a latitude,longitude value to return a weather forecast. The full URL for this endpoint is in the following format:
https://api.weather.gov/points/LATITUDE,LONGITUDE
In the code cell below, create a string in the Python variable pointsUrl that contains the /points/ endpoint for your location.
- In your URL, you will need to replace the word
LATITUDEwith your actual latitude. - You will also need to replace
LONGITUDEwith your actual longitude.
๐ Click the link you generated to see the raw data that you're retrieving from the NWS API! :)
Part 1.2: Retrieving the JSON Data
When you viewed the raw data by clicking the link you created, you saw data formatted in a format called "JSON". This is an alterative data format to CSV that is useful for hierarchical data where each data point may have different categories.
Python provides a function similar to pd.read_csv for reading JSON formatted data: pd.read_json. The key difference is that data read by pd.read_json must be JSON data (instead of CSV or other types of data).
- Use
pd.read_json(...)to use pandas'sread_jsonfunction. - The function requires two parameters, separated by a comma:
- The first parameter is the name URL to fetch the data. You already stored this in the Python variable
pointsUrl. - The second parameter is how to read the JSON, and the second parameter is
typ="series".
- The first parameter is the name URL to fetch the data. You already stored this in the Python variable
This means you can read the JSON with the following command:
pd.read_json(pointsUrl, typ="series")
Store the result of read_json in a new Python variable named pointsJSON:
Part 1.3: Finding Your Forecast URL
Looking at the output of the pointsJSON above, you should see that there are fields of data including:
geometry, which include the geometric area for the forecast (you supplied a single point, but weather forecasts are given for areas that can be many square miles),properties, which includes all properties about the location you requested,- ...and a few others (
id,type, and@context).
You can access the details of any field within pointsJSON by using the following syntax:
pointsJSON["geometry"] # Access the details of the "geometry" data within `pointsJSON` pointsJSON["properties"] # Access the details of the "properties" data within `pointsJSON` pointsJSON["id"] # Access the details of the "id" data within `pointsJSON` # ...etc...
In the next Python cell, return all of the details of the "properties" data. We will then look through all of the various properties for the data available about our forecast area:
Part 1.4: Record Your Forecast URL
In the detailed output above, find the forecastHourly property.
The forecastHourly URL is the API endpoint that will contain the hourly forecast similar to what you see on any common weather app, with the forecasted temperature, sky condition (ex: sunny, cloudy, etc), chance of presentation, and more.
Using copy and paste to avoid typos, copy and paste just the URL for the hourly forecast and store it in the variable hourlyForecastURL.
- โ ๏ธ There are many things with
forecastin this data, so make sure you're finding the field forforecastHourly. - Make sure you're copying/pasting only the URL -- it should start with
https://, end with/hourly, and have details about your specific location in the middle.
โ๏ธ Test Case: Part 1: Retrieving Your Forecast
Part 2: Loading Your Weather Forecast as a DataFrame
In an identical way as you did in Part 1.2 of this MicroProject, use pd.read_json to load the your hourly weather forecast.
- โ ๏ธ Your hourly weather forecast URL is the URL from Part 1.4, which is a different URL than you loaded the first time you read a JSON file.
- Store the JSON in a new Python variable named
forecastJSON:
Part 2.1: Finding the Forecast Data
Inside of the properties data, the periods data provides a well-defined, structured set of data that contains the forecast for every hour for your location.
Run the following cells as we dive deeper into the hourly forecast data:
Part 2.2: Converting Structured Data into a DataFrame
The data we find in the forecastJSON["properties"]["periods"] above is consistently structured data. Specifically,
- Every entry has the exact same field names (ex:
number,startTime,temperature,windSpeed, etc). - The set of entries are organized as a list.
Because this data is consistently structured, we are able to create a DataFrame out of this data. Run the code below to create a DataFrame out of the forecastJSON that contains the current hourly forecast data for the location you selected in this MicroProject:
โ๏ธ Test Case: Part 2: Loading Your Weather Forecast as a DataFrame
Puzzle 3: Exploring Your Weather
Now that we have the current hourly weather forecast for your location loaded into the DataFrame stored in the Python variable df, it's time to nerd out with it!
- Looking at data stored in
dfin the output above the previous test case, you'll find the DataFrame has about 156 rows. - Since each row contains a forecast for one hour, this means the entire DataFrame represents the forecast for the next 6.5 days.
- Let's find some interesting data about the upcoming few days. :)
Puzzle 3.1: Your Warmest Upcoming Temperature
Using the DataFrame stored in the Python variable df, find the single row with the warmest temperature in the entire forecast and store that single row in a Python variable called df_warmest.
Puzzle 3.2: Your Coldest Upcoming Temperature
Using your data, find the coldest temperature in the entire forecast! Save that row as df_coldest:
โ๏ธ Test Case: Puzzles 3.1 and 3.2: Warmest and Coldest Temperatures
Puzzle 3.3: Forecast Summary
In your data, the shortForecast column provides a brief summary of the forecasted condition of the sky for each hour. The same conditions appear multiple times throughout the forecast, so it'd be useful to get a count of how many times each value appears.
An extremely useful command to count the number of times a unique value appears in a column is:
df["column"].value_counts()
This will list all of the unique values that appears in the "column" specified, count how many times they appear, and even sorts the results with the most commonly appearing unique value at the top!
Using the code above, find the counts for the unique values of shortForecast in your forecast. Store this result in a new Python variable named forecastSummary:
โ๏ธ Test Case: Puzzle 3.3: Forecast Summary
Part 4: Create a Scatter Plot
Finally, like any good weather app, let's create a data visualization of the temperature each hour to understand the trends in the temperature!
To create a scatter plot with pandas, like almost all data visualizations, you need to identify the column name of data that you want to use for the x-axis data and column name of data that you want to use for the y-axis data.
The Python code to create a scatter plot using a DataFrame stored in the variable df is then:
df.plot.scatter(x="x-column", y="y-column")
Use the above syntax, but make sure to:
- For the
x-axis data, find and use the column name for that contains the starting date/time for the each hour, - For the
y-axis data, find and use the column name for that contains the temperature for the each hour, - Then create a simple scatter plot! :)
Refine Your Scatter Plot
Above, you have a scatter plot -- but the x-axis is likely unreadable and the plot is hard to read. Data visualizations in Python can take additional function parameters, by listing them inside of the df.plot.scatter(...) command similar to how the x and y parameters are already listed.
Additionally, we'll save this visualization in a variable called ax so we can verify it looks good!
In the cell below, add the following additional parameters to the df.plot.scatter function, making sure to separate each one by commas:
- Add your original
xandyparameters that you used in the graph you just made to the code below, and then: - Add
title="Hourly Temperature Forecast for LOCATION", replacingLOCATIONwith the city/location you choose to find the forecast of in this MicroProject, - Add
xticks = df.startTime.values[::6]to show only every 6th tick to get fewer x-axis labels, - Add
rot = 90to rotate the labels 90 degrees (vertical), - Add
grid = Trueto add gridlines, - Add
figsize = (10, 6)to make the figure 10 inches wide and 6 inches tall, making it larger (you can change these values to make it bigger/smaller as you want and they do not need to be exactly 10 and 6), - Add all of these options, and any others you want, to get a really useful visualization! :)
โ๏ธ Test Case: Part 4: Create a Scatter Plot
Share Your Weather Forecast Visualization!
The visualization you made is unique to your location and unique the time you completed this MicroProject -- for every location, and for every hour, it will look a little different!
It would be an honor to me if you shared the image you created in the DISCOVERY Discord in the channel #02-hourly-weather-forecast, and check out the forecasts for other times and locations that others have shared. :)
Hope to see your image in Discord! ๐งก๐
Earn the MicroProject Collectable Card!
Congratulations on finishing the MicroProject! ๐๐
To validate your entire project, your entire code will run from top-to-bottom on this page and each test case will be validated one final time. If everything looks good, you'll earn the card for completing this MicroProject: