In How to Import Weather Data in MySQL , we demonstrated how to load weather forecast data into MySQL using a python script that was run on regular intervals. Historical weather data is often just as critical for data science applications so in this article we demonstrate how to load weather history data using Python. We will use the Visual Crossing Weather API that offers a free tier that includes past weather data.
We are using the Timeline Weather Api and so the code within this article can be used to seamlessly retrieve forecast or history data within Python. If you supply a date range in the future, you will retrieve the weather forecast. If you supply a date range in the past, you will retrieve historical data. The switch between history and forecast is seamless – there is no change in the data format.
Why shouldn’t we scrape weather data from our favorite web site?
Scraping weather data means we simply visit a web site and either manually or programmatically copy the data from that web page. Programmatic scraping of weather data can be difficult to implement and then difficult to maintain. When the web site changes (even for very small changes), the scraping code may need changing. Most importantly, scraping weather data is against the usage terms of almost all web sites.
If we are looking for a reliable solution to retrieving weather data on regular intervals we need a more reliable solution. Using a Weather API avoids us have to scrape data.
Before starting, you need access to Weather API we are going to use. If you don’t have account already, you can sign up for a free account at Visual Crossing Weather Services. Visual Crossing offers a perpetual free tier of up to 1000 results per day. if you need more, you can pay-per-result or sign up for a monthly plan.
Our Python script has been written in Python 3.8.2 but should work in most recent versions. We have kept the library requirements to a minimum. In this sample we are going to use the CSV format to download the data so we include a library to help process the data that is returned by the API.
Here’s the full list of import statements in our script
import csv import codecs import urllib.request import urllib.error import sys
Setting up the weather data input parameters
The first part of our script sets up the parameters for the script to download the weather data and the second part retrieves the weather data. The weather data is retrieved using a RESTful weather API so we simply have to create a web query within the Python script and download the data.
The first part of the sets up some parameters to customize the weather data that is retrieved
BaseURL = 'https://weather.visualcrossing.com/VisualCrossingWebServices/rest/services/timeline/' ApiKey='YOUR_API_KEY' #UnitGroup sets the units of the output - us or metric UnitGroup='us' #Location for the weather data Location='Washington,DC' #Optional start and end dates #If nothing is specified, the forecast is retrieved. #If start date only is specified, a single historical or forecast day will be retrieved #If both start and and end date are specified, a date range will be retrieved StartDate = '' EndDate='' #JSON or CSV #JSON format supports daily, hourly, current conditions, weather alerts and events in a single JSON package #CSV format requires an 'include' parameter below to indicate which table section is required ContentType="csv" #include sections #values include days,hours,current,alerts Include="days"
In the above code, we set up parameters. for the script. To keep this script simple, we’re hardcoded variables for the various parameters to help with readability.
The first parameter is the location for which the weather data should be downloaded. This is in form of an address, partial address or latitude and longitude value (for example 35.46,-75.12).
We then ask for the API Key which is provided when signing up for the API. You can access it from the ‘Account’ location within Weather Data Account Page.
Next, we can specify the date range of information we are interested in. The Weather API will automatically request historical or forecast data based on the date range requested. The code requests a start and end date in the form YYYY-MM-DD, for example 2020-03-26 is the 26th March, 2020. The format of the date is important for the weather API query.
If start date isn’t specified, the query will request the next 15 days weather forecast.
You can also use a dynamic date period as the start date such as yesterday, tomorrow, last7 days etc. See Weather Data Periods.
Many more API parameters are available. For more information on the full set of Weather API parameters, see the Weather API documentation.
Downloading the weather data
The next section of code creates the Weather API request from the parameters, submits the request to the server and then parses the result.
#basic query including location ApiQuery=BaseURL + Location #append the start and end date if present if (len(StartDate)): ApiQuery+="/"+StartDate if (len(EndDate)): ApiQuery+="/"+EndDate #Url is completed. Now add query parameters (could be passed as GET or POST) ApiQuery+="?" #append each parameter as necessary if (len(UnitGroup)): ApiQuery+="&unitGroup="+UnitGroup if (len(ContentType)): ApiQuery+="&contentType="+ContentType if (len(Include)): ApiQuery+="&include="+Include ApiQuery+="&key="+ApiKey
The first part of the code constructs the requests to form a single URL. In this example, we are sending the request as a GET request. Other request techniques are available such as POST, which is useful if your list of locations is long or even ODATA for specialized data import and data science applications. See the Weather API documentation section for more information.
print(' - Running query URL: ', ApiQuery) print() try: CSVBytes = urllib.request.urlopen(ApiQuery) except urllib.error.HTTPError as e: ErrorInfo= e.read().decode() print('Error code: ', e.code, ErrorInfo) sys.exit() except urllib.error.URLError as e: ErrorInfo= e.read().decode() print('Error code: ', e.code,ErrorInfo) sys.exit()
The final two lines of the code download the requested weather data and provides some simple error handling. We have used the urllib.request library to provide the retrieval functionality here.
Error handling with urllib.request
It is important to provide error handling so that problems may be resolved quickly. When providing the error handling, ensure that you check the HTTP response code and also read the response body.
The response body will contain full details of the Weather API error and is the best way to resolve most problems. Another useful way to troubleshoot API requests that are returning an error is to copy the URL from the code into a browser window. This will provide a quick and easy way to see any errors that are being returned. This technique can also be used to see the structure of the weather data. Don’t forget you can also use the Weather Data Services query builder page to construct requests and see results.
Using the weather data
As mentioned above, we are using Comma Separated Values for the output in this example, but JSON format is available too. The CSV data is encoded in UTF-8 encoding so we indicate that to ensure accurate decoding.
# Parse the results as CSV CSVText = csv.reader(codecs.iterdecode(CSVBytes, 'utf-8'))
We now have the weather data as a CSVText instance. From here we can use the data in many ways. For example we can analyze the weather data in R, load it into a database or simply display it to the user. In this example, we simply display the user to the screen.
The raw data is simply a table of weather data rows. In this case the historical weather data for each day requested.
name,datetime,tempmax,tempmin,temp,feelslikemax,feelslikemin,feelslike,dew,humidity,precip,precipprob,precipcover,preciptype,snow,snowdepth,windgust,windspeed,winddir,sealevelpressure,cloudcover,visibility,solarradiation,solarenergy,uvindex,severerisk,sunrise,sunset,moonphase,conditions,description,icon,stations "Washington, DC, United States",2021-12-14,58.9,34.8,44.3,58.9,34.8,44,28.4,58.5,0,0,,,0,0,9.2,4.7,66.6,1032.2,34.4,12.5,118,7.1,4,10,2021-12-14T07:19:08,2021-12-14T16:46:50,0.41,Partially cloudy,Partly cloudy throughout the day.,partly-cloudy-day,"KDCA,F0198,KADW,KDAA,PWDM2" "Washington, DC, United States",2021-12-15,54.1,36,45.3,54.1,36,45.1,39,78.9,0,4,,,0,0,8.5,4.3,109.6,1031.5,63.5,15,81.4,7.1,4,10,2021-12-15T07:19:49,2021-12-15T16:47:07,0.44,Partially cloudy,Partly cloudy throughout the day.,partly-cloudy-day, "Washington, DC, United States",2021-12-16,62,45,53.7,62,41.9,52.6,36.1,67.1,0,16,,,0,0,20.8,10.3,208.8,1019.9,55.7,15,73.3,6.4,3,10,2021-12-16T07:20:29,2021-12-16T16:47:26,0.47,Partially cloudy,Partly cloudy throughout the day.,partly-cloudy-day,Address,Date time,Minimum Temperature,Maximum Temperature,Temperature,Dew Point,Relative Humidity,Heat Index,Wind Speed,Wind Gust,Wind Direction,Wind Chill,Precipitation,Precipitation Cover,Snow Depth,Visibility,Cloud Cover,Sea Level Pressure,Weather Type,Latitude,Longitude,Resolved Address,Name,Info,Conditions "Herndon,VA",01/01/2019,44.6,60.8,53.1,45.1,75.58,,23.2,36.3,269.29,41.4,0,8.33,,9.1,83.5,1014.8,"Mist, Fog, Light Rain",38.96972,-77.38519,"Herndon,VA","","","Overcast" "Herndon,VA",01/02/2019,37.3,44.9,42.4,35.9,78.17,,8.2,,143.95,38.7,0,0,,10,98.3,1023.6,"",38.96972,-77.38519,"Herndon,VA","","","Overcast"
The code simply steps through the rows of CSV data, prints out the data.
RowIndex = 0 # The first row contain the headers and the additional rows each contain the weather metrics for a single day # To simply our code, we use the knowledge that column 0 contains the location and column 1 contains the date. The data starts at column 4 for Row in CSVText: if RowIndex == 0: FirstRow = Row else: print('Weather in ', Row, ' on ', Row) ColIndex = 0 for Col in Row: if ColIndex >= 4: print(' ', FirstRow[ColIndex], ' = ', Row[ColIndex]) ColIndex += 1 RowIndex += 1
Parsing JSON data
If you prefer to use JSON-formatted data (which can often be a significantly faster way of processing the data if you are processing the data within Python), you can parse the response as follows:
import json .... weatherData = json.loads(data.decode('utf-8'))
The variable weatherData now contains the parsed weather data set and can be easily processed.
Full source code for downloading historical weather data in Python
For the full code listing, head over to our Github for this and other Python, Java and other Weather API examples.
This simple code demonstrates how easy it is to download historical weather data using Python without needing the headaches of scraping historical data. With the weather data in Python, you can now start analyzing and using the data in your next project.
Questions or need help?
If you have a question or need help, please post on our actively monitored forum for the fastest replies. You can also contact us via our support site or drop us an email at email@example.com.