Many business intelligence, data science, and academic research projects need high quality historical weather data to be available in a database. Weather data is typically quite large in volume, and its analysis often requires processing thousands or even millions of detailed records in order to find valuable patterns and apply them to the task at hand. For this reason, using a database as a storage and analysis platform is the best option for many projects.
In this article, we will walk through the steps of importing historical weather data into a database. While the flow is the same for any database, the exact command syntax and SQL varies somewhat for different database platforms. To make your project easier, we have noted specific commands for most popular database platforms below. Would you like us to add an entry for your favorite database? Add your voice to the comments below or email firstname.lastname@example.org, and we’ll do our best to help.
Creating a new database
This database creation step is required here for completeness so that readers starting from scratch can follow along easily. However, many readers may already have an existing database into which they want to add weather data. If you are one of those, you can safely skip this step and jump to the next header below.
If you have just installed a fresh database server or want to keep your weather data entirely separate from your existing data, then you will want to create a new database instance. In nearly all databases this can be done simply by logging in with administrator access and executing a “create database” command like this.
CREATE DATABASE Weather_DB;
(Note that this example names the database “Weather_DB”. Feel free to choose something else that better meets your own naming convention. Just make sure to use the same name consistently throughout all of the steps in the article.)
In most databases there are optional parameters that can be added to tune the database or the storage associated with it. For example, in MySQL you can specify a character set and collation type, in DB/2 you can specify security settings and storage options, and in Oracle and SQL Server you can set a host of options including languages, logfiles, and date formats. If you are working in an environment managed by others, then you will want to check with your local DBA to determine any appropriate, additional options. Otherwise, the universal create database command above should be fine to get you started.
Obtaining weather data
Now that we have a database in which to store weather data, we need to find a source for that data. A weather data service such as Visual Crossing Weather can provide bulk weather data in various convenient formats such as CSV that be directly imported into nearly any database. The web-based query interface lets you specify the detailed weather query that we want to run and then allows you to instantly download the results as a CSV file. You can then use this same query as the foundation for a recurring bulk data query, if want to retrieve updated data over time.
If you would like to follow a step-by-step guide for downloading a CSV result set based on your own weather query, please follow the CSV tutorial or watch the matching guide video. Following these steps will get you a weather data CSV file for your exact location and date range in just a couple of minutes.
Creating a weather table
Now that we have a CSV file containing weather data, we are ready to load it as a table into the database. If you are familiar with your database’s GUI tools, you can typically load a CSV file easily via their UI. However, in this article we’ll discuss how to load the file via a scriptable command. This is extremely useful since SQL and other scripts can be easily automated for future and periodic use. In many cases you will want to load periodic weather data updates as part of an ETL process.
If you are using a command line tool, before working with a database, you typically need to set that database as the current default target for your actions. On most databases, this is done with the “use” command.
(Note that the database name “Weather_DB” matches the name that we created in this article above. If the database in which you want to store the data has a different name, simply replace “Weather_DB” with that alternate name.)
However, on DB/2, you typically connect log into a database directly, so you would use a command such as “connect to Weather_DB” instead. This is also true on Microsoft Access and Postgres (also known as PostgreSQL) where you connect to a specific database when you log in.
We next need to create the table to hold the weather data. The table structure is the same for all major databases even though the column types differ slightly. We’ll start off with a standard SQL create statement and then note how to modify it as needed for various popular databases.
CREATE TABLE Weather_Data ( address varchar(256) DEFAULT NULL, latitude float DEFAULT NULL, longitude float DEFAULT NULL, date_time datetime DEFAULT NULL, maxt float DEFAULT NULL, mint float DEFAULT NULL, temperature float DEFAULT NULL, precip float DEFAULT NULL, wspd float DEFAULT NULL, wdir float DEFAULT NULL, wgust float DEFAULT NULL, pressure float DEFAULT NULL);
This table structure will work for Microsoft SQL Server, MySQL, MariaDB, and other databases that adhere to standard SQL. However, some databases require minor modifications to the types. Here are the changes required for other popular databases.
- IBM DB/2
- Change the “float” type to “real”. (Note that many modern versions of DB/2 have an internal mapping that allows you to create a column as type “float” and let DB/2 automatically store it as type “real”.
- Microsoft Access
- Change the “varchar” type to “text”.
- Change the “datetime” type to be “date”.
- Although it is optional, Oracle recommends that you create variable-length string columns using the “varchar2” type instead of “varchar”. Internally, they both represent the same type to the database.
- Change the “datetime” to be “timestamp”.
- Change “float” to be “real”.
Loading the weather data
Now we are ready to load the weather data into our new table. The syntax is somewhat different for nearly every database. However, the process is the same. We want to take the CSV file that we downloaded and use a SQL or script command to import it into the table the that we just created to store it.
Because the command is different for each database, we’ll discuss each popular database in turn. Also, for the sake of these examples, we’ll assume the the weather data file name is named “Weather_File.csv” and is stored in a directory named “c:\MyWeather”. If your location and name are different, in the example commands below, just change the file parameter to match your own local paths.
In DB/2 the command to import our historical weather CSV file is as follows:
IMPORT FROM "c:\MyWeather\Weather_File.csv" OF DEL SKIPCOUNT 1 INSERT INTO Weather_Data;
This will load the entire set of records from the weather CSV file into the table that we created above. Note that the “SKIPCOUNT 1” parameter tells DB/2 to skip the headers in the first row of data. For more details on optional parameters and usage of the DB/2 “Import” command, please see the article from IBM here: https://www.ibm.com/support/pages/how-import-data-db2-database-table-csv-file.
In Microsoft Access loading a CSV is typically done via the graphical wizard. However, you can also use VBA to script the process. (There are many tutorials on the internet showing the steps for using the Access GUI import functionality, if you need a helping hand. Here is one example: https://blog.ip2location.com/knowledge-base/how-to-import-csv-into-microsoft-access-database/) In VBA the action command to load our sample CSV is named “TransferText” and looks like this:
DoCmd.TransferText TransferType:= acImportdelim, TableName:="Weather_Data", FileName:="c:\MyWeather\Weather_File.csv", HasFieldNames:=true
You will need to run this command as VBA within Access. For this you often need to wrap this command into a simple VBA function. Note that the HasFieldNames tells Access that the first row of the CSV contains our column names, and these should match the table that we created above. You can read more about the details and parameters of the “TransferText” function here:
Microsoft SQL Server
SQL Server can directly import our weather CSV file with a SQL command such as this.
BULK INSERT Weather_Data FROM 'c:\MyWeather\Weather_File.csv' WITH ( FIRSTROW = 2, FIELDTERMINATOR = ',', ROWTERMINATOR = '\n', TABLOCK )
These standard parameters tell SQL Server to skip the header row, configure the delimiters, and request a table lock (“TABLOCK”) during the insert process. To learn more about SQL Server’s “Bulk Insert” command parameters, see Microsoft’s documentation here: https://docs.microsoft.com/en-us/sql/t-sql/statements/bulk-insert-transact-sql?view=sql-server-ver15
Note also that it is also easy to accomplish this CSV import via the SQL Server GUI tools. If you are interested in this option, you can follow it step-by-step in Microsoft’s tutorial here: https://docs.microsoft.com/en-us/sql/relational-databases/import-export/import-flat-file-wizard?view=sql-server-ver15
MySQL (and MariaDB)
MySQL’s import syntax is similar to SQL Server. You can import our weather data file with this command:
LOAD DATA INFILE 'c:\MyWeather\Weather_File.csv' INTO TABLE Weather_Data FIELDS TERMINATED BY ',' ENCLOSED BY '"' LINES TERMINATED BY '/n' IGNORE 1 ROWS;
The parameters simply tell MySQL the table name, the delimiters, and to ignore the header row in the data. If you want to learn more about the other parameters that are available, you can find the detailed “Load Data” documentation here: https://dev.mysql.com/doc/refman/8.0/en/load-data.html and very similar options for MariaDB here: https://mariadb.com/kb/en/load-data-infile/. Note that this import can also be accomplished via various GUI admin tools such as MySQL Workbench.
It is likely not a surprise to those of you who are Oracle users that Oracle is a bit more complex than other databases when it comes to importing CSV files. The standard way to accomplish this is via the SQL Loader tool. If you have any experience with Oracle, you have likely already used this tool. It is an external tool with the primary mission of importing data from external sources into Oracle database tables. Hence it is ideally suited to our CSV import task.
SQL Loader is powered by a Control File that feeds the appropriate load parameters to tell the tool where to find the data and how to load it. In our case, we can use a control file like this to load our weather data.
LOAD DATA infile 'c:\MyWeather\Weather_File.csv' REPLACE INTO TABLE Weather_Data fields terminated by ',' optionally enclosed by '"' ( address, latitude, longitude, date_time, maxt, mint, temperature, precip, wspd, wdir, wgust, pressure )
If you saved the above command file with the name WeatherDataLoader.ctl, you could then run SQL Loader with a command like this.
sqlldr userid/passwd@Weather_DB control=WeatherDataLoader.ctl
(Note that you need to replace the placeholders in the command above with your proper userid and password.)
As with many tasks in Oracle, there are several other ways to accomplish a CSV import. You can use the SQL Developer admin tool GUI or even another Oracle command line utility package cleverly named UTL.
Postgres makes it easy to copy data from a CSV file into a table using the “copy” command. Simply specify basic parameters using a command like this:
COPY Weather_Data FROM 'c:\MyWeather\Weather_File.csv' DELIMITER ',' CSV HEADER;
The “CSV HEADER” parameter tells Postgres to treat the first row of the CSV as headers instead of data. If you want more details on the Postgres “Copy” command and its many options, you can find the details here: https://www.postgresql.org/docs/9.2/sql-copy.html
Summary and What’s Next
If you followed along, by now you should have some historical weather data loaded into your database table. Of course, this is just the starting point. You can now begin the interesting and enlightening process of analyzing the data to find weather trends, patterns, and relationships with other data. If you have other data such as business transactions, customer details, employee productivity records, etc. you can now learn how your existing data correlates with weather and which specific weather metrics are interesting to your business. You likely want to use a business intelligence tool such as Power BI, Tableau, Qlik, Oracle Analytics Desktop, TIBCO Spotfire, or R Studio. Tools such as these will leverage the power of your database to make finding patterns and correlations easy. The results can then be displayed and analyzed graphically.
If you have questions, need more information about how to use weather data in general, or how to connect weather data to your favorite application, just reach out to our support team. We are always ready to help you find the best way to use weather data whatever your project may be.