This post serves as a follow-up on a previous post about scheduled collection of Weather.gov’s XML feed in R, which itself was a follow-up to retrieving real-time data from Weather.gov in Excel.
Reflecting on the best way to accomplishing this automation, I noticed something back on Weather.gov’s update page: an option for a 2-day weather history! Duh! Why automate collection every hour when I could use this link to get history every day (or more)?
Turns out this link brings you to an html page with a table recording weather updates on the hour with lots of information — more that the XML page, in fact!
For this I will use R’s htmltab package to read this table into an R table, then do some manipulation before getting it to our workbook.
Let’s get started. First time using R? Check out my free course, “5 Things Excel Users Should Know About R.”
1. Inspect the table
To figure out how to pull this table into R, we need to look under the hood of our website. To do that, right-click somewhere in the table in your browser and click “select element.”
2. Copy the table’s XPath
Here an editor comes up on your page. Notice that when you hover over different parts of this script, different parts of the web page are highlighted. Keep hovering until you see that the table we want to download is highlighted. We need to get some information about this table to write a script to collect it.
Once you have the table highlighted, right-click on this line of code and select Copy – XPath.
We will be using this in the R Script below.
3. Assemble R Script
This script will save the weather information for the past 24 hours as a .csv file based on today’s date. I read in the web page, point to the html table based on the XPath which we identified above, keep the first 24 rows for the first 24 hours (aka, today) of the weather data, and save the file as today’s date.
4. Schedule it to run
For this again you could use the Windows Task Scheduler, setting the script to run every day at midnight. Check out the previous post for more on the Task Scheduler.
So there you have it, a daily download of hourly weather readings from any recording site of the National Weather Service delivered directly to you via the power of R.
Olena
Package ‘htmltab’ was removed from the CRAN repository.