How to download data from Weather Underground – TopBullets.com

Topbullets.comWeather is an important factor for fast moving consumer goods product. It is a very interesting case study to read upon. Milk which is the major component for ice cream is heavily produced during winter as cows produce more in this season but consumed more in summer. So dairy products manufacturers have to adjust their storage of raw material and production of final products in such a way which can minimize the operational cost. Forecasting of FMCG products is also very important for manufacturer and temperature play again an important role to understand the demand. To understand the relation between temperature and demand we need historical data. So today I am going to write an R code and some manual trick to download the temperature data for different cities.

First, I will write how to download the data manually as the R code is not very efficient in extracting data for all variables. And we will choose “Weather Underground” as our source. I am using this source from a couple of years.
Step 1: We will first learn how to find the station code. Each city has one station installed by WU and it has a unique identity.

Visit Weather Underground website and search for the city. In my case it is Cape Town. Now go to “Histroy” tab and then you can see the code (circled below), it is “FACT”.

Pic1

Step 2: Edit the below code accordingly. You need to edit only a few parameters. I make them bold what you need to replace. Please note when you try to download the data for a couple of years, it won’t come in one go. You have to manually adjust the dates and download 3-4 times depending on the length of timeframe. I think in one go it allows to download 370 rows, say 1 year.

https://www.wunderground.com/history/airport/FACT/2013/3/1/CustomHistory.html?dayend=31&monthend=12&yearend=2016&req_city=NA&req_state=NA&req_statename=NA&format=1

Here: FACT is the station id, 2013 is start year, 3 is start month and 1 is start day and then dayend, monthend and yearend are your end date. So you need to adjust the start date, end date will always be same for all iterations.

Step 3:
Once you edit the link, paste it in your browser and open it. You will see the data like below image. Copy the data and paste into excel. Don’t worry about the format. Keep appending the data below (don’t paste header every time).

Pic3

Step 4: Now that you have the data, we can format our excel file. Clt+A to select all the rows, go to Data>>Text to Columns (Alt+A+E) and select comma separator and press Enter. We are done! You have the data prepared for the given station id.

Pic7

I am giving another option but this option will help you to download only 3 variables (max, mean and min temperature) for the given station id. Again station id you can find from step 1. I think the code is well commented and self-explanatory but if you find any trouble please comment below.

#################################################################
############ DOWNLOAD WEATHER DATA FROM WEATHER UNDERGROUND #####
# AUTHOR: DEEPESH SINGH
# DATA: 12 APR 2017
# PURPOSE: TO DOWNLOAD WEAHTER INFORMATION AUTOMATICALLY
# METHOD DETAILS: R CODE WITH BACKEND API AND MANUAL
################################################################

# First we install the package written by Ram Narasimhan <ramnarasimhan@gmail.com>

install.packages("weatherData")
library(weatherData)

# User input
StationID <- "FACT"

# Now we make a variable 'years' and store all the years for which we need weather info

years <- c(2010:2017)

# Now we need to find the station ID which you can find from above method

# Let's check if data is available for the given time period or not

data_flag <- checkDataAvailabilityForDateRange(StationID, "2010-01-01", "2017-03-31")
print(data_flag)

# Here data_flag = 0 means data is not available for given internal and 1 means data is available

# Now that we know we have data available for Cape Town from 2010-2017 we would like to download

# Before applying this function (getWeatherForYear) read more about the package

?getWeatherForYear

# If we make opt_detailed = T it will download all the columns and will take lot of time, hence manual
# method is faster. opt_detailed = F will give only few columns which we need 


# Let's create an empty data frame
getData <- data.frame()

for (year in years){
  tempData <- getWeatherForYear(StationID, year, opt_detailed = F) # NEVER MAKE OPT_DETAILED = T
  getData <- rbind(getData, tempData)
}

View(getData)


write.csv(getData,paste("WeatherData_",StationID,".csv", sep=""),row.names = F)

# Refrence
# https://cran.r-project.org/web/packages/weatherData/weatherData.pdf

Thank you for reading my article. Please like and share if this article helped you. Also, don’t forget to comment below your thoughts.

Signature

Deepesh Singh
logo

Advertisements

Please leave your valuable comment.

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s