Often I find myself going to various websites whenever I’m doing a valuation to track down either historical or future WTI/Henry Hub prices. At least before I started coding. So let me show you how to automate it with R!
Follow My Blog
Packages
I’m going to use a few packages in this analysis.
And that’s pretty much it.
Historical Prices
Usually I just want to get monthly prices, but the EIA provides daily settlement so I’ll use that.
WTI
First, I’ll go to the EIA website for WTI (Oil). Basically, I need to locate the webpage and use xml2/rvest to read the website and grab the table that I want.
#Tidyverse packages library(rvest) library(xml2) library(httr) library(dplyr) #EIA website for Daily Oil Production crude <- 'https://www.eia.gov/dnav/pet/hist/RWTCD.htm' #Read the website structure/content with xml2 webpage <- xml2::read_html(crude) #Look for table nodes, locate the table we want (6) and put it in a table wti1 <- webpage %>% html_nodes('table') %>% .[6] %>% html_table(fill = TRUE) wti1 <- wti1[[1]] %>% filter(!is.na(Mon))
Our resulting dataframe looks like this.
Week of | Mon | Tue | Wed | Thu | Fri |
---|---|---|---|---|---|
1985 Dec-30 to Jan- 3 | NA | NA | NA | 25.56 | 26 |
1986 Jan- 6 to Jan-10 | 26.53 | 25.85 | 25.87 | 26.03 | 25.65 |
I don’t like the layout (really not usable), so I change the table structure and then aggregate at the monthly level.
library(stringr) library(lubridate) library(tidyr) #Create Date Column from first day of week wti1$Date <- paste0(substr(wti1$`Week Of`, 1, 4), '-', substr(wti1$`Week Of`, 6, 8), '-', substr(wti1$`Week Of`, 10, 11)) #Replace spaces with zeros (Day of Week) wti1$Date <- gsub(' ', '0', wti1$Date) #Convert to date format wti1$Date <- as.Date(wti1$Date, format = '%Y-%b-%d') #Drop fist column, pivot from wide format to long format, calculate day, and then average by month wti1 <- wti1 %>% subset(select = -c(`Week Of`)) %>% gather(Day, WTI, -c(Date)) %>% arrange(Date) %>% group_by(Date) %>% mutate(daysOn = seq(0, 4, 1)) %>% ungroup() %>% #Create a day of week from 0 to 4 mutate(Date = Date %m+% days(daysOn)) %>% #Add to first date by day in week mutate(DATE = as.Date(paste0(year(Date), '-', month(Date), '-01'))) %>% #Create new date for month/year filter(!is.na(WTI)) %>% group_by(DATE) %>% #Remove NA's summarise(WTI = mean(WTI, na.rm=TRUE)) %>% ungroup()
Figure 1: Historical WTI Pricing
Henry Hub
Next, I’ll get Henry Hub (Gas). The process is essentially the same for gas.
hh <- 'https://www.eia.gov/dnav/ng/hist/rngwhhdD.htm' webpage <- xml2::read_html(hh) hh1 <- webpage %>% rvest::html_nodes('table') %>% .[6] %>% rvest::html_table(fill = TRUE) hh1 <- hh1[[1]] %>% filter(!is.na(Mon)) hh1$Date <- paste0(substr(hh1$`Week Of`, 1, 4), '-', substr(hh1$`Week Of`, 6, 8), '-', substr(hh1$`Week Of`, 10, 11)) hh1$Date <- gsub(' ', '0', hh1$Date) hh1$Date <- as.Date(hh1$Date, format = '%Y-%b-%d') hh1 <- hh1 %>% subset(select = -c(`Week Of`)) %>% gather(Day, HH, -c(Date)) %>% arrange(Date) %>% group_by(Date) %>% mutate(daysOn = seq(0, 4, 1)) %>% ungroup() %>% mutate(Date = Date %m+% days(daysOn)) %>% mutate(DATE = as.Date(paste0(year(Date), '-', month(Date), '-01'))) %>% filter(!is.na(HH)) %>% group_by(DATE) %>% summarise(HH = mean(HH, na.rm=TRUE)) %>% ungroup()
Figure 2: Historical Henry Hub Pricing
God I remember when gas prices jumped above $12. We were running through the halls giving high fives to each other.
Strip Forecast
For this one, I will be borrowing from the Wall Street Journal.
Oil
This one is a little easier. It downloads in the format we need, but I will just use month and year to create Date, and then use the settlement price as WTI.
crude = 'https://quotes.wsj.com/futures/CRUDE%20OIL%20-%20ELECTRONIC/contracts' webpage <- read_html(crude) tbls <- rvest::html_nodes(webpage, 'table') tbls_ls <- webpage %>% rvest::html_nodes('table') %>% .[1] %>% rvest::html_table(fill = TRUE) wti <- tbls_ls[[1]] %>% select(MONTH, SETTLEMENT) wti <- wti %>% filter(MONTH != 'Front Month') wti$MONTH <- gsub('Crude Oil', '', wti$MONTH) wti$MONTH <- trimws(wti$MONTH, which = 'both') wti$Year <- substr(wti$MONTH, 5, 8) wti$Month <- substr(wti$MONTH, 1, 3) wti$Date <- as.Date(paste0(wti$Month, '/01/', wti$Year), format = '%b/%d/%Y') wti <- wti %>% select(DATE = Date, WTI = SETTLEMENT)
Gas
And same for Gas Price Forecast.
gas = 'https://quotes.wsj.com/futures/NATURAL%20GAS/contracts' webpage <- read_html(gas) tbls <- rvest::html_nodes(webpage, 'table') tbls_ls <- webpage %>% rvest::html_nodes('table') %>% .[1] %>% rvest::html_table(fill = TRUE) hh <- tbls_ls[[1]]%>% select(MONTH, SETTLEMENT) %>% filter(MONTH != 'Front Month') hh$MONTH <- gsub('Natural Gas', '', hh$MONTH) hh$MONTH <- trimws(hh$MONTH, which = 'both') hh$Year <- substr(hh$MONTH, 5, 8) hh$Month <- substr(hh$MONTH, 1, 3) hh$Date <- as.Date(paste0(hh$Month, '/01/', hh$Year), format = '%b/%d/%Y') hh <- hh %>% select(DATE = Date, HH = SETTLEMENT)
Combine it all together
Now we can just link up all the data and create a combined historical + forecast view.
wti <- rbind(wti1, wti) hh <- rbind(hh1, hh) library(zoo) prices <- merge(seq(min(year(hh$DATE)), max(year(hh$DATE)), 1), seq(1, 12, 1)) %>% mutate(DATE = as.Date(paste0(x, '-', y, '-01'))) %>% arrange(DATE) %>% select(-c(x, y)) %>% left_join(wti) %>% left_join(hh) %>% mutate(WTI = na.locf(WTI), HH = na.locf(HH))
I use na.locf (from zoo) to just fill in missing values with the last reported value. I also create a total sequence of dates because there are random missing dates within the dataset. When I’m all done, I have a price file. It’s simple enough just to turn this into a function that can be run whenever you like.
Figure 3: Combined Oil & Gas Price Forecast
Summary
Thanks for tuning in for this version. Eventually I will build all of this into a cohesive valuation methodology.
Pingback: EnergyNet Override Valuation (Oil & Gas Coding with R) - Shale Insights