EnergyNet Override Valuation (Oil & Gas Coding with R)

In this insight, we will be digging into an upcoming auction on EnergyNet using the R programming language. What is EnergyNet?

EnergyNet offers an easy-to-use oil, gas, and other commodities auction and sealed bid transaction service that facilitates the sale of producing working interests (operated and non-operated), overrides, royalties, mineral interests, leasehold, and other contracts. EnergyNet is a continuous online marketplace with due diligence and bidding available 24/7/365, where auctions and sealed bid packages close weekly. Most of the properties EnergyNet sells typically range in value from $1,000 to $250MM. EnergyNet‘s tech-enabled platform allows clients the flexibility and convenience of conducting their acquisition and divestment activities online.

I get the impression by reviewing some of the packages (and from my experience with auctions), that this is the kind of market where you could do quite well if you were very experienced at O&G valuations. If I had to guess, there are probably two types of investors on the site.

  • Indiscriminate bidders
  • Experienced Valuation folks

The indiscriminates may win quite a few by sheer volume, but will likely overbid the poor assets and underbid the better ones. So to make hay proper due diligence is needed. If someone had the capacity and wherewithall to work through a lot of the smaller packages on the site, which are likely thinly evaluated, you could probably find some dislocated value.

I will largely show how to use various tools at my disposal to extract data from excel and PDF files to perform these valuations. In the end, this tutorial is largely that; how to gather and manipulate data to generate useful insights for our valuations.

At the end, I will combine a lot of our various tutorials together to demo an example valuation module that can be done once you advance to more sophisticated R programming.

If you want to just skip to the demo, head on down to the bottom of the insight.

Special thanks to Chris Atherton for letting us work through this on the site.

If you are viewing this insight as an R coding tutorial, I would advise working slowly. Run each distinct section of code, open the data tables, and inspect them. Data manipulation is a key piece of the toolbox but you need to work through the logic to ensure you understand what is happening. You often find situtations where the data is non-standard and have to work through it before moving on to automating it.

So buckle up, this is a long one.

Property Info

I will be focusing on a specific package within the Eagle Ford. 59 wells spread across ~25 leases operated by Penn Virginia in Lavaca and Gonzales counties in Texas.

Figure 1: Lot 71524 Package Teaser

Are these wells economic to drill and produce? No idea, and I don’t really care as I’m just receiving revenue less production taxes because it is an Overriding Royalty Interest. For reference, this is a revenue-share and is not burdened with opex/capex.

I do face some risk that overall cash flow from the lease goes negative and the operator shuts-in the wells, but I will just assume that doesn’t happen here. As shut-in is likely several years away, the PV impact is not overly significant, I guess unless WTI goes negative again.


EnergyNet typically provides a zip file containing supporting documents. I will be downloading this data into a folder called data1 that is in my working directory for R. What I will be using for this workflow are the checkstubs and the general well information excel files. The files can be downloaded from github into R.

Packages Needed here are:

If they are not installed, please use install.packages(c(‘sf’, ‘downloader’, ‘readr’, ‘dplyr’)).

downloader::download('', destfile='', mode = 'wb')

I will also pull in lease-level production data from the Texas Railroad Commission. To save you time trying to figure out how to download it, I will just make the production available as well.

prod <- readr::read_csv('')

To aid in visualization you can download the lateral shapefiles from the RRC, though it is not necessary for this valuation. I do show a location map in the demo.


#Gonzales County
 destfile = '', mode = 'wb') 
#Lavaca County
 destfile = '', mode = 'wb') 



sf1 <- sf::read_sf('well177l.shp')
sf2 <- sf::read_sf('well285l.shp')
sf1 <- rbind(sf1, sf2)

sf1$API <- paste0('42', sf1$API)

sf1 <- sf1 %>%
  group_by(API) %>% 
  summarize(geometry = st_union(geometry)) %>%
  ungroup() %>% filter(!duplicated(API))


What is munging? Basically it is just manipulating data and is sometimes referred to as wrangling. And when I work through things like non-repeatable excel/pdf files, I have to do a lot of it.

Additional packages needed are:

  1. readxl – Read excel files in R from tidyverse
  2. stringr – String manipulation from tidyverse

Well/Lease Data

In the end, I want to extract the well data, and then for sure find the lease number, API number, and the ORRI value. There is other data I will keep just for reference, but let’s step through the workflow.

Lease Data

Let’s do the lease data first.

#Download First Excel File, which contains individual well data

df1 <- readxl::read_xlsx('./data1/Well_Data_-_Excel.xlsx') %>% 
  select(lease = `Property/Well Name`, County, State,
                      reservoir = `Producing Formation`, 
                      compDate = `Completion Date`, 
                      API = `API Number`,
                      id = `State Well ID Number`,
                      orri = `ORRI - Decimal`)  %>%
  mutate(API = substr(API, 1, 12), 
         API = gsub('-', '', API, fixed = TRUE))

#Create a dummy column for scientific notation
df1$interest <- NA
#Take last 3 characters from orri column
df1$interest <- substr(df1$orri,nchar(df1$orri)-2, nchar(df1$orri))

#Remove the dummy column text from the initial orri column
df1$orri[df1$interest == 'E-3'|df1$interest == 'E-4'] <- substr(df1$orri[df1$interest == 'E-3'|
                                                                df1$interest == 'E-4'],0,
                                                                nchar(df1$orri[df1$interest == 'E-3'|
                                                                df1$interest == 'E-4'])-3)

#Remove special characters from the orri column
df1$orri <- gsub('\n', ' ', df1$orri, fixed=TRUE)
df1$orri <- gsub('\r', '', df1$orri, fixed=TRUE)

#Create a second column to represent the split orri value from the one line with two values
df1$orri2 <- as.numeric(word(df1$orri, 2))
df1$orri <- as.numeric(word(df1$orri, 1))

#Convert the scientific notation columns
df1$orri[df1$interest == 'E-3'] <- df1$orri[df1$interest == 'E-3']*10^-3
df1$orri[df1$interest == 'E-4'] <- df1$orri[df1$interest == 'E-4']*10^-4
#Remove dummy column
df1 <- df1 %>% subset(select= -c(interest))
#Make lease name all caps
df1$lease <- toupper(df1$lease)

What steps am I taking?

  1. Read the data.
  2. Select and Rename specific columns so as to make the file smaller and easier to work with.
  3. Convert API number to numeric values only.
  4. Manipulate ORRI data so as to get the actual values correct. You will notice that this column got all screwed up in the download. One lease has two different values separated by a space, and the other columns were read in as scientific notation but converted to text. Had to be creative here.
  5. Convert lease name to all caps so it will be easier to merge with other dataframes later (just in case).

Well Data

That was the lease data, but I can also pull in the individual well data too.

#Read Well Excel File
df3 <- readxl::read_xlsx("./data1/PVAC_Well_List_Active_as_of_2-25_Check_Run_-_Excel.xlsx")

#File didn't read in headers, so move that row 2 to the header row
names(df3) <- df3[2,]

#Filter to all rows below header row
df3 <- df3[3:nrow(df3),]

#Rename columns
names(df3) <- c('API', 'operator', 'lease', 'wellNum', 'naCol', 
                'County', 'city', 'depth', 'status', 'spudDate',
df3 <- df3 %>% filter(!

#Remove useless data columns
df3 <- df3 %>% subset(select = -c(naCol, city, interest))

#Convert API to all numeric characters
df3$API <- gsub('-', '', df3$API, fixed = TRUE)

#Capitalize leases
df3$lease <- toupper(df3$lease)

Steps taken are:

  1. Read in the file
  2. Move the second row to the header (visual inspection of data)
  3. Select only rows beneath this header row.
  4. Rename columns
  5. Remove null API rows and then remove needless columns
  6. Convert API to numeric values only
  7. Capitalize lease names

Combine Well and Lease Data

This part is a little tricky. The lease and well names are not an exact match, so I have to keep manipulating text strings to get the right combinations. Would have been a lot better had they included the associated lease number column but I work with what I got. Another day I might work through text similarity analysis to figure it out, but I just went manual for this.

#Join on lease name, which gets me half of the data
dfx <- df3 %>% 
  left_join(df1 %>% select(lease, reservoir, id, compDate, orri, orri2)) %>%

#Combine Lease and Well Number and retry, gets another 15 values
dfy <- df3 %>% mutate(lease = paste(lease, wellNum)) %>%
  left_join(df1 %>% select(lease, reservoir, id, compDate, orri, orri2)) %>%

#Append the two, still only 47 of 64
dfx <- rbind(dfx, dfy)

#Combine lease name and the word LEASE and retry, gets me another 14 wells
dfy <- df3 %>% mutate(lease = paste(lease, 'LEASE')) %>%
  left_join(df1 %>% select(lease, reservoir, id, compDate, orri, orri2)) %>%

#Combine the two; still 3 shy
dfx <- rbind(dfx, dfy) %>% distinct()

#Just hope that by joining on API number gets us the rest of the way.  And luckily it does.
dfy <- df3 %>% 
  left_join(df1 %>% select(API, reservoir, id, compDate, orri, orri2)) %>%
  filter(! %>%
  filter(!API %in% dfx$API)
dfx <- rbind(dfx, dfy)

#Change second ORRI column where it is a NA value to original value
dfx$orri2[$orri2)] <- dfx$orri[$orri2)]

#The only effected column here is the Jeffrey Unit, 
#but here I am converting the second orri2 to original if it was the first well
dfx <- dfx %>% group_by(lease) %>% 
  mutate(orri2 = replace(orri2, spudDate == min(spudDate), 
                         orri[spudDate == min(spudDate)])) %>% ungroup()

#This converts the new wells in Jeffrey unit to be the sum of both orri's
dfx$orri[dfx$orri != dfx$orri2] <- dfx$orri[dfx$orri != dfx$orri2] + dfx$orri2[dfx$orri != dfx$orri2]

#Remove 2nd orri column and no longer needed files
dfx <- dfx %>% subset(select = -c(orri2))
rm(df1, df3, dfy)

Steps here are:

  1. Loop through various combinations of lease names and API numbers so that I can append all of my original lease data to new well dataframe. I do this so I can auto-populate ORRI values to each well without having to do it manually. I don’t join on API originally as not all of the well API‘s are represented in the original lease dataframe.
  2. Convert my ORRI values for the Jeffery Unit to represent the change in value from well 1 to the next two wells. As you go though the PDF files, you will notice that the newer wells in the unit are represented twice each month for both represented ORRI’s.

So that does it for the well data. Now on to production.

Production Data

We read our production in earlier, so there’s not much we need to do.

#Select the columns we want
prod <- prod %>% select(leaseNum = LEASE_NO,lease = LEASE_NAME, operator = OPERATOR_NAME,
                        Year = CYCLE_YEAR,
                        Month = CYCLE_MONTH, oil = LEASE_OIL_PROD_VOL,
                        cond =  LEASE_COND_PROD_VOL, gas = LEASE_GAS_PROD_VOL,
                        csngGas = LEASE_CSGD_PROD_VOL)

#Gas column is empty, so use Casinghead Gas column
prod$gas <- prod$csngGas

#Remove useless columns (oil wells so no condensate reported)
prod <- prod %>% subset(select= -c(csngGas, cond))

prod$Date <- as.Date(paste0(prod$Year, '-', prod$Month, '-01'))

#Some empty production after June, so remove those columns
prod <- prod %>% filter(Date < as.Date('2020-07-01'))

#Append ORRI data to production dataframe
orriProd <- dfx %>% group_by(leaseNum = id) %>% summarise(orri = max(orri)) %>% ungroup()
prod <- prod %>% left_join(orriProd)
prod$leaseNum <- as.character(prod$leaseNum)

#Adjust orri for the Jeffery lease to reflect original. 
prod$orri[prod$leaseNum == '11086' &amp; prod$Date <= as.Date('2019-02-01')] <- 0.00002778

The Jeffery lease original well appears to have been knocked offline when the two new wells came on, and it doesn’t seem that to have been brought back online afterwards based on the checks, so I can feel comfortable using the combined ORRI for go-forward economics. Thanks parent-child for making my analysis easier.

Figure 2: Net Production

Looks like we are down to a robust 5 net bbls of oil per day. Still, this is spread around several wells so not a big risk of shut-ins.

Location Data

We dowloaded the location data earlier, so I will go ahead and append it to my well file so we can visualize the well locations. I will also use leaflet to visualize the wells.

Packages needed are:

dfx <- dfx %>% left_join(sf1)

#Create popup id column
dfx <- dfx %>% mutate(popup = glue::glue("<b>API: </b>{API}<br>
                                          <b>Operator: </b>{operator}<br>
                                          <b>Well: </b>{lease}-{wellNum}<br>
                                          <b>Reservoir: </b>{reservoir}<br>
                                          <b>ORRI: </b>{orri}"))

#Get County shapefiles from tigris
counties <- tigris::counties(state = 'TX')
counties <- st_as_sf(counties) %>% filter(NAME == 'Gonzales'|NAME == 'Lavaca') %>%
    mutate(county = NAME)
counties$pt1 <- st_centroid(counties$geometry)

#Create our map
leaflet(options = leafletOptions(zoomControl = FALSE)) %>% 
        addProviderTiles(providers$OpenStreetMap) %>%
        addPolygons(data = counties$geometry, fillColor = 'transparent', color = 'black', 
                    weight = 0.5) %>%
        addLabelOnlyMarkers(data = counties$pt1, label = as.character(counties$county),
                            labelOptions = labelOptions(noHide = T, direction = 'top', textOnly = T)) %>%
        leaflet::addScaleBar(position = 'topleft',scaleBarOptions(metric = FALSE)) %>%
        addPolylines(data = dfx$geometry, color = 'green', opacity = 0.75, weight = 2,
                     popup = dfx$popup)   %>%
      leaflet::setView(lng = -97.20932, lat = 29.45483, zoom = 10)  

Figure 3: Well Locations

If you click on a well location, the map will show various information for each well.

Revenue Data

Extracting the revenue data from PDF files is not exactly easy, so we have to get creative. To help us, we are going to use four additional packages:

First, I am going to locate the check stubs within the data1 folder, and then loop through and download them into a dataframe. This step can take a few minutes.

#Locate Files
checks <- list.files('./data1', pattern ='Ckstb', full.names = TRUE)

checks <- data.frame(files = checks, count = 1)

#Download them to list of dataframes
econSummary <- lapply(split(checks, checks[,'files']), function (well) tryCatch({
  tableX <- extract_tables(well$files,
                           output = "data.frame",
                           guess = TRUE 
  data.table::rbindlist(tableX, fill = TRUE)
error = function(e) {

table1 <- data.table::rbindlist(econSummary)
rm(econSummary, checks)

Now we will need to investigate and clean up the mess we created. This is a looooonggg one. I am not going to do a summary at the end, but will comment the code as it moves along.

This is the most complicated part of this entire tutorial. I would encourage you to investigate the dataframe in R visually first, and then work through each line of code below to see what occurs.

#Rename some columns to make it easier to analyze
names(table1)[1:9] <- c('Date', 'codeBTU', 'lease', 'intType', 'priceQty','valueX', 'value', 'deductions', 'cf')

#Remove rows where Date = 'Prod'
table1 <- table1 %>% filter(Date != 'Prod')

#Fill lease name column with value from priceQty column if Date column contains phrase 'Property'
table1$lease[grepl('Property', table1$Date)] <- table1$priceQty[grepl('Property', table1$Date)]

#Filter rows again
table1 <- table1 %>% filter(Date != 'Date')
table1 <- table1 %>% filter(value != 'Value')

#Fill missing lease values with the last occurrence
table1$lease <- na.locf(table1$lease)

#Filter rows again
table1 <- table1 %>% filter(!grepl('Property', Date))

#Get gas btu values from codeBTU column by finding the last word
table1$btu <- as.numeric(stringr::word(table1$codeBTU, -1))

#Remove btu value from codeBTU column
table1$codeBTU <- stringr::word(table1$codeBTU, 1)

#Fill empty codeBTU values with NA
table1$codeBTU[table1$codeBTU == ''] <- NA

#Fill missing btu value with last known
table1$codeBTU <- na.locf(table1$codeBTU)

#Price received from first word in priceQty column
table1$Price <- stringr::word(table1$priceQty, 1)

#If Price is 'Deduct', change to NA
table1$Price[table1$Price == 'Deduct'] <- NA

#Remove dollar sign and convert to number
table1$Price <- as.numeric(gsub("$","", table1$Price, fixed = TRUE))

#Find volume by taking last word from priceQty column
table1$Volume <- stringr::word(table1$priceQty, -1)

#Remove volume from Tax lines
table1$Volume[table1$Volume == 'FE'] <- NA
table1$Volume[table1$Volume == 'PR'] <- NA

#Remove commas from Volume lines
table1$Volume <- (gsub(",","", table1$Volume, fixed = TRUE))
table1$Volume <- as.numeric(table1$Volume)

#Replace missing Volume with the valueX file, which had some weird delimiter action
table1$Volume[$Volume)] <- table1$valueX[$Volume)]

#Repeat previous volume workflow
table1$Volume <- (gsub(",","", table1$Volume, fixed = TRUE))
table1$Volume <- (gsub("$","", table1$Volume, fixed = TRUE))
table1$Volume <- as.numeric(table1$Volume)

#Convert Date column to Date format
table1$Date <- paste0(table1$Date, '/01')
table1$Date <- as.Date(table1$Date, format = '%m/%Y/%d')

#Remove rows we do not need
table1 <- table1 %>% filter(!|Check.Number== 'FE Tax'|Check.Number == 'PR Tax')

#Fill Date column down
table1$Date <- na.locf(table1$Date)

#Remove special characters from deductions column
table1$deductions <- gsub('$', '', table1$deductions, fixed = TRUE)
table1$deductions <- gsub('(', '', table1$deductions, fixed = TRUE)
table1$deductions <- gsub(')', '', table1$deductions, fixed = TRUE)
table1$deductions <- gsub(',', '', table1$deductions, fixed = TRUE)

#If Check.Number isn't tax, just call it revenue to aid in pivoting later
table1$Check.Number[table1$Check.Number != 'PR Tax' &amp; table1$Check.Number != 'FE Tax'] <- 'Revenue'

#Remove special characters from value column
table1$value <- gsub('$', '', table1$value, fixed = TRUE)
table1$value <- gsub('(', '', table1$value, fixed = TRUE)
table1$value <- gsub(')', '', table1$value, fixed = TRUE)
table1$value <- gsub(',', '', table1$value, fixed = TRUE)

table1$deductions <- as.numeric(table1$deductions)
table1$value <- as.numeric(table1$value)

#If deductions is just the word 'Deduct', call it NA
table1$deductions[!grepl('Deduct', table1$priceQty)] <- NA

#Remove duplicate rows
table1 <- table1 %>% filter(!duplicated(paste0(Date,codeBTU, lease, priceQty, value, deductions, cf)))

#Remove special characters from cf column
table1$cf <- gsub('$', '', table1$cf, fixed = TRUE)
table1$cf <- gsub('(', '', table1$cf, fixed = TRUE)
table1$cf <- gsub(')', '', table1$cf, fixed = TRUE)
table1$cf <- gsub(',', '', table1$cf, fixed = TRUE)
table1$cf <- as.numeric(table1$cf)

#Fill values down
table1$cf <- na.locf(table1$cf)
table1$value <- na.locf(table1$value)
table1$Volume <- na.locf(table1$Volume)

#Zero btu implies it's not a gas line so I set it to NA value
table1$btu[table1$btu == 0] <- NA

#Set the btu value for each line to be the average for each lease and date
table1 <- table1 %>% group_by(lease, Date) %>% mutate(btu = mean(btu, na.rm=TRUE))

table1$btu[is.nan(table1$btu)] <- 1
table1$btu <- na.locf(table1$btu)

#Fill price column down
table1$Price <- na.locf(table1$Price)

#In this one I am renaming columns, removing the word Tax from the tax lines, removing duplicates,
#And then pivoting the table on the item/deductions lines

table1 <- table1 %>% ungroup() %>% select(Date, Product = codeBTU, lease, item = Check.Number,
                                          Volume, Price, btu, value, deductions, cf) %>%
  mutate(item = gsub(' Tax', '', item)) %>% distinct() %>%
  filter(!duplicated(paste0(Date, Product, lease, item, Volume, Price, btu, value))) %>%
  spread(item, deductions) %>% subset(select = -c(Revenue))
table1[] <- 0

#Summarise each column by date/product/lease
table1 <- table1 %>% group_by(Date, Product, Lease = lease) %>% 
  summarise(Volume = sum(Volume), Price = mean(Price), btu = mean(btu),
            value = sum(value), cf = sum(cf), AdVal = sum(FE), ProdTax = sum(PR)) %>%

econs <- table1 %>%
  filter(Product == 'OIL'|Product == 'GAS'|Product == 'PPRO') %>%
  filter(nchar(Lease) >= 5)

I’m not showing it here, but when you actually look at the AdVal totals, it is effectively 0. So really they are only paying Severance Tax, which for Texas is 4.6% of oil revenue and 7.5% of gas/ngl’s.

Shrink and NGL Yields

Now that we have sales data for basically the entire 2019 calendar year, we can actually loop through each well/lease and make an estimate of total shrink and NGL yield. I will then append all these data points to my production dataframe.

How do I estimate shrinkage? Well, I can compare the state reported Gas-Oil-Ratio by lease to the sales Gas-Oil-Ratio and the fraction (sales/state) should give me my shrinkage value. I can then convert sales gas to a raw sales gas number and calculate NGL Yield.

I am also going to need to load:


#Regroup and sum the econs dataframe
econs <- econs %>% group_by(Date, Product, Lease) %>% 
  summarise(Volume = sum(Volume), Price = mean(Price), btu = mean(btu),
            value = sum(value), cf = sum(cf), AdVal = sum(AdVal), ProdTax = sum(ProdTax)) %>%

#Filter to year 2019 and pivot the table wider so products are in columns
leaseProd <- econs %>% filter(year(Date) == 2019) %>% group_by(Date, Lease, Product) %>%
  summarise(Volume = sum(Volume, na.rm=TRUE)) %>% ungroup() %>%
  distinct() %>% 
  spread(Product, Volume) %>% group_by(Lease) %>%
  summarise(OIL = sum(OIL, na.rm=TRUE), GAS = sum(GAS, na.rm=TRUE), PPRO = sum(PPRO, na.rm=TRUE)) %>%
  ungroup() %>% filter(OIL + GAS + PPRO > 0)

#Convert gallons of ngl to bbls
leaseProd$PPRO <- leaseProd$PPRO/42

#Find production by lease for 2019 from RRC data
prod1 <- prod %>% filter(year(Date) == 2019) %>% group_by(Lease = lease) %>% 
  summarise(oil = sum(oil), gas = sum(gas)) %>% 

#Remove extra spaces from lease names
leaseProd$Lease <- gsub("\\s+"," ",leaseProd$Lease)
prod1$Lease <- gsub("\\s+"," ",prod1$Lease)

#This is a common set of lease/well names between leaseProd and prod1
#Allows me to eventually build the link between well name and lease
lsnms <- data.frame(lsnm = c('BOCK', 'PORTER', 'PILSNER','AMBER-LAGER A', 'AMBER-LAGER B',
                             'AMBER ', 'ARMADILLO A', 'ARMADILLO B', 'ARMADILLO C','BERGER',
                             'COYOTE A', 'COYOTE B', 'HEATWAVE A', 'HEATWAVE B', 
                             'HEFE', 'JACKALOPE A', 'JACKALOPE B', 'JEFFERY', 'KOSMO',
                             'LEAL', 'MATIAS', 'SHARKTOOTH'), count = 1)

#Loop through my names, summarise volumes, and then calculate shrink and ngl yield as indicated above
econSummary <- lapply(split(lsnms, lsnms[,'lsnm']), function (well) tryCatch({
  calc1 <- leaseProd %>% filter(grepl(well$lsnm, Lease)) %>% 
  group_by(well = Lease) %>%
    summarise(OIL = sum(OIL), GAS = sum(GAS),
    NGL = sum(PPRO)) %>% ungroup() %>%
    merge(prod1 %>% filter(grepl(well$lsnm, Lease)))
  calc1$shrink <- (sum(calc1$GAS)/sum(calc1$OIL))/(sum(calc1$gas)/sum(calc1$oil))
  calc1$nglYield <- sum(calc1$NGL)/(sum(calc1$GAS)/calc1$shrink[1])*1000
error = function(e) {

#Bind list into a dataframe
calc1 <- data.table::rbindlist(econSummary)
calc1$nglYield[calc1$nglYield == 0] <- NA

#If NA's are present replace with average of the group
calc1$nglYield[$nglYield)] <- mean(calc1$nglYield, na.rm=TRUE)

#Join ngl yield and shrink data to production data
prod <- prod %>% left_join(calc1 %>% select(lease = Lease, nglYield, shrink) %>% distinct())

#Replace NA's with average
prod$shrink[$shrink)] <- mean(prod$shrink, na.rm=TRUE)
prod$nglYield[$nglYield)] <- mean(prod$nglYield, na.rm=TRUE)

#Same process with overall lease and econs data

leaseProd <- leaseProd %>% left_join(calc1 %>% select(Lease = well, nglYield, shrink) %>% distinct())

econs <- econs %>% left_join(calc1 %>% select(lease = Lease, Lease = well))
econs$lease[$lease)] <- 'MATIAS UNIT'

#Remove some weird rows that snuck their way in
econs <- econs %>% filter(nchar(Lease) >= 7)

Basically, on average the leases lose 37% of the gas to a combination of processing and flaring, while they get approximately 131 BBL/MMCF in NGL’s.

Figure 4: NGL Yield by Lease

BTU Uplift

They not only get some value for ngl’s, the gas has a little bit of a btu uplift. So I’ll estimate that value and append it to my various dataframes.

btu <- econs %>% filter(Product == 'GAS') %>%
  filter(year(Date) == 2019) %>% mutate(btu = replace(btu, btu == 0, NA)) %>%
  group_by(lease, Product) %>% summarise(btu = mean(btu, na.rm=TRUE), Price = mean(as.numeric(Price),na.rm=TRUE)) %>%
  ungroup() %>% group_by(lease) %>% mutate(btu = mean(btu, na.rm=TRUE)) %>% ungroup() %>%
  spread(Product, Price) %>% filter(! %>% select(lease, btu)

prod <- prod %>% left_join(btu)
prod$btu[$btu)] <- mean(prod$btu, na.rm=TRUE)

On average they get a 5% uplift on the gas for energy value.


I also was able to extract pricing by well and date for all of the wells. To keep it simple, I’m going to just do the average by date. This only runs through 2019, so if you wanted to go into 2020 it would probably be worth it to cross-check PVA quarterly financials.

If you followed my tutorial on scraping strip, I will be grabbing 2019 WTI and Henry Hub values as my baseline for differentials. If you want to repeat, please follow that one. To get that data, I will need two more packages.

#Get price by product monthly in 2019
diffs <- econs %>% filter(Product == 'OIL'| Product == 'GAS'| Product == 'PPRO') %>%
  filter(year(Date) == 2019) %>% 
  group_by(Date, Product) %>% summarise(Price = mean(as.numeric(Price),na.rm=TRUE)) %>%
  ungroup() %>% 
  spread(Product, Price) 
diffs$PPRO <- diffs$PPRO*42


#A price function I build based on my tutorial
prices <- readRDS('./data/priceFunc.rds')()
diffs <-diffs %>% left_join(prices %>% select(Date = DATE, WTI, HH))

diffs$oilDiff <- diffs$WTI-diffs$OIL
diffs$gasDiff <- diffs$HH-diffs$GAS
diffs$nglDiff <- diffs$PPRO/diffs$WTI
DateGas Price
Oil Price
NGL Price
Table 1: 2019 Differential Summary

That should be the last step for us to do the valuation. I usually would just try to tie to last monthly differential values if I have them.


So what do I have now?

  • List of Leases/Wells with ORRI information for each
  • Monthly production data by lease
  • Differentials
  • NGL Yields
  • Shrink
  • BTU
  • Tax Information

The only thing I’d really need to do now is estimate my decline curves for each lease, which you should be able to do if you had followed along with this tutorial.

If you’re interested in participating in EnergyNet auctions, just head on over to the site (though fair warning you need to be an accredited investor).

Follow my blog for more data driven insights like this

I realize that this tutorial was quite intense, but once you get very good at manipulating data it becomes second nature. And if you get really good, you can basically build your own ARIES/phDwin to create a valuation calculation engine like the one below.

Disclaimer: I am not a financial advisor and the below is not meant to serve as a valuation for the asset package and is not meant to be construed as financial advice. This is an example, though you could potentially arrive at a pre-tax/G&A valuation for the asset by working through the decline curves and using your own pricing assumptions. However, you do so at your own risk.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: