import glob
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
Alberta In-situ Oilsands Production Analysis using Python - Part III
Data analysis and visualization
Disclaimer
This blog post is for educational purposes only. Any commercial use of the information provided in this blog post is prohibited. For more information about the AER copyright and permission to reproduce, please visit AER Copyright and Disclaimer. The author is not responsible for any damage or loss caused by the use of the information provided in this blog post.
Introduction
Alberta Energy Regulator (AER) lists the monthly injection and production data for all in-situ oilsands projects in Alberta on its website Alberta Energy Regulator. The data is available in several spreadsheets and can be downloaded from ST53: Alberta In Situ Oil Sands Production Summary. The annual data are reported in separate files in .xls
format. Each file has 6 different sheets with information about bitumen, water, steam, wells, SOR, and WSR.
To learn how to download and prepare the tidy dataset in .csv
format, please refered to the previous posts Downloading Alberta Oilsands Production Data using Python and Alberta In-situ Oilsands Production Analysis using Python - Part I. In this post we use the tidy bitumen.csv
file for data analysis and visualization.
Loading the data
All files are stored in the current local directory. We use the glob
module to list all .csv
files in the directory and then use pandas
to read the data into a DataFrame
. The matplotlib
and seaborn
modules are used to create the plots.
The glob
module is used to list all .csv
files in the directory, which we can then read into a DataFrame using pandas.
= glob.glob('*.csv')
csv_files csv_files
['bitumen.csv']
The bitumen.csv
file is loaded into a dataframe using the read_csv
function from the pandas module.
= pd.read_csv("bitumen.csv")
bitumen bitumen.head()
Operator | Scheme Name | Area | Approval Number | Recovery Method | Jan | Feb | Mar | Apr | May | Jun | Jul | Aug | Sep | Oct | Nov | Dec | Year | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | CNRL | NaN | Cold Lake | 4746P | Commercial | 587.48 | 596.72 | 637.59 | 678.74 | 705.61 | 718.99 | 697.25 | 696.88 | 763.21 | 702.21 | 696.32 | 674.99 | 2010 |
1 | CNRL | NaN | Cold Lake | 6726I | Commercial | 171.54 | 193.66 | 187.71 | 177.54 | 176.90 | 174.50 | 160.53 | 155.97 | 168.22 | 168.27 | 139.73 | 143.48 | 2010 |
2 | Baytex | Cliffdale Pilot | Peace River | 11034E | Commercial-CSS | 0.00 | 2.55 | 6.15 | 2.25 | 4.16 | 10.59 | 5.83 | 3.27 | 9.18 | 20.09 | 15.42 | 13.15 | 2010 |
3 | North Peace Energy Corp. | Red Earth | Peace River | 11209A | Commercial-CSS | 10.30 | 4.70 | 3.11 | 2.21 | 1.63 | 0.51 | 0.00 | 0.00 | 0.40 | 0.00 | 0.00 | 0.00 | 2010 |
4 | Penn West | Seal | Peace River | 11377A | Commercial-CSS | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 2010 |
Data analysis and visualization
Alberta average bitumen production rate per year
The daily bitumen production rate is reported as a monthly average in m3/day. An approximation of the yearly average production rate in m3/day is calculated by taking the average of the monthly average production rates. The yearly average production rate is then converted to Mm3/day by dividing it by 1,000.
# Calculate the average daily production rate for each year
= (bitumen
bitumen_agg = lambda x: x.loc[:,"Jan":"Dec"].mean(axis = 1))
.assign(Production_Rate = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec'])
.drop(columns 'Year'])
.groupby([sum()
.1000)
.div(
.reset_index()
) bitumen_agg
Year | Production_Rate | |
---|---|---|
0 | 2010 | 119.596470 |
1 | 2011 | 134.691030 |
2 | 2012 | 157.040421 |
3 | 2013 | 175.839908 |
4 | 2014 | 200.679804 |
5 | 2016 | 220.765067 |
6 | 2017 | 245.572865 |
7 | 2018 | 249.694999 |
8 | 2019 | 245.745934 |
9 | 2020 | 237.340961 |
10 | 2021 | 264.391272 |
11 | 2022 | 269.899715 |
Let’s see the yearly average production rate in Mm3/day.
# Plot the yearly average production rate
= (8, 6))
plt.figure(figsize = 'Year', y = 'Production_Rate', data = bitumen_agg)
sns.barplot(x 'Alberta average bitumen production rate')
plt.title('Year')
plt.xlabel('Production Rate (Mm3/day)')
plt.ylabel( plt.show()
Largest bitumen producers in Alberta since 2010
The top 10 largest bitumen producers in Alberta since 2010 are listed below. The data is sorted by the total bitumen production in descending order. The total bitumen production rate is converted to MMm3/day by dividing by 1,000,000.
# Calculate the total bitumen production rate for each operator
= (bitumen
bitumen_agg = lambda x: x.loc[:,"Jan":"Dec"].sum(axis = 1) * 365)
.assign(Cumulative_Production = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec', 'Year'])
.drop(columns 'Operator'])
.groupby([sum()
.1_000_000)
.div(= 'Cumulative_Production', ascending = False)
.sort_values(by
.reset_index()10)
.head(
) bitumen_agg
Operator | Cumulative_Production | |
---|---|---|
0 | Cenovus | 2561.630674 |
1 | CNRL | 2266.675729 |
2 | Suncor | 1550.949426 |
3 | Imperial | 1249.998787 |
4 | ConocoPhillips | 668.101066 |
5 | Devon | 533.488187 |
6 | MEG | 531.186018 |
7 | Husky | 359.571866 |
8 | Nexen | 157.297422 |
9 | CNOOC | 146.821801 |
The figure below illustrates the top 10 largest bitumen producers in Alberta since 2010.
# Plot the top 10 largest bitumen producers in Alberta since 2010
= (8, 6))
plt.figure(figsize = 'Cumulative_Production', y = 'Operator', data = bitumen_agg)
sns.barplot(x 'Top 10 largest bitumen producers in Alberta since 2010')
plt.title('Cumulative Production (MMm3)')
plt.xlabel('Operator')
plt.ylabel( plt.show()
Largest bitumen producers in Alberta in 2022
The top 10 largest bitumen producers in Alberta in 2022 are listed below. The data is sorted by the bitumen production rate in descending order. The yearly bitumen production rate is converted to Mm3/day by dividing by 1,000.
# Calculate the yearly bitumen production rate for each operator
= (bitumen
bitumen_agg "Year == 2022")
.query(= lambda x: x.loc[:,"Jan":"Dec"].mean(axis = 1))
.assign(Production_Rate = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec', 'Year'])
.drop(columns 'Operator'])
.groupby([sum()
.1000)
.div(= 'Production_Rate', ascending = False)
.sort_values(by
.reset_index()10)
.head(
) bitumen_agg
Operator | Production_Rate | |
---|---|---|
0 | Cenovus | 76.725747 |
1 | CNRL | 55.968093 |
2 | Suncor | 36.721742 |
3 | Imperial | 23.121582 |
4 | ConocoPhillips | 22.142553 |
5 | MEG | 14.758539 |
6 | Strathcona | 7.995144 |
7 | CNOOC | 7.562969 |
8 | Athabasca Oil | 4.610921 |
9 | Greenfire | 4.151495 |
The top 10 largest bitumen producers in Alberta in 2022 are shown in the plot below.
# Plot the top 10 largest bitumen producers in Alberta in 2022
= (8, 6))
plt.figure(figsize = 'Production_Rate', y = 'Operator', data = bitumen_agg)
sns.barplot(x 'Top 10 largest bitumen producers in Alberta in 2022')
plt.title('Production Rate (Mm3/day)')
plt.xlabel('Operator')
plt.ylabel( plt.show()
Largest bitumen producers in Alberta in 2022 by area
The top 5 largest 2022 bitumen producers by area in Alberta are listed below. The data is sorted by the bitumen production rate in descending order. The yearly bitumen production rate is converted to Mm3/day by dividing by 1,000.
# Calculate the yearly bitumen production rate for each operator by area
= (bitumen
bitumen_agg "Year == 2022")
.query(= lambda x: x.loc[:,"Jan":"Dec"].mean(axis = 1))
.assign(Production_Rate = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec', 'Year'])
.drop(columns 'Area', 'Operator'])
.groupby([sum()
.1000)
.div(= 'Production_Rate', ascending = False)
.sort_values(by 'Area')
.groupby(5)
.head(
.reset_index()
) bitumen_agg
Area | Operator | Production_Rate | |
---|---|---|---|
0 | Athabasca | Cenovus | 76.565583 |
1 | Athabasca | CNRL | 36.756985 |
2 | Athabasca | Suncor | 36.721742 |
3 | Cold Lake | Imperial | 23.121582 |
4 | Athabasca | ConocoPhillips | 22.142553 |
5 | Cold Lake | CNRL | 17.712652 |
6 | Athabasca | MEG | 14.758539 |
7 | Cold Lake | Strathcona | 7.995144 |
8 | Peace River | Baytex | 2.889908 |
9 | Peace River | CNRL | 1.498456 |
10 | Peace River | Obsidian | 0.889067 |
11 | Cold Lake | Baytex | 0.308280 |
12 | Peace River | Islander | 0.273017 |
13 | Cold Lake | Cenovus | 0.160163 |
14 | Peace River | Woodcote | 0.031730 |
The figure below shows the top 5 largest bitumen producers in Alberta in 2022 by area.
= plt.subplots(3, 1, figsize = (6, 15))
fig, ax for i, area in enumerate(bitumen_agg['Area'].unique()):
= 'Production_Rate', y = 'Operator', data = bitumen_agg.query("Area == @area"), ax = ax[i])
sns.barplot(x
ax[i].set_title(area)'Production Rate (Mm3/day)')
ax[i].set_xlabel('Operator')
ax[i].set_ylabel(= 0.95)
fig.subplots_adjust(top plt.show()