Application of Python and Data Analytics in Oil and GAs-1
Application of Python and Data Analytics in Oil and GAs-1
Gas
Jaiyesh Chahar | [email protected]
Useful Links-
1. Contact me at: https://www.linkedin.com/in/jaiyesh-chahar-9b3642107/ (https://www.linkedin.com
/in/jaiyesh-chahar-9b3642107/)
2. For more of my Projects : https://github.com/jaiyesh (https://github.com/jaiyesh)
3. Playlist of python for oil and gas by Petroleum from Scratch: https://www.youtube.com
/watch?v=UjdPncyGkIs&list=PLLwtZopJNyqYGXEYmt0zezAEuS616rACw (https://www.youtube.com
/watch?v=UjdPncyGkIs&list=PLLwtZopJNyqYGXEYmt0zezAEuS616rACw)
4. Petroleum from Scratch: https://www.linkedin.com/company/petroleum-from-scratch
/?viewAsMember=true (https://www.linkedin.com/company/petroleum-from-scratch
/?viewAsMember=true)
Welcome Everyone!
I am very excited to take you through this exiciting journey of applicaion of python and Data Analystics in
O&G Industry.
My Aim is to take you to a better, confident and super comfortable place as far as Python for Oil and Gas
is concerned.
We will start with basics of Python and then move on to use cases in Industry.
Here goes.
Python from Scratch
In [1]: # is used for commenting out the statement(block, inline comment)
#Starting with print function
print('Hello Guys')
#Escape Sequence \n helps change line.
print('Hello \nPDEU')
Hello Guys
Hello
PDEU
Mathematical Operations
In [2]: # Addition
print(4+5)
print(4.0+5)
#Subtraction
print(5-1)
print(5.0-1)
#Multiplication
print(2*3)
#Division
print(625/10)
9
9.0
4
4.0
6
62.5
25
5
In [5]: ## To get remainder using %
print(26%5)
Strings
6 : integer
6.0 : Float
"6" : String
In [9]: print('Petroleum'+'Engineering')
PetroleumEngineering
In [10]: print('Spam'*3)
SpamSpamSpam
In [11]: print(4*3)
print(4*'3')
12
3333
In [12]: type(2)
Out[12]: int
In [13]: type(4*'3')
Out[13]: str
-9
Input
In [18]: porosity
Out[18]: '0.5'
In [19]: type('porosity')
Out[19]: str
In [21]: type(porosity)
Out[21]: float
In [27]: print(2!= 3)
True
In [28]: print(2 == 3 )
False
In [29]: print(2 = 3 )
True
False
Safe Zone
While Loops : To repeat a block of code again and again; until the condition
satisfies
The code in body of while loop is executed repeatedly. This is called Iterations
In [35]: i = 1
while i<=5:
print(i)
i = i+1
print('Finished')
1
2
3
4
5
Finished
0
1
2
3
4
5
Breaking
Finished
Continue : to jump back to top of the while loop, rather than stopping it.
Stops the current iteration and continue with the next one.
In [40]: i = 0
while i <= 5:
i = i+1
if i ==3:
print('SKipping 3')
continue
print(i)
1
2
SKipping 3
4
5
6
Lists
Used to store items
Out[42]: 0.1
In [43]: # empty lists are used heavily to populate it later during the program
empty = []
i = 5
while i < 10:
empty.append(i)
i = i+1
empty
Out[43]: [5, 6, 7, 8, 9]
In [44]: #STrings are also like a list of Characters, so indexing operators are
also used on strings.
String = 'Petroleum'
String[1]
Out[44]: 'e'
Out[46]: [1, 2, 3, 4, 5, 6]
In [47]: #Multiply
a*3
Out[47]: [1, 2, 3, 1, 2, 3, 1, 2, 3]
In [48]: # in operator
1 in a
Out[48]: True
Out[49]: True
In [50]: #List Functions
#append: adding an item to the end of an existing list
a = [1,2,3]
a.append(4)
a
Out[50]: [1, 2, 3, 4]
In [51]: #insert: Like append but we can insert a new item at any position in li
st.
a = [1,2,3,4,5,6,7]
a.insert(4,'PETROLEUM')
a
print(subset1)
subset2 = superset[:]
print(subset2) #Skipping a part also works for first and end indices.
[1, 3, 5]
[1, 2, 3, 4, 5, 6, 7, 8, 9]
[1, 2, 3, 4, 5, 6, 7, 8]
[1, 2, 3, 4, 5, 6, 7, 8, 9]
print(reverse_set)
[9, 7, 5, 3, 1]
In [54]: #Same can be applied to strings
name = 'Petroleum_Engineering'
print(name[0])
print(name[-1])
print(name[0:5])
print(name[::2])
print(name[-1::-1])
P
g
Petro
PtoemEgneig
gnireenignE_muelorteP
Tuples
A tuple is a collection which is ordered and unchangeable
Immutable
In [55]: a = (1,2,3,4,'Hello')
Out[56]: 'Hello'
In [57]: a[4] = 7
---------------------------------------------------------------------
------
TypeError Traceback (most recent call
last)
<ipython-input-57-cb3ce5dc8467> in <module>
----> 1 a[4] = 7
Dictionaries
Helps store data with labels
Has no order
print(rock_properties)
In [59]: rock_properties['poro']
Out[59]: 0.25
In [61]: rock_properties
Sets
Curly braces are used just like dictionaries
In [62]: a = {1,2,3,4,5,6,7,1,2,2}
In [63]: a
Out[63]: {1, 2, 3, 4, 5, 6, 7}
for loops
The tool with which we can utilize the power of computers
Hello
people
of
PDEU
Functions
Instead of writing code again and again we can create a function for different values, we can write a function
and call that whenever we want to do the calculations
Use of Function
In [70]: add(2,3)
Out[70]: 5
In [71]: #Once we return from a function, it stops being executed, any code writ
en after the return will never be executed
def f(x,y,z):
return x/y +z
print('Hello')
In [72]: f(4,2,4)
Out[72]: 6.0
In [74]: api(0.9)
Lambda Function
Single line function
In [76]: api_lambda(0.9)
Out[76]: 25.72222222222223
List Comprehensions
Quickly creating lists whose contents obeys a simple rule
Numpy Arrays
In [283]: type(arr)
Out[283]: numpy.ndarray
In [284]: type(a)
Out[284]: list
In [214]: a=[1,2,3,4,5]
b=[4,5,6,7,8]
arra = np.array(a)
arrb = np.array(b)
print(a+b) #Concatation of lists not addition
print(arra+arrb)#addition of elements of array
[1, 2, 3, 4, 5, 4, 5, 6, 7, 8]
[ 5 7 9 11 13]
Out[286]: (3, 3)
Out[217]: array([ 0, 500, 1000, 1500, 2000, 2500, 3000, 3500, 4000, 4500, 50
00])
#Both start and stop values are included as the first and last values
of the array.
#Creates saturation array with 100 values: starting from 0-100
saturations
In [287]: #np.zeros
z = np.zeros(3)
z
In [288]: zm = np.zeros((3,3))
zm
In [290]: mo = np.ones((3,2))
mo
In [297]: #Random: Numpy also has lots of ways to create random number arrays:
#Rand : Create an array of the given shape and populate it with random
samples from a uniform distribution over [0, 1)
rand = np.random.rand(2) #shape input
rand
In [298]: np.random.rand(5,5)
In [299]: #randn : Return a sample (or samples) from the "standard normal" distr
ibution(mean =0,standard deviation=1). Unlike rand which is uniform.
np.random.randn(2)
In [300]: np.random.randn(5,5)
Out[306]: array([ 4, 28, 29, 8, 64, 62, 38, 46, 51, 85])
In [307]: np.random.randint(1,100,10)
Out[307]: array([67, 97, 43, 19, 78, 94, 44, 67, 46, 8])
# linewidth=2, color='r')
#plt.show()
In [310]: a*b
In [311]: a/b
In [312]: a**b
Out[313]: 60
In [314]: #len function
len(a)
Out[314]: 4
In [315]: z = np.array([a,a**b])
z
Out[316]: 2
In [318]: len(z.T)
Out[318]: 4
Out[319]: dtype('int32')
In [326]: poro.max()
Out[326]: 0.2943737318970742
In [327]: poro.min()
Out[327]: 0.21559888678906225
In [328]: poro.mean()
Out[328]: 0.24987956068441375
In [329]: poro.std()
Out[329]: 0.010131068219927063
In [330]: #Numpy Universal Array Functions - Numpy comes with many universal arr
ay functions
#which are essentially just mathematical operations you can use to per
form the operation across the array
arr = np.random.randint(1,100,10)
arr
Out[330]: array([ 4, 43, 70, 49, 64, 26, 21, 12, 72, 93])
In [333]: np.sin(arr)
In [334]: np.log(arr)
Pandas
Ms Excel of Python but powerful This library helps us import | create | work with data in the form of tables.
Out[335]:
phi perm lith
1 0.40 20 shale
Out[336]:
phi perm lith Saturation
Out[343]:
DATEPRD NPD_WELL_BORE_CODE NPD_WELL_BORE_NAME ON_STREAM_HRS AVG_DOWNHOLE_PRE
In [354]: volve.describe()
Out[354]:
NPD_WELL_BORE_CODE ON_STREAM_HRS AVG_DOWNHOLE_PRESSURE AVG_DOWNHOLE_TEMPE
In [346]: #shape
volve.shape
15/9-F-4 3327
15/9-F-5 3306
15/9-F-14 3056
15/9-F-12 3056
15/9-F-11 1165
15/9-F-15 D 978
15/9-F-1 C 746
Name: NPD_WELL_BORE_NAME, dtype: int64
In [350]: volve.groupby(['NPD_WELL_BORE_NAME']).agg({'NPD_WELL_BORE_NAME':'count
'})
Out[350]:
NPD_WELL_BORE_NAME
NPD_WELL_BORE_NAME
15/9-F-1 C 746
15/9-F-11 1165
15/9-F-12 3056
15/9-F-14 3056
15/9-F-15 D 978
15/9-F-4 3327
15/9-F-5 3306
In [353]: volve_pf12.head()
Out[353]:
DATEPRD NPD_WELL_BORE_CODE NPD_WELL_BORE_NAME ON_STREAM_HRS AVG_DOWNHOLE_
Out[265]:
NPD_WELL_BORE_CODE ON_STREAM_HRS AVG_DOWNHOLE_PRESSURE AVG_DOWNHOLE_TEMPE
<class 'pandas.core.frame.DataFrame'>
Int64Index: 3056 entries, 1911 to 4966
Data columns (total 19 columns):
DATEPRD 3056 non-null object
NPD_WELL_BORE_CODE 3056 non-null int64
NPD_WELL_BORE_NAME 3056 non-null object
ON_STREAM_HRS 3056 non-null float64
AVG_DOWNHOLE_PRESSURE 3050 non-null float64
AVG_DOWNHOLE_TEMPERATURE 3050 non-null float64
AVG_DP_TUBING 3050 non-null float64
AVG_ANNULUS_PRESS 3043 non-null float64
AVG_CHOKE_SIZE_P 3012 non-null float64
AVG_CHOKE_UOM 3056 non-null object
AVG_WHP_P 3056 non-null float64
AVG_WHT_P 3056 non-null float64
DP_CHOKE_SIZE 3056 non-null float64
BORE_OIL_VOL 3056 non-null float64
BORE_GAS_VOL 3056 non-null float64
BORE_WAT_VOL 3056 non-null float64
BORE_WI_VOL 0 non-null float64
FLOW_KIND 3056 non-null object
WELL_TYPE 3056 non-null object
dtypes: float64(13), int64(1), object(5)
memory usage: 477.5+ KB
In [267]: import seaborn as sns
sns.heatmap(volve_pf12.isnull())
Out[356]:
DATEPRD ON_STREAM_HRS AVG_DOWNHOLE_PRESSURE AVG_DOWNHOLE_TEMPERATURE
In [358]: volve_pf12.head()
Out[358]:
ON_STREAM_HRS AVG_DOWNHOLE_PRESSURE AVG_DOWNHOLE_TEMPERATURE
DATEPRD
In [360]: volve_pf12['AVG_DOWNHOLE_PRESSURE']
Out[360]: DATEPRD
12-Feb-08 308.056
13-Feb-08 303.034
14-Feb-08 295.586
15-Feb-08 297.663
16-Feb-08 295.936
...
13-Sep-16 0.000
14-Sep-16 0.000
15-Sep-16 0.000
16-Sep-16 0.000
17-Sep-16 0.000
Name: AVG_DOWNHOLE_PRESSURE, Length: 3056, dtype: float64
In [361]: volve_pf12[['AVG_DOWNHOLE_PRESSURE']]
Out[361]:
AVG_DOWNHOLE_PRESSURE
DATEPRD
12-Feb-08 308.056
13-Feb-08 303.034
14-Feb-08 295.586
15-Feb-08 297.663
16-Feb-08 295.936
... ...
13-Sep-16 0.000
14-Sep-16 0.000
15-Sep-16 0.000
16-Sep-16 0.000
17-Sep-16 0.000
In [362]: a =volve_pf12[['AVG_DOWNHOLE_PRESSURE','BORE_OIL_VOL']]
In [363]: a
Out[363]:
AVG_DOWNHOLE_PRESSURE BORE_OIL_VOL
DATEPRD
In [279]: print(plt.style.available)