Advance Python Sheet 1696337837
Advance Python Sheet 1696337837
THREE
The goal of this part is to give common snippets including built-in and 3rd party modules usages.
Table of Contents
• Regular Expression
– Compare HTML tags
– re.findall() match string
– Group Comparison
– Non capturing group
– Back Reference
– Named Grouping (?P<name>)
– Substitute String
– Look around
– Match common username or password
– Match hex color value
– Match email
– Match URL
– Match IP address
– Match Mac address
– Lexer
135
python-cheatsheet Documentation, Release 0.1.0
# open tag
>>> re.search('<[^/>][^>]*>', '<table>') != None
True
>>> re.search('<[^/>][^>]*>', '<a href="#label">') != None
True
>>> re.search('<[^/>][^>]*>', '<img src="/img">') != None
True
>>> re.search('<[^/>][^>]*>', '</table>') != None
False
# close tag
>>> re.search('</[^>]+>', '</table>') != None
True
# self close
>>> re.search('<[^/>]+/>', '<br />') != None
True
# Nesting groups
>>> m = re.search(r'(((\d{4})-\d{2})-\d{2})', '2016-01-01')
>>> m.groups()
('2016-01-01', '2016-01', '2016')
>>> m.group()
'2016-01-01'
>>> m.group(1)
'2016-01-01'
>>> m.group(2)
'2016-01'
>>> m.group(3)
'2016'
# capturing group
>>> m = re.search('(http|ftp)://([^/\r\n]+)(/[^\r\n]*)?', url)
>>> m.groups()
('http', 'stackoverflow.com', '/')
# basic substitute
>>> res = "1a2b3c"
>>> re.sub(r'[a-z]',' ', res)
'1 2 3 '
# camelcase to underscore
>>> def convert(s):
... res = re.sub(r'(.)([A-Z][a-z]+)',r'\1_\2', s)
(continues on next page)
# basic
>>> re.sub('(?=\d{3})', ' ', '12345')
' 1 2 345'
>>> re.sub('(?!\d{3})', ' ', '12345')
'123 4 5 '
>>> re.sub('(?<=\d{3})', ' ', '12345')
'123 4 5 '
>>> re.sub('(?<!\d{3})', ' ', '12345')
' 1 2 345'
>>> re.match('^([a-z0-9_\.-]+)@([\da-z\.-]+)\.([a-z\.]{2,6})$',
... '[email protected]')
<_sre.SRE_Match object at 0x1087a4d40>
# or
notation description
(?:. . . ) Don’t capture group
25[0-5] Match 251-255 pattern
2[0-4][0-9] Match 200-249 pattern
[1]?[0-9][0-9] Match 0-199 pattern
3.1.15 Lexer
>>> import re
>>> from collections import namedtuple
>>> tokens = [r'(?P<NUMBER>\d+)',
... r'(?P<PLUS>\+)',
... r'(?P<MINUS>-)',
... r'(?P<TIMES>\*)',
... r'(?P<DIVIDE>/)',
... r'(?P<WS>\s+)']
>>> lex = re.compile('|'.join(tokens))
>>> Token = namedtuple('Token', ['type', 'value'])
>>> def tokenize(text):
... scan = lex.scanner(text)
... return (Token(m.lastgroup, m.group())
... for m in iter(scan.match, None) if m.lastgroup != 'WS')
...
>>> for _t in tokenize('9 + 5 * 2 - 7'):
... print(_t)
...
Token(type='NUMBER', value='9')
Token(type='PLUS', value='+')
Token(type='NUMBER', value='5')
Token(type='TIMES', value='*')
(continues on next page)
3.2 Socket
Socket programming is inevitable for most programmers even though Python provides much high-level networking
interface such as httplib, urllib, imaplib, telnetlib and so on. Some Unix-Like system’s interfaces were called through
socket interface, e.g., Netlink, Kernel cryptography. To temper a pain to read long-winded documents or source code,
this cheat sheet tries to collect some common or uncommon snippets which are related to low-level socket programming.
Table of Contents
• Socket
– Get Hostname
– Get address family and socket address from string
– Transform Host & Network Endian
– IP dotted-quad string & byte format convert
– Mac address & byte format convert
– Simple TCP Echo Server
– Simple TCP Echo Server through IPv6
– Disable IPv6 Only
– Simple TCP Echo Server Via SocketServer
– Simple TLS/SSL TCP Echo Server
– Set ciphers on TLS/SSL TCP Echo Server
– Simple UDP Echo Server
– Simple UDP Echo Server Via SocketServer
– Simple UDP client - Sender
– Broadcast UDP Packets
– Simple UNIX Domain Socket
– Simple duplex processes communication
– Simple Asynchronous TCP Server - Thread
– Simple Asynchronous TCP Server - select
– Simple Asynchronous TCP Server - poll
– Simple Asynchronous TCP Server - epoll
– Simple Asynchronous TCP Server - kqueue
– High-Level API - selectors
import socket
import sys
try:
for res in socket.getaddrinfo(sys.argv[1], None,
proto=socket.IPPROTO_TCP):
family = res[0]
sockaddr = res[4]
print(family, sockaddr)
except socket.gaierror:
print("Invalid")
Output:
$ gai.py 192.0.2.244
AddressFamily.AF_INET ('192.0.2.244', 0)
$ gai.py 2001:db8:f00d::1:d
(continues on next page)
# little-endian machine
>>> import socket
>>> a = 1 # host endian
>>> socket.htons(a) # network endian
256
>>> socket.htonl(a) # network endian
16777216
>>> socket.ntohs(256) # host endian
1
>>> socket.ntohl(16777216) # host endian
1
# big-endian machine
>>> import socket
>>> a = 1 # host endian
>>> socket.htons(a) # network endian
1
>>> socket.htonl(a) # network endian
1L
>>> socket.ntohs(1) # host endian
1
>>> socket.ntohl(1) # host endian
1L
import socket
class Server(object):
def __init__(self, host, port):
self._host = host
self._port = port
def __enter__(self):
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
sock.bind((self._host, self._port))
sock.listen(10)
self._sock = sock
return self._sock
def __exit__(self, *exc_info):
if exc_info[0]:
import traceback
traceback.print_exception(*exc_info)
self._sock.close()
if __name__ == '__main__':
host = 'localhost'
port = 5566
with Server(host, 5566) as s:
while True:
conn, addr = s.accept()
msg = conn.recv(1024)
conn.send(msg)
conn.close()
output:
$ nc localhost 5566
Hello World
Hello World
import contextlib
import socket
host = "::1"
port = 5566
@contextlib.contextmanager
def server(host, port):
s = socket.socket(socket.AF_INET6, socket.SOCK_STREAM, 0)
try:
s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
s.bind((host, port))
s.listen(10)
yield s
finally:
s.close()
if msg:
conn.send(msg)
conn.close()
except KeyboardInterrupt:
pass
output:
#!/usr/bin/env python3
import contextlib
import socket
host = "::"
port = 5566
@contextlib.contextmanager
def server(host: str, port: int):
s = socket.socket(socket.AF_INET6, socket.SOCK_STREAM, 0)
try:
s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
s.setsockopt(socket.IPPROTO_IPV6, socket.IPV6_V6ONLY, 0)
s.bind((host, port))
s.listen(10)
yield s
finally:
s.close()
if msg:
conn.send(msg)
conn.close()
except KeyboardInterrupt:
pass
output:
output:
$ nc localhost 5566
Hello World
Hello World
import socket
import ssl
sslctx = ssl.SSLContext(ssl.PROTOCOL_TLSv1)
sslctx.load_cert_chain(certfile='./root-ca.crt',
keyfile='./root-ca.key')
try:
while True:
conn, addr = sock.accept()
sslconn = sslctx.wrap_socket(conn, server_side=True)
msg = sslconn.recv(1024)
if msg:
sslconn.send(msg)
sslconn.close()
finally:
sock.close()
output:
# console 1
$ openssl genrsa -out root-ca.key 2048
$ openssl req -x509 -new -nodes -key root-ca.key -days 365 -out root-ca.crt
$ python3 ssl_tcp_server.py
# console 2
$ openssl s_client -connect localhost:5566
...
Hello SSL
Hello SSL
read:errno=0
import socket
import json
import ssl
sslctx = ssl.SSLContext(ssl.PROTOCOL_SSLv23)
sslctx.load_cert_chain(certfile='cert.pem',
keyfile='key.pem')
# set ssl ciphers
sslctx.set_ciphers('ECDH-ECDSA-AES128-GCM-SHA256')
print(json.dumps(sslctx.get_ciphers(), indent=2))
try:
while True:
conn, addr = sock.accept()
sslconn = sslctx.wrap_socket(conn, server_side=True)
msg = sslconn.recv(1024)
if msg:
sslconn.send(msg)
sslconn.close()
finally:
sock.close()
output:
"strength_bits": 128,
"alg_bits": 128
}
]
$ openssl s_client -connect localhost:5566 -cipher "ECDH-ECDSA-AES128-GCM-SHA256"
...
---
Hello ECDH-ECDSA-AES128-GCM-SHA256
Hello ECDH-ECDSA-AES128-GCM-SHA256
read:errno=0
import socket
class UDPServer(object):
def __init__(self, host, port):
self._host = host
self._port = port
def __enter__(self):
sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
sock.bind((self._host, self._port))
self._sock = sock
return sock
def __exit__(self, *exc_info):
if exc_info[0]:
import traceback
traceback.print_exception(*exc_info)
self._sock.close()
if __name__ == '__main__':
host = 'localhost'
port = 5566
with UDPServer(host, port) as s:
while True:
msg, addr = s.recvfrom(1024)
s.sendto(msg, addr)
output:
$ nc -u localhost 5566
Hello World
Hello World
output:
$ nc -u localhost 5566
Hello World
Hello World
output:
output:
$ nc -k -w 1 -ul 5566
1431473025.72
import socket
import contextlib
import os
@contextlib.contextmanager
def DomainServer(addr):
try:
if os.path.exists(addr):
os.unlink(addr)
sock = socket.socket(socket.AF_UNIX, socket.SOCK_STREAM)
sock.bind(addr)
sock.listen(10)
yield sock
finally:
sock.close()
if os.path.exists(addr):
os.unlink(addr)
addr = "./domain.sock"
with DomainServer(addr) as sock:
while True:
conn, _ = sock.accept()
msg = conn.recv(1024)
conn.send(msg)
conn.close()
output:
$ nc -U ./domain.sock
Hello
Hello
import os
import socket
if pid == 0:
print('chlid pid: {}'.format(os.getpid()))
child.send(b'Hello Parent')
msg = child.recv(1024)
print('p[{}] ---> c[{}]: {}'.format(
os.getppid(), os.getpid(), msg))
else:
print('parent pid: {}'.format(os.getpid()))
except KeyboardInterrupt:
pass
finally:
child.close()
parent.close()
output:
$ python3 socketpair_demo.py
parent pid: 9497
chlid pid: 9498
c[9498] ---> p[9497]: b'Hello Parent'
p[9497] ---> c[9498]: b'Hello Parent'
output: (bash 1)
$ nc localhost 5566
Hello
Hello
output: (bash 2)
$ nc localhost 5566
Ker Ker
Ker Ker
output: (bash 1)
$ nc localhost 5566
Hello
Hello
output: (bash 2)
$ nc localhost 5566
Ker Ker
Ker Ker
import socket
import select
import contextlib
host = 'localhost'
port = 5566
con = {}
req = {}
resp = {}
@contextlib.contextmanager
def Server(host,port):
try:
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
s.setblocking(False)
s.bind((host,port))
s.listen(10)
yield s
except socket.error:
print("Get socket error")
raise
finally:
if s: s.close()
@contextlib.contextmanager
def Poll():
try:
e = select.poll()
yield e
(continues on next page)
conn = req[fd]
msg = conn.recv(1024)
if msg:
resp[fd] = msg
poll.modify(fd, select.POLLOUT)
else:
conn.close()
del con[fd]
del req[fd]
conn = con[fd]
msg = resp[fd]
b = 0
total = len(msg)
while total > b:
l = conn.send(msg)
msg = msg[l:]
b += l
del resp[fd]
req[fd] = conn
poll.modify(fd, select.POLLIN)
try:
with Server(host, port) as server, Poll() as poll:
poll.register(server.fileno())
(continues on next page)
while True:
events = poll.poll(1)
for fd, e in events:
if fd == server.fileno():
accept(server, poll)
elif e & (select.POLLIN | select.POLLPRI):
recv(fd, poll)
elif e & select.POLLOUT:
send(fd, poll)
except KeyboardInterrupt:
pass
output: (bash 1)
$ python3 poll.py &
[1] 3036
$ nc localhost 5566
Hello poll
Hello poll
Hello Python Socket Programming
Hello Python Socket Programming
output: (bash 2)
$ nc localhost 5566
Hello Python
Hello Python
Hello Awesome Python
Hello Awesome Python
import socket
import select
import contextlib
host = 'localhost'
port = 5566
con = {}
req = {}
resp = {}
@contextlib.contextmanager
def Server(host,port):
try:
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
(continues on next page)
@contextlib.contextmanager
def Epoll():
try:
e = select.epoll()
yield e
finally:
for fd in con: e.unregister(fd)
e.close()
conn = req[fd]
msg = conn.recv(1024)
if msg:
resp[fd] = msg
epoll.modify(fd, select.EPOLLOUT)
else:
conn.close()
del con[fd]
del req[fd]
conn = con[fd]
(continues on next page)
del resp[fd]
req[fd] = conn
epoll.modify(fd, select.EPOLLIN)
try:
with Server(host, port) as server, Epoll() as epoll:
epoll.register(server.fileno())
while True:
events = epoll.poll(1)
for fd, e in events:
if fd == server.fileno():
accept(server, epoll)
elif e & select.EPOLLIN:
recv(fd, epoll)
elif e & select.EPOLLOUT:
send(fd, epoll)
except KeyboardInterrupt:
pass
output: (bash 1)
output: (bash 2)
$ nc localhost 5566
Hello Python
Hello Python
Hello Awesome Python
Hello Awesome Python
import socket
import select
import contextlib
host = 'localhost'
port = 5566
con = {}
req = {}
resp = {}
@contextlib.contextmanager
def Server(host, port):
try:
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
s.setblocking(False)
s.bind((host, port))
s.listen(10)
yield s
except socket.error:
print("Get socket error")
raise
finally:
if s: s.close()
@contextlib.contextmanager
def Kqueue():
try:
kq = select.kqueue()
yield kq
finally:
kq.close()
for fd, c in con.items(): c.close()
conn = req[fd]
msg = conn.recv(1024)
if msg:
resp[fd] = msg
# remove read event
ke = select.kevent(fd,
select.KQ_FILTER_READ,
select.KQ_EV_DELETE)
kq.control([ke], 0)
# add write event
ke = select.kevent(fd,
select.KQ_FILTER_WRITE,
select.KQ_EV_ADD)
kq.control([ke], 0)
req[fd] = conn
con[fd] = conn
else:
conn.close()
del con[fd]
del req[fd]
conn = con[fd]
msg = resp[fd]
b = 0
total = len(msg)
while total > b:
l = conn.send(msg)
msg = msg[l:]
b += l
del resp[fd]
req[fd] = conn
# remove write event
ke = select.kevent(fd,
select.KQ_FILTER_WRITE,
select.KQ_EV_DELETE)
kq.control([ke], 0)
(continues on next page)
try:
with Server(host, port) as server, Kqueue() as kq:
max_events = 1024
timeout = 1
ke = select.kevent(server.fileno(),
select.KQ_FILTER_READ,
select.KQ_EV_ADD)
kq.control([ke], 0)
while True:
events = kq.control(None, max_events, timeout)
for e in events:
fd = e.ident
if fd == server.fileno():
accept(server, kq)
elif e.filter == select.KQ_FILTER_READ:
recv(fd, kq)
elif e.filter == select.KQ_FILTER_WRITE:
send(fd, kq)
except KeyboardInterrupt:
pass
output: (bash 1)
output: (bash 2)
$ nc localhost 5566
Hello Python
Hello Python
Hello Awesome Python
Hello Awesome Python
# Pyton3.4+ only
# Reference: selectors
import selectors
import socket
import contextlib
@contextlib.contextmanager
def Server(host, port):
try:
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
s.bind((host, port))
s.listen(10)
sel = selectors.DefaultSelector()
yield s, sel
except socket.error:
print("Get socket error")
raise
finally:
if s:
s.close()
host = 'localhost'
port = 5566
with Server(host, port) as (s,sel):
sel.register(s, selectors.EVENT_READ, accept_handler)
while True:
events = sel.select()
for sel_key, m in events:
handler = sel_key.data
handler(sel_key.fileobj, sel)
output: (bash 1)
$ nc localhost 5566
Hello
Hello
output: (bash 1)
$ nc localhost 5566
Hi
Hi
import socket
import selectors
import contextlib
import ssl
sslctx = ssl.create_default_context(ssl.Purpose.CLIENT_AUTH)
sslctx.load_cert_chain(certfile="cert.pem", keyfile="key.pem")
@contextlib.contextmanager
def Server(host, port):
try:
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
s.bind((host, port))
s.listen(10)
sel = selectors.DefaultSelector()
yield s, sel
except socket.error:
print("Get socket error")
raise
finally:
if s: s.close()
if sel: sel.close()
host = 'localhost'
port = 5566
try:
with Server(host, port) as (s,sel):
sel.register(s, selectors.EVENT_READ, accept)
while True:
events = sel.select()
for sel_key, m in events:
handler = sel_key.data
handler(sel_key.fileobj, sel)
except KeyboardInterrupt:
pass
output:
# console 1
$ openssl genrsa -out key.pem 2048
$ openssl req -x509 -new -nodes -key key.pem -days 365 -out cert.pem
$ python3 ssl_tcp_server.py &
$ openssl s_client -connect localhost:5566
...
---
Hello TLS
Hello TLS
# console 2
$ openssl s_client -connect localhost:5566
...
---
Hello SSL
Hello SSL
import socket
import os
import time
if pid:
# parent process
c_s.close()
while True:
p_s.sendall("Hi! Child!")
msg = p_s.recv(1024)
print(msg)
time.sleep(3)
os.wait()
else:
# child process
p_s.close()
while True:
msg = c_s.recv(1024)
print(msg)
c_s.sendall("Hi! Parent!")
output:
$ python ex.py
Hi! Child!
Hi! Parent!
Hi! Child!
Hi! Parent!
...
import os
import sys
if len(sys.argv) != 3:
print("Usage: cmd src dst")
exit(1)
offset = 0
count = 4096
s_len = st.st_size
sfd = s.fileno()
dfd = d.fileno()
output:
import os
import sys
import time
import socket
import contextlib
@contextlib.contextmanager
def server(host, port):
try:
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
s.bind((host, port))
s.listen(10)
yield s
finally:
(continues on next page)
@contextlib.contextmanager
def client(host, port):
try:
c = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
c.connect((host, port))
yield c
finally:
c.close()
fout.write(data)
host = 'localhost'
port = 5566
if len(sys.argv) != 3:
print("usage: cmd src dst")
exit(1)
src = sys.argv[1]
dst = sys.argv[2]
offset = 0
pid = os.fork()
if pid == 0:
# client
time.sleep(3)
with client(host, port) as c, open(src, 'rb') as f:
fd = f.fileno()
st = os.fstat(fd)
count = 4096
else:
# server
with server(host, port) as s, open(dst, 'wb') as f:
conn, addr = s.accept()
do_recv(f, conn)
output:
@contextlib.contextmanager
def create_alg(typ, name):
s = socket.socket(socket.AF_ALG, socket.SOCK_SEQPACKET, 0)
try:
s.bind((typ, name))
yield s
finally:
s.close()
# check data
h = hashlib.sha256(msg).digest()
if h != data:
raise Exception(f"sha256({h}) != af_alg({data})")
output:
$ python3 af_alg.py
9d50bcac2d5e33f936ec2db7dc7b6579cba8e1b099d77c31d8564df46f66bdf5
BS = 16 # Bytes
pad = lambda s: s + (BS - len(s) % BS) * \
chr(BS - len(s) % BS).encode('utf-8')
@contextlib.contextmanager
def create_alg(typ, name):
s = socket.socket(socket.AF_ALG, socket.SOCK_SEQPACKET, 0)
try:
s.bind((typ, name))
yield s
finally:
s.close()
return ciphertext
return upad(plaintext)
key = os.urandom(32)
iv = os.urandom(16)
print(ciphertext.hex())
print(plaintext)
output:
$ python3 aes_cbc.py
01910e4bd6932674dba9bebd4fdf6cf2
b'Demo AF_ALG'
@contextlib.contextmanager
def create_alg(typ, name):
s = socket.socket(socket.AF_ALG, socket.SOCK_SEQPACKET, 0)
try:
s.bind((typ, name))
yield s
finally:
s.close()
assoclen = len(assoc)
(continues on next page)
op, _ = algo.accept()
with op:
msg = assoc + plaintext
op.sendmsg_afalg([msg],
op=socket.ALG_OP_ENCRYPT,
iv=iv,
assoclen=assoclen)
return plaintext
key = os.urandom(16)
iv = os.urandom(12)
assoc = os.urandom(16)
print(ciphertext.hex())
print(plaintext)
output:
$ python3 aes_gcm.py
2e27b67234e01bcb0ab6b451f4f870ce
b'Hello AES-GCM'
@contextlib.contextmanager
def create_alg(typ, name):
s = socket.socket(socket.AF_ALG, socket.SOCK_SEQPACKET, 0)
try:
s.bind((typ, name))
yield s
finally:
s.close()
pfd = pfile.fileno()
offset = 0
st = os.fstat(pfd)
totalbytes = st.st_size
(continues on next page)
op, _ = algo.accept()
with op:
op.sendmsg_afalg(op=socket.ALG_OP_ENCRYPT,
iv=iv,
assoclen=assoclen,
flags=socket.MSG_MORE)
op.sendall(assoc, socket.MSG_MORE)
taglen = len(tag)
res = op.recv(len(msg) - taglen)
plaintext = res[assoclen:]
return plaintext
if len(sys.argv) != 2:
print("usage: cmd plain")
exit(1)
plain = sys.argv[1]
print(ciphertext.hex())
print(plaintext)
output:
@contextlib.contextmanager
def create_alg(typ, name):
s = socket.socket(socket.AF_ALG, socket.SOCK_SEQPACKET, 0)
try:
s.bind((typ, name))
yield s
finally:
s.close()
op.sendmsg_afalg(op=socket.ALG_OP_ENCRYPT,
iv=iv,
assoclen=assoclen,
flags=socket.MSG_MORE)
op.sendall(assoc, socket.MSG_MORE)
taglen = len(tag)
res = op.recv(len(msg) - taglen)
plaintext = res[assoclen:]
return plaintext
key = os.urandom(16)
iv = os.urandom(12)
assoc = os.urandom(16)
assoclen = len(assoc)
count = 1000000
plain = "tmp.rand"
enc_algo.setsockopt(socket.SOL_ALG,
socket.ALG_SET_KEY, key)
enc_algo.setsockopt(socket.SOL_ALG,
socket.ALG_SET_AEAD_AUTHSIZE,
None,
assoclen)
dec_algo.setsockopt(socket.SOL_ALG,
socket.ALG_SET_KEY, key)
dec_algo.setsockopt(socket.SOL_ALG,
socket.ALG_SET_AEAD_AUTHSIZE,
None,
assoclen)
enc_op, _ = enc_algo.accept()
dec_op, _ = dec_algo.accept()
st = os.fstat(pf.fileno())
psize = st.st_size
s = time.time()
for _ in range(count):
ciphertext, tag = encrypt(key, iv, assoc, 16, enc_op, pf, psize)
plaintext = decrypt(key, iv, assoc, tag, dec_op, ciphertext)
cost = time.time() - s
aesgcm = AESGCM(key)
s = time.time()
for _ in range(count):
pf.seek(0, 0)
plaintext = pf.read()
ciphertext = aesgcm.encrypt(iv, plaintext, assoc)
plaintext = aesgcm.decrypt(iv, ciphertext, assoc)
cost = time.time() - s
# clean up
os.remove(plain)
output:
$ python3 aes-gcm.py
total cost time: 15.317010641098022. [AF_ALG]
total cost time: 50.256704807281494. [cryptography]
class IP(Structure):
''' IP header Structure
strcut ip {
u_char ip_hl:4; /* header_len */
u_char ip_v:4; /* version */
u_char ip_tos; /* type of service */
short ip_len; /* total len */
u_short ip_id; /* identification */
short ip_off; /* offset field */
u_char ip_ttl; /* time to live */
u_char ip_p; /* protocol */
u_short ip_sum; /* checksum */
struct in_addr ip_src; /* source */
struct in_addr ip_dst; /* destination */
};
'''
_fields_ = [("ip_hl" , c_ubyte, 4), # 4 bit
("ip_v" , c_ubyte, 4), # 1 byte
("ip_tos", c_uint8), # 2 byte
("ip_len", c_uint16), # 4 byte
("ip_id" , c_uint16), # 6 byte
("ip_off", c_uint16), # 8 byte
("ip_ttl", c_uint8), # 9 byte
(continues on next page)
host = '0.0.0.0'
s = socket.socket(socket.AF_INET,
socket.SOCK_RAW,
socket.IPPROTO_ICMP)
s.setsockopt(socket.IPPROTO_IP, socket.IP_HDRINCL, 1)
s.bind((host, 0))
print("Sniffer start...")
try:
while True:
buf = s.recvfrom(65535)[0]
ip_header = IP(buf[:20])
print('{0}: {1} -> {2}'.format(ip_header.proto,
ip_header.src,
ip_header.dst))
except KeyboardInterrupt:
s.close()
output: (bash 1)
python sniffer.py
Sniffer start...
ICMP: 127.0.0.1 -> 127.0.0.1
ICMP: 127.0.0.1 -> 127.0.0.1
ICMP: 127.0.0.1 -> 127.0.0.1
output: (bash 2)
$ ping -c 3 localhost
PING localhost (127.0.0.1): 56 data bytes
64 bytes from 127.0.0.1: icmp_seq=0 ttl=64 time=0.063 ms
64 bytes from 127.0.0.1: icmp_seq=1 ttl=64 time=0.087 ms
64 bytes from 127.0.0.1: icmp_seq=2 ttl=64 time=0.159 ms
#!/usr/bin/env python3.6
"""
Based on RFC-793, the following figure shows the TCP header format:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Source Port | Destination Port |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Sequence Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Acknowledgment Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Data | |U|A|P|R|S|F| |
| Offset| Reserved |R|C|S|S|Y|I| Window |
| | |G|K|H|T|N|N| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Checksum | Urgent Pointer |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Options | Padding |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| data |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
struct tcphdr {
__be16 source;
__be16 dest;
__be32 seq;
__be32 ack_seq;
#if defined(__LITTLE_ENDIAN_BITFIELD)
__u16 res1:4,
doff:4,
fin:1,
syn:1,
rst:1,
psh:1,
ack:1,
urg:1,
ece:1,
cwr:1;
#elif defined(__BIG_ENDIAN_BITFIELD)
__u16 doff:4,
(continues on next page)
un = platform.system()
if un != "Linux":
print(f"{un} is not supported!")
sys.exit(1)
@contextmanager
def create_socket():
''' Create a TCP raw socket '''
s = socket.socket(socket.AF_INET,
socket.SOCK_RAW,
socket.IPPROTO_TCP)
try:
yield s
finally:
s.close()
try:
with create_socket() as s:
while True:
pkt, addr = s.recvfrom(65535)
doff = dr >> 4
fin = flags & 0x01
syn = flags & 0x02
rst = flags & 0x04
psh = flags & 0x08
ack = flags & 0x10
urg = flags & 0x20
ece = flags & 0x40
cwr = flags & 0x80
tcplen = (doff) * 4
h_size = iplen + tcplen
if not data:
continue
except KeyboardInterrupt:
pass
output:
$ python3.6 tcp.py
------------ TCP_HEADER --------------
Source Port: 38352
Destination Port: 8000
Sequence Number: 2907801591
Acknowledgment Number: 398995857
Data offset: 8
FIN: 0
SYN: 0
RST: 0
PSH: 8
ACK: 16
URG: 0
ECE: 0
CWR: 0
Window: 342
Checksum: 65142
Urgent Point: 0
--------------- DATA -----------------
b'GET / HTTP/1.1\r\nHost: localhost:8000\r\nUser-Agent: curl/7.47.0\r\nAccept: */*\r\n\r\
˓→n'
"""
Ehternet Packet Header
struct ethhdr {
unsigned char h_dest[ETH_ALEN]; /* destination eth addr */
unsigned char h_source[ETH_ALEN]; /* source ether addr */
__be16 h_proto; /* packet type ID field */
} __attribute__((packed));
struct arphdr {
uint16_t htype; /* Hardware Type */
uint16_t ptype; /* Protocol Type */
u_char hlen; /* Hardware Address Length */
u_char plen; /* Protocol Address Length */
uint16_t opcode; /* Operation Code */
u_char sha[6]; /* Sender hardware address */
u_char spa[4]; /* Sender IP address */
u_char tha[6]; /* Target hardware address */
u_char tpa[4]; /* Target IP address */
};
"""
import socket
import struct
import binascii
(continues on next page)
rawSocket = socket.socket(socket.AF_PACKET,
socket.SOCK_RAW,
socket.htons(0x0003))
while True:
packet = rawSocket.recvfrom(2048)
ethhdr = packet[0][0:14]
eth = struct.unpack("!6s6s2s", ethhdr)
arphdr = packet[0][14:42]
arp = struct.unpack("2s2s1s1s2s6s4s6s4s", arphdr)
# skip non-ARP packets
ethtype = eth[2]
if ethtype != '\x08\x06': continue
output:
$ python arp.py
-------------- ETHERNET_FRAME -------------
Dest MAC: ffffffffffff
Source MAC: f0257252f5ca
Type: 0806
--------------- ARP_HEADER ----------------
Hardware type: 0001
Protocol type: 0800
Hardware size: 06
Protocol size: 04
Opcode: 0001
Source MAC: f0257252f5ca
Source IP: 140.112.91.254
Dest MAC: 000000000000
Dest IP: 140.112.91.20
-------------------------------------------
3.3 Asyncio
Table of Contents
• Asyncio
– asyncio.run
– Future like object
– Future like object __await__ other task
– Patch loop runner _run_once
– Put blocking task into Executor
– Socket with asyncio
– Event Loop with polling
– Transport and Protocol
– Transport and Protocol with SSL
– Asynchronous Iterator
– What is asynchronous iterator
– Asynchronous context manager
– What is asynchronous context manager
– decorator @asynccontextmanager
– Simple asyncio connection pool
– Get domain name
– Gather Results
– Simple asyncio UDP echo server
– Simple asyncio Web server
– Simple HTTPS Web Server
– Simple HTTPS Web server (low-level api)
– TLS Upgrade
– Using sendfile
– Simple asyncio WSGI web server
3.3.1 asyncio.run
import asyncio
import socket
host = 'localhost'
port = 9527
loop = asyncio.get_event_loop()
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM, 0)
s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
s.setblocking(False)
s.bind((host, port))
s.listen(10)
loop.create_task(server())
loop.run_forever()
loop.close()
output: (bash 1)
$ nc localhost 9527
Hello
Hello
output: (bash 2)
$ nc localhost 9527
World
World
# using selectors
# ref: PyCon 2015 - David Beazley
import asyncio
import socket
import selectors
from collections import deque
@asyncio.coroutine
def read_wait(s):
yield 'read_wait', s
@asyncio.coroutine
def write_wait(s):
yield 'write_wait', s
class Loop:
"""Simple loop prototype"""
def __init__(self):
self.ready = deque()
self.selector = selectors.DefaultSelector()
@asyncio.coroutine
def sock_accept(self, s):
yield from read_wait(s)
return s.accept()
@asyncio.coroutine
def sock_recv(self, c, mb):
yield from read_wait(c)
return c.recv(mb)
@asyncio.coroutine
def sock_sendall(self, c, m):
while m:
yield from write_wait(c)
nsent = c.send(m)
m = m[nsent:]
def run_forever(self):
while True:
(continues on next page)
def _run_once(self):
while not self.ready:
events = self.selector.select()
for k, _ in events:
self.ready.append(k.data)
self.selector.unregister(k.fileobj)
while self.ready:
self.cur_t = ready.popleft()
try:
op, *a = self.cur_t.send(None)
getattr(self, op)(*a)
except StopIteration:
pass
loop = Loop()
host = 'localhost'
port = 9527
s = socket.socket(
socket.AF_INET,
socket.SOCK_STREAM, 0)
s.setsockopt(
socket.SOL_SOCKET,
socket.SO_REUSEADDR, 1)
s.setblocking(False)
s.bind((host, port))
s.listen(10)
@asyncio.coroutine
def handler(c):
while True:
msg = yield from loop.sock_recv(c, 1024)
if not msg:
break
yield from loop.sock_sendall(c, msg)
c.close()
@asyncio.coroutine
def server():
while True:
c, addr = yield from loop.sock_accept(s)
loop.create_task(handler(c))
import asyncio
class EchoProtocol(asyncio.Protocol):
loop = asyncio.get_event_loop()
coro = loop.create_server(EchoProtocol, 'localhost', 5566)
server = loop.run_until_complete(coro)
try:
loop.run_forever()
except:
loop.run_until_complete(server.wait_closed())
finally:
loop.close()
output:
# console 1
$ nc localhost 5566
Hello
Hello
# console 2
$ nc localhost 5566
World
World
import asyncio
import ssl
def make_header():
head = b"HTTP/1.1 200 OK\r\n"
head += b"Content-Type: text/html\r\n"
head += b"\r\n"
return head
def make_body():
resp = b"<html>"
resp += b"<h1>Hello SSL</h1>"
resp += b"</html>"
return resp
sslctx = ssl.SSLContext(ssl.PROTOCOL_SSLv23)
sslctx.load_cert_chain(
certfile="./root-ca.crt", keyfile="./root-ca.key"
)
class Service(asyncio.Protocol):
def connection_made(self, tr):
self.tr = tr
self.total = 0
try:
loop = asyncio.get_event_loop()
loop.run_until_complete(start())
finally:
loop.close()
output:
# ref: PEP-0492
# need Python >= 3.5
# ref: PEP-0492
# need Python >= 3.5
import asyncio
import socket
import uuid
class Transport:
self._loop = loop
self._host = host
self._port = port
self._sock = socket.socket(
socket.AF_INET, socket.SOCK_STREAM)
self._sock.setblocking(False)
self._uuid = uuid.uuid1()
def close(self):
if self._sock: self._sock.close()
@property
def alive(self):
ret = True if self._sock else False
return ret
@property
def uuid(self):
return self._uuid
class ConnectionPool:
def __await__(self):
for _c in self._conns:
yield from _c.connect().__await__()
return self
for _c in self._conns:
if _c.alive and not _c.used:
_c.used = True
fut.set_result(_c)
break
else:
loop.call_soon(self.getconn, fut)
return fut
def close(self):
for _c in self._conns:
_c.close()
# generate messages
msgs = ['coro_{}'.format(_).encode('utf-8') for _ in range(5)]
loop = asyncio.get_event_loop()
host = '127.0.0.1'
port = 9527
try:
loop.run_until_complete(main(loop, host, port))
except KeyboardInterrupt:
pass
finally:
loop.close()
output:
import asyncio
import ssl
path = ssl.get_default_verify_paths()
sslctx = ssl.SSLContext()
sslctx.verify_mode = ssl.CERT_REQUIRED
sslctx.check_hostname = True
sslctx.load_verify_locations(path.cafile)
# send request
w.write(req.encode())
# recv response
resp = ""
while True:
line = await r.readline()
if not line:
break
line = line.decode("utf-8")
resp += line
# close writer
w.close()
await w.wait_closed()
return resp
asyncio.run(main())
output:
$ python fetch.py
HTTP/1.1 301 Moved Permanently
(continues on next page)
import asyncio
import socket
loop = asyncio.get_event_loop()
host = 'localhost'
port = 3553
sock.bind((host, port))
try:
data, addr = sock.recvfrom(n_bytes)
except (BlockingIOError, InterruptedError):
loop.add_reader(fd, recvfrom, loop, sock, n_bytes, fut, True)
else:
fut.set_result((data, addr))
return fut
try:
n = sock.sendto(data, addr)
except (BlockingIOError, InterruptedError):
loop.add_writer(fd, sendto, loop, sock, data, addr, fut, True)
else:
fut.set_result(n)
return fut
(continues on next page)
try:
loop.run_until_complete(udp_server(loop, sock))
finally:
loop.close()
output:
$ python3 udp_server.py
$ nc -u localhost 3553
Hello UDP
Hello UDP
import asyncio
import socket
host = 'localhost'
port = 9527
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
s.setblocking(False)
s.bind((host, port))
s.listen(10)
loop = asyncio.get_event_loop()
def make_header():
header = b"HTTP/1.1 200 OK\r\n"
header += b"Content-Type: text/html\r\n"
header += b"\r\n"
return header
def make_body():
resp = b'<html>'
resp += b'<body><h3>Hello World</h3></body>'
resp += b'</html>'
return resp
try:
loop.run_until_complete(server(s, loop))
except KeyboardInterrupt:
pass
finally:
loop.close()
s.close()
# Then open browser with url: localhost:9527
import asyncio
import ssl
ctx = ssl.SSLContext(ssl.PROTOCOL_TLS_SERVER)
ctx.load_cert_chain('crt.pem', 'key.pem')
writer.write(head + body)
writer.close()
asyncio.run(main('0.0.0.0', 8000))
import asyncio
import socket
import ssl
def make_header():
head = b'HTTP/1.1 200 OK\r\n'
head += b'Content-type: text/html\r\n'
head += b'\r\n'
return head
def make_body():
resp = b'<html>'
resp += b'<h1>Hello SSL</h1>'
resp += b'</html>'
return resp
sslctx = ssl.SSLContext(ssl.PROTOCOL_SSLv23)
sslctx.load_cert_chain(certfile='./root-ca.crt',
keyfile='./root-ca.key')
loop.remove_reader(sock_fd)
loop.remove_writer(sock_fd)
waiter.set_result(None)
loop = asyncio.get_event_loop()
try:
loop.run_until_complete(server(loop))
finally:
loop.close()
output:
# console 1
# console 2
$ curl https://localhost:4433 -v \
> --resolve localhost:4433:127.0.0.1 \
> --cacert ~/test/root-ca.crt
import asyncio
import ssl
class HttpClient(asyncio.Protocol):
def __init__(self, on_con_lost):
self.on_con_lost = on_con_lost
self.resp = b""
loop = asyncio.get_running_loop()
on_con_lost = loop.create_future()
await on_con_lost
new_tr.close()
asyncio.run(main())
output:
$ python3 --version
Python 3.7.0
$ python3 https.py
HTTP/1.1 200 OK
import asyncio
path = "index.html"
loop = asyncio.get_event_loop()
_ = await reader.read(1024)
tr.write(head)
await loop.sendfile(tr, f)
writer.close()
asyncio.run(main("0.0.0.0", 8000))
output:
# ref: PEP333
import asyncio
import socket
import io
import sys
host = 'localhost'
port = 9527
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
s.setblocking(False)
s.bind((host, port))
s.listen(10)
loop = asyncio.get_event_loop()
class WSGIServer(object):
# make header
resp = 'HTTP/1.1 {0}\r\n'.format(status)
for header in resp_header:
resp += '{0}: {1}\r\n'.format(*header)
resp += '\r\n'
# make body
resp += '{0}'.format(data)
try:
await loop.sock_sendall(conn, str.encode(resp))
finally:
conn.close()
app = Flask(__name__)
@app.route('/hello')
def hello():
return Response("Hello WSGI",mimetype="text/plain")
3.4 Concurrency
Table of Contents
• Concurrency
– Execute a shell command
– Create a thread via “threading”
– Performance Problem - GIL
– Consumer and Producer
– Thread Pool Template
– Using multiprocessing ThreadPool
– Mutex lock
– Deadlock
– Implement “Monitor”
– Control primitive resources
– Ensure tasks has done
– Thread-safe priority queue
– Multiprocessing
– Custom multiprocessing map
– Graceful way to kill all child processes
– Simple round-robin scheduler
– Scheduler with blocking function
– PoolExecutor
– How to use ThreadPoolExecutor?
– What does “with ThreadPoolExecutor” work?
– Future Object
– Future error handling
class Worker(Thread):
def __init__(self,queue):
super(Worker, self).__init__()
self._q = queue
self.daemon = True
self.start()
def run(self):
while True:
f,args,kwargs = self._q.get()
try:
print(f(*args, **kwargs))
except Exception as e:
print(e)
self._q.task_done()
class ThreadPool(object):
def __init__(self, num_t=5):
self._q = Queue(num_t)
# Create Worker Thread
for _ in range(num_t):
Worker(self._q)
def add_task(self,f,*args,**kwargs):
self._q.put((f, args, kwargs))
def wait_complete(self):
self._q.join()
def fib(n):
if n <= 2:
return 1
return fib(n-1)+fib(n-2)
if __name__ == '__main__':
pool = ThreadPool()
(continues on next page)
pool = ThreadPool(10)
def profile(func):
def wrapper(*args, **kwargs):
print(func.__name__)
s = time.time()
func(*args, **kwargs)
e = time.time()
print("cost: {0}".format(e-s))
return wrapper
@profile
def pool_map():
res = pool.map(lambda x:x**2,
range(999999))
@profile
def ordinary_map():
res = map(lambda x:x**2,
range(999999))
pool_map()
ordinary_map()
output:
$ python test_threadpool.py
pool_map
cost: 0.562669038773
ordinary_map
cost: 0.38525390625
3.4.8 Deadlock
>>> t1.isAlive()
True
>>> t2.isAlive()
True
Using RLock
class monitor(object):
lock = RLock()
def foo(self,tid):
with monitor.lock:
print("%d in foo" % tid)
time.sleep(5)
self.ker(tid)
def ker(self,tid):
with monitor.lock:
print("%d in ker" % tid)
m = monitor()
def task1(id):
m.foo(id)
def task2(id):
m.ker(id)
t1 = Thread(target=task1,args=(1,))
t2 = Thread(target=task2,args=(2,))
t1.start()
t2.start()
t1.join()
t2.join()
output:
$ python monitor.py
1 in foo
1 in ker
2 in ker
Using Semaphore
# limit resource to 3
sema = Semaphore(3)
def foo(tid):
with sema:
print("%d acquire sema" % tid)
wt = random()*5
time.sleep(wt)
print("%d release sema" % tid)
threads = []
for _t in range(5):
t = Thread(target=foo,args=(_t,))
threads.append(t)
t.start()
for _t in threads:
_t.join()
output:
python semaphore.py
0 acquire sema
1 acquire sema
2 acquire sema
0 release sema
3 acquire sema
2 release sema
4 acquire sema
1 release sema
4 release sema
3 release sema
Using ‘event’
e = Event()
def worker(id):
print("%d wait event" % id)
(continues on next page)
t1=Thread(target=worker,args=(1,))
t2=Thread(target=worker,args=(2,))
t3=Thread(target=worker,args=(3,))
t1.start()
t2.start()
t3.start()
output:
python event.py
1 wait event
2 wait event
3 wait event
2 get event set
3 get event set
1 get event set
Using ‘condition’
import threading
import heapq
import time
import random
class PriorityQueue(object):
def __init__(self):
self._q = []
self._count = 0
self._cv = threading.Condition()
def __str__(self):
return str(self._q)
def __repr__(self):
return self._q
priq = PriorityQueue()
def producer():
while True:
print(priq.pop())
def consumer():
while True:
time.sleep(3)
print("consumer put value")
priority = random.random()
priq.put(priority,priority*10)
for _ in range(3):
priority = random.random()
priq.put(priority,priority*10)
t1=threading.Thread(target=producer)
t2=threading.Thread(target=consumer)
t1.start();t2.start()
t1.join();t2.join()
output:
python3 thread_safe.py
0.6657491871045683
0.5278797439991247
0.20990624606296315
wait...
consumer put value
0.09123101305407577
wait...
3.4.13 Multiprocessing
def spawn(f):
def fun(pipe,x):
pipe.send(f(x))
pipe.close()
return fun
def parmap(f,X):
pipe=[Pipe() for x in X]
proc=[Process(target=spawn(f),
args=(c,x))
for x,(p,c) in izip(X,pipe)]
[p.start() for p in proc]
[p.join() for p in proc]
return [p.recv() for (p,c) in pipe]
print(parmap(lambda x:x**x,range(1,5)))
import signal
import os
import time
NUM_PROCESS = 10
def aurora(n):
while True:
time.sleep(n)
if __name__ == "__main__":
procs = [Process(target=aurora, args=(x,))
for x in range(NUM_PROCESS)]
try:
for p in procs:
p.daemon = True
p.start()
[p.join() for p in procs]
finally:
for p in procs:
if not p.is_alive(): continue
os.kill(p.pid, signal.SIGKILL)
tasks = deque()
r_wait = {}
s_wait = {}
def fib(n):
if n <= 2:
return 1
return fib(n-1)+fib(n-2)
def run():
while any([tasks,r_wait,s_wait]):
while not tasks:
# polling
rr, sr, _ = select(r_wait, s_wait, {})
for _ in rr:
tasks.append(r_wait.pop(_))
for _ in sr:
tasks.append(s_wait.pop(_))
try:
task = tasks.popleft()
why, what = task.next()
if why == 'recv':
r_wait[what] = task
elif why == 'send':
s_wait[what] = task
else:
raise RuntimeError
except StopIteration:
pass
(continues on next page)
def fib_server():
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
sock.bind(('localhost',5566))
sock.listen(5)
while True:
yield 'recv', sock
c, a = sock.accept()
tasks.append(fib_handler(c))
def fib_handler(client):
while True:
yield 'recv', client
req = client.recv(1024)
if not req:
break
resp = fib(int(req))
yield 'send', client
client.send(str(resp)+'\n')
client.close()
tasks.append(fib_server())
run()
output: (bash 1)
$ nc loalhost 5566
20
6765
output: (bash 2)
$ nc localhost 5566
10
55
3.4.18 PoolExecutor
# demo GIL
from concurrent import futures
import time
def fib(n):
if n <= 2:
return 1
return fib(n-1) + fib(n-2)
def thread():
s = time.time()
with futures.ThreadPoolExecutor(2) as e:
res = e.map(fib, [35]*2)
for _ in res:
print(_)
e = time.time()
print("thread cost: {}".format(e-s))
def process():
s = time.time()
with futures.ProcessPoolExecutor(2) as e:
res = e.map(fib, [35]*2)
for _ in res:
print(_)
e = time.time()
print("pocess cost: {}".format(e-s))
def fib(n):
if n <= 2:
return 1
return fib(n - 1) + fib(n - 2)
print(res)
output:
$ python3 thread_pool_ex.py
[832040, 1346269, 2178309]
def fib(n):
if n <= 2:
return 1
return fib(n-1) + fib(n-2)
with futures.ThreadPoolExecutor(3) as e:
fut = e.submit(fib, 30)
res = fut.result()
print(res)
# equal to
e = futures.ThreadPoolExecutor(3)
fut = e.submit(fib, 30)
fut.result()
e.shutdown(wait=True)
print(res)
output:
$ python3 thread_pool_exec.py
832040
832040
def fib(n):
if n <= 2:
return 1
return fib(n-1) + fib(n-2)
def handler(future):
res = future.result()
print("res: {}".format(res))
def thread_v1():
with futures.ThreadPoolExecutor(3) as e:
for _ in range(3):
f = e.submit(fib, 30+_)
f.add_done_callback(handler)
print("end")
def thread_v2():
to_do = []
with futures.ThreadPoolExecutor(3) as e:
for _ in range(3):
fut = e.submit(fib, 30+_)
to_do.append(fut)
for _f in futures.as_completed(to_do):
res = _f.result()
print("res: {}".format(res))
print("end")
output:
$ python3 -i fut.py
>>> thread_v1()
res: 832040
res: 1346269
res: 2178309
end
>>> thread_v2()
res: 832040
res: 1346269
res: 2178309
end
def spam():
raise RuntimeError
def handler(future):
print("callback handler")
try:
res = future.result()
except RuntimeError:
print("get RuntimeError")
def thread_spam():
with futures.ThreadPoolExecutor(2) as e:
f = e.submit(spam)
f.add_done_callback(handler)
output:
$ python -i fut_err.py
>>> thread_spam()
callback handler
get RuntimeError
3.5 SQLAlchemy
Table of Contents
• SQLAlchemy
– Set a database URL
– Sqlalchemy Support DBAPI - PEP249
– Transaction and Connect Object
– Metadata - Generating Database Schema
– Inspect - Get Database Information
– Reflection - Loading Table from Existing Database
– Print Create Table Statement with Indexes (SQL DDL)
– Get Table from MetaData
– Create all Tables Store in “MetaData”
– Create Specific Table
– Create table with same columns
– Drop a Table
output:
$ python sqlalchemy_url.py
postgres://postgres:[email protected]:5432
sqlite:///db.sqlite
db_uri = "sqlite:///db.sqlite"
engine = create_engine(db_uri)
# DBAPI - PEP249
# create table
engine.execute('CREATE TABLE "EX1" ('
'id INTEGER NOT NULL,'
'name VARCHAR, '
'PRIMARY KEY (id));')
# insert a raw
engine.execute('INSERT INTO "EX1" '
'(id, name) '
'VALUES (1,"raw1")')
# select *
result = engine.execute('SELECT * FROM '
'"EX1"')
for _r in result:
print(_r)
# delete *
engine.execute('DELETE from "EX1" where id=1;')
result = engine.execute('SELECT * FROM "EX1"')
print(result.fetchall())
db_uri = 'sqlite:///db.sqlite'
engine = create_engine(db_uri)
# Create connection
conn = engine.connect()
# Begin transaction
trans = conn.begin()
conn.execute('INSERT INTO "EX1" (name) '
(continues on next page)
db_uri = 'sqlite:///db.sqlite'
engine = create_engine(db_uri)
db_uri = 'sqlite:///db.sqlite'
engine = create_engine(db_uri)
inspector = inspect(engine)
db_uri = 'sqlite:///db.sqlite'
engine = create_engine(db_uri)
meta = MetaData()
example_table = Table('Example',meta,
Column('id', Integer, primary_key=True),
Column('name', String(10), index=True))
db_uri = 'sqlite:///db.sqlite'
engine = create_engine(db_uri, strategy='mock', executor=metadata_dump)
meta.create_all(bind=engine, tables=[example_table])
output:
db_uri = 'sqlite:///db.sqlite'
engine = create_engine(db_uri)
# Get Table
ex_table = metadata.tables['Example']
print(ex_table)
db_uri = 'sqlite:///db.sqlite'
engine = create_engine(db_uri)
meta = MetaData(engine)
t2 = Table('EX2', meta,
Column('id',Integer, primary_key=True),
Column('val',Integer))
# Create all tables in meta
meta.create_all()
db_uri = 'sqlite:///db.sqlite'
engine = create_engine(db_uri)
(continues on next page)
meta = MetaData(engine)
t1 = Table('Table_1', meta,
Column('id', Integer, primary_key=True),
Column('name',String))
t2 = Table('Table_2', meta,
Column('id', Integer, primary_key=True),
Column('val',Integer))
t1.create()
db_url = "sqlite://"
engine = create_engine(db_url)
Base = declarative_base()
class TemplateTable(object):
id = Column(Integer, primary_key=True)
name = Column(String)
age = Column(Integer)
Base.metadata.create_all(bind=engine)
table.create(engine)
inspector = inspect(engine)
print('Test' in inspector.get_table_names())
table.drop(engine)
inspector = inspect(engine)
print('Test' in inspector.get_table_names())
output:
$ python sqlalchemy_drop.py
$ True
$ False
meta = MetaData()
t = Table('ex_table', meta,
Column('id', Integer, primary_key=True),
Column('key', String),
Column('val', Integer))
# Get Table Name
print(t.name)
# Get Columns
(continues on next page)
# Get Column
c = t.c.key
print(c.name)
# Or
c = t.columns.key
print(c.name)
meta = MetaData()
table = Table('example', meta,
Column('id', Integer, primary_key=True),
Column('l_name', String),
Column('f_name', String))
# sql expression binary object
print(repr(table.c.l_name == 'ed'))
# exhbit sql expression
print(str(table.c.l_name == 'ed'))
print(repr(table.c.f_name != 'ed'))
# comparison operator
print(repr(table.c.id > 3))
# or expression
print((table.c.id > 5) | (table.c.id < 2))
# Equal to
print(or_(table.c.id > 5, table.c.id < 2))
# + means "addition"
print(table.c.id + 5)
# or means "string concatenation"
(continues on next page)
# in expression
print(table.c.l_name.in_(['a','b']))
db_uri = 'sqlite:///db.sqlite'
engine = create_engine(db_uri)
# create table
meta = MetaData(engine)
table = Table('user', meta,
Column('id', Integer, primary_key=True),
Column('l_name', String),
Column('f_name', String))
meta.create_all()
db_uri = 'sqlite:///db.sqlite'
engine = create_engine(db_uri)
conn = engine.connect()
# or equal to
select_st = table.select().where(
table.c.l_name == 'Hello')
res = conn.execute(select_st)
for _row in res:
print(_row)
db_uri = 'sqlite:///db.sqlite'
engine = create_engine(db_uri)
meta = MetaData(engine).reflect()
email_t = Table('email_addr', meta,
(continues on next page)
# insert
conn = engine.connect()
conn.execute(email_t.insert(),[
{'email':'ker@test','name':'Hi'},
{'email':'yo@test','name':'Hello'}])
# join statement
join_obj = user_t.join(email_t,
email_t.c.name == user_t.c.l_name)
# using select_from
sel_st = select(
[user_t.c.l_name, email_t.c.email]).select_from(join_obj)
res = conn.execute(sel_st)
for _row in res:
print(_row)
import io
from datetime import date
# create table
meta = MetaData(engine)
table = Table('userinfo', meta,
(continues on next page)
# generate rows
for i in range(100):
line = '\t'.join(
[
f'Name {i}', # first_name
str(18 + i), # age
str(date.today()), # birth_day
]
)
datafile.write(line + '\n')
# create table
meta = MetaData(engine)
table = Table('userinfo', meta,
Column('id', Integer, primary_key=True),
Column('first_name', String),
Column('age', Integer),
)
meta.create_all()
# generate rows
data = [{'first_name': f'Name {i}', 'age': 18+i} for i in range(10)]
stmt = table.insert().values(data).returning(table.c.id)
# converted into SQL:
# INSERT INTO userinfo (first_name, age) VALUES
# (%(first_name_m0)s, %(age_m0)s), (%(first_name_m1)s, %(age_m1)s),
# (%(first_name_m2)s, %(age_m2)s), (%(first_name_m3)s, %(age_m3)s),
# (%(first_name_m4)s, %(age_m4)s), (%(first_name_m5)s, %(age_m5)s),
# (%(first_name_m6)s, %(age_m6)s), (%(first_name_m7)s, %(age_m7)s),
# (%(first_name_m8)s, %(age_m8)s), (%(first_name_m9)s, %(age_m9)s)
# RETURNING userinfo.id
for rowid in engine.execute(stmt).fetchall():
print(rowid['id'])
output:
$ python sqlalchemy_bulk.py
1
2
3
4
5
6
7
8
9
10
# create table
meta = MetaData(engine)
table = Table('userinfo', meta,
Column('id', Integer, primary_key=True),
Column('first_name', String),
Column('birth_year', Integer),
)
meta.create_all()
# update data
data = [
{'_id': 1, 'first_name': 'Johnny', 'birth_year': 1975},
{'_id': 2, 'first_name': 'Jim', 'birth_year': 1973},
{'_id': 3, 'first_name': 'Kaley', 'birth_year': 1985},
{'_id': 4, 'first_name': 'Simon', 'birth_year': 1980},
{'_id': 5, 'first_name': 'Kunal', 'birth_year': 1981},
{'_id': 6, 'first_name': 'Mayim', 'birth_year': 1975},
{'_id': 7, 'first_name': 'Melissa', 'birth_year': 1980},
]
engine.execute(stmt, data)
db_uri = 'sqlite:///db.sqlite'
engine = create_engine(db_uri)
conn = engine.connect()
meta = MetaData(engine).reflect()
(continues on next page)
Modal = declarative_base()
class Example(Modal):
__tablename__ = "ex_t"
id = Column(Integer, primary_key=True)
name = Column(String(20))
db_uri = 'sqlite:///db.sqlite'
engine = create_engine(db_uri)
Modal.metadata.create_all(engine)
db = {'drivername': 'postgres',
'username': 'postgres',
'password': 'postgres',
'host': '192.168.99.100',
'port': 5432}
url = URL(**db)
engine = create_engine(url)
metadata = MetaData()
metadata.reflect(bind=engine)
inspector = inspect(engine)
print(inspector.get_table_names())
output:
$ python sqlalchemy_create.py
[u'table1', u'table2', u'table3']
engine = create_engine(URL(**db_url))
create_table('Table1',
Column('id', Integer, primary_key=True),
Column('name', String))
create_table('Table2',
Column('id', Integer, primary_key=True),
Column('key', String),
Column('val', String))
inspector = inspect(engine)
for _t in inspector.get_table_names():
print(_t)
output:
$ python sqlalchemy_dynamic.py
Table1
Table2
Base = declarative_base()
class TestTable(Base):
__tablename__ = 'Test Table'
id = Column(Integer, primary_key=True)
key = Column(String, nullable=False)
val = Column(String)
date = Column(DateTime, default=datetime.utcnow)
# create tables
Base.metadata.create_all(bind=engine)
# create session
Session = sessionmaker()
Session.configure(bind=engine)
session = Session()
class TestTable(Base):
__tablename__ = 'Test Table'
id = Column(Integer, primary_key=True)
key = Column(String, nullable=False)
val = Column(String)
date = Column(DateTime, default=datetime.utcnow)
# create tables
Base.metadata.create_all(bind=engine)
# create session
Session = sessionmaker()
Session.configure(bind=engine)
session = Session()
try:
# add row to database
row = TestTable(key="hello", val="world")
session.add(row)
session.commit()
output:
$ python sqlalchemy_update.py
original: hello world
update: Hello World
class TestTable(Base):
__tablename__ = 'Test Table'
id = Column(Integer, primary_key=True)
key = Column(String, nullable=False)
val = Column(String)
date = Column(DateTime, default=datetime.utcnow)
# create tables
Base.metadata.create_all(bind=engine)
# create session
Session = sessionmaker()
Session.configure(bind=engine)
session = Session()
output:
$ python sqlalchemy_delete.py
<__main__.TestTable object at 0x104eb8f50>
[]
Base = declarative_base()
class User(Base):
__tablename__ = 'user'
id = Column(Integer, primary_key=True)
name = Column(String)
addresses = relationship("Address", backref="user")
class Address(Base):
__tablename__ = 'address'
id = Column(Integer, primary_key=True)
email = Column(String)
user_id = Column(Integer, ForeignKey('user.id'))
u1 = User()
a1 = Address()
print(u1.addresses)
print(a1.user)
u1.addresses.append(a1)
print(u1.addresses)
print(a1.user)
output:
$ python sqlalchemy_relationship.py
[]
None
[<__main__.Address object at 0x10c4edb50>]
<__main__.User object at 0x10c4ed810>
import json
base = declarative_base()
class Node(base):
__tablename__ = 'node'
id = Column(Integer, primary_key=True)
label = Column(String)
friends = relationship('Node',
secondary=association,
primaryjoin=id==association.c.left,
secondaryjoin=id==association.c.right,
backref='left')
def to_json(self):
return dict(id=self.id,
friends=[_.label for _ in self.friends])
print('----> right')
print(json.dumps([_.to_json() for _ in nodes], indent=2))
print('----> left')
print(json.dumps([_n.to_json() for _n in nodes[1].left], indent=2))
output:
----> right
[
{
"friends": [
"node_1",
(continues on next page)
Base = declarative_base()
class User(Base):
__tablename__ = 'User'
id = Column(Integer, primary_key=True)
(continues on next page)
# create tables
engine = create_engine(URL(**db_url))
Base.metadata.create_all(bind=engine)
users = [
User(name='ed',
fullname='Ed Jones',
birth=datetime(1989,7,1)),
User(name='wendy',
fullname='Wendy Williams',
birth=datetime(1983,4,1)),
User(name='mary',
fullname='Mary Contrary',
birth=datetime(1990,1,30)),
User(name='fred',
fullname='Fred Flinstone',
birth=datetime(1977,3,12)),
User(name='justin',
fullname="Justin Bieber")]
# create session
Session = sessionmaker()
Session.configure(bind=engine)
session = Session()
# add_all
session.add_all(users)
session.commit()
print("----> order_by(id):")
query = session.query(User).order_by(User.id)
for _row in query.all():
print(_row.name, _row.fullname, _row.birth)
print("\n----> order_by(desc(id)):")
query = session.query(User).order_by(desc(User.id))
for _row in query.all():
print(_row.name, _row.fullname, _row.birth)
print("\n----> order_by(date):")
query = session.query(User).order_by(User.birth)
for _row in query.all():
print(_row.name, _row.fullname, _row.birth)
print("\n----> EQUAL:")
query = session.query(User).filter(User.id == 2)
_row = query.first()
print(_row.name, _row.fullname, _row.birth)
(continues on next page)
print("\n----> IN:")
query = session.query(User).filter(User.name.in_(['ed', 'wendy']))
for _row in query.all():
print(_row.name, _row.fullname, _row.birth)
print("\n----> AND:")
query = session.query(User).filter(
User.name=='ed', User.fullname=='Ed Jones')
_row = query.first()
print(_row.name, _row.fullname, _row.birth)
print("\n----> OR:")
query = session.query(User).filter(
or_(User.name=='ed', User.name=='wendy'))
for _row in query.all():
print(_row.name, _row.fullname, _row.birth)
print("\n----> NULL:")
query = session.query(User).filter(User.birth == None)
for _row in query.all():
print(_row.name, _row.fullname)
print("\n----> LIKE")
query = session.query(User).filter(User.name.like('%ed%'))
for _row in query.all():
print(_row.name, _row.fullname)
output:
----> order_by(id):
ed Ed Jones 1989-07-01 00:00:00
wendy Wendy Williams 1983-04-01 00:00:00
mary Mary Contrary 1990-01-30 00:00:00
fred Fred Flinstone 1977-03-12 00:00:00
justin Justin Bieber None
----> order_by(date):
fred Fred Flinstone 1977-03-12 00:00:00
wendy Wendy Williams 1983-04-01 00:00:00
ed Ed Jones 1989-07-01 00:00:00
mary Mary Contrary 1990-01-30 00:00:00
justin Justin Bieber None
----> EQUAL:
wendy Wendy Williams 1983-04-01 00:00:00
----> IN:
ed Ed Jones 1989-07-01 00:00:00
wendy Wendy Williams 1983-04-01 00:00:00
----> AND:
ed Ed Jones 1989-07-01 00:00:00
----> OR:
ed Ed Jones 1989-07-01 00:00:00
wendy Wendy Williams 1983-04-01 00:00:00
----> NULL:
justin Justin Bieber
----> LIKE
ed Ed Jones
fred Fred Flinstone
meta = MetaData(bind=engine)
class Address(object):
def __init__(self, email):
self.email = email
# create table
meta.create_all()
# create session
Session = sessionmaker()
(continues on next page)
# query result
u = session.query(User).filter(User.name == 'Hello').first()
print(u.name, u.fullname, u.password)
finally:
session.close()
output:
$ python map_table_class.py
Hello HelloWorld ker
db_url = "sqlite://"
engine = create_engine(db_url)
metadata = MetaData(engine)
class TableTemp(object):
def __init__(self, name):
self.name = name
def get_table(name):
if name in metadata.tables:
table = metadata.tables[name]
else:
(continues on next page)
Session = scoped_session(sessionmaker(bind=engine))
try:
Session.add(t(name='foo'))
Session.add(t(name='bar'))
for _ in Session.query(t).all():
print(_.name)
except Exception as e:
Session.rollback()
finally:
Session.close()
output:
$ python get_table.py
foo
bar
Base = declarative_base()
class User(Base):
__tablename__ = 'user'
id = Column(Integer, primary_key=True)
name = Column(String)
addresses = relationship("Address", backref="user")
class Address(Base):
(continues on next page)
# create engine
engine = create_engine(URL(**db_url))
# create tables
Base.metadata.create_all(bind=engine)
# create session
Session = sessionmaker()
Session.configure(bind=engine)
session = Session()
user = User(name='user1')
mail1 = Address(email='[email protected]')
mail2 = Address(email='[email protected]')
user.addresses.extend([mail1, mail2])
session.add(user)
session.add_all([mail1, mail2])
session.commit()
output:
$ python sqlalchemy_join.py
user1 [email protected]
user1 [email protected]
db_url = 'sqlite://'
engine = create_engine(db_url)
Base = declarative_base()
class Parent(Base):
__tablename__ = 'parent'
id = Column(Integer, primary_key=True)
name = Column(String)
children = relationship('Child', back_populates='parent')
class Child(Base):
__tablename__ = 'child'
id = Column(Integer, primary_key=True)
name = Column(String)
parent_id = Column(Integer, ForeignKey('parent.id'))
parent = relationship('Parent', back_populates='children')
Base.metadata.create_all(bind=engine)
Session = scoped_session(sessionmaker(bind=engine))
p1 = Parent(name="Alice")
p2 = Parent(name="Bob")
c1 = Child(name="foo")
c2 = Child(name="bar")
c3 = Child(name="ker")
c4 = Child(name="cat")
try:
Session.add(p1)
Session.add(p2)
Session.commit()
# print result
for _p, _c in q.all():
(continues on next page)
output:
$ python join_group_by.py
parent: Alice, num_child: 3
parent: Bob, num_child: 1
engine = create_engine(URL(**db_url))
Base = declarative_base()
create_table('Table1', {
'__tablename__': 'Table1',
'id': Column(Integer, primary_key=True),
'name': Column(String)})
create_table('Table2', {
'__tablename__': 'Table2',
'id': Column(Integer, primary_key=True),
'key': Column(String),
'val': Column(String)})
inspector = inspect(engine)
for _t in inspector.get_table_names():
print(_t)
output:
$ python sqlalchemy_dynamic_orm.py
Table1
Table2
engine = create_engine('sqlite://')
base = declarative_base()
@event.listens_for(engine, 'engine_disposed')
def receive_engine_disposed(engine):
print("engine dispose")
class Table(base):
__tablename__ = 'example table'
id = Column(Integer, primary_key=True)
base.metadata.create_all(bind=engine)
session = sessionmaker(bind=engine)()
try:
try:
row = Table()
session.add(row)
except Exception as e:
session.rollback()
raise
finally:
session.close()
finally:
engine.dispose()
output:
$ python db_dispose.py
engine dispose
Warning: Be careful. Close session does not mean close database connection. SQLAlchemy session generally
represents the transactions, not connections.
url = 'sqlite://'
engine = create_engine(url)
base = declarative_base()
class Table(base):
__tablename__ = 'table'
id = Column(Integer, primary_key=True)
key = Column(String)
val = Column(String)
base.metadata.create_all(bind=engine)
session = sessionmaker(bind=engine)()
try:
t = Table(key="key", val="val")
try:
print(t.key, t.val)
session.add(t)
session.commit()
except Exception as e:
print(e)
session.rollback()
finally:
session.close()
output:
$ python sql.py
key val
Cannot use the object after close the session
3.5.38 Hooks
Base = declarative_base()
class User(Base):
__tablename__ = "user"
id = Column(Integer, primary_key=True)
name = Column(String)
age = Column(Integer)
url = "sqlite:///:memory:"
engine = create_engine(url)
Base.metadata.create_all(bind=engine)
Session = sessionmaker(bind=engine)
@event.listens_for(User, "before_insert")
def before_insert(mapper, connection, user):
print(f"before insert: {user.name}")
@event.listens_for(User, "after_insert")
def after_insert(mapper, connection, user):
print(f"after insert: {user.name}")
try:
session = scoped_session(Session)
user = User(name="bob", age=18)
session.add(user)
session.commit()
except SQLAlchemyError as e:
session.rollback()
finally:
session.close()
3.6 Security
Table of Contents
• Security
– Simple https server
– Generate a SSH key pair
– Get certificate information
– Generate a self-signed certificate
– Prepare a Certificate Signing Request (csr)
– Generate RSA keyfile without passphrase
– Sign a file by a given private key
– Verify a file from a signed digest
– Simple RSA encrypt via pem file
– Simple RSA encrypt via RSA module
– Simple RSA decrypt via pem file
– Simple RSA encrypt with OAEP
– Simple RSA decrypt with OAEP
– Using DSA to proof of identity
– Using AES CBC mode encrypt a file
– Using AES CBC mode decrypt a file
– AES CBC mode encrypt via password (using cryptography)
– AES CBC mode decrypt via password (using cryptography)
– AES CBC mode encrypt via password (using pycrypto)
– AES CBC mode decrypt via password (using pycrytpo)
– Ephemeral Diffie Hellman Key Exchange via cryptography
– Calculate DH shared key manually via cryptography
– Calculate DH shared key from (p, g, pubkey)
# python2
# python3
key = rsa.generate_private_key(
backend=default_backend(),
public_exponent=65537,
key_size=2048
)
private_key = key.private_bytes(
serialization.Encoding.PEM,
serialization.PrivateFormat.PKCS8,
serialization.NoEncryption(),
)
public_key = key.public_key().public_bytes(
serialization.Encoding.OpenSSH,
serialization.PublicFormat.OpenSSH
)
backend = default_backend()
with open('./cert.crt', 'rb') as f:
crt_data = f.read()
cert = x509.load_pem_x509_certificate(crt_data, backend)
class Certificate:
_fields = ['country_name',
'state_or_province_name',
'locality_name',
'organization_name',
'organizational_unit_name',
'common_name',
'email_address']
cert = Certificate(cert)
for attr in cert._fields:
for info in getattr(cert, attr):
print("{}: {}".format(info._oid._name, info._value))
output:
now = datetime.now()
expire = now + timedelta(days=365)
# country (countryName, C)
# state or province name (stateOrProvinceName, ST)
# locality (locality, L)
# organization (organizationName, O)
# organizational unit (organizationalUnitName, OU)
# common name (commonName, CN)
cert = crypto.X509()
cert.get_subject().C = "TW"
cert.get_subject().ST = "Taiwan"
cert.get_subject().L = "Taipei"
cert.get_subject().O = "pysheeet"
cert.get_subject().OU = "cheat sheet"
cert.get_subject().CN = "pythonsheets.com"
cert.set_serial_number(1000)
cert.set_notBefore(now.strftime("%Y%m%d%H%M%SZ").encode())
cert.set_notAfter(expire.strftime("%Y%m%d%H%M%SZ").encode())
cert.set_issuer(cert.get_subject())
cert.set_pubkey(k)
cert.sign(k, 'sha1')
(continues on next page)
output:
alt_name = [ b"DNS:www.pythonsheeets.com",
b"DNS:doc.pythonsheeets.com" ]
key_usage = [ b"Digital Signature",
b"Non Repudiation",
b"Key Encipherment" ]
# country (countryName, C)
# state or province name (stateOrProvinceName, ST)
# locality (locality, L)
# organization (organizationName, O)
# organizational unit (organizationalUnitName, OU)
# common name (commonName, CN)
req.get_subject().C = "TW"
req.get_subject().ST = "Taiwan"
req.get_subject().L = "Taipei"
req.get_subject().O = "pysheeet"
req.get_subject().OU = "cheat sheet"
req.get_subject().CN = "pythonsheets.com"
req.add_extensions([
crypto.X509Extension( b"basicConstraints",
False,
(continues on next page)
req.set_pubkey(key)
req.sign(key, "sha256")
output:
# create a root ca
$ openssl genrsa -out ca-key.pem 2048
Generating RSA private key, 2048 bit long modulus
.....+++
.......................................+++
e is 65537 (0x10001)
$ openssl req -x509 -new -nodes -key ca-key.pem \
> -days 10000 -out ca.pem -subj "/CN=root-ca"
# prepare a csr
$ openssl genrsa -out key.pem 2048
Generating RSA private key, 2048 bit long modulus
....+++
......................................+++
e is 65537 (0x10001)
$ python3 x509.py
# prepare openssl.cnf
cat <<EOF > openssl.cnf
> [req]
> req_extensions = v3_req
> distinguished_name = req_distinguished_name
> [req_distinguished_name]
> [ v3_req ]
> basicConstraints = CA:FALSE
> keyUsage = nonRepudiation, digitalSignature, keyEncipherment
> subjectAltName = @alt_names
> [alt_names]
> DNS.1 = www.pythonsheets.com
> DNS.2 = doc.pythonsheets.com
> EOF
# sign a csr
$ openssl x509 -req -in cert.csr -CA ca.pem \
> -CAkey ca-key.pem -CAcreateserial -out cert.pem \
(continues on next page)
# check
$ openssl x509 -in cert.pem -text -noout
output:
$ python3 sign.py
$ openssl dgst -sha256 -verify public.key -signature foo.tgz.sha256 foo.tgz
Verified OK
import sys
digest.update(data)
return signer.verify(digest, sig)
output:
# do verification
$ cat /dev/urandom | head -c 512 | base64 > foo.txt
$ tar -zcf foo.tgz foo.txt
$ openssl dgst -sha256 -sign private.key -out foo.tgz.sha256 foo.tgz
$ python3 verify.py
Verified OK
import base64
import sys
key_text = sys.stdin.read()
# encrypt
cipher_text = cipher.encrypt(b"Hello RSA!")
# do base64 encode
cipher_text = base64.b64encode(cipher_text)
print(cipher_text.decode('utf-8'))
output:
import base64
import sys
# encrypt
cipher_text = cipher.encrypt(b"Hello RSA!")
# do base64 encode
cipher_text = base64.b64encode(cipher_text)
print(cipher_text.decode('utf-8'))
output:
import base64
import sys
# decode base64
cipher_text = base64.b64decode(sys.stdin.read())
# decrypt
plain_text = cipher.decrypt(cipher_text, None)
print(plain_text.decode('utf-8').strip())
output:
import base64
import sys
output:
$ openssl genrsa -out private.key 2048
$ openssl rsa -in private.key -pubout -out public.key
$ cat public.key |\
> python3 rsa.py |\
> openssl base64 -d -A |\
> openssl rsautl -decrypt -oaep -inkey private.key
Hello RSA OAEP!
import base64
import sys
# decode base64
cipher_text = base64.b64decode(sys.stdin.read())
# decrypt
plain_text = cipher.decrypt(cipher_text)
print(plain_text.decode('utf-8').strip())
output:
$ openssl genrsa -out private.key 2048
$ openssl rsa -in private.key -pubout -out public.key
(continues on next page)
import socket
def gen_dsa_key():
private_key = dsa.generate_private_key(
key_size=2048, backend=default_backend())
return private_key, private_key.public_key()
# attacker modify the msg will make the msg check fail
verify_data(b"I'm attacker!", bob_recv_signature, alice_public_key)
output:
$ python3 test_dsa.py
check msg: b'Hello Bob' success!
recv msg: b"I'm attacker!" not trust!
import struct
import sys
import os
backend = default_backend()
key = os.urandom(32)
iv = os.urandom(16)
def encrypt(ptext):
pad = padding.PKCS7(128).padder()
ptext = pad.update(ptext) + pad.finalize()
alg = algorithms.AES(key)
mode = modes.CBC(iv)
cipher = Cipher(alg, mode, backend=backend)
encryptor = cipher.encryptor()
ctext = encryptor.update(ptext) + encryptor.finalize()
return ctext
print("key: {}".format(key.hex()))
print("iv: {}".format(iv.hex()))
if len(sys.argv) != 3:
raise Exception("usage: cmd [file] [enc file]")
# encrypt file
ciphertext = encrypt(plaintext)
with open(sys.argv[2], 'wb') as f:
f.write(ciphertext)
output:
import struct
import sys
import os
backend = default_backend()
return ptext
(continues on next page)
if len(sys.argv) != 4:
raise Exception("usage: cmd [key] [iv] [file]")
# decrypt file
key, iv = unhexlify(sys.argv[1]), unhexlify(sys.argv[2])
plaintext = decrypt(key, iv, ciphertext)
print(plaintext)
output:
import base64
import struct
import sys
import os
backend = default_backend()
# generate salt
salt = os.urandom(8)
# pad plaintext
pad = padding.PKCS7(128).padder()
ptext = pad.update(ptext) + pad.finalize()
# create an encryptor
alg = algorithms.AES(key)
mode = modes.CBC(iv)
cipher = Cipher(alg, mode, backend=backend)
encryptor = cipher.encryptor()
# encode base64
ctext = base64.b64encode(ctext)
return ctext
md = globals()[sys.argv[1]]
plaintext = sys.stdin.read().encode('utf-8')
pwd = b"password"
output:
import base64
import struct
import sys
import os
backend = default_backend()
# check magic
if ctext[:8] != b'Salted__':
raise Exception("bad magic number")
# get salt
salt = ctext[8:16]
# decrypt
alg = algorithms.AES(key)
mode = modes.CBC(iv)
cipher = Cipher(alg, mode, backend=backend)
decryptor = cipher.decryptor()
ptext = decryptor.update(ctext[16:]) + decryptor.finalize()
# unpad plaintext
unpadder = padding.PKCS7(128).unpadder() # 128 bit
ptext = unpadder.update(ptext) + unpadder.finalize()
return ptext.strip()
(continues on next page)
md = globals()[sys.argv[1]]
ciphertext = sys.stdin.read().encode('utf-8')
pwd = b"password"
output:
import struct
import base64
import sys
# generate salt
salt = struct.pack('=Q', getrandbits(64))
# pad plaintext
plaintext = pad(plaintext)
# ref: openssl/apps/enc.c
ciphertext = b'Salted__' + salt + cipher.encrypt(plaintext)
# encode base64
ciphertext = base64.b64encode(ciphertext)
return ciphertext
md = globals()[sys.argv[1]]
plaintext = sys.stdin.read().encode('utf-8')
pwd = b"password"
output:
import struct
import base64
import sys
# check magic
if ciphertext[:8] != b'Salted__':
raise Exception("bad magic number")
# get salt
salt = ciphertext[8:16]
# get key, iv
key, iv = EVP_ByteToKey(pwd, md, salt, 32, 16)
# decrypt
cipher = AES.new(key, AES.MODE_CBC, iv)
return unpad(cipher.decrypt(ciphertext[16:])).strip()
md = globals()[sys.argv[1]]
ciphertext = sys.stdin.read().encode('utf-8')
pwd = b"password"
output:
backend = default_backend()
p = int("11859949538425015739337467917303613431031019140213666"
"12902540730065402658508634532306628480096346320424639"
"0256567934582260424238844463330887962689642467123")
g = 2
y = int("32155788395534640648739966373159697798396966919821525"
"72238852825117261342483718574508213761865276905503199"
"969908098203345481366464874759377454476688391248")
x = int("409364065449673443397833358558926598469347813468816037"
"268451847116982490733450463194921405069999008617231539"
"7147035896687401350877308899732826446337707128")
params = dh.DHParameterNumbers(p, g)
public = dh.DHPublicNumbers(y, params)
private = dh.DHPrivateNumbers(x, public)
key = private.private_key(backend)
shared_key = key.exchange(public.public_key(backend))
Table of Contents
• Secure Shell
– Login ssh
import paramiko
from paramiko.client import SSHClient
with SSHClient() as ssh:
ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy())
ssh.connect("localhost", username="me", password="pwd")
stdin, stdout, stderr = ssh.exec_command("uname")
print(stdout.read())
3.8 Boto3
3.9 Test
Table of Contents
• Test
– A simple Python unittest
– Python unittest setup & teardown hierarchy
– Different module of setUp & tearDown hierarchy
– Run tests via unittest.TextTestRunner
– Test raise exception
– Pass arguments into a TestCase
– Group multiple testcases into a suite
– Group multiple tests from different TestCase
– Skip some tests in the TestCase
– Monolithic Test
– Cross-module variables to Test files
– skip setup & teardown when the test is skipped
– Re-using old test code
– Testing your document is right
– Re-using doctest to unittest
– Customize test report
– Mock - using @patch substitute original method
– What with unittest.mock.patch do?
– Mock - substitute open
OK
>>> import unittest
>>> class TestFail(unittest.TestCase):
... def test_false(self):
... self.assertTrue(False)
...
>>> unittest.main()
F
======================================================================
FAIL: test_false (__main__.TestFail)
----------------------------------------------------------------------
Traceback (most recent call last):
File "<stdin>", line 3, in test_false
AssertionError: False is not true
----------------------------------------------------------------------
Ran 1 test in 0.000s
FAILED (failures=1)
import unittest
def fib(n):
return 1 if n<=2 else fib(n-1)+fib(n-2)
def setUpModule():
print("setup module")
def tearDownModule():
print("teardown module")
class TestFib(unittest.TestCase):
def setUp(self):
print("setUp")
self.n = 10
def tearDown(self):
print("tearDown")
del self.n
@classmethod
def setUpClass(cls):
print("setUpClass")
(continues on next page)
if __name__ == "__main__":
unittest.main()
output:
$ python test.py
setup module
setUpClass
setUp
tearDown
.setUp
tearDown
.tearDownClass
teardown module
----------------------------------------------------------------------
Ran 2 tests in 0.000s
OK
# test_module.py
from __future__ import print_function
import unittest
class TestFoo(unittest.TestCase):
@classmethod
def setUpClass(self):
print("foo setUpClass")
@classmethod
def tearDownClass(self):
print("foo tearDownClass")
def setUp(self):
print("foo setUp")
def tearDown(self):
print("foo tearDown")
def test_foo(self):
self.assertTrue(True)
class TestBar(unittest.TestCase):
(continues on next page)
# test.py
from __future__ import print_function
def setUpModule():
print("setUpModule")
def tearDownModule():
print("tearDownModule")
if __name__ == "__main__":
test_module.setUpModule = setUpModule
test_module.tearDownModule = tearDownModule
suite1 = unittest.TestLoader().loadTestsFromTestCase(TestFoo)
suite2 = unittest.TestLoader().loadTestsFromTestCase(TestBar)
suite = unittest.TestSuite([suite1,suite2])
unittest.TextTestRunner().run(suite)
output:
$ python test.py
setUpModule
foo setUpClass
foo setUp
foo tearDown
.foo tearDownClass
bar setUp
bar tearDown
.tearDownModule
----------------------------------------------------------------------
Ran 2 tests in 0.000s
OK
----------------------------------------------------------------------
Ran 2 tests in 0.000s
OK
OK
>>> class TestRaiseFail(unittest.TestCase):
... def test_raise_fail(self):
... with self.assertRaises(SystemError):
... pass
>>> suite = unittest.TestLoader().loadTestsFromTestCase(TestRaiseFail)
>>> unittest.TextTestRunner(verbosity=2).run(suite)
test_raise_fail (__main__.TestRaiseFail) ... FAIL
======================================================================
FAIL: test_raise_fail (__main__.TestRaiseFail)
----------------------------------------------------------------------
Traceback (most recent call last):
File "<stdin>", line 4, in test_raise_fail
AssertionError: SystemError not raised
----------------------------------------------------------------------
(continues on next page)
FAILED (failures=1)
----------------------------------------------------------------------
Ran 1 test in 0.000s
OK
----------------------------------------------------------------------
Ran 4 tests in 0.000s
OK
----------------------------------------------------------------------
Ran 2 tests in 0.001s
OK
----------------------------------------------------------------------
Ran 4 tests in 0.000s
OK (skipped=3)
OK
<unittest.runner.TextTestResult run=1 errors=0 failures=0>
test_foo.py
import unittest
print(conf)
class TestFoo(unittest.TestCase):
def test_foo(self):
print(conf)
test_bar.py
import unittest
import __builtin__
if __name__ == "__main__":
conf = type('TestConf', (object,), {})
conf.isskip = True
output:
$ python test_bar.py
<class '__main__.TestConf'>
test_foo (test_foo.TestFoo) ... <class '__main__.TestConf'>
ok
test_skip (test_foo.TestFoo) ... skipped 'skip test'
----------------------------------------------------------------------
Ran 2 tests in 0.000s
OK (skipped=1)
----------------------------------------------------------------------
Ran 2 tests in 0.000s
OK (skipped=1)
"""
This is an example of doctest
>>> fib(10)
55
"""
def fib(n):
""" This function calculate fib number.
Example:
>>> fib(10)
55
>>> fib(-1)
Traceback (most recent call last):
...
ValueError
"""
if n < 0:
raise ValueError('')
return 1 if n<=2 else fib(n-1) + fib(n-2)
if __name__ == "__main__":
import doctest
doctest.testmod()
output:
$ python demo_doctest.py -v
Trying:
fib(10)
Expecting:
55
ok
Trying:
fib(10)
Expecting:
55
ok
Trying:
fib(-1)
Expecting:
Traceback (most recent call last):
...
(continues on next page)
import unittest
import doctest
"""
This is an example of doctest
>>> fib(10)
55
"""
def fib(n):
""" This function calculate fib number.
Example:
>>> fib(10)
55
>>> fib(-1)
Traceback (most recent call last):
...
ValueError
"""
if n < 0:
raise ValueError('')
return 1 if n<=2 else fib(n-1) + fib(n-2)
if __name__ == "__main__":
finder = doctest.DocTestFinder()
suite = doctest.DocTestSuite(test_finder=finder)
unittest.TextTestRunner(verbosity=2).run(suite)
output:
fib (__main__)
Doctest: __main__.fib ... ok
----------------------------------------------------------------------
Ran 1 test in 0.023s
OK
OK = 'ok'
FAIL = 'fail'
ERROR = 'error'
SKIP = 'skip'
class JsonTestResult(TextTestResult):
def jsonify(self):
json_out = dict()
for t in self.successes:
json_out = self.json_append(t, OK, json_out)
for t, _ in self.failures:
(continues on next page)
for t, _ in self.errors:
json_out = self.json_append(t, ERROR, json_out)
for t, _ in self.skipped:
json_out = self.json_append(t, SKIP, json_out)
return json_out
class TestSimple(TestCase):
def test_ok_1(self):
foo = True
self.assertTrue(foo)
def test_ok_2(self):
bar = True
self.assertTrue(bar)
def test_fail(self):
baz = False
self.assertTrue(baz)
def test_raise(self):
raise RuntimeError
@unittest.skip("Test skip")
def test_skip(self):
raise NotImplementedError
if __name__ == '__main__':
# redirector default output of unittest to /dev/null
with open(os.devnull, 'w') as null_stream:
# new a runner and overwrite resultclass of runner
runner = TextTestRunner(stream=null_stream)
runner.resultclass = JsonTestResult
# create a testsuite
suite = TestLoader().loadTestsFromTestCase(TestSimple)
output:
$ python test.py
{'TestSimple': {'error': ['test_raise'],
'fail': ['test_fail'],
(continues on next page)
# python-3.3 or above
>>> import os
>>> def test():
... try:
... os.remove('%$!?&*')
... except OSError as e:
... print(e)
... else:
... print('test success')
...
>>> test()
[Errno 2] No such file or directory: '%$!?&*'
PATH = '$@!%?&'
def fake_remove(path):
print("Fake remove")
class SimplePatch:
def __enter__(self):
orig, attr = self.get_target(self._target)
self.orig, self.attr = orig, attr
self.orig_attr = getattr(orig, attr)
setattr(orig, attr, self._new)
return self._new
output:
$ python3 simple_patch.py
---> inside unittest.mock.patch scope
Fake remove
(continues on next page)
3.10 C Extensions
Occasionally, it is unavoidable for pythoneers to write a C extension. For example, porting C libraries or new system
calls to Python requires to implement new object types through C extension. In order to provide a brief glance on how
C extension works. This cheat sheet mainly focuses on writing a Python C extension.
Note that the C extension interface is specific to official CPython. It is likely that extension modules do not work on
other Python implementations such as PyPy. Even if official CPython, the Python C API may be not compatible with
different versions, e.g., Python2 and Python3. Therefore, if extension modules are considered to be run on other Python
interpreters, it would be better to use ctypes module or cffi.
Table of Contents
• C Extensions
– Simple setup.py
– Customize CFLAGS
– Doc String
– Simple C Extension
– Release the GIL
– Acquire the GIL
– Get Reference Count
– Parse Arguments
– Calling Python Functions
– Raise Exception
– Customize Exception
– Iterate a List
– Iterate a Dictionary
– Simple Class
– Simple Class with Members and Methods
– Simplie Class with Getter and Setter
– Inherit from Other Class
– Run a Python Command
– Run a Python File
– Import a Python Module
– Import everything of a Module
– Access Attributes
– Performance of C Extension
– Performance of ctypes
– ctypes Error handling
import sysconfig
from distutils.core import setup, Extension
cflags = sysconfig.get_config_var("CFLAGS")
extra_compile_args = cflags.split()
extra_compile_args += ["-Wextra"]
ext = Extension(
"foo", ["foo.c"],
extra_compile_args=extra_compile_args
)
foo.c
#include <Python.h>
PyMODINIT_FUNC PyInit_foo(void)
{
return PyModule_Create(&module);
}
output:
#include <Python.h>
PyMODINIT_FUNC PyInit_foo(void)
{
return PyModule_Create(&module);
}
output:
In C extension, blocking I/O should be inserted into a block which is wrapped by Py_BEGIN_ALLOW_THREADS and
Py_END_ALLOW_THREADS for releasing the GIL temporarily; Otherwise, a blocking I/O operation has to wait until
previous operation finish. For example
#include <Python.h>
PyMODINIT_FUNC PyInit_foo(void)
(continues on next page)
output:
$ python -c "
> import threading
> import foo
> from datetime import datetime
> def f(n):
> now = datetime.now()
> print(f'{now}: thread {n}')
> foo.foo()
> ts = [threading.Thread(target=f, args=(n,)) for n in range(3)]
> [t.start() for t in ts]
> [t.join() for t in ts]"
2018-11-04 20:16:44.055932: thread 0
2018-11-04 20:16:47.059718: thread 1
2018-11-04 20:16:50.063579: thread 2
Warning: The GIL can only be safely released when there is NO Python C API functions between
Py_BEGIN_ALLOW_THREADS and Py_END_ALLOW_THREADS.
#include <pthread.h>
#include <Python.h>
typedef struct {
PyObject *sec;
PyObject *py_callback;
} foo_args;
void *
foo_thread(void *args)
{
long n = -1;
PyObject *rv = NULL, *sec = NULL,* py_callback = NULL;
foo_args *a = NULL;
if (!args)
return NULL;
a = (foo_args *)args;
sec = a->sec;
py_callback = a->py_callback;
n = PyLong_AsLong(sec);
(continues on next page)
static PyObject *
foo(PyObject *self, PyObject *args)
{
long i = 0, n = 0;
pthread_t *arr = NULL;
PyObject *py_callback = NULL;
PyObject *sec = NULL, *num = NULL;
PyObject *rv = NULL;
foo_args a = {};
if (!PyLong_Check(sec) || !PyLong_Check(num)) {
PyErr_SetString(PyExc_TypeError, "should be int");
goto error;
}
if (!PyCallable_Check(py_callback)) {
PyErr_SetString(PyExc_TypeError, "should be callable");
goto error;
}
n = PyLong_AsLong(num);
if (n == -1 && PyErr_Occurred())
goto error;
a.sec = sec;
a.py_callback = py_callback;
for (i = 0; i < n; i++) {
(continues on next page)
PyMODINIT_FUNC PyInit_foo(void)
{
return PyModule_Create(&module);
}
output:
If threads are created from C/C++, those threads do not hold the GIL. Without acquiring the GIL, the interpreter cannot
access Python functions safely. For example
void *
foo_thread(void *args)
{
...
// without acquiring the GIL
rv = PyObject_CallFunction(py_callback, "s", "Awesome Python!");
Py_XDECREF(rv);
return NULL;
}
output:
Warning: In order to call python function safely, we can simply warp Python Functions between
PyGILState_Ensure and PyGILState_Release in C extension code.
PyGILState_STATE state = PyGILState_Ensure();
// Perform Python actions
result = PyObject_CallFunction(callback)
// Error handling
PyGILState_Release(state);
#include <Python.h>
static PyObject *
getrefcount(PyObject *self, PyObject *a)
{
return PyLong_FromSsize_t(Py_REFCNT(a));
}
PyMODINIT_FUNC PyInit_foo(void)
{
return PyModule_Create(&module);
}
output:
#include <Python.h>
static PyObject *
foo(PyObject *self)
{
Py_RETURN_NONE;
}
static PyObject *
bar(PyObject *self, PyObject *arg)
{
return Py_BuildValue("O", arg);
}
static PyObject *
baz(PyObject *self, PyObject *args)
{
PyObject *x = NULL, *y = NULL;
if (!PyArg_ParseTuple(args, "OO", &x, &y)) {
return NULL;
(continues on next page)
static PyObject *
qux(PyObject *self, PyObject *args, PyObject *kwargs)
{
static char *keywords[] = {"x", "y", NULL};
PyObject *x = NULL, *y = NULL;
if (!PyArg_ParseTupleAndKeywords(args, kwargs,
"O|O", keywords,
&x, &y))
{
return NULL;
}
if (!y) {
y = Py_None;
}
return Py_BuildValue("OO", x, y);
}
PyMODINIT_FUNC PyInit_foo(void)
{
return PyModule_Create(&module);
}
output:
#include <Python.h>
static PyObject *
foo(PyObject *self, PyObject *args)
{
PyObject *py_callback = NULL;
PyObject *rv = NULL;
if (!PyCallable_Check(py_callback)) {
PyErr_SetString(PyExc_TypeError, "should be callable");
return NULL;
}
PyMODINIT_FUNC PyInit_foo(void)
{
return PyModule_Create(&module);
}
output:
#include <Python.h>
static PyObject*
foo(PyObject* self)
{
// raise NotImplementedError
PyErr_SetString(PyExc_NotImplementedError, "Not implemented");
return NULL;
}
PyMODINIT_FUNC PyInit_foo(void)
{
return PyModule_Create(&module);
}
output:
#include <stdio.h>
#include <Python.h>
static PyObject *
foo(PyObject *self __attribute__((unused)))
{
PyErr_SetString(FooError, "Raise exception in C");
return NULL;
}
PyMODINIT_FUNC PyInit_foo(void)
{
PyObject *m = NULL;
m = PyModule_Create(&module);
if (!m) return NULL;
output:
#include <Python.h>
#define PY_PRINTF(o) \
PyObject_Print(o, stdout, 0); printf("\n");
static PyObject *
iter_list(PyObject *self, PyObject *args)
{
PyObject *list = NULL, *item = NULL, *iter = NULL;
PyObject *result = NULL;
(continues on next page)
if (!PyList_Check(list))
goto error;
// Get iterator
iter = PyObject_GetIter(list);
if (!iter)
goto error;
Py_XINCREF(Py_None);
result = Py_None;
error:
Py_XDECREF(iter);
return result;
}
PyMODINIT_FUNC PyInit_foo(void)
{
return PyModule_Create(&module);
}
output:
#include <Python.h>
#define PY_PRINTF(o) \
PyObject_Print(o, stdout, 0); printf("\n");
static PyObject *
iter_dict(PyObject *self, PyObject *args)
{
PyObject *dict = NULL;
PyObject *key = NULL, *val = NULL;
PyObject *o = NULL, *result = NULL;
Py_ssize_t pos = 0;
Py_INCREF(Py_None);
result = Py_None;
error:
return result;
}
PyMODINIT_FUNC PyInit_foo(void)
{
return PyModule_Create(&module);
}
output:
#include <Python.h>
typedef struct {
PyObject_HEAD
} FooObject;
PyMODINIT_FUNC
PyInit_foo(void)
{
PyObject *m = NULL;
if (PyType_Ready(&FooType) < 0)
return NULL;
if ((m = PyModule_Create(&module)) == NULL)
return NULL;
Py_XINCREF(&FooType);
PyModule_AddObject(m, "Foo", (PyObject *) &FooType);
return m;
}
output:
#include <Python.h>
#include <structmember.h>
/*
* class Foo:
* def __new__(cls, *a, **kw):
* foo_obj = object.__new__(cls)
* foo_obj.foo = ""
* foo_obj.bar = ""
* return foo_obj
*
* def __init__(self, foo, bar):
* self.foo = foo
* self.bar = bar
*
* def fib(self, n):
* if n < 2:
* return n
* return self.fib(n - 1) + self.fib(n - 2)
*/
typedef struct {
PyObject_HEAD
PyObject *foo;
PyObject *bar;
} FooObject;
static void
Foo_dealloc(FooObject *self)
{
Py_XDECREF(self->foo);
Py_XDECREF(self->bar);
Py_TYPE(self)->tp_free((PyObject *) self);
}
static PyObject *
Foo_new(PyTypeObject *type, PyObject *args, PyObject *kw)
{
int rc = -1;
FooObject *self = NULL;
self = (FooObject *) type->tp_alloc(type, 0);
/* allocate attributes */
self->foo = PyUnicode_FromString("");
if (self->foo == NULL) goto error;
self->bar = PyUnicode_FromString("");
if (self->bar == NULL) goto error;
rc = 0;
error:
if (rc < 0) {
Py_XDECREF(self->foo);
Py_XINCREF(self->bar);
Py_XDECREF(self);
}
return (PyObject *) self;
}
static int
Foo_init(FooObject *self, PyObject *args, PyObject *kw)
{
int rc = -1;
static char *keywords[] = {"foo", "bar", NULL};
PyObject *foo = NULL, *bar = NULL, *ptr = NULL;
if (!PyArg_ParseTupleAndKeywords(args, kw,
"|OO", keywords,
&foo, &bar))
{
goto error;
}
if (foo) {
ptr = self->foo;
Py_INCREF(foo);
self->foo = foo;
Py_XDECREF(ptr);
}
if (bar) {
ptr = self->bar;
Py_INCREF(bar);
self->bar = bar;
Py_XDECREF(ptr);
}
rc = 0;
error:
return rc;
}
static PyObject *
Foo_fib(FooObject *self, PyObject *args)
{
unsigned long n = 0;
if (!PyArg_ParseTuple(args, "k", &n)) return NULL;
return PyLong_FromUnsignedLong(fib(n));
}
PyMODINIT_FUNC
PyInit_foo(void)
{
PyObject *m = NULL;
if (PyType_Ready(&FooType) < 0)
return NULL;
if ((m = PyModule_Create(&module)) == NULL)
return NULL;
Py_XINCREF(&FooType);
PyModule_AddObject(m, "Foo", (PyObject *) &FooType);
(continues on next page)
output:
#include <Python.h>
/*
* class Foo:
* def __new__(cls, *a, **kw):
* foo_obj = object.__new__(cls)
* foo_obj._foo = ""
* return foo_obj
*
* def __init__(self, foo=None):
* if foo and isinstance(foo, 'str'):
* self._foo = foo
*
* @property
* def foo(self):
* return self._foo
*
* @foo.setter
* def foo(self, value):
* if not value or not isinstance(value, str):
* raise TypeError("value should be unicode")
* self._foo = value
*/
typedef struct {
PyObject_HEAD
PyObject *foo;
} FooObject;
static void
Foo_dealloc(FooObject *self)
(continues on next page)
static PyObject *
Foo_new(PyTypeObject *type, PyObject *args, PyObject *kw)
{
int rc = -1;
FooObject *self = NULL;
self = (FooObject *) type->tp_alloc(type, 0);
/* allocate attributes */
self->foo = PyUnicode_FromString("");
if (self->foo == NULL) goto error;
rc = 0;
error:
if (rc < 0) {
Py_XDECREF(self->foo);
Py_XDECREF(self);
}
return (PyObject *) self;
}
static int
Foo_init(FooObject *self, PyObject *args, PyObject *kw)
{
int rc = -1;
static char *keywords[] = {"foo", NULL};
PyObject *foo = NULL, *ptr = NULL;
if (!PyArg_ParseTupleAndKeywords(args, kw,
"|O", keywords,
&foo))
{
goto error;
}
rc = 0;
error:
return rc;
}
(continues on next page)
static PyObject *
Foo_getfoo(FooObject *self, void *closure)
{
Py_INCREF(self->foo);
return self->foo;
}
static int
Foo_setfoo(FooObject *self, PyObject *value, void *closure)
{
int rc = -1;
if (!value || !PyUnicode_Check(value)) {
PyErr_SetString(PyExc_TypeError, "value should be unicode");
goto error;
}
Py_INCREF(value);
Py_XDECREF(self->foo);
self->foo = value;
rc = 0;
error:
return rc;
}
PyMODINIT_FUNC
PyInit_foo(void)
{
PyObject *m = NULL;
if (PyType_Ready(&FooType) < 0)
return NULL;
(continues on next page)
output:
#include <Python.h>
#include <structmember.h>
/*
* class Foo:
* def __new__(cls, *a, **kw):
* foo_obj = object.__new__(cls)
* foo_obj.foo = ""
* return foo_obj
*
* def __init__(self, foo):
* self.foo = foo
*
* def fib(self, n):
* if n < 2:
* return n
* return self.fib(n - 1) + self.fib(n - 2)
*/
/* FooObject */
typedef struct {
PyObject_HEAD
PyObject *foo;
(continues on next page)
static void
Foo_dealloc(FooObject *self)
{
Py_XDECREF(self->foo);
Py_TYPE(self)->tp_free((PyObject *) self);
}
static PyObject *
Foo_new(PyTypeObject *type, PyObject *args, PyObject *kw)
{
int rc = -1;
FooObject *self = NULL;
self = (FooObject *) type->tp_alloc(type, 0);
/* allocate attributes */
self->foo = PyUnicode_FromString("");
if (self->foo == NULL) goto error;
rc = 0;
error:
if (rc < 0) {
Py_XDECREF(self->foo);
Py_XDECREF(self);
}
return (PyObject *) self;
}
static int
Foo_init(FooObject *self, PyObject *args, PyObject *kw)
{
int rc = -1;
static char *keywords[] = {"foo", NULL};
PyObject *foo = NULL, *ptr = NULL;
if (foo) {
ptr = self->foo;
Py_INCREF(foo);
self->foo = foo;
Py_XDECREF(ptr);
}
rc = 0;
error:
return rc;
}
(continues on next page)
static PyObject *
Foo_fib(FooObject *self, PyObject *args)
{
unsigned long n = 0;
if (!PyArg_ParseTuple(args, "k", &n)) return NULL;
return PyLong_FromUnsignedLong(fib(n));
}
/*
* class Bar(Foo):
* def __init__(self, bar):
* super().__init__(bar)
*
* def gcd(self, a, b):
* while b:
* a, b = b, a % b
* return a
*/
/* BarObject */
static int
Bar_init(FooObject *self, PyObject *args, PyObject *kw)
{
return FooType.tp_init((PyObject *) self, args, kw);
}
static PyObject *
Bar_gcd(BarObject *self, PyObject *args)
{
unsigned long a = 0, b = 0;
if (!PyArg_ParseTuple(args, "kk", &a, &b)) return NULL;
return PyLong_FromUnsignedLong(gcd(a, b));
}
/* Module */
PyMODINIT_FUNC
PyInit_foo(void)
{
PyObject *m = NULL;
if (PyType_Ready(&FooType) < 0)
return NULL;
if (PyType_Ready(&BarType) < 0)
return NULL;
if ((m = PyModule_Create(&module)) == NULL)
return NULL;
Py_XINCREF(&FooType);
Py_XINCREF(&BarType);
PyModule_AddObject(m, "Foo", (PyObject *) &FooType);
PyModule_AddObject(m, "Bar", (PyObject *) &BarType);
return m;
}
output:
#include <stdio.h>
#include <Python.h>
int
main(int argc, char *argv[])
{
int rc = -1;
Py_Initialize();
rc = PyRun_SimpleString(argv[1]);
Py_Finalize();
return rc;
}
output:
#include <stdio.h>
#include <Python.h>
int
main(int argc, char *argv[])
{
int rc = -1, i = 0;
wchar_t **argv_copy = NULL;
const char *filename = NULL;
FILE *fp = NULL;
PyCompilerFlags cf = {.cf_flags = 0};
filename = argv[1];
fp = fopen(filename, "r");
if (!fp)
goto error;
// copy argv
argv_copy = PyMem_RawMalloc(sizeof(wchar_t*) * argc);
if (!argv_copy)
goto error;
Py_Initialize();
Py_SetProgramName(argv_copy[0]);
PySys_SetArgv(argc, argv_copy);
rc = PyRun_AnyFileExFlags(fp, filename, 0, &cf);
error:
if (argv_copy) {
for (i = 0; i < argc; i++)
PyMem_RawFree(argv_copy[i]);
PyMem_RawFree(argv_copy);
}
if (fp) fclose(fp);
Py_Finalize();
return rc;
}
output:
$ clang `python3-config --cflags` -c foo.c -o foo.o
$ clang `python3-config --ldflags` foo.o -o foo
$ echo "import sys; print(sys.argv)" > foo.py
$ ./foo foo.py arg1 arg2 arg3
['./foo', 'foo.py', 'arg1', 'arg2', 'arg3']
#include <stdio.h>
#include <Python.h>
int
main(int argc, char *argv[])
{
int rc = -1;
wchar_t *program = NULL;
PyObject *json_module = NULL, *json_dict = NULL;
PyObject *json_dumps = NULL;
PyObject *dict = NULL;
PyObject *result = NULL;
Py_SetProgramName(program);
Py_Initialize();
// import json
json_module = PyImport_ImportModule("json");
PYOBJECT_CHECK(json_module, error);
// json_dict = json.__dict__
json_dict = PyModule_GetDict(json_module);
PYOBJECT_CHECK(json_dict, error);
// json_dumps = json.__dict__['dumps']
json_dumps = PyDict_GetItemString(json_dict, "dumps");
PYOBJECT_CHECK(json_dumps, error);
// result = json.dumps(dict)
result = PyObject_CallObject(json_dumps, dict);
PYOBJECT_CHECK(result, error);
PyObject_Print(result, stdout, 0);
printf("\n");
rc = 0;
error:
Py_XDECREF(result);
Py_XDECREF(dict);
Py_XDECREF(json_dumps);
Py_XDECREF(json_dict);
Py_XDECREF(json_module);
PyMem_RawFree(program);
Py_Finalize();
return rc;
}
output:
#include <stdio.h>
#include <Python.h>
int
main(int argc, char *argv[])
{
int rc = -1;
wchar_t *program = NULL;
PyObject *main_module = NULL, *main_dict = NULL;
PyObject *uname = NULL;
PyObject *sysname = NULL;
PyObject *result = NULL;
Py_SetProgramName(program);
Py_Initialize();
// import __main__
main_module = PyImport_ImportModule("__main__");
PYOBJECT_CHECK(main_module, error);
// main_dict = __main__.__dict__
main_dict = PyModule_GetDict(main_module);
PYOBJECT_CHECK(main_dict, error);
// from os import *
result = PyRun_String("from os import *",
Py_file_input,
main_dict,
main_dict);
PYOBJECT_CHECK(result, error);
Py_XDECREF(result);
Py_XDECREF(main_dict);
// uname = __main__.__dict__['uname']
main_dict = PyModule_GetDict(main_module);
PYOBJECT_CHECK(main_dict, error);
// result = uname()
uname = PyDict_GetItemString(main_dict, "uname");
PYOBJECT_CHECK(uname, error);
result = PyObject_CallObject(uname, NULL);
PYOBJECT_CHECK(result, error);
// sysname = result.sysname
sysname = PyObject_GetAttrString(result, "sysname");
PYOBJECT_CHECK(sysname, error);
PyObject_Print(sysname, stdout, 0);
printf("\n");
rc = 0;
error:
Py_XDECREF(sysname);
Py_XDECREF(result);
Py_XDECREF(uname);
Py_XDECREF(main_dict);
Py_XDECREF(main_module);
PyMem_RawFree(program);
Py_Finalize();
(continues on next page)
output:
#include <stdio.h>
#include <Python.h>
int
main(int argc, char *argv[])
{
int rc = -1;
wchar_t *program = NULL;
PyObject *json_module = NULL;
PyObject *json_dumps = NULL;
PyObject *dict = NULL;
PyObject *result = NULL;
Py_SetProgramName(program);
Py_Initialize();
// import json
json_module = PyImport_ImportModule("json");
PYOBJECT_CHECK(json_module, error);
// json_dumps = json.dumps
json_dumps = PyObject_GetAttrString(json_module, "dumps");
PYOBJECT_CHECK(json_dumps, error);
// result = json.dumps(dict)
result = PyObject_CallObject(json_dumps, dict);
PYOBJECT_CHECK(result, error);
PyObject_Print(result, stdout, 0);
printf("\n");
rc = 0;
error:
Py_XDECREF(result);
Py_XDECREF(dict);
Py_XDECREF(json_dumps);
Py_XDECREF(json_module);
PyMem_RawFree(program);
Py_Finalize();
return rc;
}
output:
#include <Python.h>
static PyObject *
fibonacci(PyObject *self, PyObject *args)
{
unsigned long n = 0;
if (!PyArg_ParseTuple(args, "k", &n)) return NULL;
return PyLong_FromUnsignedLong(fib(n));
}
PyMODINIT_FUNC PyInit_foo(void)
{
return PyModule_Create(&module);
}
// Compile (Mac)
// -------------
//
// $ clang -Wall -Werror -shared -fPIC -o libfib.dylib fib.c
//
unsigned int fib(unsigned int n)
{
if ( n < 2) {
return n;
}
return fib(n-1) + fib(n-2);
}
import os
stat = libc.stat
class Stat(Structure):
"""
From /usr/include/sys/stat.h
struct stat {
dev_t st_dev;
ino_t st_ino;
mode_t st_mode;
nlink_t st_nlink;
uid_t st_uid;
gid_t st_gid;
dev_t st_rdev;
#ifndef _POSIX_SOURCE
struct timespec st_atimespec;
struct timespec st_mtimespec;
struct timespec st_ctimespec;
#else
time_t st_atime;
long st_atimensec;
time_t st_mtime;
long st_mtimensec;
time_t st_ctime;
long st_ctimensec;
#endif
off_t st_size;
int64_t st_blocks;
u_int32_t st_blksize;
u_int32_t st_flags;
u_int32_t st_gen;
int32_t st_lspare;
int64_t st_qspare[2];
};
"""
_fields_ = [
("st_dev", c_ulong),
(continues on next page)
# stat success
path = create_string_buffer(b"/etc/passwd")
st = Stat()
ret = stat(path, byref(st))
assert ret == 0
output:
FOUR
APPENDIX
The appendix mainly focuses on some critical concepts missing in cheat sheets.
@wraps preserve attributes of the original function, otherwise attributes of the decorated function will be replaced by
wrapper function. For example
Without @wraps
With @wraps
339
python-cheatsheet Documentation, Release 0.1.0
Table of Contents
4.2.1 Abstract
The C10k problem is still a puzzle for a programmer to find a way to solve it. Generally, developers deal with extensive
I/O operations via thread, epoll, or kqueue to avoid their software waiting for an expensive task. However, developing
a readable and bug-free concurrent code is challenging due to data sharing and job dependency. Even though some
powerful tools, such as Valgrind, help developers to detect deadlock or other asynchronous issues, solving these prob-
lems may be time-consuming when the scale of software grows large. Therefore, many programming languages such
as Python, Javascript, or C++ dedicated to developing better libraries, frameworks, or syntaxes to assist programmers
in managing concurrent jobs properly. Instead of focusing on how to use modern parallel APIs, this article mainly
concentrates on the design philosophy behind asynchronous programming patterns.
Using threads is a more natural way for developers to dispatch tasks without blocking the main thread. However,
threads may lead to performance issues such as locking critical sections to do some atomic operations. Although using
event-loop can enhance performance in some cases, writing readable code is challenging due to callback problems
(e.g., callback hell). Fortunately, programming languages like Python introduced a concept, async/await, to help
developers write understandable code with high performance. The following figure shows the main goal by using
async/await to handle socket connections like utilizing threads.
4.2.2 Introduction
Handling I/O operations such as network connections is one of the most expensive tasks in a program. Take a simple
TCP blocking echo server as an example (The following snippet). If a client connects to the server successfully without
sending any request, it blocks others’ connections. Even though clients send data as soon as possible, the server cannot
handle other requests if there is no client trying to establish a connection. Also, handling multiple requests is inefficient
because it wastes a lot of time waiting for I/O responses from hardware such as network interfaces. Thus, socket
programming with concurrency becomes inevitable to manage extensive requests.
import socket
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM, 0)
s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
s.bind(("127.0.0.1", 5566))
s.listen(10)
while True:
conn, addr = s.accept()
msg = conn.recv(1024)
conn.send(msg)
One possible solution to prevent a server waiting for I/O operations is to dispatch tasks to other threads. The following
example shows how to create a thread to handle connections simultaneously. However, creating numerous threads may
consume all computing power without high throughput. Even worse, an application may waste time waiting for a lock
to process tasks in critical sections. Although using threads can solve blocking issues for a socket server, other factors,
such as CPU utilization, are essential for a programmer to overcome the C10k problem. Therefore, without creating
unlimited threads, the event loop is another solution to manage connections.
import threading
import socket
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
s.bind(("127.0.0.1", 5566))
s.listen(10240)
def handler(conn):
while True:
msg = conn.recv(65535)
conn.send(msg)
while True:
conn, addr = s.accept()
t = threading.Thread(target=handler, args=(conn,))
t.start()
A simple event-driven socket server includes three main components: an I/O multiplexing module (e.g., select), a
scheduler (loop), and callback functions (events). For example, the following server utilizes the high-level I/O multi-
plexing, selectors, within a loop to check whether an I/O operation is ready or not. If data is available to read/write,
the loop acquires I/O events and execute callback functions, accept, read, or write, to finish tasks.
import socket
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
s.bind(("127.0.0.1", 5566))
s.listen(10240)
s.setblocking(False)
sel = DefaultSelector()
Although managing connections via threads may not be efficient, a program that utilizes an event loop to schedule tasks
isn’t easy to read. To enhance code readability, many programming languages, including Python, introduce abstract
concepts such as coroutine, future, or async/await to handle I/O multiplexing. To better understand programming jargon
and using them correctly, the following sections discuss what these concepts are and what kind of problems they try to
solve.
A callback function is used to control data flow at runtime when an event is invoked. However, preserving current
callback function’s status is challenging. For example, if a programmer wants to implement a handshake over a TCP
server, he/she may require to store previous status in some where.
import socket
sel = DefaultSelector()
is_hello = {}
# do a handshake
if msg.decode("utf-8").strip() != "hello":
sel.unregister(conn)
return conn.close()
is_hello[conn] = True
Although the variable is_hello assists in storing status to check whether a handshake is successful or not, the code
becomes harder for a programmer to understand. In fact, the concept of the previous implementation is simple. It is
equal to the following snippet (blocking version).
def accept(s):
conn, addr = s.accept()
success = handshake(conn)
(continues on next page)
def handshake(conn):
data = conn.recv(65535)
if not data:
return False
if data.decode('utf-8').strip() != "hello":
return False
conn.send(b"hello")
return True
To migrate the similar structure from blocking to non-blocking, a function (or a task) requires to snapshot the current
status, including arguments, variables, and breakpoints, when it needs to wait for I/O operations. Also, the scheduler
should be able to re-entry the function and execute the remaining code after I/O operations finish. Unlike other pro-
gramming languages such as C++, Python can achieve the concepts discussed above easily because its generator can
preserve all status and re-entry by calling the built-in function next(). By utilizing generators, handling I/O operations
like the previous snippet but a non-blocking form, which is called inline callback, is reachable inside an event loop.
An event loop is a scheduler to manage tasks within a program instead of depending on operating systems. The following
snippet shows how a simple event loop to handle socket connections asynchronously. The implementation concept is
to append tasks into a FIFO job queue and register a selector when I/O operations are not ready. Also, a generator
preserves the status of a task that allows it to be able to execute its remaining jobs without callback functions when
I/O results are available. By observing how an event loop works, therefore, it would assist in understanding a Python
generator is indeed a form of coroutine.
# loop.py
class Loop(object):
def __init__(self):
self.sel = DefaultSelector()
self.queue = []
def polling(self):
for e, m in self.sel.select(0):
self.queue.append((e.data, None))
self.sel.unregister(e.fileobj)
if data[0] == EVENT_READ:
if self.is_registered(data[1]):
self.sel.modify(data[1], EVENT_READ, t)
else:
self.sel.register(data[1], EVENT_READ, t)
elif data[0] == EVENT_WRITE:
if self.is_registered(data[1]):
self.sel.modify(data[1], EVENT_WRITE, t)
else:
self.sel.register(data[1], EVENT_WRITE, t)
else:
return False
return True
def once(self):
self.polling()
unfinished = []
for t, data in self.queue:
try:
data = t.send(data)
except StopIteration:
continue
if self.register(t, data):
unfinished.append((t, None))
self.queue = unfinished
def run(self):
while self.queue or self.sel.get_map():
self.once()
By assigning jobs into an event loop to handle connections, the programming pattern is similar to using threads to
manage I/O operations but utilizing a user-level scheduler. Also, PEP 380 enables a generator delegation, which allows
a generator can wait for other generators to finish their jobs. Obviously, the following snippet is more intuitive and
readable than using callback functions to handle I/O operations.
# foo.py
# $ python3 foo.py &
# $ nc localhost 5566
import socket
# import loop.py
from loop import Loop
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
s.bind(("127.0.0.1", 5566))
s.listen(10240)
s.setblocking(False)
loop = Loop()
def handler(conn):
while True:
msg = yield from loop.recv(conn, 1024)
if not msg:
conn.close()
break
yield from loop.send(conn, msg)
def main():
(continues on next page)
loop.create_task((main(), None))
loop.run()
Using an event loop with syntax, yield from, can manage connections without blocking the main thread, which
is the usage of the module, asyncio, before Python 3.5. However, using the syntax, yield from, is ambiguous
because it may tie programmers in knots: why adding @asyncio.coroutine makes a generator become a coroutine?
Instead of using yield from to handle asynchronous operations, PEP 492 proposes that coroutine should become a
standalone concept in Python, and that is how the new syntax, async/await, was introduced to enhance readability
for asynchronous programming.
Python document defines that coroutines are a generalized form of subroutines. However, this definition is ambiguous
and impedes developers to understand what coroutines are. Based on the previous discussion, an event loop is respon-
sible for scheduling generators to perform specific tasks, and that is similar to dispatch jobs to threads. In this case,
generators serve like threads to be in charge of “routine jobs.” Obviously, A coroutine is a term to represent a task that is
scheduled by an event-loop in a program instead of operating systems. The following snippet shows what @coroutine
is. This decorator mainly transforms a function into a generator function and using a wrapper, types.coroutine, to
preserve backward compatibility.
import asyncio
import inspect
import types
def coroutine(func):
"""Simple prototype of coroutine"""
if inspect.isgeneratorfunction(func):
return types.coroutine(func)
@wraps(func)
def coro(*a, **k):
res = func(*a, **k)
if isinstance(res, Future) or inspect.isgenerator(res):
res = yield from res
return res
return types.coroutine(coro)
@coroutine
def foo():
yield from asyncio.sleep(1)
print("Hello Foo")
loop = asyncio.get_event_loop()
(continues on next page)
4.2.6 Conclusion
Asynchronous programming via an event loop becomes more straightforward and readable nowadays due to modern
syntaxes and libraries’ support. Most programming languages, including Python, implement libraries to manage task
scheduling via interacting with new syntaxes. While new syntaxes look enigmatic in the beginning, they provide a way
for programmers to develop logical structure in their code, like using threads. Also, without calling a callback function
after a task finish, programmers do not need to worry about how to pass the current task status, such as local variables
and arguments, into other callbacks. Thus, programmers will be able to focus on developing their programs without
wasting a log of time to troubleshoot concurrent issues.
4.2.7 Reference
Table of Contents
import asyncio
Future = asyncio.futures.Future
@asyncio.coroutine
def foo():
yield from asyncio.sleep(3)
print("Hello Foo")
@asyncio.coroutine
def bar():
yield from asyncio.sleep(1)
print("Hello Bar")
loop = asyncio.get_event_loop()
tasks = [Task(foo(), loop=loop),
loop.create_task(bar())]
loop.run_until_complete(
asyncio.wait(tasks))
loop.close()
output:
$ python test.py
Hello Bar
hello Foo
import asyncio
from collections import deque
def done_callback(fut):
fut._loop.stop()
class Loop:
"""Simple event loop prototype"""
def __init__(self):
self._ready = deque()
self._stopping = False
def run_forever(self):
"""Run tasks until stop"""
try:
while True:
self._run_once()
if self._stopping:
break
finally:
self._stopping = False
def _run_once(self):
"""Run task at once"""
ntodo = len(self._ready)
for i in range(ntodo):
t, a = self._ready.popleft()
(continues on next page)
def stop(self):
self._stopping = True
def close(self):
self._ready.clear()
def get_debug(self):
return False
@asyncio.coroutine
def foo():
print("Foo")
@asyncio.coroutine
def bar():
print("Bar")
loop = Loop()
tasks = [loop.create_task(foo()),
loop.create_task(bar())]
loop.run_until_complete(
asyncio.wait(tasks))
loop.close()
output:
$ python test.py
Foo
Bar
import asyncio
waiter = loop.create_future()
counter = len(fs)
def _on_complete(f):
nonlocal counter
counter -= 1
if counter <= 0 and not waiter.done():
waiter.set_result(None)
loop = asyncio.get_event_loop()
try:
print("---> wait")
loop.run_until_complete(
wait([slow_task(_) for _ in range(1, 3)]))
print("---> asyncio.wait")
loop.run_until_complete(
asyncio.wait([slow_task(_) for _ in range(1, 3)]))
finally:
loop.close()
output:
---> wait
sleep "1" sec
sleep "2" sec
---> asyncio.wait
sleep "1" sec
sleep "2" sec
import asyncio
import socket
while True:
conn, addr = await loop.sock_accept(sock)
loop.create_task(handler(loop, conn))
EventLoop = asyncio.SelectorEventLoop
EventLoop.sock_accept = sock_accept
EventLoop.sock_recv = sock_recv
EventLoop.sock_sendall = sock_sendall
loop = EventLoop()
try:
loop.run_until_complete(server(loop))
except KeyboardInterrupt:
pass
finally:
loop.close()
output:
# console 1
$ python3 async_sock.py &
$ nc localhost 9527
Hello
Hello
(continues on next page)
# console 2
$ nc localhost 9527
asyncio
asyncio
import asyncio
import socket
loop = asyncio.get_event_loop()
return server
class EchoProtocol(asyncio.Protocol):
def connection_made(self, transport):
peername = transport.get_extra_info('peername')
print('Connection from {}'.format(peername))
self.transport = transport
try:
loop.run_forever()
finally:
server.close()
loop.run_until_complete(server.wait_closed())
loop.close()
output:
# console1
$ nc localhost 5566
Hello
Hello
# console2
$ nc localhost 5566
asyncio
asyncio
table of Contents
4.4.1 Abstract
PEP 572 is one of the most contentious proposals in Python3 history because assigning a value within an expression
seems unnecessary. Also, it is ambiguous for developers to distinguish the difference between the walrus operator
(:=) and the equal operator (=). Even though sophisticated developers can use “:=” smoothly, they may concern the
readability of their code. To better understand the usage of “:=,” this article discusses its design philosophy and what
kind of problems it tries to solve.
4.4.2 Introduction
For C/C++ developer, assigning a function return to a variable is common due to error code style handling. Managing
function errors includes two steps; one is to check the return value; another is to check errno. For example,
#include <stdio.h>
#include <unistd.h>
#include <string.h>
#include <errno.h>
In this case, access will assign its return value to the variable rc first. Then, the program will compare the rc value
with -1 to check whether the execution of access is successful or not. However, Python did not allow assigning values
to variables within an expression before 3.8. To fix this problem, therefore, PEP 572 introduced the walrus operator
for developers. The following Python snippet is equal to the previous C example.
>>> import os
>>> from ctypes import *
>>> libc = CDLL("libc.dylib", use_errno=True)
>>> access = libc.access
>>> path = create_string_buffer(b"hello_walrus")
>>> if (rc := access(path, os.R_OK)) == -1:
... errno = get_errno()
... print(os.strerror(errno), file=sys.stderr)
...
No such file or directory
4.4.3 Why := ?
Developers may confuse the difference between “:=” and “=.” In fact, they serve the same purpose, assigning some-
things to variables. Why Python introduced “:=” instead of using “=”? What is the benefit of using “:=”? One reason
is to reinforce the visual recognition due to a common mistake made by C/C++ developers. For instance,
// rc is unintentionally assigned to -1
if (rc = -1) {
fprintf(stderr, "%s", strerror(errno));
goto end;
}
Rather than comparison, the variable, rc, is mistakenly assigned to -1. To prevent this error, some people advocate
using Yoda conditions within an expression.
However, Yoda style is not readable enough like Yoda speaks non-standardized English. Also, unlike C/C++ can detect
assigning error during the compile-time via compiler options (e.g., -Wparentheses), it is difficult for Python interpreter
to distinguish such mistakes throughout the runtime. Thus, the final result of PEP 572 was to use a new syntax as a
solution to implement assignment expressions.
The walrus operator was not the first solution for PEP 572. The original proposal used EXPR as NAME to assign values
to variables. Unfortunately, there are some rejected reasons in this solution and other solutions as well. After intense
debates, the final decision was :=.
4.4.4 Scopes
Unlike other expressions, which a variable is bound to a scope, an assignment expression belongs to the current scope.
The purpose of this design is to allow a compact way to write code.
In PEP 572, another benefit is to conveniently capture a “witness” for an any() or an all() expression. Although cap-
turing function inputs can assist an interactive debugger, the advantage is not so obvious, and examples lack readability.
Therefore, this benefit does not discuss here. Note that other languages (e.g., C/C++ or Go) may bind an assignment
to a scope. Take Golang as an example.
package main
import (
"fmt"
"os"
)
func main() {
if env := os.Getenv("HOME"); env == "" {
panic(fmt.Sprintf("Home does not find"))
}
fmt.Print(env) // <--- compile error: undefined: env
}
4.4.5 Pitfalls
Although an assigning expression allows writing compact code, there are many pitfalls when a developer uses it in a
list comprehension. A common SyntaxError is to rebind iteration variables.
However, updating an iteration variable will reduce readability and introduce bugs. Even if Python 3.8 did not imple-
ment the walrus operator, a programmer should avoid reusing iteration variables within a scope.
Another pitfall is Python prohibits using assignment expressions within a comprehension under a class scope.
This limitation was from bpo-3692. The interpreter’s behavior is unpredictable when a class declaration contains a list
comprehension. To avoid this corner case, assigning expression is invalid under a class.
4.4.6 Conclusion
The reason why the walrus operator (:=) is so controversial is that code readability may decrease. In fact, in the dis-
cussion mail thread, the author of PEP 572, Christoph Groth, had considered using “=” to implement inline assignment
like C/C++. Without judging “:=” is ugly, many developers argue that distinguishing the functionality between “:=”
and “=” is difficult because they serve the same purpose, but behaviors are not consistent. Also, writing compact code
is not persuasive enough because smaller is not always better. However, in some cases, the walrus operator can enhance
readability (if you understand how to use :=). For example,
buf = b""
while True:
data = read(1024)
if not data:
break
buf += data
buf = b""
while (data := read(1024)):
buf += data
Python document and GitHub issue-8122 provides many great examples about improving code readability by “:=”.
However, using the walrus operator should be careful. Some cases, such as foo(x := 3, cat='vector'), may
introduce new bugs if developers are not aware of scopes. Although PEP 572 may be risky for developers to write
buggy code, an in-depth understanding of design philosophy and useful examples will help us use it to write readable
code at the right time.
4.4.7 References
Table of Contents
4.5.1 Abstract
The GNU Debugger (GDB) is the most powerful debugging tool for developers to troubleshoot errors in their code.
However, it is hard for beginners to learn, and that is why many programmers prefer to insert print to examine runtime
status. Fortunately, GDB Text User Interface (TUI) provides a way for developers to review their source code and
debug simultaneously. More excitingly, In GDB 7, Python Interpreter was built into GDB. This feature offers more
straightforward ways to customize GDB printers and commands through the Python library. By discussing examples,
this article tries to explore advanced debugging techniques via Python to develop tool kits for GDB.
4.5.2 Introduction
Troubleshooting software bugs is a big challenge for developers. While GDB provides many “debug commands” to
inspect programs’ runtime status, its non-intuitive usages impede programmers to use it to solve problems. Indeed, mas-
tering GDB is a long-term process. However, a quick start is not complicated; you must unlearn what you have learned
like Yoda. To better understand how to use Python in GDB, this article will focus on discussing Python interpreter in
GDB.
GDB supports customizing commands by using define. It is useful to run a batch of commands to troubleshoot at the
same time. For example, a developer can display the current frame information by defining a sf command.
# define in .gdbinit
define sf
where # find out where the program is
info args # show arguments
info locals # show local variables
end
However, writing a user-defined command may be inconvenient due to limited APIs. Fortunately, by interacting with
Python interpreter in GDB, developers can utilize Python libraries to establish their debugging tool kits readily. The
following sections show how to use Python to simplify debugging processes.
Inspecting a process’s memory information is an effective way to troubleshoot memory issues. Developers can acquire
memory contents by info proc mappings and dump memory. To simplify these steps, defining a customized com-
mand is useful. However, the implementation is not straightforward by using pure GDB syntax. Even though GDB
supports conditions, processing output is not intuitive. To solve this problem, using Python API in GDB would be
helpful because Python contains many useful operations for handling strings.
# mem.py
import gdb
import time
import re
class DumpMemory(gdb.Command):
"""Dump memory info into a file."""
def __init__(self):
super().__init__("dm", gdb.COMMAND_USER)
for s, e in addrs:
f = int(time.time() * 1000)
gdb.execute(f"dump memory {f}.bin {s} {e}")
DumpMemory()
Running the dm command will invoke DumpMemory.invoke. By sourcing or implementing Python scripts in .gdbinit,
developers can utilize user-defined commands to trace bugs when a program is running. For example, the following
steps show how to invoke DumpMemory in GDB.
(gdb) start
...
(gdb) source mem.py # source commands
(gdb) dm stack # dump stack to ${timestamp}.bin
(gdb) shell ls # ls current dir
1577283091687.bin a.cpp a.out mem.py
Parsing JSON is helpful when a developer is inspecting a JSON string in a running program. GDB can parse a
std::string via gdb.parse_and_eval and return it as a gdb.Value. By processing gdb.Value, developers can
pass a JSON string into Python json API and print it in a pretty format.
# dj.py
import gdb
import re
import json
class DumpJson(gdb.Command):
"""Dump std::string as a styled JSON."""
def __init__(self):
super().__init__("dj", gdb.COMMAND_USER)
DumpJson()
The command dj displays a more readable JSON format in GDB. This command helps improve visual recognization
when a JSON string large. Also, by using this command, it can detect or monitor whether a std::string is JSON or
not.
(gdb) start
(gdb) list
1 #include <string>
2
3 int main(int argc, char *argv[])
4 {
5 std::string json = R"({"foo": "FOO","bar": "BAR"})";
6 return 0;
7 }
...
(gdb) ptype json
type = std::string
(gdb) p json
$1 = "{\"foo\": \"FOO\",\"bar\": \"BAR\"}"
(gdb) source dj.py
(gdb) dj json
{
"foo": "FOO",
"bar": "BAR"
}
Syntax highlighting is useful for developers to trace source code or to troubleshoot issues. By using Pygments, applying
color to the source is easy without defining ANSI escape code manually. The following example shows how to apply
color to the list command output.
import gdb
class PrettyList(gdb.Command):
"""Print source code with color."""
def __init__(self):
super().__init__("pl", gdb.COMMAND_USER)
self.lex = CLexer()
self.fmt = TerminalFormatter()
PrettyList()
4.5.7 Tracepoints
Although a developer can insert printf, std::cout, or syslog to inspect functions, printing messages is not an
effective way to debug when a project is enormous. Developers may waste their time in building source code and
may acquire little information. Even worse, the output may become too much to detect problems. In fact, inspecting
functions or variables do not require to embed print functions in code. By writing a Python script with GDB API,
developers can customize watchpoints to trace issues dynamically at runtime. For example, by implementing a gdb.
Breakpoint and a gdb.Command, it is useful for developers to acquire essential information, such as parameters, call
stacks, or memory usage.
# tp.py
import gdb
tp = {}
class Tracepoint(gdb.Breakpoint):
def __init__(self, *args):
super().__init__(*args)
self.silent = True
self.count = 0
def stop(self):
self.count += 1
(continues on next page)
class SetTracepoint(gdb.Command):
def __init__(self):
super().__init__("tp", gdb.COMMAND_USER)
def finish(event):
for t, p in tp.items():
c = p.count
print(f"Tracepoint '{t}' Count: {c}")
gdb.events.exited.connect(finish)
SetTracepoint()
Instead of inserting std::cout at the beginning of functions, using a tracepoint at a function’s entry point provides
useful information to inspect arguments, variables, and stacks. For instance, by setting a tracepoint at fib, it is helpful
to examine memory usage, stack, and the number of calls.
int fib(int n)
{
if (n < 2) {
return 1;
}
return fib(n-1) + fib(n-2);
}
The following output shows the result of an inspection of the function fib. In this case, tracepoints display all in-
formation a developer needs, including arguments’ value, recursive flow, and variables’ size. By using tracepoints,
developers can acquire more useful information comparing with std::cout.
4.5.8 Profiling
Without inserting timestamps, profiling is still feasible through tracepoints. By using a gdb.FinishBreakpoint after
a gdb.Breakpoint, GDB sets a temporary breakpoint at the return address of a frame for developers to get the current
timestamp and to calculate the time difference. Note that profiling via GDB is not precise. Other tools, such as Linux
perf or Valgrind, provide more useful and accurate information to trace performance issues.
import gdb
import time
class EndPoint(gdb.FinishBreakpoint):
def __init__(self, breakpoint, *a, **kw):
super().__init__(*a, **kw)
self.silent = True
self.breakpoint = breakpoint
def stop(self):
# normal finish
(continues on next page)
class StartPoint(gdb.Breakpoint):
def __init__(self, *a, **kw):
super().__init__(*a, **kw)
self.silent = True
self.stack = []
def stop(self):
start = time.time()
# start, end, diff
frame = gdb.newest_frame()
sym_and_line = frame.find_sal()
func = frame.function().name
filename = sym_and_line.symtab.filename
line = sym_and_line.line
block = frame.block()
args = []
for s in block:
if not s.is_argument:
continue
name = s.name
typ = s.type
val = s.value(frame)
args.append(f"{name}: {val} [{typ}]")
# format
out = ""
out += f"{func} @ {filename}:{line}\n"
for a in args:
out += f"\t{a}\n"
class Profile(gdb.Command):
def __init__(self):
super().__init__("prof", gdb.COMMAND_USER)
Profile()
The following output shows the profiling result by setting a tracepoint at the function fib. It is convenient to inspect
the function’s performance and stack at the same time.
Although set print pretty on in GDB offers a better format to inspect variables, developers may require to parse
variables’ value for readability. Take the system call stat as an example. While it provides useful information to
examine file attributes, the output values, such as the permission, may not be readable for debugging. By implementing
a user-defined pretty print, developers can parse struct stat and output information in a readable format.
import gdb
import pwd
import grp
import stat
import time
class StatPrint:
def __init__(self, val):
self.val = val
def to_string(self):
st = self.val
st_ino = int(st["st_ino"])
st_mode = int(st["st_mode"])
st_uid = int(st["st_uid"])
st_gid = int(st["st_gid"])
st_size = int(st["st_size"])
st_blksize = int(st["st_blksize"])
st_blocks = int(st["st_blocks"])
st_atim = st["st_atim"]
st_mtim = st["st_mtim"]
st_ctim = st["st_ctim"]
out = "{\n"
out += f"Size: {st_size}\n"
out += f"Blocks: {st_blocks}\n"
out += f"IO Block: {st_blksize}\n"
out += f"Inode: {st_ino}\n"
out += f"Access: {self.get_access(st_mode)}\n"
out += f"File Type: {self.get_filetype(st_mode)}\n"
out += f"Uid: ({st_uid}/{pwd.getpwuid(st_uid).pw_name})\n"
out += f"Gid: ({st_gid}/{grp.getgrgid(st_gid).gr_name})\n"
(continues on next page)
p = gdb.printing.RegexpCollectionPrettyPrinter("sp")
p.add_printer("stat", "^stat$", StatPrint)
o = gdb.current_objfile()
gdb.printing.register_pretty_printer(o, p)
By sourcing the previous Python script, the PrettyPrinter can recognize struct stat and output a readable format
for developers to inspect file attributes. Without inserting functions to parse and print struct stat, it is a more
convenient way to acquire a better output from Python API.
(gdb) list 15
10 struct stat st;
11
12 if ((rc = stat("./a.cpp", &st)) < 0) {
13 perror("stat failed.");
14 goto end;
15 }
16
17 rc = 0;
18 end:
19 return rc;
(gdb) source st.py
(gdb) b 17
Breakpoint 1 at 0x762: file a.cpp, line 17.
(gdb) r
Starting program: /root/a.out
Note that developers can disable a user-defined pretty-print via the command disable. For example, the previous
Python script registers a pretty printer under the global pretty-printers. By calling disable pretty-print, the
printer sp will be disabled.
Additionally, developers can exclude a printer in the current GDB debugging session if it is no longer required. The
following snippet shows how to delete the sp printer through gdb.pretty_printers.remove.
(gdb) python
>import gdb
>for p in gdb.pretty_printers:
> if p.name == "sp":
> gdb.pretty_printers.remove(p)
>end
(gdb) i pretty-print
global pretty-printers:
builtin
mpx_bound128
4.5.10 Conclusion
Integrating Python interpreter into GDB offers many flexible ways to troubleshoot issues. While many integrated
development environments (IDEs) may embed GDB to debug visually, GDB allows developers to implement their
commands and parse variables’ output at runtime. By using debugging scripts, developers can monitor and record
necessary information without modifying their code. Honestly, inserting or enabling debugging code blocks may
change a program’s behaviors, and developers should get rid of this bad habit. Also, when a problem is reproduced,
GDB can attach that process and examine its status without stopping it. Obviously, debugging via GDB is inevitable
if a challenging issue emerges. Thanks to integrating Python into GDB, developing a script to troubleshoot becomes
more accessible that leads to developers establishing their debugging methods diversely.
4.5.11 Reference