Await 2.
0
Stackless Resumable
Functions
M O S T S C AL ABLE , M O S T E F F I C IEN T, M O S T O P E N
C O R O U TI N E S O F AN Y P R O G R AM MIN G L AN G U AGE I N
E X I S TEN C E
CppCon 2014 • Gor Nishanov ([email protected]) • Microsoft
What this talk is about
• Evolution of N3858 and N3977
• Stackless Resumable Functions (D4134)
• Lightweight, customizable coroutines
• Proposed for C++17
• Experimental implementation “to be” released in Visual Studio “14”
• What are they?
• How they work?
• How to use them?
• How to customize them?
CppCon 2014 • Stackless Resumable Functions 2
56 years
Coroutines ago
• Introduced in 1958 by Melvin Conway
• Donald Knuth, 1968: “generalization of subroutine”
subroutines coroutines
call Allocate frame, pass parameters Allocate frame, pass parameters
return Free frame, return result Free frame, return eventual result
suspend x yes
resume x yes
CppCon 2014 • Stackless Resumable Functions 3
User Mode Threads / Fibers
Coroutine classification Stackless Resumable Functions
• Symmetric / Asymmetric
• Modula-2 / Win32 Fibers / Boost::context are symmetric (SwitchToFiber)
• C# asymmetric (distinct suspend and resume operations)
• First-class / Constrained
• Can coroutine be passed as a parameter, returned from a function, stored in a
data structure?
• Stackful / Stackless
• How much state coroutine has? Just the locals of the coroutine or entire stack?
• Can coroutine be suspended from nested stack frames
CppCon 2014 • Stackless Resumable Functions 4
Stackful vs. Stackless
Coroutine State Coroutine State
Captured Coroutine State:
(chained stack)
Parameters
4k stacklet
Locals &
Temporaries
4k stacklet
4k stacklet
1 meg of stack 1 meg
4k stacklet of stack
4k stacklet
CppCon 2014 • Stackless Resumable Functions 5
Design Goals
• Highly scalable (to hundred millions of concurrent coroutines)
• Highly efficient (resume and suspend operations comparable
in cost to a function call overhead)
• Seamless interaction with existing facilities with no overhead
• Open ended coroutine machinery allowing library designers to
develop coroutine libraries exposing various high-level
semantics, such as generators, goroutines, tasks and more.
• Usable in environments where exception are forbidden or not
available
CppCon 2014 • Stackless Resumable Functions 6
Anatomy of a Function
std::future<ptrdiff_t> tcp_reader(int total)
{
char buf[64 * 1024];
ptrdiff_t result = 0;
auto conn =
CppCon 2014 • Stackless Resumable Functions 7
Anatomy of a Resumable Function
std::future<ptrdiff_t> tcp_reader(int total)
{
char buf[64 * 1024];
ptrdiff_t result = 0;
auto conn = await Tcp::Connect("127.0.0.1", 1337);
do
{
auto bytesRead = await conn.Read(buf, sizeof(buf));
total -= bytesRead;
result += std::count(buf, buf + bytesRead, 'c');
}
while (total > 0);
return result;
CppCon 2014 • Stackless Resumable Functions 8
Satisfies
Coroutine Promise Requirements
Anatomy of a Stackless Resumable Function
Coroutine Coroutine Frame
Return Object
Coroutine Promise
std::future<ptrdiff_t> tcp_reader(int total) Platform Context*
{
char buf[64 * 1024];
Formals (Copy)
ptrdiff_t result = 0; Locals / Temporaries
auto conn = await Tcp::Connect("127.0.0.1", 1337);
do
Suspend {
Points auto bytesRead = await conn.Read(buf, sizeof(buf));
total -= bytesRead;
result += std::count(buf, buf + bytesRead, 'c');
}
while (total > 0);
return result; Satisfies Awaitable
Requirements
}
Coroutine await <initial-suspend>
Eventual Result
CppCon 2014 • Stackless Resumable Functions await <final-suspend> 9
Compiler vs Coroutine Promise
return <expr> <Promise>.set_value(<expr>);
goto <end>
<unhandled-exception> <Promise>.set_exception (
std::current_exception())
<get-return-object> <Promise>.get_return_object()
yield <expr> await <Promise>.yield_value(<expr>)
await() await <Promise>.read_value()
<after-first-curly> await <Promise>.initial_suspend()
<before-last-curly> await <Promise>.final_suspend()
<cancel-check> if(<Promise>.cancellation_requested())
<goto end>
CppCon 2014 • Stackless Resumable Functions 10
2x2x2
• Two new keywords
• await
• yield
• Two new concepts
• Awaitable
• Coroutine Promise
• Two new types
• resumable_handle
• resumable_traits
CppCon 2014 • Stackless Resumable Functions 11
Examples
CppCon 2014 • Stackless Resumable Functions 12
Coroutine Promise
current_value
Active / Cancelling /
Generator coroutines Closed
exception
generator<int> fib(int n)
{ generator<int>
int a = 0;
int b = 1;
while (n-- > 0) generator<int>::iterator
{
yield a;
auto next = a + b; {
a = b; auto && __range = fib(35);
b = next; for (auto __begin = __range.begin(),
} __end = __range.end()
} ;
__begin != __end
;
int main() {
++__begin)
for (auto v : fib(35))
{
{
if (v > 10) auto v = *__begin;
break; {
cout << v << ' '; if (v > 10) break;
} cout << v << ' ';
} }
}
CppCon 2014 • Stackless Resumable Functions } 13
Recursive Generators
recursive_generator<int> range(int a, int b)
{
auto n = b - a;
if (n <= 0)
return; int main()
{
if (n == 1) auto r = range(0, 100);
{ copy(begin(r), end(r),
yield a; ostream_iterator<int>(cout, " "));
}
return;
}
auto mid = a + n / 2;
yield range(a, mid);
yield range(mid, b);
}
CppCon 2014 • Stackless Resumable Functions 14
Parent-stealing scheduling
spawnable<int> fib(int n) {
if (n < 2) return n;
return await(fib(n - 1) + fib(n - 2));
}
int main() { std::cout << fib(5).get() << std::endl; }
1,4 billion recursive invocations to compute fib(43), uses less than 16k of space
Not using parent-stealing, runs out of memory at fib(35)
n
1 1 1 x
1 0 1
= y
CppCon 2014 • Stackless Resumable Functions 15
Goroutines?
goroutine pusher(channel<int>& left, channel<int>& right) {
for (;;) {
auto val = await left.pull();
await right.push(val + 1);
}
}
CppCon 2014 • Stackless Resumable Functions 16
Goroutines? Sure. 100,000,000 of them
goroutine pusher(channel<int>& left, channel<int>& right) {
for (;;) {
auto val = await left.pull();
await right.push(val + 1);
}
}
int main() {
const int N = 100 * 1000 * 1000; c0-g0-c1
vector<channel<int>> c(N + 1); c1-g1-c2
…
for (int i = 0; i < N; ++i)
goroutine::go(pusher(c[i], c[i + 1])); cn-gn-cn+1
c.front().sync_push(0);
cout << c.back().sync_pull() << endl;
}
CppCon 2014 • Stackless Resumable Functions 17
Reminder: Just Core Language Evolution
BE-DEVs
FE-DEVs
Library Designer Paradise
• Lib devs can design new coroutines types
• generator<T>
• goroutine
• spawnable<T>
• task<T>
• …
• Or adapt to existing async facilities
• std::future<T>
• concurrency::task<T>
• IAsyncAction, IAsyncOperation<T>
• …
CppCon 2014 • Stackless Resumable Functions 18
Awaitable
CppCon 2014 • Stackless Resumable Functions 19
Reminder: Range-Based For
{
auto && __range = fib(35);
for (auto __begin = __range.begin(),
__end = __range.end()
;
__begin != __end
;
++__begin)
{
int main() { auto v = *__begin;
for (auto v : fib(35)) cout << v << endl;
cout << v << endl; }
} }
CppCon 2014 • Stackless Resumable Functions 20
If <expr> is a class type and
await <expr> unqualified ids await_ready,
await_suspend or await_resume
Expands into an expression equivalent of are found in the scope of a class
{
auto && __tmp = <expr>;
if (!__tmp.await_ready()) {
__tmp.await_suspend(<resumption-function-object>);
suspend
resume
}
<cancel-check>
return __tmp.await_resume();
}
CppCon 2014 • Stackless Resumable Functions 21
Otherwise
await <expr> (see rules for range-based-for
lookup)
Expands into an expression equivalent of
{
auto && __tmp = <expr>;
if (! await_ready(__tmp)) {
await_suspend(__tmp, <resumption-function-object>);
suspend
resume
}
<cancel-check>
return await_resume(__tmp);
}
CppCon 2014 • Stackless Resumable Functions 22
Trivial Awaitable #1
struct _____blank____ {
bool await_ready(){ return false; }
template <typename F>
void await_suspend(F const&){}
void await_resume(){}
};
CppCon 2014 • Stackless Resumable Functions 23
Trivial Awaitable #1
struct suspend_always {
bool await_ready(){ return false; }
template <typename F>
void await_suspend(F const&){}
void await_resume(){}
};
await suspend_always {};
CppCon 2014 • Stackless Resumable Functions 24
Trivial Awaitable #2
struct suspend_never {
bool await_ready(){ return true; }
template <typename F>
void await_suspend(F const&){}
void await_resume(){}
};
CppCon 2014 • Stackless Resumable Functions 25
Simple Awaitable #1
std::future<void> DoSomething(mutex& m) {
unique_lock<mutex> lock = await lock_or_suspend{m};
// ...
}
struct lock_or_suspend {
std::unique_lock<std::mutex> lock;
lock_or_suspend(std::mutex & mut) : lock(mut, std::try_to_lock) {}
bool await_ready() { return lock.owns_lock(); }
template <typename F>
void await_suspend(F cb)
{
std::thread t([this, cb]{ lock.lock(); cb(); });
t.detach();
}
auto await_resume() { return std::move(lock);}
};
CppCon 2014 • Stackless Resumable Functions 26
Simple Awaiter #2: Making Boost.Future awaitable
#include <boost/thread/future.hpp>
namespace boost {
template <class T>
bool await_ready(unique_future<T> & t) {
return t.is_ready();
}
template <class T, class F>
void await_suspend(unique_future<T> & t,
F resume_callback)
{
t.then([=](auto&){resume_callback();});
}
template <class T>
auto await_resume(unique_future<T> & t) {
return t.get(); }
}
}
CppCon 2014 • Stackless Resumable Functions 27
Awaitable
Interacting with C APIs
CppCon 2014 • Stackless Resumable Functions 28
2x2x2
• Two new keywords
• await
• yield
• Two new concepts
• Awaitable
• Coroutine Promise
• Two new types
• resumable_handle
• resumable_traits
CppCon 2014 • Stackless Resumable Functions 29
resumable_handle
template <typename Promise = void> struct resumable_handle;
== != < > <= >=
template <> struct resumable_handle<void> {
void operator() ();
void * to_address();
static resumable_handle<void> from_address(void*);
…
};
template <typename Promise>
struct resumable_handle: public resumable_handle<> {
Promise & promise();
static resumable_handle<Promise> from_promise(Promise*);
…
};
CppCon 2014 • Stackless Resumable Functions 30
Simple Awaitable #2: Raw OS APIs
await sleep_for(10ms);
class sleep_for {
static void TimerCallback(PTP_CALLBACK_INSTANCE, void* Context, PTP_TIMER) {
std::resumable_handle<>::from_address(Context)();
}
PTP_TIMER timer = nullptr;
std::chrono::system_clock::duration duration;
public:
sleep_for(std::chrono::system_clock::duration d) : duration(d){}
bool await_ready() const { return duration.count() <= 0; }
void await_suspend(std::resumable_handle<> resume_cb) {
int64_t relative_count = -duration.count();
timer = CreateThreadpoolTimer(TimerCallback, resume_cb.to_address(), 0);
SetThreadpoolTimer(timer, (PFILETIME)&relative_count, 0, 0);
}
void await_resume() {}
~sleep_for() { if (timer) CloseThreadpoolTimer(timer); }
};
CppCon 2014 • Stackless Resumable Functions 31
2x2x2
• Two new keywords
• await
• yield
• Two new concepts
• Awaitable
• Coroutine Promise
• Two new types
• resumable_handle
• resumable_traits
CppCon 2014 • Stackless Resumable Functions 32
resumable_traits
generator<int> fib(int n)
std::resumable_traits<generator<int>, int>
template <typename R, typename... Ts>
struct resumable_traits {
using allocator_type = std::allocator<char>;
using promise_type = typename R::promise_type;
};
CppCon 2014 • Stackless Resumable Functions 33
Defining Coroutine Promise for boost::future
namespace std {
template <typename T, typename… anything>
struct resumable_traits<boost::unique_future<T>, anything…> {
struct promise_type {
boost::promise<T> promise;
auto get_return_object() { return promise.get_future(); }
template <class U> void set_value(U && value) {
promise.set_value(std::forward<U>(value));
}
void set_exception(std::exception_ptr e) {
promise.set_exception(std::move(e));
}
suspend_never initial_suspend() { return{}; }
suspend_never final_suspend() { return{}; }
bool cancellation_requested() { return false; }
};
};
CppCon 2014 • Stackless Resumable Functions 34
Awaitable
and Exceptions
CppCon 2014 • Stackless Resumable Functions 35
Exceptionless Error Propagation (Await Part)
#include <boost/thread/future.hpp>
namespace boost {
template <class T>
bool await_ready(unique_future<T> & t) { return t.is_ready();}
template <class T, class F>
void await_suspend(
unique_future<T> & t, F rh)
{
t.then([=](auto& result){
rh();
});
}
template <class T>
auto await_resume(unique_future<T> & t) { return t.get(); }
}
CppCon 2014 • Stackless Resumable Functions 36
Exceptionless Error Propagation (Await Part)
#include <boost/thread/future.hpp>
namespace boost {
template <class T>
bool await_ready(unique_future<T> & t) { return t.is_ready();}
template <class T, class Promise>
void await_suspend(
unique_future<T> & t, std::resumable_handle<Promise> rh)
{
t.then([=](auto& result){
if(result.has_exception())
rh.promise().set_exception(result.get_exception_ptr());
rh();
});
}
template <class T>
auto await_resume(unique_future<T> & t) { return t.get(); }
}
CppCon 2014 • Stackless Resumable Functions 37
Exceptionless Error Propagation (Promise Part)
namespace std {
template <typename T, typename… anything>
struct resumable_traits<boost::unique_future<T>, anything…> {
struct promise_type {
boost::promise<T> promise;
auto get_return_object() { return promise.get_future(); }
suspend_never initial_suspend() { return{}; }
suspend_never final_suspend() { return{}; }
template <class U> void set_value(U && value) {
promise.set_value(std::forward<U>(value));
}
void set_exception(std::exception_ptr e) {
promise.set_exception(std::move(e));
}
bool cancellation_requested() { return false; }
};
}; CppCon 2014 • Stackless Resumable Functions 38
Exceptionless Error Propagation (Promise Part)
namespace std {
template <typename T, typename… anything>
struct resumable_traits<boost::unique_future<T>, anything…> {
struct promise_type {
boost::promise<T> promise;
bool cancelling = false;
auto get_return_object() { return promise.get_future(); }
suspend_never initial_suspend() { return{}; }
suspend_never final_suspend() { return{}; }
template <class U> void set_value(U && value) {
promise.set_value(std::forward<U>(value));
}
void set_exception(std::exception_ptr e) {
promise.set_exception(std::move(e)); cancelling = true;
}
bool cancellation_requested() { return cancelling; }
};
}; CppCon 2014 • Stackless Resumable Functions 39
Simple Happy path and reasonable error
propagation
std::future<ptrdiff_t> tcp_reader(int total)
{
char buf[64 * 1024];
ptrdiff_t result = 0;
auto conn = await Tcp::Connect("127.0.0.1", 1337);
do
{
auto bytesRead = await conn.Read(buf, sizeof(buf));
total -= bytesRead;
result += std::count(buf, buf + bytesRead, 'c');
}
while (total > 0);
return result;
CppCon 2014 • Stackless Resumable Functions 40
Reminder: await <expr>
Expands into an expression equivalent of
{
auto && __tmp = <expr>;
if (! await_ready(__tmp)) {
await_suspend(__tmp, <resumption-function-object>);
suspend
resume
}
if (<promise>.cancellation_requested()) goto <end-label>;
return await_resume(__tmp);
}
CppCon 2014 • Stackless Resumable Functions 41
Done!
CppCon 2014 • Stackless Resumable Functions 42
What this talk was about
• Stackless Resumable Functions (D4134)
• Lightweight, customizable coroutines
• Proposed for C++17
• Experimental implementation “to be” released in Visual Studio “14”
• What are they?
• How they work?
• How to use them?
• How to customize them?
CppCon 2014 • Stackless Resumable Functions 43
To learn more:
• https://github.com/GorNishanov/await/
• Draft snapshot: D4134 Resumable Functions v2.pdf
• In October 2014 look for
• N4134 at http://isocpp.org
• http://open-std.org/JTC1/SC22/WG21/
CppCon 2014 • Stackless Resumable Functions 44
Backup
CppCon 2014 • Stackless Resumable Functions 45
Introduction
CppCon 2014 • Stackless Resumable Functions 46
How does it work?
CppCon 2014 • Stackless Resumable Functions 47
Generator coroutines
generator<int> fib(int n)
{
int a = 0;
int b = 1;
while (n-- > 0) {
{ auto && __range = fib(35);
yield a; for (auto __begin = __range.begin(),
auto next = a + b; __end = __range.end()
a = b; ;
b = next; __begin != __end
} ;
} ++__begin)
{
int main() { auto v = *__begin;
for (auto v : fib(35)) cout << v << endl;
cout << v << endl; }
} }
CppCon 2014 • Stackless Resumable Functions 48
x86_x64 Windows ABI
Execution generator<int> fib(int n)
RSP auto && __range = fib(35)
Stack
Suspend!!!!
ret-addr __range RCX = &__range
RDX = 35
slot1
&__range slot2
savedRDI
Heap
slot3
savedRSI slot4 X
savedRB
RDI = n
Coroutine Frame
ret-main savedRBP RSI = a Coroutine
RBX = b Promise
slot1 slot2
RBP = $fp saved
RDI slot
RDI
slot3 slot4
saved
RSI slot
RSI
saved RBX
RBX slot
RAX = &__range
saved
RIP slot
RIP
CppCon 2014 • Stackless Resumable Functions 49
x86_x64 Windows ABI
Resume generator<int>::iterator::operator ++()
struct iterator {
for(…;…; ++__begin) iterator& operator ++() {
RSP Stack resume_cb(); return *this; }
ret-addr __range RCX = $fp …
slot1 slot2
savedRDI
resumable_handle<Promise> resume_cb;
};
slot3
savedRSI slot4
savedRBX Heap
RDI = n
Coroutine Frame
ret-main savedRBP RSI = a Coroutine
RBX = b Promise
slot1 slot2
slot3 slot4 RBP = $fp saved
RDI slot
RDI
saved
RSI slot
RSI
RBX slot
saved RBX
saved
RIP slot
RIP
CppCon 2014 • Stackless Resumable Functions 50
If await_suspend
returns bool
await <expr>
Expands into an expression equivalent of
{
auto && __tmp = <expr>;
if (! await_ready(__tmp) &&
await_suspend(__tmp, <resumption-function-object>) {
suspend
resume
}
if (<promise>.cancellation_requested()) goto <end-label>;
return await_resume(__tmp);
}
CppCon 2014 • Stackless Resumable Functions 55
Yield implementation
compiler: yield <expr> await <Promise>.yield_value(<expr>)
library:
suspend_now
generator<T>::promise_type::yield_value(T const& expr) {
this->current_value = &expr;
return{};
}
CppCon 2014 • Stackless Resumable Functions 56
awaitable_overlapped_base
struct awaitable_overlapped_base : public OVERLAPPED
{
ULONG IoResult;
ULONG_PTR NumberOfBytesTransferred;
std::resumable_handle<> resume;
static void __stdcall io_complete_callback( PTP_CALLBACK_INSTANCE,
PVOID, PVOID Overlapped, ULONG IoResult,
ULONG_PTR NumberOfBytesTransferred,
PTP_IO)
{
auto o = reinterpret_cast<OVERLAPPED*>(Overlapped);
auto me = static_cast<awaitable_overlapped_base*>(o);
me->IoResult = IoResult;
me->NumberOfBytesTransferred = NumberOfBytesTransferred;
me->resume();
}
};
CppCon 2014 • Stackless Resumable Functions 57
Dial awaitable
class Dial : public awaitable_overlapped_base {
ports::endpoint remote;
Connection conn;
public:
Dial(string_view str, unsigned short port) : remote(str, port) {}
bool await_ready() const { return false; }
void await_suspend(std::resumable_handle<> cb) {
resume = cb;
conn.handle = detail::TcpSocket::Create();
detail::TcpSocket::Bind(conn.handle, ports::endpoint("0.0.0.0"));
conn.io = CreateThreadpoolIo(conn.handle, &io_complete_callback, 0,0);
if (conn.io == nullptr) throw_error(GetLastError());
StartThreadpoolIo(conn.io);
auto error = detail::TcpSocket::Connect(conn.handle, remote, this);
if (error) { CancelThreadpoolIo(conn.io); throw_error(GetLastError());
}
Connection await_resume() {
if (conn.error) throw_error(error);
return std::move(conn);
}
};
CppCon 2014 • Stackless Resumable Functions 58
Connection::Read
auto Connection::read(void* buf, size_t bytes) {
class awaiter : public awaitable_overlapped_base {
void* buf; size_t size;
Connection * conn;
public:
awaiter(void* b, size_t n, Connection * c): buf(b), size(n), conn(c) {}
bool await_ready() const { return false; }
void await_suspend(std::resumable_handle<> cb) {
resume = cb;
StartThreadpoolIo(conn->io);
auto error = TcpSocket::Read(conn->handle, buf, (uint32_t)size, this);
if (error)
{ CancelThreadpoolIo(conn->io); throw_error(error); }
}
int await_resume() {
if (IoResult)
{ throw_error(IoResult); }
return (int)this->NumberOfBytesTransferred;
}
};
return awaiter{ buf, bytes, this };
}
CppCon 2014 • Stackless Resumable Functions 59
asynchronous iterator helper: await for
goroutine foo(channel<int> & input) {
await for(auto && i : input) {
cout << “got: “ << i << endl;
}
}
await for expands into: {
auto && __range = range-init;
for ( auto __begin = await (begin-expr),
__end = end-expr;
__begin != __end;
await ++__begin )
{
for-range-declaration = *__begin;
statement
}
}
Recursive Tree Walk (Stackful) from N3985
void traverse ( node_t * n, std::push_coroutine<std::string> & yield) {
if(n-> left ) traverse (n->left, yield);
yield (n-> value);
if(n-> right ) traverse (n->right, yield);
}
node * root1 = create_tree();
node * root2 = create_tree();
std::pull_coroutine<std::string> reader1( [&](auto & yield ){ traverse (root1, yield);});
std::pull_coroutine<std::string> reader2( [&](auto & yield ){ traverse (root2, yield);});
std :: cout << “equal = “ << std::equal (begin (reader1), end( reader1), begin(reader2))
<< std :: endl ;
Recursive Tree Walk (Stackless)
generator<std::string> traverse(node_t* n)
{
if (p->left) yield traverse(p->left);
yield p->name;
if (p->right) yield traverse(p->right);
}
node * root1 = create_tree();
node * root2 = create_tree();
auto reader1 = traverse (root1);
auto reader2 = traverse (root2);
std :: cout << “equal = “ << std::equal(begin(reader1), end(reader1),
begin(reader2) )
<< std :: endl ;