From: aselder@... Date: 2018-11-15T20:15:24+00:00 Subject: [ruby-core:89821] [Ruby trunk Feature#14038] Use rb_execution_context_t instead of rb_thread_t to represent execution context Issue #14038 has been updated by aselder (Andrew Selder). FWIW, this changed seems to cause SegFaults on a regular basis on OS X https://bugs.ruby-lang.org/issues/14714 https://bugs.ruby-lang.org/issues/14561 https://bugs.ruby-lang.org/issues/15308 https://bugs.ruby-lang.org/issues/14334 ---------------------------------------- Feature #14038: Use rb_execution_context_t instead of rb_thread_t to represent execution context https://bugs.ruby-lang.org/issues/14038#change-74881 * Author: ko1 (Koichi Sasada) * Status: Closed * Priority: Normal * Assignee: ko1 (Koichi Sasada) * Target version: 2.5 ---------------------------------------- # Summary This ticket proposes the following three changes. * (1) Separate `rb_execution_context_t` from `rb_thread_t`. * (2) Allocate `rb_fiber_t` for each Thread even if Threads don't make Fiber. * (3) Use `rb_execution_context_t *ec` to represent VM states instead of `rb_thread_t *th` Current patch is here: https://github.com/ruby/ruby/compare/trunk...ko1:sep_con https://github.com/ruby/ruby/compare/trunk...ko1:sep_con.patch # Background Recently Eric Wong introduced `rb_execution_context_t` to represent the "context", state of VM (program counter, stack pointer and so on). You can find this discussion around this mailing list threads: `[ ruby-core:81054] Re: [ruby-cvs:65407] normal:r58236 (trunk): thread.c: comments on M:N threading [ci skip]` Before introduction of `rb_execution_context_t`, VM status are contained by threads (`rb_thread_t`) and we need to copy status from `rb_thread_t` to switch Fiber context. I gathered all of states into `rb_execution_context_t` and now we only need to copy one sequential data `rb_execution_context_t` to switch Fiber context. # Proposal ## Proposal (1) Separate `rb_execution_context_t` from `rb_thread_t`. On this ticket I propose to separate `rb_execution_context_t` from `rb_thread_t` and `rb_thread_t` only contains pointer to separated `rb_execution_context_t` data. `rb_execution_context_t` memory are managed by `rb_fiber_t`. Before: ``` struct rb_thread_t { ...; rb_execution_context_t ec; ...; } ``` After: ``` struct rb_thread_t { ...; rb_execution_context_t *ec; /* points rb_fiber_t::ec_body */ ...; } ``` With this change, we only need to swap `ec` pointers to switch Fiber context. ## Proposal (2) Allocate `rb_fiber_t` for each Thread even if Threads don't make Fiber. Now root Fiber (and `rb_fiber_t`) is allocated just after another Fiber is created. For proposal (1), we need to allocate `rb_fiber_t` to prepare ec body. We can optimize this allocation with several techniques, but it introduce complexity. So that I allocated `rb_fiber_t` for each thread. ## Proposal (3) Use `rb_execution_context_t *ec` to represent VM states instead of `rb_thread_t *th` Now many functions accept `rb_thread_t *th` as first argument to represent VM states. For now most of states are in `rb_execution_context_t` so that we need to introduce indirect access `th->ec->...` to access VM states. To overcome this performance overhead, we need to use `ec` instead of `th` as the first argument. # Evaluation Comparison with trunk and modified version, `vm2_fiber_switch` benchmarks 10% speedup (not so big impact). ``` name trunk modified loop_whileloop2 0.113 0.116 vm2_fiber_switch* 2.500 2.242 Speedup ratio: compare with the result of `trunk' (greater is better) name modified loop_whileloop2 0.974 vm2_fiber_switch* 1.115 ``` However, other micro-benchmarks show slowdown becuase of indirect access I explained at Proposal (3). ``` name trunk modified loop_whileloop 0.436 0.444 vm1_simplereturn* 0.487 0.672 Speedup ratio: compare with the result of `trunk' (greater is better) name modified loop_whileloop 0.981 vm1_simplereturn* 0.726 ``` I believe we can improve by "Proposal (3)". # Foresight For Guild proposal, context passing by `ec` is important. I want to introduce many other APIs to accept `ec` as first argument and we don't need to call `GET_THREAD()` or `GET_EC()`, which cause system native thread-local-storage. I will introduce this patch soon because this patch is only internal changes. Ruby users can't observe any incompatible changes (hopefully). Thanks, Koichi -- https://bugs.ruby-lang.org/ Unsubscribe: