From: Eric Wong Date: 2017-06-01T09:18:10+00:00 Subject: [ruby-core:81500] Re: [Ruby trunk Feature#13618] [PATCH] auto fiber schedule for rb_wait_for_single_fd and rb_waitpid ko1@atdot.net wrote: > Issue #13618 has been updated by ko1 (Koichi Sasada). > > > Thank you for your great work. You're welcome :) > # summary of this comment > > Recent days I'm thinking about this feature's "safety" or "dependability". > Because of this issue, I think it is difficult to employ this feature right now. I disagree. I do not recall Ruby 1.8 Threads being a big problem for Rubyists. Modern Rubyists seem OK using native Threads ("OK", not "great" :) We can improve APIs (maybe more Queue/SizedQueue, less Mutex). What auto-Fiber provides is an option to reduce memory usage and improve scalability without rewriting existing synchronous codebases (e.g. Rack + middlewares). In my experience, I think Ruby gained more users during 1.8-era when it memory usage was low for green threads; and lost users as 1.9/2.x memory usage increase (and I guess 3rd-party libs grew, too). The safety difference between auto-Fiber and Thread is a minor point. Lowering memory usage while retaining compatibility with existing synchronous code is my reason for working on this. > * It is difficult to know which operations should be run in atomic (users write code without checking atomicity). > * It is difficult to find out which method can switch. > * Not only user writing code, but also all library code can switch fibers. > * This means that we need to check all of library code to know that they don't violate atomic assumptions. > * It introduced non-deterministic behavior (with `Fiber.yield` it will be deterministic behavior and it is easy to reproduce the problem). Yes; we will document all switch points in RDoc and NEWS, of course (maybe write a separate doc/auto-fiber.rdoc) > This kind of difficulties are same as threading. The impact > can be smaller than threading (because threading can switch > anywhere and it is very hard to predict the behavior. > Auto-fibers switch only at blocking operations especially on > IO operations). Right, I think auto-fiber will have some of the same (probably minor) difficulties as threading. However, I do not believe it is a big problem since Rubyists should already be used to threading. > # Consideration > > To solve this behavior, we have several choice. > > (1) Introduce synchronization mechanisms for auto-fibers > > Like Mutex, Queue and so on. Yes, I think Queue/SizedQueue should be able to respect Fiber scheduling boundaries. Queue/SizedQueue are especially useful and I plan to implement auto-fiber support for that. I am not sure about Mutex... (can we defer to Matz for decisions?) > On Ruby 1.8 era, we have `Thread.exclusive` to prohibit thread-switching. > > I don't want to choice this option because it is what I want to avoid from Ruby. Right. Maybe Mutex#synchronize can prohibit auto-switch (or, it will show a warning or raise at auto-switch points). > (2) Introduce limitations > > The problem "It is difficult to find out which method can switch" is because we need to check whole of code. If we can restrict the auto-fiber switching, this problem can be smaller. Right now for IO, it is double opt-in: It requires _both_ Fiber#start and IO#nonblock=true. Sidenote: As a Rubyist who studies the Linux kernel; I consider it imperative to give Rubyists the choice to make real blocking syscalls (not the "fake blocking" with auto-fiber/green threads). This is because Linux can optimize "wake-one" situations to: a) give round-robin load distribution across independent processes b) avoid thundering herd with multiple threads/processes c) (I forget...) (sorry I forgot to note this in my original ticket, but it will be in the final docs) > (2-1) Introduce Fiber switching methods > > Instead of implicit blocking (IO) operations, introduce explicit blocking operations can switch. We can check all of source code by grep. I am against this. Instead, I want it to be easy to port existing Thread-aware codebases over. Notice my example test script used net/http from stdlib. I would like to use existing stdlib (net/*, webrick, drb, ...) as much as possible without modifications. That means many existing Ruby libraries can work transparently. > (2-2) Check context > > Permit fiber switching only at permitted places, by block, pragma, and so on. > > ``` > # auto-fiber: true # <- this file can switch fibers automatically > Fiber.new(auto: true){ > ... > io.read # can switch > ... > something_defined_in_gem # can't switch > ... > } > ``` > > I think other languages like Python, JavaScript employs this idea. I need to survey more on such languages. I do not like this, either. I admit I am not familiar with those languages. I think we should strive to make existing Thread-aware Ruby code work well, and as transparently as possible... > (3) Something else cleaver > > Introducing debugger is one choice (maybe it is easy than threading issues). > But we can't avoid troubles (and maybe the troubles should be not frequent, non-reproducible). Adding Tracepoint to help track auto-switch should be done (honestly I have never used this feature in ruby :x). And yes, I think native threading bugs are trickier to track down than auto-Fiber switching. Just remember, today we have native threading and things are OK. And I think there were more happy Rubyists in 1.8 days. > Other option is to introduce hooks to implement auto-fibers and provide auto-fibers by gems and advanced users know the above risk use this feature. But not good idea because we can't provide good way to write for many people. > > thought? Again, no. I am really in favor of making it easy to port existing Thread-aware code to auto-Fiber. Again; from my experience; I do not believe many Ruby programmers had safety problems with 1.8 green threads. Today we have Rubyists who are already used to 1.9/2.x native Thread already. The safety improvement is a minor point. Unsubscribe: