From: Matthew Kerwin Date: 2018-08-08T21:14:25+10:00 Subject: [ruby-core:88352] Re: [Ruby trunk Feature#13618] [PATCH] auto fiber schedule for rb_wait_for_single_fd and rb_waitpid [snipping some bits, because I can only speak to what I know] On Wed, 8 Aug 2018 at 18:50, Eric Wong wrote: > > samuel@oriontransfer.net wrote: > > > In particular, when handling HTTP/2 with multiple streams, > > it's tricky to get good performance because utilising multiple > > threads is basically impossible (and this applies to Ruby in > > general). With HTTP/1, multiple "streams" could be easily > > multiplexed across multiple processes easily. > > I'm no expert on HTTP/2, but I don't believe HTTP/2 was built > for high-throughput in mind. By "high-throughput", I mean > capable of maxing out the physical network or storage. > It was originally invented to reduce perceived latency, both in terms of time-to-first-paint and time-to-last-byte, in solitary servers as well as data centres and CDNs. As such throughput was definitely a goal, but not the only one. There is some synchronisation: the server has to read a few bytes of each frame it receives before it can demux them to independent handlers; and when transmitting you have to block for CONTINUATION frames if any are in progress, and for flow control if you're sending DATA. But aside from those bottlenecks, each request/response can be handled completely in parallel. Does that really have that big of an impact on throughput? > > At least, multiplexing multiple streams over a single TCP > connection doesn't make any sense as a way to improve > throughput. Rather, HTTP/2 was meant to reduce latency by > avoiding TCP connection setup overhead, and maybe avoiding > slow-start-after-idle (by having less idle time). In other > words, HTTP/2 aims to make better use of a > heavy-in-memory-but-often-idle resource. > It shouldn't be that hard to saturate your network card, if you've got enough data to write, and the other end can consume it fast enough. The single TCP connection and application-layer flow control is meant to avoid problems like congestion and bufferbloat, on top of reducing slow-start, TIME_WAIT, etc. so throughput should in theory be pretty high. I guess ramming it all into a single TLS stream doesn't help, as there is some fairly hefty overhead that necessarily runs in a single thread. I'd like to say that's why I argued so hard for to be included in the spec, but it's actually just coincidental. > > > What this means is that a single HTTP/2 connection, even with > > multiple streams, is limited to a single thread with the > > fiver-based/green-thread design. > > > I actually see two sids to this: It limits bad connections to > > a single thread, which is actually a feature in some ways. On > > the other hand, you can't completely depend on multiplexing > > HTTP/2 streams to improve performance. > > Right. > > > On the other hand, any green-thread based design is probably > > going to suffer from this problem, unless a work pool is used > > for actually generating responses. In the case of > > `async-http`, it exposes streaming requests and responses, so > > this isn't very easy to achieve. > Hmm, I think that's what I just said. But then, horses for courses -- if a protocol is designed one way, and an application is designed another, there won't be a great mesh. > > Exactly. As I've been say all aalong: use different concurrency > primitives for different things. fork (or Guilds) for > CPU/memory-bound processing; green threads and/or nonblocking > I/O for low-throughput transfers (virtually all public Internet > stuff), native Threads for high-throughput transfers > (local/LAN/LFN). > > So you could use a green thread to coordinate work to the work > pool (forked processes), and still use a green thread to serialize > the low-throughput response back to the client. > > This is also why it's desirable (but not a priority) to be able > to migrate green-threads to different Threads/Guilds for load > balancing. Different stages of an application response will > shift from being CPU/memory-bound to low-throughput trickles. > Yeah, all of this. [snipped the rest] Cheers -- Matthew Kerwin https://matthew.kerwin.net.au/ Unsubscribe: