From: Matthew Kerwin <matthew@...>
Date: 2018-08-08T21:14:25+10:00
Subject: [ruby-core:88352] Re: [Ruby trunk Feature#13618] [PATCH] auto fiber schedule for rb_wait_for_single_fd and rb_waitpid

[snipping some bits, because I can only speak to what I know]

On Wed, 8 Aug 2018 at 18:50, Eric Wong <normalperson@yhbt.net> wrote:
>
> samuel@oriontransfer.net wrote:
>
> > In particular, when handling HTTP/2 with multiple streams,
> > it's tricky to get good performance because utilising multiple
> > threads is basically impossible (and this applies to Ruby in
> > general). With HTTP/1, multiple "streams" could be easily
> > multiplexed across multiple processes easily.
>
> I'm no expert on HTTP/2, but I don't believe HTTP/2 was built
> for high-throughput in mind.  By "high-throughput", I mean
> capable of maxing out the physical network or storage.
>

It was originally invented to reduce perceived latency, both in terms
of time-to-first-paint and time-to-last-byte, in solitary servers as
well as data centres and CDNs.  As such throughput was definitely a
goal, but not the only one.

There is some synchronisation: the server has to read a few bytes of
each frame it receives before it can demux them to independent
handlers; and when transmitting you have to block for CONTINUATION
frames if any are in progress, and for flow control if you're sending
DATA.  But aside from those bottlenecks, each request/response can be
handled completely in parallel.  Does that really have that big of an
impact on throughput?

>
> At least, multiplexing multiple streams over a single TCP
> connection doesn't make any sense as a way to improve
> throughput.  Rather, HTTP/2 was meant to reduce latency by
> avoiding TCP connection setup overhead, and maybe avoiding
> slow-start-after-idle (by having less idle time).  In other
> words, HTTP/2 aims to make better use of a
> heavy-in-memory-but-often-idle resource.
>

It shouldn't be that hard to saturate your network card, if you've got
enough data to write, and the other end can consume it fast enough.
The single TCP connection and application-layer flow control is meant
to avoid problems like congestion and bufferbloat, on top of reducing
slow-start, TIME_WAIT, etc. so throughput should in theory be pretty
high.  I guess ramming it all into a single TLS stream doesn't help,
as there is some fairly hefty overhead that necessarily runs in a
single thread.  I'd like to say that's why I argued so hard for
<https://tools.ietf.org/html/rfc7540#section-3.2> to be included in
the spec, but it's actually just coincidental.

>
> > What this means is that a single HTTP/2 connection, even with
> > multiple streams, is limited to a single thread with the
> > fiver-based/green-thread design.
>
> > I actually see two sids to this: It limits bad connections to
> > a single thread, which is actually a feature in some ways. On
> > the other hand, you can't completely depend on multiplexing
> > HTTP/2 streams to improve performance.
>
> Right.
>
> > On the other hand, any green-thread based design is probably
> > going to suffer from this problem, unless a work pool is used
> > for actually generating responses. In the case of
> > `async-http`, it exposes streaming requests and responses, so
> > this isn't very easy to achieve.
>

Hmm, I think that's what I just said.  But then, horses for courses --
if a protocol is designed one way, and an application is designed
another, there won't be a great mesh.

>
> Exactly.  As I've been say all aalong: use different concurrency
> primitives for different things.  fork (or Guilds) for
> CPU/memory-bound processing; green threads and/or nonblocking
> I/O for low-throughput transfers (virtually all public Internet
> stuff), native Threads for high-throughput transfers
> (local/LAN/LFN).
>
> So you could use a green thread to coordinate work to the work
> pool (forked processes), and still use a green thread to serialize
> the low-throughput response back to the client.
>
> This is also why it's desirable (but not a priority) to be able
> to migrate green-threads to different Threads/Guilds for load
> balancing.  Different stages of an application response will
> shift from being CPU/memory-bound to low-throughput trickles.
>

Yeah, all of this.

[snipped the rest]

Cheers
-- 
  Matthew Kerwin
  https://matthew.kerwin.net.au/

Unsubscribe: <mailto:ruby-core-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>