Feature #6047
closedread_all: Grow buffer exponentially in generic case
Description
In the general case, read_all grows its buffer linearly by just the amount that is currently read from the underlying source. This results in a linear number of reallocs, It might turn out beneficial if the buffer were grown exponentially by multiplying with a constant factor (e.g. 1.5 or 2), thus resulting in only a logarithmic numver of reallocs.
I will provide a patch and benchmarks, but I'm already opening this issue so I won't forget.
See also https://bugs.ruby-lang.org/issues/5353 for more details.
Updated by ko1 (Koichi Sasada) over 12 years ago
ping. status?
Do you need helps or comments?
Updated by MartinBosslet (Martin Bosslet) over 12 years ago
ko1 (Koichi Sasada) wrote:
ping. status?
Do you need helps or comments?
Thanks for your help, to be honest, I haven't tried so far. Can we leave it at 2.0.0 target for now? If I run into problems, I'll ask here!
Updated by normalperson (Eric Wong) over 12 years ago
Martin Bosslet [email protected] wrote:
In the general case, read_all grows its buffer linearly by just the
amount that is currently read from the underlying source. This results
in a linear number of reallocs, It might turn out beneficial if the
buffer were grown exponentially by multiplying with a constant factor
(e.g. 1.5 or 2), thus resulting in only a logarithmic numver of
reallocs.
I think growing the buffer exponentially makes sense.
I would enforce a hard limit (probably <= 8 MB) for each growth,
to:
-
discourage read_all() for large files, it's very wasteful and
usually hurts performance -
prevent memory exhaustion for edge cases (especially on 32-bit)
Updated by mame (Yusuke Endoh) over 12 years ago
- Target version changed from 2.0.0 to 2.6
My experience also shows that it is useless to open a ticket for a reminder to myself :-)
I'm setting to next minor tentatively, but if it is really just a performance improvement (i.e., it affects no external modules), you can commit it to 2.0.0 before code freeze.
--
Yusuke Endoh [email protected]
Updated by zzak (zzak _) over 9 years ago
- Assignee changed from MartinBosslet (Martin Bosslet) to 7150
Updated by hsbt (Hiroshi SHIBATA) over 2 years ago
- Status changed from Assigned to Open
Updated by byroot (Jean Boussier) over 2 years ago
I just tried my hand at this one: https://github.com/ruby/ruby/pull/6829
I think such a change would make sense. Not that IO#read
without a size if common, but might as well do something sensible.
Updated by Anonymous over 2 years ago
- Status changed from Open to Closed
Applied in changeset git|7390eb43fe1bfb069af80ba8f73f7dc4999df0fd.
io.c (read_all): grow the buffer exponentially when size is unknown
[Feature #6047]
Currently it's grown by BUFSIZ
(1024) on every iteration which is bit wasteful.
Instead we can double the capacity whenever there is less than BUFSIZ
capacity
left.