From: akr@... Date: 2018-02-21T00:26:08+00:00 Subject: [ruby-core:85702] [Ruby trunk Bug#14400] IO#ungetc and IO#ungetbyte documentation is inconsistent with the behavior Issue #14400 has been updated by akr (Akira Tanaka). I think it is possible to glow the IO read buffer (rbuf) if it is properly locked. Since the IO buffer is modified without GVL (to avoid whole process blocking), glowing (realloc) the buffer without lock is too dangerous. It seems rb_io_t has no lock for rbuf, now. (I guess write_lock is not for the lock for rbuf.) ---------------------------------------- Bug #14400: IO#ungetc and IO#ungetbyte documentation is inconsistent with the behavior https://bugs.ruby-lang.org/issues/14400#change-70505 * Author: Eregon (Benoit Daloze) * Status: Feedback * Priority: Normal * Assignee: akr (Akira Tanaka) * Target version: * ruby -v: ruby 2.6.0dev (2018-01-25 trunk 62035) [x86_64-linux] * Backport: 2.3: UNKNOWN, 2.4: UNKNOWN, 2.5: UNKNOWN ---------------------------------------- The documentation of IO#ungetc states: > Pushes back one character (passed as a parameter) onto ios, such that a > subsequent buffered character read will return it. Only one character may be > pushed back before a subsequent read operation (that is, you will be able to > read only the last of several characters that have been pushed back). Has no > effect with unbuffered reads (such as IO#sysread). And similar for IO#ungetbyte: > Pushes back bytes (passed as a parameter) onto ios, such that a > subsequent buffered read will return it. Only one byte may be pushed back > before a subsequent read operation (that is, you will be able to read only the > last of several bytes that have been pushed back). Has no effect with > unbuffered reads (such as IO#sysread). The part about only one byte/character is inconsistent with the actual behavior, most notably because both of these methods accept a String with multiple characters as argument. ~~~ ruby STDIN.ungetc "Hello World!" STDIN.read 12 #=> "Hello World!" STDIN.ungetbyte "Foo Bar" STDIN.read 7 #=> "Foo Bar" ~~~ (There are even specs for it: https://github.com/ruby/spec/blob/7fa22023d69620ea3ff4d0ed2eb71fd7b02dd950/core/io/ungetc_spec.rb#L98 https://github.com/ruby/spec/blob/7fa22023d69620ea3ff4d0ed2eb71fd7b02dd950/core/io/ungetbyte_spec.rb#L21) > that is, you will be able to read only the last of several characters that have been pushed back is contradicting what happens. The behavior with large Strings is confusing. It seems to allow arbitrarily large strings (but only if there was not a ungetbyte already/the buffer was empty?). ~~~ $ pry [1] pry(main)> STDIN.ungetbyte "a"*10_000 => nil [2] pry(main)> STDIN.ungetbyte "a"*10_000 IOError: ungetbyte failed $ pry [1] pry(main)> STDIN.ungetbyte "a"*100_000 => nil [2] pry(main)> STDIN.ungetbyte "a"*100_000 IOError: ungetbyte failed from (pry):2:in `ungetbyte' $ pry [1] pry(main)> STDIN.ungetbyte "a"*100_000 => nil [2] pry(main)> STDIN.read(100_000).size => 100000 [3] pry(main)> STDIN.ungetbyte "a"*100_000 => nil [4] pry(main)> STDIN.read(100_000).size => 100000 ~~~ And it's not as simple as if two consecutive ungetbyte were forbidden: ~~~ $ pry [1] pry(main)> STDIN.ungetbyte "a"*10_000_000 => nil [2] pry(main)> STDIN.ungetbyte "a" IOError: ungetbyte failed from (pry):2:in `ungetbyte' $ pry [1] pry(main)> STDIN.ungetbyte "a" => nil [2] pry(main)> STDIN.ungetbyte "a" => nil ~~~ So how are those methods supposed to behave? Can the documentation be updated to match the behavior and/or the behavior be fixed to be simpler? I also wonder when those methods are useful. There seems to be very few usages in the stdlib. Maybe they should just be removed? It seems easy to make a custom IO wrapper/buffer supporting pushing characters/bytes back. -- https://bugs.ruby-lang.org/ Unsubscribe: