diff options
Diffstat (limited to 'doc/io_streams.rdoc')
-rw-r--r-- | doc/io_streams.rdoc | 350 |
1 files changed, 350 insertions, 0 deletions
diff --git a/doc/io_streams.rdoc b/doc/io_streams.rdoc new file mode 100644 index 0000000000..b686d67eb5 --- /dev/null +++ b/doc/io_streams.rdoc @@ -0,0 +1,350 @@ +== \IO Streams + +Ruby supports processing data as \IO streams; +that is, as data that may be read, re-read, written, re-written, +and traversed via iteration. + +Core classes with such support include: + +- IO, and its derived class File. +- {StringIO}[rdoc-ref:StringIO]: for processing a string. +- {ARGF}[rdoc-ref:ARGF]: for processing files cited on the command line. + +Pre-existing stream objects that are referenced by constants include: + +- $stdin: read-only instance of \IO. +- $stdout: write-only instance of \IO. +- $stderr: read-only instance of \IO. +- \ARGF: read-only instance of \ARGF. + +You can create stream objects: + +- \File: + + - File.new: returns a new \File object. + - File.open: passes a new \File object to given the block. + +- \IO: + + - IO.new: returns a new \IO object for the given integer file descriptor. + - IO.open: passes a new \IO object to the given block. + - IO.popen: returns a new \IO object that is connected to the $stdin + and $stdout of a newly-launched subprocess. + - Kernel#open: returns a new \IO object connected to a given source: + stream, file, or subprocess. + +- \StringIO: + + - StringIO.new: returns a new \StringIO object. + - StringIO.open: passes a new \StringIO object to the given block. + +(You cannot create an \ARGF object, but one already exists.) + +=== About the Examples + +Many examples here use these variables: + + # English text with newlines. + text = <<~EOT + First line + Second line + + Fourth line + Fifth line + EOT + + # Russian text. + russian = "\u{442 435 441 442}" # => "ัะตัั" + + # Binary data. + data = "\u9990\u9991\u9992\u9993\u9994" + + # Text file. + File.write('t.txt', text) + + # File with Russian text. + File.write('t.rus', russian) + + # File with binary data. + f = File.new('t.dat', 'wb:UTF-16') + f.write(data) + f.close + +=== Position + +An \IO stream has a nonnegative integer _position_, +which is the byte offset at which the next read or write is to occur; +the relevant methods: + +- +#tell+ (aliased as #pos): Returns the current position (in bytes) in the stream: + + f = File.new('t.txt') + f.tell # => 0 + f.gets # => "First line\n" + f.tell # => 12 + f.close + +- +#pos=+: Sets the position of the stream (in bytes): + + f = File.new('t.txt') + f.tell # => 0 + f.pos = 20 # => 20 + f.tell # => 20 + f.close + +- +#seek+: Sets the position of the stream to a given integer +offset+ + (in bytes), with respect to a given constant +whence+, which is one of: + + - +:CUR+ or <tt>IO::SEEK_CUR</tt>: + Repositions the stream to its current position plus the given +offset+: + + f = File.new('t.txt') + f.tell # => 0 + f.seek(20, :CUR) # => 0 + f.tell # => 20 + f.seek(-10, :CUR) # => 0 + f.tell # => 10 + f.close + + - +:END+ or <tt>IO::SEEK_END</tt>: + Repositions the stream to its end plus the given +offset+: + + f = File.new('t.txt') + f.tell # => 0 + f.seek(0, :END) # => 0 # Repositions to stream end. + f.tell # => 52 + f.seek(-20, :END) # => 0 + f.tell # => 32 + f.seek(-40, :END) # => 0 + f.tell # => 12 + f.close + + - +:SET+ or <tt>IO:SEEK_SET</tt>: + Repositions the stream to the given +offset+: + + f = File.new('t.txt') + f.tell # => 0 + f.seek(20, :SET) # => 0 + f.tell # => 20 + f.seek(40, :SET) # => 0 + f.tell # => 40 + f.close + +- +#rewind+: Positions the stream to the beginning: + + f = File.new('t.txt') + f.tell # => 0 + f.gets # => "First line\n" + f.tell # => 12 + f.rewind # => 0 + f.tell # => 0 + f.close + +=== Lines + +Some reader methods in \IO streams are line-oriented; +such a method reads one or more lines, +which are separated by an implicit or explicit line separator. + +These methods are included (except as noted) in classes Kernel, IO, File, +and {ARGF}[rdoc-ref:ARGF]: + +- +#each_line+ - passes each line to the block; not in Kernel: + + f = File.new('t.txt') + f.each_line {|line| p line } + + Output: + + "First line\n" + "Second line\n" + "\n" + "Fourth line\n" + "Fifth line\n" + + The reading may begin mid-line: + + f = File.new('t.txt') + f.pos = 27 + f.each_line {|line| p line } + + Output: + + "rth line\n" + "Fifth line\n" + +- +#gets+ - returns the next line (which may begin mid-line): + + f = File.new('t.txt') + f.gets # => "First line\n" + f.gets # => "Second line\n" + f.pos = 27 + f.gets # => "rth line\n" + f.readlines # => ["Fifth line\n"] + f.gets # => nil + +- +#readline+ - like #gets, but raises an exception at end-of-file; + not in StringIO. + +- +#readlines+ - returns all remaining lines in an array; + may begin mid-line: + + f = File.new('t.txt') + f.pos = 19 + f.readlines # => ["ine\n", "\n", "Fourth line\n", "Fifth line\n"] + f.readlines # => [] + +Each of these methods may be called with: + +- An optional line separator, +sep+. +- An optional line-size limit, +limit+. +- Both +sep+ and +limit+. + +==== Line Separator + +The default line separator is the given by the global variable <tt>$/</tt>, +whose value is by default <tt>"\n"</tt>. +The line to be read next is all data from the current position +to the next line separator: + + f = File.new('t.txt') + f.gets # => "First line\n" + f.gets # => "Second line\n" + f.gets # => "\n" + f.gets # => "Fourth line\n" + f.gets # => "Fifth line\n" + f.close + +You can specify a different line separator: + + f = File.new('t.txt') + f.gets('l') # => "First l" + f.gets('li') # => "ine\nSecond li" + f.gets('lin') # => "ne\n\nFourth lin" + f.gets # => "e\n" + f.close + +There are two special line separators: + +- +nil+: The entire stream is read into a single string: + + f = File.new('t.txt') + f.gets(nil) # => "First line\nSecond line\n\nFourth line\nFifth line\n" + f.close + +- <tt>''</tt> (the empty string): The next "paragraph" is read + (paragraphs being separated by two consecutive line separators): + + f = File.new('t.txt') + f.gets('') # => "First line\nSecond line\n\n" + f.gets('') # => "Fourth line\nFifth line\n" + f.close + +==== Line Limit + +The line to be read may be further defined by an optional integer argument +limit+, +which specifies that the number of bytes returned may not be (much) longer +than the given +limit+; +a multi-byte character will not be split, and so a line may be slightly longer +than the given limit. + +If +limit+ is not given, the line is determined only by +sep+. + + # Text with 1-byte characters. + File.new('t.txt') {|f| f.gets(1) } # => "F" + File.new('t.txt') {|f| f.gets(2) } # => "Fi" + File.new('t.txt') {|f| f.gets(3) } # => "Fir" + File.new('t.txt') {|f| f.gets(4) } # => "Firs" + # No more than one line. + File.new('t.txt') {|f| f.gets(10) } # => "First line" + File.new('t.txt') {|f| f.gets(11) } # => "First line\n" + File.new('t.txt') {|f| f.gets(12) } # => "First line\n" + + # Text with 2-byte characters, which will not be split. + File.new('r.rus') {|f| f.gets(1).size } # => 1 + File.new('r.rus') {|f| f.gets(2).size } # => 1 + File.new('r.rus') {|f| f.gets(3).size } # => 2 + File.new('r.rus') {|f| f.gets(4).size } # => 2 + +==== Line Separator and Line Limit + +With arguments +sep+ and +limit+ given, +combines the two behaviors: + +- Returns the next line as determined by line separator +sep+. +- But returns no more bytes than are allowed by the limit. + +Example: + + File.new('t.txt') {|f| f.gets('li', 20) } # => "First li" + File.new('t.txt') {|f| f.gets('li', 2) } # => "Fi" + +==== Line Number + +A readable \IO stream has a _line_ _number_, +which is the non-negative integer line number +in the stream where the next read will occur. + +A new stream is initially has line number +0+. + +\Method IO#lineno returns the line number. + +Reading lines from a stream usually changes its line number: + + f = File.new('t.txt', 'r') + f.lineno # => 0 + f.readline # => "This is line one.\n" + f.lineno # => 1 + f.readline # => "This is the second line.\n" + f.lineno # => 2 + f.readline # => "Here's the third line.\n" + f.lineno # => 3 + f.eof? # => true + f.close + +Iterating over lines in a stream usually changes its line number: + + f = File.new('t.txt') + f.each_line do |line| + p "position=#{f.pos} eof?=#{f.eof?} lineno=#{f.lineno}" + end + f.close + +Output: + + "position=11 eof?=false lineno=1" + "position=23 eof?=false lineno=2" + "position=24 eof?=false lineno=3" + "position=36 eof?=false lineno=4" + "position=47 eof?=true lineno=5" + +==== Line Options + +A number of \IO methods accept optional keyword arguments +that determine how lines in a stream are to be treated: + +- +:chomp+: If +true+, line separators are omitted; default is +false+. + +=== Open and Closed \IO Streams + +A new \IO stream may be open for reading, open for writing, or both. + +You can close a stream using these methods: + +- +#close+ - closes the stream for both reading and writing. + +- +#close_read+ (not available in \ARGF) - closes the stream for reading. + +- +#close_write+ (not available in \ARGF) - closes the stream for writing. + +You can query whether a stream is closed using these methods: + +- +#closed?+ - returns whether the stream is closed. + +=== Stream End-of-File + +You can query whether a stream is at end-of-file using this method: + +- +#eof?+ (also aliased as +#eof+) - + returns whether the stream is at end-of-file. + |