From: knu@... Date: 2018-10-19T03:30:07+00:00 Subject: [ruby-core:89466] [Ruby trunk Feature#14781] Enumerator#generate Issue #14781 has been updated by knu (Akinori MUSHA). zverok (Victor Shepelev) wrote: > @knu > The _ultimate_ goal for my proposal is, in fact, promoting Enumerator as a "Ruby way" for doing all-the-things with loops; not just "new useful feature". > > That's why I feel really uneasy about your changes to the proposal. Thanks for your quick feedback, and for bringing up this issue. > **drop** > ```ruby > # from: `drop: 2` is part of Enumerator.from API > Enumerator.from([node], drop: 2, &:parent).map(&:name) > # generate: `drop(2)` is part of standard Enumerator API > Enumerator.generate(node, &:parent).take(6).map(&:name).drop(2) > ``` I presume `.take(6)` is inserted by mistake, but with it or not the following map and drop methods belong to Enumerable, and are Array based operations that create an intermediate array per call. So, I consider them as Array/Enumerable API rather than Enumerator API. Creating intermediate arrays is not only a waste of memory but also against the key concept of Enumerator: to deal with an object as a stream, which may be infinite. Adding `.lazy` before `.drop(2)` can be a cure, but then the value you get is a lazy enumerator that is incompatible with an non-lazy enumerator. For instance, Lazy#map, Lazy#select etc. return Lazy objects, so you can't always pass one to methods that expect a normal Enumerable object. I've always thought that Lazy#eager that turns a lazy enumerator back to a non-lazy enumerator would be nice, but `.lazy.map{}.eager` would look messy anyway. > # implicit "stop on nil" is part of Enumerator.from convention that code reader should be aware of I think it's good and reasonable default behavior to treat nil as an end. Taking your Octokit example, the block could be `{ |response| response.rels[:next]&.get }` to make it go through all pages and automatically stop if nil were treated as an end. You omitted a `.take_while` in the example, but you'd get an error if there were less than 3 pages. You'd almost always need to either explicitly raise StopIteration in the initial block or chain `.take_while`/`.take` if there were no default end, and the choice between them is not obvious. > **start with array** (I believe 1 and 0 initial values are the MOST used cases) > ```ruby > # from: we should start from empty array, expression nothing but Enumerator.from API limitation > Enumerator.from([]) { 0 }.take(10) > # generate: no start value > Enumerator.generate { 0 }.take(10) The limitation only came from what the word `from` sounds like. I picked the name `from` and `Enumerator.from {}` just didn't sound right to me, so I made the argument mandatory. You can just default the first argument to `[]` if it reads and writes better, possibly with a different name than `from` which I won't insist on. > # from: work with one value requires not forgetting to arrayify it > Enumerator.from([1], &:succ).take(10) > # generate: just use the value > Enumerator.generate(1, &:succ).take(10) Yeah, due to our keyword arguments being pseudo ones, you can't use variable length arguments for a list of objects that might end with a hash. We'll hopefully be getting it right by Ruby 3.0. There's much room for consideration of the name and method signature. Perhaps multiple factory methods could work better. > # from: "we pass as much of previous values as initial array had" convention > Enumerator.from([0, 1]) { |i, j| i + j }.take(10) > # generate: regular value enumeration, next block receives exactly what previous returns > Enumerator.generate([0, 1]) { |i, j| [j, i + j] }.take(10).map(&:last) > # ^ yes, it will require additional trick to include 0 in final result, but I believe this is worthy sacrifice > ``` The former directly generates an infinite Fibonacci sequence and that's a major difference. Taking a first few elements with `.take` is just for testing (assertion) purposes and not part of the use case. When solving a problem like "Find the least n such that \sum_{k=1}^{n} fib(k) >= 1000", `take` wouldn't work optimally. > The problem with "API complication" is inconsistency. Like, a newcomer may ask: Why `Enumerator.from` has "this handy `drop: 2` initial arg", and `each` don't? Use cases could exist, too! I understand that sentiment, but there's no surprise that a factory/constructor method of a dedicated class often takes many tunables while individual instance methods do not. If people all said they need it as a generic feature, it wouldn't be a bad idea to me to consider adding something like Enumerable#skip(n) that would return an offset enumerator. ---------------------------------------- Feature #14781: Enumerator#generate https://bugs.ruby-lang.org/issues/14781#change-74506 * Author: zverok (Victor Shepelev) * Status: Feedback * Priority: Normal * Assignee: * Target version: ---------------------------------------- This is alternative proposal to `Object#enumerate` (#14423), which was considered by many as a good idea, but with unsure naming and too radical (`Object` extension). This one is _less_ radical, and, at the same time, more powerful. **Synopsys**: * `Enumerator.generate(initial, &block)`: produces infinite sequence where each next element is calculated by applying block to previous; `initial` is first sequence element; * `Enumerator.generate(&block)`: the same; first element of sequence is a result of calling the block with no args. This method allows to produce enumerators replacing a lot of common `while` and `loop` cycles in the same way `#each` replaces `for`. **Examples:** With initial value ```ruby # Infinite sequence p Enumerator.generate(1, &:succ).take(5) # => [1, 2, 3, 4, 5] # Easy Fibonacci p Enumerator.generate([0, 1]) { |f0, f1| [f1, f0 + f1] }.take(10).map(&:first) #=> [0, 1, 1, 2, 3, 5, 8, 13, 21, 34] require 'date' # Find next Tuesday p Enumerator.generate(Date.today, &:succ).detect { |d| d.wday == 2 } # => # # Tree navigation # --------------- require 'nokogiri' require 'open-uri' # Find some element on page, then make list of all parents p Nokogiri::HTML(open('https://www.ruby-lang.org/en/')) .at('a:contains("Ruby 2.2.10 Released")') .yield_self { |a| Enumerator.generate(a, &:parent) } .take_while { |node| node.respond_to?(:parent) } .map(&:name) # => ["a", "h3", "div", "div", "div", "div", "div", "div", "body", "html"] # Pagination # ---------- require 'octokit' Octokit.stargazers('rails/rails') # ^ this method returned just an array, but have set `.last_response` to full response, with data # and pagination. So now we can do this: p Enumerator.generate(Octokit.last_response) { |response| response.rels[:next].get # pagination: `get` fetches next Response } .first(3) # take just 3 pages of stargazers .flat_map(&:data) # `data` is parsed response content (stargazers themselves) .map { |h| h[:login] } # => ["wycats", "brynary", "macournoyer", "topfunky", "tomtt", "jamesgolick", ... ``` Without initial value ```ruby # Random search target = 7 p Enumerator.generate { rand(10) }.take_while { |i| i != target }.to_a # => [0, 6, 3, 5,....] # External while condition require 'strscan' scanner = StringScanner.new('7+38/6') p Enumerator.generate { scanner.scan(%r{\d+|[-+*/]}) }.slice_after { scanner.eos? }.first # => ["7", "+", "38", "/", "6"] # Potential message loop system: Enumerator.generate { Message.receive }.take_while { |msg| msg != :exit } ``` **Reference implementation**: https://github.com/zverok/enumerator_generate I want to **thank** all peers that participated in the discussion here, on Twitter and Reddit. ---Files-------------------------------- enumerator_from.rb (3.16 KB) -- https://bugs.ruby-lang.org/ Unsubscribe: