From: eregontp@...
Date: 2019-05-17T12:56:40+00:00
Subject: [ruby-core:92696] [Ruby trunk Feature#14844] Future of RubyVM::AST?

Issue #14844 has been updated by Eregon (Benoit Daloze).


@mame Thank you for the reply.

Could you or @yui-knk propose a description to include in the documentation, summarizing what was said?

Could you also give your opinion on accessing Node members by name (https://bugs.ruby-lang.org/issues/14844#note-13) ?

> Ripper does not reproduce the details including parser-level optimization.

What kind of details? Could you give an example?
Things like OPCALL instead of CALL? Is that useful for any tool?

I tried a simple expression to compare Ripper and RubyVM::AbstractSyntaxTree:

```ruby
pry(main)> Ripper.sexp("def m(a) a * 2 end")                                                                                           
=> [:program,
 [[:def,
   [:@ident, "m", [1, 4]],
   [:paren, [:params, [[:@ident, "a", [1, 6]]], nil, nil, nil, nil, nil, nil]],
   [:bodystmt, [[:binary, [:var_ref, [:@ident, "a", [1, 9]]], :*, [:@int, "2", [1, 13]]]], nil, nil, nil]]]]

pry(main)> RubyVM::AbstractSyntaxTree.parse("def m(a) a * 2 end")                                                                      
=> (SCOPE@1:0-1:18
 tbl: []
 args: nil
 body:
   (DEFN@1:0-1:18
    mid: :m
    body:
      (SCOPE@1:0-1:18
       tbl: [:a]
       args:
         (ARGS@1:6-1:7
          pre_num: 1
          pre_init: nil
          opt: nil
          first_post: nil
          post_num: 0
          post_init: nil
          rest: nil
          kw: nil
          kwrest: nil
          block: nil)
       body: (OPCALL@1:9-1:14 (LVAR@1:9-1:10 :a) :* (ARRAY@1:13-1:14 (LIT@1:13-1:14 2) nil)))))
```

Indeed, the RubyVM::AbstractSyntaxTree version seems easier to read (and access once we have `RubyVM::AST::Node#[:field_name]`).
I think one of the main gains is node fields are named, while they are just a flat Array in `Ripper.sexp`.

OTOH, things are far from perfectly clear (so I think "experimental/not for serious use" seems appropriate currently).
For instance, one has to manually associate arguments given as e.g. a number for `pre_num` and their names in `tbl`.
Optional arguments seem exposed more clearly, by having node under `ARGSnode[:opt]`, however the OPT_ARG look nested like a cons-list instead of being an Array which would be more intuitive.

So if we compare a slightly more complex example with the `parser` gem, we see there are lots of opportunities to make RubyVM::AST easier to access/process/read/understand:

```ruby
pry(main)> require 'parser/current'

pry(main)> RubyVM::AbstractSyntaxTree.parse("def m(b,a,c=3,d=4) a * 2 end")                                                             
=> (SCOPE@1:0-1:28
 tbl: []
 args: nil
 body:
   (DEFN@1:0-1:28
    mid: :m
    body:
      (SCOPE@1:0-1:28
       tbl: [:b, :a, :c, :d]
       args:
         (ARGS@1:6-1:17
          pre_num: 2
          pre_init: nil
          opt: (OPT_ARG@1:10-1:17 (LASGN@1:10-1:13 :c (LIT@1:12-1:13 3)) (OPT_ARG@1:14-1:17 (LASGN@1:14-1:17 :d (LIT@1:16-1:17 4)) nil))
          first_post: nil
          post_num: 0
          post_init: nil
          rest: nil
          kw: nil
          kwrest: nil
          block: nil)
       body: (OPCALL@1:19-1:24 (LVAR@1:19-1:20 :a) :* (ARRAY@1:23-1:24 (LIT@1:23-1:24 2) nil)))))

pry(main)> Parser::CurrentRuby.parse("def m(b,a,c=3,d=4) a * 2 end")                                                                    
=> s(:def, :m,
  s(:args,
    s(:arg, :b),
    s(:arg, :a),
    s(:optarg, :c,
      s(:int, 3)),
    s(:optarg, :d,
      s(:int, 4))),
  s(:send,
    s(:lvar, :a), :*,
    s(:int, 2)))
```

I think it would be good to take inspiration from `parser` here, which makes it really convenient to access the AST and still seems to not lose any important information.

In fact, in what cases the additional things in RubyVM::AST such as the SCOPE nodes would be useful beyond debugging the MRI parser?
Would any tool be able to do anything with those that it could not without?

I understand exposing the internal AST directly is the simplest implementation-wise.
But I think it's quite sub-optimal to access, process and understand.
Would it be better to expose an AST more similar, or even exactly the same, as the `parser` gem?

----------------------------------------
Feature #14844: Future of RubyVM::AST? 
https://bugs.ruby-lang.org/issues/14844#change-78054

* Author: rmosolgo (Robert Mosolgo)
* Status: Open
* Priority: Normal
* Assignee: yui-knk (Kaneko Yuichiro)
* Target version: 
----------------------------------------
Hi! Thanks for all your great work on the Ruby language. 

I saw the new RubyVM::AST module in 2.6.0-preview2 and I quickly went to try it out. 

I'd love to have a well-documented, user-friendly way to parse and manipulate Ruby code using the Ruby standard library, so I'm pretty excited to try it out. (I've been trying to learn Ripper recently, too: https://ripper-preview.herokuapp.com/, https://rmosolgo.github.io/ripper_events/ .)

Based on my exploration, I opened a small PR on GitHub with some documentation: https://github.com/ruby/ruby/pull/1888

I'm curious though, are there future plans for this module? For example, we might: 

- Add more details about each node (for example, we could expose the names of identifiers and operators through the node classes)
- Document each node type 

I see there is a lot more information in the C structures that we could expose, and I'm interested to help out if it's valuable. What do you think? 


-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>