Add Ripper::Tree subclass #5679

kddnewton · 2022-03-17T21:31:12Z

This is a new subclass of the Ripper parser. It is based on/extracted from the work done in https://github.com/ruby-syntax-tree/syntax_tree. This subclass is similar to SexpBuilderPP in that it provides individual shapes per production rule from ripper. However, there are a couple of differences:

Tree uses class instances instead of arrays to represent nodes.
Comments are automatically attached to the various nodes when parsing is finished.
A couple of additional nodes are added for clarity (i.e., ArgStar, Not, PinnedBegin, etc.).
Every node has location information attached to it (as opposed to just the scanner event nodes).
There's a standard interface for descending the tree (child_nodes).

Additionally, each node has pattern matching (both array and hash patterns) as well as pretty_print defined to make it easier to work with.

I think we should ship this with Ruby to make it easier to build tools (like formatters and linters) on top of this structure. It also will make it easier to change the parser in the future (if we ever do) because any tools built on top of these objects will not have to worry about the specific order of the nodes (unlike the SexpBuilderPP version) since everything has a named field. Additionally, since everything is written in pure Ruby, it makes it easy for other implementations of Ruby to benefit.

This is a new subclass of the Ripper parser. It is based on/extracted from the work done in https://github.com/ruby-syntax-tree/syntax_tree. This subclass is similar to SexpBuilderPP in that it provides individual shapes per production rule from ripper. However, there are a couple of differences: * Tree uses class instances instead of arrays to represent nodes. * Comments are automatically attached to the various nodes when parsing is finished. * A couple of additional nodes are added for clarity (i.e., ArgStar, Not, PinnedBegin, etc.). * Every node has location information attached to it (as opposed to just the scanner event nodes). * There's a standard interface for descending the tree (child_nodes). Additionally, each node has pattern matching (both array and hash patterns) as well as pretty_print defined to make it easier to work with. I think we should ship this with Ruby to make it easier to build tools (like formatters and linters) on top of this structure. It also will make it easier to change the parser in the future (if we ever do) because any tools built on top of these objects will not have to worry about the specific order of the nodes (unlike the SexpBuilderPP version) since everything has a named field.

Morriar · 2022-03-24T17:51:17Z

Considering these points:

A couple of additional nodes are added for clarity (i.e., ArgStar, Not, PinnedBegin, etc.).
Every node has location information attached to it (as opposed to just the scanner event nodes).
There's a standard interface for descending the tree (child_nodes).

If we're going to add new nodes for clarity, I think we could also take the liberty to add a common ancestor to all nodes. This common ancestor could hold the location and comments. Maybe even define an empty child_nodes method that would be overridden by the sub classes.

Having a common ancestor to all nodes has at least two advantages when using the resulting tree to write tools:

If you work with typing (be it RBS or Sorbet), this makes signatures easier to write. Consider this:

class Visitor
  def visit: (node: Node) -> void
end

instead of:

class Visitor
  def visit: (node: Alias | ARef | ARefField | ArgParen | ...) -> void
end

and it lets you properly/easily type the accept method:

class Node
  def accept: (v: Visitor) -> void
end

instead of:

class Alias
  def accept: (v: Visitor) -> void
end

class ARef
  def accept: (v: Visitor) -> void
end

...

When writing tools it allows you to attach behavior to all nodes at once without requiring to monkey patch them or rely on meta-programming:

class Node
  def to_some_format_i_need
    {
      kind: class.name,
      loc: location.to_s,
      children: child_nodes.map(&:to_some_format_i_need)
    }
  end
end

instead of defining this method on each node separately.

dleavitt · 2022-03-24T18:45:11Z

Awesome work on this + would be great to have something like this as part of ripper. Couple of thoughts:

Any reason the node classes aren't in their own files? It's sort of nice to have them live near their handlers, but having an 11k+ line file really reduces the accessibility of this code.
The lack of test coverage for ripper has been a pain-point for me. Would be great if some of the tests from syntax_tree could come with this, maybe with the goal to add more. I think with a good test suite it's possible this feature could reduce the total maintenance burden of ripper, since breaking changes to Ruby's AST would be caught earlier.

delonnewman mentioned this pull request Apr 14, 2022

Parsing delonnewman/el-toolkit#55

Open

kddnewton closed this May 12, 2022

kddnewton deleted the ripper-tree branch May 12, 2022 13:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add Ripper::Tree subclass #5679

Add Ripper::Tree subclass #5679

Uh oh!

kddnewton commented Mar 17, 2022 •

edited

Loading

Uh oh!

Morriar commented Mar 24, 2022

Uh oh!

dleavitt commented Mar 24, 2022 •

edited

Loading

Uh oh!

Uh oh!

Add Ripper::Tree subclass #5679

Add Ripper::Tree subclass #5679

Uh oh!

Conversation

kddnewton commented Mar 17, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Morriar commented Mar 24, 2022

Uh oh!

dleavitt commented Mar 24, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

kddnewton commented Mar 17, 2022 •

edited

Loading

dleavitt commented Mar 24, 2022 •

edited

Loading