Skip to content

Add Ripper::Tree subclass #5679

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed

Conversation

kddnewton
Copy link
Contributor

@kddnewton kddnewton commented Mar 17, 2022

This is a new subclass of the Ripper parser. It is based on/extracted from the work done in https://github.com/ruby-syntax-tree/syntax_tree. This subclass is similar to SexpBuilderPP in that it provides individual shapes per production rule from ripper. However, there are a couple of differences:

  • Tree uses class instances instead of arrays to represent nodes.
  • Comments are automatically attached to the various nodes when parsing is finished.
  • A couple of additional nodes are added for clarity (i.e., ArgStar, Not, PinnedBegin, etc.).
  • Every node has location information attached to it (as opposed to just the scanner event nodes).
  • There's a standard interface for descending the tree (child_nodes).

Additionally, each node has pattern matching (both array and hash patterns) as well as pretty_print defined to make it easier to work with.

I think we should ship this with Ruby to make it easier to build tools (like formatters and linters) on top of this structure. It also will make it easier to change the parser in the future (if we ever do) because any tools built on top of these objects will not have to worry about the specific order of the nodes (unlike the SexpBuilderPP version) since everything has a named field. Additionally, since everything is written in pure Ruby, it makes it easy for other implementations of Ruby to benefit.

This is a new subclass of the Ripper parser. It is based on/extracted from the work done in https://github.com/ruby-syntax-tree/syntax_tree. This subclass is similar to SexpBuilderPP in that it provides individual shapes per production rule from ripper. However, there are a couple of differences:

* Tree uses class instances instead of arrays to represent nodes.
* Comments are automatically attached to the various nodes when parsing is finished.
* A couple of additional nodes are added for clarity (i.e., ArgStar, Not, PinnedBegin, etc.).
* Every node has location information attached to it (as opposed to just the scanner event nodes).
* There's a standard interface for descending the tree (child_nodes).

Additionally, each node has pattern matching (both array and hash patterns) as well as pretty_print defined to make it easier to work with.

I think we should ship this with Ruby to make it easier to build tools (like formatters and linters) on top of this structure. It also will make it easier to change the parser in the future (if we ever do) because any tools built on top of these objects will not have to worry about the specific order of the nodes (unlike the SexpBuilderPP version) since everything has a named field.
@Morriar
Copy link
Contributor

Morriar commented Mar 24, 2022

Considering these points:

  • A couple of additional nodes are added for clarity (i.e., ArgStar, Not, PinnedBegin, etc.).
  • Every node has location information attached to it (as opposed to just the scanner event nodes).
  • There's a standard interface for descending the tree (child_nodes).

If we're going to add new nodes for clarity, I think we could also take the liberty to add a common ancestor to all nodes. This common ancestor could hold the location and comments. Maybe even define an empty child_nodes method that would be overridden by the sub classes.

Having a common ancestor to all nodes has at least two advantages when using the resulting tree to write tools:

  1. If you work with typing (be it RBS or Sorbet), this makes signatures easier to write. Consider this:

    class Visitor
      def visit: (node: Node) -> void
    end

    instead of:

    class Visitor
      def visit: (node: Alias | ARef | ARefField | ArgParen | ...) -> void
    end

    and it lets you properly/easily type the accept method:

    class Node
      def accept: (v: Visitor) -> void
    end

    instead of:

    class Alias
      def accept: (v: Visitor) -> void
    end
    
    class ARef
      def accept: (v: Visitor) -> void
    end
    
    ...
  2. When writing tools it allows you to attach behavior to all nodes at once without requiring to monkey patch them or rely on meta-programming:

    class Node
      def to_some_format_i_need
        {
          kind: class.name,
          loc: location.to_s,
          children: child_nodes.map(&:to_some_format_i_need)
        }
      end
    end

    instead of defining this method on each node separately.

@dleavitt
Copy link

dleavitt commented Mar 24, 2022

Awesome work on this + would be great to have something like this as part of ripper. Couple of thoughts:

  • Any reason the node classes aren't in their own files? It's sort of nice to have them live near their handlers, but having an 11k+ line file really reduces the accessibility of this code.
  • The lack of test coverage for ripper has been a pain-point for me. Would be great if some of the tests from syntax_tree could come with this, maybe with the goal to add more. I think with a good test suite it's possible this feature could reduce the total maintenance burden of ripper, since breaking changes to Ruby's AST would be caught earlier.

@kddnewton kddnewton closed this May 12, 2022
@kddnewton kddnewton deleted the ripper-tree branch May 12, 2022 13:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants