Bug #20761
closed[DOC] `RubyVM::AbstractSyntaxTree.of` examples raise because parser is prism by default
Description
RubyVM::AbstractSyntaxTree.of(proc {1 + 2})
# => <internal:ast>:97:in 'RubyVM::AbstractSyntaxTree.of': cannot get AST for ISEQ compiled by prism (RuntimeError)
Same for the method example. Is this method even functional when prism is used, or is the prism gem able to do this somehow?
Updated by kddnewton (Kevin Newton) 8 months ago
- Status changed from Open to Closed
This is expected behavior. The instruction sequences will have different node ids, so it's not possible to retrieve the RubyVM::AbstractSyntaxTree representation of the AST. If you want to retrieve the Prism AST, you can do so using Prism.parse
in the same way that error highlight does here: https://github.com/ruby/error_highlight/blob/452f78640c08ab277683416668a52d9fcfb6a26a/lib/error_highlight/base.rb#L57-L66.
Updated by Earlopain (Earlopain _) 8 months ago
kddnewton (Kevin Newton) wrote in #note-1:
This is expected behavior. The instruction sequences will have different node ids, so it's not possible to retrieve the RubyVM::AbstractSyntaxTree representation of the AST. If you want to retrieve the Prism AST, you can do so using
Prism.parse
in the same way that error highlight does here: https://github.com/ruby/error_highlight/blob/452f78640c08ab277683416668a52d9fcfb6a26a/lib/error_highlight/base.rb#L57-L66.
Thanks, I have already looked at that code since you linked to it in the PR that made this error. It seems to rely on a backtrace to understand which node to look for but is that possible if I just have a proc that doesn't raise by itself? I'd guess the proc does have a node id but I'm not sure how to retrieve it.
Updated by kddnewton (Kevin Newton) 8 months ago
For a proc you can mirror what RubyVM::AbstractSyntaxTree.of
is doing under the hood using RubyVM::InstructionSequence
, as in:
my_proc = -> { 1 }
iseq = RubyVM::InstructionSequence.of(my_proc)
node_id = iseq.to_a[4][:node_id]
source = iseq.script_lines.join
require "prism"
Prism.parse(source).value.breadth_first_search { |node| node.node_id == node_id }
Updated by kddnewton (Kevin Newton) 8 months ago
Note that all the APIs under RubyVM are subject to change, so you're effectively calling internal APIs here. Just so you know going forward.
Updated by Earlopain (Earlopain _) 8 months ago
This is great info, thank you! I'm aware about the non-guarantees for RubyVM
, just something to live with. AbstractSyntaxTree.of
already had the same "problem".
Regardless, should AbstractSyntaxTree.of
documentation be updated in some way, or be documented at all? Seems to only work if I explicitly opt out of prism.
Updated by kddnewton (Kevin Newton) 8 months ago
Yeah, I think it would make sense to update those docs to indicate that it only works when the instruction sequences were compiled with the old compiler.
Updated by Eregon (Benoit Daloze) 8 months ago
· Edited
@Earlopain Is there a reason you need this and cannot just use e.g. Prism.parse("proc {1 + 2}")
?
A cleaner and portable way to achieve this functionality from a Proc/Method would be if there is a way to retrieve their precise source location, i.e. start & end byte offsets or equivalent (#source_location
currently only gives the start line and column ).
Updated by Eregon (Benoit Daloze) 8 months ago
In general the entire RubyVM::AbstractSyntaxTree
module should be considered deprecated (and maybe even removed), given it relies on parse.y internals and that matz said that the Prism API is the official Ruby API for parsing Ruby code (i.e. no matter which parser is used internally, the exposed API must be the one of Prism).
RubyVM::AbstractSyntaxTree
was always experimental, unstable, only working on CRuby, etc. So now is a good time to no longer rely on it.
Updated by Earlopain (Earlopain _) 8 months ago
· Edited
Eregon (Benoit Daloze) wrote in #note-7:
@Earlopain Is there a reason you need this and cannot just use e.g.
Prism.parse("proc {1 + 2}")
?A cleaner and portable way to achieve this functionality from a Proc/Method would be if there is a way to retrieve their precise source location, i.e. start & end byte offsets or equivalent (
#source_location
currently only gives the start line and column).
This question specifically was about usage that Rails recently adopted for printing the source code of a proc when doing assert_no_difference(-> { foo })
inside the assertion error message. https://github.com/rails/rails/pull/52036. kddnewton has since raised a PR switching it over to RubyVM::InstructionSequence
which of course has the same interoperability problems. https://github.com/rails/rails/pull/53055
If source_location
would provide richer information, then it would be an easy switch. The PR basically already doing that through :code_location
from the iseq. It also checks for script_lines
but I believe in practise that will mostly be nil
so the file gets read anyways. I remember an issue about it, maybe this https://bugs.ruby-lang.org/issues/8751, though I thought it was a bit more recently discussed.
Updated by Eregon (Benoit Daloze) 8 months ago
Thanks for the links, yes, that seems a perfect use case for more information in #source_location
, going through RubyVM
APIs for that feels overly complicated for something so simple.
I found https://bugs.ruby-lang.org/issues/6012 which seems closest to what's needed here, and linked the 2 other related issues to it.