summaryrefslogtreecommitdiff
path: root/node.c
AgeCommit message (Collapse)Author
2024-11-29Remove a useless checkYusuke Endoh
Here `nb` should never be NULL. If it were, the following `nb->buffer_list` would be strange. A follow-up to ddd8da4b6ba3dfcca21ca710e7cef2fa3b9632d7 Notes: Merged: https://github.com/ruby/ruby/pull/12208
2024-07-20Change UNDEF Node structureyui-knk
Change UNDEF Node to hold their items to keep the original grammar structure. For example: ``` undef a, b ``` Before: ``` @ NODE_BLOCK (id: 4, line: 1, location: (1,6)-(1,10))* +- nd_head (1): | @ NODE_UNDEF (id: 1, line: 1, location: (1,6)-(1,7)) | +- nd_undef: | @ NODE_SYM (id: 0, line: 1, location: (1,6)-(1,7)) | +- string: :a +- nd_head (2): @ NODE_UNDEF (id: 3, line: 1, location: (1,9)-(1,10)) +- nd_undef: @ NODE_SYM (id: 2, line: 1, location: (1,9)-(1,10)) +- string: :b ``` After: ``` @ NODE_UNDEF (id: 1, line: 1, location: (1,6)-(1,10))* +- nd_undefs: +- length: 2 +- element (0): | @ NODE_SYM (id: 0, line: 1, location: (1,6)-(1,7)) | +- string: :a +- element (1): @ NODE_SYM (id: 2, line: 1, location: (1,9)-(1,10)) +- string: :b ``` Notes: Merged: https://github.com/ruby/ruby/pull/11213
2024-04-29Remove needless header file includeyui-knk
2024-04-28Remove `ast_new` field from `struct rb_parser_config_struct`yui-knk
`ast_new` can be embedded into `rb_ast_new`.
2024-04-28[Universal parser] Improve AST structureHASUMI Hitoshi
This patch moves `ast->node_buffer->config` to `ast->config` aiming to improve readability and maintainability of the source. ## Background We could not add the `config` field to the `rb_ast_t *` due to the five-word restriction of the IMEMO object. But it is now doable by merging https://github.com/ruby/ruby/pull/10618 ## About assigning `&rb_global_parser_config` to `ast->config` in `ast_alloc()` The approach of not setting `ast->config` in `ast_alloc()` means that the client, CRuby in this scenario, that directly calls `ast_alloc()` will be responsible for releasing it if a resource that is passed to AST needs to be released. However, we have put on hold whether we can guarantee the above so far, thus, this patch looks like that. ``` // ruby_parser.c static VALUE ast_alloc(void) { rb_ast_t *ast; VALUE vast = TypedData_Make_Struct(0, rb_ast_t, &ast_data_type, ast); #ifdef UNIVERSAL_PARSER ast = (rb_ast_t *)DATA_PTR(vast); ast->config = &rb_global_parser_config; #endif return vast; } ```
2024-04-27Add line_count field to rb_ast_body_tHASUMI Hitoshi
This patch adds `int line_count` field to `rb_ast_body_t` structure. Instead, we no longer cast `script_lines` to Fixnum. ## Background Ref https://github.com/ruby/ruby/pull/10618 In the PR above, we have decoupled IMEMO from `rb_ast_t`. This means we could lift the five-words-restriction of the structure that forced us to unionize `rb_ast_t *` and `FIXNUM` in one field. ## Relating refactor - Remove the second parameter of `rb_ruby_ast_new()` function ## Attention I will remove a code that assigns -1 to line_count, in `rb_binding_add_dynavars()` of vm.c, because I don't think it is necessary. But I will make another PR for this so that we can atomically revert in case I was wrong (See the comment on the code)
2024-04-26[Universal parser] Decouple IMEMO from rb_ast_tHASUMI Hitoshi
This patch removes the `VALUE flags` member from the `rb_ast_t` structure making `rb_ast_t` no longer an IMEMO object. ## Background We are trying to make the Ruby parser generated from parse.y a universal parser that can be used by other implementations such as mruby. To achieve this, it is necessary to exclude VALUE and IMEMO from parse.y, AST, and NODE. ## Summary (file by file) - `rubyparser.h` - Remove the `VALUE flags` member from `rb_ast_t` - `ruby_parser.c` and `internal/ruby_parser.h` - Use TypedData_Make_Struct VALUE which wraps `rb_ast_t` `in ast_alloc()` so that GC can manage it - You can retrieve `rb_ast_t` from the VALUE by `rb_ruby_ast_data_get()` - Change the return type of `rb_parser_compile_XXXX()` functions from `rb_ast_t *` to `VALUE` - rb_ruby_ast_new() which internally `calls ast_alloc()` is to create VALUE vast outside ruby_parser.c - `iseq.c` and `vm_core.h` - Amend the first parameter of `rb_iseq_new_XXXX()` functions from `rb_ast_body_t *` to `VALUE` - This keeps the VALUE of AST on the machine stack to prevent being removed by GC - `ast.c` - Almost all change is replacement `rb_ast_t *ast` with `VALUE vast` (sorry for the big diff) - Fix `node_memsize()` - Now it includes `rb_ast_local_table_link`, `tokens` and script_lines - `compile.c`, `load.c`, `node.c`, `parse.y`, `proc.c`, `ruby.c`, `template/prelude.c.tmpl`, `vm.c` and `vm_eval.c` - Follow-up due to the above changes - `imemo.{c|h}` - If an object with `imemo_ast` appears, considers it a bug Co-authored-by: Nobuyoshi Nakada <[email protected]>
2024-04-21Remove needless header file includeyui-knk
2024-04-16Remove unused functions from `struct rb_parser_config_struct`yui-knk
2024-04-15[Universal parser] DeVALUE of p->debug_lines and ast->body.script_linesHASUMI Hitoshi
This patch is part of universal parser work. ## Summary - Decouple VALUE from members below: - `(struct parser_params *)->debug_lines` - `(rb_ast_t *)->body.script_lines` - Instead, they are now `rb_parser_ary_t *` - They can also be a `(VALUE)FIXNUM` as before to hold line count - `ISEQ_BODY(iseq)->variable.script_lines` remains VALUE - In order to do this, - Add `VALUE script_lines` param to `rb_iseq_new_with_opt()` - Introduce `rb_parser_build_script_lines_from()` to convert `rb_parser_ary_t *` into `VALUE` ## Other details - Extend `rb_parser_ary_t *`. It previously could only store `rb_parser_ast_token *`, now can store script_lines, too - Change tactics of building the top-level `SCRIPT_LINES__` in `yycompile0()` - Before: While parsing, each line of the script is added to `SCRIPT_LINES__[path]` - After: After `yyparse(p)`, `SCRIPT_LINES__[path]` will be built from `p->debug_lines` - Remove the second parameter of `rb_parser_set_script_lines()` to make it simple - Introduce `script_lines_free()` to be called from `rb_ast_free()` because the GC no longer takes care of the script_lines - Introduce `rb_parser_string_deep_copy()` in parse.y to maintain script_lines when `rb_ruby_parser_free()` called - With regard to this, please see *Future tasks* below ## Future tasks - Decouple IMEMO from `rb_ast_t *` - This lifts the five-members-restriction of Ruby object, - So we will be able to move the ownership of the `lex.string_buffer` from parser to AST - Then we remove `rb_parser_string_deep_copy()` to make the whole thing simple
2024-04-08Don't set T_TYPES of NODEyui-knk
T_TYPES was needed once Ripper jumbled NODEs and other type objects. However such hack was already removed. Therefore don't need to set T_TYPES of NODE.
2024-04-05Remove needless checkyui-knk
`nodetype_markable_p` always returns `false` then `rb_ast_node_type_change` never calls `rb_bug`.
2024-04-05Merge two `node_buffer_list_t` fields into oneyui-knk
All types of Node are managed by `node_buffer_list_t unmarkable` therefore merge them into `node_buffer_list_t buffer_list`.
2024-04-05Remove unused macros from node.cyui-knk
2024-04-04NODE_LIT is not used anymoreyui-knk
2024-03-12[Universal Parser] Reduce dependence on RArray in parse.yHASUMI Hitoshi
- Introduce `rb_parser_ary_t` structure to partly eliminate RArray from parse.y - In this patch, `parser_params->tokens` and `parser_params->ast->node_buffer->tokens` are now `rb_parser_ary_t *` - Instead, `ast_node_all_tokens()` internally creates a Ruby Array object from the `rb_parser_ary_t` - Also, delete `rb_ast_tokens()` and `rb_ast_set_tokens()` in node.c - Implement `rb_parser_str_escape()` - This is a port of the `rb_str_escape()` function in string.c - `rb_parser_str_escape()` does not depend on `VALUE` (RString) - Instead, it uses `rb_parser_stirng_t *` - This function works when --dump=y option passed - Because WIP of the universal parser, similar functions like `rb_parser_tokens_free()` exist in both node.c and parse.y. Refactoring them may be needed in some way in the future - Although we considered redesigning the structure: `ast->node_buffer->tokens` into `ast->tokens`, we leave it as it is because `rb_ast_t` is an imemo. (We will address it in the future)
2024-02-21Add IMEMO_NEWPeter Zhu
Rather than exposing that an imemo has a flag and four fields, this changes the implementation to only expose one field (the klass) and fills the rest with 0. The type will have to fill in the values themselves.
2024-02-21Introduce NODE_REGX to manage regexp literalyui-knk
2024-02-20Use rb_gc_mark_and_move for imemoPeter Zhu
2024-02-20[Feature #20257] Rearchitect Ripperyui-knk
Introduce another semantic value stack for Ripper so that Ripper can manage both Node and Ruby Object separately. This rearchitectutre of Ripper solves these issues. Therefore adding test cases for them. * [Bug 10436] https://bugs.ruby-lang.org/issues/10436 * [Bug 18988] https://bugs.ruby-lang.org/issues/18988 * [Bug 20055] https://bugs.ruby-lang.org/issues/20055 Checked the differences of `Ripper.sexp` for files under `/test/ruby` are only on test_pattern_matching.rb. The differences comes from the differences between `new_hash_pattern_tail` functions between parser and Ripper. Ripper `new_hash_pattern_tail` didn’t call `assignable` then `kw_rest_arg` wasn’t marked as local variable. This is also fixed by this commit. ``` --- a/./tmp/before/test_pattern_matching.rb +++ b/./tmp/after/test_pattern_matching.rb @@ -3607,7 +3607,7 @@ [:in, [:hshptn, nil, [], [:var_field, [:@ident, “a”, [984, 13]]]], [[:binary, - [:vcall, [:@ident, “a”, [985, 10]]], + [:var_ref, [:@ident, “a”, [985, 10]]], :==, [:hash, nil]]], nil]]], @@ -3662,7 +3662,7 @@ [:in, [:hshptn, nil, [], [:var_field, [:@ident, “a”, [993, 13]]]], [[:binary, - [:vcall, [:@ident, “a”, [994, 10]]], + [:var_ref, [:@ident, “a”, [994, 10]]], :==, [:hash, [:assoclist_from_args, @@ -3813,7 +3813,7 @@ [:command, [:@ident, “raise”, [1022, 10]], [:args_add_block, - [[:vcall, [:@ident, “b”, [1022, 16]]]], + [[:var_ref, [:@ident, “b”, [1022, 16]]]], false]]], [:else, [[:var_ref, [:@kw, “true”, [1024, 10]]]]]]]], nil, @@ -3876,7 +3876,7 @@ [:@int, “0”, [1033, 15]]], :“&&“, [:binary, - [:vcall, [:@ident, “b”, [1033, 20]]], + [:var_ref, [:@ident, “b”, [1033, 20]]], :==, [:hash, nil]]]], nil]]], @@ -3946,7 +3946,7 @@ [:@int, “0”, [1042, 15]]], :“&&“, [:binary, - [:vcall, [:@ident, “b”, [1042, 20]]], + [:var_ref, [:@ident, “b”, [1042, 20]]], :==, [:hash, [:assoclist_from_args, @@ -5206,7 +5206,7 @@ [[:assoc_new, [:@label, “c:“, [1352, 22]], [:@int, “0”, [1352, 25]]]]]], - [:vcall, [:@ident, “r”, [1352, 29]]]], + [:var_ref, [:@ident, “r”, [1352, 29]]]], false]]], [:binary, [:call, @@ -5299,7 +5299,7 @@ [:assoc_new, [:@label, “c:“, [1367, 34]], [:@int, “0”, [1367, 37]]]]]], - [:vcall, [:@ident, “r”, [1367, 41]]]], + [:var_ref, [:@ident, “r”, [1367, 41]]]], false]]], [:binary, [:call, @@ -5931,7 +5931,7 @@ [:in, [:hshptn, nil, [], [:var_field, [:@ident, “r”, [1533, 11]]]], [[:binary, - [:vcall, [:@ident, “r”, [1534, 8]]], + [:var_ref, [:@ident, “r”, [1534, 8]]], :==, [:hash, [:assoclist_from_args, ```
2024-02-16Make all fields in AST movablePeter Zhu
2024-02-09Remove ruby object from string nodesyui-knk
String nodes holds ruby string object on `VALUE nd_lit`. This commit changes it to `struct rb_parser_string *string` to reduce dependency on ruby object. Sometimes these strings are concatenated with other string therefore string concatenate functions are needed.
2024-01-14Constify `rb_global_parser_config`Nobuyoshi Nakada
2024-01-12Remove reference counter from rb_parser_configyui-knk
It's allocated outside of parser then no need to track reference count in rb_parser_config.
2024-01-12Statically allocate parser configyui-knk
2024-01-09Introduce NODE_SYM to manage symbol literalyui-knk
`:sym` was managed by `NODE_LIT` with `Symbol` object. This commit introduces `NODE_SYM` so that 1. Symbol literal is detectable from AST Node 2. Reduce dependency on ruby object
2024-01-07Introduce Numeric Node'sS-H-GAMELINKS
2024-01-02Introduce NODE_FILEyui-knk
`__FILE__` was managed by `NODE_STR` with `String` object. This commit introduces `NODE_FILE` and `struct rb_parser_string` so that 1. `__FILE__` is detectable from AST Node 2. Reduce dependency ruby object
2023-10-30Embed `rb_args_info` in `rb_node_args_t`Nobuyoshi Nakada
2023-10-14Delete heredoc line mark referencesNobuyoshi Nakada
2023-09-30Expand pattern_info struct into ARYPTN Node and FNDPTN Nodeyui-knk
2023-09-29Fix memory leak in the parserPeter Zhu
Reproduction script: ``` require "ripper" 10.times do 20_000.times do Ripper.parse("") end puts `ps -o rss= -p #{$$}` end ``` Before: ``` 28032 34432 40704 47232 53632 60032 66432 72832 79232 85632 ``` After: ``` 21760 21760 21760 21760 21760 21760 21760 21760 21760 21760 ```
2023-09-28Change RNode structure from union to structyui-knk
All kind of AST nodes use same struct RNode, which has u1, u2, u3 union members for holding different kind of data. This has two problems. 1. Low flexibility of data structure Some nodes, for example NODE_TRUE, don’t use u1, u2, u3. On the other hand, NODE_OP_ASGN2 needs more than three union members. However they use same structure definition, need to allocate three union members for NODE_TRUE and need to separate NODE_OP_ASGN2 into another node. This change removes the restriction so make it possible to change data structure by each node type. 2. No compile time check for union member access It’s developer’s responsibility for using correct member for each node type when it’s union. This change clarifies which node has which type of fields and enables compile time check. This commit also changes node_buffer_elem_struct buf management to handle different size data with alignment.
2023-09-22Directly free structure managed by imemo tmpbufyui-knk
NODE_ARGS, NODE_ARYPTN, NODE_FNDPTN manage memory of their structure by imemo tmpbuf Object. However rb_ast_struct has reference to NODE. Then these memory can be freed directly when rb_ast_struct is freed. This commit reduces parser's dependency on CRuby functions.
2023-06-17Replace parser & node compile_option from Hash to bit fieldyui-knk
This commit reduces dependency to CRuby object. Notes: Merged: https://github.com/ruby/ruby/pull/7950
2023-06-12[Feature #19719] Universal Parseryui-knk
Introduce Universal Parser mode for the parser. This commit includes these changes: * Introduce `UNIVERSAL_PARSER` macro. All of CRuby related functions are passed via `struct rb_parser_config_struct` when this macro is enabled. * Add CI task with 'cppflags=-DUNIVERSAL_PARSER' for ubuntu. Notes: Merged: https://github.com/ruby/ruby/pull/7927
2023-05-24Rename `rb_node_name` to the original nameyui-knk
98637d421dbe8bcf86cc2effae5e26bb96a6a4da changes the name of the function. However this function is exported as global, then change the name to origin one for keeping compatibility. Notes: Merged: https://github.com/ruby/ruby/pull/7852
2023-05-23Move `ruby_node_name` to node.c and rename prefix of the functionyui-knk
Notes: Merged: https://github.com/ruby/ruby/pull/7844
2022-11-21Enhance keep_tokens option for RubyVM::AbstractSyntaxTree parsing methodsyui-knk
Implementation for Language Server Protocol (LSP) sometimes needs token information. For example both `m(1)` and `m(1, )` has same AST structure other than node locations then it's impossible to check the existence of `,` from AST. However in later case, it might be better to suggest variables list for the second argument. Token information is important for such case. This commit adds these methods. * Add `keep_tokens` option for `RubyVM::AbstractSyntaxTree.parse`, `.parse_file` and `.of` * Add `RubyVM::AbstractSyntaxTree::Node#tokens` which returns tokens for the node including tokens for descendants nodes. * Add `RubyVM::AbstractSyntaxTree::Node#all_tokens` which returns all tokens for the input script regardless the receiver node. [Feature #19070] Impacts on memory usage and performance are below: Memory usage: ``` $ cat test.rb root = RubyVM::AbstractSyntaxTree.parse_file(File.expand_path('../test/ruby/test_keyword.rb', __FILE__), keep_tokens: true) $ /usr/bin/time -f %Mkb /usr/local/bin/ruby -v ruby 3.2.0dev (2022-11-19T09:41:54Z 19070-keep_tokens d3af1b8057) [x86_64-linux] 11408kb # keep_tokens :false $ /usr/bin/time -f %Mkb /usr/local/bin/ruby test.rb 17508kb # keep_tokens :true $ /usr/bin/time -f %Mkb /usr/local/bin/ruby test.rb 30960kb ``` Performance: ``` $ cat ../ast_keep_tokens.yml prelude: | src = <<~SRC module M class C def m1(a, b) 1 + a + b end end end SRC benchmark: without_keep_tokens: | RubyVM::AbstractSyntaxTree.parse(src, keep_tokens: false) with_keep_tokens: | RubyVM::AbstractSyntaxTree.parse(src, keep_tokens: true) $ make benchmark COMPARE_RUBY="./ruby" ARGS=../ast_keep_tokens.yml /home/kaneko.y/.rbenv/shims/ruby --disable=gems -rrubygems -I../benchmark/lib ../benchmark/benchmark-driver/exe/benchmark-driver \ --executables="compare-ruby::./ruby -I.ext/common --disable-gem" \ --executables="built-ruby::./miniruby -I../lib -I. -I.ext/common ../tool/runruby.rb --extout=.ext -- --disable-gems --disable-gem" \ --output=markdown --output-compare -v ../ast_keep_tokens.yml compare-ruby: ruby 3.2.0dev (2022-11-19T09:41:54Z 19070-keep_tokens d3af1b8057) [x86_64-linux] built-ruby: ruby 3.2.0dev (2022-11-19T09:41:54Z 19070-keep_tokens d3af1b8057) [x86_64-linux] warming up.. | |compare-ruby|built-ruby| |:--------------------|-----------:|---------:| |without_keep_tokens | 21.659k| 21.303k| | | 1.02x| -| |with_keep_tokens | 6.220k| 5.691k| | | 1.09x| -| ``` Notes: Merged: https://github.com/ruby/ruby/pull/6770
2022-10-08Move `error` from top_stmts and top_stmt to stmtyui-knk
By this change, syntax error is recovered smaller units. In the case below, "DEFN :bar" is same level with "CLASS :Foo" now. ``` module Z class Foo foo. end def bar end end ``` [Feature #19013] Notes: Merged: https://github.com/ruby/ruby/pull/6512
2022-08-01Initialize node_idWolf
In some causes node_id might have been left uninitialized leading to undefined behavior on access. So always set it to -1, so we have *some* valid value in there. Notes: Merged: https://github.com/ruby/ruby/pull/6202
2022-07-21Expand tabs [ci skip]Takashi Kokubun
[Misc #18891] Notes: Merged: https://github.com/ruby/ruby/pull/6094
2021-12-13Remove `NODE_DASGN_CURR` [Feature #18406]Nobuyoshi Nakada
This `NODE` type was used in pre-YARV implementation, to improve the performance of assignment to dynamic local variable defined at the innermost scope. It has no longer any actual difference with `NODE_DASGN`, except for the node dump. Notes: Merged: https://github.com/ruby/ruby/pull/5251
2021-12-04Add `nd_type_p` macroS.H
Notes: Merged: https://github.com/ruby/ruby/pull/5091 Merged-By: nobu <[email protected]>
2021-11-21Refactor hacky ID tables to struct rb_ast_id_table_tYusuke Endoh
The implementation of a local variable tables was represented as `ID*`, but it was very hacky: the first element is not an ID but the size of the table, and, the last element is (sometimes) a link to the next local table only when the id tables are a linked list. This change converts the hacky implementation to a normal struct. Notes: Merged: https://github.com/ruby/ruby/pull/5136
2021-11-17node.c (dump_node): update format explanation for NODE_ARGSYusuke Endoh
2021-11-17node.c (dump_node): trivial refactoringYusuke Endoh
2021-07-12Show node IDs in dumpNobuyoshi Nakada
2021-06-18ast.rb: RubyVM::AST.parse and .of accepts `save_script_lines: true`Yusuke Endoh
This option makes the parser keep the original source as an array of the original code lines. This feature exploits the mechanism of `SCRIPT_LINES__` but records only the specified code that is passed to RubyVM::AST.of or .parse, instead of recording all parsed program texts. Notes: Merged: https://github.com/ruby/ruby/pull/4581
2021-04-27Partially revert 2c7d3b3a722c4636ab1e9d289cbca47ddd168d3eYusuke Endoh
to make imemo_ast WB-protected again. Only the test is kept. Notes: Merged: https://github.com/ruby/ruby/pull/4419