ruby.git - The Ruby Programming Language

Age	Commit message (Collapse)	Author
2024-11-29	Remove a useless check	Yusuke Endoh
	Here `nb` should never be NULL. If it were, the following `nb->buffer_list` would be strange. A follow-up to ddd8da4b6ba3dfcca21ca710e7cef2fa3b9632d7 Notes: Merged: https://github.com/ruby/ruby/pull/12208
2024-07-20	Change UNDEF Node structure	yui-knk
	Change UNDEF Node to hold their items to keep the original grammar structure. For example: ``` undef a, b ``` Before: ``` @ NODE_BLOCK (id: 4, line: 1, location: (1,6)-(1,10))* +- nd_head (1): \| @ NODE_UNDEF (id: 1, line: 1, location: (1,6)-(1,7)) \| +- nd_undef: \| @ NODE_SYM (id: 0, line: 1, location: (1,6)-(1,7)) \| +- string: :a +- nd_head (2): @ NODE_UNDEF (id: 3, line: 1, location: (1,9)-(1,10)) +- nd_undef: @ NODE_SYM (id: 2, line: 1, location: (1,9)-(1,10)) +- string: :b ``` After: ``` @ NODE_UNDEF (id: 1, line: 1, location: (1,6)-(1,10))* +- nd_undefs: +- length: 2 +- element (0): \| @ NODE_SYM (id: 0, line: 1, location: (1,6)-(1,7)) \| +- string: :a +- element (1): @ NODE_SYM (id: 2, line: 1, location: (1,9)-(1,10)) +- string: :b ``` Notes: Merged: https://github.com/ruby/ruby/pull/11213
2024-04-29	Remove needless header file include	yui-knk

2024-04-28	Remove `ast_new` field from `struct rb_parser_config_struct`	yui-knk
	`ast_new` can be embedded into `rb_ast_new`.
2024-04-28	[Universal parser] Improve AST structure	HASUMI Hitoshi
	This patch moves `ast->node_buffer->config` to `ast->config` aiming to improve readability and maintainability of the source. ## Background We could not add the `config` field to the `rb_ast_t ` due to the five-word restriction of the IMEMO object. But it is now doable by merging https://github.com/ruby/ruby/pull/10618 ## About assigning `&rb_global_parser_config` to `ast->config` in `ast_alloc()` The approach of not setting `ast->config` in `ast_alloc()` means that the client, CRuby in this scenario, that directly calls `ast_alloc()` will be responsible for releasing it if a resource that is passed to AST needs to be released. However, we have put on hold whether we can guarantee the above so far, thus, this patch looks like that. ``` // ruby_parser.c static VALUE ast_alloc(void) { rb_ast_t ast; VALUE vast = TypedData_Make_Struct(0, rb_ast_t, &ast_data_type, ast); #ifdef UNIVERSAL_PARSER ast = (rb_ast_t *)DATA_PTR(vast); ast->config = &rb_global_parser_config; #endif return vast; } ```
2024-04-27	Add line_count field to rb_ast_body_t	HASUMI Hitoshi
	This patch adds `int line_count` field to `rb_ast_body_t` structure. Instead, we no longer cast `script_lines` to Fixnum. ## Background Ref https://github.com/ruby/ruby/pull/10618 In the PR above, we have decoupled IMEMO from `rb_ast_t`. This means we could lift the five-words-restriction of the structure that forced us to unionize `rb_ast_t *` and `FIXNUM` in one field. ## Relating refactor - Remove the second parameter of `rb_ruby_ast_new()` function ## Attention I will remove a code that assigns -1 to line_count, in `rb_binding_add_dynavars()` of vm.c, because I don't think it is necessary. But I will make another PR for this so that we can atomically revert in case I was wrong (See the comment on the code)
2024-04-26	[Universal parser] Decouple IMEMO from rb_ast_t	HASUMI Hitoshi
	This patch removes the `VALUE flags` member from the `rb_ast_t` structure making `rb_ast_t` no longer an IMEMO object. ## Background We are trying to make the Ruby parser generated from parse.y a universal parser that can be used by other implementations such as mruby. To achieve this, it is necessary to exclude VALUE and IMEMO from parse.y, AST, and NODE. ## Summary (file by file) - `rubyparser.h` - Remove the `VALUE flags` member from `rb_ast_t` - `ruby_parser.c` and `internal/ruby_parser.h` - Use TypedData_Make_Struct VALUE which wraps `rb_ast_t` `in ast_alloc()` so that GC can manage it - You can retrieve `rb_ast_t` from the VALUE by `rb_ruby_ast_data_get()` - Change the return type of `rb_parser_compile_XXXX()` functions from `rb_ast_t ` to `VALUE` - rb_ruby_ast_new() which internally `calls ast_alloc()` is to create VALUE vast outside ruby_parser.c - `iseq.c` and `vm_core.h` - Amend the first parameter of `rb_iseq_new_XXXX()` functions from `rb_ast_body_t ` to `VALUE` - This keeps the VALUE of AST on the machine stack to prevent being removed by GC - `ast.c` - Almost all change is replacement `rb_ast_t *ast` with `VALUE vast` (sorry for the big diff) - Fix `node_memsize()` - Now it includes `rb_ast_local_table_link`, `tokens` and script_lines - `compile.c`, `load.c`, `node.c`, `parse.y`, `proc.c`, `ruby.c`, `template/prelude.c.tmpl`, `vm.c` and `vm_eval.c` - Follow-up due to the above changes - `imemo.{c\|h}` - If an object with `imemo_ast` appears, considers it a bug Co-authored-by: Nobuyoshi Nakada <[email protected]>
2024-04-21	Remove needless header file include	yui-knk

2024-04-16	Remove unused functions from `struct rb_parser_config_struct`	yui-knk

2024-04-15	[Universal parser] DeVALUE of p->debug_lines and ast->body.script_lines	HASUMI Hitoshi
	This patch is part of universal parser work. ## Summary - Decouple VALUE from members below: - `(struct parser_params )->debug_lines` - `(rb_ast_t )->body.script_lines` - Instead, they are now `rb_parser_ary_t ` - They can also be a `(VALUE)FIXNUM` as before to hold line count - `ISEQ_BODY(iseq)->variable.script_lines` remains VALUE - In order to do this, - Add `VALUE script_lines` param to `rb_iseq_new_with_opt()` - Introduce `rb_parser_build_script_lines_from()` to convert `rb_parser_ary_t ` into `VALUE` ## Other details - Extend `rb_parser_ary_t `. It previously could only store `rb_parser_ast_token `, now can store script_lines, too - Change tactics of building the top-level `SCRIPT_LINES__` in `yycompile0()` - Before: While parsing, each line of the script is added to `SCRIPT_LINES__[path]` - After: After `yyparse(p)`, `SCRIPT_LINES__[path]` will be built from `p->debug_lines` - Remove the second parameter of `rb_parser_set_script_lines()` to make it simple - Introduce `script_lines_free()` to be called from `rb_ast_free()` because the GC no longer takes care of the script_lines - Introduce `rb_parser_string_deep_copy()` in parse.y to maintain script_lines when `rb_ruby_parser_free()` called - With regard to this, please see Future tasks below ## Future tasks - Decouple IMEMO from `rb_ast_t *` - This lifts the five-members-restriction of Ruby object, - So we will be able to move the ownership of the `lex.string_buffer` from parser to AST - Then we remove `rb_parser_string_deep_copy()` to make the whole thing simple
2024-04-08	Don't set T_TYPES of NODE	yui-knk
	T_TYPES was needed once Ripper jumbled NODEs and other type objects. However such hack was already removed. Therefore don't need to set T_TYPES of NODE.
2024-04-05	Remove needless check	yui-knk
	`nodetype_markable_p` always returns `false` then `rb_ast_node_type_change` never calls `rb_bug`.
2024-04-05	Merge two `node_buffer_list_t` fields into one	yui-knk
	All types of Node are managed by `node_buffer_list_t unmarkable` therefore merge them into `node_buffer_list_t buffer_list`.
2024-04-05	Remove unused macros from node.c	yui-knk

2024-04-04	NODE_LIT is not used anymore	yui-knk

2024-03-12	[Universal Parser] Reduce dependence on RArray in parse.y	HASUMI Hitoshi
	- Introduce `rb_parser_ary_t` structure to partly eliminate RArray from parse.y - In this patch, `parser_params->tokens` and `parser_params->ast->node_buffer->tokens` are now `rb_parser_ary_t ` - Instead, `ast_node_all_tokens()` internally creates a Ruby Array object from the `rb_parser_ary_t` - Also, delete `rb_ast_tokens()` and `rb_ast_set_tokens()` in node.c - Implement `rb_parser_str_escape()` - This is a port of the `rb_str_escape()` function in string.c - `rb_parser_str_escape()` does not depend on `VALUE` (RString) - Instead, it uses `rb_parser_stirng_t ` - This function works when --dump=y option passed - Because WIP of the universal parser, similar functions like `rb_parser_tokens_free()` exist in both node.c and parse.y. Refactoring them may be needed in some way in the future - Although we considered redesigning the structure: `ast->node_buffer->tokens` into `ast->tokens`, we leave it as it is because `rb_ast_t` is an imemo. (We will address it in the future)
2024-02-21	Add IMEMO_NEW	Peter Zhu
	Rather than exposing that an imemo has a flag and four fields, this changes the implementation to only expose one field (the klass) and fills the rest with 0. The type will have to fill in the values themselves.
2024-02-21	Introduce NODE_REGX to manage regexp literal	yui-knk

2024-02-20	Use rb_gc_mark_and_move for imemo	Peter Zhu

2024-02-20	[Feature #20257] Rearchitect Ripper	yui-knk
	Introduce another semantic value stack for Ripper so that Ripper can manage both Node and Ruby Object separately. This rearchitectutre of Ripper solves these issues. Therefore adding test cases for them. * [Bug 10436] https://bugs.ruby-lang.org/issues/10436 * [Bug 18988] https://bugs.ruby-lang.org/issues/18988 * [Bug 20055] https://bugs.ruby-lang.org/issues/20055 Checked the differences of `Ripper.sexp` for files under `/test/ruby` are only on test_pattern_matching.rb. The differences comes from the differences between `new_hash_pattern_tail` functions between parser and Ripper. Ripper `new_hash_pattern_tail` didn’t call `assignable` then `kw_rest_arg` wasn’t marked as local variable. This is also fixed by this commit. ``` --- a/./tmp/before/test_pattern_matching.rb +++ b/./tmp/after/test_pattern_matching.rb @@ -3607,7 +3607,7 @@ [:in, [:hshptn, nil, [], [:var_field, [:@ident, “a”, [984, 13]]]], [[:binary, - [:vcall, [:@ident, “a”, [985, 10]]], + [:var_ref, [:@ident, “a”, [985, 10]]], :==, [:hash, nil]]], nil]]], @@ -3662,7 +3662,7 @@ [:in, [:hshptn, nil, [], [:var_field, [:@ident, “a”, [993, 13]]]], [[:binary, - [:vcall, [:@ident, “a”, [994, 10]]], + [:var_ref, [:@ident, “a”, [994, 10]]], :==, [:hash, [:assoclist_from_args, @@ -3813,7 +3813,7 @@ [:command, [:@ident, “raise”, [1022, 10]], [:args_add_block, - [[:vcall, [:@ident, “b”, [1022, 16]]]], + [[:var_ref, [:@ident, “b”, [1022, 16]]]], false]]], [:else, [[:var_ref, [:@kw, “true”, [1024, 10]]]]]]]], nil, @@ -3876,7 +3876,7 @@ [:@int, “0”, [1033, 15]]], :“&&“, [:binary, - [:vcall, [:@ident, “b”, [1033, 20]]], + [:var_ref, [:@ident, “b”, [1033, 20]]], :==, [:hash, nil]]]], nil]]], @@ -3946,7 +3946,7 @@ [:@int, “0”, [1042, 15]]], :“&&“, [:binary, - [:vcall, [:@ident, “b”, [1042, 20]]], + [:var_ref, [:@ident, “b”, [1042, 20]]], :==, [:hash, [:assoclist_from_args, @@ -5206,7 +5206,7 @@ [[:assoc_new, [:@label, “c:“, [1352, 22]], [:@int, “0”, [1352, 25]]]]]], - [:vcall, [:@ident, “r”, [1352, 29]]]], + [:var_ref, [:@ident, “r”, [1352, 29]]]], false]]], [:binary, [:call, @@ -5299,7 +5299,7 @@ [:assoc_new, [:@label, “c:“, [1367, 34]], [:@int, “0”, [1367, 37]]]]]], - [:vcall, [:@ident, “r”, [1367, 41]]]], + [:var_ref, [:@ident, “r”, [1367, 41]]]], false]]], [:binary, [:call, @@ -5931,7 +5931,7 @@ [:in, [:hshptn, nil, [], [:var_field, [:@ident, “r”, [1533, 11]]]], [[:binary, - [:vcall, [:@ident, “r”, [1534, 8]]], + [:var_ref, [:@ident, “r”, [1534, 8]]], :==, [:hash, [:assoclist_from_args, ```
2024-02-16	Make all fields in AST movable	Peter Zhu

2024-02-09	Remove ruby object from string nodes	yui-knk
	String nodes holds ruby string object on `VALUE nd_lit`. This commit changes it to `struct rb_parser_string *string` to reduce dependency on ruby object. Sometimes these strings are concatenated with other string therefore string concatenate functions are needed.
2024-01-14	Constify `rb_global_parser_config`	Nobuyoshi Nakada

2024-01-12	Remove reference counter from rb_parser_config	yui-knk
	It's allocated outside of parser then no need to track reference count in rb_parser_config.
2024-01-12	Statically allocate parser config	yui-knk

2024-01-09	Introduce NODE_SYM to manage symbol literal	yui-knk
	`:sym` was managed by `NODE_LIT` with `Symbol` object. This commit introduces `NODE_SYM` so that 1. Symbol literal is detectable from AST Node 2. Reduce dependency on ruby object
2024-01-07	Introduce Numeric Node's	S-H-GAMELINKS

2024-01-02	Introduce NODE_FILE	yui-knk
	`__FILE__` was managed by `NODE_STR` with `String` object. This commit introduces `NODE_FILE` and `struct rb_parser_string` so that 1. `__FILE__` is detectable from AST Node 2. Reduce dependency ruby object
2023-10-30	Embed `rb_args_info` in `rb_node_args_t`	Nobuyoshi Nakada

2023-10-14	Delete heredoc line mark references	Nobuyoshi Nakada

2023-09-30	Expand pattern_info struct into ARYPTN Node and FNDPTN Node	yui-knk

2023-09-29	Fix memory leak in the parser	Peter Zhu
	Reproduction script: ``` require "ripper" 10.times do 20_000.times do Ripper.parse("") end puts `ps -o rss= -p #{$$}` end ``` Before: ``` 28032 34432 40704 47232 53632 60032 66432 72832 79232 85632 ``` After: ``` 21760 21760 21760 21760 21760 21760 21760 21760 21760 21760 ```
2023-09-28	Change RNode structure from union to struct	yui-knk
	All kind of AST nodes use same struct RNode, which has u1, u2, u3 union members for holding different kind of data. This has two problems. 1. Low flexibility of data structure Some nodes, for example NODE_TRUE, don’t use u1, u2, u3. On the other hand, NODE_OP_ASGN2 needs more than three union members. However they use same structure definition, need to allocate three union members for NODE_TRUE and need to separate NODE_OP_ASGN2 into another node. This change removes the restriction so make it possible to change data structure by each node type. 2. No compile time check for union member access It’s developer’s responsibility for using correct member for each node type when it’s union. This change clarifies which node has which type of fields and enables compile time check. This commit also changes node_buffer_elem_struct buf management to handle different size data with alignment.
2023-09-22	Directly free structure managed by imemo tmpbuf	yui-knk
	NODE_ARGS, NODE_ARYPTN, NODE_FNDPTN manage memory of their structure by imemo tmpbuf Object. However rb_ast_struct has reference to NODE. Then these memory can be freed directly when rb_ast_struct is freed. This commit reduces parser's dependency on CRuby functions.
2023-06-17	Replace parser & node compile_option from Hash to bit field	yui-knk
	This commit reduces dependency to CRuby object. Notes: Merged: https://github.com/ruby/ruby/pull/7950
2023-06-12	[Feature #19719] Universal Parser	yui-knk
	Introduce Universal Parser mode for the parser. This commit includes these changes: * Introduce `UNIVERSAL_PARSER` macro. All of CRuby related functions are passed via `struct rb_parser_config_struct` when this macro is enabled. * Add CI task with 'cppflags=-DUNIVERSAL_PARSER' for ubuntu. Notes: Merged: https://github.com/ruby/ruby/pull/7927
2023-05-24	Rename `rb_node_name` to the original name	yui-knk
	98637d421dbe8bcf86cc2effae5e26bb96a6a4da changes the name of the function. However this function is exported as global, then change the name to origin one for keeping compatibility. Notes: Merged: https://github.com/ruby/ruby/pull/7852
2023-05-23	Move `ruby_node_name` to node.c and rename prefix of the function	yui-knk
	Notes: Merged: https://github.com/ruby/ruby/pull/7844
2022-11-21	Enhance keep_tokens option for RubyVM::AbstractSyntaxTree parsing methods	yui-knk
	Implementation for Language Server Protocol (LSP) sometimes needs token information. For example both `m(1)` and `m(1, )` has same AST structure other than node locations then it's impossible to check the existence of `,` from AST. However in later case, it might be better to suggest variables list for the second argument. Token information is important for such case. This commit adds these methods. * Add `keep_tokens` option for `RubyVM::AbstractSyntaxTree.parse`, `.parse_file` and `.of` * Add `RubyVM::AbstractSyntaxTree::Node#tokens` which returns tokens for the node including tokens for descendants nodes. * Add `RubyVM::AbstractSyntaxTree::Node#all_tokens` which returns all tokens for the input script regardless the receiver node. [Feature #19070] Impacts on memory usage and performance are below: Memory usage: ``` $ cat test.rb root = RubyVM::AbstractSyntaxTree.parse_file(File.expand_path('../test/ruby/test_keyword.rb', __FILE__), keep_tokens: true) $ /usr/bin/time -f %Mkb /usr/local/bin/ruby -v ruby 3.2.0dev (2022-11-19T09:41:54Z 19070-keep_tokens d3af1b8057) [x86_64-linux] 11408kb # keep_tokens :false $ /usr/bin/time -f %Mkb /usr/local/bin/ruby test.rb 17508kb # keep_tokens :true $ /usr/bin/time -f %Mkb /usr/local/bin/ruby test.rb 30960kb ``` Performance: ``` $ cat ../ast_keep_tokens.yml prelude: \| src = <<~SRC module M class C def m1(a, b) 1 + a + b end end end SRC benchmark: without_keep_tokens: \| RubyVM::AbstractSyntaxTree.parse(src, keep_tokens: false) with_keep_tokens: \| RubyVM::AbstractSyntaxTree.parse(src, keep_tokens: true) $ make benchmark COMPARE_RUBY="./ruby" ARGS=../ast_keep_tokens.yml /home/kaneko.y/.rbenv/shims/ruby --disable=gems -rrubygems -I../benchmark/lib ../benchmark/benchmark-driver/exe/benchmark-driver \ --executables="compare-ruby::./ruby -I.ext/common --disable-gem" \ --executables="built-ruby::./miniruby -I../lib -I. -I.ext/common ../tool/runruby.rb --extout=.ext -- --disable-gems --disable-gem" \ --output=markdown --output-compare -v ../ast_keep_tokens.yml compare-ruby: ruby 3.2.0dev (2022-11-19T09:41:54Z 19070-keep_tokens d3af1b8057) [x86_64-linux] built-ruby: ruby 3.2.0dev (2022-11-19T09:41:54Z 19070-keep_tokens d3af1b8057) [x86_64-linux] warming up.. \| \|compare-ruby\|built-ruby\| \|:--------------------\|-----------:\|---------:\| \|without_keep_tokens \| 21.659k\| 21.303k\| \| \| 1.02x\| -\| \|with_keep_tokens \| 6.220k\| 5.691k\| \| \| 1.09x\| -\| ``` Notes: Merged: https://github.com/ruby/ruby/pull/6770
2022-10-08	Move `error` from top_stmts and top_stmt to stmt	yui-knk
	By this change, syntax error is recovered smaller units. In the case below, "DEFN :bar" is same level with "CLASS :Foo" now. ``` module Z class Foo foo. end def bar end end ``` [Feature #19013] Notes: Merged: https://github.com/ruby/ruby/pull/6512
2022-08-01	Initialize node_id	Wolf
	In some causes node_id might have been left uninitialized leading to undefined behavior on access. So always set it to -1, so we have some valid value in there. Notes: Merged: https://github.com/ruby/ruby/pull/6202
2022-07-21	Expand tabs [ci skip]	Takashi Kokubun
	[Misc #18891] Notes: Merged: https://github.com/ruby/ruby/pull/6094
2021-12-13	Remove `NODE_DASGN_CURR` [Feature #18406]	Nobuyoshi Nakada
	This `NODE` type was used in pre-YARV implementation, to improve the performance of assignment to dynamic local variable defined at the innermost scope. It has no longer any actual difference with `NODE_DASGN`, except for the node dump. Notes: Merged: https://github.com/ruby/ruby/pull/5251
2021-12-04	Add `nd_type_p` macro	S.H
	Notes: Merged: https://github.com/ruby/ruby/pull/5091 Merged-By: nobu <[email protected]>
2021-11-21	Refactor hacky ID tables to struct rb_ast_id_table_t	Yusuke Endoh
	The implementation of a local variable tables was represented as `ID*`, but it was very hacky: the first element is not an ID but the size of the table, and, the last element is (sometimes) a link to the next local table only when the id tables are a linked list. This change converts the hacky implementation to a normal struct. Notes: Merged: https://github.com/ruby/ruby/pull/5136
2021-11-17	node.c (dump_node): update format explanation for NODE_ARGS	Yusuke Endoh

2021-11-17	node.c (dump_node): trivial refactoring	Yusuke Endoh

2021-07-12	Show node IDs in dump	Nobuyoshi Nakada

2021-06-18	ast.rb: RubyVM::AST.parse and .of accepts `save_script_lines: true`	Yusuke Endoh
	This option makes the parser keep the original source as an array of the original code lines. This feature exploits the mechanism of `SCRIPT_LINES__` but records only the specified code that is passed to RubyVM::AST.of or .parse, instead of recording all parsed program texts. Notes: Merged: https://github.com/ruby/ruby/pull/4581
2021-04-27	Partially revert 2c7d3b3a722c4636ab1e9d289cbca47ddd168d3e	Yusuke Endoh
	to make imemo_ast WB-protected again. Only the test is kept. Notes: Merged: https://github.com/ruby/ruby/pull/4419