ruby.git - The Ruby Programming Language

Age	Commit message (Collapse)	Author
5 days	Inline Class#new.	Aaron Patterson
	This commit inlines instructions for Class#new. To make this work, we added a new YARV instructions, `opt_new`. `opt_new` checks whether or not the `new` method is the default allocator method. If it is, it allocates the object, and pushes the instance on the stack. If not, the instruction jumps to the "slow path" method call instructions. Old instructions: ``` > ruby --dump=insns -e'Object.new' == disasm: #<ISeq:<main>@-e:1 (1,0)-(1,10)> 0000 opt_getconstant_path <ic:0 Object> ( 1)[Li] 0002 opt_send_without_block <calldata!mid:new, argc:0, ARGS_SIMPLE> 0004 leave ``` New instructions: ``` > ./miniruby --dump=insns -e'Object.new' == disasm: #<ISeq:<main>@-e:1 (1,0)-(1,10)> 0000 opt_getconstant_path <ic:0 Object> ( 1)[Li] 0002 putnil 0003 swap 0004 opt_new <calldata!mid:new, argc:0, ARGS_SIMPLE>, 11 0007 opt_send_without_block <calldata!mid:initialize, argc:0, FCALL\|ARGS_SIMPLE> 0009 jump 14 0011 opt_send_without_block <calldata!mid:new, argc:0, ARGS_SIMPLE> 0013 swap 0014 pop 0015 leave ``` This commit speeds up basic object allocation (`Foo.new`) by 60%, but classes that take keyword parameters see an even bigger benefit because no hash is allocated when instantiating the object (3x to 6x faster). Here is an example that uses `Hash.new(capacity: 0)`: ``` > hyperfine "ruby --disable-gems -e'i = 0; while i < 10_000_000; Hash.new(capacity: 0); i += 1; end'" "./ruby --disable-gems -e'i = 0; while i < 10_000_000; Hash.new(capacity: 0); i += 1; end'" Benchmark 1: ruby --disable-gems -e'i = 0; while i < 10_000_000; Hash.new(capacity: 0); i += 1; end' Time (mean ± σ): 1.082 s ± 0.004 s [User: 1.074 s, System: 0.008 s] Range (min … max): 1.076 s … 1.088 s 10 runs Benchmark 2: ./ruby --disable-gems -e'i = 0; while i < 10_000_000; Hash.new(capacity: 0); i += 1; end' Time (mean ± σ): 627.9 ms ± 3.5 ms [User: 622.7 ms, System: 4.8 ms] Range (min … max): 622.7 ms … 633.2 ms 10 runs Summary ./ruby --disable-gems -e'i = 0; while i < 10_000_000; Hash.new(capacity: 0); i += 1; end' ran 1.72 ± 0.01 times faster than ruby --disable-gems -e'i = 0; while i < 10_000_000; Hash.new(capacity: 0); i += 1; end' ``` This commit changes the backtrace for `initialize`: ``` aaron@tc ~/g/ruby (inline-new)> cat test.rb class Foo def initialize puts caller end end def hello Foo.new end hello aaron@tc ~/g/ruby (inline-new)> ruby -v test.rb ruby 3.4.2 (2025-02-15 revision d2930f8e7a) +PRISM [arm64-darwin24] test.rb:8:in 'Class#new' test.rb:8:in 'Object#hello' test.rb:11:in '<main>' aaron@tc ~/g/ruby (inline-new)> ./miniruby -v test.rb ruby 3.5.0dev (2025-03-28T23:59:40Z inline-new c4157884e4) +PRISM [arm64-darwin24] test.rb:8:in 'Object#hello' test.rb:11:in '<main>' ``` It also increases memory usage for calls to `new` by 122 bytes: ``` aaron@tc ~/g/ruby (inline-new)> cat test.rb require "objspace" class Foo def initialize puts caller end end def hello Foo.new end puts ObjectSpace.memsize_of(RubyVM::InstructionSequence.of(method(:hello))) aaron@tc ~/g/ruby (inline-new)> make runruby RUBY_ON_BUG='gdb -x ./.gdbinit -p' ./miniruby -I./lib -I. -I.ext/common ./tool/runruby.rb --extout=.ext -- --disable-gems ./test.rb 656 aaron@tc ~/g/ruby (inline-new)> ruby -v test.rb ruby 3.4.2 (2025-02-15 revision d2930f8e7a) +PRISM [arm64-darwin24] 544 ``` Thanks to @ko1 for coming up with this idea! Co-Authored-By: John Hawthorn <[email protected]>
12 days	Fix yjit-bindgen	Takashi Kokubun
	Notes: Merged: https://github.com/ruby/ruby/pull/13131
12 days	Move a couple of bindgen targets to ZJIT bindgen	Takashi Kokubun
	We filed https://github.com/Shopify/zjit/pull/65 and https://github.com/Shopify/zjit/pull/64 concurrently. Notes: Merged: https://github.com/ruby/ruby/pull/13131
12 days	Rust tests: Load builtins (core library written in ruby)	Alan Wu
	Key here is calling rb_call_builtin_inits(), which sticking to public API for robustness is done by calling ruby_options(). Fixes: https://github.com/Shopify/zjit/issues/61 Notes: Merged: https://github.com/ruby/ruby/pull/13131
12 days	Print Ruby exception in test utils	Max Bernstein
	Notes: Merged: https://github.com/ruby/ruby/pull/13131
12 days	Add compact Type lattice	Max Bernstein
	This will be used for local type inference and potentially SCCP. Notes: Merged: https://github.com/ruby/ruby/pull/13131
12 days	Add zjit_* instructions to profile the interpreter ↵	Takashi Kokubun
	(https://github.com/Shopify/zjit/pull/16) * Add zjit_* instructions to profile the interpreter * Rename FixnumPlus to FixnumAdd * Update a comment about Invalidate * Rename Guard to GuardType * Rename Invalidate to PatchPoint * Drop unneeded debug!() * Plan on profiling the types * Use the output of GuardType as type refined outputs Notes: Merged: https://github.com/ruby/ruby/pull/13131
12 days	boot_vm boots and runs	Alan Wu
	Notes: Merged: https://github.com/ruby/ruby/pull/13131
12 days	bindgen works in --enable-zjit=dev mode.	Alan Wu
	Notes: Merged: https://github.com/ruby/ruby/pull/13131
12 days	make zjit-bindgen runs, but doesn't graft the right things yet	Alan Wu
	Notes: Merged: https://github.com/ruby/ruby/pull/13131
2025-02-16	Remove undefined function from bindgen	Aaron Patterson
	`rb_get_iseq_body_total_calls` was removed in cd8d20cd1fbcf9bf9d438b306beb65b2417fcc04, but it's still in the YJIT bindgen file. This commit just removes it from bindgen Notes: Merged: https://github.com/ruby/ruby/pull/12760
2025-02-14	Only count VM instructions in YJIT stats builds	Aaron Patterson
	The instruction counter is slowing multi-Ractor applications. I had changed it to use a thread local, but using a thread local is slowing single threaded applications. This commit only enables the instruction counter in YJIT stats builds until we can figure out a way to gather the information with lower overhead. Co-authored-by: Randy Stauner <[email protected]> Notes: Merged: https://github.com/ruby/ruby/pull/12670
2025-01-10	Make rb_vm_insns_count a thread local variable	Aaron Patterson
	`rb_vm_insns_count` is a global variable used for reporting YJIT statistics. It is a counter that tallies the number of interpreter instructions that have been executed, this way we can approximate how much time we're spending in YJIT compared to the interpreter. Unfortunately keeping this statistic means that every instruction executed in the interpreter loop must increment the counter. Normally this isn't a problem, but in multi-threaded situations (when Ractors are used), incrementing this counter can become quite costly due to page caching issues. Additionally, since there is no locking when incrementing this global, the count can't really make sense in a multi-threaded environment. This commit changes `rb_vm_insns_count` to a thread local. That way each Ractor has it's own copy of the counter and incrementing the counter becomes quite cheap. Of course this means that in multi-threaded situations, the value doesn't really make sense (but it didn't make sense before because of the lack of locking). The counter is used for YJIT statistics, and since YJIT is basically disabled when Ractors are in use, I don't think we care about inaccuracies (for the time being). We can revisit this counter when we give YJIT multi-threading support, but for the time being this commit restores multi-threaded performance. To test this, I used the benchmark in [Bug #20489]. Here is the performance on Ruby 3.2: ``` $ time RUBY_MAX_CPU=12 ./miniruby -v ../test.rb 8 8 ruby 3.2.0 (2022-12-25 revision a528908271) [x86_64-linux] [0...1, 1...2, 2...3, 3...4, 4...5, 5...6, 6...7, 7...8] ../test.rb:43: warning: Ractor is experimental, and the behavior may change in future versions of Ruby! Also there are many implementation issues. ________________________________________________________ Executed in 2.53 secs fish external usr time 19.86 secs 370.00 micros 19.86 secs sys time 0.02 secs 320.00 micros 0.02 secs ``` We can see the regression in performance on the master branch: ``` $ time RUBY_MAX_CPU=12 ./miniruby -v ../test.rb 8 8 ruby 3.5.0dev (2025-01-10T16:22:26Z master 4a2702dafb) +PRISM [x86_64-linux] [0...1, 1...2, 2...3, 3...4, 4...5, 5...6, 6...7, 7...8] ../test.rb:43: warning: Ractor is experimental, and the behavior may change in future versions of Ruby! Also there are many implementation issues. ________________________________________________________ Executed in 24.87 secs fish external usr time 195.55 secs 0.00 micros 195.55 secs sys time 0.00 secs 716.00 micros 0.00 secs ``` Here are the stats after this commit: ``` $ time RUBY_MAX_CPU=12 ./miniruby -v ../test.rb 8 8 ruby 3.5.0dev (2025-01-10T20:37:06Z tl 3ef0432779) +PRISM [x86_64-linux] [0...1, 1...2, 2...3, 3...4, 4...5, 5...6, 6...7, 7...8] ../test.rb:43: warning: Ractor is experimental, and the behavior may change in future versions of Ruby! Also there are many implementation issues. ________________________________________________________ Executed in 2.46 secs fish external usr time 19.34 secs 381.00 micros 19.34 secs sys time 0.01 secs 321.00 micros 0.01 secs ``` [Bug #20489] Notes: Merged: https://github.com/ruby/ruby/pull/12549
2024-11-13	YJIT: Specialize `String#[]` (`String#slice`) with fixnum arguments (#12069)	Randy Stauner
	* YJIT: Specialize `String#[]` (`String#slice`) with fixnum arguments String#[] is in the top few C calls of several YJIT benchmarks: liquid-compile rubocop mail sudoku This speeds up these benchmarks by 1-2%. * YJIT: Try harder to get type info for `String#[]` In the large generated code of the mail gem the context doesn't have the type info. In that case if we peek at the stack and add a guard we can still apply the specialization and it speeds up the mail benchmark by 5%. Co-authored-by: Maxime Chevalier-Boisvert <[email protected]> Co-authored-by: Takashi Kokubun (k0kubun) <[email protected]> --------- Co-authored-by: Maxime Chevalier-Boisvert <[email protected]> Co-authored-by: Takashi Kokubun (k0kubun) <[email protected]> Notes: Merged-By: maximecb <[email protected]>
2024-10-22	YJIT: Implement specialization for no-op `{Kernel,Numeric}#dup`	Alan Wu
	Type information in the context for no additional work! This is the `if (special_object_p(obj)) return obj;` path in rb_obj_dup() and for Numeric#dup, it's always the identity function. Notes: Merged: https://github.com/ruby/ruby/pull/11926
2024-10-18	YJIT: Allow shareable consts in multi-ractor mode (#11917)	John Hawthorn
	* Update yjit-bindgen deps * YJIT: Allow shareable consts in multi-ractor mode * Update yjit/src/codegen.rs Co-authored-by: Alan Wu <[email protected]> --------- Co-authored-by: Alan Wu <[email protected]> Notes: Merged-By: maximecb <[email protected]>
2024-10-08	YJIT: Fastpath for Module#name (#11819)	Alan Wu
	Module#name shows up as a top C method callee in lobsters so probably common enough. It's also easy to substitute thanks to rb_mod_name() already having no GC yield points. klass = BasicObject 50_000_000.times { klass.name } Benchmark 1: /.rubies/post/bin/ruby --yjit mod_name.rb Time (mean ± σ): 1.433 s ± 0.010 s [User: 1.410 s, System: 0.010 s] Range (min … max): 1.421 s … 1.449 s 10 runs Benchmark 2: /.rubies/mstr/bin/ruby --yjit mod_name.rb Time (mean ± σ): 1.491 s ± 0.012 s [User: 1.468 s, System: 0.010 s] Range (min … max): 1.470 s … 1.511 s 10 runs Summary /.rubies/post/bin/ruby --yjit mod_name.rb ran 1.04 ± 0.01 times faster than /.rubies/mstr/bin/ruby --yjit mod_name.rb
2024-08-27	YJIT: Encode doubles to VALUE objects and move stat generation to rust (#11388)	Randy Stauner
	* YJIT: Encode doubles to VALUE objects and move stat generation to rust Stats that can now be generated from rust have been moved there. * Move object_shape_count call for runtime_stats to rust This reduces the ruby method to a single primitive. * Change hash_aset_usize from macro to function Notes: Merged-By: maximecb <[email protected]>
2024-08-02	YJIT: Enhance the `String#<<` method substitution to handle integer ↵	Kevin Menard
	codepoint values. (#11032) * Document why we need to explicitly spill registers. * Simplify passing a byte value to `str_buf_cat`. * YJIT: Enhance the `String#<<` method substitution to handle integer codepoint values. * YJIT: Move runtime type check into YJIT. Performing the check in YJIT means we can make assumptions about the type. It also improves correctness of stack traces in cases where the codepoint argument is not a String or a Fixnum. Notes: Merged-By: maximecb <[email protected]>
2024-07-29	Expand opt_newarray_send to support Array#pack with buffer keyword arg	Randy Stauner
	Use an enum for the method arg instead of needing to add an id that doesn't map to an actual method name. $ ruby --dump=insns -e 'b = "x"; [v].pack("E", buffer: b)' before: ``` == disasm: #<ISeq:<main>@-e:1 (1,0)-(1,34)> local table (size: 1, argc: 0 [opts: 0, rest: -1, post: 0, block: -1, kw: -1@-1, kwrest: -1]) [ 1] b@0 0000 putchilledstring "x" ( 1)[Li] 0002 setlocal_WC_0 b@0 0004 putself 0005 opt_send_without_block <calldata!mid:v, argc:0, FCALL\|VCALL\|ARGS_SIMPLE> 0007 newarray 1 0009 putchilledstring "E" 0011 getlocal_WC_0 b@0 0013 opt_send_without_block <calldata!mid:pack, argc:2, kw:[#<Symbol:0x000000000023110c>], KWARG> 0015 leave ``` after: ``` == disasm: #<ISeq:<main>@-e:1 (1,0)-(1,34)> local table (size: 1, argc: 0 [opts: 0, rest: -1, post: 0, block: -1, kw: -1@-1, kwrest: -1]) [ 1] b@0 0000 putchilledstring "x" ( 1)[Li] 0002 setlocal_WC_0 b@0 0004 putself 0005 opt_send_without_block <calldata!mid:v, argc:0, FCALL\|VCALL\|ARGS_SIMPLE> 0007 putchilledstring "E*" 0009 getlocal b@0, 0 0012 opt_newarray_send 3, 5 0015 leave ``` Notes: Merged: https://github.com/ruby/ruby/pull/11249
2024-06-18	Optimized forwarding callers and callees	Aaron Patterson
	This patch optimizes forwarding callers and callees. It only optimizes methods that only take `...` as their parameter, and then pass `...` to other calls. Calls it optimizes look like this: ```ruby def bar(a) = a def foo(...) = bar(...) # optimized foo(123) ``` ```ruby def bar(a) = a def foo(...) = bar(1, 2, ...) # optimized foo(123) ``` ```ruby def bar(a) = a def foo(...) list = [1, 2] bar(list, ...) # optimized end foo(123) ``` All variants of the above but using `super` are also optimized, including a bare super like this: ```ruby def foo(...) super end ``` This patch eliminates intermediate allocations made when calling methods that accept `...`. We can observe allocation elimination like this: ```ruby def m x = GC.stat(:total_allocated_objects) yield GC.stat(:total_allocated_objects) - x end def bar(a) = a def foo(...) = bar(...) def test m { foo(123) } end test p test # allocates 1 object on master, but 0 objects with this patch ``` ```ruby def bar(a, b:) = a + b def foo(...) = bar(...) def test m { foo(1, b: 2) } end test p test # allocates 2 objects on master, but 0 objects with this patch ``` How does it work? ----------------- This patch works by using a dynamic stack size when passing forwarded parameters to callees. The caller's info object (known as the "CI") contains the stack size of the parameters, so we pass the CI object itself as a parameter to the callee. When forwarding parameters, the forwarding ISeq uses the caller's CI to determine how much stack to copy, then copies the caller's stack before calling the callee. The CI at the forwarded call site is adjusted using information from the caller's CI. I think this description is kind of confusing, so let's walk through an example with code. ```ruby def delegatee(a, b) = a + b def delegator(...) delegatee(...) # CI2 (FORWARDING) end def caller delegator(1, 2) # CI1 (argc: 2) end ``` Before we call the delegator method, the stack looks like this: ``` Executing Line \| Code \| Stack ---------------+---------------------------------------+-------- 1\| def delegatee(a, b) = a + b \| self 2\| \| 1 3\| def delegator(...) \| 2 4\| # \| 5\| delegatee(...) # CI2 (FORWARDING) \| 6\| end \| 7\| \| 8\| def caller \| -> 9\| delegator(1, 2) # CI1 (argc: 2) \| 10\| end \| ``` The ISeq for `delegator` is tagged as "forwardable", so when `caller` calls in to `delegator`, it writes `CI1` on to the stack as a local variable for the `delegator` method. The `delegator` method has a special local called `...` that holds the caller's CI object. Here is the ISeq disasm fo `delegator`: ``` == disasm: #<ISeq:delegator@-e:1 (1,0)-(1,39)> local table (size: 1, argc: 0 [opts: 0, rest: -1, post: 0, block: -1, kw: -1@-1, kwrest: -1]) [ 1] "..."@0 0000 putself ( 1)[LiCa] 0001 getlocal_WC_0 "..."@0 0003 send <calldata!mid:delegatee, argc:0, FCALL\|FORWARDING>, nil 0006 leave [Re] ``` The local called `...` will contain the caller's CI: CI1. Here is the stack when we enter `delegator`: ``` Executing Line \| Code \| Stack ---------------+---------------------------------------+-------- 1\| def delegatee(a, b) = a + b \| self 2\| \| 1 3\| def delegator(...) \| 2 -> 4\| # \| CI1 (argc: 2) 5\| delegatee(...) # CI2 (FORWARDING) \| cref_or_me 6\| end \| specval 7\| \| type 8\| def caller \| 9\| delegator(1, 2) # CI1 (argc: 2) \| 10\| end \| ``` The CI at `delegatee` on line 5 is tagged as "FORWARDING", so it knows to memcopy the caller's stack before calling `delegatee`. In this case, it will memcopy self, 1, and 2 to the stack before calling `delegatee`. It knows how much memory to copy from the caller because `CI1` contains stack size information (argc: 2). Before executing the `send` instruction, we push `...` on the stack. The `send` instruction pops `...`, and because it is tagged with `FORWARDING`, it knows to memcopy (using the information in the CI it just popped): ``` == disasm: #<ISeq:delegator@-e:1 (1,0)-(1,39)> local table (size: 1, argc: 0 [opts: 0, rest: -1, post: 0, block: -1, kw: -1@-1, kwrest: -1]) [ 1] "..."@0 0000 putself ( 1)[LiCa] 0001 getlocal_WC_0 "..."@0 0003 send <calldata!mid:delegatee, argc:0, FCALL\|FORWARDING>, nil 0006 leave [Re] ``` Instruction 001 puts the caller's CI on the stack. `send` is tagged with FORWARDING, so it reads the CI and _copies_ the callers stack to this stack: ``` Executing Line \| Code \| Stack ---------------+---------------------------------------+-------- 1\| def delegatee(a, b) = a + b \| self 2\| \| 1 3\| def delegator(...) \| 2 4\| # \| CI1 (argc: 2) -> 5\| delegatee(...) # CI2 (FORWARDING) \| cref_or_me 6\| end \| specval 7\| \| type 8\| def caller \| self 9\| delegator(1, 2) # CI1 (argc: 2) \| 1 10\| end \| 2 ``` The "FORWARDING" call site combines information from CI1 with CI2 in order to support passing other values in addition to the `...` value, as well as perfectly forward splat args, kwargs, etc. Since we're able to copy the stack from `caller` in to `delegator`'s stack, we can avoid allocating objects. I want to do this to eliminate object allocations for delegate methods. My long term goal is to implement `Class#new` in Ruby and it uses `...`. I was able to implement `Class#new` in Ruby [here](https://github.com/ruby/ruby/pull/9289). If we adopt the technique in this patch, then we can optimize allocating objects that take keyword parameters for `initialize`. For example, this code will allocate 2 objects: one for `SomeObject`, and one for the kwargs: ```ruby SomeObject.new(foo: 1) ``` If we combine this technique, plus implement `Class#new` in Ruby, then we can reduce allocations for this common operation. Co-Authored-By: John Hawthorn <[email protected]> Co-Authored-By: Alan Wu <[email protected]>
2024-06-04	Do not emit shape transition warnings when YJIT is compiling	Jean Boussier
	[Bug #20522] If `Warning.warn` is redefined in Ruby, emitting a warning would invoke Ruby code, which can't safely be done when YJIT is compiling.
2024-05-28	Stop marking chilled strings as frozen	Étienne Barrié
	They were initially made frozen to avoid false positives for cases such as: str = str.dup if str.frozen? But this may cause bugs and is generally confusing for users. [Feature #20205] Co-authored-by: Jean Boussier <[email protected]>
2024-04-29	YJIT: Add specialized codegen function for `TrueClass#===` (#10640)	Randy Stauner
	* YJIT: Add specialized codegen function for `TrueClass#===` TrueClass#=== is currently number 10 in the most frequent C calls list of the lobsters benchmark. ``` require "benchmark/ips" def wrap true === true true === false true === :x end Benchmark.ips do \|x\| x.report(:wrap) do wrap end end ``` ``` before Warming up -------------------------------------- wrap 1.791M i/100ms Calculating ------------------------------------- wrap 17.806M (± 1.0%) i/s - 89.544M in 5.029363s after Warming up -------------------------------------- wrap 4.024M i/100ms Calculating ------------------------------------- wrap 40.149M (± 1.1%) i/s - 201.223M in 5.012527s ``` Co-authored-by: Maxime Chevalier-Boisvert <[email protected]> Co-authored-by: Takashi Kokubun (k0kubun) <[email protected]> Co-authored-by: Kevin Menard <[email protected]> Co-authored-by: Alan Wu <[email protected]> * Fix the new test for RJIT --------- Co-authored-by: Maxime Chevalier-Boisvert <[email protected]> Co-authored-by: Takashi Kokubun (k0kubun) <[email protected]> Co-authored-by: Kevin Menard <[email protected]> Co-authored-by: Alan Wu <[email protected]>
2024-04-25	YJIT: Optimize local variables when EP == BP (take 2) (#10607)	Takashi Kokubun
	* Revert "Revert "YJIT: Optimize local variables when EP == BP" (#10584)" This reverts commit c8783441952217c18e523749c821f82cd7e5d222. * YJIT: Take care of GC references in ISEQ invariants Co-authored-by: Alan Wu <[email protected]> --------- Co-authored-by: Alan Wu <[email protected]>
2024-04-24	YJIT: Add a specialized codegen function for `Class#superclass`. (#10613)	Kevin Menard
	Add a specialized codegen function for `Class#superclass`. Co-authored-by: Maxime Chevalier-Boisvert <[email protected]> Co-authored-by: Takashi Kokubun (k0kubun) <[email protected]> Co-authored-by: Randy Stauner <[email protected]> Co-authored-by: Alan Wu <[email protected]>
2024-04-19	Revert "YJIT: Optimize local variables when EP == BP" (#10584)	Alan Wu
	This reverts commit 4cc58ea0b865f2fd20f1e881ddbd4c4fab0b072c. Since the change landed call-threshold=1 CI runs have been timing out. There has also been `verify-ctx` violations. Revert for now while we debug.
2024-04-17	YJIT: Optimize local variables when EP == BP (#10487)	Takashi Kokubun

2024-02-27	YJIT: Support splat with C methods with -1 arity	Alan Wu
	Usually we deal with splats by speculating that they're of a specific size. In this case, the C method takes a pointer and a length, so we can support changing sizes just fine.
2024-02-15	YJIT: Pass nil to anonymous kwrest when empty (#9972)	Alan Wu
	This is the same optimization as e4272fd29 ("Avoid allocation when passing no keywords to anonymous kwrest methods") but for YJIT. For anonymous kwrest parameters, nil is just as good as an empty hash. On the usage side, update `splatkw` to handle `nil` with a leaner path.
2024-02-14	Move rb_class_allocate_instance from gc.c to object.c	Peter Zhu

2024-02-13	Specialize String#byteslice(a, b) (#9939)	Aaron Patterson
	* Specialize String#byteslice(a, b) This adds a specialization for String#byteslice when there are two parameters. This makes our protobuf parser go from 5.84x slower to 5.33x slower ``` Comparison: decode upstream (53738 bytes): 7228.5 i/s decode protobuff (53738 bytes): 1236.8 i/s - 5.84x slower Comparison: decode upstream (53738 bytes): 7024.8 i/s decode protobuff (53738 bytes): 1318.5 i/s - 5.33x slower ``` * Update yjit/src/codegen.rs --------- Co-authored-by: Maxime Chevalier-Boisvert <[email protected]>
2024-02-12	YJIT: Add support for `**kwrest` parameters	Alan Wu
	Now that `...` uses `**kwrest` instead of regular splat and ruby2keywords, we need to support these type of methods to support `...` well.
2024-02-09	YJIT: Add top ISEQ call counts to --yjit-stats (#9906)	Takashi Kokubun

2024-01-31	YJIT: Add codegen for Float arithmetics (#9774)	Takashi Kokubun
	* YJIT: Add codegen for Float arithmetics * Add Flonum and Fixnum tests
2024-01-26	YJIT: Fix exits on splatkw instruction (#9711)	Takashi Kokubun

2024-01-25	YJIT: Support concattoarray and pushtoarray (#9708)	Takashi Kokubun

2024-01-24	YJIT: Avoid leaks by skipping objects with a singleton class	Alan Wu
	For receiver with a singleton class, there are multiple vectors YJIT can end up retaining the object. There is a path in jit_guard_known_klass() that bakes the receiver into the code, and the object could also be kept alive indirectly through a path starting at the CME object baked into the code. To avoid these leaks, avoid compiling calls on objects with a singleton class. See: https://github.com/Shopify/ruby/issues/552 [Bug #20209]
2024-01-23	YJIT: Fix ruby2_keywords splat+rest and drop bogus checks	Alan Wu
	YJIT didn't guard for ruby2_keywords hash in case of splat calls that land in methods with a rest parameter, creating incorrect results. The compile-time checks didn't correspond to any actual effects of ruby2_keywords, so it was masking this bug and YJIT was needlessly refusing to compile some code. About 16% of fallback reasons in `lobsters` was due to the ISeq check. We already handle the tagging part with exit_if_supplying_kw_and_has_no_kw() and should now have a dynamic guard for all splat cases. Note for backporting: You also need 7f51959ff1. [Bug #20195]
2024-01-19	YJIT: Optimize defined?(yield) (#9599)	Takashi Kokubun
	* YJIT: Optimize defined?(yield) * Remove an irrelevant comment * s/get/gen/
2024-01-10	YJIT: Fix unused warnings	Alan Wu
	``` warning: unused import: `condition::Condition` --> src/asm/arm64/arg/mod.rs:13:9 \| 13 \| pub use condition::Condition; \| ^^^^^^^^^^^^^^^^^^^^ \| = note: `#[warn(unused_imports)]` on by default warning: unused import: `rb_yjit_fix_mul_fix as rb_fix_mul_fix` --> src/cruby.rs:188:9 \| 188 \| pub use rb_yjit_fix_mul_fix as rb_fix_mul_fix; \| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ warning: unused import: `rb_insn_len as raw_insn_len` --> src/cruby.rs:142:9 \| 142 \| pub use rb_insn_len as raw_insn_len; \| ^^^^^^^^^^^^^^^^^^^^^^^^^^^ \| = note: `#[warn(unused_imports)]` on by default ``` Make asm public so it stops warning about unused public stuff in there.
2023-11-08	Refactor rb_shape_transition_shape_capa out	Jean Boussier
	Right now the `rb_shape_get_next` shape caller need to first check if there is capacity left, and if not call `rb_shape_transition_shape_capa` before it can call `rb_shape_get_next`. And on each of these it needs to checks if we got a TOO_COMPLEX back. All this logic is duplicated in the interpreter, YJIT and RJIT. Instead we can have `rb_shape_get_next` do the capacity transition when needed. The caller can compare the old and new shapes capacity to know if resizing is needed. It also can check for TOO_COMPLEX only once.
2023-10-17	YJIT: Lookup IDs on boot instead of binding to them	Alan Wu
	Previously, the version-controlled `cruby_bindings.inc.rs` file contained the build-time artifact `id.h`, which nobu mentioned hinders the goal of having fewer magic numbers in the repository. Lookup the IDs YJIT needs on boot. It costs cycles, but it's fine since YJIT only uses a handful of IDs at the moment. No perceptible degradation to boot time found in my testing.
2023-10-05	YJIT: Remove duplicate cfp->iseq accessor	Alan Wu

2023-09-14	YJIT: Plug native stack overflow	Alan Wu
	Previously, TestStack#test_machine_stack_size failed pretty consistently on ARM64 macOS, with Rust code and part of the interpreter used for per-instruction fallback (rb_vm_invokeblock() and friends) touching the stack guard page and crashing with SEGV. I've also seen the same test fail on x64 Linux, though with a different symptom. Notes: Merged: https://github.com/ruby/ruby/pull/8443 Merged-By: XrXr
2023-08-17	YJIT: implement side chain fallback for setlocal to avoid exiting (#8227)	Maxime Chevalier-Boisvert
	* YJIT: implement side chain fallback for setlocal to avoid exiting * Update yjit/src/codegen.rs Co-authored-by: Takashi Kokubun <[email protected]> --------- Co-authored-by: Takashi Kokubun <[email protected]> Notes: Merged-By: maximecb <[email protected]>
2023-08-10	YJIT: Implement checkmatch instruction (#8203)	Takashi Kokubun
	Notes: Merged-By: maximecb <[email protected]>
2023-08-09	YJIT: Count throw instructions for each tag (#8188)	Takashi Kokubun
	* YJIT: Count throw instructions for each tag * Show % of each throw type Notes: Merged-By: k0kubun <[email protected]>
2023-08-08	YJIT: Compile exception handlers (#8171)	Takashi Kokubun
	Co-authored-by: Maxime Chevalier-Boisvert <[email protected]> Notes: Merged-By: k0kubun <[email protected]>
2023-08-02	YJIT: Move ROBJECT_OFFSET_* to yjit.c (#8157)	Takashi Kokubun
	Notes: Merged-By: maximecb <[email protected]>