ruby.git - The Ruby Programming Language

Age	Commit message (Collapse)	Author
2024-06-18	Optimized forwarding callers and callees	Aaron Patterson
	This patch optimizes forwarding callers and callees. It only optimizes methods that only take `...` as their parameter, and then pass `...` to other calls. Calls it optimizes look like this: ```ruby def bar(a) = a def foo(...) = bar(...) # optimized foo(123) ``` ```ruby def bar(a) = a def foo(...) = bar(1, 2, ...) # optimized foo(123) ``` ```ruby def bar(a) = a def foo(...) list = [1, 2] bar(list, ...) # optimized end foo(123) ``` All variants of the above but using `super` are also optimized, including a bare super like this: ```ruby def foo(...) super end ``` This patch eliminates intermediate allocations made when calling methods that accept `...`. We can observe allocation elimination like this: ```ruby def m x = GC.stat(:total_allocated_objects) yield GC.stat(:total_allocated_objects) - x end def bar(a) = a def foo(...) = bar(...) def test m { foo(123) } end test p test # allocates 1 object on master, but 0 objects with this patch ``` ```ruby def bar(a, b:) = a + b def foo(...) = bar(...) def test m { foo(1, b: 2) } end test p test # allocates 2 objects on master, but 0 objects with this patch ``` How does it work? ----------------- This patch works by using a dynamic stack size when passing forwarded parameters to callees. The caller's info object (known as the "CI") contains the stack size of the parameters, so we pass the CI object itself as a parameter to the callee. When forwarding parameters, the forwarding ISeq uses the caller's CI to determine how much stack to copy, then copies the caller's stack before calling the callee. The CI at the forwarded call site is adjusted using information from the caller's CI. I think this description is kind of confusing, so let's walk through an example with code. ```ruby def delegatee(a, b) = a + b def delegator(...) delegatee(...) # CI2 (FORWARDING) end def caller delegator(1, 2) # CI1 (argc: 2) end ``` Before we call the delegator method, the stack looks like this: ``` Executing Line \| Code \| Stack ---------------+---------------------------------------+-------- 1\| def delegatee(a, b) = a + b \| self 2\| \| 1 3\| def delegator(...) \| 2 4\| # \| 5\| delegatee(...) # CI2 (FORWARDING) \| 6\| end \| 7\| \| 8\| def caller \| -> 9\| delegator(1, 2) # CI1 (argc: 2) \| 10\| end \| ``` The ISeq for `delegator` is tagged as "forwardable", so when `caller` calls in to `delegator`, it writes `CI1` on to the stack as a local variable for the `delegator` method. The `delegator` method has a special local called `...` that holds the caller's CI object. Here is the ISeq disasm fo `delegator`: ``` == disasm: #<ISeq:delegator@-e:1 (1,0)-(1,39)> local table (size: 1, argc: 0 [opts: 0, rest: -1, post: 0, block: -1, kw: -1@-1, kwrest: -1]) [ 1] "..."@0 0000 putself ( 1)[LiCa] 0001 getlocal_WC_0 "..."@0 0003 send <calldata!mid:delegatee, argc:0, FCALL\|FORWARDING>, nil 0006 leave [Re] ``` The local called `...` will contain the caller's CI: CI1. Here is the stack when we enter `delegator`: ``` Executing Line \| Code \| Stack ---------------+---------------------------------------+-------- 1\| def delegatee(a, b) = a + b \| self 2\| \| 1 3\| def delegator(...) \| 2 -> 4\| # \| CI1 (argc: 2) 5\| delegatee(...) # CI2 (FORWARDING) \| cref_or_me 6\| end \| specval 7\| \| type 8\| def caller \| 9\| delegator(1, 2) # CI1 (argc: 2) \| 10\| end \| ``` The CI at `delegatee` on line 5 is tagged as "FORWARDING", so it knows to memcopy the caller's stack before calling `delegatee`. In this case, it will memcopy self, 1, and 2 to the stack before calling `delegatee`. It knows how much memory to copy from the caller because `CI1` contains stack size information (argc: 2). Before executing the `send` instruction, we push `...` on the stack. The `send` instruction pops `...`, and because it is tagged with `FORWARDING`, it knows to memcopy (using the information in the CI it just popped): ``` == disasm: #<ISeq:delegator@-e:1 (1,0)-(1,39)> local table (size: 1, argc: 0 [opts: 0, rest: -1, post: 0, block: -1, kw: -1@-1, kwrest: -1]) [ 1] "..."@0 0000 putself ( 1)[LiCa] 0001 getlocal_WC_0 "..."@0 0003 send <calldata!mid:delegatee, argc:0, FCALL\|FORWARDING>, nil 0006 leave [Re] ``` Instruction 001 puts the caller's CI on the stack. `send` is tagged with FORWARDING, so it reads the CI and _copies_ the callers stack to this stack: ``` Executing Line \| Code \| Stack ---------------+---------------------------------------+-------- 1\| def delegatee(a, b) = a + b \| self 2\| \| 1 3\| def delegator(...) \| 2 4\| # \| CI1 (argc: 2) -> 5\| delegatee(...) # CI2 (FORWARDING) \| cref_or_me 6\| end \| specval 7\| \| type 8\| def caller \| self 9\| delegator(1, 2) # CI1 (argc: 2) \| 1 10\| end \| 2 ``` The "FORWARDING" call site combines information from CI1 with CI2 in order to support passing other values in addition to the `...` value, as well as perfectly forward splat args, kwargs, etc. Since we're able to copy the stack from `caller` in to `delegator`'s stack, we can avoid allocating objects. I want to do this to eliminate object allocations for delegate methods. My long term goal is to implement `Class#new` in Ruby and it uses `...`. I was able to implement `Class#new` in Ruby [here](https://github.com/ruby/ruby/pull/9289). If we adopt the technique in this patch, then we can optimize allocating objects that take keyword parameters for `initialize`. For example, this code will allocate 2 objects: one for `SomeObject`, and one for the kwargs: ```ruby SomeObject.new(foo: 1) ``` If we combine this technique, plus implement `Class#new` in Ruby, then we can reduce allocations for this common operation. Co-Authored-By: John Hawthorn <[email protected]> Co-Authored-By: Alan Wu <[email protected]>
2024-03-03	Drop support for old ERB	Nobuyoshi Nakada

2023-08-24	Escape non-ascii characters in prelude C comments	Nobuyoshi Nakada
	Non-ASCII code are often warned by localized compilers. Notes: Merged: https://github.com/ruby/ruby/pull/8276
2023-08-16	Move the PC regardless of the leaf flag (#8232)	Takashi Kokubun
	Co-authored-by: Alan Wu <[email protected]> Notes: Merged-By: k0kubun <[email protected]>
2023-07-27	Clean up OPT_STACK_CACHING (#8132)	Takashi Kokubun
	Notes: Merged-By: k0kubun <[email protected]>
2023-03-16	Rename opes to operands on RubyVM::BaseInstruction	John Hawthorn
	Notes: Merged: https://github.com/ruby/ruby/pull/7523
2023-03-16	Rename opes to operands	John Hawthorn
	Co-authored-by: Aaron Patterson <[email protected]> Notes: Merged: https://github.com/ruby/ruby/pull/7523
2023-03-16	Re-add RJIT::Instruction#opes	John Hawthorn
	Notes: Merged: https://github.com/ruby/ruby/pull/7523
2023-03-10	RJIT: Simplify RubyVM::RJIT::Instruction	Takashi Kokubun

2023-03-06	s/MJIT/RJIT/	Takashi Kokubun
	Notes: Merged: https://github.com/ruby/ruby/pull/7462
2023-03-06	Rename MJIT filenames to RJIT	Takashi Kokubun
	Notes: Merged: https://github.com/ruby/ruby/pull/7462
2023-03-06	Remove obsoleted mjit_sp_inc.inc.erb	Takashi Kokubun

2023-03-05	Decode trace insns properly	Takashi Kokubun

2023-03-05	Move modules around	Takashi Kokubun

2023-02-24	Fix incorrect line numbers in GC hook	Peter Zhu
	If the previous instruction is not a leaf instruction, then the PC was incremented before the instruction was ran (meaning the currently executing instruction is actually the previous instruction), so we should not increment the PC otherwise we will calculate the source line for the next instruction. This bug can be reproduced in the following script: ``` require "objspace" ObjectSpace.trace_object_allocations_start a = 1.0 / 0.0 p [ObjectSpace.allocation_sourceline(a), ObjectSpace.allocation_sourcefile(a)] ``` Which outputs: [4, "test.rb"] This is incorrect because the object was allocated on line 10 and not line 4. The behaviour is correct when we use a leaf instruction (e.g. if we replaced `1.0 / 0.0` with `"hello"`), then the output is: [10, "test.rb"]. [Bug #19456] Notes: Merged: https://github.com/ruby/ruby/pull/7357
2023-02-24	Fix RubyVM::CExpr#inspect	Peter Zhu
	@__LINE__ can be nil which causes the inspect method to fail. Notes: Merged: https://github.com/ruby/ruby/pull/7357
2022-12-22	Polish the public docs for MJIT [ci skip]	Takashi Kokubun
	Now every private interface is cleaned up, and the public interface is documented.
2022-12-21	Put RubyVM::MJIT::Compiler under ruby_vm directory (#6989)	Takashi Kokubun
	[Misc #19250] Notes: Merged-By: k0kubun <[email protected]>
2022-11-29	MJIT: Rename mjit_compile_attr to mjit_sp_inc	Takashi Kokubun
	There's no mjit_compile.inc, so no need to use this prefix anymore.
2022-10-13	Revert "FreeBSD make uses the target under srcdir [ci skip]"	Nobuyoshi Nakada
	This reverts commit 751ffb276f658518c6fe06461a9d3d1c136c7d5d, which caused build failures on other platforms.
2022-10-13	FreeBSD make uses the target under srcdir [ci skip]	Nobuyoshi Nakada

2022-09-23	mjit_c.rb doesn't need to be an erb	Takashi Kokubun
	Notes: Merged: https://github.com/ruby/ruby/pull/6418
2022-09-23	Mix manual and auto-generated C APIs	Takashi Kokubun
	Notes: Merged: https://github.com/ruby/ruby/pull/6418
2022-09-23	Bindgen macro with builtin	Takashi Kokubun
	Notes: Merged: https://github.com/ruby/ruby/pull/6418
2022-09-23	Builtin RubyVM::MJIT::C	Takashi Kokubun
	Notes: Merged: https://github.com/ruby/ruby/pull/6418
2022-09-22	Expand paths used for dumper.rb	Takashi Kokubun
	This seems to be needed on Samuel's environment
2022-09-18	Introduce --basedir to insns2vm.rb	Takashi Kokubun
	and leverage that to preserve the directory structure under tool/ruby_vm/views
2022-09-18	Revert "Preserve the directory structure under tool/ruby_vm/views"	Takashi Kokubun
	This reverts commit 62ec621f8c7457374d1f08aec97138ac1b7bdf2a. will revisit this once fixing non-MJIT targets
2022-09-18	Preserve the directory structure under tool/ruby_vm/views	Takashi Kokubun
	for nested target directories
2022-09-18	Demote mjit_instruction.rb from builtin to stdlib	Takashi Kokubun

2022-09-05	Fix warnings from private_constant	Takashi Kokubun
	`private_constant *constants` seems to be warned for some reason
2022-09-04	Ruby MJIT (#6028)	Takashi Kokubun
	Notes: Merged-By: k0kubun <[email protected]>
2022-09-01	New constant caching insn: opt_getconstant_path	John Hawthorn
	Previously YARV bytecode implemented constant caching by having a pair of instructions, opt_getinlinecache and opt_setinlinecache, wrapping a series of getconstant calls (with putobject providing supporting arguments). This commit replaces that pattern with a new instruction, opt_getconstant_path, handling both getting/setting the inline cache and fetching the constant on a cache miss. This is implemented by storing the full constant path as a null-terminated array of IDs inside of the IC structure. idNULL is used to signal an absolute constant reference. $ ./miniruby --dump=insns -e '::Foo::Bar::Baz' == disasm: #<ISeq:<main>@-e:1 (1,0)-(1,13)> (catch: FALSE) 0000 opt_getconstant_path <ic:0 ::Foo::Bar::Baz> ( 1)[Li] 0002 leave The motivation for this is that we had increasingly found the need to disassemble the instructions between the opt_getinlinecache and opt_setinlinecache in order to determine the constant we are fetching, or otherwise store metadata. This disassembly was done: * In opt_setinlinecache, to register the IC against the constant names it is using for granular invalidation. * In rb_iseq_free, to unregister the IC from the invalidation table. * In YJIT to find the position of a opt_getinlinecache instruction to invalidate it when the cache is populated * In YJIT to register the constant names being used for invalidation. With this change we no longe need disassemly for these (in fact rb_iseq_each is now unused), as the list of constant names being referenced is held in the IC. This should also make it possible to make more optimizations in the future. This may also reduce the size of iseqs, as previously each segment required 32 bytes (on 64-bit platforms) for each constant segment. This implementation only stores one ID per-segment. There should be no significant performance change between this and the previous implementation. Previously opt_getinlinecache was a "leaf" instruction, but it included a jump (almost always to a separate cache line). Now opt_getconstant_path is a non-leaf (it may raise/autoload/call const_missing) but it does not jump. These seem to even out. Notes: Merged: https://github.com/ruby/ruby/pull/6187
2022-08-21	Rename mjit_compile.c to mjit_compiler.c	Takashi Kokubun
	I'm planning to introduce mjit_compiler.rb, and I want to make this consistent with it. Consistency with compile.c doesn't seem important for MJIT anyway.
2022-08-19	Rename mjit_exec to jit_exec (#6262)	Takashi Kokubun
	* Rename mjit_exec to jit_exec * Rename mjit_exec_slowpath to mjit_check_iseq * Remove mjit_exec references from comments Notes: Merged-By: k0kubun <[email protected]>
2022-07-21	Expand tabs [ci skip]	Takashi Kokubun
	[Misc #18891] Notes: Merged: https://github.com/ruby/ruby/pull/6094
2022-07-18	Separate TS_IVC and TS_ICVARC in is_entries buffers	Jemma Issroff
	This allows us to treat cvar caches differently than ivar caches. Notes: Merged: https://github.com/ruby/ruby/pull/6148
2022-07-15	Implement Objects on VWA	Peter Zhu
	This commit implements Objects on Variable Width Allocation. This allows Objects with more ivars to be embedded (i.e. contents directly follow the object header) which improves performance through better cache locality. Notes: Merged: https://github.com/ruby/ruby/pull/6117
2022-06-30	Adjust indent [ci skip]	Nobuyoshi Nakada

2022-06-29	Move function to `static inline` so we don't have leaked globals	Aaron Patterson
	This function shouldn't leak and is only needed during instruction assembly Notes: Merged: https://github.com/ruby/ruby/pull/6069
2022-04-01	Finer-grained constant cache invalidation (take 2)	Kevin Newton
	This commit reintroduces finer-grained constant cache invalidation. After 8008fb7 got merged, it was causing issues on token-threaded builds (such as on Windows). The issue was that when you're iterating through instruction sequences and using the translator functions to get back the instruction structs, you're either using `rb_vm_insn_null_translator` or `rb_vm_insn_addr2insn2` depending if it's a direct-threading build. `rb_vm_insn_addr2insn2` does some normalization to always return to you the non-trace version of whatever instruction you're looking at. `rb_vm_insn_null_translator` does not do that normalization. This means that when you're looping through the instructions if you're trying to do an opcode comparison, it can change depending on the type of threading that you're using. This can be very confusing. So, this commit creates a new translator function `rb_vm_insn_normalizing_translator` to always return the non-trace version so that opcode comparisons don't have to worry about different configurations. [Feature #18589] Notes: Merged: https://github.com/ruby/ruby/pull/5716
2022-03-25	Revert "Finer-grained inline constant cache invalidation"	Nobuyoshi Nakada
	This reverts commits for [Feature #18589]: * 8008fb7352abc6fba433b99bf20763cf0d4adb38 "Update formatting per feedback" * 8f6eaca2e19828e92ecdb28b0fe693d606a03f96 "Delete ID from constant cache table if it becomes empty on ISEQ free" * 629908586b4bead1103267652f8b96b1083573a8 "Finer-grained inline constant cache invalidation" MSWin builds on AppVeyor have been crashing since the merger. Notes: Merged: https://github.com/ruby/ruby/pull/5715 Merged-By: nobu <[email protected]>
2022-03-24	Finer-grained inline constant cache invalidation	Kevin Newton
	Current behavior - caches depend on a global counter. All constant mutations cause caches to be invalidated. ```ruby class A B = 1 end def foo A::B # inline cache depends on global counter end foo # populate inline cache foo # hit inline cache C = 1 # global counter increments, all caches are invalidated foo # misses inline cache due to `C = 1` ``` Proposed behavior - caches depend on name components. Only constant mutations with corresponding names will invalidate the cache. ```ruby class A B = 1 end def foo A::B # inline cache depends constants named "A" and "B" end foo # populate inline cache foo # hit inline cache C = 1 # caches that depend on the name "C" are invalidated foo # hits inline cache because IC only depends on "A" and "B" ``` Examples of breaking the new cache: ```ruby module C # Breaks `foo` cache because "A" constant is set and the cache in foo depends # on "A" and "B" class A; end end B = 1 ``` We expect the new cache scheme to be invalidated less often because names aren't frequently reused. With the cache being invalidated less, we can rely on its stability more to keep our constant references fast and reduce the need to throw away generated code in YJIT. Notes: Merged: https://github.com/ruby/ruby/pull/5433
2022-03-24	Add ISEQ_BODY macro	Peter Zhu
	Use ISEQ_BODY macro to get the rb_iseq_constant_body of the ISeq. Using this macro will make it easier for us to change the allocation strategy of rb_iseq_constant_body when using Variable Width Allocation. Notes: Merged: https://github.com/ruby/ruby/pull/5698
2022-02-02	Treat TS_ICVARC cache as separate from TS_IVC cache	Jemma Issroff
	Notes: Merged: https://github.com/ruby/ruby/pull/5519
2021-12-05	Make `leaf` const in VM generator	Alan Wu
	Assigning to `leaf` in insns.def would give undesirable results.
2021-10-29	vm_core.h: Avoid unaligned access to ic_serial on 32-bit machine	Yusuke Endoh
	This caused Bus error on 32 bit Solaris Notes: Merged: https://github.com/ruby/ruby/pull/5049
2021-10-20	Cleanup diff against upstream. Add comments	Alan Wu
	I did a `git diff --stat` against upstream and looked at all the files that are outside of YJIT to come up with these minor changes.
2021-10-20	Remove the scraper	Aaron Patterson
	Now that we're using the jit function entry point, we don't need the scraper. Thank you for your service, scraper. ❤️
2021-10-20	Remove some MicroJIT vestiges	Aaron Patterson
	Just happened to run across this, so lets fix them