Age | Commit message (Collapse) | Author |
|
|
|
* Add debug counters to RubyVM.stat
* Use SIZET2NUM
Co-author: Nobuyoshi Nakada <[email protected]>
* Prefix debug_counter_names
Notes:
Merged-By: k0kubun <[email protected]>
|
|
All values should have a MJIT_ prefix. We could address the warning for
the end mark if we just define the macro for the check next to the enum.
It even simplifies some code for checking the enum.
|
|
Prior to this commit the `OPTIMIZED_CMP` macro relied on a method lookup
to determine whether `<=>` was overridden. The result of the lookup was
cached, but only for the duration of the specific method that
initialized the cmp_opt_data cache structure.
With this method lookup, `[x,y].max` is slower than doing `x > y ?
x : y` even though there's an optimized instruction for "new array max".
(John noticed somebody a proposed micro-optimization based on this fact
in https://github.com/mastodon/mastodon/pull/19903.)
```rb
a, b = 1, 2
Benchmark.ips do |bm|
bm.report('conditional') { a > b ? a : b }
bm.report('method') { [a, b].max }
bm.compare!
end
```
Before:
```
Comparison:
conditional: 22603733.2 i/s
method: 19820412.7 i/s - 1.14x (± 0.00) slower
```
This commit replaces the method lookup with a new CMP basic op, which
gives the examples above equivalent performance.
After:
```
Comparison:
method: 24022466.5 i/s
conditional: 23851094.2 i/s - same-ish: difference falls within
error
```
Relevant benchmarks show an improvement to Array#max and Array#min when
not using the optimized newarray_max instruction as well. They are
noticeably faster for small arrays with the relevant types, and the same
or maybe a touch faster on larger arrays.
```
$ make benchmark COMPARE_RUBY=<master@5958c305> ITEM=array_min
$ make benchmark COMPARE_RUBY=<master@5958c305> ITEM=array_max
```
The benchmarks added in this commit also look generally improved.
Co-authored-by: John Hawthorn <[email protected]>
|
|
This commit moves ruby_basic_operators and the unredefined macros out of
vm_core.h and into basic_operators.h so that we can use them more
broadly in places where we currently use a method look up via
`rb_method_basic_definition_p` (e.g. object.c, numeric.c, complex.c,
enum.c, but also in internal/compar.h after introducing BOP_CMP and
elsewhere if we introduce more BOPs)
The most controversial part of this change is probably moving
redefined_flag out of rb_vm_t. [vm_opt_method_def_table and
vm_opt_mid_table](https://github.com/ruby/ruby/blob/9da2a5204f32a4f2ce135fddde2abb6e07d647e9/vm.c)
are not part of rb_vm_t either, and I think this fits well with those.
But more significantly it seems to result in one fewer instruction. For
example:
Before:
```
(lldb) disassemble -n vm_opt_str_freeze
miniruby`vm_exec_core:
miniruby[0x10028233e] <+14558>: movq 0x11a86b(%rip), %rax ; ruby_current_vm_ptr
miniruby[0x100282345] <+14565>: testb $0x4, 0x242c(%rax)
```
After:
```
(lldb) disassemble -n vm_opt_str_freeze
ruby`vm_exec_core:
ruby[0x100280ebe] <+14510>: testb $0x4, 0x120147(%rip) ; ruby_vm_redefined_flag + 43
```
Co-authored-by: John Hawthorn <[email protected]>
|
|
Notes:
Merged-By: ioquatix <[email protected]>
|
|
Previously, for statically-linked extensions, we used
`vm->loading_table` to delay calling the init function until the
extensions are required. This caused the extensions to look like they
are in the middle of being loaded even before they're required.
(`rb_feature_p()` returned true with a loading path output.) Combined
with autoload, queries like `defined?(CONST)` and `Module#autoload?`
were confused by this and returned nil incorrectly. RubyGems uses
`defined?` to detect if OpenSSL is available and failed when OpenSSL was
available in builds using `--with-static-linked-ext`.
Use a dedicated table for the init functions instead of adding them to
the loading table. This lets us remove some logic from non-EXTSTATIC
builds.
[Bug #19115]
Notes:
Merged: https://github.com/ruby/ruby/pull/6756
|
|
We need to track this number in CI. It's important to know how changes
to the codebase impact the number of shapes in the system, so lets add
the number to the VM stat hash
Notes:
Merged: https://github.com/ruby/ruby/pull/6798
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/6721
|
|
for consistency with YJIT
Notes:
Merged-By: k0kubun <[email protected]>
|
|
* Reduce the number of branches in jit_exec
* Address build failure in some configurations
* Refactor yjit.h
Notes:
Merged-By: k0kubun <[email protected]>
|
|
|
|
The structure and readability of jit_exec is messed up right now. I'd
like to help the current situation by this for now. I'll resurrect
them when I need it again.
|
|
This commit adds a `capacity` field to shapes, and adds shape
transitions whenever an object's capacity changes. Objects which are
allocated out of a bigger size pool will also make a transition from the
root shape to the shape with the correct capacity for their size pool
when they are allocated.
This commit will allow us to remove numiv from objects completely, and
will also mean we can guarantee that if two objects share shapes, their
IVs are in the same positions (an embedded and extended object cannot
share shapes). This will enable us to implement ivar sets in YJIT using
object shapes.
Co-Authored-By: Aaron Patterson <[email protected]>
Notes:
Merged: https://github.com/ruby/ruby/pull/6699
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/6634
|
|
Before object shapes, we were using class serial to invalidate
inline caches. Now that we use shape_id for inline cache keys,
the class serial is unnecessary.
Co-Authored-By: Aaron Patterson <[email protected]>
Notes:
Merged: https://github.com/ruby/ruby/pull/6605
|
|
This patch pushes dummy frames when loading code for the
profiling purpose.
The following methods push a dummy frame:
* `Kernel#require`
* `Kernel#load`
* `RubyVM::InstructionSequence.compile_file`
* `RubyVM::InstructionSequence.load_from_binary`
https://bugs.ruby-lang.org/issues/18559
Notes:
Merged: https://github.com/ruby/ruby/pull/6572
|
|
Prior to this commit, we were reading and writing ivar index and
shape ID in inline caches in two separate instructions when
getting and setting ivars. This meant there was a race condition
with ractors and these caches where one ractor could change
a value in the cache while another was still reading from it.
This commit instead reads and writes shape ID and ivar index to
inline caches atomically so there is no longer a race condition.
Co-Authored-By: Aaron Patterson <[email protected]>
Co-Authored-By: John Hawthorn <[email protected]>
|
|
This reverts commit 9a6803c90b817f70389cae10d60b50ad752da48f.
|
|
This reverts commit 68bc9e2e97d12f80df0d113e284864e225f771c2.
|
|
Object Shapes is used for accessing instance variables and representing the
"frozenness" of objects. Object instances have a "shape" and the shape
represents some attributes of the object (currently which instance variables are
set and the "frozenness"). Shapes form a tree data structure, and when a new
instance variable is set on an object, that object "transitions" to a new shape
in the shape tree. Each shape has an ID that is used for caching. The shape
structure is independent of class, so objects of different types can have the
same shape.
For example:
```ruby
class Foo
def initialize
# Starts with shape id 0
@a = 1 # transitions to shape id 1
@b = 1 # transitions to shape id 2
end
end
class Bar
def initialize
# Starts with shape id 0
@a = 1 # transitions to shape id 1
@b = 1 # transitions to shape id 2
end
end
foo = Foo.new # `foo` has shape id 2
bar = Bar.new # `bar` has shape id 2
```
Both `foo` and `bar` instances have the same shape because they both set
instance variables of the same name in the same order.
This technique can help to improve inline cache hits as well as generate more
efficient machine code in JIT compilers.
This commit also adds some methods for debugging shapes on objects. See
`RubyVM::Shape` for more details.
For more context on Object Shapes, see [Feature: #18776]
Co-Authored-By: Aaron Patterson <[email protected]>
Co-Authored-By: Eileen M. Uchitelle <[email protected]>
Co-Authored-By: John Hawthorn <[email protected]>
|
|
Revert "* expand tabs. [ci skip]"
This reverts commit 830b5b5c351c5c6efa5ad461ae4ec5085e5f0275.
Revert "This commit implements the Object Shapes technique in CRuby."
This reverts commit 9ddfd2ca004d1952be79cf1b84c52c79a55978f4.
|
|
Object Shapes is used for accessing instance variables and representing the
"frozenness" of objects. Object instances have a "shape" and the shape
represents some attributes of the object (currently which instance variables are
set and the "frozenness"). Shapes form a tree data structure, and when a new
instance variable is set on an object, that object "transitions" to a new shape
in the shape tree. Each shape has an ID that is used for caching. The shape
structure is independent of class, so objects of different types can have the
same shape.
For example:
```ruby
class Foo
def initialize
# Starts with shape id 0
@a = 1 # transitions to shape id 1
@b = 1 # transitions to shape id 2
end
end
class Bar
def initialize
# Starts with shape id 0
@a = 1 # transitions to shape id 1
@b = 1 # transitions to shape id 2
end
end
foo = Foo.new # `foo` has shape id 2
bar = Bar.new # `bar` has shape id 2
```
Both `foo` and `bar` instances have the same shape because they both set
instance variables of the same name in the same order.
This technique can help to improve inline cache hits as well as generate more
efficient machine code in JIT compilers.
This commit also adds some methods for debugging shapes on objects. See
`RubyVM::Shape` for more details.
For more context on Object Shapes, see [Feature: #18776]
Co-Authored-By: Aaron Patterson <[email protected]>
Co-Authored-By: Eileen M. Uchitelle <[email protected]>
Co-Authored-By: John Hawthorn <[email protected]>
Notes:
Merged: https://github.com/ruby/ruby/pull/6386
|
|
fork is for parallel compilation, but --mjit-wait cancels it.
It's more useful to not fork it for binding.irb, debugging, etc.
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/6274
Merged-By: nobu <[email protected]>
|
|
* Rename mjit_exec to jit_exec
* Rename mjit_exec_slowpath to mjit_check_iseq
* Remove mjit_exec references from comments
Notes:
Merged-By: k0kubun <[email protected]>
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/6239
|
|
* Simplify around `USE_YJIT` macro
- Use `USE_YJIT` macro only instead of `YJIT_BUILD`.
- An intermediate macro `YJIT_SUPPORTED_P` is no longer used.
* Bail out if YJIT is enabled on unsupported platforms
Notes:
Merged-By: maximecb <[email protected]>
|
|
rb_ary_tmp_new suggests that the array is temporary in some way, but
that's not true, it just creates an array that's hidden and not on the
transient heap. This commit renames it to rb_ary_hidden_new.
Notes:
Merged: https://github.com/ruby/ruby/pull/6180
|
|
... as per ko1's request.
Notes:
Merged: https://github.com/ruby/ruby/pull/6169
|
|
[Misc #18891]
Notes:
Merged: https://github.com/ruby/ruby/pull/6094
|
|
This commit prevents the stack from being marked twice: once via the
Fiber, and once via the Thread. It introduces an assertion to assert
that the ec on the thread is the same as the ec on the Fiber being
marked via the thread.
Notes:
Merged: https://github.com/ruby/ruby/pull/6123
|
|
Previously, we didn't pop the frame that runs the TracePoint hook for
b_return events for blocks running as methods (bmethods). In case the
hook raises, that formed an infinite loop during stack unwinding in
hook_before_rewind().
[Bug #18060]
Notes:
Merged: https://github.com/ruby/ruby/pull/4638
|
|
just less C code to maintain
|
|
`NON_SCALAR_THREAD_ID` shows `pthread_t` is non-scalar (non-pointer)
and only s390x is known platform. However, the supporting code is
very complex and it is only used for deubg print information.
So this patch removes the support of `NON_SCALAR_THREAD_ID`
and make the code simple.
Notes:
Merged: https://github.com/ruby/ruby/pull/5933
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/5922
|
|
`rb_thread_t::serial` is auto-incremented serial number for
threads and it can overflow, it means the serial is not a ID
for each thread, it is only for debug print.
`RUBY_DEBUG_LOG` shows this information.
Also skip EC related information if EC is NULL. This patch
enable to use `RUBY_DEBUG_LOG` without setup EC.
Notes:
Merged: https://github.com/ruby/ruby/pull/5921
|
|
* Update naming of critical section assertions macros.
* Improved locking for autoload.
Notes:
Merged-By: ioquatix <[email protected]>
|
|
* Add RUBY_VM_CRITICAL_SECTION for detecting unexpected context switch.
* Prevent race between GC mark and autoload setup.
* Protect race on autoload state.
* Avoid potential race condition when allocating `autoload_featuremap`.
* Add NEWS entry for autoload fixes.
Notes:
Merged-By: ioquatix <[email protected]>
|
|
In December 2021, we opened an [issue] to solicit feedback regarding the
porting of the YJIT codebase from C99 to Rust. There were some
reservations, but this project was given the go ahead by Ruby core
developers and Matz. Since then, we have successfully completed the port
of YJIT to Rust.
The new Rust version of YJIT has reached parity with the C version, in
that it passes all the CRuby tests, is able to run all of the YJIT
benchmarks, and performs similarly to the C version (because it works
the same way and largely generates the same machine code). We've even
incorporated some design improvements, such as a more fine-grained
constant invalidation mechanism which we expect will make a big
difference in Ruby on Rails applications.
Because we want to be careful, YJIT is guarded behind a configure
option:
```shell
./configure --enable-yjit # Build YJIT in release mode
./configure --enable-yjit=dev # Build YJIT in dev/debug mode
```
By default, YJIT does not get compiled and cargo/rustc is not required.
If YJIT is built in dev mode, then `cargo` is used to fetch development
dependencies, but when building in release, `cargo` is not required,
only `rustc`. At the moment YJIT requires Rust 1.60.0 or newer.
The YJIT command-line options remain mostly unchanged, and more details
about the build process are documented in `doc/yjit/yjit.md`.
The CI tests have been updated and do not take any more resources than
before.
The development history of the Rust port is available at the following
commit for interested parties:
https://github.com/Shopify/ruby/commit/1fd9573d8b4b65219f1c2407f30a0a60e537f8be
Our hope is that Rust YJIT will be compiled and included as a part of
system packages and compiled binaries of the Ruby 3.2 release. We do not
anticipate any major problems as Rust is well supported on every
platform which YJIT supports, but to make sure that this process works
smoothly, we would like to reach out to those who take care of building
systems packages before the 3.2 release is shipped and resolve any
issues that may come up.
[issue]: https://bugs.ruby-lang.org/issues/18481
Co-authored-by: Maxime Chevalier-Boisvert <[email protected]>
Co-authored-by: Noah Gibbs <[email protected]>
Co-authored-by: Kevin Newton <[email protected]>
Notes:
Merged: https://github.com/ruby/ruby/pull/5826
|
|
`rb_thread_t` contained `native_thread_data_t` to represent
thread implementation dependent data. This patch separates
them and rename it `rb_native_thread` and point it from
`rb_thraed_t`.
Now, 1 Ruby thread (`rb_thread_t`) has 1 native thread (`rb_native_thread`).
Notes:
Merged: https://github.com/ruby/ruby/pull/5836
|
|
* `th_init` accepts vm and ractor.
* remove `ruby_thread_init` because it is duplicated with `th_init`.
* add some comments.
Notes:
Merged: https://github.com/ruby/ruby/pull/5834
|
|
Not sure if this is the correct fix. It does raise LocalJumpError in
the yielding thread as you would expect, but the value yielded to the calling
thread is still yielded without an exception.
Fixes [Bug #18649]
Notes:
Merged: https://github.com/ruby/ruby/pull/5692
|
|
Check whether the current or previous frame is a Ruby frame in
call_trace_func and rb_tracearg_binding before attempting to
create a binding for the frame.
Fixes [Bug #18487]
Co-authored-by: Alan Wu <[email protected]>
Notes:
Merged: https://github.com/ruby/ruby/pull/5767
Merged-By: jeremyevans <[email protected]>
|
|
Before the new constant cache behavior, caches were invalidated by a
single global variable. You could inspect the value of this variable
with RubyVM.stat(:global_constant_state). This was mostly useful to
verify the behavior of the VM or to test constant loading like in Rails.
With the new constant cache behavior, we introduced
RubyVM.stat(:constant_cache) which returned a hash with symbol keys and
integer values that represented the number of live constant caches
associated with the given symbol. Additionally, we removed the old
RubyVM.stat(:global_constant_state).
This was proven to be not very useful, so it doesn't help you diagnose
constant loading issues. So, instead we added the global constant state
back into the RubyVM output. However, that number can be misleading as
now when you invalidate something like `Foo::Bar::Baz` you're actually
invalidating 3 different lists of inline caches.
This commit attempts to get the best of both worlds. We remove
RubyVM.stat(:global_constant_state) like we did originally, as it
doesn't have the same semantic meaning and it could be confusing going
forward. Instead we add RubyVM.stat(:constant_cache_invalidations) and
RubyVM.stat(:constant_cache_misses). These two metrics should provide
enough information to diagnose any constant loading issues, as well as
provide a replacement for the old global constant state.
Notes:
Merged-By: maximecb <[email protected]>
|
|
This was removed as part of [Feature #18589]. But some applications were relying on this behavior. So bringing this back to make it better for backward compatibility going forward.
Notes:
Merged: https://github.com/ruby/ruby/pull/5758
|
|
This commit reintroduces finer-grained constant cache invalidation.
After 8008fb7 got merged, it was causing issues on token-threaded
builds (such as on Windows).
The issue was that when you're iterating through instruction sequences
and using the translator functions to get back the instruction structs,
you're either using `rb_vm_insn_null_translator` or
`rb_vm_insn_addr2insn2` depending if it's a direct-threading build.
`rb_vm_insn_addr2insn2` does some normalization to always return to
you the non-trace version of whatever instruction you're looking at.
`rb_vm_insn_null_translator` does not do that normalization.
This means that when you're looping through the instructions if you're
trying to do an opcode comparison, it can change depending on the type
of threading that you're using. This can be very confusing. So, this
commit creates a new translator function
`rb_vm_insn_normalizing_translator` to always return the non-trace
version so that opcode comparisons don't have to worry about different
configurations.
[Feature #18589]
Notes:
Merged: https://github.com/ruby/ruby/pull/5716
|
|
This reverts commit 343ea9967e4a6b279eed6bd8e81ad0bdc747f254.
This causes an assertion failure with -DRUBY_DEBUG=1 -DRGENGC_CHECK_MODE=2
|
|
* Prefixed ccan headers
* Remove unprefixed names in ccan/build_assert
* Remove unprefixed names in ccan/check_type
* Remove unprefixed names in ccan/container_of
* Remove unprefixed names in ccan/list
Co-authored-by: Samuel Williams <[email protected]>
Notes:
Merged-By: ioquatix <[email protected]>
|
|
This reverts commits for [Feature #18589]:
* 8008fb7352abc6fba433b99bf20763cf0d4adb38
"Update formatting per feedback"
* 8f6eaca2e19828e92ecdb28b0fe693d606a03f96
"Delete ID from constant cache table if it becomes empty on ISEQ free"
* 629908586b4bead1103267652f8b96b1083573a8
"Finer-grained inline constant cache invalidation"
MSWin builds on AppVeyor have been crashing since the merger.
Notes:
Merged: https://github.com/ruby/ruby/pull/5715
Merged-By: nobu <[email protected]>
|