Age | Commit message (Collapse) | Author |
|
Prior to this commit the `OPTIMIZED_CMP` macro relied on a method lookup
to determine whether `<=>` was overridden. The result of the lookup was
cached, but only for the duration of the specific method that
initialized the cmp_opt_data cache structure.
With this method lookup, `[x,y].max` is slower than doing `x > y ?
x : y` even though there's an optimized instruction for "new array max".
(John noticed somebody a proposed micro-optimization based on this fact
in https://github.com/mastodon/mastodon/pull/19903.)
```rb
a, b = 1, 2
Benchmark.ips do |bm|
bm.report('conditional') { a > b ? a : b }
bm.report('method') { [a, b].max }
bm.compare!
end
```
Before:
```
Comparison:
conditional: 22603733.2 i/s
method: 19820412.7 i/s - 1.14x (± 0.00) slower
```
This commit replaces the method lookup with a new CMP basic op, which
gives the examples above equivalent performance.
After:
```
Comparison:
method: 24022466.5 i/s
conditional: 23851094.2 i/s - same-ish: difference falls within
error
```
Relevant benchmarks show an improvement to Array#max and Array#min when
not using the optimized newarray_max instruction as well. They are
noticeably faster for small arrays with the relevant types, and the same
or maybe a touch faster on larger arrays.
```
$ make benchmark COMPARE_RUBY=<master@5958c305> ITEM=array_min
$ make benchmark COMPARE_RUBY=<master@5958c305> ITEM=array_max
```
The benchmarks added in this commit also look generally improved.
Co-authored-by: John Hawthorn <[email protected]>
|
|
This commit moves ruby_basic_operators and the unredefined macros out of
vm_core.h and into basic_operators.h so that we can use them more
broadly in places where we currently use a method look up via
`rb_method_basic_definition_p` (e.g. object.c, numeric.c, complex.c,
enum.c, but also in internal/compar.h after introducing BOP_CMP and
elsewhere if we introduce more BOPs)
The most controversial part of this change is probably moving
redefined_flag out of rb_vm_t. [vm_opt_method_def_table and
vm_opt_mid_table](https://github.com/ruby/ruby/blob/9da2a5204f32a4f2ce135fddde2abb6e07d647e9/vm.c)
are not part of rb_vm_t either, and I think this fits well with those.
But more significantly it seems to result in one fewer instruction. For
example:
Before:
```
(lldb) disassemble -n vm_opt_str_freeze
miniruby`vm_exec_core:
miniruby[0x10028233e] <+14558>: movq 0x11a86b(%rip), %rax ; ruby_current_vm_ptr
miniruby[0x100282345] <+14565>: testb $0x4, 0x242c(%rax)
```
After:
```
(lldb) disassemble -n vm_opt_str_freeze
ruby`vm_exec_core:
ruby[0x100280ebe] <+14510>: testb $0x4, 0x120147(%rip) ; ruby_vm_redefined_flag + 43
```
Co-authored-by: John Hawthorn <[email protected]>
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/6700
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/6700
|
|
Notes:
Merged-By: ioquatix <[email protected]>
|
|
obj_ivar_set and vm_setivar_slowpath is essentially doing the same thing,
but the code is duplicated and not quite implemented in the same way,
which could cause bugs. This commit refactors vm_setivar_slowpath to use
obj_ivar_set.
Notes:
Merged: https://github.com/ruby/ruby/pull/6732
|
|
Implementation for Language Server Protocol (LSP) sometimes needs token information.
For example both `m(1)` and `m(1, )` has same AST structure other than node locations
then it's impossible to check the existence of `,` from AST. However in later case,
it might be better to suggest variables list for the second argument.
Token information is important for such case.
This commit adds these methods.
* Add `keep_tokens` option for `RubyVM::AbstractSyntaxTree.parse`, `.parse_file` and `.of`
* Add `RubyVM::AbstractSyntaxTree::Node#tokens` which returns tokens for the node including tokens for descendants nodes.
* Add `RubyVM::AbstractSyntaxTree::Node#all_tokens` which returns all tokens for the input script regardless the receiver node.
[Feature #19070]
Impacts on memory usage and performance are below:
Memory usage:
```
$ cat test.rb
root = RubyVM::AbstractSyntaxTree.parse_file(File.expand_path('../test/ruby/test_keyword.rb', __FILE__), keep_tokens: true)
$ /usr/bin/time -f %Mkb /usr/local/bin/ruby -v
ruby 3.2.0dev (2022-11-19T09:41:54Z 19070-keep_tokens d3af1b8057) [x86_64-linux]
11408kb
# keep_tokens :false
$ /usr/bin/time -f %Mkb /usr/local/bin/ruby test.rb
17508kb
# keep_tokens :true
$ /usr/bin/time -f %Mkb /usr/local/bin/ruby test.rb
30960kb
```
Performance:
```
$ cat ../ast_keep_tokens.yml
prelude: |
src = <<~SRC
module M
class C
def m1(a, b)
1 + a + b
end
end
end
SRC
benchmark:
without_keep_tokens: |
RubyVM::AbstractSyntaxTree.parse(src, keep_tokens: false)
with_keep_tokens: |
RubyVM::AbstractSyntaxTree.parse(src, keep_tokens: true)
$ make benchmark COMPARE_RUBY="./ruby" ARGS=../ast_keep_tokens.yml
/home/kaneko.y/.rbenv/shims/ruby --disable=gems -rrubygems -I../benchmark/lib ../benchmark/benchmark-driver/exe/benchmark-driver \
--executables="compare-ruby::./ruby -I.ext/common --disable-gem" \
--executables="built-ruby::./miniruby -I../lib -I. -I.ext/common ../tool/runruby.rb --extout=.ext -- --disable-gems --disable-gem" \
--output=markdown --output-compare -v ../ast_keep_tokens.yml
compare-ruby: ruby 3.2.0dev (2022-11-19T09:41:54Z 19070-keep_tokens d3af1b8057) [x86_64-linux]
built-ruby: ruby 3.2.0dev (2022-11-19T09:41:54Z 19070-keep_tokens d3af1b8057) [x86_64-linux]
warming up..
| |compare-ruby|built-ruby|
|:--------------------|-----------:|---------:|
|without_keep_tokens | 21.659k| 21.303k|
| | 1.02x| -|
|with_keep_tokens | 6.220k| 5.691k|
| | 1.09x| -|
```
Notes:
Merged: https://github.com/ruby/ruby/pull/6770
|
|
This commit adds a `capacity` field to shapes, and adds shape
transitions whenever an object's capacity changes. Objects which are
allocated out of a bigger size pool will also make a transition from the
root shape to the shape with the correct capacity for their size pool
when they are allocated.
This commit will allow us to remove numiv from objects completely, and
will also mean we can guarantee that if two objects share shapes, their
IVs are in the same positions (an embedded and extended object cannot
share shapes). This will enable us to implement ivar sets in YJIT using
object shapes.
Co-Authored-By: Aaron Patterson <[email protected]>
Notes:
Merged: https://github.com/ruby/ruby/pull/6699
|
|
The ELTS_SHARED flag is generic, so we should prefer to use the flags
specific of the type (STR_SHARED for strings and RARRAY_SHARED_FLAG
for arrays).
|
|
* Avoid RCLASS_IV_TBL in marshal.c
* Avoid RCLASS_IV_TBL for class names
* Avoid RCLASS_IV_TBL for autoload
* Avoid RCLASS_IV_TBL for class variables
* Avoid copying RCLASS_IV_TBL onto ICLASSes
* Use object shapes for Class and Module IVs
Notes:
Merged-By: jhawthorn <[email protected]>
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/6610
|
|
* YJIT: Lazily enable YJIT after prelude
* Update dependencies
* Use a bit field for opt->yjit
Notes:
Merged-By: maximecb <[email protected]>
|
|
Before object shapes, we were using class serial to invalidate
inline caches. Now that we use shape_id for inline cache keys,
the class serial is unnecessary.
Co-Authored-By: Aaron Patterson <[email protected]>
Notes:
Merged: https://github.com/ruby/ruby/pull/6605
|
|
This patch pushes dummy frames when loading code for the
profiling purpose.
The following methods push a dummy frame:
* `Kernel#require`
* `Kernel#load`
* `RubyVM::InstructionSequence.compile_file`
* `RubyVM::InstructionSequence.load_from_binary`
https://bugs.ruby-lang.org/issues/18559
Notes:
Merged: https://github.com/ruby/ruby/pull/6572
|
|
This solves multiple problems.
First, RB_VM_LOCK_ENTER/LEAVE is a barrier. We could at least use the
_NO_BARRIER variant.
Second, this doesn't need to interfere with GC or other GVL users when
multiple Ractors are used. This needs to be used in very few places, so
the benefit of fine-grained locking would outweigh its small maintenance
cost.
Third, it fixes a crash for YJIT. Because YJIT is never disabled until a
process exits unlike MJIT that finishes earlier, we could call jit_cont_free
when EC no longer exists, which crashes RB_VM_LOCK_ENTER.
|
|
Notes:
Merged-By: k0kubun <[email protected]>
|
|
* Make mjit_cont sharable with YJIT
* Update dependencies
* Update YJIT binding
Notes:
Merged-By: k0kubun <[email protected]>
|
|
We should make this function static and remove it from YJIT bindings.
Notes:
Merged: https://github.com/ruby/ruby/pull/6553
|
|
|
|
This reverts commit 9a6803c90b817f70389cae10d60b50ad752da48f.
|
|
If this option is enabled, SyntaxError is not raised and Node is
returned even if passed script is broken.
[Feature #19013]
Notes:
Merged: https://github.com/ruby/ruby/pull/6512
|
|
This reverts commit 68bc9e2e97d12f80df0d113e284864e225f771c2.
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/6353
Merged-By: nobu <[email protected]>
|
|
|
|
Object Shapes is used for accessing instance variables and representing the
"frozenness" of objects. Object instances have a "shape" and the shape
represents some attributes of the object (currently which instance variables are
set and the "frozenness"). Shapes form a tree data structure, and when a new
instance variable is set on an object, that object "transitions" to a new shape
in the shape tree. Each shape has an ID that is used for caching. The shape
structure is independent of class, so objects of different types can have the
same shape.
For example:
```ruby
class Foo
def initialize
# Starts with shape id 0
@a = 1 # transitions to shape id 1
@b = 1 # transitions to shape id 2
end
end
class Bar
def initialize
# Starts with shape id 0
@a = 1 # transitions to shape id 1
@b = 1 # transitions to shape id 2
end
end
foo = Foo.new # `foo` has shape id 2
bar = Bar.new # `bar` has shape id 2
```
Both `foo` and `bar` instances have the same shape because they both set
instance variables of the same name in the same order.
This technique can help to improve inline cache hits as well as generate more
efficient machine code in JIT compilers.
This commit also adds some methods for debugging shapes on objects. See
`RubyVM::Shape` for more details.
For more context on Object Shapes, see [Feature: #18776]
Co-Authored-By: Aaron Patterson <[email protected]>
Co-Authored-By: Eileen M. Uchitelle <[email protected]>
Co-Authored-By: John Hawthorn <[email protected]>
|
|
Revert "* expand tabs. [ci skip]"
This reverts commit 830b5b5c351c5c6efa5ad461ae4ec5085e5f0275.
Revert "This commit implements the Object Shapes technique in CRuby."
This reverts commit 9ddfd2ca004d1952be79cf1b84c52c79a55978f4.
|
|
Object Shapes is used for accessing instance variables and representing the
"frozenness" of objects. Object instances have a "shape" and the shape
represents some attributes of the object (currently which instance variables are
set and the "frozenness"). Shapes form a tree data structure, and when a new
instance variable is set on an object, that object "transitions" to a new shape
in the shape tree. Each shape has an ID that is used for caching. The shape
structure is independent of class, so objects of different types can have the
same shape.
For example:
```ruby
class Foo
def initialize
# Starts with shape id 0
@a = 1 # transitions to shape id 1
@b = 1 # transitions to shape id 2
end
end
class Bar
def initialize
# Starts with shape id 0
@a = 1 # transitions to shape id 1
@b = 1 # transitions to shape id 2
end
end
foo = Foo.new # `foo` has shape id 2
bar = Bar.new # `bar` has shape id 2
```
Both `foo` and `bar` instances have the same shape because they both set
instance variables of the same name in the same order.
This technique can help to improve inline cache hits as well as generate more
efficient machine code in JIT compilers.
This commit also adds some methods for debugging shapes on objects. See
`RubyVM::Shape` for more details.
For more context on Object Shapes, see [Feature: #18776]
Co-Authored-By: Aaron Patterson <[email protected]>
Co-Authored-By: Eileen M. Uchitelle <[email protected]>
Co-Authored-By: John Hawthorn <[email protected]>
Notes:
Merged: https://github.com/ruby/ruby/pull/6386
|
|
This reverts commit b9f030954a8a1572032f3548b39c5b8ac35792ce.
[Bug #18170]
|
|
UTF-16/UTF-32
* And simplify callers of get_actual_encoding().
* See [Feature #18949].
* See https://github.com/ruby/ruby/pull/6322#issuecomment-1242758474
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/6306
|
|
rb_ary_tmp_new suggests that the array is temporary in some way, but
that's not true, it just creates an array that's hidden and not on the
transient heap. This commit renames it to rb_ary_hidden_new.
Notes:
Merged: https://github.com/ruby/ruby/pull/6180
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/5956
|
|
The RARRAY_LITERAL_FLAG was added in commit
5871ecf956711fcacad7c03f2aef95115ed25bc4 to improve CoW performance for
array literals by not keeping track of reference counts.
This commit reverts that commit and has an alternate implementation that
is more generic for all frozen arrays. Since frozen arrays cannot be
modified, we don't need to set the RARRAY_SHARED_ROOT_FLAG and we don't
need to do reference counting.
Notes:
Merged: https://github.com/ruby/ruby/pull/6171
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/6157
|
|
Move some macros in array.c to internal/array.h so that other files
can also access these macros.
Notes:
Merged: https://github.com/ruby/ruby/pull/6157
|
|
This commit prevents the stack from being marked twice: once via the
Fiber, and once via the Thread. It introduces an assertion to assert
that the ec on the thread is the same as the ec on the Fiber being
marked via the thread.
Notes:
Merged: https://github.com/ruby/ruby/pull/6123
|
|
Prior to this commit it was possible to call `ObjectSpace._id2ref` with
an offset static symbol object_id and get back a new, incorrectly tagged
symbol:
```
> sensible_sym = ObjectSpace._id2ref(:a.object_id)
=> :a
> nonsense_sym = ObjectSpace._id2ref(:a.object_id + 40)
=> :a
> sensible_sym == nonsense_sym
=> false
```
`nonsense_sym` ends up tagged with `RUBY_ID_INSTANCE` instead of
`RB_ID_LOCAL`. That means we can do silly things like:
```
> foo = Object.new
> foo.instance_variable_set(:a, 123)
(irb):2:in `instance_variable_set': `a' is not allowed as an instance variable name (NameError)
> foo.instance_variable_set(ObjectSpace._id2ref(:a.object_id + 40), 123)
=> 123
> foo.instance_variables
=> [:a]
```
This was happening because `get_id_entry` ignores the tag bits when
looking up the symbol. So `rb_id2str(symid)` would return a value and
then we'd continue on with the nonsense `symid`.
This commit prevents the situation by checking that the `symid` actually
matches what we get back from `get_id_entry`. Now we get a `RangeError`
for the nonsense id:
```
> ObjectSpace._id2ref(:a.object_id)
=> :a
> ObjectSpace._id2ref(:a.object_id + 40)
(irb):1:in `_id2ref': 0x000000000013f408 is not symbol id value (RangeError)
```
Co-authored-by: John Hawthorn <[email protected]>
Notes:
Merged: https://github.com/ruby/ruby/pull/6147
|
|
Array created as literals during iseq compilation don't need a
reference count since they can never be modified. The previous
implementation would mutate the hidden array's reference count,
causing copy-on-write invalidation.
This commit adds a RARRAY_LITERAL_FLAG for arrays created through
rb_ary_literal_new. Arrays created with this flag do not have reference
count stored and just assume they have infinite number of references.
Co-authored-by: Jean Boussier <[email protected]>
Notes:
Merged: https://github.com/ruby/ruby/pull/6151
|
|
This commit enables Arrays to move between size pools during compaction.
This can occur if the array is mutated such that it would fit in a
different size pool when embedded.
The move is carried out in two stages:
1. The RVALUE is moved to a destination heap during object movement
phase of compaction
2. The array data is re-embedded and the original buffer free'd if
required. This happens during the update references step
Notes:
Merged: https://github.com/ruby/ruby/pull/6099
|
|
This fixes invalid and inconsistent results for the Fixnum*Fixnum case
where the result of the multiplication does not fit in 64-bit
on OpenBSD/mips64. For example:
$ for x in 1 23; do ruby31 -e 'p(54306000000000*86400)'; done
14409380628474329524
11410664325873689790
Cases where an argument was Bignum, as well as cases where the result
of the multiplication fits in 64-bit are fine:
$ for x in 1 23; do ruby31 -e 'p(54306000*86400)'; done
4692038400000
4692038400000
$ for x in 1 23; do ruby31 -e 'p(5430600000000000000000*86400)'; done
469203840000000000000000000
469203840000000000000000000
This was originally discovered by running the tests for the openssl gem
on OpenBSD/mips64 and having one test fail for a date far in the future.
I eventually traced this to the generic multiplication issue.
The underlying cause is using the int128_t type. This avoids use of the
int128_t type in this case, falling back to the slower conversion code,
which in the overflow case, turns the Fixnums into Bignums, then
performs the multiplication.
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/6039
|
|
This was a public method, so we should probably keep it.
Notes:
Merged: https://github.com/ruby/ruby/pull/6027
|
|
And re-embed any strings that can now fit inside the slot they've been
moved to
Notes:
Merged: https://github.com/ruby/ruby/pull/5986
|
|
rb_imemo_new is defined again later in the file.
|
|
This reverts commit 9d927204e7b86eb00bfd07a060a6383139edf741.
Notes:
Merged: https://github.com/ruby/ruby/pull/5981
|
|
... only when the message string has a newline.
`p StandardError.new("foo\nbar")` now prints `#<StandardError: "foo\nbar">'
instead of:
#<StandardError:
bar>
[Bug #18170]
Notes:
Merged: https://github.com/ruby/ruby/pull/4857
|
|
Implements [Feature #12655]
Co-authored-by: Nobuyoshi Nakada <[email protected]>
Notes:
Merged: https://github.com/ruby/ruby/pull/5733
Merged-By: jeremyevans <[email protected]>
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/5955
|
|
Having more size pools will allow us to allocate larger objects
through Variable Width Allocation.
I have attached some benchmark results below.
Discourse:
On Discourse, we don't see much change in response times. We do see
a small reduction in RSS.
Branch RSS: 377.8 MB
Master RSS: 396.3 MB
railsbench:
On railsbench, we don't see a big change in RPS or p99 performance.
We see a small increase in RSS.
Branch RPS: 815.38
Master RPS: 811.73
Branch p99: 1.69 ms
Master p99: 1.68 ms
Branch RSS: 90.6 MB
Master RSS: 89.4 MB
liquid:
We don't see a significant change in liquid performance.
Branch parse & render: 29.041 I/s
Master parse & render: 29.211 I/s
Notes:
Merged: https://github.com/ruby/ruby/pull/5885
|
|
In December 2021, we opened an [issue] to solicit feedback regarding the
porting of the YJIT codebase from C99 to Rust. There were some
reservations, but this project was given the go ahead by Ruby core
developers and Matz. Since then, we have successfully completed the port
of YJIT to Rust.
The new Rust version of YJIT has reached parity with the C version, in
that it passes all the CRuby tests, is able to run all of the YJIT
benchmarks, and performs similarly to the C version (because it works
the same way and largely generates the same machine code). We've even
incorporated some design improvements, such as a more fine-grained
constant invalidation mechanism which we expect will make a big
difference in Ruby on Rails applications.
Because we want to be careful, YJIT is guarded behind a configure
option:
```shell
./configure --enable-yjit # Build YJIT in release mode
./configure --enable-yjit=dev # Build YJIT in dev/debug mode
```
By default, YJIT does not get compiled and cargo/rustc is not required.
If YJIT is built in dev mode, then `cargo` is used to fetch development
dependencies, but when building in release, `cargo` is not required,
only `rustc`. At the moment YJIT requires Rust 1.60.0 or newer.
The YJIT command-line options remain mostly unchanged, and more details
about the build process are documented in `doc/yjit/yjit.md`.
The CI tests have been updated and do not take any more resources than
before.
The development history of the Rust port is available at the following
commit for interested parties:
https://github.com/Shopify/ruby/commit/1fd9573d8b4b65219f1c2407f30a0a60e537f8be
Our hope is that Rust YJIT will be compiled and included as a part of
system packages and compiled binaries of the Ruby 3.2 release. We do not
anticipate any major problems as Rust is well supported on every
platform which YJIT supports, but to make sure that this process works
smoothly, we would like to reach out to those who take care of building
systems packages before the 3.2 release is shipped and resolve any
issues that may come up.
[issue]: https://bugs.ruby-lang.org/issues/18481
Co-authored-by: Maxime Chevalier-Boisvert <[email protected]>
Co-authored-by: Noah Gibbs <[email protected]>
Co-authored-by: Kevin Newton <[email protected]>
Notes:
Merged: https://github.com/ruby/ruby/pull/5826
|