Age | Commit message (Collapse) | Author |
|
I was not aware of this because I use clang these days.
Notes:
Merged: https://github.com/ruby/ruby/pull/4815
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/4791
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/4795
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/4795
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/4795
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/4795
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/4795
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/4795
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/4795
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/4795
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/4795
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/4795
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/4795
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/4795
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/4795
|
|
|
|
And constify `node` argument of `iseq_compile_each0`.
|
|
|
|
This updates the trace instructions to directly dispatch to
opt_send_without_block. So this should cause no slowdown in
non-trace mode.
To enable the tracing of the optimized methods, RUBY_EVENT_C_CALL
and RUBY_EVENT_C_RETURN are added as events to the specialized
instructions.
Fixes [Bug #14870]
Co-authored-by: Takashi Kokubun <[email protected]>
Notes:
Merged: https://github.com/ruby/ruby/pull/4739
Merged-By: jeremyevans <[email protected]>
|
|
[0] => [0, *, a]
#=> [0] length mismatch (given 1, expected 2+) (NoMatchingPatternError)
Ignore test failures of typeprof caused by this change for now.
|
|
On -DUSE_EMBED_CI=0, there are more GC allocations and the old code
didn't keep old_operands[0] reachable while allocating. On a Debian
based system, I get a crash requiring erb under GC stress mode. On
macOS, tool/transcode-tblgen.rb runs incorrectly if I put GC.stress=true
as the first line.
Notes:
Merged: https://github.com/ruby/ruby/pull/4662
Merged-By: XrXr
|
|
Pin matching for local variables and constants is already supported,
and it is fairly simple to add support for these variable types.
Note that pin matching for method calls is still not supported
without wrapping in parentheses (pin expressions). I think that's
for the best as method calls are far more complex (arguments/blocks).
Implements [Feature #17724]
Notes:
Merged: https://github.com/ruby/ruby/pull/4502
|
|
Since b2fc592c304 nothing was holding a reference to the dup'd CDHASH
during IBF loading. If a GC happened to run during IBF load then the
copied hash wouldn't have anything to keep it alive. We don't really
want to keep the originally loaded CDHASH hash, so this patch just
overwrites the original hash with the copied / modified hash.
[Bug #17984] [ruby-core:104259]
Notes:
Merged: https://github.com/ruby/ruby/pull/4630
|
|
If the type is ADJUST we don't want to treat it like an INSN so we have
to check the type before reading from `insn_info.events`.
[Bug #18001] [ruby-core:104371]
Co-authored-by: Aaron Patterson <[email protected]>
Notes:
Merged: https://github.com/ruby/ruby/pull/4601
|
|
Redo of 34a2acdac788602c14bf05fb616215187badd504 and
931138b00696419945dc03e10f033b1f53cd50f3 which were reverted.
GitHub PR #4340.
This change implements a cache for class variables. Previously there was
no cache for cvars. Cvar access is slow due to needing to travel all the
way up th ancestor tree before returning the cvar value. The deeper the
ancestor tree the slower cvar access will be.
The benefits of the cache are more visible with a higher number of
included modules due to the way Ruby looks up class variables. The
benchmark here includes 26 modules and shows with the cache, this branch
is 6.5x faster when accessing class variables.
```
compare-ruby: ruby 3.1.0dev (2021-03-15T06:22:34Z master 9e5105c) [x86_64-darwin19]
built-ruby: ruby 3.1.0dev (2021-03-15T12:12:44Z add-cache-for-clas.. c6be009) [x86_64-darwin19]
| |compare-ruby|built-ruby|
|:--------|-----------:|---------:|
|vm_cvar | 5.681M| 36.980M|
| | -| 6.51x|
```
Benchmark.ips calling `ActiveRecord::Base.logger` from within a Rails
application. ActiveRecord::Base.logger has 71 ancestors. The more
ancestors a tree has, the more clear the speed increase. IE if Base had
only one ancestor we'd see no improvement. This benchmark is run on a
vanilla Rails application.
Benchmark code:
```ruby
require "benchmark/ips"
require_relative "config/environment"
Benchmark.ips do |x|
x.report "logger" do
ActiveRecord::Base.logger
end
end
```
Ruby 3.0 master / Rails 6.1:
```
Warming up --------------------------------------
logger 155.251k i/100ms
Calculating -------------------------------------
```
Ruby 3.0 with cvar cache / Rails 6.1:
```
Warming up --------------------------------------
logger 1.546M i/100ms
Calculating -------------------------------------
logger 14.857M (± 4.8%) i/s - 74.198M in 5.006202s
```
Lastly we ran a benchmark to demonstate the difference between master
and our cache when the number of modules increases. This benchmark
measures 1 ancestor, 30 ancestors, and 100 ancestors.
Ruby 3.0 master:
```
Warming up --------------------------------------
1 module 1.231M i/100ms
30 modules 432.020k i/100ms
100 modules 145.399k i/100ms
Calculating -------------------------------------
1 module 12.210M (± 2.1%) i/s - 61.553M in 5.043400s
30 modules 4.354M (± 2.7%) i/s - 22.033M in 5.063839s
100 modules 1.434M (± 2.9%) i/s - 7.270M in 5.072531s
Comparison:
1 module: 12209958.3 i/s
30 modules: 4354217.8 i/s - 2.80x (± 0.00) slower
100 modules: 1434447.3 i/s - 8.51x (± 0.00) slower
```
Ruby 3.0 with cvar cache:
```
Warming up --------------------------------------
1 module 1.641M i/100ms
30 modules 1.655M i/100ms
100 modules 1.620M i/100ms
Calculating -------------------------------------
1 module 16.279M (± 3.8%) i/s - 82.038M in 5.046923s
30 modules 15.891M (± 3.9%) i/s - 79.459M in 5.007958s
100 modules 16.087M (± 3.6%) i/s - 81.005M in 5.041931s
Comparison:
1 module: 16279458.0 i/s
100 modules: 16087484.6 i/s - same-ish: difference falls within error
30 modules: 15891406.2 i/s - same-ish: difference falls within error
```
Co-authored-by: Aaron Patterson <[email protected]>
Notes:
Merged: https://github.com/ruby/ruby/pull/4544
|
|
... which is formally called EXPERIMENTAL_ISEQ_NODE_ID.
See also ff69ef27b06eed1ba750e7d9cab8322f351ed245.
https://bugs.ruby-lang.org/issues/17930
Notes:
Merged: https://github.com/ruby/ruby/pull/4558
|
|
RubyVM::AST.of(Thread::Backtrace::Location) returns a node that
corresponds to the location. Typically, the node is a method call, but
not always.
This change also includes iseq's dump/load support of node_ids for each
instructions.
Notes:
Merged: https://github.com/ruby/ruby/pull/4558
|
|
by merging `rb_ast_body_t#line_count` and `#script_lines`.
Fortunately `line_count == RARRAY_LEN(script_lines)` was always
satisfied. When script_lines is saved, it has an array of lines, and
when not saved, it has a Fixnum that represents the old line_count.
Notes:
Merged: https://github.com/ruby/ruby/pull/4581
|
|
This option makes the parser keep the original source as an array of
the original code lines. This feature exploits the mechanism of
`SCRIPT_LINES__` but records only the specified code that is passed to
RubyVM::AST.of or .parse, instead of recording all parsed program texts.
Notes:
Merged: https://github.com/ruby/ruby/pull/4581
|
|
* --braces-after-func-def-line
* --dont-cuddle-else
* --procnames-start-lines
* --space-after-for
* --space-after-if
* --space-after-while
|
|
Following non-special_const literals:
* T_REGEXP
Notes:
Merged: https://github.com/ruby/ruby/pull/4548
|
|
Following non-special_const literals:
* T_BIGNUM
* T_FLOAT (non-flonum)
* T_RATIONAL
* T_COMPLEX
Notes:
Merged: https://github.com/ruby/ruby/pull/4548
|
|
It's been a way too much amount of ifdefs.
|
|
The checkmatch instruction with VM_CHECKMATCH_TYPE_CASE calls
=== without a call cache. Emit a send instruction to make the call
instead. It includes a call cache.
The call cache improves throughput of using when statements to check the
class of a given object. This is useful for say, JSON serialization.
Use of a regular send instead of checkmatch also avoids taking the VM
lock every time, which is good for multi-ractor workloads.
Calculating -------------------------------------
master post
vm_case_classes 11.013M 16.172M i/s - 6.000M times in 0.544795s 0.371009s
vm_case_lit 2.296 2.263 i/s - 1.000 times in 0.435606s 0.441826s
vm_case 74.098M 64.338M i/s - 6.000M times in 0.080974s 0.093257s
Comparison:
vm_case_classes
post: 16172114.4 i/s
master: 11013316.9 i/s - 1.47x slower
vm_case_lit
master: 2.3 i/s
post: 2.3 i/s - 1.01x slower
vm_case
master: 74097858.6 i/s
post: 64338333.9 i/s - 1.15x slower
The vm_case benchmark is a bit slower post patch, possibily due to the
larger instruction sequence. The benchmark dispatches using
opt_case_dispatch so was not running checkmatch and does not make the
=== call post patch.
Notes:
Merged: https://github.com/ruby/ruby/pull/4468
|
|
It looks for "checkmatch", when it could be applied to anything that has
"newrange".
Making the optimization target more ranges might only be fair play when
all ranges are frozen. So I'm putting a reference to the ticket that
froze all ranges.
[Feature #15504]
Notes:
Merged: https://github.com/ruby/ruby/pull/4468
|
|
Before this change, CDHASH operands were built as plain hashes when
loaded from binary. Without setting up the hash with the correct
st_table type, the hash can sometimes be an ar_table. When the hash is
an ar_table, lookups can call the `eql?` method on keys of the hash,
which makes the `opt_case_dispatch` instruction not "leaf" as it
implicitly declares.
The following script trips the stack canary for checking the leaf
attribute for `opt_case_dispatch` on VM_CHECK_MODE > 0 (enabled by
default with RUBY_DEBUG).
rb_vm_iseq = RubyVM::InstructionSequence
iseq = rb_vm_iseq.compile(<<-EOF)
case Class.new(String).new("foo")
when "foo"
42
end
EOF
puts rb_vm_iseq.load_from_binary(iseq.to_binary).eval
This commit changes the binary loading logic to build CDHASH with the
right st_table type. The dumping logic and the dump format stays the
same
Notes:
Merged: https://github.com/ruby/ruby/pull/4511
Merged-By: XrXr
|
|
609de71f043e8ba34f22b9993e444e2e5bb05709 fixes the issue by using
`throw` insn if `ensure` is used. However, that patch introduce
additional `throw` even if it is not needed. This patch solves
the issue.
This issue is pointed by @mame.
Notes:
Merged: https://github.com/ruby/ruby/pull/4507
|
|
Fixes [Bug #17868]
|
|
Fixes [Bug #17857]
Notes:
Merged: https://github.com/ruby/ruby/pull/4496
|
|
cf: https://github.com/ruby/ruby/pull/4469#discussion_r628386707
|
|
For instance a rational's numerator can be a bignum. Comparison using
C's == can be insufficient.
Notes:
Merged: https://github.com/ruby/ruby/pull/4469
|
|
There are complex literals `123i`, which can also be a case condition.
Notes:
Merged: https://github.com/ruby/ruby/pull/4469
|
|
Nobu kindly pointed out that rational literals can have fractions.
Notes:
Merged: https://github.com/ruby/ruby/pull/4469
|
|
Rational literals are those integers suffixed with `r`. They tend to
be a part of more complex expressions like `123/456r`, but in theory
they can live alone. When such "bare" rational literals are passed to
case-when branch, we have to take care of them. Fixes [Bug #17854]
Notes:
Merged: https://github.com/ruby/ruby/pull/4469
|
|
This reverts commit 08de37f9fa3469365e6b5c964689ae2bae0eb9f3.
This reverts commit e8ae922b62adb00a80d3d4c49f7d7b0e6026eaba.
|
|
Instead of on read. Once it's in the inline cache we never have to make
one again. We want to eventually put the value into the cache, and the
best opportunity to do that is when you write the value.
Notes:
Merged: https://github.com/ruby/ruby/pull/4340
|
|
This change implements a cache for class variables. Previously there was
no cache for cvars. Cvar access is slow due to needing to travel all the
way up th ancestor tree before returning the cvar value. The deeper the
ancestor tree the slower cvar access will be.
The benefits of the cache are more visible with a higher number of
included modules due to the way Ruby looks up class variables. The
benchmark here includes 26 modules and shows with the cache, this branch
is 6.5x faster when accessing class variables.
```
compare-ruby: ruby 3.1.0dev (2021-03-15T06:22:34Z master 9e5105ca45) [x86_64-darwin19]
built-ruby: ruby 3.1.0dev (2021-03-15T12:12:44Z add-cache-for-clas.. c6be0093ae) [x86_64-darwin19]
| |compare-ruby|built-ruby|
|:--------|-----------:|---------:|
|vm_cvar | 5.681M| 36.980M|
| | -| 6.51x|
```
Benchmark.ips calling `ActiveRecord::Base.logger` from within a Rails
application. ActiveRecord::Base.logger has 71 ancestors. The more
ancestors a tree has, the more clear the speed increase. IE if Base had
only one ancestor we'd see no improvement. This benchmark is run on a
vanilla Rails application.
Benchmark code:
```ruby
require "benchmark/ips"
require_relative "config/environment"
Benchmark.ips do |x|
x.report "logger" do
ActiveRecord::Base.logger
end
end
```
Ruby 3.0 master / Rails 6.1:
```
Warming up --------------------------------------
logger 155.251k i/100ms
Calculating -------------------------------------
```
Ruby 3.0 with cvar cache / Rails 6.1:
```
Warming up --------------------------------------
logger 1.546M i/100ms
Calculating -------------------------------------
logger 14.857M (± 4.8%) i/s - 74.198M in 5.006202s
```
Lastly we ran a benchmark to demonstate the difference between master
and our cache when the number of modules increases. This benchmark
measures 1 ancestor, 30 ancestors, and 100 ancestors.
Ruby 3.0 master:
```
Warming up --------------------------------------
1 module 1.231M i/100ms
30 modules 432.020k i/100ms
100 modules 145.399k i/100ms
Calculating -------------------------------------
1 module 12.210M (± 2.1%) i/s - 61.553M in 5.043400s
30 modules 4.354M (± 2.7%) i/s - 22.033M in 5.063839s
100 modules 1.434M (± 2.9%) i/s - 7.270M in 5.072531s
Comparison:
1 module: 12209958.3 i/s
30 modules: 4354217.8 i/s - 2.80x (± 0.00) slower
100 modules: 1434447.3 i/s - 8.51x (± 0.00) slower
```
Ruby 3.0 with cvar cache:
```
Warming up --------------------------------------
1 module 1.641M i/100ms
30 modules 1.655M i/100ms
100 modules 1.620M i/100ms
Calculating -------------------------------------
1 module 16.279M (± 3.8%) i/s - 82.038M in 5.046923s
30 modules 15.891M (± 3.9%) i/s - 79.459M in 5.007958s
100 modules 16.087M (± 3.6%) i/s - 81.005M in 5.041931s
Comparison:
1 module: 16279458.0 i/s
100 modules: 16087484.6 i/s - same-ish: difference falls within error
30 modules: 15891406.2 i/s - same-ish: difference falls within error
```
Co-authored-by: Aaron Patterson <[email protected]>
Notes:
Merged: https://github.com/ruby/ruby/pull/4340
|
|
... then, new_insn_core extracts nd_line(node).
Also, if a macro "EXPERIMENTAL_ISEQ_NODE_ID" is defined, this changeset
keeps nd_node_id(node) for each instruction. This is intended for
TypeProf to identify what AST::Node corresponds to each instruction.
This patch is originally authored by @yui-knk for showing which column a
NoMethodError occurred.
https://github.com/ruby/ruby/compare/master...yui-knk:feature/node_id
Co-Authored-By: Yuichiro Kaneko <[email protected]>
Notes:
Merged: https://github.com/ruby/ruby/pull/4470
|
|
add_ensure_iseq() adds ensure block to the end of
jump such as next/redo/return. However, if the rescue
cause are in the body, this rescue catches the exception
in ensure clause.
iter do
next
rescue
R
ensure
raise
end
In this case, R should not be executed, but executed without this patch.
Fixes [Bug #13930]
Fixes [Bug #16618]
A part of tests are written by @jeremyevans https://github.com/ruby/ruby/pull/4291
Notes:
Merged: https://github.com/ruby/ruby/pull/4399
|
|
In regular assignment, Ruby evaluates the left hand side before
the right hand side. For example:
```ruby
foo[0] = bar
```
Calls `foo`, then `bar`, then `[]=` on the result of `foo`.
Previously, multiple assignment didn't work this way. If you did:
```ruby
abc.def, foo[0] = bar, baz
```
Ruby would previously call `bar`, then `baz`, then `abc`, then
`def=` on the result of `abc`, then `foo`, then `[]=` on the
result of `foo`.
This change makes multiple assignment similar to single assignment,
changing the evaluation order of the above multiple assignment code
to calling `abc`, then `foo`, then `bar`, then `baz`, then `def=` on
the result of `abc`, then `[]=` on the result of `foo`.
Implementing this is challenging with the stack-based virtual machine.
We need to keep track of all of the left hand side attribute setter
receivers and setter arguments, and then keep track of the stack level
while handling the assignment processing, so we can issue the
appropriate topn instructions to get the receiver. Here's an example
of how the multiple assignment is executed, showing the stack and
instructions:
```
self # putself
abc # send
abc, self # putself
abc, foo # send
abc, foo, 0 # putobject 0
abc, foo, 0, [bar, baz] # evaluate RHS
abc, foo, 0, [bar, baz], baz, bar # expandarray
abc, foo, 0, [bar, baz], baz, bar, abc # topn 5
abc, foo, 0, [bar, baz], baz, abc, bar # swap
abc, foo, 0, [bar, baz], baz, def= # send
abc, foo, 0, [bar, baz], baz # pop
abc, foo, 0, [bar, baz], baz, foo # topn 3
abc, foo, 0, [bar, baz], baz, foo, 0 # topn 3
abc, foo, 0, [bar, baz], baz, foo, 0, baz # topn 2
abc, foo, 0, [bar, baz], baz, []= # send
abc, foo, 0, [bar, baz], baz # pop
abc, foo, 0, [bar, baz] # pop
[bar, baz], foo, 0, [bar, baz] # setn 3
[bar, baz], foo, 0 # pop
[bar, baz], foo # pop
[bar, baz] # pop
```
As multiple assignment must deal with splats, post args, and any level
of nesting, it gets quite a bit more complex than this in non-trivial
cases. To handle this, struct masgn_state is added to keep
track of the overall state of the mass assignment, which stores a linked
list of struct masgn_attrasgn, one for each assigned attribute.
This adds a new optimization that replaces a topn 1/pop instruction
combination with a single swap instruction for multiple assignment
to non-aref attributes.
This new approach isn't compatible with one of the optimizations
previously used, in the case where the multiple assignment return value
was not needed, there was no lhs splat, and one of the left hand side
used an attribute setter. This removes that optimization. Removing
the optimization allowed for removing the POP_ELEMENT and adjust_stack
functions.
This adds a benchmark to measure how much slower multiple
assignment is with the correct evaluation order.
This benchmark shows:
* 4-9% decrease for attribute sets
* 14-23% decrease for array member sets
* Basically same speed for local variable sets
Importantly, it shows no significant difference between the popped
(where return value of the multiple assignment is not needed) and
!popped (where return value of the multiple assignment is needed)
cases for attribute and array member sets. This indicates the
previous optimization, which was dropped in the evaluation
order fix and only affected the popped case, is not important to
performance.
Fixes [Bug #4443]
Notes:
Merged: https://github.com/ruby/ruby/pull/4390
Merged-By: jeremyevans <[email protected]>
|