Age | Commit message (Collapse) | Author |
|
Set has been an autoloaded standard library since Ruby 3.2.
The standard library Set is less efficient than it could be, as it
uses Hash for storage, which stores unnecessary values for each key.
Implementation details:
* Core Set uses a modified version of `st_table`, named `set_table`.
than `s/st_/set_/`, the main difference is that the stored records
do not have values, making them 1/3 smaller. `st_table_entry` stores
`hash`, `key`, and `record` (value), while `set_table_entry` only
stores `hash` and `key`. This results in large sets using ~33% less
memory compared to stdlib Set. For small sets, core Set uses 12% more
memory (160 byte object slot and 64 malloc bytes, while stdlib set
uses 40 for Set and 160 for Hash). More memory is used because
the set_table is embedded and 72 bytes in the object slot are
currently wasted. Hopefully we can make this more efficient and have
it stored in an 80 byte object slot in the future.
* All methods are implemented as cfuncs, except the pretty_print
methods, which were moved to `lib/pp.rb` (which is where the
pretty_print methods for other core classes are defined). As is
typical for core classes, internal calls call C functions and
not Ruby methods. For example, to check if something is a Set,
`rb_obj_is_kind_of` is used, instead of calling `is_a?(Set)` on the
related object.
* Almost all methods use the same algorithm that the pure-Ruby
implementation used. The exception is when calling `Set#divide` with a
block with 2-arity. The pure-Ruby method used tsort to implement this.
I developed an algorithm that only allocates a single intermediate
hash and does not need tsort.
* The `flatten_merge` protected method is no longer necessary, so it
is not implemented (it could be).
* Similar to Hash/Array, subclasses of Set are no longer reflected in
`inspect` output.
* RDoc from stdlib Set was moved to core Set, with minor updates.
This includes a comprehensive benchmark suite for all public Set
methods. As you would expect, the native version is faster in the
vast majority of cases, and multiple times faster in many cases.
There are a few cases where it is significantly slower:
* Set.new with no arguments (~1.6x)
* Set#compare_by_identity for small sets (~1.3x)
* Set#clone for small sets (~1.5x)
* Set#dup for small sets (~1.7x)
These are slower as Set does not currently use the AR table
optimization that Hash does, so a new set_table is initialized for
each call. I'm not sure it's worth the complexity to have an AR
table-like optimization for small sets (for hashes it makes sense,
as small hashes are used everywhere in Ruby).
The rbs and repl_type_completor bundled gems will need updates to
support core Set. The pull request marks them as allowed failures.
This passes all set tests with no changes. The following specs
needed modification:
* Modifying frozen set error message (changed for the better)
* `Set#divide` when passed a 2-arity block no longer yields the same
object as both the first and second argument (this seems like an issue
with the previous implementation).
* Set-like objects that override `is_a?` such that `is_a?(Set)` return
`true` are no longer treated as Set instances.
* `Set.allocate.hash` is no longer the same as `nil.hash`
* `Set#join` no longer calls `Set#to_a` (it calls the underlying C
function).
* `Set#flatten_merge` protected method is not implemented.
Previously, `set.rb` added a `SortedSet` autoload, which loads
`set/sorted_set.rb`. This replaces the `Set` autoload in `prelude.rb`
with a `SortedSet` autoload, but I recommend removing it and
`set/sorted_set.rb`.
This moves `test/set/test_set.rb` to `test/ruby/test_set.rb`,
reflecting that switch to a core class. This does not move the spec
files, as I'm not sure how they should be handled.
Internally, this uses the st_* types and functions as much as
possible, and only adds set_* types and functions as needed.
The underlying set_table implementation is stored in st.c, but
there is no public C-API for it, nor is there one planned, in
order to keep the ability to change the internals going forward.
For internal uses of st_table with Qtrue values, those can
probably be replaced with set_table. To do that, include
internal/set_table.h. To handle symbol visibility (rb_ prefix),
internal/set_table.h uses the same macro approach that
include/ruby/st.h uses.
The Set class (rb_cSet) and all methods are defined in set.c.
There isn't currently a C-API for the Set class, though C-API
functions can be added as needed going forward.
Implements [Feature #21216]
Co-authored-by: Jean Boussier <[email protected]>
Co-authored-by: Oliver Nutter <[email protected]>
|
|
0363..036F ; Alphabetic # Mn [13] COMBINING LATIN SMALL LETTER A..COMBINING LATIN SMALL LETTER X
Notes:
Merged: https://github.com/ruby/ruby/pull/13117
|
|
The following method call:
```ruby
a(*nil)
```
A method call such as `a(*nil)` previously allocated an array, because
it calls `nil.to_a`, but I have determined this array allocation is
unnecessary. The instructions in this case are:
```
0000 putself ( 1)[Li]
0001 putnil
0002 splatarray false
0004 opt_send_without_block <calldata!mid:a, argc:1, ARGS_SPLAT|FCALL>
0006 leave
```
The method call uses `ARGS_SPLAT` without `ARGS_SPLAT_MUT`, so the
returned array doesn't need to be mutable. I believe all cases where
`splatarray false` are used allow the returned object to be frozen,
since the `false` means to not duplicate the array. The optimization
in this case is to have `splatarray false` push a shared empty frozen
array, instead of calling `nil.to_a` to return a newly allocated array.
There is a slightly backwards incompatibility with this optimization,
in that `nil.to_a` is not called. However, I believe the new behavior
of `*nil` not calling `nil.to_a` is more consistent with how `**nil`
does not call `nil.to_hash`. Also, so much Ruby code would break if
`nil.to_a` returned something different from the empty hash, that it's
difficult to imagine anyone actually doing that in real code, though
we have a few tests/specs for that.
I think it would be bad for consistency if `*nil` called `nil.to_a`
in some cases and not others, so this changes other cases to not
call `nil.to_a`:
For `[*nil]`, this uses `splatarray true`, which now allocates a
new array for a `nil` argument without calling `nil.to_a`.
For `[1, *nil]`, this uses `concattoarray`, which now returns
the first array if the second array is `nil`.
This updates the allocation tests to check that the array allocations
are avoided where possible.
Implements [Feature #21047]
Notes:
Merged: https://github.com/ruby/ruby/pull/12597
|
|
[Feature #21109]
By always freezing when setting the global rb_rs variable, we can ensure
it is not modified and can be accessed from a ractor.
We're also making sure it's an instance of String and does not have any
instance variables.
Of course, if $/ is changed at runtime, it may cause surprising behavior
but doing so is deprecated already anyway.
Co-authored-by: Jean Boussier <[email protected]>
Notes:
Merged: https://github.com/ruby/ruby/pull/12975
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/12984
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/12790
|
|
Also, Binding#local_variable_get and #local_variable_set rejects an
access to numbered parameters.
[Bug #20965] [Bug #21049]
Notes:
Merged: https://github.com/ruby/ruby/pull/12746
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/12455
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/12297
|
|
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/10924
|
|
|
|
|
|
https://bugs.ruby-lang.org/issues/20478
|
|
|
|
|
|
Signed-off-by: careworry <[email protected]>
|
|
The backtick method recieves a frozen string unless it is interpolated.
Otherwise the string held in the ISeq could be mutated by a custom
backtick method.
|
|
This is a global variable and may happen to be set to 4 elsewhere.
http://ci.rvm.jp/logfiles/brlog.trunk.20240403-054356#L1707
```
The if expression with a boolean range ('flip-flop' operator) warns when Integer literals are used instead of predicates FAILED
Expected [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] == []
to be truthy but was false
```
|
|
|
|
|
|
[Feature #20205]
As a path toward enabling frozen string literals by default in the future,
this commit introduce "chilled strings". From a user perspective chilled
strings pretend to be frozen, but on the first attempt to mutate them,
they lose their frozen status and emit a warning rather than to raise a
`FrozenError`.
Implementation wise, `rb_compile_option_struct.frozen_string_literal` is
no longer a boolean but a tri-state of `enabled/disabled/unset`.
When code is compiled with frozen string literals neither explictly enabled
or disabled, string literals are compiled with a new `putchilledstring`
instruction. This instruction is identical to `putstring` except it marks
the String with the `STR_CHILLED (FL_USER3)` and `FL_FREEZE` flags.
Chilled strings have the `FL_FREEZE` flag as to minimize the need to check
for chilled strings across the codebase, and to improve compatibility with
C extensions.
Notes:
- `String#freeze`: clears the chilled flag.
- `String#-@`: acts as if the string was mutable.
- `String#+@`: acts as if the string was mutable.
- `String#clone`: copies the chilled flag.
Co-authored-by: Jean Boussier <[email protected]>
|
|
|
|
Signed-off-by: cui fliter <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
Since Ruby 3.0, Ruby has passed a keyword splat as a regular
argument in the case of a call to a Ruby method where the
method does not accept keyword arguments, if the method
call does not contain an argument splat:
```ruby
def self.f(obj) obj end
def self.fs(*obj) obj[0] end
h = {a: 1}
f(**h).equal?(h) # Before: true; After: false
fs(**h).equal?(h) # Before: true; After: false
a = []
f(*a, **h).equal?(h) # Before and After: false
fs(*a, **h).equal?(h) # Before and After: false
```
The fact that the behavior differs when passing an empty
argument splat makes it obvious that something is not
working the way it is intended. Ruby 2 always copied
the keyword splat hash, and that is the expected behavior
in Ruby 3.
This bug is because of a missed check in setup_parameters_complex.
If the keyword splat passed is not mutable, then it points to
an existing object and not a new object, and therefore it must
be copied.
Now, there are 3 specs for the broken behavior of directly
using the keyword splatted hash. Fix two specs and add a
new version guard. Do not keep the specs for the broken
behavior for earlier Ruby versions, in case this fix is
backported. For the ruby2_keywords spec, just remove the
related line, since that line is unrelated to what the
spec is testing.
Co-authored-by: Nobuyoshi Nakada <[email protected]>
|
|
|
|
|
|
|
|
|
|
[Feature #19755]
Before (in /tmp/test.rb):
```ruby
Object.class_eval("p __FILE__") # => "(eval)"
```
After:
```ruby
Object.class_eval("p __FILE__") # => "(eval at /tmp/test.rb:1)"
```
This makes it much easier to track down generated code in case
the author forgot to provide a filename argument.
Notes:
Merged: https://github.com/ruby/ruby/pull/8070
|
|
|
|
|
|
|
|
|
|
Since these files are written in a wide character encoding, stop at
first NUL byte and are actually empty. ASCII-incompatible encodings
have never been supported as source encoding.
|
|
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/7136
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Fixes case where Object includes a module that defines a constant,
then using class/module keyword to define the same constant on
Object itself.
Implements [Feature #18832]
Notes:
Merged: https://github.com/ruby/ruby/pull/6048
|
|
|