Age | Commit message (Collapse) | Author |
|
Set has been an autoloaded standard library since Ruby 3.2.
The standard library Set is less efficient than it could be, as it
uses Hash for storage, which stores unnecessary values for each key.
Implementation details:
* Core Set uses a modified version of `st_table`, named `set_table`.
than `s/st_/set_/`, the main difference is that the stored records
do not have values, making them 1/3 smaller. `st_table_entry` stores
`hash`, `key`, and `record` (value), while `set_table_entry` only
stores `hash` and `key`. This results in large sets using ~33% less
memory compared to stdlib Set. For small sets, core Set uses 12% more
memory (160 byte object slot and 64 malloc bytes, while stdlib set
uses 40 for Set and 160 for Hash). More memory is used because
the set_table is embedded and 72 bytes in the object slot are
currently wasted. Hopefully we can make this more efficient and have
it stored in an 80 byte object slot in the future.
* All methods are implemented as cfuncs, except the pretty_print
methods, which were moved to `lib/pp.rb` (which is where the
pretty_print methods for other core classes are defined). As is
typical for core classes, internal calls call C functions and
not Ruby methods. For example, to check if something is a Set,
`rb_obj_is_kind_of` is used, instead of calling `is_a?(Set)` on the
related object.
* Almost all methods use the same algorithm that the pure-Ruby
implementation used. The exception is when calling `Set#divide` with a
block with 2-arity. The pure-Ruby method used tsort to implement this.
I developed an algorithm that only allocates a single intermediate
hash and does not need tsort.
* The `flatten_merge` protected method is no longer necessary, so it
is not implemented (it could be).
* Similar to Hash/Array, subclasses of Set are no longer reflected in
`inspect` output.
* RDoc from stdlib Set was moved to core Set, with minor updates.
This includes a comprehensive benchmark suite for all public Set
methods. As you would expect, the native version is faster in the
vast majority of cases, and multiple times faster in many cases.
There are a few cases where it is significantly slower:
* Set.new with no arguments (~1.6x)
* Set#compare_by_identity for small sets (~1.3x)
* Set#clone for small sets (~1.5x)
* Set#dup for small sets (~1.7x)
These are slower as Set does not currently use the AR table
optimization that Hash does, so a new set_table is initialized for
each call. I'm not sure it's worth the complexity to have an AR
table-like optimization for small sets (for hashes it makes sense,
as small hashes are used everywhere in Ruby).
The rbs and repl_type_completor bundled gems will need updates to
support core Set. The pull request marks them as allowed failures.
This passes all set tests with no changes. The following specs
needed modification:
* Modifying frozen set error message (changed for the better)
* `Set#divide` when passed a 2-arity block no longer yields the same
object as both the first and second argument (this seems like an issue
with the previous implementation).
* Set-like objects that override `is_a?` such that `is_a?(Set)` return
`true` are no longer treated as Set instances.
* `Set.allocate.hash` is no longer the same as `nil.hash`
* `Set#join` no longer calls `Set#to_a` (it calls the underlying C
function).
* `Set#flatten_merge` protected method is not implemented.
Previously, `set.rb` added a `SortedSet` autoload, which loads
`set/sorted_set.rb`. This replaces the `Set` autoload in `prelude.rb`
with a `SortedSet` autoload, but I recommend removing it and
`set/sorted_set.rb`.
This moves `test/set/test_set.rb` to `test/ruby/test_set.rb`,
reflecting that switch to a core class. This does not move the spec
files, as I'm not sure how they should be handled.
Internally, this uses the st_* types and functions as much as
possible, and only adds set_* types and functions as needed.
The underlying set_table implementation is stored in st.c, but
there is no public C-API for it, nor is there one planned, in
order to keep the ability to change the internals going forward.
For internal uses of st_table with Qtrue values, those can
probably be replaced with set_table. To do that, include
internal/set_table.h. To handle symbol visibility (rb_ prefix),
internal/set_table.h uses the same macro approach that
include/ruby/st.h uses.
The Set class (rb_cSet) and all methods are defined in set.c.
There isn't currently a C-API for the Set class, though C-API
functions can be added as needed going forward.
Implements [Feature #21216]
Co-authored-by: Jean Boussier <[email protected]>
Co-authored-by: Oliver Nutter <[email protected]>
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/13141
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/13141
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/12759
|
|
Originally, if a class was defined with the class keyword, the cref had a
const_added callback, and the superclass an inherited callback, const_added was
called first, and inherited second.
This was discussed in
https://bugs.ruby-lang.org/issues/21143
and an attempt at changing this order was made.
While both constant assignment and inheritance have happened before these
callbacks are invoked, it was deemed nice to have the same order as in
C = Class.new
This was mostly for alignment: In that last use case things happen at different
times and therefore the order of execution is kind of obvious, whereas when the
class keyword is involved, the order is opaque to the user and it is up to the
interpreter.
However, soon in
https://bugs.ruby-lang.org/issues/21193
Matz decided to play safe and keep the existing order.
This reverts commits:
de097fbe5f3df105bd2a26e72db06b0f5139bc1a
de48e47ddf78aba02fd9623bc7ce685540a10743
Notes:
Merged: https://github.com/ruby/ruby/pull/13085
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/13037
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/12984
|
|
[Misc #21143]
[Bug #21193]
The previous change caused a backward compatibility issue with code
that called `Object.const_source_location` from the `inherited` callback.
To fix this, the order is now:
- Define the constant
- Invoke `inherited`
- Invoke `const_set`
Notes:
Merged: https://github.com/ruby/ruby/pull/12956
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/12622
|
|
Co-authored-by: Nobuyoshi Nakada <[email protected]>
Notes:
Merged: https://github.com/ruby/ruby/pull/12622
|
|
(Bug #21083)
https://bugs.ruby-lang.org/issues/21083
Notes:
Merged: https://github.com/ruby/ruby/pull/12622
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/12947
|
|
|
|
[Misc #21143]
Conceptually this makes sense and is more consistent with using
the `Name = Class.new(Superclass)` alternative method.
However the new class is still named before `inherited` is called.
Notes:
Merged: https://github.com/ruby/ruby/pull/12927
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/12879
|
|
Fix: https://github.com/ruby/spec/issues/1249
JRuby and TruffleRuby can't implement this behavior.
While quite a lot of code out there relies on it, if it's
not implemented it will simply result in sligthly less efficient
code, so not the end of the world.
Notes:
Merged: https://github.com/ruby/ruby/pull/12850
|
|
The message from dlerror is not our concern.
|
|
|
|
* Move out from quarantine a Marshal.dump spec for Float
Co-authored-by: Benoit Daloze <[email protected]>
Notes:
Merged-By: eregon <[email protected]>
|
|
When a positive integer limit is given, `rand` method of a RNG object
is expected to return a value between 0 and the limit (exclusive).
Fix shuffle_spec.rb like as the similar code in sample_spec.rb, and
add tests for greater values.
TODO:
- Return a value that is equal to or greater than the limit given to
the RNG object.
- Extract common code about RNG objects to a shared file.
Notes:
Merged: https://github.com/ruby/ruby/pull/12690
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/12679
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/12679
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/12624
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/12539
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/12517
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/12517
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/12472
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/12452
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/12452
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/12438
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/12438
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/12297
|
|
[Feature #20912]
Notes:
Merged: https://github.com/ruby/ruby/pull/12177
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/12114
|
|
* Use FL_USER0 for ELTS_SHARED
This makes space in RString for two bits for chilled strings.
* Mark strings returned by `Symbol#to_s` as chilled
[Feature #20350]
`STR_CHILLED` now spans on two user flags. If one bit is set it
marks a chilled string literal, if it's the other it marks a
`Symbol#to_s` chilled string.
Since it's not possible, and doesn't make much sense to include
debug info when `--debug-frozen-string-literal` is set, we can't
include allocation source, but we can safely include the symbol
name in the warning message, making it much easier to find the source
of the issue.
Co-Authored-By: Étienne Barrié <[email protected]>
---------
Co-authored-by: Étienne Barrié <[email protected]>
Co-authored-by: Jean Boussier <[email protected]>
|
|
... instead, just calculate the value unless it is too big.
Also, this change raises an ArgumentError if it is expected to exceed
16 GB in a 64-bit environment.
(It is possible to calculate it straightforward, but it would likely be
out-of-memory, so I didn't think it would make sense.)
[Feature #20811]
Notes:
Merged: https://github.com/ruby/ruby/pull/12033
|
|
* See discussion on https://github.com/ruby/spec/pull/1210
|
|
The test was too flaky
|
|
|
|
|
|
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/11892
|
|
The absence of either the integer or fractional part should be
allowed.
Notes:
Merged: https://github.com/ruby/ruby/pull/11807
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/10924
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/7968
|
|
This is a static function only called in two places (rb_to_id and
rb_to_symbol), and in both places, both symbols and strings are
allowed. This makes the error message consistent with rb_check_id
and rb_check_symbol.
Fixes [Bug #20607]
Notes:
Merged: https://github.com/ruby/ruby/pull/11097
|
|
[Feature #20594]
A handy method to construct a string out of multiple chunks.
Contrary to `String#concat`, it doesn't do any encoding negociation,
and simply append the content as bytes regardless of whether this
result in a broken string or not.
It's the caller responsibility to check for `String#valid_encoding?`
in cases where it's needed.
When passed integers, only the lower byte is considered, like in
`String#setbyte`.
Notes:
Merged: https://github.com/ruby/ruby/pull/11552
|
|
[Feature #20702]
Works the same way than `Hash#fetch_values` for for array.
Notes:
Merged: https://github.com/ruby/ruby/pull/11557
|
|
[Feature #20707]
Converting Time into RFC3339 / ISO8601 representation is an significant
hotspot for applications that serialize data in JSON, XML or other formats.
By moving it into core we can optimize it much further than what `strftime` will
allow.
```
compare-ruby: ruby 3.4.0dev (2024-08-29T13:11:40Z master 6b08a50a62) +YJIT [arm64-darwin23]
built-ruby: ruby 3.4.0dev (2024-08-30T13:17:32Z native-xmlschema 34041ff71f) +YJIT [arm64-darwin23]
warming up......
| |compare-ruby|built-ruby|
|:-----------------------|-----------:|---------:|
|time.xmlschema | 1.087M| 5.190M|
| | -| 4.78x|
|utc_time.xmlschema | 1.464M| 6.848M|
| | -| 4.68x|
|time.xmlschema(6) | 859.960k| 4.646M|
| | -| 5.40x|
|utc_time.xmlschema(6) | 1.080M| 5.917M|
| | -| 5.48x|
|time.xmlschema(9) | 893.909k| 4.668M|
| | -| 5.22x|
|utc_time.xmlschema(9) | 1.056M| 5.707M|
| | -| 5.40x|
```
Notes:
Merged: https://github.com/ruby/ruby/pull/11510
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/11454
|