ruby.git - The Ruby Programming Language

Age	Commit message (Collapse)	Author
2023-12-13	[Bug #20061] Clear mark bits when rb_free_on_exit	Peter Zhu
	When compiling with cppflags=-DRGENGC_CHECK_MODE, the following crashes: ``` $ RUBY_FREE_ON_EXIT=1 ./miniruby -e 0 -e: [BUG] obj_free: RVALUE_MARKED(0x0000000103570020 [3LM ] T_CLASS (anon)) != FALSE ``` This commit clears the mark bits when rb_free_on_exit is enabled.
2023-12-10	add `flags` to `rb_postponed_job_preregister`	Koichi Sasada
	for future extensions.
2023-12-10	Change the semantics of rb_postponed_job_register	KJ Tsanaktsidis
	Our current implementation of rb_postponed_job_register suffers from some safety issues that can lead to interpreter crashes (see bug #1991). Essentially, the issue is that jobs can be called with the wrong arguments. We made two attempts to fix this whilst keeping the promised semantics, but: * The first one involved masking/unmasking when flushing jobs, which was believed to be too expensive * The second one involved a lock-free, multi-producer, single-consumer ringbuffer, which was too complex The critical insight behind this third solution is that essentially the only user of these APIs are a) internal, or b) profiling gems. For a), none of the usages actually require variable data; they will work just fine with the preregistration interface. For b), generally profiling gems only call a single callback with a single piece of data (which is actually usually just zero) for the life of the program. The ringbuffer is complex because it needs to support multi-word inserts of job & data (which can't be atomic); but nobody actually even needs that functionality, really. So, this comit: * Introduces a pre-registration API for jobs, with a GVL-requiring rb_postponed_job_prereigster, which returns a handle which can be used with an async-signal-safe rb_postponed_job_trigger. * Deprecates rb_postponed_job_register (and re-implements it on top of the preregister function for compatability) * Moves all the internal usages of postponed job register pre-registration
2023-12-07	Free everything at shutdown	Adam Hess
	when the RUBY_FREE_ON_SHUTDOWN environment variable is set, manually free memory at shutdown. Co-authored-by: Nobuyoshi Nakada <[email protected]> Co-authored-by: Peter Zhu <[email protected]>
2023-12-07	Check need_major_gc during GC stress	Peter Zhu
	need_major_gc is set when a major GC is required. However, if gc_stress_no_major is also set, then it will not actually run a major GC. For example, the following script will sometimes crash: ``` GC.stress = 1 50000.times.map { [] } ``` With the following message: ``` [BUG] cannot create a new page after major GC ```
2023-12-07	Fix GC.verify_compaction_references not moving every object	KJ Tsanaktsidis
	The intention of GC.verify_compaction_references is, I believe, to force every single movable object to be moved, so that it's possible to debug native extensions which not correctly updating their references to objects they mark as movable. To do this, it doubles the number of allocated pages for each size pool, and sorts the heap pages so that the free ones are swept first; thus, every object in an old page should be moved into a free slot in one of the new pages. This worked fine until movement of objects _between_ size pools during compaction was implemented. That causes some problems for verify_compaction_references: * We were doubling the number of pages in each size pool, but actually if some objects need to move into a _different_ pool, there's no guarantee that they'll be enough room in that one. * It's possible for the sweep & compact cursors to meet in one size pool before all the objects that want to move into that size pool from another are processed by the compaction. You can see these problems by changing some of the movement tests in test_gc_compact.rb to try and move e.g. 50,000 objects instead of 500; the test is not able to actually move all of the objects in a single compaction run. To fix this, we do two things in verify_compaction_references: * Firstly, we add enough pages to every size pool to make them the same size. This ensures that their compact cursors will all have space to move during compaction (even if that means empty pages are pointlessly compacted) * Then, we examine every object and determine where it _wants_ to be compacted into. We use this information to add additional pages to each size pool to handle all objects which should live there. With these two changes, we can move arbitrary amounts of objects into the correct size pool in a single call to verify_compaction_references. My _motivation_ for performing this work was to try and fix some test stability issues in test_gc_compact.rb. I now no longer think that we actually see this particular bug in rubyci.org, but I also think verify_compaction_references should do what it says on the tin, so it's worth keeping. [Bug #20022]
2023-12-07	Add objspace_each_pages to gc.c	KJ Tsanaktsidis
	This works like objspace_each_obj, except instead of being called with the start & end address of each page, it's called with the page structure itself. [Bug #20022]
2023-12-07	Fix SEGV caused by `GC::Profiler.raw_data` (#9122)	Soutaro Matsumoto

2023-12-06	Re-embed when removing Object instance variables	Peter Zhu
	Objects with the same shape must always have the same "embeddedness" (either embedded or heap allocated) because YJIT assumes so. However, using remove_instance_variable, it's possible that some objects are embedded and some are heap allocated because it does not re-embed heap allocated objects. This commit changes remove_instance_variable to re-embed Object instance variables when it becomes small enough.
2023-12-04	Fix format specifiers for `size_t`	Nobuyoshi Nakada

2023-12-01	Remove unneeded local variables	Peter Zhu

2023-12-01	Pin embedded shared strings	Peter Zhu
	Embedded shared strings cannot be moved because strings point into the slot of the shared string. There may be code using the RSTRING_PTR on the stack, which would pin the string but not pin the shared string, causing it to move.
2023-11-29	Remove written-but-never-read `me->def.body.refined.owner`	Alan Wu
	This also removes aliasing rule violations; the anonymous structs were distinct types from `rb_method_refined_t`.
2023-11-27	Don't incremental mark when GC stressful	Peter Zhu
	Incremental marking prevents the GC from fully executing, so it may fail to catch certain bugs.
2023-11-27	Set compaction after major GC has been determined	Peter Zhu
	do_full_mark can change in gc_start, so we want to set auto-compaction only after do_full_mark has been properly set.
2023-11-24	Fix compaction for generic ivars	Peter Zhu
	When generic instance variable has a shape, it is marked movable. If it it transitions to too complex, it needs to update references otherwise it may have incorrect references.
2023-11-24	Mark cc->cme_ for refinement callcaches as well	KJ Tsanaktsidis
	This is required for the same reason that super CC needs it. See 36023d5cb751d62fca0c27901c07527b20170f4d. Reproducer: def cached_foo_callsite(obj) = obj.foo class Foo def foo = :v1 module R refine Foo do def foo = :unused end end end obj = Foo.new cached_foo_callsite(obj) # set up cc with cme for foo=:v1 class Foo def foo = :v2 end GC.start # cme for foo=:v1 collected, if not reachable by cached_foo_callsite cached_foo_callsite(obj) [Bug #19994]
2023-11-24	Abort GC on shutdown	Peter Zhu
	On large Ruby applications, shutdown may be slow if a major GC has just started because rb_objspace_call_finalizer completes the GC. This commit adds gc_abort which discards the mark stack if during incremental marking and stops sweeping if during lazy sweeping.
2023-11-23	Allow ivars movement in too_complex RCLASSes to fix crash	Alan Wu
	Previously, because gc_update_object_references() did not update the VALUEs in the too_complex ivar st_table for T_CLASS and T_MODULE objects, GC compaction could finish with corrupted objects. - start with `klass`, not too_complex - GC incremental step marks `klass` and its ivars - ruby code makes `klass` too_complex - GC compaction runs and move `klass` ivars, but because `klass` is too_complex, its ivars are not updated by gc_update_object_references(), leaving T_NONE or T_MOVED objects in the ivar table. Co-authored-by: Peter Zhu <[email protected]>
2023-11-23	Avoid marking IDs in too_complex tables and rename gc_update_tbl_refs()	Alan Wu
	Marking both keys and values versus marking just values is an important distinction, but previously, gc_update_tbl_refs() and gc_update_table_refs() had names that were too similar. The st_table storing ivars for too_complex T_OBJECTs have IDs as keys, but we were marking the IDs unnecessary previously, maybe due to the confusing naming.
2023-11-23	Fix `rp(too_complex_t_object)` tripping assert	Alan Wu
	Previously, it tripped the assert about too_complex in ROBJECT_IV_CAPACITY(). This fixes double faults for some crashes and helps with use during development.
2023-11-20	Don't try compacting ivars on Classes that are "too complex"	Aaron Patterson
	Too complex classes use a hash table to store ivs, and should always pin their IVs. We shouldn't touch those classes in compaction.
2023-11-20	Support declarative marked TypedData objects on VWA	Peter Zhu

2023-11-17	size_pool_idx_for_size: Include debugging info in error message	Jean Boussier
	We ran into that case on our CI, including some sizes would help debug it much easier.
2023-11-13	Revert "Wrap rb_objspace_reachable_objects_from_root with RB_VM_LOCK"	Jean Boussier
	This reverts commit 76dc327eeffefe02577999fe5f8215f762a581b6.
2023-11-13	Revert "Fix crash caused by concurrent ObjectSpace.dump_all calls"	Jean Boussier
	This reverts commit 9a62fd3cbae2ebb60e2f9cad782af1ad18db4319.
2023-11-12	Fix crash caused by concurrent ObjectSpace.dump_all calls	KJ Tsanaktsidis
	Since the callback defined in the objspace module might give up the GVL, we need to make sure the right cr->mfd value is set back after the GVL is re-obtained.
2023-11-12	Wrap rb_objspace_reachable_objects_from_root with RB_VM_LOCK	KJ Tsanaktsidis
	rb_objspace_reachable_objects_from has it too, so I figure it's most likely required for _from_root as well.
2023-11-11	RCLASS_EXT is never NULL now	Nobuyoshi Nakada

2023-11-10	rb_data_free: Fix freeing embedded TypedData	Jean Boussier
	The previous implementation was using the pointer given by `DATA_PTR` in all cases. But in the case of an embedded TypedData, that pointer is garbage, we need to use RTYPEDDATA_GET_DATA to get the proper data pointer. Co-Authored-By: Étienne Barrié <[email protected]>
2023-11-07	Implement embedded TypedData objects	Peter Zhu
	This commit adds a new flag RUBY_TYPED_EMBEDDABLE that allows the data of a TypedData object to be embedded after the object itself. This will improve cache locality and allow us to save the 8 byte data pointer. Co-Authored-By: Jean Boussier <[email protected]>
2023-11-02	Make every initial size pool shape a root shape	Peter Zhu
	This commit makes every initial size pool shape a root shape and assigns it a capacity of 0.
2023-10-27	Fix bug for removed weak references	Peter Zhu
	rb_darray_foreach gives a pointer to the entry, so we need to deference it to read the value.
2023-10-24	geniv objects can become too complex	Aaron Patterson

2023-10-23	rb_shape_transition_shape_capa: use optimal sizes transitions	Jean Boussier
	Previously the growth was 3(embed), 6, 12, 24, ... With this change it's now 3(embed), 8, 16, 32, 64, ... by default. However, since power of two isn't the best size for all allocators, if `malloc_usable_size` is vailable, we use it to discover the best offset. On Linux/glibc 2.35 for instance, the growth will be 3(embed), 7, 15, 31 to avoid wasting 8B per object. Test program: ```c size_t test(size_t slots) { size_t allocated = slots * VALUE_SIZE; void test_ptr = malloc(allocated); size_t wasted = malloc_usable_size(test_ptr) - allocated; free(test_ptr); fprintf(stderr, "slots = %lu, wasted_bytes = %lu\n", slots, wasted); return wasted; } int main(int argc, char argv[]) { size_t best_padding = 0; size_t padding = 0; for (padding = 0; padding <= 2; padding++) { size_t wasted = test(8 - padding); if (wasted == 0) { best_padding = padding; break; } } size_t index = 0; fprintf(stderr, "=============== naive ================\n"); size_t list_size = 4; for (index = 0; index < 10; index++) { test(list_size); list_size = 2; } fprintf(stderr, "=============== auto-padded (-%lu) ================\n", best_padding); list_size = 4; for (index = 0; index < 10; index ++) { test(list_size - best_padding); list_size = 2; } fprintf(stderr, "\n\n"); return 0; } ``` ``` ===== glibc ====== slots = 8, wasted_bytes = 8 slots = 7, wasted_bytes = 0 =============== naive ================ slots = 4, wasted_bytes = 8 slots = 8, wasted_bytes = 8 slots = 16, wasted_bytes = 8 slots = 32, wasted_bytes = 8 slots = 64, wasted_bytes = 8 slots = 128, wasted_bytes = 8 slots = 256, wasted_bytes = 8 slots = 512, wasted_bytes = 8 slots = 1024, wasted_bytes = 8 slots = 2048, wasted_bytes = 8 =============== auto-padded (-1) ================ slots = 3, wasted_bytes = 0 slots = 7, wasted_bytes = 0 slots = 15, wasted_bytes = 0 slots = 31, wasted_bytes = 0 slots = 63, wasted_bytes = 0 slots = 127, wasted_bytes = 0 slots = 255, wasted_bytes = 0 slots = 511, wasted_bytes = 0 slots = 1023, wasted_bytes = 0 slots = 2047, wasted_bytes = 0 ``` ``` ========== jemalloc ======= slots = 8, wasted_bytes = 0 =============== naive ================ slots = 4, wasted_bytes = 0 slots = 8, wasted_bytes = 0 slots = 16, wasted_bytes = 0 slots = 32, wasted_bytes = 0 slots = 64, wasted_bytes = 0 slots = 128, wasted_bytes = 0 slots = 256, wasted_bytes = 0 slots = 512, wasted_bytes = 0 slots = 1024, wasted_bytes = 0 slots = 2048, wasted_bytes = 0 =============== auto-padded (-0) ================ slots = 4, wasted_bytes = 0 slots = 8, wasted_bytes = 0 slots = 16, wasted_bytes = 0 slots = 32, wasted_bytes = 0 slots = 64, wasted_bytes = 0 slots = 128, wasted_bytes = 0 slots = 256, wasted_bytes = 0 slots = 512, wasted_bytes = 0 slots = 1024, wasted_bytes = 0 slots = 2048, wasted_bytes = 0 ```
2023-10-16	Remove unneeded checks	Yusuke Endoh
	Follow up of 591336a0f278bf963d01b6e9810cfc86a5b50620
2023-10-14	Manage `rb_strterm_t` without imemo	Nobuyoshi Nakada

2023-10-01	Use reference counting to avoid memory leak in kwargs	HParker
	Tracks other callinfo that references the same kwargs and frees them when all references are cleared. [bug #19906] Co-authored-by: Peter Zhu <[email protected]>
2023-09-25	Dump backtraces to an arbitrary stream	Nobuyoshi Nakada

2023-09-24	Add rb_hash_free for the GC to use	Peter Zhu

2023-09-22	[Bug #19896]	Adam Hess
	fix memory leak in vm_method This introduces a unified reference_count to clarify who is referencing a method. This also allows us to treat the refinement method as the def owner since it counts itself as a reference Co-authored-by: Peter Zhu <[email protected]>
2023-09-18	Only sort the heap on platforms with compaction	Matt Valentine-House

2023-09-18	Allow pages to be sorted by pinned slot count	Matt Valentine-House
	By compacting into slots with pinned objects first, we improve the efficiency of compaction. As it is less likely that there will exist pages containing only pinned objects after compaction. This will increase the number of free pages left after compaction and enable us to free them. This used to be the default compaction method before it was removed (inadvertently?) during the introduction of auto_compaction. This commit will sort the pages by the pinned slot count at the start of a major GC that has been triggered by explicitly calling GC.compact (and thus setting objspace->flags.during_compaction). It works using the same method by which we sort the heap by empty slot count during GC.verify_compaction_references.
2023-09-18	Move heap sorting into the main GC loop	Matt Valentine-House
	Previously it was only being sorted during the verify compaction references stage - so would only happen during testing. This commit allows us to sort the heap prior to each explicit GC.compact run
2023-09-18	Enable different heap sort methods during compaction	Matt Valentine-House
	pass the sorting function in as a function pointer so we don't always sort by how empty a page is
2023-09-16	Another try to fix build in emscripten	Peter Zhu
	malloc_trim is defined in emscripten/emmalloc.h on emscripten.
2023-09-16	Fix malloc_trim on emscripten	Peter Zhu
	``` gc.c:9746:5: error: implicit declaration of function 'malloc_trim' is invalid in C99 [-Werror,-Wimplicit-function-declaration] malloc_trim(0); ^ ``` http://rubyci.s3.amazonaws.com/crossruby/crossruby-master-wasm32_emscripten/log/20230916T104311Z.fail.html.gz
2023-09-16	Fix malloc_trim() on wasm32	Jean Boussier
	``` compiling gc.c gc.c:9746:5: error: implicit declaration of function 'malloc_trim' is invalid in C99 [-Werror,-Wimplicit-function-declaration] malloc_trim(0); ^ 1 error generated. ```
2023-09-15	Free all heap pages at shutdown	Adam Hess
	previously heap_allocated_pages was decremented from heap_page_free causing only half the heap pages to be freed at shutdown
2023-09-15	Process.warmup: invoke `malloc_trim` if available	Jean Boussier
	Similar to releasing free GC pages, releasing free malloc pages reduce the amount of page faults post fork.