From: cristian@...
Date: 2019-10-25T22:53:14+00:00
Subject: [ruby-core:95555] [Ruby master Bug#16278] Potential memory leak when an hash is used as a key for another hash

Issue #16278 has been updated by cristiangreco (Cristian Greco).


jeremyevans0 (Jeremy Evans) wrote:

> `object_id` is only unique for the life of the object.  After the object is garbage collected, the same `object_id` could be used for a different object.  So measuring using `object_id` is not a good idea.  Maybe use `ObjectSpace.define_finalizer` in your testing?  I'm not sure this affects your particular test program, it is just a good general principle.
> 
> Can you remove the usage of `Prometheus` in your test script, and design a test script using only core Ruby classes, so we can be sure that `Prometheus` is not holding any references to the hashes?

Hi Jeremy!

I reported an example code using the prometheus gem as requested by Koichi.

The usage of `ObjectSpace.define_finalizer` seems to always force objects to be retained in my tests. However, it seems very unlikely that object_ids are re-used in such short code.

I've put together another snippet that uses again `object_id` for comparison. In my tests `h3` and `h4` are always present on heap after GC:

```ruby
# frozen_string_literal: true

require 'objspace'

class Klass; end

def create
  h1 = { :a => 1 }
  h2 = { Klass.new => 2 }
  h3 = { [Klass.new] => 3 }
  h4 = { { :a => Klass.new } => 4 }

  $id_h1 = h1.object_id
  $id_h2 = h2.object_id
  $id_h3 = h3.object_id
  $id_h4 = h3.object_id

  nil
end

10.times do
  GC.start(full_mark: true, immediate_sweep: true)
end

create

10.times do
  GC.start(full_mark: true, immediate_sweep: true)
end

ObjectSpace.each_object(Hash) do |h|
  puts "found h1" if h.object_id == $id_h1
  puts "found h2" if h.object_id == $id_h2
  puts "found h3" if h.object_id == $id_h3
  puts "found h4" if h.object_id == $id_h4
end
```

For the sake of completeness, this bug report originates from a strange behaviour I've observed using the MemoryProfiler gem: https://github.com/SamSaffron/memory_profiler/issues/81


----------------------------------------
Bug #16278: Potential memory leak when an hash is used as a key for another hash
https://bugs.ruby-lang.org/issues/16278#change-82333

* Author: cristiangreco (Cristian Greco)
* Status: Open
* Priority: Normal
* Assignee: 
* Target version: 
* ruby -v: ruby 2.6.5p114 (2019-10-01 revision 67812) [x86_64-darwin18]
* Backport: 2.5: UNKNOWN, 2.6: UNKNOWN
----------------------------------------
Hi,

I've been hitting what seems to be a memory leak.

When an hash is used as key for another hash, the former object will be retained even after multiple GC runs.

The following code snippet demonstrates how the hash `{:a => 1}` (which is never used outside the scope of `create`) is retained even after 10 GC runs (`find` will look for an object with a given `object_id` on heap).


```ruby
# frozen_string_literal: true

def create
  h = {{:a => 1} => 2}
  h.keys.first.object_id
end

def find(object_id)
  ObjectSpace.each_object(Hash).any?{|h| h.object_id == object_id} ? 1 : 0
end


leaked = create

10.times do
  GC.start(full_mark: true, immediate_sweep: true)
end

exit find(leaked)
```

This code snippet is expected to exit with `0` while it exits with `1` in my tests. I've tested this on multiple recent ruby versions and OSs, either locally (OSX with homebrew) or in different CIs (e.g. [here](https://github.com/cristiangreco/ruby-hash-leak/commit/285e586b7193104989f59b92579fe8f25770141e/checks?check_suite_id=278711566)).

Can you please help understand what's going on here? Thanks!



-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>