Bug #16278
closedPotential memory leak when an hash is used as a key for another hash
Description
Hi,
I've been hitting what seems to be a memory leak.
When an hash is used as key for another hash, the former object will be retained even after multiple GC runs.
The following code snippet demonstrates how the hash {:a => 1}
(which is never used outside the scope of create
) is retained even after 10 GC runs (find
will look for an object with a given object_id
on heap).
# frozen_string_literal: true
def create
h = {{:a => 1} => 2}
h.keys.first.object_id
end
def find(object_id)
ObjectSpace.each_object(Hash).any?{|h| h.object_id == object_id} ? 1 : 0
end
leaked = create
10.times do
GC.start(full_mark: true, immediate_sweep: true)
end
exit find(leaked)
This code snippet is expected to exit with 0
while it exits with 1
in my tests. I've tested this on multiple recent ruby versions and OSs, either locally (OSX with homebrew) or in different CIs (e.g. here).
Can you please help understand what's going on here? Thanks!
Updated by mame (Yusuke Endoh) over 5 years ago
- Status changed from Open to Rejected
Ruby uses so-called conservative GC, which is inexact. It is not guaranteed that an object is collected when it becomes unreachable.
You can confirm that it does not cause leak by the following code:
def create
h = {{:a => 1} => 2}
h.keys.first.object_id
end
loop { create }
This code uses constant memory. If it caused memory leak, the memory usage would continue to increase.
Updated by cristiangreco (Cristian Greco) over 5 years ago
- Status changed from Rejected to Open
mame (Yusuke Endoh) wrote:
This code uses constant memory. If it caused memory leak, the memory usage would continue to increase.
Thank for your answer!
If an application exercises this pattern very frequently during lifetime and across multiple processes then it’s definitely going to bloat memory, at the very least. As a real-world example, this is causing high memory usage for the Prometheus client gem, where such pattern is heavily used when passing around metric labels.
As Ruby allows any object to be potentially used as hash key, this behaviour is easily going to cause troubles in long running application and to confuse developers.
I understand it might not be easy or cheap to change the way it works now, but hope you agree with me that application code should not be designed to work around these implementation-specific details. Do you think there’s any viable approach to alleviate this type of issues in future releases?
Updated by ko1 (Koichi Sasada) over 5 years ago
If an application exercises this pattern very frequently during lifetime and across multiple processes then it’s definitely going to bloat memory, at the very least. As a real-world example, this is causing high memory usage for the Prometheus client gem, where such pattern is heavily used when passing around metric labels.
Could you give us good example with that gem? As mame-san said, your example is not a problem because "This code uses constant memory".
Updated by cristiangreco (Cristian Greco) over 5 years ago
ko1 (Koichi Sasada) wrote:
If an application exercises this pattern very frequently during lifetime and across multiple processes then it’s definitely going to bloat memory, at the very least. As a real-world example, this is causing high memory usage for the Prometheus client gem, where such pattern is heavily used when passing around metric labels.
Could you give us good example with that gem? As mame-san said, your example is not a problem because "This code uses constant memory".
Hi Koichi, thanks for taking the time to look into this!
This example creates a metric of type counter
, updates it 3 times using different labels and prints out an export of the metric registry:
# frozen_string_literal: true
require "prometheus/client"
require "prometheus/client/formats/text"
require "prometheus/client/data_stores/synchronized"
# require "prometheus/client/data_stores/single_threaded"
# require "prometheus/client/data_stores/direct_file_store"
Prometheus::Client.config.data_store = Prometheus::Client::DataStores::Synchronized.new
# Prometheus::Client.config.data_store = Prometheus::Client::DataStores::SingleThreaded.new
# Prometheus::Client.config.data_store = Prometheus::Client::DataStores::DirectFileStore.new(dir: ".")
def test
registry = Prometheus::Client::Registry.new
counter = registry.counter(
:counter_metric,
docstring: "a counter",
labels: %i[label1 label2],
)
labels1 = { label1: "foo", label2: "bar" }
labels2 = { label1: 1, label2: 2 }
labels3 = { label1: :a, label2: :b }
counter.increment(by: 1, labels: labels1)
counter.increment(by: 1, labels: labels2)
counter.increment(by: 1, labels: labels3)
puts Prometheus::Client::Formats::Text.marshal(registry)
[labels1.object_id, labels2.object_id, labels3.object_id]
end
def find(id)
ObjectSpace.each_object(Hash).each { |h| id == h.object_id }
end
retained_ids = test
10.times do
GC.start(full_mark: true, immediate_sweep: true)
end
retained_ids.each { |id| puts "found #{id}" if find(id) }
Using each store I found that the 3 labels hashes are retained after garbage collection (all the stores use a similar pattern of storing labels hashes within another hash).
What do you think is going on here?
Updated by jeremyevans0 (Jeremy Evans) over 5 years ago
cristiangreco (Cristian Greco) wrote:
Using each store I found that the 3 labels hashes are retained after garbage collection (all the stores use a similar pattern of storing labels hashes within another hash).
What do you think is going on here?
object_id
is only unique for the life of the object. After the object is garbage collected, the same object_id
could be used for a different object. So measuring using object_id
is not a good idea. Maybe use ObjectSpace.define_finalizer
in your testing? I'm not sure this affects your particular test program, it is just a good general principle.
Can you remove the usage of Prometheus
in your test script, and design a test script using only core Ruby classes, so we can be sure that Prometheus
is not holding any references to the hashes?
Updated by cristiangreco (Cristian Greco) over 5 years ago
jeremyevans0 (Jeremy Evans) wrote:
object_id
is only unique for the life of the object. After the object is garbage collected, the sameobject_id
could be used for a different object. So measuring usingobject_id
is not a good idea. Maybe useObjectSpace.define_finalizer
in your testing? I'm not sure this affects your particular test program, it is just a good general principle.Can you remove the usage of
Prometheus
in your test script, and design a test script using only core Ruby classes, so we can be sure thatPrometheus
is not holding any references to the hashes?
Hi Jeremy!
I reported an example code using the prometheus gem as requested by Koichi.
The usage of ObjectSpace.define_finalizer
seems to always force objects to be retained in my tests. However, it seems very unlikely that object_ids are re-used in such short code.
I've put together another snippet that uses again object_id
for comparison. In my tests [EDIT: only h3
andh4
] h4
is always present on heap after GC:
# frozen_string_literal: true
require 'objspace'
class Klass; end
def create
h1 = { :a => 1 }
h2 = { Klass.new => 2 }
h3 = { [Klass.new] => 3 }
h4 = { { :a => Klass.new } => 4 }
$id_h1 = h1.object_id
$id_h2 = h2.object_id
$id_h3 = h3.object_id
$id_h4 = h4.object_id
nil
end
10.times do
GC.start(full_mark: true, immediate_sweep: true)
end
create
10.times do
GC.start(full_mark: true, immediate_sweep: true)
end
ObjectSpace.each_object(Hash) do |h|
puts "found h1" if h.object_id == $id_h1
puts "found h2" if h.object_id == $id_h2
puts "found h3" if h.object_id == $id_h3
puts "found h4" if h.object_id == $id_h4
end
For the sake of completeness, this bug report originates from a strange behaviour I've observed using the MemoryProfiler gem: https://github.com/SamSaffron/memory_profiler/issues/81
Updated by jeremyevans0 (Jeremy Evans) over 5 years ago
Here's a modified version of your script, fixing the issue where $id_h4 = h3.object_id
, and showing the actual contents of the objects found with those ids:
require 'objspace'
class Klass; end
def create
h1 = { :a => 1 }
h2 = { Klass.new => 2 }
h3 = { [Klass.new] => 3 }
h4 = { { :a => Klass.new } => 4 }
$id_h1 = h1.object_id
$id_h2 = h2.object_id
$id_h3 = h3.object_id
$id_h4 = h4.object_id
nil
end
10.times do
GC.start(full_mark: true, immediate_sweep: true)
end
create
10.times do
GC.start(full_mark: true, immediate_sweep: true)
end
ObjectSpace.each_object(Hash) do |h|
puts "found h1: #{ObjectSpace._id2ref($id_h1)}" if h.object_id == $id_h1
puts "found h2: #{ObjectSpace._id2ref($id_h2)}" if h.object_id == $id_h2
puts "found h3: #{ObjectSpace._id2ref($id_h3)}" if h.object_id == $id_h3
puts "found h4: #{ObjectSpace._id2ref($id_h4)}" if h.object_id == $id_h4
end
Here's the output from running it a few times
# first run
found h4: {{:a=>#<Klass:0x00000615fa257108>}=>4}
found h1: {:a=>1}
# second run
found h1: {:a=>1}
found h4: {{:a=>#<Klass:0x0000032a2e2b2c20>}=>4}
found h3: {}
found h2: {:full_mark=>true, :immediate_sweep=>true}
# third run
found h4: {{:a=>#<Klass:0x0000067d0e3a2578>}=>4}
found h1: {:a=>1}
# fourth run (this when compiled with optimizations)
found h4: {{:a=>#<Klass:0x0000042b0e833c30>}=>4}
It does seem to always retain h4
, but that could be due to the conservative GC. When compiled without optimizations (-O0
), h1
seems to always be retained as well. On my system, h4
seems to always be retained in ruby 2.1 and 2.3-2.7, not always retained in 2.2 and the branch I have to fully separate keyword arguments from positional arguments (https://github.com/jeremyevans/ruby/tree/r3). As those changes seem unrelated, I'm going to assume those versions just change the memory layout such that the conservative GC no longer retains h4
. The fact that h1
is always retained when compiled without optimizations lends weight to this theory.
I agree with @mame (Yusuke Endoh) that unless you are seeing unbounded memory growth, this is not a bug, just a result of the conservative GC.
Updated by nobu (Nobuyoshi Nakada) over 5 years ago
Adding this method and calling it after create
clears the output.
def garbage
h1 = h2 = h3 = h4 = h5 = h6 = h7 = h8 = h9 = h10 = nil
end
So a “shadow” seems staying on the VM stack.
Updated by cristiangreco (Cristian Greco) over 5 years ago
Hi Jeremy, thanks for these details!
I don’t know the details of ruby’s GC, seems to me it might behave unpredictably sometimes. I guess what confuses me now is that although that object is retained we don’t observe unbounded memory growth when calling the method in a loop: is it eventually collected? Is it somehow re-used, or what?
Another thing that intrigues me is that this behaviour is not always consistent: it might vary across different ruby builds (you mentioned that memory layout and compiler optimisations might affect the output) and is also subtly influenced by code changes. For example, if I add this line puts ObjectSpace.count_objects[:T_HASH]
just after the call to create
then h4
is almost never found after GC (or its reference is found associated to another object). Not sure why this happens though!
nobu (Nobuyoshi Nakada) wrote:
Adding this method and calling it after
create
clears the output.def garbage h1 = h2 = h3 = h4 = h5 = h6 = h7 = h8 = h9 = h10 = nil end
So a “shadow” seems staying on the VM stack.
Hey Nobu, what is going on here? What means that a shadow is retained on the stack?
Updated by jeremyevans0 (Jeremy Evans) over 5 years ago
- Status changed from Open to Rejected
cristiangreco (Cristian Greco) wrote:
Hi Jeremy, thanks for these details!
I don’t know the details of ruby’s GC, seems to me it might behave unpredictably sometimes. I guess what confuses me now is that although that object is retained we don’t observe unbounded memory growth when calling the method in a loop: is it eventually collected? Is it somehow re-used, or what?
It is eventually collected when Ruby's GC can no longer find a reference to it. The reason it may be retained even though there is no direct reference to it in Ruby is that pointers to the object may still exist somewhere on the the C/Ruby stack. Ruby's GC is conservative, part of the scan for objects is just looking at at stack memory, and if any memory could be considered a pointer to a Ruby object, that Ruby object is retained during that GC pass.
This conservative GC is needed to make C extensions work. C local variables may hold pointers to Ruby objects, and collecting the Ruby objects while there is a C local variable pointing to them can cause undefined behavior (often a segfault).
Another thing that intrigues me is that this behaviour is not always consistent: it might vary across different ruby builds (you mentioned that memory layout and compiler optimisations might affect the output) and is also subtly influenced by code changes. For example, if I add this line
puts ObjectSpace.count_objects[:T_HASH]
just after the call tocreate
thenh4
is almost never found after GC (or its reference is found associated to another object). Not sure why this happens though!
It happens because the memory layout changed, and there are no longer locations in memory on the stack that point to the objects.
nobu (Nobuyoshi Nakada) wrote:
Adding this method and calling it after
create
clears the output.def garbage h1 = h2 = h3 = h4 = h5 = h6 = h7 = h8 = h9 = h10 = nil end
So a “shadow” seems staying on the VM stack.
Hey Nobu, what is going on here? What means that a shadow is retained on the stack?
It just means there are locations in memory on the stack that point to the Ruby objects. By calling another method that sets a bunch of local variables, the locations in memory that previously pointed to the hashes now are modified to contain nil
(Qnil
in C, 0x4 or 0x8 depending on processor type). Since no locations in memory point to the hashes, the hashes can then be GCed.
Hopefully that provides an adequate explanation for Ruby's GC. For more background, please review some conference presentations by @ko1 (Koichi Sasada) : http://www.atdot.net/~ko1/activities/. I'm going to close this now. Please only reopen if you can show a case with unbounded memory growth.
Updated by alanwu (Alan Wu) over 5 years ago
The GC scans the native stack for Ruby's C code to find values to retain.
Pointers to Ruby's heap objects can end up on the native stack for a variety
of reasons, and this is mostly up to the C compiler. Whether a pointer ends up
on the native stack depends on how Ruby is built, inlining decisions, register
alloation and a bunch of other things -- this is not something Ruby's code has
direct control over.
The generated machine code might interact in a way such that certain sequence
of Ruby code overwrite heap pointers on the native stack, letting the GC
collect those objects.
To illustrate, here is a modified version of your prometheus script:
# frozen_string_literal: true
require "prometheus/client"
require "prometheus/client/formats/text"
require "prometheus/client/data_stores/synchronized"
Prometheus::Client.config.data_store = Prometheus::Client::DataStores::Synchronized.new
def test
registry = Prometheus::Client::Registry.new
counter = registry.counter(
:counter_metric,
docstring: "a counter",
labels: %i[label1 label2],
)
labels1 = { label1: "foo", label2: "bar" }
labels2 = { label1: 1, label2: 2 }
labels3 = { label1: :a, label2: :b }
counter.increment(by: 1, labels: labels1)
counter.increment(by: 1, labels: labels2)
counter.increment(by: 1, labels: labels3)
puts Prometheus::Client::Formats::Text.marshal(registry)
ret = [labels1.object_id, labels2.object_id, labels3.object_id]
return ret
3.itself
ret
end
def find(id)
ObjectSpace.each_object(Hash).find { |h| id == h.object_id }
end
retained_ids = test
10.times do
GC.start(full_mark: true, immediate_sweep: true)
end
retained_ids.each { |id|
obj = find(id)
puts "found #{id} #{obj}" if obj
}
Tracing with GDB, you can find that labels3
is retained because the compiler
chose to spill the receiver onto the native stack whenever a call without block
happens:
insns.def:
763 calling.block_handler = VM_BLOCK_HANDLER_NONE;
0x00005555557284ee <+174>: mov %r15,%rbx
0x00005555557284f1 <+177>: movq $0x0,0x70(%rsp)
764 vm_search_method(ci, cc, calling.recv = TOPN(calling.argc = ci->orig_argc));
0x00005555557284fa <+186>: movslq 0xc(%rcx),%rax
0x00005555557284fe <+190>: mov %eax,0x80(%rsp)
0x0000555555728505 <+197>: shl $0x3,%rax
0x0000555555728509 <+201>: sub %rax,%rdx
0x000055555572850c <+204>: mov -0x8(%rdx),%rax
0x0000555555728510 <+208>: mov %rax,0x78(%rsp) <<<< Stores pointer to Ruby object on the native stack <<<<<<<
./include/ruby/ruby.h:
2058 if (RB_IMMEDIATE_P(obj)) {
=> 0x0000555555728515 <+213>: test $0x7,%al
0x0000555555728517 <+215>: je 0x55555572b24b <vm_exec_core+11787>
2059 if (RB_FIXNUM_P(obj)) return rb_cInteger;
0x000055555572851d <+221>: mov 0x1b56a4(%rip),%r12 # 0x5555558ddbc8 <rb_cInteger>
0x0000555555728524 <+228>: test $0x1,%al
0x0000555555728526 <+230>: jne 0x555555728555 <vm_exec_core+277>
labels3
happens to be the last receiver in the method so it is left on the
stack and retained. We can allow the GC to collect labels3
by performing
another method call on some other object before the end of the test
method.
Commenting out return ret
, the VM spills 3
onto the same location that
labels3
occupied, allowing the GC to collect labels3.
Again, all this is up to the exact machine code output of your compiler, so
what I'm seeing here on my setup might be completely different from yours.
This kind of object retainment can happen anywhere in the Ruby's C code and
can easily be introduced or removed by changes to the VM or the Ruby code in
question.
This kind of extraneous retainment is an inherent property of Ruby's GC.
There is no memory growth since the fixed size native stack puts a cap on
the number of objects that could be retained in this manner.
This could be the reason to the memory bloat problems in your app, but I think
other explanations are more plausible. I would suggest taking another look at
your heap profile. There should be other, more actionable instances of
object retainment in your app.
Updated by cristiangreco (Cristian Greco) over 5 years ago
Thanks both @jeremyevans0 (Jeremy Evans) @alanwu (Alan Wu) for the rich and clear explanations!