From: Motohiro KOSAKI Date: 2011-07-09T13:34:05+09:00 Subject: [ruby-core:37911] [Ruby 1.9 - Bug #4962] come back gem_prelude! Issue #4962 has been updated by Motohiro KOSAKI. And, here is laster benchmark comparision on Linux Fedora15 x86-64. You have to ignore vm_thead_xx and vm3_clearmethodcache. They are influenced another changes. And only followint three have significant difference. name 192r31932-nogems 192r31932 trunk-nogems trunk app_mandelbrot 3.240 3.274 3.398 4.125 io_file_read 6.756 7.214 7.685 8.123 vm3_gc 2.084 2.167 2.259 3.296 Do anyone have more improvement idea? Cheeers. % /usr/bin/ruby ../benchmark/driver.rb -v --executables="192r31932-nogems::~/ruby/bin/ruby-192-r31932 --disable-gems; 192r31932::~/ruby/bin/ruby-192-r31932; trunk-nogems::~/ruby/bin/ruby-trunk --disable-gems; trunk::~/ruby/bin/ruby-trunk -I../lib -I. -I.ext/common ../tool/runruby.rb --extout=.ext --" --pattern='bm_' --directory=../benchmark -r 5 Elapesed time: 39930.060974 (sec) ----------------------------------------------------------- benchmark results: minimum results in each 5 measurements. name 192r31932-nogems 192r31932 trunk-nogems trunk app_answer 0.157 0.154 0.156 0.195 app_erb 2.475 2.484 2.517 2.652 app_factorial 2.558 2.591 2.482 2.737 app_fib 1.833 1.845 1.777 1.906 app_mandelbrot 3.240 3.274 3.398 4.125 app_pentomino 32.655 32.803 33.764 34.966 app_raise 1.146 1.148 1.081 1.189 app_strconcat 2.734 2.808 2.821 2.832 app_tak 2.791 2.787 2.714 2.770 app_tarai 2.153 2.145 2.170 2.225 app_uri 1.633 1.628 1.654 1.767 io_file_create 2.514 2.482 2.035 2.214 io_file_read 6.756 7.214 7.685 8.123 io_file_write 2.206 2.152 2.161 2.254 io_select 2.253 2.283 2.550 2.619 io_select2 6.379 6.478 7.192 6.084 io_select3 0.871 0.847 0.887 0.937 loop_for 3.003 3.036 2.863 2.905 loop_generator 1.031 1.073 0.889 0.945 loop_times 2.606 2.593 2.636 2.753 loop_whileloop 1.608 1.606 1.534 1.265 loop_whileloop2 0.347 0.334 0.335 0.321 so_ackermann 2.034 2.008 2.023 2.062 so_array 2.369 2.375 2.884 2.935 so_binary_trees 0.889 0.913 0.932 0.995 so_concatenate 7.358 7.322 7.271 7.310 so_count_words 0.534 0.535 0.515 0.585 so_exception 2.150 2.172 2.022 2.245 so_fannkuch 2.821 2.809 2.793 3.147 so_fasta 4.160 4.225 4.446 4.784 so_k_nucleotide 3.341 3.400 3.232 3.313 so_lists 1.651 1.675 1.661 1.717 so_mandelbrot 11.325 11.690 11.764 11.530 so_matrix 1.530 1.526 1.542 1.592 so_meteor_contest 9.493 8.700 8.517 9.820 so_nbody 8.184 8.339 8.191 7.797 so_nested_loop 2.527 2.447 2.443 2.491 so_nsieve 7.351 7.252 8.503 8.516 so_nsieve_bits 4.920 4.945 4.896 4.973 so_object 1.779 1.781 1.752 1.750 so_partial_sums 10.433 10.625 10.520 10.269 so_pidigits 1.710 1.750 1.811 1.986 so_random 1.834 1.899 1.958 1.863 so_reverse_complement 3.198 3.351 3.542 3.615 so_sieve 2.141 2.161 2.439 2.489 so_spectralnorm 7.887 8.088 7.805 7.412 vm1_block* 3.863 3.462 3.850 3.953 vm1_const* 0.821 0.860 1.107 0.668 vm1_ensure* 0.951 1.012 1.197 0.973 vm1_ivar* 1.238 1.232 0.879 0.878 vm1_ivar_set* 1.824 1.286 1.126 1.201 vm1_length* 1.389 1.329 1.403 1.260 vm1_neq* 0.921 0.902 1.114 0.831 vm1_not* 0.297 0.315 0.407 0.514 vm1_rescue* 0.140 0.124 0.361 0.134 vm1_simplereturn* 2.389 2.409 2.992 2.523 vm1_swap* 2.000 2.011 2.081 1.033 vm2_array* 1.421 1.463 1.443 1.910 vm2_case* 0.314 0.331 0.322 0.297 vm2_defined_method* 6.192 6.221 6.884 6.565 vm2_eval* 29.320 29.618 30.998 38.323 vm2_method* 3.458 3.479 3.610 3.790 vm2_mutex* 1.748 1.776 2.010 1.905 vm2_poly_method* 5.353 5.345 5.992 5.731 vm2_poly_method_ov* 0.546 0.554 0.530 0.472 vm2_proc* 0.913 0.930 0.984 0.994 vm2_regexp* 2.351 2.379 2.412 2.398 vm2_send* 0.554 0.595 0.500 0.552 vm2_super* 1.119 1.145 1.328 1.244 vm2_unif1* 0.566 0.587 0.599 0.580 vm2_zsuper* 1.242 1.290 1.476 1.385 vm3_clearmethodcache 4.762 4.992 0.822 1.008 vm3_gc 2.084 2.167 2.259 3.296 vm_thread_alive_check1 0.346 0.356 0.436 0.516 vm_thread_create_join 5.791 5.737 5.932 5.915 vm_thread_mutex1 1.568 1.587 1.642 1.700 vm_thread_mutex2 1.596 1.616 5.558 2.638 vm_thread_mutex3 3131.381 1335.436 4.009 4.089 vm_thread_pass 0.130 0.137 1.687 1.862 vm_thread_pass_flood 0.356 0.311 0.705 0.847 vm_thread_pipe 1.154 1.133 2.515 2.459 ---------------------------------------- Bug #4962: come back gem_prelude! http://redmine.ruby-lang.org/issues/4962 Author: Yusuke Endoh Status: Assigned Priority: Normal Assignee: Nobuyoshi Nakada Category: lib Target version: 1.9.3 ruby -v: - Hello, rubygems developers Kosaki-san noticed that 1.9.3 is slower than 1.9.2 on many benchmarks. http://www.atdot.net/sp/view/5qunnl I investigated and found that the cause is the lack of gem_prelude.rb. Loading rubygems seems to create many objects and keep the references to them. See below: $ ruby -ve 'GC.start; p ObjectSpace.count_objects[:TOTAL]' ruby 1.9.2p180 (2011-02-18 revision 30909) [i686-linux] 9821 $ ./ruby -ve 'GC.start; p ObjectSpace.count_objects[:TOTAL]' ruby 1.9.3dev (2011-07-01 trunk 32356) [i686-linux] 19638 $ ./ruby --disable-gems -ve 'GC.start; p ObjectSpace.count_objects[:TOTAL]' ruby 1.9.3dev (2011-07-01 trunk 32356) [i686-linux] 9821 The number of live objects is proportional to the cost of GC mark phase. You can actually confirm the performance degradation with the following benchmark script: require 'tempfile' max = 200_000 str = "Hello world! " * 1000 f = Tempfile.new('yarv-benchmark') f.write str GC::Profiler.enable max.times{ f.seek 0 f.read } p GC::Profiler.total_time $ time ruby -v bm_io_file_read.rb ruby 1.9.2p180 (2011-02-18 revision 30909) [i686-linux] 0.7280460000000308 real 0m3.965s user 0m2.940s sys 0m1.024s $ time ./ruby -v bm_io_file_read.rb ruby 1.9.3dev (2011-07-01 trunk 32356) [i686-linux] 1.396088000000029 real 0m4.786s user 0m3.716s sys 0m1.060s $ time ./ruby --disable-gems -v bm_io_file_read.rb ruby 1.9.3dev (2011-07-01 trunk 32356) [i686-linux] 0.7640390000000309 real 0m4.079s user 0m2.872s sys 0m1.192s The performance degradation can be seen by not only such micro benckmarks, but also my puzzle solvers :-( There are some approaches to address the problem: 1. to introduce a generational GC; this is impossible until 2.0 because it requires modifications to all extension libraries. 2. to diet rubygems; do not create any string, array, hash, and any object as much as possible, and do not keep the references to them. 3. to restore gem_prelude.rb to delay loading rubygems. I guess that 3 is a reasonable choice for 1.9.3. But I'm fine with any solution to fix rubygems if 1.9.3 becomes as fast as 1.9.2 on the benchmarks. -- Yusuke Endoh -- http://redmine.ruby-lang.org