HIGH PERFORMANCE
       RUBY
Hiya

• Charles Oliver Nutter
• headius@headius.com
• @headius
• JVM language guy at Red Hat (JBoss)
Performance?
• Writing code
 • Man hours more expensive than CPU
    hours
 • Developer contentedness
• Running code
 • Straight line
High Performance?

• Faster than...
 • ...other Ruby impls?
 • ...other language runtimes?
 • ...unmanaged languages, like C?
 • ...you need it to be?
“Fast Enough”

• 1.8.7 was fast enough
• 1.9.3 is fast enough
• Unless it’s not fast enough
 • Does it matter?
Performance Wall

• Move to a different runtime
• Move to a different language
 • ...in whole or part
If you’re not writing perf-
  sensitive code in Ruby,
you’re giving up too easily.
Native Extensions

• Not universally bad
• Just bad in MRI
 • Invasive
 • Pointers
 • Few guarantees
What We Want

• Faster execution
• Better GC
• Parallel execution
• Big data
What We Can’t Have

• Faster execution
• Better GC
• Parallel execution
• Big data
Different Approach

• Build our own runtime?
 • YARV, Rubinius, MacRuby
• Use an existing runtime?
 • JRuby, MagLev, MacRuby, IronRuby
Build or Buy

• Making a new VM is “easy”
• Making it competitive is really hard
• I mean really, really, really hard
JVM
• 15+ years of engineering by whole teams
• FOSS
• Fastest VM available
• Best GCs available
• Full parallel threading with guarantees
• Broad platform support
But Java is Slow!
• Java is very, very fast
 • Literally, C fast in many cases
• Java applications can be slow
 • Oh hey, just like Ruby?
• The way you write code is more important
  than the language you use.
JRuby

• Java (and Ruby) impl of Ruby on JVM
• Same memory, threading model
• JRuby JITs to JVM bytecode
• End of story, right?
Long, Hard Road

• Interpreter optimization
• JVM bytecode compiler
• Optimizing core class methods
• Lather, rinse, and repeat
Align with JVM

• Individual arguments on call stack
• JVM local variables
• Avoid artificial framing
• Avoid inter-call goo
• Eliminate unnecessary work
Unnecessary Work
• Modules are maps
 • Name to method
 • Name to constant
 • Name to class var
• Instance variables as maps
• Wasted cycles without caching
Method Lookup
• Inside a class/module
 • Current class’s methods (a map)
 • Methods retrieved from class + ancestors
 • Serial or switch indicates staleness
 • Weak list of child classes
• Class mutation cascades down hierarchy
Thing



                 Person           Place



obj.to_s   Rubyist        Other
Method lookups go up-hierarchy            Thing



                                 Person           Place



       obj.to_s         Rubyist           Other
to_s
Method lookups go up-hierarchy            Thing



                                 Person            Place



       obj.to_s         Rubyist           Other
to_s
Method lookups go up-hierarchy            Thing

 Lookup target caches result

                                 Person            Place



       obj.to_s          Rubyist          Other
Method lookups go up-hierarchy             Thing

 Lookup target caches result

                                  Person           Place


                               to_s
       obj.to_s          Rubyist           Other
Method lookups go up-hierarchy             Thing

 Lookup target caches result
Modification cascades down         Person           Place


                               to_s
       obj.to_s          Rubyist           Other
Method lookups go up-hierarchy             Thing
                                             to_s

 Lookup target caches result
Modification cascades down         Person            Place


                               to_s
       obj.to_s          Rubyist           Other
Constant Lookup

• Cache at lookup site
• Global serial/switch indicates staleness
 • Complexities of lookup, etc
 • Joy of Ruby interfering with Joy of Opto
• Modifying constants triggers invalidation
Instance Vars

• Class holds a table of offsets
• Object holds array of values
• Call site caches offset plus class ID
• Same class, no lookup cost
 • Can be polymorphically chained
Optimizing Ruby

• Make calls fast
• Make constants free
• Make instance variables cheap
• Make closures lightweight
 • TODO
What is
invokedynamic?
Invoke?
Invoke?
That’s one use, but there are many others
Dynamic?
Dynamic?
Dynamic typing is a common reason,
    but there are many others
JVM 101
JVM 101
200 opcodes
JVM 101
       200 opcodes
Ten (or 16) “data endpoints”
JVM 101
                   200 opcodes
            Ten (or 16) “data endpoints”
   Invocation
 invokevirtual
invokeinterface
 invokestatic
 invokespecial
JVM 101
                   200 opcodes
            Ten (or 16) “data endpoints”
   Invocation       Field Access
 invokevirtual       getfield
invokeinterface      setfield
 invokestatic        getstatic
 invokespecial       setstatic
JVM 101
                   200 opcodes
            Ten (or 16) “data endpoints”
   Invocation       Field Access     Array Access
 invokevirtual       getfield          *aload
invokeinterface      setfield          *astore
 invokestatic        getstatic     b,s,c,i,l,d,f,a
 invokespecial       setstatic
JVM 101
                   200 opcodes
            Ten (or 16) “data endpoints”
   Invocation       Field Access      Array Access
 invokevirtual       getfield          *aload
invokeinterface      setfield          *astore
 invokestatic        getstatic     b,s,c,i,l,d,f,a
 invokespecial       setstatic

   All Java code revolves around these endpoints
  Remaining ops are stack, local vars, flow control
    allocation, and math/boolean/bit operations
JVM
Opcodes
JVM
              Opcodes
 Invocation       Field Access   Array Access
 invokevirtual      getfield
                                     *aload
invokeinterface     setfield
                                     *astore
 invokestatic       getstatic
                                 b,s,c,i,l,d,f,a
 invokespecial      setstatic
JVM
              Opcodes
 Invocation       Field Access   Array Access
 invokevirtual      getfield
                                     *aload
invokeinterface     setfield
                                     *astore
 invokestatic       getstatic
                                 b,s,c,i,l,d,f,a
 invokespecial      setstatic




    Stack                            Local Vars
                  Flow Control


                                  Allocation
 Boolean and Numeric
JVM
              Opcodes
 Invocation       Field Access   Array Access
 invokevirtual      getfield
                                     *aload
invokeinterface     setfield
                                     *astore
 invokestatic       getstatic
                                 b,s,c,i,l,d,f,a
 invokespecial      setstatic




    Stack                            Local Vars
                  Flow Control


                                  Allocation
 Boolean and Numeric
JVM
              Opcodes
 Invocation       Field Access   Array Access
 invokevirtual      getfield
                                     *aload
invokeinterface     setfield
                                     *astore
 invokestatic       getstatic
                                 b,s,c,i,l,d,f,a
 invokespecial      setstatic




    Stack                            Local Vars
                  Flow Control


                                  Allocation
 Boolean and Numeric
In Detail

• JRuby generates code with indy calls
• JVM at first call asks JRuby what to do
• JRuby provides function pointers to code
• Pointers include guards, invalidation logic
• JRuby and JVM cooperate on optimizing
invokedynamic bytecode
invokedynamic bytecode




bo
  ot
     stra
         p
             m
              et
                ho
                  d
invokedynamic bytecode




bo
  ot
     stra
         p
             m
              et
                ho
                  d             method handles
invokedynamic bytecode

                                  target method



bo
  ot
     stra
         p
             m
              et
                ho
                  d             method handles
invokedynamic bytecode

                                  target method



bo
  ot
     stra
         p
             m
              et
                ho
                  d             method handles
invokedynamic bytecode

                                  target method



bo
  ot
     stra
         p
             m
              et
                ho
                  d             method handles
Dynamic Invocation
                  Target
                  Object

                         associated with
obj.foo()   JVM
                  Method
                   Table
                  def foo ...

                  def bar ...
Dynamic Invocation
VM Operations
                      Target
                      Object

                             associated with
   obj.foo()    JVM
                      Method
                       Table
   Call Site          def foo ...

                      def bar ...
Dynamic Invocation
VM Operations
                      Target
                      Object

                             associated with
   obj.foo()    JVM
                      Method
                       Table
   Call Site          def foo ...

                      def bar ...
Dynamic Invocation
VM Operations
  Method Lookup                 Target
                                Object

                                       associated with
   obj.foo()       JVM
                                Method
                                 Table
   Call Site                    def foo ...
                  def foo ...
                                def bar ...
Dynamic Invocation
VM Operations
  Method Lookup                 Target
     Branch
                                Object

                                       associated with
   obj.foo()       JVM
                                Method
                                 Table
   Call Site                    def foo ...
                  def foo ...
                                def bar ...
Dynamic Invocation
VM Operations
  Method Lookup              Target
     Branch
  Method Cache               Object

                                    associated with
   obj.foo()           JVM
         def foo ...
                             Method
                              Table
   Call Site                 def foo ...

                             def bar ...
Constants



               JVM       Constant
MY_CONST
                         Lookup

 Call Site
Constants
VM Operations



                  JVM       Constant
 MY_CONST
                            Lookup

   Call Site
Constants
VM Operations



                  JVM       Constant
 MY_CONST
                            Lookup

   Call Site
Constants
VM Operations
  Lookup Value




                   JVM        Constant
 MY_CONST
                         value Lookup

   Call Site
Constants
VM Operations
   Lookup Value
 Bind Permanently



                       JVM       Constant
 MY_CONST
             value               Lookup

   Call Site
Instance Variables
                  Target
                  Object

                        associated with
@bar     JVM
                  Offset
                  Table
                  “@foo” => 0
                  “@bar” => 1
Instance Variables
VM Operations
                        Target
                        Object

                              associated with
    @bar        JVM
                        Offset
                        Table
  Access Site           “@foo” => 0
                        “@bar” => 1
Instance Variables
VM Operations
Instance Var Lookup         Target
                            Object

                                  associated with
     @bar             JVM
                            Offset
                            Table
  Access Site               “@foo” => 0
                            “@bar” => 1
Instance Variables
VM Operations
Instance Var Lookup         Target
    Offset Cache
                            Object

                                  associated with
     @bar             JVM
            1
                            Offset
                            Table
  Access Site               “@foo” => 0
                            “@bar” => 1
Instance Variables
VM Operations
Instance Var Lookup         Target
    Offset Cache
   Access Object            Object

                                  associated with
     @bar             JVM
            1
                            Offset
                            Table
  Access Site               “@foo” => 0
                            “@bar” => 1
Instance Variables
VM Operations
Instance Var Lookup         Target
    Offset Cache
   Access Object            Object

                                  associated with
     @bar             JVM
            1
                            Offset
                            Table
  Access Site               “@foo” => 0
                            “@bar” => 1
InvokeDynamic lets
JRuby teach the JVM
   how Ruby works
How Do We Know
  We’ve Succeeded?

• Benchmarking
• Monitoring
• User reports
Benchmarking is Hard

• Runtimes may improve over time
• Optimizer may eliminate useless code
• Small systems are completely different
• Know how your runtime optimizes!
bench_empty_method
def foo; self; end

i = 0
while i < 10_000_000
  foo; foo; foo; foo; foo
  i += 1
end
Ruby 1.9.3   JRuby      JRuby + indy

4s



3s



2s



1s



0s
                     ZOMG
                            40X FA
                                      STER!
Observations
One slow runtime
screws up the table
...do comparisons as
ratios against a norm
JRuby calls empty
methods really fast!!!
InvokeDynamic does
not do much for us?
Ruby 1.9.3   JRuby   JRuby + indy

4s



3s



2s



1s



0s
JVM Opto 101
• JITs code bodies after 10k calls
 • No 10k calls, no JIT (generally)
• Inlines up to two targets
• Optimistic
 • Early decisions may be wrong
 • Small code looks drastically different
SMALL CODE IS
DIFFERENT THAN
  LARGE CODE
Inlining

• Call site in method A and method B match
• JVM treats them as though B lived in A
 • No call overhead
 • Variables visible across call boundary
 • More complete view for optimization
Optimistic

• Say we have a system...
• The only method dynamically called is “foo”
• All logic for dyncall revolves around “foo”
• Hotspot thinks all dyncalls will be “foo”
bench_empty_method2
def foo; self; end
def bar1; self; end
def bar2; self; end

i = 0
while i < 10_000_000
  bar1; bar1; bar1; bar1; bar1
  bar2; bar2; bar2; bar2; bar2
  i += 1
end
...
bench1   bench2   bench1 + indy   bench2 + indy

  0.7s



0.525s



 0.35s



0.175s



   0s
bench1 + rbx    bench2 + rbx   bench1 + indy
       bench2 + indy
0.4s



0.3s



0.2s



0.1s



 0s
What Happened?

• An unrelated change slowed our bench?
• Not really unrelated
 • Hotspot optimizes early loop first
 • Later loop is different...calls “foo”
 • Assumptions change, perf looks different
Benchmarking is
       Not Enough
• Need to monitor runtime optimization
 • JIT compilation
 • Inlining
 • Eventual native code (x86 ASM)
• Fun?
1711   4 %    bench_empty_method::block_0$RUBY$__file__ @ 56 (171 bytes)
             @ 59 java.lang.invoke.MethodHandle::invokeExact (33 bytes) inline (hot)
              @ 5 java.lang.invoke.MethodHandle::invokeExact (20 bytes) inline (hot)
                @ 2 java.lang.invoke.MethodHandle::invokeExact (9 bytes) inline (hot)
                 @ 2 java.lang.invoke.MutableCallSite::getTarget (5 bytes) inline (hot)
                @ 16 java.lang.invoke.MethodHandle::invokeExact (5 bytes) inline (hot)
                 @ 1 sun.invoke.util.ValueConversions::identity (2 bytes) inline (hot)
              @ 12 java.lang.invoke.MethodHandle::invokeExact (10 bytes) inline (hot)
              @ 29 java.lang.invoke.MethodHandle::invokeExact (35 bytes) inline (hot)
                @ 5 java.lang.invoke.MethodHandle::invokeExact (7 bytes) inline (hot)
                 @ 3 org.jruby.runtime.invokedynamic.InvocationLinker::testMetaclass (17 bytes)   inline (hot)
                  @ 5 org.jruby.RubyBasicObject::getMetaClass (5 bytes) inline (hot)
                @ 14 java.lang.invoke.MethodHandle::invokeExact (10 bytes) inline (hot)
                @ 31 java.lang.invoke.MethodHandle::invokeExact (10 bytes) inline (hot)
                 @ 6 bench_empty_method::method__0$RUBY$foo (2 bytes) inline (hot)
             @ 68 java.lang.invoke.MethodHandle::invokeExact (33 bytes) inline (hot)
              @ 5 java.lang.invoke.MethodHandle::invokeExact (20 bytes) inline (hot)
                @ 2 java.lang.invoke.MethodHandle::invokeExact (9 bytes) inline (hot)
                 @ 2 java.lang.invoke.MutableCallSite::getTarget (5 bytes) inline (hot)
1711   4 %    bench_empty_method::block_0$RUBY$__file__ @ 56 (171 bytes)
             @ 59 java.lang.invoke.MethodHandle::invokeExact (33 bytes) inline (hot)
              @ 5 java.lang.invoke.MethodHandle::invokeExact (20 bytes) inline (hot)
                @ 2 java.lang.invoke.MethodHandle::invokeExact (9 bytes) inline (hot)
                 @ 2 java.lang.invoke.MutableCallSite::getTarget (5 bytes) inline (hot)
                @ 16 java.lang.invoke.MethodHandle::invokeExact (5 bytes) inline (hot)
                 @ 1 sun.invoke.util.ValueConversions::identity (2 bytes) inline (hot)
              @ 12 java.lang.invoke.MethodHandle::invokeExact (10 bytes) inline (hot)
              @ 29 java.lang.invoke.MethodHandle::invokeExact (35 bytes) inline (hot)
                @ 5 java.lang.invoke.MethodHandle::invokeExact (7 bytes) inline (hot)
                 @ 3 org.jruby.runtime.invokedynamic.InvocationLinker::testMetaclass (17 bytes)   inline (hot)
                  @ 5 org.jruby.RubyBasicObject::getMetaClass (5 bytes) inline (hot)
                @ 14 java.lang.invoke.MethodHandle::invokeExact (10 bytes) inline (hot)
                @ 31 java.lang.invoke.MethodHandle::invokeExact (10 bytes) inline (hot)
                 @ 6 bench_empty_method::method__0$RUBY$foo (2 bytes) inline (hot)
             @ 68 java.lang.invoke.MethodHandle::invokeExact (33 bytes) inline (hot)
              @ 5 java.lang.invoke.MethodHandle::invokeExact (20 bytes) inline (hot)
                @ 2 java.lang.invoke.MethodHandle::invokeExact (9 bytes) inline (hot)
                 @ 2 java.lang.invoke.MutableCallSite::getTarget (5 bytes) inline (hot)
1711   4 %    bench_empty_method::block_0$RUBY$__file__ @ 56 (171 bytes)
             @ 59 java.lang.invoke.MethodHandle::invokeExact (33 bytes) inline (hot)
              @ 5 java.lang.invoke.MethodHandle::invokeExact (20 bytes) inline (hot)
                @ 2 java.lang.invoke.MethodHandle::invokeExact (9 bytes) inline (hot)
                 @ 2 java.lang.invoke.MutableCallSite::getTarget (5 bytes) inline (hot)
                @ 16 java.lang.invoke.MethodHandle::invokeExact (5 bytes) inline (hot)
                 @ 1 sun.invoke.util.ValueConversions::identity (2 bytes) inline (hot)
              @ 12 java.lang.invoke.MethodHandle::invokeExact (10 bytes) inline (hot)
              @ 29 java.lang.invoke.MethodHandle::invokeExact (35 bytes) inline (hot)
                @ 5 java.lang.invoke.MethodHandle::invokeExact (7 bytes) inline (hot)
                 @ 3 org.jruby.runtime.invokedynamic.InvocationLinker::testMetaclass (17 bytes)   inline (hot)
                  @ 5 org.jruby.RubyBasicObject::getMetaClass (5 bytes) inline (hot)
                @ 14 java.lang.invoke.MethodHandle::invokeExact (10 bytes) inline (hot)
                @ 31 java.lang.invoke.MethodHandle::invokeExact (10 bytes) inline (hot)
                 @ 6 bench_empty_method::method__0$RUBY$foo (2 bytes) inline (hot)
             @ 68 java.lang.invoke.MethodHandle::invokeExact (33 bytes) inline (hot)
              @ 5 java.lang.invoke.MethodHandle::invokeExact (20 bytes) inline (hot)
                @ 2 java.lang.invoke.MethodHandle::invokeExact (9 bytes) inline (hot)
                 @ 2 java.lang.invoke.MutableCallSite::getTarget (5 bytes) inline (hot)
Decoding compiled method 0x000000010549d7d0:
Code:
[Entry Point]
[Verified Entry Point]
[Constants]
  # {method} 'method__0$RUBY$foo' '(Lbench_empty_method;Lorg/jruby/runtime/ThreadContext;Lorg/jruby/
runtime/builtin/IRubyObject;Lorg/jruby/runtime/Block;)Lorg/jruby/runtime/builtin/IRubyObject;' in
'bench_empty_method'
  # parm0:    rsi:rsi = 'bench_empty_method'
  # parm1:    rdx:rdx = 'org/jruby/runtime/ThreadContext'
  # parm2:    rcx:rcx = 'org/jruby/runtime/builtin/IRubyObject'
  # parm3:    r8:r8    = 'org/jruby/runtime/Block'
  #         [sp+0x20] (sp of caller)
  0x000000010549d900: sub    $0x18,%rsp
  0x000000010549d907: mov    %rbp,0x10(%rsp)    ;*synchronization entry
                                  ; - bench_empty_method::method__0$RUBY$foo@-1 (line 3)
  0x000000010549d90c: mov    %rcx,%rax
  0x000000010549d90f: add    $0x10,%rsp
  0x000000010549d913: pop    %rbp
  0x000000010549d914: test %eax,-0xe9f91a(%rip)         # 0x00000001045fe000
                                  ; {poll_return}
  0x000000010549d91a: retq
Decoding compiled method 0x000000010549d7d0:
Code:
[Entry Point]
[Verified Entry Point]
[Constants]
  # {method} 'method__0$RUBY$foo' '(Lbench_empty_method;Lorg/jruby/runtime/ThreadContext;Lorg/jruby/
runtime/builtin/IRubyObject;Lorg/jruby/runtime/Block;)Lorg/jruby/runtime/builtin/IRubyObject;' in
'bench_empty_method'
  # parm0:    rsi:rsi = 'bench_empty_method'
  # parm1:    rdx:rdx = 'org/jruby/runtime/ThreadContext'
  # parm2:    rcx:rcx = 'org/jruby/runtime/builtin/IRubyObject'
  # parm3:    r8:r8    = 'org/jruby/runtime/Block'
  #         [sp+0x20] (sp of caller)
  0x000000010549d900: sub    $0x18,%rsp
  0x000000010549d907: mov    %rbp,0x10(%rsp)    ;*synchronization entry
                                  ; - bench_empty_method::method__0$RUBY$foo@-1 (line 3)
  0x000000010549d90c: mov    %rcx,%rax
  0x000000010549d90f: add    $0x10,%rsp
  0x000000010549d913: pop    %rbp
  0x000000010549d914: test %eax,-0xe9f91a(%rip)         # 0x00000001045fe000
                                  ; {poll_return}
  0x000000010549d91a: retq
Decoding compiled method 0x000000010549d7d0:
Code:
[Entry Point]
[Verified Entry Point]
[Constants]
  # {method} 'method__0$RUBY$foo' '(Lbench_empty_method;Lorg/jruby/runtime/ThreadContext;Lorg/jruby/
runtime/builtin/IRubyObject;Lorg/jruby/runtime/Block;)Lorg/jruby/runtime/builtin/IRubyObject;' in
'bench_empty_method'
  # parm0:    rsi:rsi = 'bench_empty_method'
  # parm1:    rdx:rdx = 'org/jruby/runtime/ThreadContext'
  # parm2:    rcx:rcx = 'org/jruby/runtime/builtin/IRubyObject'
  # parm3:    r8:r8    = 'org/jruby/runtime/Block'
  #         [sp+0x20] (sp of caller)
  0x000000010549d900: sub    $0x18,%rsp
  0x000000010549d907: mov    %rbp,0x10(%rsp)    ;*synchronization entry
                                  ; - bench_empty_method::method__0$RUBY$foo@-1 (line 3)
  0x000000010549d90c: mov    %rcx,%rax
  0x000000010549d90f: add    $0x10,%rsp
  0x000000010549d913: pop    %rbp
  0x000000010549d914: test %eax,-0xe9f91a(%rip)         # 0x00000001045fe000
                                  ; {poll_return}
  0x000000010549d91a: retq
bench_empty_method3
def invoker1
  i = 0
  while i < 1000
    foo; foo; foo; foo; foo
    i+=1
  end
end
...
  i = 0
  while i < 10000
    invoker1
    i+=1
  end
bench1 + indy   bench2 + indy   bench3 + indy

 0.15s



0.113s



0.075s



0.038s



   0s
Moral

• Benchmarks are synthetic
• Every system is different
• Do your own testing
bench_red_black
• Pure-Ruby red/black tree impl
• Build a 100k tree of rand(999_999)
• Delete all nodes
• Build it again
• Search for elements
• In-order walks, min, max
Ruby 1.9.3    JRuby - indy     JRuby + indy
                     bench_red_black
  5s



3.75s



 2.5s



1.25s



  0s
bench_fractal
bench_flipflop_fractal
• Mandelbrot generator
 • Integer loops
 • Floating-point math
• Julia generator using flip-flops
 • I don’t really understand it.
def fractal_flipflop
  w, h = 44, 54
  c = 7 + 42 * w
  a = [0] * w * h
  g = d = 0
  f = proc do |n|
    a[c] += 1
    o = a.map {|z| " :#"[z, 1] * 2 }.join.scan(/.{#{w * 2}}/)
    puts "f" + o.map {|l| l.rstrip }.join("n")
    d += 1 - 2 * ((g ^= 1 << n) >> n)
    c += [1, w, -1, -w][d %= 4]
  end
  1024.times do
    !!(!!(!!(!!(!!(!!(!!(!!(!!(true...
     f[0])...f[1])...f[2])...
     f[3])...f[4])...f[5])...
     f[6])...f[7])...f[8])
  end
end
def fractal_flipflop
  w, h = 44, 54
  c = 7 + 42 * w
  a = [0] * w * h
  g = d = 0
  f = proc do |n|
    a[c] += 1
    o = a.map {|z| " :#"[z, 1] * 2 }.join.scan(/.{#{w * 2}}/)
    puts "f" + o.map {|l| l.rstrip }.join("n")
    d += 1 - 2 * ((g ^= 1 << n) >> n)
    c += [1, w, -1, -w][d %= 4]
  end
  1024.times do
    !!(!!(!!(!!(!!(!!(!!(!!(!!(true...
     f[0])...f[1])...f[2])...
     f[3])...f[4])...f[5])...
     f[6])...f[7])...f[8])
  end
end
Ruby 1.9.3   JRuby - indy    JRuby + indy
                      bench_fractal
  1.5s



1.125s



 0.75s



0.375s



   0s
Ruby 1.9.3      JRuby - indy          JRuby + indy
                      bench_flipflop_fractal
  1.5s



1.125s



 0.75s



0.375s



   0s
Rails?
Rails Perf

• Mixed bag right now...some fast some slow
• JVM JIT limits need to be bumped up
 • Significant gains for some folks
• Long warmup times for so much code
• Work continues!
What Next?
Expand Opto

• Mixed-arity (ADD SLIDES ABOUT WHAT
  WE OPTIMIZE TODAY)
• Super calls
• Much, much lighter-weight closures
• Then what?
Wacky Stuff

• define_method methods?
• method_missing call-throughs?
• respond_to???
• proc tables?
• All possible...but worth it?
The Future
• JRuby will continue to get faster
 • Indy improvements at VM-level
 • Compiler improvements at Ruby level
• If you can’t compete with JVM...
• Still FOSS from top to bottom
 • Don’t be afraid!
Q/A

High Performance Ruby - Golden Gate RubyConf 2012

  • 1.
  • 2.
    Hiya • Charles OliverNutter • [email protected] • @headius • JVM language guy at Red Hat (JBoss)
  • 3.
    Performance? • Writing code • Man hours more expensive than CPU hours • Developer contentedness • Running code • Straight line
  • 4.
    High Performance? • Fasterthan... • ...other Ruby impls? • ...other language runtimes? • ...unmanaged languages, like C? • ...you need it to be?
  • 5.
    “Fast Enough” • 1.8.7was fast enough • 1.9.3 is fast enough • Unless it’s not fast enough • Does it matter?
  • 6.
    Performance Wall • Moveto a different runtime • Move to a different language • ...in whole or part
  • 7.
    If you’re notwriting perf- sensitive code in Ruby, you’re giving up too easily.
  • 8.
    Native Extensions • Notuniversally bad • Just bad in MRI • Invasive • Pointers • Few guarantees
  • 9.
    What We Want •Faster execution • Better GC • Parallel execution • Big data
  • 10.
    What We Can’tHave • Faster execution • Better GC • Parallel execution • Big data
  • 11.
    Different Approach • Buildour own runtime? • YARV, Rubinius, MacRuby • Use an existing runtime? • JRuby, MagLev, MacRuby, IronRuby
  • 12.
    Build or Buy •Making a new VM is “easy” • Making it competitive is really hard • I mean really, really, really hard
  • 13.
    JVM • 15+ yearsof engineering by whole teams • FOSS • Fastest VM available • Best GCs available • Full parallel threading with guarantees • Broad platform support
  • 14.
    But Java isSlow! • Java is very, very fast • Literally, C fast in many cases • Java applications can be slow • Oh hey, just like Ruby? • The way you write code is more important than the language you use.
  • 15.
    JRuby • Java (andRuby) impl of Ruby on JVM • Same memory, threading model • JRuby JITs to JVM bytecode • End of story, right?
  • 16.
    Long, Hard Road •Interpreter optimization • JVM bytecode compiler • Optimizing core class methods • Lather, rinse, and repeat
  • 18.
    Align with JVM •Individual arguments on call stack • JVM local variables • Avoid artificial framing • Avoid inter-call goo • Eliminate unnecessary work
  • 19.
    Unnecessary Work • Modulesare maps • Name to method • Name to constant • Name to class var • Instance variables as maps • Wasted cycles without caching
  • 20.
    Method Lookup • Insidea class/module • Current class’s methods (a map) • Methods retrieved from class + ancestors • Serial or switch indicates staleness • Weak list of child classes • Class mutation cascades down hierarchy
  • 21.
    Thing Person Place obj.to_s Rubyist Other
  • 22.
    Method lookups goup-hierarchy Thing Person Place obj.to_s Rubyist Other
  • 23.
    to_s Method lookups goup-hierarchy Thing Person Place obj.to_s Rubyist Other
  • 24.
    to_s Method lookups goup-hierarchy Thing Lookup target caches result Person Place obj.to_s Rubyist Other
  • 25.
    Method lookups goup-hierarchy Thing Lookup target caches result Person Place to_s obj.to_s Rubyist Other
  • 26.
    Method lookups goup-hierarchy Thing Lookup target caches result Modification cascades down Person Place to_s obj.to_s Rubyist Other
  • 27.
    Method lookups goup-hierarchy Thing to_s Lookup target caches result Modification cascades down Person Place to_s obj.to_s Rubyist Other
  • 28.
    Constant Lookup • Cacheat lookup site • Global serial/switch indicates staleness • Complexities of lookup, etc • Joy of Ruby interfering with Joy of Opto • Modifying constants triggers invalidation
  • 29.
    Instance Vars • Classholds a table of offsets • Object holds array of values • Call site caches offset plus class ID • Same class, no lookup cost • Can be polymorphically chained
  • 30.
    Optimizing Ruby • Makecalls fast • Make constants free • Make instance variables cheap • Make closures lightweight • TODO
  • 31.
  • 32.
  • 33.
    Invoke? That’s one use,but there are many others
  • 34.
  • 35.
    Dynamic? Dynamic typing isa common reason, but there are many others
  • 36.
  • 37.
  • 38.
    JVM 101 200 opcodes Ten (or 16) “data endpoints”
  • 39.
    JVM 101 200 opcodes Ten (or 16) “data endpoints” Invocation invokevirtual invokeinterface invokestatic invokespecial
  • 40.
    JVM 101 200 opcodes Ten (or 16) “data endpoints” Invocation Field Access invokevirtual getfield invokeinterface setfield invokestatic getstatic invokespecial setstatic
  • 41.
    JVM 101 200 opcodes Ten (or 16) “data endpoints” Invocation Field Access Array Access invokevirtual getfield *aload invokeinterface setfield *astore invokestatic getstatic b,s,c,i,l,d,f,a invokespecial setstatic
  • 42.
    JVM 101 200 opcodes Ten (or 16) “data endpoints” Invocation Field Access Array Access invokevirtual getfield *aload invokeinterface setfield *astore invokestatic getstatic b,s,c,i,l,d,f,a invokespecial setstatic All Java code revolves around these endpoints Remaining ops are stack, local vars, flow control allocation, and math/boolean/bit operations
  • 43.
  • 44.
    JVM Opcodes Invocation Field Access Array Access invokevirtual getfield *aload invokeinterface setfield *astore invokestatic getstatic b,s,c,i,l,d,f,a invokespecial setstatic
  • 45.
    JVM Opcodes Invocation Field Access Array Access invokevirtual getfield *aload invokeinterface setfield *astore invokestatic getstatic b,s,c,i,l,d,f,a invokespecial setstatic Stack Local Vars Flow Control Allocation Boolean and Numeric
  • 46.
    JVM Opcodes Invocation Field Access Array Access invokevirtual getfield *aload invokeinterface setfield *astore invokestatic getstatic b,s,c,i,l,d,f,a invokespecial setstatic Stack Local Vars Flow Control Allocation Boolean and Numeric
  • 47.
    JVM Opcodes Invocation Field Access Array Access invokevirtual getfield *aload invokeinterface setfield *astore invokestatic getstatic b,s,c,i,l,d,f,a invokespecial setstatic Stack Local Vars Flow Control Allocation Boolean and Numeric
  • 49.
    In Detail • JRubygenerates code with indy calls • JVM at first call asks JRuby what to do • JRuby provides function pointers to code • Pointers include guards, invalidation logic • JRuby and JVM cooperate on optimizing
  • 51.
  • 52.
    invokedynamic bytecode bo ot stra p m et ho d
  • 53.
    invokedynamic bytecode bo ot stra p m et ho d method handles
  • 54.
    invokedynamic bytecode target method bo ot stra p m et ho d method handles
  • 55.
    invokedynamic bytecode target method bo ot stra p m et ho d method handles
  • 56.
    invokedynamic bytecode target method bo ot stra p m et ho d method handles
  • 57.
    Dynamic Invocation Target Object associated with obj.foo() JVM Method Table def foo ... def bar ...
  • 58.
    Dynamic Invocation VM Operations Target Object associated with obj.foo() JVM Method Table Call Site def foo ... def bar ...
  • 59.
    Dynamic Invocation VM Operations Target Object associated with obj.foo() JVM Method Table Call Site def foo ... def bar ...
  • 60.
    Dynamic Invocation VM Operations Method Lookup Target Object associated with obj.foo() JVM Method Table Call Site def foo ... def foo ... def bar ...
  • 61.
    Dynamic Invocation VM Operations Method Lookup Target Branch Object associated with obj.foo() JVM Method Table Call Site def foo ... def foo ... def bar ...
  • 62.
    Dynamic Invocation VM Operations Method Lookup Target Branch Method Cache Object associated with obj.foo() JVM def foo ... Method Table Call Site def foo ... def bar ...
  • 63.
    Constants JVM Constant MY_CONST Lookup Call Site
  • 64.
    Constants VM Operations JVM Constant MY_CONST Lookup Call Site
  • 65.
    Constants VM Operations JVM Constant MY_CONST Lookup Call Site
  • 66.
    Constants VM Operations Lookup Value JVM Constant MY_CONST value Lookup Call Site
  • 67.
    Constants VM Operations Lookup Value Bind Permanently JVM Constant MY_CONST value Lookup Call Site
  • 68.
    Instance Variables Target Object associated with @bar JVM Offset Table “@foo” => 0 “@bar” => 1
  • 69.
    Instance Variables VM Operations Target Object associated with @bar JVM Offset Table Access Site “@foo” => 0 “@bar” => 1
  • 70.
    Instance Variables VM Operations InstanceVar Lookup Target Object associated with @bar JVM Offset Table Access Site “@foo” => 0 “@bar” => 1
  • 71.
    Instance Variables VM Operations InstanceVar Lookup Target Offset Cache Object associated with @bar JVM 1 Offset Table Access Site “@foo” => 0 “@bar” => 1
  • 72.
    Instance Variables VM Operations InstanceVar Lookup Target Offset Cache Access Object Object associated with @bar JVM 1 Offset Table Access Site “@foo” => 0 “@bar” => 1
  • 73.
    Instance Variables VM Operations InstanceVar Lookup Target Offset Cache Access Object Object associated with @bar JVM 1 Offset Table Access Site “@foo” => 0 “@bar” => 1
  • 74.
    InvokeDynamic lets JRuby teachthe JVM how Ruby works
  • 75.
    How Do WeKnow We’ve Succeeded? • Benchmarking • Monitoring • User reports
  • 76.
    Benchmarking is Hard •Runtimes may improve over time • Optimizer may eliminate useless code • Small systems are completely different • Know how your runtime optimizes!
  • 77.
    bench_empty_method def foo; self;end i = 0 while i < 10_000_000 foo; foo; foo; foo; foo i += 1 end
  • 78.
    Ruby 1.9.3 JRuby JRuby + indy 4s 3s 2s 1s 0s ZOMG 40X FA STER!
  • 79.
  • 80.
  • 82.
  • 84.
  • 85.
  • 86.
    Ruby 1.9.3 JRuby JRuby + indy 4s 3s 2s 1s 0s
  • 87.
    JVM Opto 101 •JITs code bodies after 10k calls • No 10k calls, no JIT (generally) • Inlines up to two targets • Optimistic • Early decisions may be wrong • Small code looks drastically different
  • 88.
    SMALL CODE IS DIFFERENTTHAN LARGE CODE
  • 89.
    Inlining • Call sitein method A and method B match • JVM treats them as though B lived in A • No call overhead • Variables visible across call boundary • More complete view for optimization
  • 90.
    Optimistic • Say wehave a system... • The only method dynamically called is “foo” • All logic for dyncall revolves around “foo” • Hotspot thinks all dyncalls will be “foo”
  • 91.
    bench_empty_method2 def foo; self;end def bar1; self; end def bar2; self; end i = 0 while i < 10_000_000 bar1; bar1; bar1; bar1; bar1 bar2; bar2; bar2; bar2; bar2 i += 1 end ...
  • 92.
    bench1 bench2 bench1 + indy bench2 + indy 0.7s 0.525s 0.35s 0.175s 0s
  • 93.
    bench1 + rbx bench2 + rbx bench1 + indy bench2 + indy 0.4s 0.3s 0.2s 0.1s 0s
  • 94.
    What Happened? • Anunrelated change slowed our bench? • Not really unrelated • Hotspot optimizes early loop first • Later loop is different...calls “foo” • Assumptions change, perf looks different
  • 95.
    Benchmarking is Not Enough • Need to monitor runtime optimization • JIT compilation • Inlining • Eventual native code (x86 ASM) • Fun?
  • 96.
    1711 4 % bench_empty_method::block_0$RUBY$__file__ @ 56 (171 bytes) @ 59 java.lang.invoke.MethodHandle::invokeExact (33 bytes) inline (hot) @ 5 java.lang.invoke.MethodHandle::invokeExact (20 bytes) inline (hot) @ 2 java.lang.invoke.MethodHandle::invokeExact (9 bytes) inline (hot) @ 2 java.lang.invoke.MutableCallSite::getTarget (5 bytes) inline (hot) @ 16 java.lang.invoke.MethodHandle::invokeExact (5 bytes) inline (hot) @ 1 sun.invoke.util.ValueConversions::identity (2 bytes) inline (hot) @ 12 java.lang.invoke.MethodHandle::invokeExact (10 bytes) inline (hot) @ 29 java.lang.invoke.MethodHandle::invokeExact (35 bytes) inline (hot) @ 5 java.lang.invoke.MethodHandle::invokeExact (7 bytes) inline (hot) @ 3 org.jruby.runtime.invokedynamic.InvocationLinker::testMetaclass (17 bytes) inline (hot) @ 5 org.jruby.RubyBasicObject::getMetaClass (5 bytes) inline (hot) @ 14 java.lang.invoke.MethodHandle::invokeExact (10 bytes) inline (hot) @ 31 java.lang.invoke.MethodHandle::invokeExact (10 bytes) inline (hot) @ 6 bench_empty_method::method__0$RUBY$foo (2 bytes) inline (hot) @ 68 java.lang.invoke.MethodHandle::invokeExact (33 bytes) inline (hot) @ 5 java.lang.invoke.MethodHandle::invokeExact (20 bytes) inline (hot) @ 2 java.lang.invoke.MethodHandle::invokeExact (9 bytes) inline (hot) @ 2 java.lang.invoke.MutableCallSite::getTarget (5 bytes) inline (hot)
  • 97.
    1711 4 % bench_empty_method::block_0$RUBY$__file__ @ 56 (171 bytes) @ 59 java.lang.invoke.MethodHandle::invokeExact (33 bytes) inline (hot) @ 5 java.lang.invoke.MethodHandle::invokeExact (20 bytes) inline (hot) @ 2 java.lang.invoke.MethodHandle::invokeExact (9 bytes) inline (hot) @ 2 java.lang.invoke.MutableCallSite::getTarget (5 bytes) inline (hot) @ 16 java.lang.invoke.MethodHandle::invokeExact (5 bytes) inline (hot) @ 1 sun.invoke.util.ValueConversions::identity (2 bytes) inline (hot) @ 12 java.lang.invoke.MethodHandle::invokeExact (10 bytes) inline (hot) @ 29 java.lang.invoke.MethodHandle::invokeExact (35 bytes) inline (hot) @ 5 java.lang.invoke.MethodHandle::invokeExact (7 bytes) inline (hot) @ 3 org.jruby.runtime.invokedynamic.InvocationLinker::testMetaclass (17 bytes) inline (hot) @ 5 org.jruby.RubyBasicObject::getMetaClass (5 bytes) inline (hot) @ 14 java.lang.invoke.MethodHandle::invokeExact (10 bytes) inline (hot) @ 31 java.lang.invoke.MethodHandle::invokeExact (10 bytes) inline (hot) @ 6 bench_empty_method::method__0$RUBY$foo (2 bytes) inline (hot) @ 68 java.lang.invoke.MethodHandle::invokeExact (33 bytes) inline (hot) @ 5 java.lang.invoke.MethodHandle::invokeExact (20 bytes) inline (hot) @ 2 java.lang.invoke.MethodHandle::invokeExact (9 bytes) inline (hot) @ 2 java.lang.invoke.MutableCallSite::getTarget (5 bytes) inline (hot)
  • 98.
    1711 4 % bench_empty_method::block_0$RUBY$__file__ @ 56 (171 bytes) @ 59 java.lang.invoke.MethodHandle::invokeExact (33 bytes) inline (hot) @ 5 java.lang.invoke.MethodHandle::invokeExact (20 bytes) inline (hot) @ 2 java.lang.invoke.MethodHandle::invokeExact (9 bytes) inline (hot) @ 2 java.lang.invoke.MutableCallSite::getTarget (5 bytes) inline (hot) @ 16 java.lang.invoke.MethodHandle::invokeExact (5 bytes) inline (hot) @ 1 sun.invoke.util.ValueConversions::identity (2 bytes) inline (hot) @ 12 java.lang.invoke.MethodHandle::invokeExact (10 bytes) inline (hot) @ 29 java.lang.invoke.MethodHandle::invokeExact (35 bytes) inline (hot) @ 5 java.lang.invoke.MethodHandle::invokeExact (7 bytes) inline (hot) @ 3 org.jruby.runtime.invokedynamic.InvocationLinker::testMetaclass (17 bytes) inline (hot) @ 5 org.jruby.RubyBasicObject::getMetaClass (5 bytes) inline (hot) @ 14 java.lang.invoke.MethodHandle::invokeExact (10 bytes) inline (hot) @ 31 java.lang.invoke.MethodHandle::invokeExact (10 bytes) inline (hot) @ 6 bench_empty_method::method__0$RUBY$foo (2 bytes) inline (hot) @ 68 java.lang.invoke.MethodHandle::invokeExact (33 bytes) inline (hot) @ 5 java.lang.invoke.MethodHandle::invokeExact (20 bytes) inline (hot) @ 2 java.lang.invoke.MethodHandle::invokeExact (9 bytes) inline (hot) @ 2 java.lang.invoke.MutableCallSite::getTarget (5 bytes) inline (hot)
  • 99.
    Decoding compiled method0x000000010549d7d0: Code: [Entry Point] [Verified Entry Point] [Constants] # {method} 'method__0$RUBY$foo' '(Lbench_empty_method;Lorg/jruby/runtime/ThreadContext;Lorg/jruby/ runtime/builtin/IRubyObject;Lorg/jruby/runtime/Block;)Lorg/jruby/runtime/builtin/IRubyObject;' in 'bench_empty_method' # parm0: rsi:rsi = 'bench_empty_method' # parm1: rdx:rdx = 'org/jruby/runtime/ThreadContext' # parm2: rcx:rcx = 'org/jruby/runtime/builtin/IRubyObject' # parm3: r8:r8 = 'org/jruby/runtime/Block' # [sp+0x20] (sp of caller) 0x000000010549d900: sub $0x18,%rsp 0x000000010549d907: mov %rbp,0x10(%rsp) ;*synchronization entry ; - bench_empty_method::method__0$RUBY$foo@-1 (line 3) 0x000000010549d90c: mov %rcx,%rax 0x000000010549d90f: add $0x10,%rsp 0x000000010549d913: pop %rbp 0x000000010549d914: test %eax,-0xe9f91a(%rip) # 0x00000001045fe000 ; {poll_return} 0x000000010549d91a: retq
  • 100.
    Decoding compiled method0x000000010549d7d0: Code: [Entry Point] [Verified Entry Point] [Constants] # {method} 'method__0$RUBY$foo' '(Lbench_empty_method;Lorg/jruby/runtime/ThreadContext;Lorg/jruby/ runtime/builtin/IRubyObject;Lorg/jruby/runtime/Block;)Lorg/jruby/runtime/builtin/IRubyObject;' in 'bench_empty_method' # parm0: rsi:rsi = 'bench_empty_method' # parm1: rdx:rdx = 'org/jruby/runtime/ThreadContext' # parm2: rcx:rcx = 'org/jruby/runtime/builtin/IRubyObject' # parm3: r8:r8 = 'org/jruby/runtime/Block' # [sp+0x20] (sp of caller) 0x000000010549d900: sub $0x18,%rsp 0x000000010549d907: mov %rbp,0x10(%rsp) ;*synchronization entry ; - bench_empty_method::method__0$RUBY$foo@-1 (line 3) 0x000000010549d90c: mov %rcx,%rax 0x000000010549d90f: add $0x10,%rsp 0x000000010549d913: pop %rbp 0x000000010549d914: test %eax,-0xe9f91a(%rip) # 0x00000001045fe000 ; {poll_return} 0x000000010549d91a: retq
  • 101.
    Decoding compiled method0x000000010549d7d0: Code: [Entry Point] [Verified Entry Point] [Constants] # {method} 'method__0$RUBY$foo' '(Lbench_empty_method;Lorg/jruby/runtime/ThreadContext;Lorg/jruby/ runtime/builtin/IRubyObject;Lorg/jruby/runtime/Block;)Lorg/jruby/runtime/builtin/IRubyObject;' in 'bench_empty_method' # parm0: rsi:rsi = 'bench_empty_method' # parm1: rdx:rdx = 'org/jruby/runtime/ThreadContext' # parm2: rcx:rcx = 'org/jruby/runtime/builtin/IRubyObject' # parm3: r8:r8 = 'org/jruby/runtime/Block' # [sp+0x20] (sp of caller) 0x000000010549d900: sub $0x18,%rsp 0x000000010549d907: mov %rbp,0x10(%rsp) ;*synchronization entry ; - bench_empty_method::method__0$RUBY$foo@-1 (line 3) 0x000000010549d90c: mov %rcx,%rax 0x000000010549d90f: add $0x10,%rsp 0x000000010549d913: pop %rbp 0x000000010549d914: test %eax,-0xe9f91a(%rip) # 0x00000001045fe000 ; {poll_return} 0x000000010549d91a: retq
  • 102.
    bench_empty_method3 def invoker1 i = 0 while i < 1000 foo; foo; foo; foo; foo i+=1 end end ... i = 0 while i < 10000 invoker1 i+=1 end
  • 103.
    bench1 + indy bench2 + indy bench3 + indy 0.15s 0.113s 0.075s 0.038s 0s
  • 104.
    Moral • Benchmarks aresynthetic • Every system is different • Do your own testing
  • 105.
    bench_red_black • Pure-Ruby red/blacktree impl • Build a 100k tree of rand(999_999) • Delete all nodes • Build it again • Search for elements • In-order walks, min, max
  • 106.
    Ruby 1.9.3 JRuby - indy JRuby + indy bench_red_black 5s 3.75s 2.5s 1.25s 0s
  • 107.
    bench_fractal bench_flipflop_fractal • Mandelbrot generator • Integer loops • Floating-point math • Julia generator using flip-flops • I don’t really understand it.
  • 109.
    def fractal_flipflop w, h = 44, 54 c = 7 + 42 * w a = [0] * w * h g = d = 0 f = proc do |n| a[c] += 1 o = a.map {|z| " :#"[z, 1] * 2 }.join.scan(/.{#{w * 2}}/) puts "f" + o.map {|l| l.rstrip }.join("n") d += 1 - 2 * ((g ^= 1 << n) >> n) c += [1, w, -1, -w][d %= 4] end 1024.times do !!(!!(!!(!!(!!(!!(!!(!!(!!(true... f[0])...f[1])...f[2])... f[3])...f[4])...f[5])... f[6])...f[7])...f[8]) end end
  • 110.
    def fractal_flipflop w, h = 44, 54 c = 7 + 42 * w a = [0] * w * h g = d = 0 f = proc do |n| a[c] += 1 o = a.map {|z| " :#"[z, 1] * 2 }.join.scan(/.{#{w * 2}}/) puts "f" + o.map {|l| l.rstrip }.join("n") d += 1 - 2 * ((g ^= 1 << n) >> n) c += [1, w, -1, -w][d %= 4] end 1024.times do !!(!!(!!(!!(!!(!!(!!(!!(!!(true... f[0])...f[1])...f[2])... f[3])...f[4])...f[5])... f[6])...f[7])...f[8]) end end
  • 112.
    Ruby 1.9.3 JRuby - indy JRuby + indy bench_fractal 1.5s 1.125s 0.75s 0.375s 0s
  • 113.
    Ruby 1.9.3 JRuby - indy JRuby + indy bench_flipflop_fractal 1.5s 1.125s 0.75s 0.375s 0s
  • 114.
  • 115.
    Rails Perf • Mixedbag right now...some fast some slow • JVM JIT limits need to be bumped up • Significant gains for some folks • Long warmup times for so much code • Work continues!
  • 116.
  • 117.
    Expand Opto • Mixed-arity(ADD SLIDES ABOUT WHAT WE OPTIMIZE TODAY) • Super calls • Much, much lighter-weight closures • Then what?
  • 118.
    Wacky Stuff • define_methodmethods? • method_missing call-throughs? • respond_to??? • proc tables? • All possible...but worth it?
  • 119.
    The Future • JRubywill continue to get faster • Indy improvements at VM-level • Compiler improvements at Ruby level • If you can’t compete with JVM... • Still FOSS from top to bottom • Don’t be afraid!
  • 120.

Editor's Notes