diff options
author | schneems <[email protected]> | 2021-11-07 13:57:24 -0600 |
---|---|---|
committer | Yusuke Endoh <[email protected]> | 2021-12-02 15:55:42 +0900 |
commit | 3f74eaa7a83e42b31c219a534ec5330e511d2921 (patch) | |
tree | 1d1617acb0e7483fa5bc5d2a68a33dcb634cfe28 | |
parent | 6721ce1cc4035fe5508c13d0151f748e8b8e8cf1 (diff) |
~1.10x faster Change Ripper.lex structs to classes
## Concept
I am proposing we replace the Struct implementation of data structures inside of ripper with real classes.
This will improve performance and the implementation is not meaningfully more complicated.
## Example
Struct versus class comparison:
```ruby
Elem = Struct.new(:pos, :event, :tok, :state, :message) do
def initialize(pos, event, tok, state, message = nil)
super(pos, event, tok, State.new(state), message)
end
# ...
def to_a
a = super
a.pop unless a.empty?
a
end
end
class ElemClass
attr_accessor :pos, :event, :tok, :state, :message
def initialize(pos, event, tok, state, message = nil)
@pos = pos
@event = event
@tok = tok
@state = State.new(state)
@message = message
end
def to_a
if @message
[@pos, @event, @tok, @state, @message]
else
[@pos, @event, @tok, @state]
end
end
end
# stub state class creation for now
class State; def initialize(val); end; end
```
## MicroBenchmark creation
```ruby
require 'benchmark/ips'
require 'ripper'
pos = [1, 2]
event = :on_nl
tok = "\n".freeze
state = Ripper::EXPR_BEG
Benchmark.ips do |x|
x.report("struct") { Elem.new(pos, event, tok, state) }
x.report("class ") { ElemClass.new(pos, event, tok, state) }
x.compare!
end; nil
```
Gives ~1.2x faster creation:
```
Warming up --------------------------------------
struct 263.983k i/100ms
class 303.367k i/100ms
Calculating -------------------------------------
struct 2.638M (± 5.9%) i/s - 13.199M in 5.023460s
class 3.171M (± 4.6%) i/s - 16.078M in 5.082369s
Comparison:
class : 3170690.2 i/s
struct: 2638493.5 i/s - 1.20x (± 0.00) slower
```
## MicroBenchmark `to_a` (Called by Ripper.lex for every element)
```ruby
require 'benchmark/ips'
require 'ripper'
pos = [1, 2]
event = :on_nl
tok = "\n".freeze
state = Ripper::EXPR_BEG
struct = Elem.new(pos, event, tok, state)
from_class = ElemClass.new(pos, event, tok, state)
Benchmark.ips do |x|
x.report("struct") { struct.to_a }
x.report("class ") { from_class.to_a }
x.compare!
end; nil
```
Gives 1.46x faster `to_a`:
```
Warming up --------------------------------------
struct 612.094k i/100ms
class 893.233k i/100ms
Calculating -------------------------------------
struct 6.121M (± 5.4%) i/s - 30.605M in 5.015851s
class 8.931M (± 7.9%) i/s - 44.662M in 5.039733s
Comparison:
class : 8930619.0 i/s
struct: 6121358.9 i/s - 1.46x (± 0.00) slower
```
## MicroBenchmark data access
```ruby
require 'benchmark/ips'
require 'ripper'
pos = [1, 2]
event = :on_nl
tok = "\n".freeze
state = Ripper::EXPR_BEG
struct = Elem.new(pos, event, tok, state)
from_class = ElemClass.new(pos, event, tok, state)
Benchmark.ips do |x|
x.report("struct") { struct.pos[1] }
x.report("class ") { from_class.pos[1] }
x.compare!
end; nil
```
Gives ~1.17x faster data access:
```
Warming up --------------------------------------
struct 1.694M i/100ms
class 1.868M i/100ms
Calculating -------------------------------------
struct 16.149M (± 6.8%) i/s - 81.318M in 5.060633s
class 18.886M (± 2.9%) i/s - 95.262M in 5.048359s
Comparison:
class : 18885669.6 i/s
struct: 16149255.8 i/s - 1.17x (± 0.00) slower
```
## Full benchmark integration of this inside of Ripper.lex
Inside of this repo with this commit
```
$ cd ext/ripper
$ make
$ cat test.rb
file = File.join(__dir__, "../../array.rb")
source = File.read(file)
bench = Benchmark.measure do
10_000.times.each do
Ripper.lex(source)
end
end
puts bench
```
Then execute with and without this change 50 times:
```
rm new.txt
rm old.txt
for i in {0..50}
do
`ruby -Ilib -rripper -rbenchmark ./test.rb >> new.txt`
`ruby -rripper -rbenchmark ./test.rb >> old.txt`
done
```
I used derailed benchmarks internals to compare the results:
```
dir = Pathname(".")
branch_info = {}
branch_info["old"] = { desc: "Struct lex", time: Time.now, file: dir.join("old.txt"), name: "old" }
branch_info["new"] = { desc: "Class lex", time: Time.now, file: dir.join("new.txt"), name: "new" }
stats = DerailedBenchmarks::StatsFromDir.new(branch_info)
stats.call.banner
```
Which gave us:
```
❤️ ❤️ ❤️ (Statistically Significant) ❤️ ❤️ ❤️
[new] (3.3139 seconds) "Class lex" ref: "new"
FASTER 🚀🚀🚀 by:
1.1046x [older/newer]
9.4700% [(older - newer) / older * 100]
[old] (3.6606 seconds) "Struct lex" ref: "old"
Iterations per sample:
Samples: 51
Test type: Kolmogorov Smirnov
Confidence level: 99.0 %
Is significant? (max > critical): true
D critical: 0.30049534876137013
D max: 0.9607843137254902
Histograms (time ranges are in seconds):
[new] description: [old] description:
"Class lex" "Struct lex"
┌ ┐ ┌ ┐
[3.0, 3.3) ┤▇ 1 [3.0, 3.3) ┤ 0
[3.3, 3.6) ┤▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 47 [3.3, 3.6) ┤ 0
[3.5, 3.8) ┤▇▇ 2 [3.5, 3.8) ┤▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 46
[3.8, 4.1) ┤▇ 1 [3.8, 4.1) ┤▇▇▇ 4
[4.0, 4.3) ┤ 0 [4.0, 4.3) ┤ 0
[4.3, 4.6) ┤ 0 [4.3, 4.6) ┤▇ 1
└ ┘ └ ┘
# of runs in range # of runs in range
```
To sum this up, the "new" version of this code (using real classes instead of structs) is 10% faster across 50 runs with a statistical significance confidence level of 99%. Histograms are for visual checksum.
Notes
Notes:
Merged: https://github.com/ruby/ruby/pull/5093
-rw-r--r-- | ext/ripper/lib/ripper/lexer.rb | 30 |
1 files changed, 22 insertions, 8 deletions
diff --git a/ext/ripper/lib/ripper/lexer.rb b/ext/ripper/lib/ripper/lexer.rb index f6051c6341..92ee4ef7df 100644 --- a/ext/ripper/lib/ripper/lexer.rb +++ b/ext/ripper/lib/ripper/lexer.rb @@ -53,10 +53,16 @@ class Ripper end class Lexer < ::Ripper #:nodoc: internal use only - State = Struct.new(:to_int, :to_s) do + class State + attr_reader :to_int, :to_s + + def initialize(i) + @to_int = i + @to_s = Ripper.lex_state_name(i) + freeze + end + alias to_i to_int - def initialize(i) super(i, Ripper.lex_state_name(i)).freeze end - # def inspect; "#<#{self.class}: #{self}>" end alias inspect to_s def pretty_print(q) q.text(to_s) end def ==(i) super or to_int == i end @@ -67,9 +73,15 @@ class Ripper def nobits?(i) to_int.nobits?(i) end end - Elem = Struct.new(:pos, :event, :tok, :state, :message) do + class Elem + attr_accessor :pos, :event, :tok, :state, :message + def initialize(pos, event, tok, state, message = nil) - super(pos, event, tok, State.new(state), message) + @pos = pos + @event = event + @tok = tok + @state = State.new(state) + @message = message end def inspect @@ -94,9 +106,11 @@ class Ripper end def to_a - a = super - a.pop unless a.last - a + if @message + [@pos, @event, @tok, @state, @message] + else + [@pos, @event, @tok, @state] + end end end |