Skip to content

Improve Pathname performance #1836

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed

Conversation

Watson1978
Copy link
Contributor

@Watson1978 Watson1978 commented Mar 13, 2018

If it will not use special variables (like $1, $&, $`...),
it can improve the performance by using Regexp#match? instead of Regexp#=~.
Because Regexp#=~ will generate the objects to special variables by pattern matching.

This patch will replace Regexp#=~ without special variables to Regexp#match?.
(Excludes https://github.com/ruby/ruby/blob/trunk/ext/pathname/lib/pathname.rb#L144-L153)

Environment

  • OS : Ubuntu 17.10
  • Compiler : gcc version 7.2.0
  • CPU : Intel(R) Core(TM) i5-3210M CPU @ 2.50GHz
  • Memory : 16 GB

TL;DR

  before after Speed up
Pathname#absolute? 142836 198487 39.0%
Pathname#cleanpath 60706 79415 30.8%
Pathname#root? 603806 759157 25.7%
Pathname#absolute? 142592 197859 38.8%
Pathname#each_filename 115600 152982 32.3%
Pathname#ascend 50494 63606 26.0%
Pathname#+ 100550 130372 29.7%
Pathname#join 46673 60994 30.7%
Pathname#relative_path_from 28362 37494 32.2%

Before

Calculating -------------------------------------
  Pathname#absolute?    142.836k (± 0.1%) i/s -    722.304k in   5.056884s
  Pathname#cleanpath     60.706k (± 0.1%) i/s -    306.764k in   5.053305s
      Pathname#root?    603.806k (± 0.3%) i/s -      3.062M in   5.071696s
  Pathname#absolute?    142.592k (± 0.1%) i/s -    720.846k in   5.055301s
Pathname#each_filename
                        115.600k (± 0.1%) i/s -    586.818k in   5.076292s
     Pathname#ascend     50.494k (± 0.1%) i/s -    255.301k in   5.056049s
          Pathname#+    100.550k (± 0.1%) i/s -    509.630k in   5.068433s
       Pathname#join     46.673k (± 0.1%) i/s -    236.433k in   5.065696s
Pathname#relative_path_from
                         28.362k (± 0.0%) i/s -    143.728k in   5.067640s

After

Calculating -------------------------------------
  Pathname#absolute?    198.487k (± 0.1%) i/s -    995.665k in   5.016272s
  Pathname#cleanpath     79.415k (± 0.1%) i/s -    404.406k in   5.092344s
      Pathname#root?    759.157k (± 0.0%) i/s -      3.800M in   5.005072s
  Pathname#absolute?    197.859k (± 0.1%) i/s -    995.720k in   5.032494s
Pathname#each_filename
                        152.982k (± 0.1%) i/s -    775.555k in   5.069607s
     Pathname#ascend     63.606k (± 0.0%) i/s -    320.862k in   5.044560s
          Pathname#+    130.372k (± 0.1%) i/s -    660.856k in   5.068991s
       Pathname#join     60.994k (± 0.1%) i/s -    305.068k in   5.001626s
Pathname#relative_path_from
                         37.494k (± 0.4%) i/s -    189.124k in   5.044146s

Benchmark code

require 'pathname'
require 'benchmark/ips'

Benchmark.ips do |x|
  root  = Pathname.new('/')
  path1 = Pathname.new('/path/to/some/file1.rb')
  path2 = Pathname.new('/path/to/some/file2.rb')

  x.report("Pathname#absolute?") do
    path1.absolute?
  end

  x.report("Pathname#cleanpath") do
    Pathname.new('/path/to/some/file.rb').cleanpath
  end

  x.report("Pathname#root?") do
    path1.root?
  end

  x.report("Pathname#absolute?") do
    path1.absolute?
  end

  x.report("Pathname#each_filename") do
    path1.each_filename { |file| }
  end

  x.report("Pathname#ascend") do
    path1.ascend { |path| }
  end

  x.report("Pathname#+") do
    path1 + path2
  end

  x.report("Pathname#join") do
    path1.join("../file3.rb")
  end

  x.report("Pathname#relative_path_from") do
    path1.relative_path_from(root)
  end
end

https://bugs.ruby-lang.org/issues/14599

If it will not use special variables (like $1, $&, $`...),
it can improve the performance by using Regexp#match? instead of Regexp#=~.
Because Regexp#=~ will generate the objects to special variables by pattern matching.

This patch will replace Regexp#=~ without special variables to Regexp#match?.
(Excludes https://github.com/ruby/ruby/blob/trunk/ext/pathname/lib/pathname.rb#L144-L153)

## Environment
* OS : Ubuntu 17.10
* Compiler : gcc version 7.2.0
* CPU : Intel(R) Core(TM) i5-3210M CPU @ 2.50GHz
* Memory : 16 GB

## TL;DR
                            | before | after  | Speed up
--------------------------- | ------ | ------ | --------
Pathname#absolute?          | 142836 | 198487 | 39.0%
Pathname#cleanpath          |  60706 |  79415 | 30.8%
Pathname#root?              | 603806 | 759157 | 25.7%
Pathname#absolute?          | 142592 | 197859 | 38.8%
Pathname#each_filename      | 115600 | 152982 | 32.3%
Pathname#ascend             |  50494 |  63606 | 26.0%
Pathname#+                  | 100550 | 130372 | 29.7%
Pathname#join               |  46673 |  60994 | 30.7%
Pathname#relative_path_from |  28362 |  37494 | 32.2%

## Before
```
Calculating -------------------------------------
  Pathname#absolute?    142.836k (± 0.1%) i/s -    722.304k in   5.056884s
  Pathname#cleanpath     60.706k (± 0.1%) i/s -    306.764k in   5.053305s
      Pathname#root?    603.806k (± 0.3%) i/s -      3.062M in   5.071696s
  Pathname#absolute?    142.592k (± 0.1%) i/s -    720.846k in   5.055301s
Pathname#each_filename
                        115.600k (± 0.1%) i/s -    586.818k in   5.076292s
     Pathname#ascend     50.494k (± 0.1%) i/s -    255.301k in   5.056049s
          Pathname#+    100.550k (± 0.1%) i/s -    509.630k in   5.068433s
       Pathname#join     46.673k (± 0.1%) i/s -    236.433k in   5.065696s
Pathname#relative_path_from
                         28.362k (± 0.0%) i/s -    143.728k in   5.067640s
```

## After
```
Calculating -------------------------------------
  Pathname#absolute?    198.487k (± 0.1%) i/s -    995.665k in   5.016272s
  Pathname#cleanpath     79.415k (± 0.1%) i/s -    404.406k in   5.092344s
      Pathname#root?    759.157k (± 0.0%) i/s -      3.800M in   5.005072s
  Pathname#absolute?    197.859k (± 0.1%) i/s -    995.720k in   5.032494s
Pathname#each_filename
                        152.982k (± 0.1%) i/s -    775.555k in   5.069607s
     Pathname#ascend     63.606k (± 0.0%) i/s -    320.862k in   5.044560s
          Pathname#+    130.372k (± 0.1%) i/s -    660.856k in   5.068991s
       Pathname#join     60.994k (± 0.1%) i/s -    305.068k in   5.001626s
Pathname#relative_path_from
                         37.494k (± 0.4%) i/s -    189.124k in   5.044146s
```

## Benchmark code
```ruby
require 'pathname'
require 'benchmark/ips'

Benchmark.ips do |x|
  root  = Pathname.new('/')
  path1 = Pathname.new('/path/to/some/file1.rb')
  path2 = Pathname.new('/path/to/some/file2.rb')

  x.report("Pathname#absolute?") do
    path1.absolute?
  end

  x.report("Pathname#cleanpath") do
    Pathname.new('/path/to/some/file.rb').cleanpath
  end

  x.report("Pathname#root?") do
    path1.root?
  end

  x.report("Pathname#absolute?") do
    path1.absolute?
  end

  x.report("Pathname#each_filename") do
    path1.each_filename { |file| }
  end

  x.report("Pathname#ascend") do
    path1.ascend { |path| }
  end

  x.report("Pathname#+") do
    path1 + path2
  end

  x.report("Pathname#join") do
    path1.join("../file3.rb")
  end

  x.report("Pathname#relative_path_from") do
    path1.relative_path_from(root)
  end
end
```
@nurse
Copy link
Member

nurse commented Mar 13, 2018

You can commit by yourself a patch which improves performance without increasing complexity.

@Watson1978
Copy link
Contributor Author

Thanks :)

@matzbot matzbot closed this in ccc2576 Mar 13, 2018
Watson1978 added a commit to Watson1978/ruby that referenced this pull request Mar 17, 2018
If it will not use special variables (like $1, $&, $`...),
it can improve the performance by using Regexp#match? or String#match? instead of Regexp#=~ or String#=~.

This patch is same idea as ruby#1836

## Environment
* OS : Ubuntu 17.10
* Compiler : gcc version 7.2.0
* CPU : Intel(R) Core(TM) i5-3210M CPU @ 2.50GHz
* Memory : 16 GB

## TL;DR
            | Before | After  | Speed up
----------- | ------ | ------ | --------
CSV.foreach | 44.825 | 48.201 | 7.5%
CSV#shift   | 45.200 | 49.584 | 9.7%
CSV.read    | 42.968 | 46.853 | 9.0%
CSV.table   | 10.933 | 11.277 | 3.1%

## Before
```
Calculating -------------------------------------
         CSV.foreach     44.825  (± 0.0%) i/s -    228.000  in   5.086576s
           CSV#shift     45.200  (± 0.0%) i/s -    228.000  in   5.044297s
            CSV.read     42.968  (± 0.0%) i/s -    216.000  in   5.027504s
           CSV.table     10.933  (± 0.0%) i/s -     55.000  in   5.031098s
```

## After
```
Calculating -------------------------------------
         CSV.foreach     48.201  (± 0.0%) i/s -    244.000  in   5.062256s
           CSV#shift     49.584  (± 0.0%) i/s -    248.000  in   5.001652s
            CSV.read     46.853  (± 0.0%) i/s -    236.000  in   5.037044s
           CSV.table     11.277  (± 0.0%) i/s -     57.000  in   5.054694s
```

## Benchmark code
```ruby
require 'csv'
require 'benchmark/ips'

CSV.open("/tmp/file.csv", "w") do |csv|
  csv << ["player", "gameA", "gameB"]
  1000.times do
    csv << ['"Alice"', "84.0", "79.5"]
    csv << ['"Bob"', "20.0", "56.5"]
  end
end

Benchmark.ips do |x|
  x.report "CSV.foreach" do
    CSV.foreach("/tmp/file.csv") do |row|
    end
  end

  x.report "CSV#shift" do
    CSV.open("/tmp/file.csv") do |csv|
      while line = csv.shift
      end
    end
  end

  x.report "CSV.read" do
    CSV.read("/tmp/file.csv")
  end

  x.report "CSV.table" do
    CSV.table("/tmp/file.csv")
  end
end
```
Watson1978 added a commit to Watson1978/ruby that referenced this pull request Mar 17, 2018
If it will not use special variables (like $1, $&, $`...),
it can improve the performance by using Regexp#match? or String#match? instead of Regexp#=~ or String#=~.

This patch is same idea as ruby#1836

## Environment
* OS : Ubuntu 17.10
* Compiler : gcc version 7.2.0
* CPU : Intel(R) Core(TM) i5-3210M CPU @ 2.50GHz
* Memory : 16 GB

## TL;DR
Methods     | Before | After  | Speed up
----------- | ------ | ------ | --------
CSV.foreach | 44.825 | 48.201 | 7.5%
CSV#shift   | 45.200 | 49.584 | 9.7%
CSV.read    | 42.968 | 46.853 | 9.0%
CSV.table   | 10.933 | 11.277 | 3.1%

## Before
```
Calculating -------------------------------------
         CSV.foreach     44.825  (± 0.0%) i/s -    228.000  in   5.086576s
           CSV#shift     45.200  (± 0.0%) i/s -    228.000  in   5.044297s
            CSV.read     42.968  (± 0.0%) i/s -    216.000  in   5.027504s
           CSV.table     10.933  (± 0.0%) i/s -     55.000  in   5.031098s
```

## After
```
Calculating -------------------------------------
         CSV.foreach     48.201  (± 0.0%) i/s -    244.000  in   5.062256s
           CSV#shift     49.584  (± 0.0%) i/s -    248.000  in   5.001652s
            CSV.read     46.853  (± 0.0%) i/s -    236.000  in   5.037044s
           CSV.table     11.277  (± 0.0%) i/s -     57.000  in   5.054694s
```

## Benchmark code
```ruby
require 'csv'
require 'benchmark/ips'

CSV.open("/tmp/file.csv", "w") do |csv|
  csv << ["player", "gameA", "gameB"]
  1000.times do
    csv << ['"Alice"', "84.0", "79.5"]
    csv << ['"Bob"', "20.0", "56.5"]
  end
end

Benchmark.ips do |x|
  x.report "CSV.foreach" do
    CSV.foreach("/tmp/file.csv") do |row|
    end
  end

  x.report "CSV#shift" do
    CSV.open("/tmp/file.csv") do |csv|
      while line = csv.shift
      end
    end
  end

  x.report "CSV.read" do
    CSV.read("/tmp/file.csv")
  end

  x.report "CSV.table" do
    CSV.table("/tmp/file.csv")
  end
end
```
matzbot pushed a commit that referenced this pull request Mar 18, 2018
If it will not use special variables (like $1, $&, $`...),
it can improve the performance by using Regexp#match? or String#match? instead of Regexp#=~ or String#=~.

This patch is same idea as #1836

[Fix GH-1842]

## Environment
* OS : Ubuntu 17.10
* Compiler : gcc version 7.2.0
* CPU : Intel(R) Core(TM) i5-3210M CPU @ 2.50GHz
* Memory : 16 GB

## TL;DR
Methods     | Before | After  | Speed up
----------- | ------ | ------ | --------
CSV.foreach | 44.825 | 48.201 | 7.5%
CSV#shift   | 45.200 | 49.584 | 9.7%
CSV.read    | 42.968 | 46.853 | 9.0%
CSV.table   | 10.933 | 11.277 | 3.1%

## Before
```
Calculating -------------------------------------
         CSV.foreach     44.825  (± 0.0%) i/s -    228.000  in   5.086576s
           CSV#shift     45.200  (± 0.0%) i/s -    228.000  in   5.044297s
            CSV.read     42.968  (± 0.0%) i/s -    216.000  in   5.027504s
           CSV.table     10.933  (± 0.0%) i/s -     55.000  in   5.031098s
```

## After
```
Calculating -------------------------------------
         CSV.foreach     48.201  (± 0.0%) i/s -    244.000  in   5.062256s
           CSV#shift     49.584  (± 0.0%) i/s -    248.000  in   5.001652s
            CSV.read     46.853  (± 0.0%) i/s -    236.000  in   5.037044s
           CSV.table     11.277  (± 0.0%) i/s -     57.000  in   5.054694s
```

## Benchmark code
```ruby
require 'csv'
require 'benchmark/ips'

CSV.open("/tmp/file.csv", "w") do |csv|
  csv << ["player", "gameA", "gameB"]
  1000.times do
    csv << ['"Alice"', "84.0", "79.5"]
    csv << ['"Bob"', "20.0", "56.5"]
  end
end

Benchmark.ips do |x|
  x.report "CSV.foreach" do
    CSV.foreach("/tmp/file.csv") do |row|
    end
  end

  x.report "CSV#shift" do
    CSV.open("/tmp/file.csv") do |csv|
      while line = csv.shift
      end
    end
  end

  x.report "CSV.read" do
    CSV.read("/tmp/file.csv")
  end

  x.report "CSV.table" do
    CSV.table("/tmp/file.csv")
  end
end
```

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@62806 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
@hsbt hsbt added the Backport label Sep 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants