[#61822] Plan Developers Meeting Japan April 2014 — Zachary Scott <e@...>

I would like to request developers meeting around April 17 or 18 in this mo=

14 messages 2014/04/03
[#61825] Re: Plan Developers Meeting Japan April 2014 — Urabe Shyouhei <shyouhei@...> 2014/04/03

It's good if we have a meeting then.

[#61826] Re: Plan Developers Meeting Japan April 2014 — Zachary Scott <e@...> 2014/04/03

Regarding openssl issues, I=E2=80=99ve discussed possible meeting time with=

[#61833] Re: Plan Developers Meeting Japan April 2014 — Martin Bo煬et <martin.bosslet@...> 2014/04/03

Hi,

[ruby-core:61816] [ruby-trunk - Bug #9694] Bad regexp hangs ruby

From: shyouhei@...
Date: 2014-04-03 02:42:17 UTC
List: ruby-core #61816
Issue #9694 has been updated by Shyouhei Urabe.


Ruby's regexp engine is NP-complete.  It's ultimately impossible to guarantee
regexp matches to run fast (if you don't think so please send us a proof).  It
might be possible to warn your specific bad regexp, but in general it's also
impossible to tell which regexp is bad and which isn't.  That's the Halintg
problem http://en.wikipedia.org/wiki/Halting_problem .

So, in short, there is (at least believed to be) no ultimate solution.  All what
might be possible is to find a reasonable compromise.  For instance python does
not detect that shorter one:

~~~python
zsh % python
Python 2.7.5+ (default, Feb 27 2014, 19:37:08) 
[GCC 4.8.1] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import re
>>> re.compile(r'(\w*)*$')
<_sre.SRE_Pattern object at 0x1dee030>
>>> 
~~~

----------------------------------------
Bug #9694: Bad regexp hangs ruby
https://bugs.ruby-lang.org/issues/9694#change-46048

* Author: Nikolay Markov
* Status: Open
* Priority: Normal
* Assignee: 
* Category: regexp
* Target version: 
* ruby -v: ruby 2.1.1p76 (2014-02-24 revision 45161) [x86_64-darwin13.0]
* Backport: 2.0.0: UNKNOWN, 2.1: UNKNOWN
----------------------------------------
Here is an extracted problem i ran into recently:
~~~
$ cat test.rb

str = ('a' * ARGV[0].to_i) + '?'
re = /(\w*)*$/
re.match(str)
~~~

On few chars match returns quite fast, but here's what happens on 14 'a'-s and up:
~~~
$ time RBENV_VERSION=2.1.1 ruby test.rb 14

real 1.392	user 1.364	sys 0.026	pcpu 99.83

$ time RBENV_VERSION=2.1.1 ruby test.rb 15

real 3.979	user 3.949	sys 0.026	pcpu 99.89

$ time RBENV_VERSION=2.1.1 ruby test.rb 16

real 11.995	user 11.954	sys 0.031	pcpu 99.92
~~~

Ruby versions 1.9.3 and 2.0 behave similarly. 

I ran into the problem, because one of my colleagues copy-pasted this regexp to test url's somewhere from stackoverflow:
/^(https?:\/\/)?([\da-z\.-]+)\.([a-z\.]{2,6})([\/\w \.-:]*)*\/?$/

I know the regexp is useless, however i think it's still a problem if a bad regexp can hang ruby.
Python (2.7) says that this regexp is bad and does not compile it.
Perl matches without any performance issues





-- 
https://bugs.ruby-lang.org/

In This Thread

Prev Next