summaryrefslogtreecommitdiff
path: root/ext/json/parser
AgeCommit message (Collapse)Author
2025-03-28[ruby/json] Move `create_addtions` logic in Ruby.Jean Boussier
By leveraging the `on_load` callback we can move all this logic out of the parser. Which mean we no longer have to duplicate that logic in both parser and that we'll later be able to extract it entirely from the gem. https://github.com/ruby/json/commit/f411ddf1ce Notes: Merged: https://github.com/ruby/ruby/pull/13004
2025-03-28[ruby/json] JSON.load invoke the proc callback directly from the parser.Jean Boussier
And substitute the return value like `Marshal.load` doesm which I can only assume was the intent. This also open the door to re-implement all the `create_addition` logic in `json/common.rb`. https://github.com/ruby/json/commit/73d2137fd3 Notes: Merged: https://github.com/ruby/ruby/pull/13004
2025-03-28[ruby/json] Remove `Class#json_creatable?` monkey patch.Jean Boussier
https://github.com/ruby/json/commit/1ca7efed1f Notes: Merged: https://github.com/ruby/ruby/pull/13004
2025-03-13[ruby/json] Fix potential out of bound read in `json_string_unescape`.Jean Boussier
https://github.com/ruby/json/commit/cf242d89a0
2025-03-12[ruby/json] Raise a ParserError on all incomplete unicode escape sequence.Jean Boussier
This was the behavior until `2.10.0` unadvertently changed it. `"\u1"` would raise, but `"\u1zzz"` wouldn't. https://github.com/ruby/json/commit/7d0637b9e6
2025-02-27[ruby/json] Ensure parser error snippets are valid UTF-8Jean Boussier
Fix: https://github.com/ruby/json/issues/755 Error messages now include a snippet of the document that doesn't parse to help locate the issue, however the way it was done wasn't UTF-8 aware, and it could result in exception messages with truncated characters. It would be nice to go a bit farther and actually support codepoints, but it's a lot of complexity to do it in C, perhaps if we move that logic to Ruby given it's not a performance sensitive codepath. https://github.com/ruby/json/commit/e144793b72
2025-01-30[ruby/json] Avoid plain char for ctype macrosNobuyoshi Nakada
On some platforms ctype functions are defined as macros accesing tables. A plain char may be `signed` or `unsigned` per implementations and the extension result implementation dependent. gcc warns such case: ``` parser.c: In function 'rstring_cache_fetch': parser.c:138:33: warning: array subscript has type 'char' [-Wchar-subscripts] 138 | if (RB_UNLIKELY(!isalpha(str[0]))) { | ~~~^~~ parser.c: In function 'rsymbol_cache_fetch': parser.c:190:33: warning: array subscript has type 'char' [-Wchar-subscripts] 190 | if (RB_UNLIKELY(!isalpha(str[0]))) { | ~~~^~~ ``` https://github.com/ruby/json/commit/4431b362f6
2025-01-20[ruby/json] Reject invalid number: `-` `-.1` `-e0`tompng
https://github.com/ruby/json/commit/b9bfeecfa9 Notes: Merged: https://github.com/ruby/ruby/pull/12602
2025-01-20[ruby/json] Raise parse error on invalid commentstompng
https://github.com/ruby/json/commit/2f57f40467 Notes: Merged: https://github.com/ruby/ruby/pull/12602
2025-01-20[ruby/json] Fix parsing incomplete unicode escape "\uaaa"tompng
https://github.com/ruby/json/commit/86c0d4eb7e Notes: Merged: https://github.com/ruby/ruby/pull/12602
2025-01-20ext/json no longer uses ragelNobuyoshi Nakada
Notes: Merged: https://github.com/ruby/ruby/pull/12599
2025-01-20[ruby/json] Fix a regression in the parser with leading /Jean Boussier
Ref: https://github.com/ruby/ruby/pull/12598 This could lead to an infinite loop. https://github.com/ruby/json/commit/f8cfa2696a Notes: Merged: https://github.com/ruby/ruby/pull/12600
2025-01-20Removed parser.rl from ext/json/parser/dependHiroshi SHIBATA
Notes: Merged: https://github.com/ruby/ruby/pull/12598
2025-01-20[ruby/json] json_string_unescape: use memchr to search for backslashesJean Boussier
https://github.com/ruby/json/commit/5e6cfcf724 Notes: Merged: https://github.com/ruby/ruby/pull/12598
2025-01-20[ruby/json] Cleanup json_decode_floatJean Boussier
Move all the decimal_class option parsing in the constructor. https://github.com/ruby/json/commit/e9adefdc38 Notes: Merged: https://github.com/ruby/ruby/pull/12598
2025-01-20[ruby/json] parser.c: Pass the JSON_ParserConfig pointerJean Boussier
Doesn't make a measurable performance difference but is a bit clearer. https://github.com/ruby/json/commit/314d117c61 Notes: Merged: https://github.com/ruby/ruby/pull/12598
2025-01-20[ruby/json] Use RSTRING_ENDJean Boussier
https://github.com/ruby/json/commit/dd9c46c805 Notes: Merged: https://github.com/ruby/ruby/pull/12598
2025-01-20[ruby/json] Replace fbuffer by stack buffers or RB_ALLOCV in parser.cJean Boussier
We only use that buffer for parsing integer and floats, these are unlikely to be very big, and if so we can just use RB_ALLOCV as it will almost always end in a small `alloca`. This allow to no longer need `rb_protect` around the parser. https://github.com/ruby/json/commit/994859916a Notes: Merged: https://github.com/ruby/ruby/pull/12598
2025-01-20[ruby/json] Implement write barriers for ParserConfig objectsJean Boussier
https://github.com/ruby/json/commit/591056a526 Notes: Merged: https://github.com/ruby/ruby/pull/12598
2025-01-20Finalize Kevin's handrolled parser.Jean Boussier
And get rid of the Ragel parser. This is 7% faster on activitypub, 15% after on twitter and 11% faster on citm_catalog. There might be some more optimization opportunities, I did a quick optimization pass to fix a regression in string parsing, but other than that I haven't dug much in performance. Notes: Merged: https://github.com/ruby/ruby/pull/12598
2025-01-14[ruby/json] Refactor JSON::Ext::Parser to split configuration and parsing stateJean Boussier
Ref: https://github.com/ruby/json/pull/718 The existing `Parser` interface is pretty bad, as it forces to instantiate a new instance for each document. Instead it's preferable to only take the config and do all the initialization needed, and then keep the parsing state on the stack on in ephemeral memory. This refactor makes the `JSON::Coder` pull request much easier to implement in a performant way. https://github.com/ruby/json/commit/c8d5236a92 Co-Authored-By: Étienne Barrié <[email protected]>
2024-12-19[ruby/json] Add support for Solaris 10 which lacks strnlen()Naohisa Goto
Check for existence of strnlen() and use alternative code if it is missing. https://github.com/ruby/json/commit/48d4bbc3a0 Notes: Merged: https://github.com/ruby/ruby/pull/12394
2024-11-26[ruby/json] Stop using `rb_gc_mark_locations`Jean Boussier
It's using `rb_gc_mark_maybe` under the hood, which isn't what we need. https://github.com/ruby/json/commit/e10d0bffcd
2024-11-14[ruby/json] Only use the key cache if the Hash is in an ArrayJean Boussier
Otherwise the likeliness of seeing that key again is really low, and looking up the cache is just a waste. Before: ``` == Parsing small hash (65 bytes) ruby 3.4.0dev (2024-11-13T12:32:57Z fstr-update-callba.. https://github.com/ruby/json/commit/9b44b455b3) +YJIT +PRISM [arm64-darwin24] Warming up -------------------------------------- json 343.049k i/100ms oj 213.943k i/100ms Oj::Parser 31.583k i/100ms rapidjson 303.433k i/100ms Calculating ------------------------------------- json 3.704M (± 1.5%) i/s (270.01 ns/i) - 18.525M in 5.003078s oj 2.200M (± 1.1%) i/s (454.46 ns/i) - 11.125M in 5.056526s Oj::Parser 285.369k (± 4.8%) i/s (3.50 μs/i) - 1.453M in 5.103866s rapidjson 3.216M (± 1.6%) i/s (310.95 ns/i) - 16.082M in 5.001973s Comparison: json: 3703517.4 i/s rapidjson: 3215983.0 i/s - 1.15x slower oj: 2200417.1 i/s - 1.68x slower Oj::Parser: 285369.1 i/s - 12.98x slower == Parsing test from oj (258 bytes) ruby 3.4.0dev (2024-11-13T12:32:57Z fstr-update-callba.. https://github.com/ruby/json/commit/9b44b455b3) +YJIT +PRISM [arm64-darwin24] Warming up -------------------------------------- json 54.539k i/100ms oj 41.473k i/100ms Oj::Parser 24.064k i/100ms rapidjson 51.466k i/100ms Calculating ------------------------------------- json 549.386k (± 1.6%) i/s (1.82 μs/i) - 2.781M in 5.064316s oj 417.003k (± 1.3%) i/s (2.40 μs/i) - 2.115M in 5.073047s Oj::Parser 226.500k (± 4.7%) i/s (4.42 μs/i) - 1.131M in 5.005466s rapidjson 526.124k (± 1.0%) i/s (1.90 μs/i) - 2.676M in 5.087176s Comparison: json: 549385.6 i/s rapidjson: 526124.3 i/s - 1.04x slower oj: 417003.4 i/s - 1.32x slower Oj::Parser: 226500.4 i/s - 2.43x slower ``` After: ``` == Parsing small hash (65 bytes) ruby 3.4.0dev (2024-11-13T12:32:57Z fstr-update-callba.. https://github.com/ruby/json/commit/9b44b455b3) +YJIT +PRISM [arm64-darwin24] Warming up -------------------------------------- json 361.394k i/100ms oj 217.203k i/100ms Oj::Parser 28.855k i/100ms rapidjson 303.404k i/100ms Calculating ------------------------------------- json 3.859M (± 2.9%) i/s (259.13 ns/i) - 19.515M in 5.061302s oj 2.191M (± 1.6%) i/s (456.49 ns/i) - 11.077M in 5.058043s Oj::Parser 315.132k (± 7.1%) i/s (3.17 μs/i) - 1.587M in 5.065707s rapidjson 3.156M (± 4.0%) i/s (316.88 ns/i) - 15.777M in 5.008949s Comparison: json: 3859046.5 i/s rapidjson: 3155778.5 i/s - 1.22x slower oj: 2190616.0 i/s - 1.76x slower Oj::Parser: 315132.4 i/s - 12.25x slower == Parsing test from oj (258 bytes) ruby 3.4.0dev (2024-11-13T12:32:57Z fstr-update-callba.. https://github.com/ruby/json/commit/9b44b455b3) +YJIT +PRISM [arm64-darwin24] Warming up -------------------------------------- json 55.682k i/100ms oj 40.343k i/100ms Oj::Parser 25.119k i/100ms rapidjson 51.500k i/100ms Calculating ------------------------------------- json 555.808k (± 1.4%) i/s (1.80 μs/i) - 2.784M in 5.010092s oj 412.283k (± 1.7%) i/s (2.43 μs/i) - 2.098M in 5.089900s Oj::Parser 279.306k (±13.3%) i/s (3.58 μs/i) - 1.356M in 5.022079s rapidjson 517.177k (± 2.7%) i/s (1.93 μs/i) - 2.626M in 5.082352s Comparison: json: 555808.3 i/s rapidjson: 517177.1 i/s - 1.07x slower oj: 412283.2 i/s - 1.35x slower Oj::Parser: 279306.5 i/s - 1.99x slower ``` https://github.com/ruby/json/commit/00c45ddc9f
2024-11-11[ruby/json] Rename parse_float into parse_numberJean Boussier
https://github.com/ruby/json/commit/e51e796697
2024-11-11[ruby/json] Reduce comparisons when parsing numbersAaron Patterson
Before this commit, we would try to scan for a float, then if that failed, scan for an integer. But floats and integers have many bytes in common, so we would end up scanning the same bytes multiple times. This patch combines integer and float scanning machines so that we only have to scan bytes once. If the machine finds "float parts", then it executes the "isFloat" transition in the machine, which sets a boolean letting us know that the parser found a float. If we didn't find a float, but we did match, then we know it's an int. https://github.com/ruby/json/commit/0c0e0930cd
2024-11-06[ruby/json] Implement a fast path for integer parsingJean Boussier
`rb_cstr2inum` isn't very fast because it handles tons of different scenarios, and also require a NULL terminated string which forces us to copy the number into a secondary buffer. But since the parser already computed the length, we can much more cheaply do this with a very simple function as long as the number is small enough to fit into a native type (`long long`). If the number is too long, we can fallback to the `rb_cstr2inum` slowpath. Before: ``` == Parsing citm_catalog.json (1727030 bytes) ruby 3.4.0dev (2024-11-06T07:59:09Z precompute-hash-wh.. https://github.com/ruby/json/commit/7943f98a8a) +YJIT +PRISM [arm64-darwin24] Warming up -------------------------------------- json 40.000 i/100ms oj 35.000 i/100ms Oj::Parser 45.000 i/100ms rapidjson 38.000 i/100ms Calculating ------------------------------------- json 425.941 (± 1.9%) i/s (2.35 ms/i) - 2.160k in 5.072833s oj 349.617 (± 1.7%) i/s (2.86 ms/i) - 1.750k in 5.006953s Oj::Parser 464.767 (± 1.7%) i/s (2.15 ms/i) - 2.340k in 5.036381s rapidjson 382.413 (± 2.4%) i/s (2.61 ms/i) - 1.938k in 5.070757s Comparison: json: 425.9 i/s Oj::Parser: 464.8 i/s - 1.09x faster rapidjson: 382.4 i/s - 1.11x slower oj: 349.6 i/s - 1.22x slower ``` After: ``` == Parsing citm_catalog.json (1727030 bytes) ruby 3.4.0dev (2024-11-06T07:59:09Z precompute-hash-wh.. https://github.com/ruby/json/commit/7943f98a8a) +YJIT +PRISM [arm64-darwin24] Warming up -------------------------------------- json 46.000 i/100ms oj 33.000 i/100ms Oj::Parser 45.000 i/100ms rapidjson 39.000 i/100ms Calculating ------------------------------------- json 462.332 (± 3.2%) i/s (2.16 ms/i) - 2.346k in 5.080504s oj 351.140 (± 1.1%) i/s (2.85 ms/i) - 1.782k in 5.075616s Oj::Parser 473.500 (± 1.3%) i/s (2.11 ms/i) - 2.385k in 5.037695s rapidjson 395.052 (± 3.5%) i/s (2.53 ms/i) - 1.989k in 5.042275s Comparison: json: 462.3 i/s Oj::Parser: 473.5 i/s - same-ish: difference falls within error rapidjson: 395.1 i/s - 1.17x slower oj: 351.1 i/s - 1.32x slower ``` https://github.com/ruby/json/commit/3a4dc9e1b4
2024-11-06[ruby/json] parser.rl: parse_string implement a fast pathJean Boussier
If we assume most string don't contain any escape sequence we can avoid a lot of costly operations when it holds true. Before: ``` == Parsing activitypub.json (58160 bytes) ruby 3.4.0dev (2024-11-06T07:59:09Z precompute-hash-wh.. https://github.com/ruby/json/commit/7943f98a8a) +YJIT +PRISM [arm64-darwin24] Warming up -------------------------------------- json 884.000 i/100ms oj 789.000 i/100ms Oj::Parser 943.000 i/100ms rapidjson 584.000 i/100ms Calculating ------------------------------------- json 8.897k (± 1.3%) i/s (112.40 μs/i) - 45.084k in 5.068520s oj 7.967k (± 1.5%) i/s (125.52 μs/i) - 40.239k in 5.051985s Oj::Parser 9.564k (± 1.4%) i/s (104.56 μs/i) - 48.093k in 5.029626s rapidjson 5.947k (± 1.4%) i/s (168.16 μs/i) - 29.784k in 5.009437s Comparison: json: 8896.5 i/s Oj::Parser: 9563.8 i/s - 1.08x faster oj: 7966.8 i/s - 1.12x slower rapidjson: 5946.7 i/s - 1.50x slower == Parsing twitter.json (567916 bytes) ruby 3.4.0dev (2024-11-06T07:59:09Z precompute-hash-wh.. https://github.com/ruby/json/commit/7943f98a8a) +YJIT +PRISM [arm64-darwin24] Warming up -------------------------------------- json 83.000 i/100ms oj 64.000 i/100ms Oj::Parser 77.000 i/100ms rapidjson 54.000 i/100ms Calculating ------------------------------------- json 823.083 (± 1.8%) i/s (1.21 ms/i) - 4.150k in 5.043805s oj 632.538 (± 1.4%) i/s (1.58 ms/i) - 3.200k in 5.060073s Oj::Parser 769.122 (± 1.8%) i/s (1.30 ms/i) - 3.850k in 5.007501s rapidjson 548.494 (± 1.5%) i/s (1.82 ms/i) - 2.754k in 5.022153s Comparison: json: 823.1 i/s Oj::Parser: 769.1 i/s - 1.07x slower oj: 632.5 i/s - 1.30x slower rapidjson: 548.5 i/s - 1.50x slower == Parsing citm_catalog.json (1727030 bytes) ruby 3.4.0dev (2024-11-06T07:59:09Z precompute-hash-wh.. https://github.com/ruby/json/commit/7943f98a8a) +YJIT +PRISM [arm64-darwin24] Warming up -------------------------------------- json 41.000 i/100ms oj 34.000 i/100ms Oj::Parser 45.000 i/100ms rapidjson 39.000 i/100ms Calculating ------------------------------------- json 427.162 (± 1.2%) i/s (2.34 ms/i) - 2.173k in 5.087666s oj 351.463 (± 2.8%) i/s (2.85 ms/i) - 1.768k in 5.035149s Oj::Parser 461.849 (± 3.7%) i/s (2.17 ms/i) - 2.340k in 5.074461s rapidjson 395.155 (± 1.8%) i/s (2.53 ms/i) - 1.989k in 5.034927s Comparison: json: 427.2 i/s Oj::Parser: 461.8 i/s - 1.08x faster rapidjson: 395.2 i/s - 1.08x slower oj: 351.5 i/s - 1.22x slower ``` After: ``` == Parsing activitypub.json (58160 bytes) ruby 3.4.0dev (2024-11-06T07:59:09Z precompute-hash-wh.. https://github.com/ruby/json/commit/7943f98a8a) +YJIT +PRISM [arm64-darwin24] Warming up -------------------------------------- json 953.000 i/100ms oj 813.000 i/100ms Oj::Parser 956.000 i/100ms rapidjson 563.000 i/100ms Calculating ------------------------------------- json 9.525k (± 1.2%) i/s (104.98 μs/i) - 47.650k in 5.003252s oj 8.117k (± 0.5%) i/s (123.20 μs/i) - 40.650k in 5.008283s Oj::Parser 9.590k (± 3.2%) i/s (104.27 μs/i) - 48.756k in 5.089794s rapidjson 6.020k (± 0.9%) i/s (166.10 μs/i) - 30.402k in 5.050155s Comparison: json: 9525.3 i/s Oj::Parser: 9590.1 i/s - same-ish: difference falls within error oj: 8116.7 i/s - 1.17x slower rapidjson: 6020.5 i/s - 1.58x slower == Parsing twitter.json (567916 bytes) ruby 3.4.0dev (2024-11-06T07:59:09Z precompute-hash-wh.. https://github.com/ruby/json/commit/7943f98a8a) +YJIT +PRISM [arm64-darwin24] Warming up -------------------------------------- json 87.000 i/100ms oj 64.000 i/100ms Oj::Parser 75.000 i/100ms rapidjson 55.000 i/100ms Calculating ------------------------------------- json 866.563 (± 0.8%) i/s (1.15 ms/i) - 4.350k in 5.020138s oj 643.567 (± 0.8%) i/s (1.55 ms/i) - 3.264k in 5.072101s Oj::Parser 777.346 (± 3.5%) i/s (1.29 ms/i) - 3.900k in 5.023933s rapidjson 557.158 (± 0.7%) i/s (1.79 ms/i) - 2.805k in 5.034731s Comparison: json: 866.6 i/s Oj::Parser: 777.3 i/s - 1.11x slower oj: 643.6 i/s - 1.35x slower rapidjson: 557.2 i/s - 1.56x slower == Parsing citm_catalog.json (1727030 bytes) ruby 3.4.0dev (2024-11-06T07:59:09Z precompute-hash-wh.. https://github.com/ruby/json/commit/7943f98a8a) +YJIT +PRISM [arm64-darwin24] Warming up -------------------------------------- json 41.000 i/100ms oj 35.000 i/100ms Oj::Parser 40.000 i/100ms rapidjson 39.000 i/100ms Calculating ------------------------------------- json 429.216 (± 1.2%) i/s (2.33 ms/i) - 2.173k in 5.063351s oj 354.755 (± 1.1%) i/s (2.82 ms/i) - 1.785k in 5.032374s Oj::Parser 465.114 (± 3.7%) i/s (2.15 ms/i) - 2.360k in 5.081634s rapidjson 387.135 (± 1.3%) i/s (2.58 ms/i) - 1.950k in 5.037787s Comparison: json: 429.2 i/s Oj::Parser: 465.1 i/s - 1.08x faster rapidjson: 387.1 i/s - 1.11x slower oj: 354.8 i/s - 1.21x slower ``` https://github.com/ruby/json/commit/96bd97c61e
2024-11-06[ruby/json] Categorize deprecated warningNobuyoshi Nakada
https://github.com/ruby/json/commit/1acce7aceb
2024-11-05Update depend filesJean Boussier
Notes: Merged: https://github.com/ruby/ruby/pull/12003
2024-11-05[ruby/json] ResyncJean Boussier
Notes: Merged: https://github.com/ruby/ruby/pull/12003
2024-11-05[ruby/json] JSON::Ext::Parser mark the name cache entries when not on the heapJean Boussier
This is somewhat dead code as unless you are using `JSON::Parser.new` direcltly we never allocate `JSON::Ext::Parser` anymore. But still, we should mark all its reference in case some code out there uses that. Followup: #675 https://github.com/ruby/json/commit/8bf74a977b Notes: Merged: https://github.com/ruby/ruby/pull/12003
2024-11-01[ruby/json] json_string_unescape: Use the returned RString as bufferJean Boussier
Rather than to copy into a buffer to unescape and then copy that buffer into the final string, we can directly copy into the final string. The downside is that if the string contains a lot of escaping, we end up returning a string that's larger than strictly necessary, but it's probably fine. Before: ``` == Parsing twitter.json (567916 bytes) ruby 3.3.4 (2024-07-09 revision https://github.com/ruby/json/commit/be1089c8ec) +YJIT [arm64-darwin23] Warming up -------------------------------------- json 56.000 i/100ms oj 58.000 i/100ms oj strict 74.000 i/100ms Oj::Parser 76.000 i/100ms rapidjson 52.000 i/100ms Calculating ------------------------------------- json 556.659 (± 2.9%) i/s (1.80 ms/i) - 2.800k in 5.034719s oj 604.077 (± 3.8%) i/s (1.66 ms/i) - 3.016k in 5.001546s oj strict 706.942 (± 3.5%) i/s (1.41 ms/i) - 3.552k in 5.030954s Oj::Parser 752.917 (± 3.2%) i/s (1.33 ms/i) - 3.800k in 5.052707s rapidjson 546.470 (± 3.5%) i/s (1.83 ms/i) - 2.756k in 5.049855s Comparison: json: 556.7 i/s Oj::Parser: 752.9 i/s - 1.35x faster oj strict: 706.9 i/s - 1.27x faster oj: 604.1 i/s - 1.09x faster rapidjson: 546.5 i/s - same-ish: difference falls within error == Parsing citm_catalog.json (1727030 bytes) ruby 3.3.4 (2024-07-09 revision https://github.com/ruby/json/commit/be1089c8ec) +YJIT [arm64-darwin23] Warming up -------------------------------------- json 29.000 i/100ms oj 32.000 i/100ms oj strict 38.000 i/100ms Oj::Parser 42.000 i/100ms rapidjson 38.000 i/100ms Calculating ------------------------------------- json 317.858 (± 3.1%) i/s (3.15 ms/i) - 1.595k in 5.023245s oj 348.168 (± 2.6%) i/s (2.87 ms/i) - 1.760k in 5.058431s oj strict 394.599 (± 2.8%) i/s (2.53 ms/i) - 1.976k in 5.012073s Oj::Parser 403.771 (± 3.0%) i/s (2.48 ms/i) - 2.058k in 5.101578s rapidjson 383.441 (± 3.7%) i/s (2.61 ms/i) - 1.938k in 5.061355s Comparison: json: 317.9 i/s Oj::Parser: 403.8 i/s - 1.27x faster oj strict: 394.6 i/s - 1.24x faster rapidjson: 383.4 i/s - 1.21x faster oj: 348.2 i/s - 1.10x faster ``` After: ``` == Parsing twitter.json (567916 bytes) ruby 3.3.4 (2024-07-09 revision https://github.com/ruby/json/commit/be1089c8ec) +YJIT [arm64-darwin23] Warming up -------------------------------------- json 56.000 i/100ms oj 62.000 i/100ms oj strict 73.000 i/100ms Oj::Parser 76.000 i/100ms rapidjson 54.000 i/100ms Calculating ------------------------------------- json 561.009 (± 7.5%) i/s (1.78 ms/i) - 2.800k in 5.039548s oj 601.124 (± 4.3%) i/s (1.66 ms/i) - 3.038k in 5.064686s oj strict 707.455 (± 3.4%) i/s (1.41 ms/i) - 3.577k in 5.062540s Oj::Parser 751.799 (± 3.1%) i/s (1.33 ms/i) - 3.800k in 5.059509s rapidjson 535.641 (± 3.2%) i/s (1.87 ms/i) - 2.700k in 5.045816s Comparison: json: 561.0 i/s Oj::Parser: 751.8 i/s - 1.34x faster oj strict: 707.5 i/s - 1.26x faster oj: 601.1 i/s - same-ish: difference falls within error rapidjson: 535.6 i/s - same-ish: difference falls within error == Parsing citm_catalog.json (1727030 bytes) ruby 3.3.4 (2024-07-09 revision https://github.com/ruby/json/commit/be1089c8ec) +YJIT [arm64-darwin23] Warming up -------------------------------------- json 30.000 i/100ms oj 32.000 i/100ms oj strict 36.000 i/100ms Oj::Parser 42.000 i/100ms rapidjson 39.000 i/100ms Calculating ------------------------------------- json 313.248 (± 7.3%) i/s (3.19 ms/i) - 1.560k in 5.014118s oj 341.977 (± 4.1%) i/s (2.92 ms/i) - 1.728k in 5.063332s oj strict 387.062 (± 6.2%) i/s (2.58 ms/i) - 1.944k in 5.045961s Oj::Parser 400.423 (± 4.0%) i/s (2.50 ms/i) - 2.016k in 5.044513s rapidjson 379.046 (± 6.1%) i/s (2.64 ms/i) - 1.911k in 5.064461s Comparison: json: 313.2 i/s Oj::Parser: 400.4 i/s - 1.28x faster oj strict: 387.1 i/s - 1.24x faster rapidjson: 379.0 i/s - 1.21x faster oj: 342.0 i/s - same-ish: difference falls within error ``` https://github.com/ruby/json/commit/5e1ec4a268
2024-11-01[ruby/json] Remove String#-@ check in extconf.rbJean Boussier
Now that older rubies have been droped, we no longer need to check for all that. https://github.com/ruby/json/commit/35cf2b84e0
2024-11-01[ruby/json] json_string_unescape: assume the string doesn't need escapingJean Boussier
If that assumption holds true, then we don't need to copy the string into a buffer to unescape it. For small string is just saves copying, but for large ones it also saves a malloc/free combo. Before: ``` == Parsing twitter.json (567916 bytes) ruby 3.3.4 (2024-07-09 revision https://github.com/ruby/json/commit/be1089c8ec) +YJIT [arm64-darwin23] Warming up -------------------------------------- json 52.000 i/100ms oj 61.000 i/100ms oj strict 70.000 i/100ms Oj::Parser 71.000 i/100ms rapidjson 55.000 i/100ms Calculating ------------------------------------- json 510.111 (± 2.9%) i/s (1.96 ms/i) - 2.548k in 5.000029s oj 610.232 (± 3.1%) i/s (1.64 ms/i) - 3.050k in 5.003725s oj strict 713.231 (± 3.2%) i/s (1.40 ms/i) - 3.570k in 5.010902s Oj::Parser 762.598 (± 3.0%) i/s (1.31 ms/i) - 3.834k in 5.033130s rapidjson 553.029 (± 7.4%) i/s (1.81 ms/i) - 2.750k in 5.022630s Comparison: json: 510.1 i/s Oj::Parser: 762.6 i/s - 1.49x faster oj strict: 713.2 i/s - 1.40x faster oj: 610.2 i/s - 1.20x faster rapidjson: 553.0 i/s - same-ish: difference falls within error == Parsing citm_catalog.json (1727030 bytes) ruby 3.3.4 (2024-07-09 revision https://github.com/ruby/json/commit/be1089c8ec) +YJIT [arm64-darwin23] Warming up -------------------------------------- json 28.000 i/100ms oj 33.000 i/100ms oj strict 37.000 i/100ms Oj::Parser 43.000 i/100ms rapidjson 38.000 i/100ms Calculating ------------------------------------- json 303.853 (± 3.6%) i/s (3.29 ms/i) - 1.540k in 5.076079s oj 348.009 (± 2.0%) i/s (2.87 ms/i) - 1.749k in 5.027738s oj strict 396.679 (± 3.3%) i/s (2.52 ms/i) - 1.998k in 5.042271s Oj::Parser 406.699 (± 2.2%) i/s (2.46 ms/i) - 2.064k in 5.077587s rapidjson 393.463 (± 3.3%) i/s (2.54 ms/i) - 1.976k in 5.028501s Comparison: json: 303.9 i/s Oj::Parser: 406.7 i/s - 1.34x faster oj strict: 396.7 i/s - 1.31x faster rapidjson: 393.5 i/s - 1.29x faster oj: 348.0 i/s - 1.15x faster ``` After: ``` == Parsing twitter.json (567916 bytes) ruby 3.3.4 (2024-07-09 revision https://github.com/ruby/json/commit/be1089c8ec) +YJIT [arm64-darwin23] Warming up -------------------------------------- json 56.000 i/100ms oj 62.000 i/100ms oj strict 72.000 i/100ms Oj::Parser 77.000 i/100ms rapidjson 55.000 i/100ms Calculating ------------------------------------- json 568.025 (± 2.1%) i/s (1.76 ms/i) - 2.856k in 5.030272s oj 630.936 (± 1.4%) i/s (1.58 ms/i) - 3.162k in 5.012630s oj strict 705.784 (±11.2%) i/s (1.42 ms/i) - 3.456k in 5.006706s Oj::Parser 783.989 (± 1.7%) i/s (1.28 ms/i) - 3.927k in 5.010343s rapidjson 557.630 (± 2.0%) i/s (1.79 ms/i) - 2.805k in 5.032388s Comparison: json: 568.0 i/s Oj::Parser: 784.0 i/s - 1.38x faster oj strict: 705.8 i/s - 1.24x faster oj: 630.9 i/s - 1.11x faster rapidjson: 557.6 i/s - same-ish: difference falls within error == Parsing citm_catalog.json (1727030 bytes) ruby 3.3.4 (2024-07-09 revision https://github.com/ruby/json/commit/be1089c8ec) +YJIT [arm64-darwin23] Warming up -------------------------------------- json 29.000 i/100ms oj 33.000 i/100ms oj strict 38.000 i/100ms Oj::Parser 43.000 i/100ms rapidjson 37.000 i/100ms Calculating ------------------------------------- json 319.271 (± 3.1%) i/s (3.13 ms/i) - 1.595k in 5.001128s oj 347.946 (± 1.7%) i/s (2.87 ms/i) - 1.749k in 5.028395s oj strict 396.914 (± 3.0%) i/s (2.52 ms/i) - 2.014k in 5.079645s Oj::Parser 409.311 (± 2.7%) i/s (2.44 ms/i) - 2.064k in 5.046626s rapidjson 394.752 (± 1.5%) i/s (2.53 ms/i) - 1.998k in 5.062776s Comparison: json: 319.3 i/s Oj::Parser: 409.3 i/s - 1.28x faster oj strict: 396.9 i/s - 1.24x faster rapidjson: 394.8 i/s - 1.24x faster oj: 347.9 i/s - 1.09x faster ``` https://github.com/ruby/json/commit/7e0f66546a
2024-11-01[ruby/json] parser.rl: extract `build_string`Jean Boussier
https://github.com/ruby/json/commit/7e557ee291
2024-11-01[ruby/json] Use String#encode instead of rb_str_conv_enc()Benoit Daloze
* rb_str_conv_enc() returns the source string unmodified if the conversion did not work. But we should be consistent with the generator here and only accept BINARY or convertible to UTF-8. https://github.com/ruby/json/commit/1344ad6f66
2024-11-01[ruby/json] Emit warnings when dumping binary stringsJean Boussier
Because of it's Ruby 1.8 heritage, the C extension doesn't care much about strings encoding. We should get stricter over time. https://github.com/ruby/json/commit/42402fc13f
2024-11-01Deprecate unsafe default options of `JSON.load`Jean Boussier
[Feature #19528] Ref: https://bugs.ruby-lang.org/issues/19528 `load` is understood as the default method for serializer kind of libraries, and the default options of `JSON.load` has caused many security vulnerabilities over the years. The plan is to do like YAML/Psych, deprecate these default options and direct users toward using `JSON.unsafe_load` so at least it's obvious it should be used against untrusted data.
2024-11-01[ruby/json] Allocate the initial generator buffer on the stackJean Boussier
Ref: https://github.com/ruby/json/issues/655 Followup: https://github.com/ruby/json/issues/657 Assuming the generator might be used for fairly small documents we can start with a reasonable buffer size of the stack, and if we outgrow it, we can spill on the heap. In a way this is optimizing for micro-benchmarks, but there are valid use case for fiarly small JSON document in actual real world scenarios, so trashing the GC less in such case make sense. Before: ``` ruby 3.3.4 (2024-07-09 revision https://github.com/ruby/json/commit/be1089c8ec) +YJIT [arm64-darwin23] Warming up -------------------------------------- Oj 518.700k i/100ms JSON reuse 483.370k i/100ms Calculating ------------------------------------- Oj 5.722M (± 1.8%) i/s (174.76 ns/i) - 29.047M in 5.077823s JSON reuse 5.278M (± 1.5%) i/s (189.46 ns/i) - 26.585M in 5.038172s Comparison: Oj: 5722283.8 i/s JSON reuse: 5278061.7 i/s - 1.08x slower ``` After: ``` ruby 3.3.4 (2024-07-09 revision https://github.com/ruby/json/commit/be1089c8ec) +YJIT [arm64-darwin23] Warming up -------------------------------------- Oj 517.837k i/100ms JSON reuse 548.871k i/100ms Calculating ------------------------------------- Oj 5.693M (± 1.6%) i/s (175.65 ns/i) - 28.481M in 5.004056s JSON reuse 5.855M (± 1.2%) i/s (170.80 ns/i) - 29.639M in 5.063004s Comparison: Oj: 5692985.6 i/s JSON reuse: 5854857.9 i/s - 1.03x faster ``` https://github.com/ruby/json/commit/fe607f4806
2024-10-30[ruby/json] Remove double semicolon at end of line in parserPeter Zhu
https://github.com/ruby/json/commit/f6d6ca3c17
2024-10-30[ruby/json] Allocate the FBuffer struct on the stackJean Boussier
Ref: https://github.com/ruby/json/issues/655 The actual buffer is still on the heap, but this saves a pair of malloc/free. This helps a lot on micro-benchmarks Before: ``` ruby 3.3.4 (2024-07-09 revision https://github.com/ruby/json/commit/be1089c8ec) +YJIT [arm64-darwin23] Warming up -------------------------------------- Oj 531.598k i/100ms JSON reuse 417.666k i/100ms Calculating ------------------------------------- Oj 5.735M (± 1.3%) i/s (174.35 ns/i) - 28.706M in 5.005900s JSON reuse 4.604M (± 1.4%) i/s (217.18 ns/i) - 23.389M in 5.080779s Comparison: Oj: 5735475.6 i/s JSON reuse: 4604380.3 i/s - 1.25x slower ``` After: ``` ruby 3.3.4 (2024-07-09 revision https://github.com/ruby/json/commit/be1089c8ec) +YJIT [arm64-darwin23] Warming up -------------------------------------- Oj 518.700k i/100ms JSON reuse 483.370k i/100ms Calculating ------------------------------------- Oj 5.722M (± 1.8%) i/s (174.76 ns/i) - 29.047M in 5.077823s JSON reuse 5.278M (± 1.5%) i/s (189.46 ns/i) - 26.585M in 5.038172s Comparison: Oj: 5722283.8 i/s JSON reuse: 5278061.7 i/s - 1.08x slower ``` Bench: ```ruby require 'benchmark/ips' require 'oj' require 'json' json_encoder = JSON::State.new(JSON.dump_default_options) test_data = [1, "string", { a: 1, b: 2 }, [3, 4, 5]] Oj.default_options = Oj.default_options.merge(mode: :compat) Benchmark.ips do |x| x.config(time: 5, warmup: 2) x.report("Oj") do Oj.dump(test_data) end x.report("JSON reuse") do json_encoder.generate(test_data) end x.compare!(order: :baseline) end ``` https://github.com/ruby/json/commit/72110f7992
2024-10-26[ruby/json] Use smaller types for JSON_Parser boolean fieldsJean Boussier
https://github.com/ruby/json/commit/7f079b25be
2024-10-26[ruby/json] JSON.dump / String#to_json: raise on invalid encodingJean Boussier
This regressed since 2.7.2. https://github.com/ruby/json/commit/35407d6635
2024-10-26[ruby/json] raise_parse_error: avoid UBJean Boussier
Fix: https://github.com/ruby/json/pull/625 Declaring the buffer in a sub block cause bugs on some compilers. https://github.com/ruby/json/commit/90967c9eb0
2024-10-26Use frozen string literalsÉtienne Barrié
Co-authored-by: Jean Boussier <[email protected]>
2024-10-26[ruby/json] Compile with std=c99Jean Boussier
https://github.com/ruby/json/commit/d4968d2e48
2024-10-26[ruby/json] Ext::Parser avoid costly check on decimal_class when it is nilJean Boussier
Closes: https://github.com/ruby/json/pull/512 https://github.com/ruby/json/commit/d882a45d82 Co-Authored-By: lukeg <[email protected]>
2024-10-26[ruby/json] Limit the size of ParserError exception messagesJean Boussier
Fix: https://github.com/ruby/json/issues/534 Only include up to 32 bytes of unparseable the source. https://github.com/ruby/json/commit/f44995cfb6
2024-10-26[ruby/json] parser.c: refactor raise_parse_errorJean Boussier
https://github.com/ruby/json/commit/09e1df2643