Age | Commit message (Collapse) | Author |
|
on interruption.
The cancellation code was originally written for leave insn, but re-entering
opt_invokebuiltin_delegate_leave insn on a cancellation is not safe, because
a builtin function is executed twice.
|
|
e7fc353f04 reverted vm_ic_hit_p's signature change made in 53babf35ef,
which broke JIT compilation of getinlinecache.
To make sure it doesn't happen again, I separated vm_inlined_ic_hit_p to
make the intention clear.
|
|
Other `_mjit_compile_*.erb` files don't use goto. These files'd better
be consistent for readability.
|
|
constant cache `IC` is accessed by non-atomic manner and there are
thread-safety issues, so Ruby 3.0 disables to use const cache on
non-main ractors.
This patch enables it by introducing `imemo_constcache` and allocates
it by every re-fill of const cache like `imemo_callcache`.
[Bug #17510]
Now `IC` only has one entry `IC::entry` and it points to
`iseq_inline_constant_cache_entry`, managed by T_IMEMO object.
`IC` is atomic data structure so `rb_mjit_before_vm_ic_update()` and
`rb_mjit_after_vm_ic_update()` is not needed.
Notes:
Merged: https://github.com/ruby/ruby/pull/4022
|
|
when we already check ROBJECT_NUMIV(self) is larger than
ROBJECT_EMBED_LEN_MAX at the beginning of the method, because the number
of instance variables for the same object doesn't decrease.
```
$ benchmark-driver -v --rbenv 'before --jit;after --jit' --repeat-count=4 --alternate --output=all benchmark_3000.yml
before --jit: ruby 3.0.0dev (2020-12-23T06:32:19Z master dbb4f19969) +JIT [x86_64-linux]
after --jit: ruby 3.0.0dev (2020-12-23T07:45:42Z master 95e866c098) +JIT [x86_64-linux]
last_commit=Skip checking ROBJECT_EMBED
Calculating -------------------------------------
before --jit after --jit
Optcarrot 3000 frames 102.34091772397872 102.77738408379015 fps
103.37784821624231 105.46530219076179
104.39567016876369 106.43712452152215
105.31782092252713 106.54986150067481
```
|
|
Make the code a bit modern and consistent with some other places.
|
|
for leaf_without_check_ints insns.
$ benchmark-driver -v --rbenv 'before --jit;after --jit' --repeat-count=12 --alternate --output=all benchmark.yml
before --jit: ruby 3.0.0dev (2020-12-20T05:02:18Z master 02b3555874) +JIT [x86_64-linux]
after --jit: ruby 3.0.0dev (2020-12-20T05:36:00Z master 3f58de4eab) +JIT [x86_64-linux]
last_commit=Check mjit_call_p only when interrupted
Calculating -------------------------------------
before --jit after --jit
Optcarrot Lan_Master.nes 84.50647332260259 85.85057800433144 fps
91.17796644338372 92.09930605656054
91.29346683444497 93.01336611323687
91.50322318568884 93.07234029037433
91.66560903214686 93.22773241529644
91.82315142636172 93.37032901061119
92.15066379608260 93.83701526141679
92.37897097456643 93.86032792681507
92.53049815524908 93.91211970920320
92.78414507914283 94.09109196967890
92.90299756525958 94.40107239595325
93.70279428858790 95.01326369371263
|
|
following the original implementation's change.
RB_TYPE_P(obj, T_OBJECT) is already checked in these places.
```
$ benchmark-driver -v --rbenv 'before --jit;after --jit' --repeat-count=12 --alternate --output=all benchmark.yml
before --jit: ruby 3.0.0dev (2020-12-19T08:27:44Z master 52b1716c78) +JIT [x86_64-linux]
after --jit: ruby 3.0.0dev (2020-12-19T08:27:44Z master 52b1716c78) +JIT [x86_64-linux]
Calculating -------------------------------------
before --jit after --jit
Optcarrot Lan_Master.nes 88.04551460097873 84.38303800957766 fps
88.25194345156318 85.31098251408059
88.34143982084871 86.60491582339496
88.63486879856976 88.23675694701865
88.85392212902701 88.23696283371444
89.05739427483194 88.97185459567562
89.08141031147311 90.16373192658857
89.11359420883423 90.61655686444394
89.80323392966130 90.77044959019291
90.58912189625207 90.88534596330966
90.59847996970350 91.34314801302897
90.61180456415137 93.11599164249547
```
|
|
```
$ benchmark-driver -v --rbenv 'before --jit;after --jit' --repeat-count=12 --alternate --output=all benchmark.yml
before --jit: ruby 3.0.0dev (2020-12-17T06:17:46Z master 3b4d698e0b) +JIT [x86_64-linux]
after --jit: ruby 3.0.0dev (2020-12-17T07:01:48Z master 843abb96f0) +JIT [x86_64-linux]
last_commit=Lazily move PC with RUBY_VM_CHECK_INTS
Calculating -------------------------------------
before --jit after --jit
Optcarrot Lan_Master.nes 80.29343646660429 83.15779723251525 fps
82.26755637885149 85.50197941326810
83.50682959728820 88.14657804306270
85.01236533133049 88.78201988978667
87.81799334561326 88.94841008936447
87.88228562393064 89.37925215601926
88.06695585889995 89.86143277214475
88.84730834922165 90.00773346420887
90.46317871213088 90.82603371104014
90.96308347148916 91.29797694822179
90.97945938504556 91.31086331868738
91.57127890154500 91.49949184318844
```
|
|
We probably don't need to move it when an insn is leaf...
|
|
* Inline getconstant on JIT
* Support USE_MJIT=0
Notes:
Merged-By: k0kubun <[email protected]>
|
|
and fix inconsistent indentation in mjit_compile.inc.erb
|
|
`cd` is passed to method call functions to method invocation
functions, but `cd` can be manipulated by other ractors simultaneously
so it contains thread-safety issue.
To solve this issue, this patch stores `ci` and found `cc` to `calling`
and stops to pass `cd`.
Notes:
Merged: https://github.com/ruby/ruby/pull/3903
|
|
```
$ benchmark-driver -v --rbenv 'before --jit;after --jit' --repeat-count=12 --alternate --output=all benchmark.yml
before --jit: ruby 3.0.0dev (2020-11-27T06:41:15Z master 8ce1711c25) +JIT [x86_64-linux]
after --jit: ruby 3.0.0dev (2020-11-27T08:36:02Z master 2c592126b9) +JIT [x86_64-linux]
last_commit=Cache access to reg_cfp->self on JIT
Calculating -------------------------------------
before --jit after --jit
Optcarrot Lan_Master.nes 82.40522392468650 82.66023870551237 fps
82.67998539899482 83.08660305312587
85.51280693947453 87.09311989553235
86.32925337181406 87.16115255191410
87.35617494926235 87.30699391518075
87.91865339426212 88.47590342996875
88.11573661006648 88.64778616696353
88.16060826662158 88.67015079203991
88.21639244865058 89.19630739497482
88.47241577897603 89.23443637947730
89.37087287229809 89.57052723997015
89.46969964699964 89.97803363889025
```
|
|
This reverts commit 4d2c8edca69884a41d2f843d36023e3decdb9872.
Unfortunately this seems to cause several issues:
https://github.com/ruby/ruby/runs/1462188376?check_suite_focus=true
http://ci.rvm.jp/results/trunk-mjit-wait@phosphorus-docker/3272802
|
|
Performance is probably improved?
$ benchmark-driver -v --rbenv 'before --jit;after --jit' --repeat-count=12 --alternate --output=all benchmark.yml
before --jit: ruby 3.0.0dev (2020-11-27T04:37:47Z master 69e77e81dc) +JIT [x86_64-linux]
after --jit: ruby 3.0.0dev (2020-11-27T05:28:19Z master df6b05c6dd) +JIT [x86_64-linux]
last_commit=Set VM_FRAME_FLAG_FINISH at once
Calculating -------------------------------------
before --jit after --jit
Optcarrot Lan_Master.nes 80.89292998533379 82.19497327502751 fps
80.93130641142331 85.13943315260148
81.06214830270119 87.43757879797808
82.29172808453910 87.89942441487113
84.61206450455929 87.91309779491075
85.44545883567997 87.98026086648694
86.02923132404449 88.03081060383973
86.07411817365879 88.14650206137341
86.34348799602836 88.32791633649961
87.90257338977324 88.57599644892220
88.58006509876580 88.67426384743277
89.26611118140011 88.81669430874207
This should have no bad impact on VM because this function is ALWAYS_INLINE.
|
|
|
|
|
|
|
|
iv_index_tbl manages instance variable indexes (ID -> index).
This data structure should be synchronized with other ractors
so introduce some VM locks.
This patch also introduced atomic ivar cache used by
set/getinlinecache instructions. To make updating ivar cache (IVC),
we changed iv_index_tbl data structure to manage (ID -> entry)
and an entry points serial and index. IVC points to this entry so
that cache update becomes atomically.
Notes:
Merged: https://github.com/ruby/ruby/pull/3662
|
|
generic_ivtbl is a process global table to maintain instance variables
for non T_OBJECT/T_CLASS/... objects. So we need to protect them
for multi-Ractor exection.
Hint: we can make them Ractor local for unshareable objects, but
now it is premature optimization.
Notes:
Merged: https://github.com/ruby/ruby/pull/3655
|
|
This commit introduces Ractor mechanism to run Ruby program in
parallel. See doc/ractor.md for more details about Ractor.
See ticket [Feature #17100] to see the implementation details
and discussions.
[Feature #17100]
This commit does not complete the implementation. You can find
many bugs on using Ractor. Also the specification will be changed
so that this feature is experimental. You will see a warning when
you make the first Ractor with `Ractor.new`.
I hope this feature can help programmers from thread-safety issues.
Notes:
Merged: https://github.com/ruby/ruby/pull/3365
|
|
This makes the binary 272 bytes smaller on -O3 GCC 10.2.0.
Notes:
Merged: https://github.com/ruby/ruby/pull/3494
|
|
Do not repeat yourself.
Notes:
Merged: https://github.com/ruby/ruby/pull/3405
|
|
A single quote "is representable either by itself or by the escape
sequence", according to ISO/IEC 9899 (checked all versions). So this is
not a bug fix. But the generated output is a bit readable without
backslashes.
Notes:
Merged: https://github.com/ruby/ruby/pull/3405
|
|
Requested by ko1.
Notes:
Merged: https://github.com/ruby/ruby/pull/3314
|
|
Was my bad to assume sp_inc was positive. Real criteria is the
calculated sp is non-negative. We have to assert that.
|
|
Stacks are emulated in MJIT, must not touch the original VM stack.
See also http://ci.rvm.jp/results/trunk-mjit-wait@silicon-docker/3061353
|
|
Instead of doubling the invokebuiltin logic here and there, use the same
insns.def definition for both MJIT/non-JIT situations.
Notes:
Merged: https://github.com/ruby/ruby/pull/3305
|
|
We can obtain the verbatim source code of Primitive.cexpr!. Why not
paste that content into the JITed program.
Notes:
Merged: https://github.com/ruby/ruby/pull/3305
|
|
Noticed that struct rb_builtin_function is a purely compile-time
constant. MJIT can eliminate some runtime calculations by statically
generate dedicated C code generator for each builtin functions.
Notes:
Merged: https://github.com/ruby/ruby/pull/3305
|
|
which is checked by the first guard. When JIT-inlined cc and operand
cd->cc are different, the JIT-ed code might wrongly dispatch cd->cc even
while class check is done with another cc inlined by JIT.
This fixes SEGV on railsbench.
|
|
Fix CI failure like
http://ci.rvm.jp/results/trunk-mjit-wait@silicon-docker/3043247
introduced by a69dd699ee630dd1086627dbca15a218a8538b6f
|
|
when an ISeq has multiple ivar accesses.
Notes:
Merged-By: k0kubun <[email protected]>
|
|
Use ID instead of GENTRY for gvars.
Global variables are compiled into GENTRY (a pointer to struct
rb_global_entry). This patch replace this GENTRY to ID and
make the code simple.
We need to search GENTRY from ID every time (st_lookup), so
additional overhead will be introduced.
However, the performance of accessing global variables is not
important now a day and this simplicity helps Ractor development.
Notes:
Merged-By: ko1 <[email protected]>
|
|
and add a debug log
|
|
for opt_* insns.
opt_eq handles rb_obj_equal inside opt_eq, and all other cfunc is
handled by opt_send_without_block. Therefore we can't decide which insn
should be generated by checking whether it's cfunc cc or not.
```
$ benchmark-driver -v --rbenv 'before --jit;after --jit' benchmark/mjit_opt_cc_insns.yml --repeat-count=4
before --jit: ruby 2.8.0dev (2020-06-26T05:21:43Z master 9dbc2294a6) +JIT [x86_64-linux]
after --jit: ruby 2.8.0dev (2020-06-26T06:30:18Z master 75cece1b0b) +JIT [x86_64-linux]
last_commit=Decide JIT-ed insn based on cached cfunc
Calculating -------------------------------------
before --jit after --jit
mjit_nil?(1) 73.878M 74.021M i/s - 40.000M times in 0.541432s 0.540391s
mjit_not(1) 72.635M 74.601M i/s - 40.000M times in 0.550702s 0.536187s
mjit_eq(1, nil) 7.331M 7.445M i/s - 8.000M times in 1.091211s 1.074596s
mjit_eq(nil, 1) 49.450M 64.711M i/s - 8.000M times in 0.161781s 0.123627s
Comparison:
mjit_nil?(1)
after --jit: 74020528.4 i/s
before --jit: 73878185.9 i/s - 1.00x slower
mjit_not(1)
after --jit: 74600882.0 i/s
before --jit: 72634507.6 i/s - 1.03x slower
mjit_eq(1, nil)
after --jit: 7444657.4 i/s
before --jit: 7331304.3 i/s - 1.02x slower
mjit_eq(nil, 1)
after --jit: 64710790.6 i/s
before --jit: 49449507.4 i/s - 1.31x slower
```
|
|
only for opt_nil_p and opt_not.
While vm_method_cfunc_is is used for opt_eq too, many fast paths of it
don't call it. So if it's populated, it should generate opt_send,
regardless of cfunc or not. And again, opt_neq isn't relevant due to the
difference in operands.
So opt_nil_p and opt_not are the only variants using vm_method_cfunc_is
like they use.
```
$ benchmark-driver -v --rbenv 'before2 --jit::ruby --jit;before --jit;after --jit' benchmark/mjit_opt_cc_insns.yml --repeat-count=4
before2 --jit: ruby 2.8.0dev (2020-06-22T08:37:37Z master 3238641750) +JIT [x86_64-linux]
before --jit: ruby 2.8.0dev (2020-06-23T01:01:24Z master 9ce2066209) +JIT [x86_64-linux]
after --jit: ruby 2.8.0dev (2020-06-23T06:58:37Z master 17e9df3157) +JIT [x86_64-linux]
last_commit=Avoid generating opt_send with cfunc cc with JIT
Calculating -------------------------------------
before2 --jit before --jit after --jit
mjit_nil?(1) 54.204M 75.536M 75.031M i/s - 40.000M times in 0.737947s 0.529548s 0.533110s
mjit_not(1) 53.822M 70.921M 71.920M i/s - 40.000M times in 0.743195s 0.564007s 0.556171s
mjit_eq(1, nil) 7.367M 6.496M 7.331M i/s - 8.000M times in 1.085882s 1.231470s 1.091327s
Comparison:
mjit_nil?(1)
before --jit: 75536059.3 i/s
after --jit: 75031409.4 i/s - 1.01x slower
before2 --jit: 54204431.6 i/s - 1.39x slower
mjit_not(1)
after --jit: 71920324.1 i/s
before --jit: 70921063.1 i/s - 1.01x slower
before2 --jit: 53821697.6 i/s - 1.34x slower
mjit_eq(1, nil)
before2 --jit: 7367280.0 i/s
after --jit: 7330527.4 i/s - 1.01x slower
before --jit: 6496302.8 i/s - 1.13x slower
```
|
|
because opt_nil/opt_not/opt_eq populates cc even when it doesn't
fallback to opt_send_without_block because of vm_method_cfunc_is.
```
$ benchmark-driver -v --rbenv 'before --jit;after --jit' benchmark/mjit_opt_cc_insns.yml --repeat-count=4
before --jit: ruby 2.8.0dev (2020-06-22T08:11:24Z master d231b8f95b) +JIT [x86_64-linux]
after --jit: ruby 2.8.0dev (2020-06-22T08:53:27Z master e1125879ed) +JIT [x86_64-linux]
last_commit=Compile opt_send for opt_* only when cc has ISeq
Calculating -------------------------------------
before --jit after --jit
mjit_nil?(1) 54.106M 73.693M i/s - 40.000M times in 0.739288s 0.542795s
mjit_not(1) 53.398M 74.477M i/s - 40.000M times in 0.749090s 0.537075s
mjit_eq(1, nil) 7.427M 6.497M i/s - 8.000M times in 1.077136s 1.231326s
Comparison:
mjit_nil?(1)
after --jit: 73692594.3 i/s
before --jit: 54106108.4 i/s - 1.36x slower
mjit_not(1)
after --jit: 74477487.9 i/s
before --jit: 53398125.0 i/s - 1.39x slower
mjit_eq(1, nil)
before --jit: 7427105.9 i/s
after --jit: 6497063.0 i/s - 1.14x slower
```
Actually opt_eq becomes slower by this. Maybe it's indeed using
opt_send_without_block, but I'll approach that one in another commit.
|
|
* Verify builtin inline annotation with VM_CHECK_MODE
* Remove static to fix the link issue on MJIT
Notes:
Merged-By: k0kubun <[email protected]>
|
|
[Feature #15589]
Notes:
Merged-By: k0kubun <[email protected]>
|
|
* Remove obsoleted opt_call_c_function insn
* Keep opt_call_c_function with DEFINE_INSN_IF
Notes:
Merged-By: k0kubun <[email protected]>
|
|
|
|
by inlining only hot path.
=== mame/optcarrot ===
$ benchmark-driver -v --rbenv 'before --jit;after --jit' benchmark.yml --repeat-count=24 --output=all
before --jit: ruby 2.8.0dev (2020-05-18T05:21:31Z master 0e5a58b6bf) +JIT [x86_64-linux]
after --jit: ruby 2.8.0dev (2020-05-18T06:12:04Z master 0e3d71a8d1) +JIT [x86_64-linux]
last_commit=Reduce code size for rb_class_of
Calculating -------------------------------------
before --jit after --jit
Optcarrot Lan_Master.nes 71.62880463568773 70.95730063273503 fps
71.73973684273152 71.98447841929851
75.03923801841310 75.54262519509039
75.16300287174957 77.64029272984344
75.16834828625935 78.67861469580785
75.17670723726911 78.81879353707393
75.67637908020630 79.18188850392886
76.19843953215396 79.66484891814478
77.28166716118808 79.80278072861037
77.38509903325165 80.05859292679696
78.12693418455953 80.34624804808006
78.73654441746730 80.66326571254345
79.25387513454415 80.69760605740196
79.44137881689524 81.32053489212245
79.50497657368358 81.50250852553751
79.62401328582868 82.27544931834611
79.79178811723664 82.67455264522741
81.20275352937418 82.93857260493297
81.57027048640776 83.15019118788184
81.63373188649095 83.20728816044721
81.93420437766426 83.25027576772972
82.05716136357167 83.27072145898173
82.21070805525066 83.36008265822194
82.56924063784872 83.36112268888493
=== benchmark-driver/sinatra ===
[rps]
before: 13143.49 rps
after: 13505.70 rps
[inlined rb_class_of size]
before: 11.5K
after: 3.8K
(calculated by `dwarftree --die inlined_subroutine --flat --merge --show-size`)
|
|
Even if local stack optimization is not used and values are written to
VM stack, the stack pointer itself may not be moved properly. So this
should be always moved on JIT cancellation.
By the way it's hard to write a test for this because if we try to
generate an interrupt, it will be a method call and it consumes the
interrupt by itself on popping a frame.
|
|
I'm trying to make it possible to include all JIT-ed code in a single C
file. This is needed to guarantee uniqueness of all function names
|
|
on stack when local_stack_p is enabled.
This fixes `RB_FL_TEST_RAW:"RB_FL_ABLE(obj)"` assertion failure
on power_assert's test with JIT enabled.
|
|
when there's no need to call CALLER_SETUP_ARG and CALLER_REMOVE_EMPTY_KW_SPLAT
(i.e. !rb_splat_or_kwargs_p(ci) && !calling->kw_splat).
Micro benchmark:
```
$ benchmark-driver -v --rbenv 'before;after' benchmark/vm_send_cfunc.yml --repeat-count=4
before: ruby 2.8.0dev (2020-04-13T23:45:05Z master b9d3ceee8f) [x86_64-linux]
after: ruby 2.8.0dev (2020-04-14T00:48:52Z no-splat-fastpath 418d363722) [x86_64-linux]
Calculating -------------------------------------
before after
vm_send_cfunc 69.585M 88.724M i/s - 100.000M times in 1.437097s 1.127096s
Comparison:
vm_send_cfunc
after: 88723605.2 i/s
before: 69584737.1 i/s - 1.28x slower
```
Optcarrot:
```
$ benchmark-driver -v --rbenv 'before;after' benchmark.yml --repeat-count=12 --output=all
before: ruby 2.8.0dev (2020-04-13T23:45:05Z master b9d3ceee8f) [x86_64-linux]
after: ruby 2.8.0dev (2020-04-14T00:48:52Z no-splat-fastpath 418d363722) [x86_64-linux]
Calculating -------------------------------------
before after
Optcarrot Lan_Master.nes 50.76119601545175 42.73858236484051 fps
50.76388649761503 51.04211379912850
50.80930672252514 51.39455790755538
50.90236000778749 51.75656936556145
51.01744746340430 51.86875277356489
51.06495279015112 51.88692482485558
51.07785337168974 51.93429603190578
51.20163525187862 51.95768145071314
51.34671771913112 52.45577266040274
51.35918340835583 52.53163888762858
51.46641337418146 52.62172484121034
51.50835463462257 52.85064021113239
```
Notes:
Merged-By: k0kubun <[email protected]>
|
|
for VM_METHOD_TYPE_CFUNC.
This has been known to decrease optcarrot fps:
```
$ benchmark-driver -v --rbenv 'before --jit;after --jit' benchmark.yml --repeat-count=24 --output=all
before --jit: ruby 2.8.0dev (2020-04-13T16:25:13Z master fb40495cd9) +JIT [x86_64-linux]
after --jit: ruby 2.8.0dev (2020-04-13T23:23:11Z mjit-inline-c bdcd06d159) +JIT [x86_64-linux]
Calculating -------------------------------------
before --jit after --jit
Optcarrot Lan_Master.nes 66.38132676191719 67.41369177299630 fps
69.42728743772243 68.90327567263054
72.16028300263211 69.62605130880686
72.46631319102777 70.48818243767207
73.37078877002490 70.79522887347566
73.69422431217367 70.99021920193194
74.01471487018695 74.69931965402584
75.48685183295630 74.86714575949016
75.54445264507932 75.97864419721677
77.28089738169756 76.48908637569581
78.04183397891302 76.54320932488021
78.36807984096562 76.59407262898067
78.92898762543574 77.31316743361343
78.93576483233765 77.97153484180480
79.13754917503078 77.98478782102325
79.62648945850653 78.02263322726446
79.86334213878064 78.26333724045934
80.05100635898518 78.60056756355614
80.26186843769584 78.91082645644468
80.34205717020330 79.01226659142263
80.62286066044338 79.32733939423721
80.95883033058557 79.63793060542024
80.97376819251613 79.73108936622778
81.23050939202896 80.18280109433088
```
and I deleted this capability in an early stage of YARV-MJIT development:
https://github.com/k0kubun/yarv-mjit/commit/0ab130feeefc2b9078a1077e4fec93b3f5e45d07
I suspect either of the following things could be the cause:
* Directly calling vm_call_cfunc requires more optimization effort in GCC,
resulting in 30ms-ish compilation time increase for such methods and
decreasing the number of methods compiled in a benchmarked period.
* Code size increase => icache miss hit
These hypotheses could be verified by some methodologies. However, I'd
like to introduce this regardless of the result because this blocks
inlining C method's definition.
I may revert this commit when I give up to implement inlining C method
definition, which requires this change.
Microbenchmark-wise, this gives slight performance improvement:
```
$ benchmark-driver -v --rbenv 'before --jit;after --jit' benchmark/mjit_send_cfunc.yml --repeat-count=4
before --jit: ruby 2.8.0dev (2020-04-13T16:25:13Z master fb40495cd9) +JIT [x86_64-linux]
after --jit: ruby 2.8.0dev (2020-04-13T23:23:11Z mjit-inline-c bdcd06d159) +JIT [x86_64-linux]
Calculating -------------------------------------
before --jit after --jit
mjit_send_cfunc 41.961M 56.489M i/s - 100.000M times in 2.383143s 1.770244s
Comparison:
mjit_send_cfunc
after --jit: 56489372.5 i/s
before --jit: 41961388.1 i/s - 1.35x slower
```
|
|
_mjit_compile_send.erb doesn't use _mjit_compile_insn_body.erb
|