From: "hanazuki (Kasumi Hanazuki) via ruby-core" Date: 2024-09-23T18:29:19+00:00 Subject: [ruby-core:119283] [Ruby master Bug#20745] IO::Buffer#copy triggers UB when src/dest buffers overlap Issue #20745 has been updated by hanazuki (Kasumi Hanazuki). After reviewing `memcpy` and `memmove` from open-source libc implementations, I found some optimize for small copies that fit within registers. These optimizations handle overlapping source and destination memory without explicitly checking for overlaps. Therefore, checking for buffer overlap on the Ruby side to choose between `memcpy` or `memmove` would negate these optimizations, and I think using `memmove` alone (as in my proposal) will suffice. --- - glibc: - [x86_64 for various vector extension](https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S;h=838f8f8bff9b42c45f41670ed20e614151470148;hb=3d1aed874918c466a4477af1da35983ab036690e): - `memcpy` is aliased to `memmove`. - Has optimization for small copies (<= 8 * vector\_size). - [aarch64 generic](https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/aarch64/memcpy.S;h=f21c21d3f2a21d890a4f68de32874299d7fd219d;hb=3d1aed874918c466a4477af1da35983ab036690e): - Has optimization to skip overlap check for copies of <= 128B. - FreeBSD: - x86_64: - [memcpy](https://cgit.freebsd.org/src/tree/lib/libc/amd64/string/memcpy.S?id=1e52ba8c621d8c9e40815fa6e1f28207e35e3877) and [memmove](https://cgit.freebsd.org/src/tree/lib/libc/amd64/string/memmove.S?id=1e52ba8c621d8c9e40815fa6e1f28207e35e3877) look identical. - aarch64: - [memmove](https://cgit.freebsd.org/src/tree/contrib/cortex-strings/src/aarch64/memmove.S?id=1e52ba8c621d8c9e40815fa6e1f28207e35e3877) calls [memcpy](https://cgit.freebsd.org/src/tree/contrib/cortex-strings/src/aarch64/memcpy.S?id=1e52ba8c621d8c9e40815fa6e1f28207e35e3877) for <= 96B as it provides efficient copy for overlapped memory. - musl: - x86_64: - [memmove](https://git.musl-libc.org/cgit/musl/tree/src/string/x86_64/memmove.s?h=v1.2.5&id=0784374d561435f7c787a555aeab8ede699ed298) calls [memcpy](https://git.musl-libc.org/cgit/musl/tree/src/string/x86_64/memcpy.s?h=v1.2.5&id=0784374d561435f7c787a555aeab8ede699ed298) if possible. No special optimization for both. - aarch64: - [memmove](https://git.musl-libc.org/cgit/musl/tree/src/string/memmove.c?h=v1.2.5&id=0784374d561435f7c787a555aeab8ede699ed298) is generic one, calling optimized [memcpy](https://git.musl-libc.org/cgit/musl/tree/src/string/aarch64/memcpy.S?h=v1.2.5&id=0784374d561435f7c787a555aeab8ede699ed298) if non-overlapping. ---------------------------------------- Bug #20745: IO::Buffer#copy triggers UB when src/dest buffers overlap https://bugs.ruby-lang.org/issues/20745#change-109892 * Author: hanazuki (Kasumi Hanazuki) * Status: Open * ruby -v: ruby 3.4.0dev (2024-09-15T01:06:11Z master 532af89e3b) +PRISM [x86_64-linux] * Backport: 3.1: UNKNOWN, 3.2: UNKNOWN, 3.3: UNKNOWN ---------------------------------------- The current implementation of `IO::Buffer#copy` uses `memcpy` to copy data between the two memory regions. `memcpy` has a requirement that the source and destination must not overlap; otherwise the behavior is undefined. When copying between the same instance of `IO::Buffer` (or slices sharing the same underlying memory), the rule can be violated, and the data is corrupted with some libc implementation / architecture combinations (note that Alpine uses musl libc). ```shell-session % docker run --platform=linux/amd64 --rm ruby:3.3.5-alpine3.20 ruby -e 'b=IO::Buffer.new(10); b.set_string("0123456789"); b.copy(b, 3, 7, 0); p b' -e:1: warning: IO::Buffer is experimental and both the Ruby and C interface may change in the future! # 0x00000000 30 31 32 30 31 32 30 31 32 30 0120120120 % docker run --platform=linux/arm64 --rm ruby:3.3.5-alpine3.20 ruby -e 'b=IO::Buffer.new(10); b.set_string("0123456789"); b.copy(b, 3, 7, 0); p b' -e:1: warning: IO::Buffer is experimental and both the Ruby and C interface may change in the future! # 0x00000000 30 31 32 30 31 32 33 34 35 36 0120123456 ``` Copying "0123456789" three bytes behind onto itself, "0120123456" is the expected result. --- This error can also be detected with ASAN. ```shell-session % CXX=clang++ CC=clang ./configure cppflags="-fsanitize=address -fno-omit-frame-pointer" optflags=-O0 LDFLAGS="-fsanitize=address -fno-omit-frame-pointer" % ./ruby -e 'b=IO::Buffer.new(10); b.copy(b, 0, 9, 1)' `RubyGems' were not loaded. `error_highlight' was not loaded. `did_you_mean' was not loaded. `syntax_suggest' was not loaded. -e:1: warning: IO::Buffer is experimental and both the Ruby and C interface may change in the future! ================================================================= ==1655425==ERROR: AddressSanitizer: memcpy-param-overlap: memory ranges [0x5020000107b0,0x5020000107b9) and [0x5020000107b1, 0x5020000107ba) overlap #0 0x55dcfbb72d90 in __asan_memcpy (/home/kasumi/.local/src/github.com/ruby/ruby/ruby+0x1fdd90) (BuildId: 2591ca8e9e713537a8f388383df19d1f4284b722) #1 0x55dcfbcb31e3 in ruby_nonempty_memcpy /home/kasumi/.local/src/github.com/ruby/ruby/./include/ruby/internal/memory.h:662:16 #2 0x55dcfbcb3867 in io_buffer_memcpy /home/kasumi/.local/src/github.com/ruby/ruby/io_buffer.c:2347:5 #3 0x55dcfbcb354a in io_buffer_copy_from /home/kasumi/.local/src/github.com/ruby/ruby/io_buffer.c:2384:5 #4 0x55dcfbcafd2f in io_buffer_copy /home/kasumi/.local/src/github.com/ruby/ruby/io_buffer.c:2490:12 #5 0x55dcfc0b955f in ractor_safe_call_cfunc_m1 /home/kasumi/.local/src/github.com/ruby/ruby/./vm_insnhelper.c:3597:12 #6 0x55dcfc09ca28 in vm_call_cfunc_with_frame_ /home/kasumi/.local/src/github.com/ruby/ruby/./vm_insnhelper.c:3788:11 #7 0x55dcfc09cf2a in vm_call_cfunc_with_frame /home/kasumi/.local/src/github.com/ruby/ruby/./vm_insnhelper.c:3834:12 #8 0x55dcfc09c0e9 in vm_call_cfunc_other /home/kasumi/.local/src/github.com/ruby/ruby/./vm_insnhelper.c:3860:16 #9 0x55dcfc081811 in vm_call_cfunc /home/kasumi/.local/src/github.com/ruby/ruby/./vm_insnhelper.c:3942:12 #10 0x55dcfc07f347 in vm_call_method_each_type /home/kasumi/.local/src/github.com/ruby/ruby/./vm_insnhelper.c:4766:16 #11 0x55dcfc07ebc2 in vm_call_method /home/kasumi/.local/src/github.com/ruby/ruby/./vm_insnhelper.c:4892:20 #12 0x55dcfc02d9a4 in vm_call_general /home/kasumi/.local/src/github.com/ruby/ruby/./vm_insnhelper.c:4936:12 #13 0x55dcfc02fb24 in vm_sendish /home/kasumi/.local/src/github.com/ruby/ruby/./vm_insnhelper.c:5955:15 #14 0x55dcfc03eb00 in vm_exec_core /home/kasumi/.local/src/github.com/ruby/ruby/insns.def:898:11 #15 0x55dcfc0306a2 in rb_vm_exec /home/kasumi/.local/src/github.com/ruby/ruby/vm.c:2564:22 #16 0x55dcfc06ee9f in rb_iseq_eval_main /home/kasumi/.local/src/github.com/ruby/ruby/vm.c:2830:11 #17 0x55dcfbbb7a84 in rb_ec_exec_node /home/kasumi/.local/src/github.com/ruby/ruby/eval.c:281:9 #18 0x55dcfbbb74b2 in ruby_run_node /home/kasumi/.local/src/github.com/ruby/ruby/eval.c:319:30 #19 0x55dcfbbaf81e in rb_main /home/kasumi/.local/src/github.com/ruby/ruby/./main.c:43:12 #20 0x55dcfbbaf699 in main /home/kasumi/.local/src/github.com/ruby/ruby/./main.c:62:12 #21 0x7fb604914db9 in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16 #22 0x7fb604914e74 in __libc_start_main csu/../csu/libc-start.c:360:3 #23 0x55dcfbad8fe0 in _start (/home/kasumi/.local/src/github.com/ruby/ruby/ruby+0x163fe0) (BuildId: 2591ca8e9e713537a8f388383df19d1f4284b722) [snip] ``` -- https://bugs.ruby-lang.org/ ______________________________________________ ruby-core mailing list -- ruby-core@ml.ruby-lang.org To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org ruby-core info -- https://ml.ruby-lang.org/mailman3/lists/ruby-core.ml.ruby-lang.org/