From: davidegrayson@... Date: 2014-12-15T04:56:59+00:00 Subject: [ruby-core:66835] [ruby-trunk - Bug #10598] [Open] Cannot make two symbols with same bytes and different encodings Issue #10598 has been reported by David Grayson. ---------------------------------------- Bug #10598: Cannot make two symbols with same bytes and different encodings https://bugs.ruby-lang.org/issues/10598 * Author: David Grayson * Status: Open * Priority: Normal * Assignee: * Category: core * Target version: * ruby -v: ruby 2.2.0preview2 (2014-11-28 trunk 48628) [x86_64-linux] * Backport: 2.0.0: UNKNOWN, 2.1: UNKNOWN ---------------------------------------- It looks like Ruby 2.1.1 introduced a bug where it is impossible create two different symbols with the same bytes but different encodings. Here is a simple script that reproduces the bug: ```{ruby} sym1 = "ab".force_encoding("UTF-16").to_sym sym2 = "ab".to_sym puts sym2.encoding sym3 = "cd".to_sym sym4 = "cd".force_encoding("UTF-16").to_sym puts sym4.encoding ``` I would expect the output of this script to be: ``` US-ASCII UTF-16 ``` The script behaves as expected in Ruby 2.1.0, but in Ruby 2.1.1 and every later version that I tested, it gives incorrect results. Here is a shell session showing the output of the script when I run it in Ruby 2.1.0, 2.1.1, and 2.2.0-preview2: ``` $ chruby 2.1.0 && ruby -v && ruby symbol_encoding_bug.rb ruby 2.1.0p0 (2013-12-25 revision 44422) [x86_64-linux] US-ASCII UTF-16 $ chruby 2.1.1 && ruby -v && ruby symbol_encoding_bug.rb ruby 2.1.1p76 (2014-02-24 revision 45161) [x86_64-linux] UTF-16 US-ASCII $ chruby 2.2.0-preview2 && ruby -v && ruby symbol_encoding_bug.rb ruby 2.2.0preview2 (2014-11-28 trunk 48628) [x86_64-linux] UTF-16 US-ASCII ``` It looks like `String#to_sym` is not properly accounting for the encoding of the string when it searches the symbol table. This is definitely a bug; the value of `"ab".to_sym.encoding` should be predictable; it should not depend on the state of the symbol table. By the way, JRuby has a similar bug: https://github.com/jruby/jruby/issues/1348 -- https://bugs.ruby-lang.org/