From: "naruse (Yui NARUSE) via ruby-core" Date: 2024-01-18T03:14:14+00:00 Subject: [ruby-core:116283] [Ruby master Misc#20191] Deprecate magic encoding comment Issue #20191 has been updated by naruse (Yui NARUSE). Status changed from Open to Rejected You also need to consider applications in addition to gems publicly available on GitHub. Breaking compatibility forces such users/developers to work such unproductive work. You must carefully compare the trade off between your development and maintenance cost and Ruby users unproductive cost. I don't understand why you propose such big incompatible change without concrete evidence of "a lot of value/simplifications/performance opportunities". Even if Ruby is known as a language which is aggressive to introduce incompatibility, we are always carefully discussing the trade off of the incompatibility and ensure the benefit of it is actually larger than the downside of the incompatibility. Also note that in Ruby if they change the encoding of source code, the encoding of literals defined in it is also changed. Affected applications will need to change the logic or convert those strings into the original encoding. Those changes will be larger than you expect. ---------------------------------------- Misc #20191: Deprecate magic encoding comment https://bugs.ruby-lang.org/issues/20191#change-106305 * Author: kddnewton (Kevin Newton) * Status: Rejected * Priority: Normal ---------------------------------------- I would like to ask that we deprecate the magic encoding comment, and instead require all source files to be encoded in UTF-8. There would be many benefits to the performance of both the parser and compiler. It would also help to simplify both. For example, right now a string literal in a file encoded in US-ASCII can result in 3 different encodings, depending on its internal bytes. If the file is encoded in UTF-8, it can only be a UTF-8 string. The encoding comment itself is not very commonly used in gems. If you take the top 100 most downloaded gem versions from rubygems.org and look at the resolved encoding of all of the files, you get: - UTF-8: 11554 - ASCII-8BIT: 35 - US-ASCII: 10 For all of the most recent versions of gems on rubygems.org, you get: - UTF-8: 2967421 - US-ASCII: 20130 - ASCII-8BIT: 9237 - ISO-8859-1: 87 - Windows-1252: 45 - Shift_JIS: 32 - Windows-31J: 22 - Windows-1251: 15 - EUC-JP: 11 - GBK: 4 - KOI8-R: 3 - ISO-8859-15: 2 - UTF8-MAC: 1 - invalid: 33 Note that "invalid" here could have worked on some rubies < 3.2 if they used Encoding#replicate. If we were to change this, the main breaking change concern would be the encoding of strings and symbols that would leave the context of the file by virtue of a constant read/method call. That's why I think it should first be deprecated in a minor release, then removed in the next major. At the moment this would mean for the top 100 gems we would be worried about 0.39% of files, and on rubygems.org as a whole we would be worried about 0.99% of files. If deprecating the entire encoding comment is unacceptable from a compatibility point of view, I would suggest we try only allowing UTF-8, US-ASCII, and ASCII-8BIT. This would still have a lot of value/simplifications/performance opportunities, at the expense of still needing to be parsed and checked. On the top 100 gems this would mean no files would have to change, and on rubygems.org as a whole it would mean we would be worried about 0.009% of files. That being said, if we're going to deprecate this at all, we should probably just do it all the way to get the full benefit. (In case you want to check the math, the script used to calculate these is attached.) ---Files-------------------------------- gems.rb (4.33 KB) -- https://bugs.ruby-lang.org/ ______________________________________________ ruby-core mailing list -- ruby-core@ml.ruby-lang.org To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/