[#48745] [ruby-trunk - Bug #7267][Open] Dir.glob on Mac OS X returns unexpected string encodings for unicode file names — "kennygrant (Kenny Grant)" <kennygrant@...>

17 messages 2012/11/02

[#48773] [ruby-trunk - Bug #7269][Open] Refinement doesn't work if using locate after method — "ko1 (Koichi Sasada)" <redmine@...>

12 messages 2012/11/03

[#48847] [ruby-trunk - Bug #7274][Open] UnboundMethods should be bindable to any object that is_a?(owner of the UnboundMethod) — "rits (First Last)" <redmine@...>

21 messages 2012/11/04

[#48854] [ruby-trunk - Bug #7276][Open] TestFile#test_utime failure — "jonforums (Jon Forums)" <redmine@...>

14 messages 2012/11/04

[#48988] [ruby-trunk - Feature #7292][Open] Enumerable#to_h — "marcandre (Marc-Andre Lafortune)" <ruby-core@...>

40 messages 2012/11/06

[#48997] [ruby-trunk - Feature #7297][Open] map_to alias for each_with_object — "nathan.f77 (Nathan Broadbent)" <nathan.f77@...>

19 messages 2012/11/06

[#49001] [ruby-trunk - Bug #7298][Open] Behavior of Enumerator.new different between 1.9.3 and 2.0.0 — "ayumin (Ayumu AIZAWA)" <ayumu.aizawa@...>

12 messages 2012/11/06

[#49018] [ruby-trunk - Feature #7299][Open] Ruby should not completely ignore blocks. — "marcandre (Marc-Andre Lafortune)" <ruby-core@...>

13 messages 2012/11/07

[#49044] [ruby-trunk - Bug #7304][Open] Random test failures around test_autoclose_true_closed_by_finalizer — "luislavena (Luis Lavena)" <luislavena@...>

11 messages 2012/11/07

[#49196] [ruby-trunk - Feature #7322][Open] Add a new operator name #>< for bit-wise "exclusive or" — "alexeymuranov (Alexey Muranov)" <redmine@...>

18 messages 2012/11/10

[#49211] [ruby-trunk - Feature #7328][Open] Move ** operator precedence under unary + and - — "boris_stitnicky (Boris Stitnicky)" <boris@...>

20 messages 2012/11/11

[#49229] [ruby-trunk - Bug #7331][Open] Set the precedence of unary `-` equal to the precedence `-`, same for `+` — "alexeymuranov (Alexey Muranov)" <redmine@...>

17 messages 2012/11/11

[#49256] [ruby-trunk - Feature #7336][Open] Flexiable OPerator Precedence — "trans (Thomas Sawyer)" <transfire@...>

18 messages 2012/11/12

[#49354] review open pull requests on github — Zachary Scott <zachary@...>

Could we get a review on any open pull requests on github before the

12 messages 2012/11/15
[#49355] Re: review open pull requests on github — "NARUSE, Yui" <naruse@...> 2012/11/15

2012/11/15 Zachary Scott <[email protected]>:

[#49356] Re: review open pull requests on github — Zachary Scott <zachary@...> 2012/11/15

Ok, I was hoping one of the maintainers might want to.

[#49451] [ruby-trunk - Bug #7374][Open] File.expand_path resolving to first file/dir instead of absolute path — mdube@... (Martin Dubé) <mdube@...>

12 messages 2012/11/16

[#49463] [ruby-trunk - Feature #7375][Open] embedding libyaml in psych for Ruby 2.0 — "tenderlovemaking (Aaron Patterson)" <aaron@...>

21 messages 2012/11/16
[#49494] [ruby-trunk - Feature #7375] embedding libyaml in psych for Ruby 2.0 — "vo.x (Vit Ondruch)" <v.ondruch@...> 2012/11/17

[#49467] [ruby-trunk - Feature #7377][Open] #indetical? as an alias for #equal? — "aef (Alexander E. Fischer)" <aef@...>

13 messages 2012/11/17

[#49558] [ruby-trunk - Bug #7395][Open] Negative numbers can't be primes by definition — "zzak (Zachary Scott)" <zachary@...>

10 messages 2012/11/19

[#49566] [ruby-trunk - Feature #7400][Open] Incorporate OpenSSL tests from JRuby. — "zzak (Zachary Scott)" <zachary@...>

11 messages 2012/11/19

[#49770] [ruby-trunk - Feature #7414][Open] Now that const_get supports "Foo::Bar" syntax, so should const_defined?. — "robertgleeson (Robert Gleeson)" <rob@...>

9 messages 2012/11/20

[#49950] [ruby-trunk - Feature #7427][Assigned] Update Rubygems — "mame (Yusuke Endoh)" <mame@...>

17 messages 2012/11/24

[#50043] [ruby-trunk - Bug #7429][Open] Provide options for core collections to customize behavior — "headius (Charles Nutter)" <headius@...>

10 messages 2012/11/24

[#50092] [ruby-trunk - Feature #7434][Open] Allow caller_locations and backtrace_locations to receive negative params — "sam.saffron (Sam Saffron)" <sam.saffron@...>

21 messages 2012/11/25

[#50094] [ruby-trunk - Bug #7436][Open] Allow for a "granularity" flag for backtrace_locations — "sam.saffron (Sam Saffron)" <sam.saffron@...>

11 messages 2012/11/25

[#50207] [ruby-trunk - Bug #7445][Open] strptime('%s %z') doesn't work — "felipec (Felipe Contreras)" <felipe.contreras@...>

19 messages 2012/11/27

[#50424] [ruby-trunk - Bug #7485][Open] ruby cannot build on mingw32 due to missing __sync_val_compare_and_swap — "drbrain (Eric Hodel)" <[email protected]>

15 messages 2012/11/30

[#50429] [ruby-trunk - Feature #7487][Open] Cutting through the issues with Refinements — "trans (Thomas Sawyer)" <transfire@...>

13 messages 2012/11/30

[ruby-core:49166] [ruby-trunk - Bug #7267] Dir.glob on Mac OS X returns unexpected string encodings for unicode file names

From: "kennygrant (Kenny Grant)" <kennygrant@...>
Date: 2012-11-09 13:56:02 UTC
List: ruby-core #49166
Issue #7267 has been updated by kennygrant (Kenny Grant).


> see several lines at the end of enc/utf_8.c.

Thanks for the links, I'll read them to get a better understanding of this, I'm sorry I can't read the other two tickets associated with this as that would probably be enlightening too. I suspect the problem comes because of my somewhat parochial focus on using Ruby in English and other roman languages, where the text editors and other tools tend to generate NFC UTF-8, so I expected to get NFC UTF-8 back. When I type 

s = "détente"

the string is composed, which is what I'd expect, it isn't accidental. If there was some way in future to ask Ruby to use only composed forms when interacting with file systems, it would be really useful for use with western languages on Mac OS. 

> writer.rb's two puts output the same result.

Here is the result I get from writer.rb (with a few more tests)
(ruby 1.9.3p194 (2012-04-20 revision 35410) [x86_64-darwin11.4.0], Mac OS X 10.7.5 HFS+ disk, I saw this on 2.0 too)- one result is decomposed, one is composed, depending on the string passed in. 
After writing out a file with an NFC name to disk (resulting in an NFD filename on disk presumably). 
# GLOB on "./*.txt" pattern - result NFD
# [".", "/", "t", "e", "s", "t", "e", "́", ".", "t", "x", "t"]
# GLOB on "./testé.txt" - result NFC
# [".", "/", "t", "e", "s", "t", "é", ".", "t", "x", "t"]
# GLOB on "./testé.*" - fails, no match
# GLOB on "./teste*" - match result NFD
[".", "/", "t", "e", "s", "t", "e", "́", ".", "t", "x", "t"]


> No way. And again, Ruby string is not NFC.

OK thanks, I'll just have to accept that converting to composed to match my internal ruby strings (which are NFC) is necessary when dealing with an HFS file system. 
----------------------------------------
Bug #7267: Dir.glob on Mac OS X returns unexpected string encodings for unicode file names
https://bugs.ruby-lang.org/issues/7267#change-32713

Author: kennygrant (Kenny Grant)
Status: Assigned
Priority: Normal
Assignee: duerst (Martin Dürst)
Category: 
Target version: 2.0.0
ruby -v: ruby 1.9.3p194 (2012-04-20 revision 35410) [x86_64-darwin11.4.0]


Tested on Ruby 1.9.3-p194 and ruby-2.0.0-preview1 on Mac OS X 10. 7.5

When calling file system methods with Ruby on Mac OS X, it is not possible to manipulate the resulting file name as a normal UTF-8 string, even though it reports the encoding as UTF-8. It seems to be a UTF-8-MAC string, even when the default encoding is set to UTF-8. This leads to confusion as the string can be manipulated normally except for any unicode characters, which seem to be decomposed. So a regexp using utf-8 characters won't work on the string, unless it is first converted from UTF-8-MAC. I'd expect the string encoding to be UTF-8, or at least to report that it is not a normal UTF-8 string if it has to be UTF-8-MAC for some reason. 

Example, run with a file called Testé.txt in the same folder:

def transform_string s
   puts "Testing string #{s}"
   puts s.gsub(/é/,'TEST')
end

Dir.glob("./*.txt").each do |f|  
  puts "Inline string works as expected" 
   s = "./Testé.txt" 
   puts transform_string s

   puts "File name from Dir.glob does not" 
   puts transform_string f
   
   puts "Encoded file name works as expected, though it is reported as UTF-8, not UTF-8-MAC" 
   f.encode!('UTF-8','UTF-8-MAC')
   puts transform_string f
end


-- 
http://bugs.ruby-lang.org/

In This Thread