From: Alex Young Date: 2011-11-23T21:00:05+09:00 Subject: [ruby-core:41249] [ruby-trunk - Feature #2567] Net::HTTP does not handle encoding correctly Issue #2567 has been updated by Alex Young. Eric Hodel wrote: > So giving the user undetectably garbled text is acceptable to both of you? I wish to clarify. Yes. If the user is getting garbled text, *they'll see it* and can fix it. It's only undetectable at the data level - once it gets rendered, it's obvious. > If the Content-Type header is used as you propose and the user sets the default_internal encoding what should happen? Use the Content-Type header. Only use default_internal if the Content-Type doesn't have specify a charset. > If the server lies and the response body is transcoded data may be lost or an exception may be raised. Should this exception be rescued by Net::HTTP? No, it should be left uncaught, or re-raised, to tell the user that there's breakage in the encoding pipeline. As long as there's a mechanism the user can use to *opt* to ignore the server's charset to help diagnosing this sort of breakage, I don't see that this is problematic. ---------------------------------------- Feature #2567: Net::HTTP does not handle encoding correctly http://redmine.ruby-lang.org/issues/2567 Author: Ryan Sims Status: Assigned Priority: Low Assignee: Yui NARUSE Category: lib Target version: 2.0.0 ruby -v: ruby 1.9.1p376 (2009-12-07 revision 26041) [i686-linux] =begin A string returned by an HTTP get does not have its encoding set appropriately with the charset field, nor does the content_type report the charset. Example code demonstrating incorrect behavior is below. #!/usr/bin/ruby -w # encoding: UTF-8 require 'net/http' uri = URI.parse('http://www.hearya.com/feed/') result = Net::HTTP.start(uri.host, uri.port) {|http| http.get(uri.request_uri) } p result['content-type'] # "text/xml; charset=UTF-8" <- correct p result.content_type # "text/xml" <- incorrect; truncates the charset field puts result.body.encoding # ASCII-8BIT <- incorrect encoding, should be UTF-8 =end -- http://redmine.ruby-lang.org