monotone-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Monotone-devel] Re: Cygwin-1.7 tests [Was: Time for a release]


From: Lapo Luchini
Subject: [Monotone-devel] Re: Cygwin-1.7 tests [Was: Time for a release]
Date: Thu, 10 Sep 2009 21:44:31 +0200
User-agent: Mozilla/5.0 (Windows; U; Windows NT 5.2; en-US; rv:1.8.1.23) Gecko/20090812 Thunderbird/2.0.0.23 Mnenhy/0.7.5.0

Lapo Luchini wrote:
> tester_dir/tests/importing_files_with_non-english_names/8859-1/???
> tester_dir/tests/importing_files_with_non-english_names/euc/?Ƥ??
> tester_dir/tests/importing_files_with_non-english_names/utf8/öäüß
> tester_dir/tests/importing_files_with_non-english_names/utf8/てすと
> 
> I guess because 8859-1 is not valid UTF-8?

OK, the problem is this: Cygwin-1.5 used to use the normal file system
calls (which use CP1252 or other, depending on the locale) while
Cygwin-1.7 always use the wchar_t equivalents which always use UTF-16.

>From a few tests I did I think that Cygwin-1.7 is using $LANG to know
how to interpret the input, then convert it to UTF-16, then feed it to
the wide-char file function. The filesystem only accepts UTF16, so it's
really wrong to send "raw filenames" which mean nothing in the charset
specified in $LANG.

So it's not realyl like in OSX where only UTF8 is ok, on Cygwin-1.7
anything that conforms to LANG is ok and anything else produces
"strange" files which can't be used from the command-line...

I started a thread upstream to inquiry on that:
http://www.cygwin.com/ml/cygwin/2009-09/msg00234.html
but in the meantime I propose to:
a. always launch the test programs using LANG=C
   (it uses UTF8 on Cygwin and seems like a nice change also on other
   systems, not to depend on system locale with test output)
b. change that test to avoid non-UTF8 filenames much like on OSX


--- tests/importing_files_with_non-english_names/__driver__.lua
6ef11dc2d3f1743eb29baa10b0fe9fd84fa8aa78
+++ tests/importing_files_with_non-english_names/__driver__.lua
3786e1be7ec2f6e02ebf3724339eacd174707c91
@@ -25,7 +25,14 @@ check(writefile("utf8/" .. japanese_utf8
 check(writefile("utf8/" .. european_utf8, ""))
 check(writefile("utf8/" .. japanese_utf8, ""))

-if ostype ~= "Darwin" then
+if ostype ~= "Darwin" and string.sub(ostype, 1, 6) ~= "CYGWIN" then
+       notUnicode = true
+else
+       notUnicode = false
+end
+
+
+if notUnicode then
        check(writefile("8859-1/" .. european_8859_1, ""))
        check(writefile("euc/" .. japanese_euc_jp, ""))
 end
@@ -65,7 +72,7 @@ commit()

 -- OS X expects data passed to the OS to be utf8, so these tests don't make
 -- sense.
-if ostype ~= "Darwin" then
+if notUnicode then
        -- now try iso-8859-1

        set_env("LANG", "de_DE.iso-8859-1")
@@ -81,20 +88,20 @@ check(qgrep("spaces", "manifest"))
 rename("stdout", "manifest")
 check(qgrep("funny", "manifest"))
 check(qgrep("spaces", "manifest"))
-if ostype ~= "Darwin" then
+if notUnicode then
   check(qgrep("8859-1/" .. european_utf8, "manifest"))
 end

 -- okay, clean up again

-if ostype ~= "Darwin" then
+if notUnicode then
        check(mtn("drop", "--bookkeep-only", "8859-1/" ..
european_8859_1), 0, false, false)
        commit()
 end

 -- now try euc

-if ostype ~= "Darwin" then
+if notUnicode then
        set_env("LANG", "ja_JP.euc-jp")
        set_env("CHARSET", "euc-jp")
        check(mtn("add", "euc/" .. japanese_euc_jp), 0, false, false)
@@ -108,6 +115,6 @@ check(qgrep("spaces", "manifest"))
 rename("stdout", "manifest")
 check(qgrep("funny", "manifest"))
 check(qgrep("spaces", "manifest"))
-if ostype ~= "Darwin" then
+if notUnicode then
        check(qgrep("euc/" .. japanese_utf8, "manifest"))
 end


-- 
Lapo Luchini - http://lapo.it/

“If knowledge can create problems, it is not through ignorance that we
can solve them.” (Isaac Asimov)





reply via email to

[Prev in Thread] Current Thread [Next in Thread]