openexr-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Openexr-devel] RE: UNICODE support in openexr file I/O


From: Dennis Crowley
Subject: Re: [Openexr-devel] RE: UNICODE support in openexr file I/O
Date: Thu, 26 Jan 2006 11:51:47 -0800

Indeed.  UTF16 LE with no BOM seems to do the trick. (with _wfopen)

I get a filename that displays in windows as 3 boxes, but when I cut it and paste it into a text file (in visual studio) I get the appropriate Kanji characters

(Do I need to have the right language pack installed for XP to see them in the explorer window?)


my experiments...

#include <stdio.h>

// UTF8
//unsigned char name[] = {0xe4, 0xba, 0xac, 0xe9, 0x83, 0xbd, 0xe5, 0xb8, 0x82, 0x23, 0};

// UTF8 - with UTF8 encoded BOM - (obtained from visual studio "save as" with encoding)
//unsigned char name[] = {0xef, 0xbb, 0xbf, 0xe4, 0xba, 0xac, 0xe9, 0x83, 0xbd, 0xe5, 0xb8, 0x82, 0x23, 0};

// UTF16 LE -
// unsigned char name[] = {0xff, 0xfe, 0xac, 0x4e, 0xfd, 0x90, 0x02, 0x5e};

// UTF16 BE -
//unsigned char name[] = {0xfe, 0xff, 0x4e, 0xac, 0x90, 0xfd, 0x5e, 0x02};

// UTF16 LE No BOM-
unsigned char name[] = {0xac, 0x4e, 0xfd, 0x90, 0x02, 0x5e};

int main(int argc, char **argv)
{

//   FILE *f = fopen((char*)name, "wb");
   FILE *f = _wfopen((wchar_t*)name, L"wb");
   fclose(f);
};



On 1/26/06, Paul Miller <address@hidden> wrote:
Nick Porcino wrote:
> Hello list, I am stumped.

Nick - if that is a UTF-8 string, you probably need to convert it to
UCS-16 (16-bit Unicode) and use the _wfopen() or CreateFile() functions
with _UNICODE set in the preprocessor.

I don't think Windows file functions take UTF-8 strings directly.


>
> I am trying to create a file named Kyoto-to in Kanji. name below is the UTF-8 encoding of Kyoto-to. I've tried a number of experiments, but all yield a goobledy-gook filename. I have appended my tests below.
>
> If someone can provide the correct answer that would be swell!
>
>
>
> const char name[] =
> {0xe4, 0xba, 0xac, 0xe9, 0x83, 0xbd, 0xe5, 0xb8, 0x82, 0x23, 0};
>
>
>
>
>
> TEST ONE - fopen
>
>
> fopen(name, "wb") yields a file called:
>
> 京都市#
>
> (random junk)
>
>
>
>
>
>
> TEST TWO - CreateFile (windows)
>
>     HANDLE hFile = CreateFile(TEXT(name),    // file to open
>         GENERIC_WRITE,          // open for writing
>         FILE_SHARE_WRITE,       // share for writing
>         NULL,                  // default security
>         CREATE_ALWAYS,         // create file only
>         FILE_ATTRIBUTE_NORMAL, // normal file
>         NULL);
>
> äºéƒ½å¸‚#
>
> (random junk)
>
>
>
> TEST THREE - _wfopen
>
>
>     _wfopen((const wchar_t*)name, (const wchar_t*)"wb");
>
> äºéƒ½å¸‚#
>
> (random junk)
>
>
>
>
>
>
> TEST FOUR - _wfopen + UTF-8 -> 16 conversion
>
> Finally:
>
>
> using std::wstring;
>
> wstring
> toWideString( const char* pStr , int len )
> {
>     // figure out how many wide characters we are going to get
>     int nChars = MultiByteToWideChar( CP_ACP , 0 , pStr , len , NULL , 0 ) ;
>     if ( len == -1 )
>         -- nChars ;
>     if ( nChars == 0 )
>         return L"" ;
>
>     wstring buf ;
>     buf.resize( nChars ) ;
>     MultiByteToWideChar( CP_ACP , 0 , pStr , len ,
>         const_cast<wchar_t*>(buf.c_str()) , nChars ) ;
>
>     return buf ;
> }
>
> int _tmain(int argc, _TCHAR* argv[])
> {
>     wstring wname = toWideString(name, strlen(name));
>     FILE* x = _wfopen(wname.c_str(), (const wchar_t*)"wb");
>       return 0;
> }
>
> äºéƒ½å¸‚#
>
> (random junk)
>
>
>
> -----Original Message-----
> From: Florian Kainz [mailto: address@hidden]
> Sent: Wednesday, January 25, 2006 3:17 PM
> To: Nick Porcino
> Cc: Drew Hess
> Subject: Re: [Openexr-devel] RE: UNICODE support in openexr file I/O
>
> According to a hex dump of a UTF-8-encoded e-mail message that I sent
> to myself, the byte sequence for 京都市 is:
>
>      const char name[] =
>          {0xe4, 0xba, 0xac, 0xe9, 0x83, 0xbd, 0xe5, 0xb8, 0x82, 0x23};
>
>
> Nick Porcino wrote:
>> I'm kind of pressed for time... I can try something tho'
>>
>> Florian, could you give me something like this -
>>
>> char name[] = { 0xa, 0x21, 0x23, ...., 0 };
>>
>> that UTF-8 encodes one of those strings like the name of Bangkok or
>> whatever and I'll write a little test app for you
>>
>> -----Original Message-----
>> From: Drew Hess [mailto: address@hidden]
>> Sent: Wednesday, January 25, 2006 10:25 AM
>> To: Nick Porcino
>> Cc: Florian Kainz
>> Subject: Re: [Openexr-devel] RE: UNICODE support in openexr file I/O
>>
>>
>> "Nick Porcino" < address@hidden> writes:
>>
>>> The code I posted earlier converts between UTF-8 and 16. So if you do
>>> the conversion of the strings 8<>16 outside of OpenEXR you are covered
>>> for OpenEXR's internal strings.
>>>
>>> Santiago, are you saying that if you convert UTF-16 to UTF-8, then use
>>> the converted string as a filename, that the OS itself displays the
>>> filename as garbage?
>>
>> ------------------------------------------------------------------------
>>
>> _______________________________________________
>> Openexr-devel mailing list
>> address@hidden
>> http://lists.nongnu.org/mailman/listinfo/openexr-devel


--
Paul Miller | address@hidden | www.fxtech.com | Got Tivo?



_______________________________________________
Openexr-devel mailing list
address@hidden
http://lists.nongnu.org/mailman/listinfo/openexr-devel



reply via email to

[Prev in Thread] Current Thread [Next in Thread]