pdf-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [pdf-devel] [PATCH] Implement pdf_fsys_disk_item_p in base/pdf-fsys-


From: Zac Brown
Subject: Re: [pdf-devel] [PATCH] Implement pdf_fsys_disk_item_p in base/pdf-fsys-disk.c [try 2]
Date: Mon, 07 Jul 2008 14:19:35 -0700
User-agent: Thunderbird 1.5.0.14ubu (X11/20080502)

Hi,

Responses are below, inline.

address@hidden wrote:
Hi Zac.

Many thanks for fixing the patch.

   +  if (path_name == NULL)
   +    {
   +      return PDF_FALSE;
   +    }

Maybe would be a good idea to write system-dependent path syntax
checking functions. Something like 'pdf_fsys_posix_path_name_p' and
'pdf_fsys_w32_path_name_p'. What do you think?

I think it would be a good idea. I will look into it.

   +  ret_code = pdf_text_get_host (&ascii_path, &ascii_path_len, path_name, 
pdf_text_get_host_encoding());

Note that there is a static function named
'pdf_fsys_disk_get_host_path'. Would be better to use it instead of
'pdf_text_get_host' directly.

There may be a problem with our approach of getting the host-encoded
path name and using it with the OS functions. The filenames In GNU and
POSIX systems are encoded using either ASCII, ISO-8859-X or
filesystem-safe Unicode encodings such as UTF-32, UTF-16 or
UTF-8. File names using those encodings never contains null octects
and can be safely used with libc functions expecting null-terminated
strings.

My concern is: are we in the same case in Windoze machines? Could
'pdf_text_get_host' return a string encoded in some filesystem-unsafe
encoding such as UCS-2? We should investigate how windows manages the
encoding of file names. Any idea?


It is unlikely that this would happen but I'm not entirely sure it can't either. NTFS uses UTF-16 and doesn't check strings given to it to ensure that they're valid unicode.

Windows is Unicode safe, they comply from UTF-7 up to UTF-32. Pre-Win2K NT platforms (most notably NT 4) do use UCS-2 so that can be incompatible.

As a solution (discussed between myself and jemarch), a work around is to always ask Windows to convert the string to UTF-16 since this is Windows preferred internal representation of data in unicode.

   +#ifdef PDF_HOST_WIN32
   +  hFile = FindFirstFile (ascii_path, &data);
   +  if (hFile == NULL)
   +    goto error_cleanup;
   +  else
   +    FindClose (hFile);
   +#else

How do 'FindFirstFile' manages relative paths? What is the "default"
volume used if the path does not contain a volume specification? Do it
support "device" filenames such as PRN?

Relative paths are handled by the FindFirstFile facilities. I'm almost positive the default volume used is that of the CWD. Regarding device filenames, I'm unsure.

Regarding the device filenames, note that there is a function in
'pdf-fsys-disk.c' called 'pdf_fsys_disk_win32_device_p' that will tell
you if a given path names a device. My suggestion is to directly
return PDF_TRUE if the path is a device (beware: that function was
never tested).

I'll look into it.

   +  if (file == NULL)
   +    goto error_cleanup;
   +  else
   +    fclose (file);

Please use braces {} in any if or else body even if it only contain a
single statement, such as in:

   if (file == NULL)
   {
     goto error_cleanup;
   }
   else
   {
     fclose (file);
   }


-Zac




reply via email to

[Prev in Thread] Current Thread [Next in Thread]