[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [bug #15820] Can not read sav file
From: |
Ben Pfaff |
Subject: |
Re: [bug #15820] Can not read sav file |
Date: |
Thu, 23 Feb 2006 13:39:38 -0800 |
User-agent: |
Gnus/5.110004 (No Gnus v0.4) Emacs/21.4 (gnu/linux) |
John Darrington <address@hidden> writes:
> Since we have absolutely no idea of the locale in which a system file
> was created, I think we should simply take it on trust that the
> variable names and strings within a file are valid ones.
Do you think we can assume that variables names are encoded in
UTF-8? Then it is fairly easy to convert variable names to/from
the current locale on system file input/output.
I have not experimented with non-ASCII variable names in SPSS. A
few experiments might turn up the encoding.
> Thus lines such as
[...]
> need to be excised from sfm-read.c --- we don't know the locale in
> which the file was written, so we don't know how isalpha/islower etc ought
> to behave when reading.
I think it'd still be a good idea to sanity-check variable names,
assuming that we can figure out the variable name encoding used
in system files.
> Similarly, I think that that sfm-write should also not use any
> ctype functions. Let's just assume that the dictionary and
> casefiles are valid ones.
I don't think sfm-write validates anything in the dictionary
currently.
> Instead, let's do all that sort of checking in the lexer, and the
> output routines. Thus,
>
> DATA LIST LIST /Äpfel *.
>
> Will give an error (or perhaps just a warning) in the default "C"
> locale, but continue happily if the LC_CTYPE locale has been set to
> say "de_DE". Similarly, if I generate output from a system file which
> was created in the "de_DE" locale, but my current locale is "en_US",
> then the output routine will generate a warning when it encounters a
> variable name for which isalpha returns false.
Is that the way that other languages with support for
internationalization parse variable names? e.g. how does Java
work? I must admit that I have a pretty weak grasp of how this
sort of thing is supposed to work.
> So you're probably right, we'd need to audit the code for files which
> currently use ctype (I had a look, it's about 12 files), and decide
> whether they really should honour LC_CTYPE. [...]
--
"To the engineer, the world is a toy box full of sub-optimized and
feature-poor toys."
--Scott Adams
- Re: [bug #15820] Can not read sav file, John Darrington, 2006/02/21
- Re: [bug #15820] Can not read sav file,
Ben Pfaff <=
- Re: [bug #15820] Can not read sav file, Ben Pfaff, 2006/02/23
- Re: [bug #15820] Can not read sav file, John Darrington, 2006/02/24
- Re: [bug #15820] Can not read sav file, Ben Pfaff, 2006/02/24
- Re: [bug #15820] Can not read sav file, John Darrington, 2006/02/24
- Re: [bug #15820] Can not read sav file, Ben Pfaff, 2006/02/24
- Re: [bug #15820] Can not read sav file, John Darrington, 2006/02/24
- Re: [bug #15820] Can not read sav file, Ben Pfaff, 2006/02/24
- Re: [bug #15820] Can not read sav file, Ben Pfaff, 2006/02/24
- Re: [bug #15820] Can not read sav file, John Darrington, 2006/02/25
- Re: [bug #15820] Can not read sav file, Ben Pfaff, 2006/02/25