phpgroupware-developers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Phpgroupware-developers] SV: utf-8 vs iso-8859-1


From: Sigurd Nes
Subject: [Phpgroupware-developers] SV: utf-8 vs iso-8859-1
Date: Thu, 23 Feb 2006 14:43:55 +0100 (CET)

> On Thu, 2006-02-23 at 12:50 +0100, Sigurd Nes wrote:
> > > On Thu, 2006-02-23 at 09:08 +0100, Sigurd Nes wrote:
> > > > > 
> > > > > The conversion to utf-8 is giving me problems.
> > > > > I have a database with more than 5000 dwellings, 35000
> > > > workorders ...
> > > > > The language is norwegian - and I really would like to keep the
> > > > character set (at least for norwegian) - this way I can use what
> > query
> > > > tool (as M$access) I like to make anaylis without the need for
> > > > postprocessing.
> > > > > Please enlighten me if I am missing something.
> > > > > 
> > > 
> > > There are several reasons for the switch to utf-8.  The main one is
> > that
> > > from db to the user interface we can know that we are always dealing
> > > with utf-8.  We can then remove things like lang('chartset').  
> > > 
> > > Unicode also means we can have multi lingual installs.  For example
> > if a
> > > company has operations across Europe they can not use a single phpgw
> > > install, as we currently use at least 3 different charsets for
> > > translations.  I would also like to hardcode urf-8 into stuff
> > instead of
> > > having to keep track of charsets which potentially causes problems.
> > It
> > > is also easier if everyone knows to use utf-8 compliant tools.
> > > 
> > > I haven't used M$ Access since O2k days, but I know that OO.o2 Base
> > > allows you to specify the charset for the database connection.
> > Maybe M$
> > > Access has the same option tucked away somewhere
> > > 
> > > What are the problems you have?  I am happy to see if we can find a
> > way
> > > of fixing the problems instead of switching back to encoding soup :)
> > > 
> > > Cheers
> > > 
> > > Dave
> > > 
> > 
> > I'm not sure I grasp all the consequenses - this is from some testing:
> > 
> > I seems that postgres has an unicode odbc-driver so that "should" be
> > ok - but it don't seems to work (if there is any converted characters
> > - I got 'ODBC -- called failed').
> > 
> 
> I am not sure what the issue is here.  Is it when the db contains
> unicode chars or iso-8859-1 ?
> 
> > I will need to convert all the characters in the database to unicode -
> > I figure I can dump the database, convert the characters (there is a
> > tool ?) and reload the data into an empty database. At this point I
> > will most certainly run into problems - 'cause the fields will be to
> > short in many cases.
> > 
> 
> check out iconv.  That is what I used to convert the lang files.  It is
> pretty simple. You should be able to convert a full db dump on the
> command line, then reimport it.  On average, how many non ascii
> characters do you have in a field?  How much slack do your fields have?
> 
> > Writing lang-files will be somewhat more difficult ?
> > When saving a file with gedit as unicode it is ok when reopened in
> > gedit and TexPad (my favorite) but not in emacs.
> > 
> 
> What? you don't use vi? ;)  Soemone suggested trying " C-x RET f utf-8
> RET" in emacs, but I have no idea when it comes to emacs.
> 
> > When insterting new values to the database - do I need to filter the
> > values trough a converter?
> > I certainly cannot edit records with webmin.
> > 
> 
> You mean manual inserts?  For that I use phpmyadmin or mysql query
> browser as I use mysql not pgsql.  Does webmin set the charset based on
> a language?
> 
> > I thought that the lang-table combined with the users preferences took
> > care of multilanguage issue.
> > 
> 
> Not completely.  AFAIK Unless we use unicode we can't use say different
> charsets in 1 install.  For example we can list languages in that
> language's local language and charset.
> 
> > If there is special functions in the api the reqiure unicode - I'm
> > more than willing to convert the input to that function to unicode at
> > demand.
> > 
> > All in all - As I see it - there is a number of limitations compared
> > to allow iso-8859-1 for the xsl:stylesheet
> > 
> 
> What limitations are there for the stylesheets?  From what I understand
> it is best to use utf-8 for xml.
> 
> Cheers
> 
> Dave
> 

OK - converting and import of database went well (saved the dump as unicode 
from gedit).
The unicode ODBC-driver is working - which means all is well in M$access
phpPgAdmin seems to have no problem with the converted database.
The (test) phpgroupware seems to behave (and look) as usual

Regards

Sigurd

reply via email to

[Prev in Thread] Current Thread [Next in Thread]