help-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: how to calculate the size of string in bytes?


From: Eli Zaretskii
Subject: Re: how to calculate the size of string in bytes?
Date: Tue, 18 Aug 2015 22:49:58 +0300

> Date: Tue, 18 Aug 2015 21:30:49 +0200
> Cc: help-gnu-emacs@gnu.org
> From:  <tomas@tuxteam.de>
> 
> I was having difficulties in understanding you

Sorry about that.  It's a complex issue to explain in a few words.

> Now I understand: Emacs's internal (raw) coding system can represent
> "characters not expressible in utf-8".

More accurately, it can represent characters outside the Unicode code
space.

And please don't call that "raw"; the internal representation of
characters used by Emacs is known as 'utf-8-emacs'.

> The function encode-coding-string passes those bytes silently
> through, outputting an invalid utf-8 sequence.

Yes.  Although in interactive functions Emacs will normally complain
and ask for a better encoding.

> So I venture the guess that when the Emacs buffer contains something
> epressible as valid utf-8, 'utf-8 and 'raw are equivalent

Yes.

> (what about combining characters?)

Emacs doesn't normalize/compose/decompose characters when it encodes
text (with a notable exception of the utf-8-hfs encoding).
Applications that want this should do that themselves, e.g. using the
facilities in ucs-normalize.el.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]