screen-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [screen-devel] [PATCH] :hardcopy: handle encoding (Was: Hardcopy use


From: Amadeusz Sławiński
Subject: Re: [screen-devel] [PATCH] :hardcopy: handle encoding (Was: Hardcopy uses ISO-8859-1)
Date: Thu, 11 Feb 2016 08:04:53 +0100

Hey,

sorry for delay with replying.


So I didn't apply that one, as to why: I'm working on rewriting
how screen handles encodings. The mchar struct after some changes is
unnecessarily big, it can be compressed to something like this:
struct mchar {
uint32_t image;
uint32_t attr;
uint32_t font;
uint32_t colorbg;
uint32_t colorfg;
}

image should be able to hold up UCS-4/UTF-32 characters (32 bit
character encoding), official UTF-8 is up to 21 bits (rfc3629)

attr holds attributes (BOLD, UNDERSCORE, BLINK etc.)

font is used by non utf-8 encodings (we can compress fontx into it)

colorbg, colorfg support 16/256/truecolor color encoding

removed ones:
fontx - it can be merged with font
mbcs - it's unnecessary with regards to UTF-8, current implementation
uses it to hold part which should be in image (previous versions of
screen had "char image; char mbcs;", so not enough bytes to hold it
there).



As for why there is no release, all your patches are against master
which is development branch for v5, I would like to finish rewriting
encoding handling before even considering putting up pre release
version.

Amadeusz

On Sat, 6 Feb 2016 16:39:46 +0100
Simon Ruderich <address@hidden> wrote:

> On Tue, Jan 26, 2016 at 02:46:29PM +0100, Simon Ruderich wrote:
> > Hi,
> >
> > I'm using GNU screen's hardcopy function to dump the current
> > screen content to a file. However the resulting file is encoded
> > in ISO-8859-1 although my current locale is UTF-8. This causes
> > corruption for characters which are not representable in
> > ISO-8859-1.  
> 
> Hello again,
> 
> It's not actually using ISO-8859-1, but instead printing the
> first byte of ->image which seems to be the unicode code point.
> 
> image.h:
>     /* structure representing single cell of terminal */
>     struct mchar {
>             uint32_t image;         /* actual letter like a, b, c ...
> */ [...]
>     };
> 
> fileio.c WriteFile():
>     for (i = 0; i < fore->w_height; i++) {
>             p = fore->w_mlines[i].image;
>             for (k = fore->w_width - 1; k >= 0 && p[k] == ' '; k--) ;
>             for (j = 0; j <= k; j++)
>                     putc(p[j], f);
>             putc('\n', f);
>     }
> 
> This obviously doesn't work for characters > 255 which caused the
> garbled display for me.
> 
> 
> The attached patch should fix the issue. However somebody should
> verify my assumptions:
> 
> I'm not 100% sure that ->image is actually the unicode code
> point.
> 
> Double-width characters are followed by a character with ->image
> = 0xff and ->font = 0xff. I assumed that this means the character
> is a filler character to handle the fixed screen width correctly,
> but I'm not entirely sure. Is there a function/constant to check
> for fillers like this? Hard-coding 0xff doesn't sound like a good
> idea.
> 
> I don't know how the fontp parameter of EncodeChar() is used:
> 
>     int EncodeChar(char *bp, int c, int encoding, int *fontp)
> 
> Passing NULL seems to work though.
> 
> Regards
> Simon
> 
> PS: The Git repository contains a lot of commits since the last
> release. A new release of GNU Screen sounds like a good idea to
> get those fixes/improvements distributed.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]