[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [screen-devel] [PATCH] :hardcopy: handle encoding (Was: Hardcopy use
From: |
Amadeusz Sławiński |
Subject: |
Re: [screen-devel] [PATCH] :hardcopy: handle encoding (Was: Hardcopy uses ISO-8859-1) |
Date: |
Thu, 11 Feb 2016 08:04:53 +0100 |
Hey,
sorry for delay with replying.
So I didn't apply that one, as to why: I'm working on rewriting
how screen handles encodings. The mchar struct after some changes is
unnecessarily big, it can be compressed to something like this:
struct mchar {
uint32_t image;
uint32_t attr;
uint32_t font;
uint32_t colorbg;
uint32_t colorfg;
}
image should be able to hold up UCS-4/UTF-32 characters (32 bit
character encoding), official UTF-8 is up to 21 bits (rfc3629)
attr holds attributes (BOLD, UNDERSCORE, BLINK etc.)
font is used by non utf-8 encodings (we can compress fontx into it)
colorbg, colorfg support 16/256/truecolor color encoding
removed ones:
fontx - it can be merged with font
mbcs - it's unnecessary with regards to UTF-8, current implementation
uses it to hold part which should be in image (previous versions of
screen had "char image; char mbcs;", so not enough bytes to hold it
there).
As for why there is no release, all your patches are against master
which is development branch for v5, I would like to finish rewriting
encoding handling before even considering putting up pre release
version.
Amadeusz
On Sat, 6 Feb 2016 16:39:46 +0100
Simon Ruderich <address@hidden> wrote:
> On Tue, Jan 26, 2016 at 02:46:29PM +0100, Simon Ruderich wrote:
> > Hi,
> >
> > I'm using GNU screen's hardcopy function to dump the current
> > screen content to a file. However the resulting file is encoded
> > in ISO-8859-1 although my current locale is UTF-8. This causes
> > corruption for characters which are not representable in
> > ISO-8859-1.
>
> Hello again,
>
> It's not actually using ISO-8859-1, but instead printing the
> first byte of ->image which seems to be the unicode code point.
>
> image.h:
> /* structure representing single cell of terminal */
> struct mchar {
> uint32_t image; /* actual letter like a, b, c ...
> */ [...]
> };
>
> fileio.c WriteFile():
> for (i = 0; i < fore->w_height; i++) {
> p = fore->w_mlines[i].image;
> for (k = fore->w_width - 1; k >= 0 && p[k] == ' '; k--) ;
> for (j = 0; j <= k; j++)
> putc(p[j], f);
> putc('\n', f);
> }
>
> This obviously doesn't work for characters > 255 which caused the
> garbled display for me.
>
>
> The attached patch should fix the issue. However somebody should
> verify my assumptions:
>
> I'm not 100% sure that ->image is actually the unicode code
> point.
>
> Double-width characters are followed by a character with ->image
> = 0xff and ->font = 0xff. I assumed that this means the character
> is a filler character to handle the fixed screen width correctly,
> but I'm not entirely sure. Is there a function/constant to check
> for fillers like this? Hard-coding 0xff doesn't sound like a good
> idea.
>
> I don't know how the fontp parameter of EncodeChar() is used:
>
> int EncodeChar(char *bp, int c, int encoding, int *fontp)
>
> Passing NULL seems to work though.
>
> Regards
> Simon
>
> PS: The Git repository contains a lot of commits since the last
> release. A new release of GNU Screen sounds like a good idea to
> get those fixes/improvements distributed.