bug-readline
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-readline] rl_point, multibyte strings, and the cursor position


From: Ulf Magnusson
Subject: Re: [Bug-readline] rl_point, multibyte strings, and the cursor position
Date: Mon, 16 Feb 2015 22:52:12 +0100

On Mon, Feb 16, 2015 at 4:43 PM, Ulf Magnusson <address@hidden> wrote:
> I'll try it. Thanks for the suggestion!
>
> /Ulf
>

Here's what I came up with in case someone else runs into the same
problem. I'm sure there's more stuff to handle (not sure what to do
for non-printable characters for example), but it seems to handle
multibyte (tested using åäö's and Chinese) and combining characters
correctly for UTF-8 at least:

// Returns the total width (in columns) of the characters in the 'n'-byte
// prefix of the null-terminated multibyte string 's'. If 'n' is larger than
// 's', returns the total width of the string. Suitable for calculating a
// cursor position.
//
// Makes a guess for malformed strings.
static size_t strnwidth(const char *s, size_t n) {
    mbstate_t shift_state;
    wchar_t wc;
    size_t wc_len;
    size_t width = 0;

    // Start in the initial shift state.
    memset(&shift_state, '\0', sizeof shift_state);

    for (size_t i = 0; i < n; i += wc_len) {
        // Extract the next multibyte character.
        wc_len = mbrtowc(&wc, s + i, MB_CUR_MAX, &shift_state);
        if (wc_len == 0)
            // Reached the end of the string.
            break;
        if (wc_len == -1)
            // Failed to extract character. Guess that the remaining characters
            // are one byte/column wide each.
            return width + min(n - i, strlen(s + i));

        width += max(0, wcwidth(wc));
    }

    return width;
}

/Ulf

> On Mon, Feb 16, 2015 at 4:36 PM, Chet Ramey <address@hidden> wrote:
>>
>> On 2/16/15 10:26 AM, Ulf Magnusson wrote:
>> > Okay, I'll roll my own then.
>>
>> If you dont't have invisible characters in the prompt string or handle
>> line
>> wrapping, wcswidth(3) can do the job for you.  Just assume the line buffer
>> contains multibyte characters, convert it to a wide character string with
>> mbstowcs(3), and compute the number of columns to get to the buffer index
>> indicated by rl_point.  If you want to make it easier, strdup
>> rl_line_buffer and null-terminate the new string at rl_point before
>> calling
>> mbstowcs.
>>
>> Chet
>> --
>> ``The lyf so short, the craft so long to lerne.'' - Chaucer
>>                  ``Ars longa, vita brevis'' - Hippocrates
>> Chet Ramey, ITS, CWRU    address@hidden
>> http://cnswww.cns.cwru.edu/~chet/
>
>



reply via email to

[Prev in Thread] Current Thread [Next in Thread]