[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#24603: [RFC 16/18] Refactor character class checking; optimise ASCII
From: |
Eli Zaretskii |
Subject: |
bug#24603: [RFC 16/18] Refactor character class checking; optimise ASCII case |
Date: |
Tue, 04 Oct 2016 10:48:36 +0300 |
> From: Michal Nazarewicz <mina86@mina86.com>
> Date: Tue, 4 Oct 2016 03:10:39 +0200
>
> +const unsigned char category_char_bits[] = {
> + [UNICODE_CATEGORY_UNKNOWN] = 0,
> + [UNICODE_CATEGORY_Lu] = CHAR_BIT_ALPHA_ | CHAR_BIT_UPPER,
> + [UNICODE_CATEGORY_Ll] = CHAR_BIT_ALPHA_ | CHAR_BIT_LOWER,
Is this syntax portable enough for us to use it?
> +/* Limited set of character categories which syntax-independent. Testing of
^^^^^^^^^^^^^^^^^^^^^^^^
"which are syntax-independent"
> + * those characters do not require any run-time data, e.g. do not depend on
^^^^^^^^^^^^^^ ^^^^^^^^^^^^^
"does not require" and "does not depend"
Thanks. I think this change will require a benchmark to make sure we
don't lose too much in terms of performance.
- bug#24603: [RFC 05/18] Introduce case_character function, (continued)
- bug#24603: [RFC 05/18] Introduce case_character function, Michal Nazarewicz, 2016/10/03
- bug#24603: [RFC 06/18] Add support for title-casing letters, Michal Nazarewicz, 2016/10/03
- bug#24603: [RFC 13/18] Add some tricky Unicode characters to regex test, Michal Nazarewicz, 2016/10/03
- bug#24603: [RFC 15/18] Base lower- and upper-case tests on Unicode properties, Michal Nazarewicz, 2016/10/03
- bug#24603: [RFC 04/18] Split casify_object into multiple functions, Michal Nazarewicz, 2016/10/03
- bug#24603: [RFC 03/18] Don’t assume character can be either upper- or lower-case when casing, Michal Nazarewicz, 2016/10/03
- bug#24603: [RFC 12/18] Implement rules for title-casing Dutch ij ‘letter’, Michal Nazarewicz, 2016/10/03
- bug#24603: [RFC 11/18] Implement casing rules for Lithuanian, Michal Nazarewicz, 2016/10/03
- bug#24603: [RFC 16/18] Refactor character class checking; optimise ASCII case, Michal Nazarewicz, 2016/10/03
- bug#24603: [RFC 09/18] Implement special sigma casing rule, Michal Nazarewicz, 2016/10/03
- bug#24603: [RFC 14/18] Factor out character category lookup to separate function, Michal Nazarewicz, 2016/10/03
- bug#24603: [RFC 07/18] Split up casify_region function., Michal Nazarewicz, 2016/10/03
bug#24603: [RFC 02/18] Generate upcase and downcase tables from Unicode data, Michal Nazarewicz, 2016/10/03