[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: "split-sentences"?
From: |
Tomas Hlavaty |
Subject: |
Re: "split-sentences"? |
Date: |
Sat, 23 Jan 2021 10:07:06 +0100 |
On Sat 23 Jan 2021 at 09:41, <tomas@tuxteam.de> wrote:
> On Sat, Jan 23, 2021 at 07:38:49AM +0100, moasenwood--- via Users list for
> the GNU Emacs text editor wrote:
>> Can I parse/split a string into sentences based on
>> human-language punctuation?
not easily
>> Did anyone do that already?
https://www.unicode.org/reports/tr29/#Sentence_Boundaries
Does emacs expose unicode text functions? For example to classify
characters, determine graphemes, words, sentences, line breaks etc?
>> I mean very mechanically is fine, no linguistics or anything.
>>
>> So this
>>
>> "'This sentence is spoken by Mr. W. E. B Dubois, Esq.!' played
>> through amazon.com alexa speakers?"
>>
>> would be
>>
>> ("'" "This sentence is spoken by Mr" "." "W" "." "E" "." "B
>> Dubois" "," "Esq" "." "!" "'" "played through amazon" "."
>> "com" "alexa "speakers" "?")
That is not really split-sentences.
The example has two sentences. Moreover the first sentence is a subject
of the second.
This would be represented something like this:
(sentence
(sentence "This sentence is spoken by Mr. W. E. B Dubois, Esq.!")
"played through amazon.com alexa speakers?")
but it depends, what do you want to achieve.
- "split-sentences"?, Emanuel Berg, 2021/01/23
- Re: "split-sentences"?, Emanuel Berg, 2021/01/23
- Re: "split-sentences"?, Emanuel Berg, 2021/01/23
- Re: "split-sentences"?, tomas, 2021/01/23
- Re: "split-sentences"?, Eric Abrahamsen, 2021/01/23
- Re: "split-sentences"?, tomas, 2021/01/23