[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: "split-sentences"?
From: |
tomas |
Subject: |
Re: "split-sentences"? |
Date: |
Sat, 23 Jan 2021 09:41:37 +0100 |
User-agent: |
Mutt/1.5.21 (2010-09-15) |
On Sat, Jan 23, 2021 at 07:38:49AM +0100, moasenwood--- via Users list for the
GNU Emacs text editor wrote:
> moasenwood--- via Users list for the GNU Emacs text editor wrote:
>
> > Can I parse/split a string into sentences based on
> > human-language punctuation?
> >
> > Did anyone do that already?
>
> I mean very mechanically is fine, no linguistics or anything.
>
> So this
>
> "'This sentence is spoken by Mr. W. E. B Dubois, Esq.!' played
> through amazon.com alexa speakers?"
>
> would be
>
> ("'" "This sentence is spoken by Mr" "." "W" "." "E" "." "B
> Dubois" "," "Esq" "." "!" "'" "played through amazon" "."
> "com" "alexa "speakers" "?")
Not exactly your result, but this comes close:
(split-string
"'This sentence is spoken by Mr. W. E. B Dubois, Esq.!' played through
amazon.com alexa speakers?"
"[[:punct:]][[:space:]]*")
=>
(""
"This sentence is spoken by Mr"
"W"
"E"
"B Dubois"
"Esq"
""
""
"played through amazon"
"com alexa speakers"
"")
You can adjust the results by tweaking the regexp (try word
boundaries like '\<' and '\>' if you want to keep punctuation)
or the other split-string's optional params (e.g. drop the
empty matches, etc.).
Cheers
- t
signature.asc
Description: Digital signature
- "split-sentences"?, Emanuel Berg, 2021/01/23
- Re: "split-sentences"?, Emanuel Berg, 2021/01/23
- Re: "split-sentences"?,
tomas <=
- Re: "split-sentences"?, Emanuel Berg, 2021/01/23
- Re: "split-sentences"?, tomas, 2021/01/23
- Re: "split-sentences"?, Eric Abrahamsen, 2021/01/23
- Re: "split-sentences"?, tomas, 2021/01/23