emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Bidirectional text and URLs


From: Stephen J. Turnbull
Subject: Bidirectional text and URLs
Date: Fri, 28 Nov 2014 12:27:28 +0900

Lars Magne Ingebrigtsen writes:

 > Using right-to-left markers to do phishing and obscure URLs has gotten
 > some attention on the webs today.  For instance, can you easily tell
 > where the link below takes you if you click on it in Gnus and
 > (presumably) rmail?

Eli's the expert, but I would say that given that the UAX#9 bidi
algorithm does what's wanted 99.44% of the time, it makes sense to
mark text reordered by RTL markers with a warning face, and to the
extent that your UI recognizes URLs, you could even query the user:

    This link appears to have been obfuscated by using unusual
    characters or presentation techniques.  This link points to

    http://myspace.com/#/...

    Is that your intended destination?

if you recognize that the URL was obfuscated (not limited to RTL, but
also out-of-block confusable characters such as a Cyrillic A in an
otherwise ASCII URL and HTML A elements where the displayed text
appears to be a URL that doesn't match the href, etc).

Personally I'll probably just add RTL characters to my .procmailrc,
and never see them in the first place. :-)  Sorry about not noticing
your post, larsi! ;^)

 >      Works on URLs too.                                               
 >                                                                       
 >http://myspace.com/#/segami/moc.koobecaf//:sptth                 
 >                                                                       
 > Unless I messed something up while cut'n'pasting that, you should see
 > the problem.

Interestingly, it worked temporarily in Terminal.app but then stopped,
I'm not sure why.  A wormy Apple, I guess! ;-)

 > Now, should we do something about that?  And if so -- what?

I think that the query and the statistical analysis of confusables is
likely to be a fair amount of work, if you want to avoid confusing the
user more than the obfuscation does.  A different face should be easy
enough in cases where you have RTL markers or mixed charset blocks.
You do need a way to turn it off, or to make it reasonably smart, in
the case of ASCII which is often mixed with other charsets.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]