On 03/21/2014 04:54 PM, Ángel González wrote:
On 21/03/14 21:13, Daniel Kahn Gillmor wrote:
i've just pushed some cleanup suggestions here:
https://github.com/rockdaboot/libpsl/pull/1
i see you've pulled them already, thanks!
i've got three more conceptual issues which warrant discussion, rather
than a patch, though. If there's a better place to have this discussion
than this mailing list, i'm happy to move to it, please let me know
where.
psl_is_tld() semantics
----------------------
the way i see it, we know what it means for psl_is_tld() to return
"true" -- but "false" could mean either:
(A) "this zone is subordinate to a TLD" (as example.com is to com)
or
(B) "this zone is superior to a TLD" (as uk is to co.uk). Note that
"uk" is not a public suffix.
Hmm, actually uk is a public suffix, since not matching anything
explictely in
the list, it will be caught by the implicit last-resource rule '*'.
Also, what would you do with a domain such as his.name?
It is both inferior to a public suffix (.name) and superior
(forgot.his.name).
hm, the same problem is present for amazonaws.com; it is superior to
s3.amazonaws.com (and 32 other public suffixes), and subordinate to .com
I think it should have a different return code, though.
can you propose a specific API? the devil is in the details.
https://www.gnu.org/software/libidn/
I would expect the input in punycode and optionally in utf-8. This means
a preprocessing step from the original list is needed.
This implies that people wouldn't be able to use effective_tld_names.dat
as distributed, right? I can see this working for OS-level
distributions (I can preprocess effective_tld_names.dat when
distributing it in publicsuffix for debian), but for regular users it
sounds terrible.