emacs-orgmode
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [O] Org Mode and PDF Notes!


From: Ramon Diaz-Uriarte
Subject: Re: [O] Org Mode and PDF Notes!
Date: Fri, 13 Nov 2015 00:51:41 +0100
User-agent: mu4e 0.9.13; emacs 24.5.1



On Thu, 12-11-2015, at 23:52, Matt Price <address@hidden> wrote:
> On Thu, Nov 12, 2015 at 9:28 AM, Matt Lundin <address@hidden> wrote:
>
>> Ramon Diaz-Uriarte <address@hidden> writes:
>> >
>> > I'll do. In the meantime, I think this is a limitation coming from
>> > poppler. Other people have mentioned similar things (e.g.,
>> > http://coda.caseykuhlman.com/entries/2014/pdf-extract.html) and using
>> other
>> > tools that depend on poppler (such as Leela:
>> > https://github.com/TrilbyWhite/Leela) also will not give us the text
>> > itself.
>>
>> I don't think this is a limitation of poppler so much as the way that
>> pdf annotations work. Typically, the subject/text field is not populated
>> by the text of the highlighted region. Rather, a highlight annotation
>> specifies bounds, color, style, etc. Basically what Repligo does (I
>> wouldn't recommend using it, as it is closed source and severely out of
>> date) is to grab the text *at the time of highlighting* and add it to
>> the notes field. I don't know of any other annotation tool that does the
>> same thing. Applications built on poppler could do it, though they
>> currently do not.
>>
>> For extracting the text of highlighted regions *after the fact*, I've
>> had good luck with this script that relies on the pdf-reader gem for
>> ruby:
>>
>> https://gist.github.com/danlucraft/5277732
>>
>> This looks interesting. It searches for file "./markup_receiver", but
> doesn't provide that file, which does not appear to be a gem.  Any hints?


I think I got it from

https://www.omniref.com/github/danlucraft/pyranine/HEAD/files/lib/pyranine/markup_receiver.rb

>
> With politza's help am getting close to being able to extract annotation
> text from within pdf-tools, but am not quite there yet.


Neat!


R.


>
>
>> Matt
>>


-- 
Ramon Diaz-Uriarte
Department of Biochemistry, Lab B-25
Facultad de Medicina
Universidad Autónoma de Madrid 
Arzobispo Morcillo, 4
28029 Madrid
Spain

Phone: +34-91-497-2412

Email: address@hidden
       address@hidden

http://ligarto.org/rdiaz



reply via email to

[Prev in Thread] Current Thread [Next in Thread]