[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: ANN: Vindaloo 0.2 & PopplerKit
From: |
Yen-Ju Chen |
Subject: |
Re: ANN: Vindaloo 0.2 & PopplerKit |
Date: |
Thu, 21 Jul 2005 12:18:57 -0700 |
On 7/21/05, Stefan Kleine Stegemann <stefankst@gmail.com> wrote:
> > I am interested in using PopplerKit to extract text from PDF.
> > I look at the headers and there is no support for that.
> > It would be nice that PopplerKit offer this kind of functions,
> > like extracting text from document or each page, extracting outline, etc.
>
> The functionality you're looking for is planned but not there right
> now. However, at
> least text extraction it is not very difficult to implement and I can
> give you that,
> say, by the end of next week. Outline will take a bit longer.
Thanx a lot.
Extracting text is plenty enough for me now.
> > Actually I am porting LuceneKit, which is a search engine like
> > Google or Spotlight,
>
> Great, is this related to the apache lucene thing? Does it also work on OSX?
Yes, it is a port of Apache Lucene.
It works both on GNUstep and OSX, complied by GNUstep-make.
Actually it works well for some sample data,
but I want to test it with real data.
I have half GB of PDF files, which should be good as real data.
[snip]
Yen-Ju
> greets
> Stefan
>
> --
> Stefan Kleine Stegemann
> Mail: stefankst at gmail.com
> Home: http://rzserv2.fhnon.de/~lg017420
> Weblog: http://stefankst.blogspot.com/
>