bug-ocrad
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-ocrad] A few ocrad problems


From: Antonio Diaz Diaz
Subject: Re: [Bug-ocrad] A few ocrad problems
Date: Sun, 02 Jun 2013 21:46:36 +0200
User-agent: Mozilla/5.0 (X11; U; Linux i586; en-US; rv:1.7.11) Gecko/20050905

Hello Don.

Don Moir wrote:
Results so far look good and better than other other sources I have tried.

Thanks for the feedback.


1) An orphan capitial letter I fails to be detected.

Textline::recognize2 is supposed to become some kind of expert system for post-processing of recognized text, but it yet lacks a lot of "rules". This is one of them. I'll add rules for isolated 'I' and 'UP' in the next release of ocrad.


3) Failure to detect a space character in latin_space.pbm.

This one is a little trickier. Currently ocrad measures the distance between "character boxes". It should measure the distance between the black blobs inside those character boxes, but this is more difficult to do. I plan to fix this in a future version of ocrad.


4) Failure to detect merged ti, vi, im, ll,  in merged_ti_vi_im_ll.pbm

This one is the most difficult. The last version of ocrad has fixed some problems like those, but there are lots of them (even with more than two letters merged). I'll try to fix as many as I can, but I don't promise anything.


The attached zip contains 6 files:

Next time, please, send the images to my email address, not to the list. Thanks.


I am wondering if possible merged characters should be added as special 
characters. like TT, ti, etc so then in future it's easy to add such 
combinations.

I have in fact removed some such combinations from the last version of ocrad. There are just too many of them and trying to recognize them worsens recognition results.


Best regards,
Antonio.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]