bug-ocrad
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Bug-ocrad] A few ocrad problems


From: Don Moir
Subject: [Bug-ocrad] A few ocrad problems
Date: Sat, 1 Jun 2013 16:22:51 -0400

Hello,

I am a developer and just started using ocrad. I am using ocrad-0.22-rc2.

Results so far look good and better than other other sources I have tried.

Here's some problems I have found:

1) An orphan capitial letter I fails to be detected. The current code checks 
for a before or after AlphaNum but space is not taken into account.

So for example if you have: a<space>|<space>b The I is not detected as a 
capital letter I and left as a vertical bar. So when setting lcode and rcode in 
Textline::recognize2 when checing vertical bar, you need to skip before and 
after spaces to see what lcode and rcode need to be set to.

2) I have an example with the word UP in it. This is detected as uP (lower case 
u)

3) Failure to detect a space character in latin_space.pbm. The words como jamás 
are detected as comojamás, otherwise the recognition is perfect here.

4) Failure to detect merged ti, vi, im, ll,  in merged_ti_vi_im_ll.pbm

The attached zip contains 6 files:

cap_I_and_UP.pbm (for items 1 and 2)
cap_I_and_UP.txt

latin_space.pbm (for item 3)
latin_space.txt

merged_ti_vi_im_ll.pbm (for item 4)
merged_ti_vi_im_ll.txt

ocrad is working better for me than anything else so far so looks very 
promising.

I am wondering if possible merged characters should be added as special 
characters. like TT, ti, etc so then in future it's easy to add such 
combinations.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]