VICUG-L Archives

Visually Impaired Computer Users' Group List

VICUG-L@LISTSERV.ICORS.ORG

Options: Use Forum View

Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
Don Moore <[log in to unmask]>
Reply To:
Don Moore <[log in to unmask]>
Date:
Tue, 24 Jun 2008 11:27:31 -0400
Content-Type:
text/plain
Parts/Attachments:
text/plain (18 lines)
Now available at
http://EmpowermentZone.com/p2tsetup.exe 

After a few years since version 2.1, I have now updated the program with two substantive enhancements that broaden the range of PDFs from which text can be obtained.  If a PDF is locked with a password that you know, type it in the edit box that has been added to the main dialog.  If the PDF is primarily an image format without textual characters, e.g., the result of a scan, mark the new checkbox so that optical character recognition (OCR) is performed rather than the usual text extraction techniques.  Google Tesseract technology is used for this, which is currently the best free OCR available. 

Note that OCR should be used as a last resort, since it takes much longer and is more error prone.  Essentially, PDF to TXT now incorporates the PDF2OCR package, which has been available at http://EmpowermentZone.com/pdf2ocr.zip The download size of the new installer is much larger, about 22 megabytes, in exchange for the additional OCR capability. 

The program's batch conversion features work with the latest enhancements. Thus, all the PDFs in a directory, or all those on a web page, may be processed with a single command if they share the same password or image format. 

Jamal


    VICUG-L is the Visually Impaired Computer User Group List.
Archived on the World Wide Web at
    http://listserv.icors.org/archives/vicug-l.html
    Signoff: [log in to unmask]
    Subscribe: [log in to unmask]

ATOM RSS1 RSS2