Brian,
Well sort of, but I've seen PDF files from which you could extract text but
some characters were represented by graphics, and without sighted help, you
never know you lost them.
I recall some Microchip datasheets where the logical "NOT", as in an
inverted logic input, was represented as a graphic, so you'd never know an
input was active low.
The "scanned pdf" files can be run through OCR programs with highly
varriable results.
As an engineer, now gladly retired, PDF data sheets were the major barrier
to doing research..
When I'm being sort of polite, I call them Perverted Dribble Font files,
When I'm not being polite, I say "Pretty D---ned F--ked"
When Adobe first introduced pdf, the purpose was to make everything "LOOK"
just like the original document. They never thought that content is more
important than form. They got it wrong for years and only later developed
what they call "searchable" pdf, which means the text is extractable.
So, although the original intent was to make documents consistant in
appearance with the original, what you get for accessibility is all over the
map depending on how the person who made the pdf chose to do it.
Tom Fowle WA6IVG
On Thu, Nov 12, 2015 at 10:15:53AM -0600, Brian Tew wrote:
> This is my current theory about pdf files.
> Please correct if I have it wrong.
>
> There are pdf files that represent text directly.
> There are other pdf files that represent pictures of text.
> We can convert and read the first type but not the second.
> Is that right?
> Thanks.
>
> Brian Tew
|