A common method for making PDF documents is to place a paper copy of a document into a scanner and view the newly-scanned document as a PDF with Adobe Acrobat. Unfortunately, scanners only create an image of text, not the actual text itself. This means the content is not accessible to users who rely on assistive technology. Additional modifications must be made to make the document accessible.
If the PDF document is not a scanned document or it has previously undergone optical character recognition (OCR), skip this discussion and proceed to “Step 4: Add Form Fields and Set the Tab Order”.
There are many ways to determine if a PDF file originated from a scanned page:
The Page Appears to be Skewed
Sometimes sheets are not properly fed into the scanner. The result is the page appears to be crooked, or skewed on the screen . Lines of text will not be straight but will appear to slant up or down.
Figure 5. Skewed Text Indicates a Scanned PDF
Search for Characters that Appear on the Page
Use the find command in Acrobat to search for text that appears on the page. Select Edit > Find and type a term that appears on the page in the search field.
If the document was scanned, Acrobat will not find the search item but will display the message: “Acrobat has finished searching the document. No matches were found.”
Zoom in and Check for Jagged Edges on Smooth Characters
Scanned images are bitmaps (See “Figure 6. Bitmapped Text Appearance”). The edges of curves on bitmapped images will not appear to be smooth or rounded but will be jagged, as shown in the sample illustrating the word “Writing” in Figure 6. Use the Marquee Zoom tool in Acrobat to define the area and magnify the edges of curved letters such as “c”, “s”, and “o”. Text that has undergone the OCR process using the ClearScan option will display edges that are smoother but still uneven or lumpy where there should be smooth curves, as shown in the illustration of the of the words “Quality” and “region” in Figure 7.
Figure 6. Bitmapped Text Appearance
Figure 7. ClearScan Text Appearance
Acrobat Pro DC can detect the presence of assistive technology, and if it encounters a scanned document, Acrobat will announce an audible empty page warning and display the Scanned Page Alert dialog (See “Figure 8. Scanned Page Alert and Recognize Text Dialogs”).
Figure 8. Scanned Page Alert and Recognize Text Dialogs
Perform Optical Character Recognition (OCR) to convert the bitmap image of text to actual characters. In Acrobat Pro DC, this can be performed two ways:
There is an option of recognizing the entire document, the current page, or a range of pages within the document. Use the Edit button in the scanned page dialog to set the desired characteristics for the resulting file. The “Recognize Text—General Settings” dialog appears also when the Make Accessible Wizard is run. The options to choose are:
Figure 9. Recognize Text - Settings
For additional information on performing optical character recognition using Adobe Acrobat, refer to the Acrobat Pro DC Help.
Proceed to Step 4: Add Form Fields and Set the Tab Order.