How to scan and get text from an image with OCR.

An open white laptop on a desk is used to get text from an image with OCR.

Wondering how to get text from an image using OCR but not sure where to begin? Learn how you can use OCR technology to transform text from image files into editable PDF documents.

It’s possible to scan documents into many different formats, including images. And because image files are easy to share, they can come in pretty handy sometimes. But if you need to copy text from an image, you’ll face a challenge — you just can’t do it.

Luckily, you may be able to turn an image into an editable text file and avoid retyping the entire document. Let’s find out how to read text from image files using OCR technology.

What is OCR technology?

Optical character recognition (OCR) is a technology that can scan uneditable files, identify the text elements on the page, and use the scanned data to produce an editable text file, like a PDF. OCR can do this task in two different ways.

Some PDF editing software solutions can read through image files and recognize character shapes. The software then reconstructs the image as a PDF file. In the best-case scenario, it can even identify and reproduce the original font.

Certain scanners can read text on physical documents and automatically turn them into text files. You can easily turn paper documents into PDFs without retyping all of the text.

How it makes work easier to scan text from an image.

At work, you might need to update, edit, or repurpose the copy from a paper document, such as an old marketing brochure, a contract, or user instructions. If you have no digital copy, you would need to use the old-fashioned method of copying the information and retyping it by hand.

Sometimes clients or colleagues provide paper copies or image files only. If you can extract text from image files, you’ll be able to make the text both editable and searchable. Not only will you be able to correct mistakes or make needed updates, but you’ll also be able to search the document for key terms and even index those terms or use them to organize the information into a larger database. Especially if you’re working with a large number of documents for legal or research purposes, that functionality might be crucial for the success of your work.

How to get text from images easily.

You might think you need expensive or complicated software to recognize text from image files, but OCR software has become commonplace. You’ll find several different options and methods to get text from an image.

How to OCR a PDF.

You can use OCR to scan text from image files in multiple ways. The easiest method is to use a PDF editing application. Many modern apps have OCR features and can read through image files in seconds.

Another workable option is converting an image to a PDF. Some PDF converters have OCR functionality and can also read and convert text. Not all converters can do it, but it’s worth a try.

Finally, you can use an OCR-capable scanner or free scanner app if you have the original paper document. This way, you can save time and turn physical documents directly into machine-readable PDFs.

Get text from an image with a single picture.

Sometimes all you need is the text from a single image or one-page PDF file. To get the text from the image, follow these steps to apply OCR:

  1. Open a PDF file containing a scanned image in Adobe Acrobat for Mac or PC.
  2. Click the Edit PDF tool in the right pane. Acrobat automatically applies OCR to your document and converts it to a fully editable copy of your PDF.
  3. Click the text element you wish to edit and start typing. New text matches the look of the original fonts in your scanned image.
  4. Choose File > Save As and type a new name for your editable document.

Get text from images of a multiple-page file.

The steps to get text from multiple images in a multiple-page file are the same. If you find that your PDF still doesn’t recognize text from images, you can use Adobe Acrobat Pro to get text from all pages and all images at once by following these steps:

  1. Open Adobe Acrobat Pro.
  2. Choose Tools > Export PDF.
  3. Export to a Word document or rich text file.
  4. Choose Include Images from the advanced options.

Are there any use cases where OCR image-to-text may not work?

If OCR isn’t working, it’s often due to poor image quality, especially for scanned documents. Ensure that you have good lighting when taking a picture of a document and that the document is straight when scanned.

You might get an error message if the document contains renderable text. If the text isn’t renderable, but you get that error message anyway, you can try converting the PDF to a TIFF and then open the TIFF file as a PDF to rerun OCR.

Another reason OCR might not work precisely could be due to text and graphics being heavily mixed or distorted, so it struggles to separate visual information from copy. It works best for straight lines of copy.

Can I scan text from an image with any file type?

There are various converters online that are designed to recognize text from different file types, but you can also convert almost any file type to a PDF to activate OCR. Just convert the file to PDF and then open the file in Acrobat. Click on the text to edit.

Is it the same to extract text from an image as it is to get text from an image?

Extracting text from image files is the same as getting text from an image. If you want to edit text in its original format, you can turn your image file into an editable PDF, but if you want to extract the text to a new file type, you can do that too by copying and pasting the editable text into another document.

More resources on documents and PDFs.

Now that you’ve learned how to scan text from image using OCR, here are other ways to work with your documents and PDFs:

Explore everything you can do with Acrobat online services to convert, edit, and share your files.