Do you have times when you need to extact text from images or scanned PDF files? If you have ever been in such a situation, you should know something about OCR.
OCR technology will help you convert the images or scanned PDF files into editable and searchable text. There are tons of free OCR tools in the market and we have tested some of them to give a thorough review.
Free Online OCR is an online free OCR tool. It enables users to extract the text in PDF, JPG, GIF, TIFF or BMP files without registration. You even do not need to leave your email address as the text recognition result will be displayed as soon as the source file is uploaded.
We upload an image and get an amazing result in seconds. We can see from the below illustration that the free OCR tool has done a good job. The text in the source image file has been recognized correctly and been extracted successfully.
Pros: 1. It is a free but useful open resource for users without registration nor email address.
2. It can handle images with multi-column text and also supports multiple languages.
Cons: 1. There is a size limitation. The max file size of the uploaded file is 2MB. If the PDF file is larger than 2MB, and that is often the case, you can split it to small parts.
2. The image should be no wider or higher than 5000 pixels and there is a limit of 10 image uploads per hour.
Overview: The free online OCR tool works well in general. It will help people get out of the case. But if you have a batch of scanned PDF file and images at hand to convert, it is not a good choice.
PDFMate Free PDF Converter is a free desktop program that can convert PDF documents to multiple file formats. It is not only a PDF Converter but also a PDF merger. Featuring with built-in OCR technology, the freeware enables users to convert scanned PDFs to editable text or Microsoft Word files. The interface of the program is quite different from those of the below two programs we are going to talk about but is also deadly simple.
We add an image PDF file to the program, move to advanced settings to enable OCR, and then choose Text as output format. Click the convert button and then the status bar shows success few seconds later. Open the destination folder to find the output file. It is amazing that all the characters are correctly recognized! Again we choose Word as output format (just want to konw if it will preserve the formats). We cannot believe that all the layouts in the source PDF documents are well preserved.
Pros: 1. Easy to use. The interface is quite simple and clean. All functions are straigtforward.
2. Excellent recognition result with all the formats and layouts well preserved.
3. More than an OCR tool. It is a PDF to Word + PDF to Text + PDF to Image + PDF to HTML + PDF to SWF + PDF to Epub converter and a PDF merger.
Cons: 1. The current version can only recoginze English characters.
2. It has a 3-page limit. However, with PDFMate Free PDF Merger, you can split the scanned PDF file into parts and then convert the parts into editable files.
Overview: As a new rising free OCR tool, PDFMate Free PDF Converter has done a very good job in its first show and even better than its predecessors.If you have need to edit scanned PDF files one day, you may as well have a try.
FreeOCR is a free desktop program that can import images and scanned PDF files and export plain text or directly to Microsoft Word format. Microsoft users will be quite familiar with the neat and friendly user interface.
The program is simple to install and more importantly, free to use.
We add an image PDF file to program and OCR a random page. To be honest, the output text is much better than expect as we have read some negative comments on the program. The characters are extracted correctly and can be edited directly within the program.
Pros: 1. Easy to use. Output text can be edited and saved as a text file or Word document.
2. No size limitation and multiple languages supported, including English, Danish, German, Finnish, French, Italian, Dutch, Norway, Poland, Spanish and Swedish.
Cons: 1. The conversion quality is not so great. The program still makes some mistakes on certain font types. It fails to load certain scanned PDF files at times.
2. The program provides no formatting, nor reproduces fonts or sizes.
Overview: Based on the latest Tesseract OCR engine, FreeOCR has been doing a good job and providing a reasonable conversion service. If you want to extract text from images and PDFs, then you can at least try the program. It is free.
Free OCR to Word is also a powerful but free OCR desktop program that enables you to extract the text from images and export it to MS Word or save it as plain text file. The interface of the program ostensibly has a lot in common with FreeOCR.
We first intend to add an image PDF to the program but unfortunately it seems that the program only supports to load images. The good thing is that it supports JPG, PNG, BMP, TIFF and various other image formats. Then we load an image and click OCR button. The editable text are displayed in the right column within seconds. We also perform some minor editing.
Pros: 1. It supports various image formats and the accuracy rate is relatively higher.
2. You can edit the text before saving and export the OCR converted text to Microsoft Word format. (It requires you to have Microsoft Word installed)
Cons: 1. You can only add image files to the program.
2. It seems that the program does not support multiple languages either.
Overview: Free OCR to Word is generally very well. Though it had difficulty with some documents and produced gobbledegook text, that is a common problem with OCR, and no program has solved it entirely. If you are looking for a free OCR PDF converter, it is worth trying.