Convert Scanned PDF to Searchable PDF with OCR in C#

Scanned PDF to Searchable with OCR in C#

PDF files are sometimes based on images which are usually created using a scanner or imaging device. You can convert a scanned PDF file to a searchable PDF file with OCR so that the text can be edited or updated in the document. In accordance with that scenario, this article explains how to convert a scanned PDF to a searchable PDF by OCR operations programmatically using C#.

Scanned PDF to Searchable PDF by OCR – C# API Installation

You can perform OCR operations on a scanned PDF file with Aspose.OCR for .NET API. Simply configure the API by downloading the DLL file from the New Releases section, or using the following NuGet installation command:

PM> Install-Package Aspose.OCR

Convert Scanned PDF to Searchable PDF Programmatically using C#

You can convert a scanned PDF file to a Searchable PDF document while optically recognizing the text by following the steps below:

  1. Initialize AsposeOcr class instance.
  2. Recognize images from PDF using RecognizePdf method.
  3. Set different properties for OCR recognition with the DocumentRecognitionSettings class.
  4. Save OCR result as a searchable PDF file.

The code snippet below explains how to convert a scanned PDF to a searchable PDF document programmatically using C#:

Get Free Evaluation License

You can evaluate the feature to recognize text in scanned PDF with OCR operations without any limitations by requesting a free temporary license.

Conclusion

In this article, you have learned how to convert a scanned PDF file to a searchable PDF document by performing OCR operations on it programmatically in C#. Moreover, you can check several other OCR-related features offered by the API by visiting the documentation. Please feel free to contact us at the forum in case of any inquires.

See Also

Convert Image to Word Document (DOCX) with OCR using C#