The find and replace option makes it possible to replace a particular piece of text in a document in one go. This way, you do not have to locate and update each occurrence of the text in the whole document manually. This article even goes one step further and covers how to automate the find and replace text feature in PDF documents. Particularly, you will learn how to find and replace text in a whole PDF, a particular page, or a page region using C#.
- C# API to Find and Replace Text in PDF
- Find and Replace Text in PDF using C#
- Replace Text in a Particular Page in PDF
- Replace Text in PDF Page Region
- Find and Replace Text in PDF using Regex
C# API to Find and Replace Text in PDF
Aspose.PDF for .NET is a C# class library that provides basic as well as advanced PDF manipulation features for .NET applications. The API also lets you find and replace text in PDF files in different ways seamlessly. You can either download the API’s DLL or install it using NuGet.
PM> Install-Package Aspose.PDF
Find and Replace Text in PDF using C#
The following are steps to find and replace text in a PDF document.
- Use Document class to load the PDF document using its path.
- Create an instance of TextFragmentAbsorber class and provide the search phrase to its constructor.
- Accept the text absorber for all the pages of the PDF using Document.Pages.Accept(TextFragmentAbsorber).
- Get the extracted text fragments into TextFragmentCollection object.
- Loop through the found TextFragmentCollection and replace text in each fragment.
- Save updated PDF document using Document.Save(String) method.
The following code sample shows how to find and replace text in PDF using C#.
Replace Text in a Particular PDF Page using C#
The following are steps to find and replace text on a particular page in a PDF document.
- Use Document class to load the PDF document using its path.
- Create an instance of TextFragmentAbsorber class and provide the search phrase to its constructor.
- Accept the text absorber for the desired page using Document.Pages[1].Accept(TextFragmentAbsorber).
- Loop through the found TextFragmentAbsorber.TextFragments collection and replace text in each fragment.
- Save updated PDF document using Document.Save(String) method.
The following code sample shows how to find and replace text in a particular page of the PDF using C#.
Replace Text in PDF Page Region using C#
You can also find and replace text in a particular region of the page in a PDF document. The following steps show how to define a particular region and then replace text within it.
- Use Document class to load the PDF document using its path.
- Create an instance of TextFragmentAbsorber class and provide the search phrase to its constructor.
- Accept the text absorber for the desired page using Document.Pages[0].Accept(TextFragmentAbsorber).
- Define the page region using the Rectangle class.
- Loop through the TextFragmentAbsorber.TextFragments collection and replace text in each fragment.
- Save updated PDF document using Document.Save(String) method.
The following code sample shows how to find and replace text in a particular page region in a PDF using C#.
Replace Text in PDF using Regular Expression in C#
You can also use regular expressions to find and replace the text occurrences matching a particular pattern. For this, you only need to provide a regular expression instead of the plain search phrase and use TextSearchOptions. The following are the steps to do so.
- Use Document class to load the PDF document using its path.
- Create an instance of TextFragmentAbsorber class and provide the search phrase to its constructor.
- Create an instance of TextSearchOptions class and pass true to its constructor to enable the regex-based search.
- Assign the TextSearchOptions object to TextFragmentAbsorber.TextSearchOptions property.
- Accept the text absorber for the desired page using Document.Pages[0].Accept(TextFragmentAbsorber).
- Define the page region using the Rectangle class.
- Loop through the TextFragmentAbsorber.TextFragments collection and replace text in each fragment.
- Save updated PDF document using Document.Save(String) method.
The following code sample shows how to find and replace text in a PDF using regular expression using C#.
Conclusion
PDF automation is widely adopted these days in order to manipulate PDF documents from within the web or desktop applications. This article covered a useful PDF automation feature of finding and replacing text programmatically using C#. The step by step guide and code samples have shown how to find and replace text in a whole PDF, a particular page in PDF, or a page region. You can explore more advanced features using the documentation of the API.