In this article, I’ll demonstrate how to find and replace text in Word (DOC/DOCX) documents programmatically using Java. The step by step guide and code samples will cover various scenarios of finding and replacing text in Word documents.
MS Word provides an easy way to find and replace text in the documents. One of the popular use cases of finding and replacing text could be removing or replacing sensitive information within the documents before they are shared among various entities. However, the manual process may require you to install MS Word and update every document separately. In such situations, it would be handy and time-saving especially when you have integrated find and replace features within your desktop or web application. So let’s begin and see how to find and replace text in Word documents using Java in various scenarios.
- Find and Replace Text in Word DOC/DOCX using Java
- Replace Similar Words based on Regex Pattern in Word DOC/DOCX
- Find and Replace Text in the Header/Footer of Word Document
- Find and Replace Text with Meta-Characters in Word DOC/DOCX
Java API to Find and Replace Text in Word Documents
In order to implement the find and replace feature, we’ll use Aspose.Words for Java which is a powerful, feature-rich, and easy to use Word processing API for Java platform. You can either download its JAR or install it within your Maven-based application using the following configurations.
Repository
<repository>
<id>AsposeJavaAPI</id>
<name>Aspose Java API</name>
<url>https://repository.aspose.com/repo/</url>
</repository>
Dependency
<dependency>
<groupId>com.aspose</groupId>
<artifactId>aspose-words</artifactId>
<version>20.5</version>
<classifier>jdk17</classifier>
</dependency>
Find and Replace Text in Word Documents (DOC/DOCX) using Java
Let’s start by addressing a simple find and replace scenario where we will find the word “Sad” within the input Word document. The following are the steps to perform this operation.
- Create an instance of Document class and pass to it the Word document’s path.
- Find and replace text using Document.getRange.replace(String, String, FindReplaceOptions) method.
- Save the document using Document.save(String) method.
The following code sample shows how to find and replace text in Word DOCX documents using Java.
Below is the input Word document that we have used in this article.
The following is the output after finding and replacing the word “sad”.
Find and Replace Similar Words in Word DOC/DOCX using Java
You can also customize the API to find and replace text based on the similarity. For example, the words “sad”, “mad” and “bad” are following a similar pattern that ends on “ad”. Email IDs are another example of such a text. In such cases, you can define a regex pattern to find and replace all the text occurrences having a particular pattern. The following are the steps to achieve this.
- Create an instance of Document class and pass to it the Word document’s path.
- Define a regex pattern using Pattern.compile() method and pass it to Document.getRange().replace(Pattern pattern, String replacement, FindReplaceOptions options) method.
- Save the updated document using Document.save(String) method.
The following code sample shows how to find and replace similar words based on a particular pattern using Java.
The following is the screenshot of the Word document after updating similar words.
Replace Text in the Header/Footer of Word Document
Aspose.Words also allows you to find and replace text only in the header/footer of the Word document. The following are the steps to perform this operation.
- Create an instance of Document class and pass to it the Word document’s path.
- Get HeaderFooterCollection of the document using Document.getFirstSection().getHeadersFooters() method.
- Retrieve the particular header/footer in the HeaderFooter object.
- Use HeaderFooter.getRange().replace() method to find and replace text.
- Save the updated Word document.
The following code sample shows how to find and replace text in the header/footer of Word document using Java.
The following screenshot shows the updated text in the footer of the Word document.
Find and Replace Text with Meta-Characters in Word DOCX using Java
There could be the case when you need to find and replace a phrase that is divided into multiple lines or paragraphs. In such cases, you will have to take care of the paragraph, section, or line breaks. Aspose.Words for Java makes it simple for you to handle such cases quite easily. The following are the meta-characters that you can use for different breaks:
- &p: paragraph break
- &b: section break
- &m: page break
- &l: line break
The following code sample demonstrates how to find and replace the text with a paragraph break in a Word document.
The following is the screenshot of the output Word document.
Conclusion
In this article, you have seen how to find and replace text in Word DOC/DOCX documents programmatically using Java. Various scenarios of finding and replacing text in MS Word DOCX files have been addressed with the help of code samples. You can learn more about Aspose.Words for Java from the documentation.