Convert PDF Files to MS Word Documents (DOC/DOCX) in Java

PDF to Word

PDF is one of the most commonly used formats for sending the document out to third parties. The reason behind this popularity is PDF’s compatibility across multiple platforms regardless of any hardware/software requirements. However, in some cases, you would want to convert the PDF document into an editable document format. PDF to DOC or DOCX format could be the priority conversion option in such cases. In order to automate the conversion process, this article showcases how to convert PDF to Word programmatically in Java.

So in this article, you will get to know how to:

  • Convert PDF to DOC using Java.
  • Convert PDF to DOCX format using Java.
  • Customize PDF to Word (DOC/DOCX) conversion.

Java PDF to Word Converter Library

Thanks to Aspose.PDF for Java – a PDF manipulation Java API that provides easy ways to convert PDF files to a variety of other formats including PDF to DOC and PDF to DOCX. You can download and add API’s JAR file to your project or reference it using the following Maven configurations:

Repository

<repository>
    <id>AsposeJavaAPI</id>
    <name>Aspose Java API</name>
    <url>https://repository.aspose.com/repo/</url>
</repository>

Dependency

<dependency>
    <groupId>com.aspose</groupId>
    <artifactId>aspose-pdf</artifactId>
    <version>19.12</version>
</dependency>

Convert PDF to DOC using Java

Once you have referenced Aspose.PDF for Java in your application, you can convert any PDF document to DOC format in a couple of lines of code. The following are the steps required to perform this conversion.

The following code sample shows how to convert PDF to DOC in Java.

Input PDF Document

How to Convert PDF to DOC in Java

Output Word Document

Convert PDF to DOCX in Java

Convert PDF to DOCX using Java

DOCX is a well-known format for Word documents and in contrast to the DOC format, the structure of DOCX was based on the binary as well as the XML files. In case you want to convert PDF to DOCX format, you can tell the API to do so using the SaveFormat.DocX argument in Document.save() method.

The following code sample shows how to convert PDF to DOCX in Java.

Convert PDF to Word with Additional Options

Aspose.PDF for Java also provides some additional options that you can use in PDF to Word conversion, such as the output format, image resolution, distance between text lines and so on. DocSaveOptions class is used for this purpose and the following is the list of options you can use:

The following code sample shows how to use DocSaveOptions class in PDF to DOCX conversion using Java.

Conclusion

In this article, you have learned how easy it is to convert PDF documents to Word formats using Java. You can either convert PDF to DOC or PDF to DOCX based on your requirements. Furthermore, additional features to customize the PDF to Word DOC/DOCX conversion have also been discussed. You can learn more about converting PDF to other formats from the documentation.

Related Article(s)