In various cases, you may need to convert the HTML content to a Word document. For example, for generating the document from a WYSIWYG HTML editor or converting a web page to DOCX or DOC format. To perform this conversion programmatically, this article covers how to convert HTML files to Word DOCX, DOC, DOCM, or other formats in Java.
- Library to Convert HTML to Word
- Convert an HTML File to DOCX/DOC/DOCM etc.
- Convert a Web Page to Word using URL
- HTML String to Word Conversion
Info: If you ever need to get a Word document from a PowerPoint presentation, you can use Aspose Presentation to Word Document converter.
Java Library to Convert HTML to Word
To convert HTML to DOCX, DOC, DOT, DOCM, and other Word formats, we will use Aspose.Words for Java. It is a powerful library to create and manipulate Word documents programmatically. Moreover, it provides a built-in document converter that provides high fidelity conversion of/to Word processing documents. You can download the API’s JAR from the downloads section or install it using the following Maven configurations in pom.xml.
<repository>
<id>AsposeJavaAPI</id>
<name>Aspose Java API</name>
<url>https://repository.aspose.com/repo/</url>
</repository>
<dependency>
<groupId>com.aspose</groupId>
<artifactId>aspose-words</artifactId>
<version>21.11</version>
<type>pom</type>
</dependency>
Convert HTML to DOCX/DOC/DOCM in Java
Using Aspose.Words for Java, the conversion of an HTML file to Word format can be done in a couple of steps, which are mentioned below.
- Load the HTML file using Document class.
- Save the HTML file as Word document using Document.save(string, SaveFormat) method.
The SaveFormat enum in Document.save() method specifies the format in which you want to convert the HTML file. The following code sample shows how to generate a Word document from HTML in Java.
Convert a Web Page to Word using URL in Java
You can also convert a web page to a Word document directly from its URL. The following are the steps to convert HTML to DOCX using URL in Java.
- Create an instance of URL class and initialize it with desired URL.
- Open URL into a InputStream object.
- Create an instance of HtmlLoadOptions class.
- Create an instance of Document class and initialize it with InputStream and HtmlLoadOptions objects.
- Save the web page as Word document using Document.save(string, SaveFormat) method.
The following code sample shows how to convert a web page to a Word document using a URL.
Convert an HTML String to Word using Java
Aspose.Words for Java also allows you to generate a Word document from HTML string dynamically. The following are the steps to perform this operation.
- Create an instance of Document class.
- Create an instance of DocumentBuilder class and initialize it with Document object.
- Insert HTML into the document using DocumentBuilder.InsertHtml(string) method.
- Save the Word document using Document.save(string, SaveFormat) method.
The following code sample shows how to convert an HTML string to a DOCX file using Java.
Get a Free API License
You can use Aspose.Words for Java without evaluation limitations by getting a free temporary license.
Conclusion
In this article, you have learned how to convert HTML files to Word DOCX, DOC, DOCM, or other formats programmatically using Java. Moreover, you have seen how to convert an HTML string or a web page from a URL to a Word document dynamically. You can simply install Aspose.Words for Java and use the provided code to build your HTML to Word converter. Besides, you can visit the documentation to explore more about Aspose.Words for Java. Furthermore, you can share your queries with us on our forum.