You may often need to convert a PDF file into HTML to embed it within a web page. In order to perform this conversion programmatically, this article covers how to convert a PDF file to HTML using Java. Furthermore, you will learn how to use additional options to customize the PDF to HTML conversion.
Java API for PDF to HTML Conversion
In order to convert PDF files to HTML, we’ll use Aspose.PDF for Java. It is a feature-rich PDF manipulation API that lets you generate, modify, and convert PDF files. Furthermore, it provides easy-to-use methods to convert PDF to HTML pages. You can either download the API or install it using the following Maven configurations.
<repository>
<id>AsposeJavaAPI</id>
<name>Aspose Java API</name>
<url>https://repository.aspose.com/repo/</url>
</repository>
<dependency>
<groupId>com.aspose</groupId>
<artifactId>aspose-pdf</artifactId>
<version>21.5</version>
</dependency>
Convert PDF to HTML in Java
The following are the steps to convert a PDF file to HTML using Java.
- Load the PDF document using Document class.
- Convert PDF to HTML using Document.save(“pdf-to-html.html”, SaveFormat.Html) method.
The following code sample shows how to convert PDF files to HTML.
PDF to HTML with Additional Options
Aspose.PDF for Java also provides additional options to customize the PDF to HTML conversion. Such as, you can compress the images, split the output into multiple HTML pages, specify folder for images, etc. The following are the steps to specify additional options for PDF to HTML conversion.
- Load the PDF document using Document class.
- Create an instance of HtmlSaveOptions class.
- Set desired options such as HtmlOptions.setSplitIntoPages(true) to split HTML content into multiple pages.
- Convert PDF to HTML using Document.save(“pdf-to-html.html”, HtmlSaveOptions) method.
The following code sample shows how to customize PDF to HTML conversion.
You can explore other options using the following documentation links.
- Avoid Saving Images in SVG Format
- Compressing SVG Images During Conversion
- Specify Images Folder
- Create Subsequent Files with Body Contents Only
- Transparent Text rendering
- PDF document layers rendering
- Exclude Font Resources
Get a Free License
You can use Aspose.PDF for Java without evaluation limitations using the temporary license. Get yours now.
Conclusion
In this article, you have learned how to convert PDF files to HTML using Java. Furthermore, you have seen how to use different options to customize PDF to HTML conversion. You can explore more about the API using the documentation. In addition, you can post your queries to our forum.