PDF is a standard document format that is widely used for exchanging documents between individuals and different organizations. Even though it is popular, it may not always be the ideal choice for showing content. For example, on web pages, HTML is the better choice for a more satisfying user experience. If you want to display the PDF content on a website, then converting it to HTML may prove to be helpful. In light of this, this article will teach you how to convert PDF documents to HTML format using C++.
- C++ API for Converting PDF Documents to HTML Format
- Convert PDF Documents to HTML Format using C++
- Convert PDF Documents to HTML Format with Additional Options using C++
C++ API for Converting PDF Documents to HTML Format
Aspose.PDF for C++ is a C++ library that allows you to create, read and update PDF documents. Furthermore, the API supports converting PDF files to HTML format. You can either install the API through NuGet or download it directly from the downloads section.
PM> Install-Package Aspose.PDF.Cpp
Convert PDF Documents to HTML Format using C++
Converting a PDF document to HTML format is a breeze with the Aspose.PDF for C++ API. You can perform the conversion with just two lines of code. To convert a PDF document to HTML format, please follow the steps given below.
- Load the PDF document using the Document class.
- Save the HTML output using the Document->Save(System::String outputFileName, SaveFormat format) method.
The following sample code shows how to convert PDF documents to HTML format using C++.
Source PDF File
Output HTML File
Convert PDF Documents to HTML Format with Additional Options using C++
The Aspose.PDF for C++ API provides you the ability to customize the HTML generated by the conversion process. For this, the API offers the HtmlSaveOptions class. The following are some of the options provided by the HtmlSaveOptions class.
- FontSavingMode: It is used to set the font saving mode used during the conversion. The FontSavingModes enum is used to set its value.
- RasterImagesSavingMode: It is used to set how the raster images should be handled during the conversion. The RasterImagesSavingModes enum is used to set its value.
- LettersPositioningMethod: It sets the positioning of letters in words. The LettersPositioningMethods enum is used to set its value.
- SpecialFolderForAllImages: It is used to set the path where the images will be saved.
- SplitIntoPages: It sets whether each page of the PDF should be converted as a separate HTML page or the whole document should be converted to a single HTML file.
- SplitCssIntoPages: When SplitIntoPages is set to true, SplitCssIntoPages sets whether the CSS should be saved as a single file or as a separate file for each HTML page.
The following are the steps to convert a PDF document to HTML format with additional options.
- Load the PDF document using the Document class.
- Create an instance of the HtmlSaveOptions class.
- Set the desired options.
- Save the HTML output using the Document->Save(System::String outputFileName, System::SharedPtr<SaveOptions> options) method.
The following is the C++ sample code that demonstrates the use of the HtmlSaveOptions class to customize the HTML output.
Get a Free License
You can try the API without evaluation limitations by requesting a free temporary license.
Conclusion
In this article, you have learned how to convert PDF documents to HTML format using C++. Furthermore, you have learned how to use the additional options provided by the Aspose.PDF for C++ API to customize the generated HTML. The API provides many additional features for automating your PDF-related tasks. You can explore the API in detail by using the official documentation. If you have any questions, please feel free to reach us on the free support forum.