You may often need to extract text from the PowerPoint slides in order to perform text analysis. On the other hand, you may want to extract and save the text in a file or database for further processing. In accordance with that, this article covers how to extract text from PowerPoint presentations using Java. Particularly, you will learn how to extract text from a specific slide or whole presentation.
- API to Extract Text from PowerPoint PPTX
- Extract Text from a PowerPoint Slide
- Extract Text from a PowerPoint Presentation
Java API to Extract Text from PowerPoint PPTX
In order to manipulate PowerPoint presentations, Aspose offers Aspose.Slides for Java. The said API is designed to implement PowerPoint automation features in Java applications. It also provides some simple ways of extracting text from the PPT/PPTX presentations. You can either download the API or install it using the following Maven configurations.
<repository>
<id>AsposeJavaAPI</id>
<name>Aspose Java API</name>
<url>http://repository.aspose.com/repo/</url>
</repository>
<dependency>
<groupId>com.aspose</groupId>
<artifactId>aspose-slides</artifactId>
<version>21.7</version>
<classifier>jdk16</classifier>
</dependency>
Extract Text from a PowerPoint Slide in Java
The following are the steps to extract text from a slide in a PowerPoint presentation using Java.
- Load the presentation using Presentation class.
- Get all the text frames from a slide into ITextFrame array using SlideUtil.getAllTextBoxes() method.
- Loop through each ITextFrame and access its text using ITextFrame.getParagraphs() method.
- Retrieve and print text from each IPortion of the paragraph.
The following code sample shows how to extract text from a PowerPoint slide.
Extract Text from Whole PowerPoint Presentation
You can also extract text from the whole PowerPoint presentation. The following are the steps to perform this operation.
- Load the presentation using Presentation class.
- Get all the text frames in presentation using SlideUtil.getAllTextFrames() method.
- Loop through each ITextFrame and access its paragraphs.
- Access the portions of the paragraphs and print their text.
The following code sample shows how to extract text from a PowerPoint presentation.
Get a Free API License
In case you want to use the API without evaluation limitations, you can get a free temporary license.
Try Online
You may also try the free online presentation parser, which is developed using Aspose.Slides.
Conclusion
In this article, you have learned how to extract text from PowerPoint presentations using Java. The code samples have shown how to extract text from a specific slide or the whole presentation. You can explore more about Aspose.Slides for Java using documentation. In case you would have any queries, inform us via our forum.