Split MS Word Documents using Java

Split Word Documents Java

In various cases, you need to split an MS Word document into multiple documents. For example, you may need to create a separate document for each page, section, or collection of pages in a Word document. In order to automate the document splitting, this article covers how to split MS Word DOCX programmatically using Java. The following sections provide a step-by-step tutorial and code samples of the above-mentioned splitting criteria.

Java API to Split Word DOCX

Aspose.Words for Java is a powerful and feature-rich document manipulation API that lets you create and process MS Word documents. In addition to the basic as well as advanced Word automation features, the API also allows you to split a Word document into multiple documents. You can either download the API or install it within your Maven-based application using the following configurations.

<repository>
    <id>AsposeJavaAPI</id>
    <name>Aspose Java API</name>
    <url>https://repository.aspose.com/repo/</url>
</repository>
<dependency>
    <groupId>com.aspose</groupId>
    <artifactId>aspose-words</artifactId>
    <version>21.1</version>
    <classifier>jdk17</classifier>
</dependency>

Word Document Splitter – Helper Classes

Before you start splitting the documents, you would need to add the following helper classes to your project that implement a Java document splitter based on Aspose.Words for Java. Once you have added the classes, you can proceed to split the documents using the code samples provided in the sections below.

Split a Word DOCX using Java

First of all, let’s have a look at how to split an MS Word document by page. In this case, each page of the source document will be converted into a separate Word document. The following are the steps to split pages of a Word document.

The following code sample shows how to split a Word document using Java.

// For complete examples and data files, please go to https://github.com/aspose-words/Aspose.Words-for-Java
// Open a Word document
Document doc = new Document("Word.docx");
// Split nodes in the document into separate pages
DocumentPageSplitter splitter = new DocumentPageSplitter(doc);
// Save each page as a separate document
for (int page = 1; page <= doc.getPageCount(); page++)
{
Document pageDoc = splitter.getDocumentOfPage(page);
pageDoc.save("SplitDocumentByPage_" + page + ".docx");
}

Use Page Range to Split Word DOCX in Java

You can also define a page range that you want to split from the source Word document. The following are the steps to perform this operation.

The following code sample shows how to split a Word document by a page range using Java.

// For complete examples and data files, please go to https://github.com/aspose-words/Aspose.Words-for-Java
// Open a Word document
Document doc = new Document("Word.docx");
// Split nodes in the document into separate pages
DocumentPageSplitter splitter = new DocumentPageSplitter(doc);
// Get part of the document
Document pageDoc = splitter.getDocumentOfPageRange(3,6);
pageDoc.save("SplitDocumentByPageRange.docx");

Split a Word Document by Sections using Java

Aspose.Words for Java also allows you to split a Word document by section breaks. The following are the steps to perform this operation.

The following code sample shows how to split a Word document by sections using Java.

// Load a Word DOCX document
Document doc = new Document("word.docx");
for (int i = 0; i < doc.getSections().getCount(); i++) {
// Split a document into smaller parts, in this instance split by section
Section section = doc.getSections().get(i).deepClone();
// Create a new document
Document newDoc = new Document();
newDoc.getSections().clear();
// Add section
Section newSection = (Section) newDoc.importNode(section, true);
newDoc.getSections().add(newSection);
// Save each section as a separate document
newDoc.save("splitted_" + i + ".docx");
}

Get a Free API License

You can get a free temporary license in order to try the API without evaluation limitations.

Conclusion

In this article, you have learned how to split MS Word DOCX/DOC using Java. The step by step guide and code samples have shown how to split a Word document by sections, pages, or a range of pages. You can explore more about the Java Word API using documentation.

See Also