Writing Microsoft Word Documents in Java With Apache POI (Part 1 – Writing Paragraphs)

Apache POI and OOXML

Apache POI (which stands for “Poor Obfuscation Implementation”) is a collection of APIs for manipulating various file types in Java, most notably those based on the Open Office standards (OOXML) and the old MS OLE2 ones  – doc, xls, ppt, etc). Keep in mind that while POI offers functionality for most (simpler) use cases, it may be lacking advanced functionality, especially when it comes to editing documents. I successfully used it for a small side project at work that involved implementing a simple application that generates docx files from a template. I found that the latest stable release at the time – 3.6 – was lacking the functionality I needed, so I ended up using 3.7-beta2 and implementing some of the functionality by manipulating the underlying XML myself. While there are some examples out there, I couldn’t find a lot on reading and editing Word documents and had to rely on the documentation that is also far from perfect. I am planning to dedicate several blog posts to POI and go over some of the goodies it offers and come up with simple examples that will hopefully demystify it a little bit.

I will focus on the Open Office file types rather than the old OLE2. It is worth mentioning that I had problems opening OLE2 Word documents with MS Word 2010, that were generated by POI 3.6, so I resorted to writing docx files. OOXML is an open file standard developed by Microsoft. It stores data in XML files that are zipped up in a package. Microsoft started using OOXML as the default format in MS Office 2007. You can take verify by renaming a docx file (or any OOXML-based file, for that matter) and changing its extension to zip. If you want to learn more about Apache POI, visit the project website. To learn more about OOXML, just Google it.

Writing a Word file using Apache POI

Let’s shift gears and start with Word files. This is definitely not the strongest side of POI, but will probably be suitable for most basic needs. What became clear to me through my experience with POI, which is rather limited, is that it’s better at reading these files and is limited when it comes to writing them. After all, POI is an open source project, so if you need something that is missing, you can implement it and submit a patch, or log a request for it to be added.

Lets take a look at a very simple example that writes out a simple docx file:

public class CreateDocumentFromScratch {

    public static void main(String[] args) {
        XWPFDocument document = new XWPFDocument();

        XWPFParagraph paragraphOne = document.createParagraph();
        XWPFRun paragraphOneRunOne = paragraphOne.createRun();
        paragraphOneRunOne.setText("Hello world! This is paragraph one!");
        XWPFRun paragraphOneRunTwo = paragraphOne.createRun();
        paragraphOneRunTwo.setText(" More text in paragraph one...");

        XWPFParagraph paragraphTwo = document.createParagraph();
        XWPFRun paragraphTwoRunOne = paragraphTwo.createRun();
        paragraphTwoRunOne.setText("And this is paragraph two.");

        FileOutputStream outStream = null;
        try {
            outStream = new FileOutputStream(args[0]);
        } catch (FileNotFoundException e) {
            e.printStackTrace();
        }

        try {
            document.write(outStream);
            outStream.close();
        } catch (FileNotFoundException e) {
            e.printStackTrace();
        } catch (IOException e) {
            e.printStackTrace();
        }
    }

}

From our simple example above you will notice that we are creating a document, which contains paragraphs, which contain “text runs”. As you may have guessed by now, text runs are blocks of text that shares similar features, such a font, style, etc.

Let’s explore styling in a little more details:

public class CreateDocumentFromScratch {

    public static void main(String[] args) {
        XWPFDocument document = new XWPFDocument();

        XWPFParagraph paragraphOne = document.createParagraph();
        paragraphOne.setAlignment(ParagraphAlignment.CENTER);
        paragraphOne.setBorderBottom(Borders.SINGLE);
        paragraphOne.setBorderTop(Borders.SINGLE);
        paragraphOne.setBorderRight(Borders.SINGLE);
        paragraphOne.setBorderLeft(Borders.SINGLE);
        paragraphOne.setBorderBetween(Borders.SINGLE);

        XWPFRun paragraphOneRunOne = paragraphOne.createRun();
        paragraphOneRunOne.setBold(true);
        paragraphOneRunOne.setItalic(true);
        paragraphOneRunOne.setText("Hello world! This is paragraph one!");
        paragraphOneRunOne.addBreak();

        XWPFRun paragraphOneRunTwo = paragraphOne.createRun();
        paragraphOneRunTwo.setText("Run two!");
        paragraphOneRunTwo.setTextPosition(100);

        XWPFRun paragraphOneRunThree = paragraphOne.createRun();
        paragraphOneRunThree.setStrike(true);
        paragraphOneRunThree.setFontSize(20);
        paragraphOneRunThree.setSubscript(VerticalAlign.SUBSCRIPT);
        paragraphOneRunThree.setText(" More text in paragraph one...");

        XWPFParagraph paragraphTwo = document.createParagraph();
        paragraphTwo.setAlignment(ParagraphAlignment.DISTRIBUTE);
        paragraphTwo.setIndentationRight(200);
        XWPFRun paragraphTwoRunOne = paragraphTwo.createRun();
        paragraphTwoRunOne.setText("And this is paragraph two.");

        FileOutputStream outStream = null;
        try {
            outStream = new FileOutputStream(args[0]);
        } catch (FileNotFoundException e) {
            e.printStackTrace();
        }

        try {
            document.write(outStream);
            outStream.close();
        } catch (FileNotFoundException e) {
            e.printStackTrace();
        } catch (IOException e) {
            e.printStackTrace();
        }
    }

}

The result:

The code is self-explanatory, so I will not go into a lot of details on what each line is doing, but it should give you a good idea about what POI is capable of. You can explore the methods on XWPFParagraph and XWPFRun to learn more.

This should be a good start. In part two of this series, I will show you how to create tables in a Word document next time. Until then, post questions in the comments section.

VN:F [1.9.11_1134]
Rating: 4.9/5 (35 votes cast)
Writing Microsoft Word Documents in Java With Apache POI (Part 1 - Writing Paragraphs), 4.9 out of 5 based on 35 ratings
Posted in Apache POI by TK at August 26th, 2010.
Tags: , , , ,
  • Robert Waltz

    XWPFRun.setText() doesn’t seem to respect line breaks or tabs? Do I have to reformat multi-line text in the run?

    thanks

    VA:F [1.9.11_1134]
    Rating: +2 (from 2 votes)
  • Clement Hoffmann

    I spent hours and hours over the internet to find some good examples about the POI API.
    But those who are on this blog are far the best i’ve ever seen.
    Thank you for for sharing.
    Excellent job and quality teaching !

    I hope there is more to come about XWPF…

    VA:F [1.9.11_1134]
    Rating: +2 (from 2 votes)
  • Nitesh

    Hi!!

    I am trying to copy content of one .docx to another but I am not able to retain the formatting..
    Could you help me that?
    I am using similar code for writing the docx.

    VA:F [1.9.11_1134]
    Rating: -1 (from 1 vote)
  • TK

    I haven’t done this with POI before, but you should be able to apply the same formatting, even if you have to manually replicate it. I would be interested to look at the solution, if you find one.

    VN:F [1.9.11_1134]
    Rating: -1 (from 1 vote)
  • Krish

    Hey, this is useful.. but can you please help me with servlets i mean once i click the button it shoud get the data from the databse and should generate a word doc which should contain the data.. ex: – name : – adasd, age : – 12
    like wise…
    Thanks

    VA:F [1.9.11_1134]
    Rating: -2 (from 2 votes)
  • Kamal Mansoor

    Thank you, very useful.

    VA:F [1.9.11_1134]
    Rating: +1 (from 1 vote)
  • Mohit

    Hey thanks for the code above.
    if u can post a code for XWPFDocument.setParagraph()

    how to replace a whole paragraph with POI, so that we can get the paragraph where we want to replace text, and create a new paragraph with the result and set it in place of the original.

    Pls do it ….

    Thanks

    VA:F [1.9.11_1134]
    Rating: -1 (from 3 votes)
  • Adnan

    Any example to insert the image?

    VA:F [1.9.11_1134]
    Rating: +2 (from 4 votes)
  • Raj kumar

    Hi
    I want to write a file using java and the output should be in ms word format.can any one help me?

    VA:F [1.9.11_1134]
    Rating: -1 (from 1 vote)
  • William

    Hi,

    Thanks for the tutorial. Its nice and its was of great help so far.

    I am not able to change the font family though:

    paragraphOneRunOne.setFontFamily(“Arial Unicode MS”); does not seem to work. Do you know a work around for this? What wrong Am i doing?

    VA:F [1.9.11_1134]
    Rating: +2 (from 2 votes)
  • S.deenathiyalan

    hi

    i want Apache POI word for free download

    VA:F [1.9.11_1134]
    Rating: -1 (from 1 vote)
  • Ankit

    Am trying to add bullets to my text in a word document. Any ideas?

    VA:F [1.9.11_1134]
    Rating: 0 (from 0 votes)
  • maroua

    thx sir for ur perfect code :)

    VA:F [1.9.11_1134]
    Rating: 0 (from 0 votes)
  • DevXXX

    I need to replace strings in the word temlate with value passed from java, Please suggest

    VA:F [1.9.11_1134]
    Rating: 0 (from 0 votes)
  • Laxman Rana

    I want add bullets in word …how to achieve this??

    VA:F [1.9.11_1134]
    Rating: 0 (from 0 votes)
  • Jeremy Atkinson

    Hi, this post is brilliant very much appreciated.

    VA:F [1.9.11_1134]
    Rating: 0 (from 0 votes)
  • loch

    Is it possible to underline a word and add a comment to that word.

    VA:F [1.9.11_1134]
    Rating: 0 (from 0 votes)
  • Piyush

    HI,
    I am getting below exception when I tried to run the example. Can anyne please help.

    Exception in thread “main” java.lang.NoClassDefFoundError: org/apache/xmlbeans/XmlException
    at ApachePOIWord.main(ApachePOIWord.java:12)
    Caused by: java.lang.ClassNotFoundException: org.apache.xmlbeans.XmlException
    at java.net.URLClassLoader$1.run(Unknown Source)
    at java.net.URLClassLoader$1.run(Unknown Source)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.net.URLClassLoader.findClass(Unknown Source)
    at java.lang.ClassLoader.loadClass(Unknown Source)
    at sun.misc.Launcher$AppClassLoader.loadClass(Unknown Source)
    at java.lang.ClassLoader.loadClass(Unknown Source)
    … 1 more

    Thanks,
    Piyush

    VA:F [1.9.11_1134]
    Rating: 0 (from 0 votes)
  • Piyush

    I got the solution, I was not having xmlbeans-2.3.0.jar in my class path…thanks anways..

    VA:F [1.9.11_1134]
    Rating: 0 (from 0 votes)
  • http://www.facebook.com/mudassar.hussain.94402 Mudassar Hussain

    Exception in thread “main” java.lang.NoClassDefFoundError: org/apache/poi/util/POILogFactory

    at org.apache.poi.POIXMLDocumentPart.(POIXMLDocumentPart.java:49)

    at demo.CreateDocumentFromScratch.main(CreateDocumentFromScratch.java:21)

    Caused by: java.lang.ClassNotFoundException: org.apache.poi.util.POILogFactory

    at java.net.URLClassLoader$1.run(URLClassLoader.java:202)

    at java.security.AccessController.doPrivileged(Native Method)

    at java.net.URLClassLoader.findClass(URLClassLoader.java:190)

    at java.lang.ClassLoader.loadClass(ClassLoader.java:307)

    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)

    at java.lang.ClassLoader.loadClass(ClassLoader.java:248)

    … 2 more

    hey i get This exception plz help me………

    VA:F [1.9.11_1134]
    Rating: 0 (from 0 votes)
  • sandesh

    how to open a already existing doc file and append text into it?

    VA:F [1.9.11_1134]
    Rating: +1 (from 1 vote)
  • sarah

    In which directory should I put my POI files? my complier is not recognizing imports

    VA:F [1.9.11_1134]
    Rating: 0 (from 0 votes)
  • rehan mustafa

    You can do that by using this Java library : http://www.aspose.com/java/word-component.aspx

    VA:F [1.9.11_1134]
    Rating: 0 (from 0 votes)
  • kingxxxx

    In a word, how to locate?If use bookmarks to positioning, how to add information in a fixed place?

    VA:F [1.9.11_1134]
    Rating: 0 (from 0 votes)
  • Gajendra kangokar

    Is it possible to count number of pages from a .doc(word document) file?

    VA:F [1.9.11_1134]
    Rating: 0 (from 0 votes)
  • Carlos Barros

    copy the code and paste into ecilpse, will get a compilation error about token something. blame these thing you use to highlight the code.

    VA:F [1.9.11_1134]
    Rating: 0 (from 0 votes)
  • http://jrphub.github.io/jrpwilt/ Jyotiranjan

    Thanks , you made my day :)

    VA:F [1.9.11_1134]
    Rating: 0 (from 0 votes)
Rss Feed Tweeter button Facebook button Digg button Stumbleupon button