Skip to toolbar

File Handling | Reading data from word document(.doc or .docx) in JAVA.

Reading data from word document(.doc or .docx) in JAVA.

1. The user can enter word document in any format(extension like .doc or .docx) as mentioned above.
2. Identify the extension of word file.
3. Reading and print all the content of the word file on Console.

Reference Documents:


Step #1 Create two-word documents, one with tempdata.doc and other with tempdata_1.docx extensions.
Step #2 Now download and add the jar files in your java project as mentioned in the ‘Reference Jar Files‘ section.
Step #3 Copy and paste the code in your class file and run the code to observe the output.
Step #4 To identify the extension of the word file we have used the getExtension() method of FilenameUtils Class as mentioned below:
             String fileExtension= FilenameUtils.getExtension(filePath);
Step #5 Once we get the file extension, we have to call correct method accordingly.
Step #6 For the file with the extension “.docx” we have to use XWPFDocument and XWPFParagraph Classes.
              XWPFDocument doc =new XWPFDocument(FileInputStream fis);
              List<XWPFParagraph> getDocParagraphs= doc.getParagraphs();
Step #7 For the file with the extension “.doc” we have to HWPFDocument and WordExtractor classes.
               HWPFDocument doc=new HWPFDocument(FileInputStream fis);
               WordExtractor extractor=new WordExtractor(doc);

Note: Please change the path of the Word document file accordingly.

Reference Jar files

1. Navigate to :- poi-bin-3.16-20170419.tar.gz
2. Click on the first link, poi-bin3.16-20170419.tar.gz link.
3. Jar files get downloaded automatically.
4. Add the jar files in your Project using ‘configure build path‘ option.

import java.util.List;
import org.apache.poi.hwpf.HWPFDocument;
import org.apache.poi.hwpf.extractor.WordExtractor;
import org.apache.poi.xwpf.usermodel.XWPFDocument;
import org.apache.poi.xwpf.usermodel.XWPFParagraph;

public class WordHandling 

	public static void main(String[] args) throws IOException 
		// TODO Auto-generated method stub
		String filePath="input_Word//tempdata_1.doc";
	public static void loadFile(String filePath) throws IOException
		File file=new File(filePath);                               // Creating File Object
		String fileExtension=FilenameUtils.getExtension(filePath);  // Getting extension of file
		else if(fileExtension.equalsIgnoreCase("doc"))
	// Reading data from ".docx" file.
	public static void readDocxFile(File file) throws IOException
		FileInputStream fis=new FileInputStream(file); 
		XWPFDocument doc =new XWPFDocument(fis); 
		List getDocParagraphs= doc.getParagraphs(); // Getting all the paragraphs from the document and adding the same in ArrayList
		int totalParagraphs=getDocParagraphs.size();               // Getting total number of paragraphs in word document.
		System.out.println("Total number of paragraphs : "+totalParagraphs);
		for (XWPFParagraph currentParagraph : getDocParagraphs) 
	// Reading data from ".doc" file.
	public static void readDocFile(File file) throws IOException
		FileInputStream fis =new FileInputStream(file);
		HWPFDocument doc=new HWPFDocument(fis);
		WordExtractor extractor=new WordExtractor(doc);
		String[] getDocParagraphs= extractor.getParagraphText();        // Getting all the paragraphs from the document and adding the same in String array.
		int totalParagraphs=getDocParagraphs.length;                    // Getting total number of paragraphs in word document.
		System.out.println("Total count of paragraphs : "+totalParagraphs+"\n");
		for (String currentPara : getDocParagraphs) 


Related Links:


Java Basics:

Computer Baiscs:

OOPs Concept:

Java Question And Answer:

Java Programs:

Leave a Reply

Your email address will not be published. Required fields are marked *

Site Statistics

  • Users online: 0 
  • Visitors today : 1
  • Page views today : 1
  • Total visitors : 49,375
  • Total page view: 67,972

   YouTube ChannelQuora

            Ashok Kumar is working in an IT Company as a QA Consultant. He has started his career as a Test Trainee in manual testing in August 2010. Then he moves towards the automation testing after 4 years. He started learning JAVA and Selenium by self to get the knowledge of automation.

       While learning these tools and working on multiple projects, he found that sometimes people get stuck in live scenarios in their project and they have to do lots of RnD to get out of it. So he decided to start blogging only for such scenarios, where anyone facing any problem in their project, can ask any question or give a solution or you can say an alternate solution to achieve the goal successfully.

Later on, he observed that some people want to learn Java but they have few questions in their mind like how to start Java, whether we should go for the online or offline course. So he started writing tutorials on Java, Jira, Selenium, Excel etc.