File Handling | Reading data from word document(.doc or .docx) in JAVA.

Reading data from word document(.doc or .docx) in JAVA.

Problem:
1. The user can enter word document in any format(extension like .doc or .docx) as mentioned above.
2. Identify the extension of word file.
3. Reading and print all the content of the word file on Console.

Reference Documents:
tempdata.doc
tempdata_1.docx

Answer

Step #1 Create two-word documents, one with tempdata.doc and other with tempdata_1.docx extensions.
Step #2 Now download and add the jar files in your java project as mentioned in the ‘Reference Jar Files‘ section.
Step #3 Copy and paste the code in your class file and run the code to observe the output.
Step #4 To identify the extension of the word file we have used the getExtension() method of FilenameUtils Class as mentioned below:
             String fileExtension= FilenameUtils.getExtension(filePath);
Step #5 Once we get the file extension, we have to call correct method accordingly.
Step #6 For the file with the extension “.docx” we have to use XWPFDocument and XWPFParagraph Classes.
              XWPFDocument doc =new XWPFDocument(FileInputStream fis);
              List<XWPFParagraph> getDocParagraphs= doc.getParagraphs();
Step #7 For the file with the extension “.doc” we have to HWPFDocument and WordExtractor classes.
               HWPFDocument doc=new HWPFDocument(FileInputStream fis);
               WordExtractor extractor=new WordExtractor(doc);

Note: Please change the path of the Word document file accordingly.

Reference Jar files

1. Navigate to :- poi-bin-3.16-20170419.tar.gz
2. Click on the first link, poi-bin3.16-20170419.tar.gz link.
3. Jar files get downloaded automatically.
4. Add the jar files in your Project using ‘configure build path‘ option.

Code:
import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
import java.util.List;
import org.apache.commons.io.FilenameUtils;
import org.apache.poi.hwpf.HWPFDocument;
import org.apache.poi.hwpf.extractor.WordExtractor;
import org.apache.poi.xwpf.usermodel.XWPFDocument;
import org.apache.poi.xwpf.usermodel.XWPFParagraph;

public class WordHandling 
{

	public static void main(String[] args) throws IOException 
	{
		// TODO Auto-generated method stub
		String filePath="input_Word//tempdata_1.doc";
		loadFile(filePath);
	}
	
	public static void loadFile(String filePath) throws IOException
	{
		File file=new File(filePath);                               // Creating File Object
		
		String fileExtension=FilenameUtils.getExtension(filePath);  // Getting extension of file
		if(fileExtension.equalsIgnoreCase("docx"))
		{
			readDocxFile(file);
		}
		else if(fileExtension.equalsIgnoreCase("doc"))
		{
			readDocFile(file);
		}
		
	}
	
	// Reading data from ".docx" file.
	public static void readDocxFile(File file) throws IOException
	{
		FileInputStream fis=new FileInputStream(file); 
		XWPFDocument doc =new XWPFDocument(fis); 
		List getDocParagraphs= doc.getParagraphs(); // Getting all the paragraphs from the document and adding the same in ArrayList
		int totalParagraphs=getDocParagraphs.size();               // Getting total number of paragraphs in word document.
		
		System.out.println("Total number of paragraphs : "+totalParagraphs);
		
		for (XWPFParagraph currentParagraph : getDocParagraphs) 
		{
			System.out.println(currentParagraph.getText().toString());
		}
		doc.close();
	}
	
	// Reading data from ".doc" file.
	public static void readDocFile(File file) throws IOException
	{
		FileInputStream fis =new FileInputStream(file);
		HWPFDocument doc=new HWPFDocument(fis);
		
		WordExtractor extractor=new WordExtractor(doc);
		String[] getDocParagraphs= extractor.getParagraphText();        // Getting all the paragraphs from the document and adding the same in String array.
		int totalParagraphs=getDocParagraphs.length;                    // Getting total number of paragraphs in word document.
		System.out.println("Total count of paragraphs : "+totalParagraphs+"\n");
		for (String currentPara : getDocParagraphs) 
		{
			System.out.print(currentPara);
		}
		extractor.close();
	}
	

}

Related Links:

TestNG:

Java Basics:

Computer Baiscs:

OOPs Concept:

Java Question And Answer:

Java Programs:


2889total visits,7visits today

Leave a Reply

Your email address will not be published. Required fields are marked *