java 读取word内容
在Java中读取Word内容,可以使用Apache POI库。下面是两种不同的实现方法:
方法1:使用XWPF读取器
import org.apache.poi.xwpf.usermodel.XWPFDocument;
import org.apache.poi.xwpf.usermodel.XWPFParagraph;
import org.apache.poi.xwpf.usermodel.XWPFRun;
import java.io.FileInputStream;
import java.io.IOException;
import java.io.InputStream;
public class WordReader {
public static void main(String[] args) {
try {
InputStream fis = new FileInputStream("path/to/your/word/document.docx");
XWPFDocument document = new XWPFDocument(fis);
for (XWPFParagraph paragraph : document.getParagraphs()) {
String text = paragraph.getText();
System.out.println(text);
}
document.close();
fis.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
方法2:使用HWPF读取器
import org.apache.poi.hwpf.HWPFDocument;
import org.apache.poi.hwpf.extractor.WordExtractor;
import java.io.FileInputStream;
import java.io.IOException;
import java.io.InputStream;
public class WordReader {
public static void main(String[] args) {
try {
InputStream fis = new FileInputStream("path/to/your/word/document.doc");
HWPFDocument document = new HWPFDocument(fis);
WordExtractor extractor = new WordExtractor(document);
String[] paragraphs = extractor.getParagraphText();
for (String paragraph : paragraphs) {
System.out.println(paragraph);
}
extractor.close();
fis.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
注意:方法1适用于读取.docx格式的Word文档,需要导入poi-ooxml依赖;方法2适用于读取.doc格式的Word文档,需要导入poi依赖。
评论已关闭