TIKA-123: Structured MS Office parsing
commit848b72148b2a4daeb0560ab2939f8b52728ad8d7
authorJukka Lauri Zitting <jukka@apache.org>
Sun, 9 Mar 2008 11:47:54 +0000 (9 11:47 +0000)
committerJukka Lauri Zitting <jukka@apache.org>
Sun, 9 Mar 2008 11:47:54 +0000 (9 11:47 +0000)
tree2cb520c6f7ac23cf392f087cef9bb7bd945f992b
parent1f15082885e88f22d2007ec757dddd9d6f2f70fb
TIKA-123: Structured MS Office parsing
    - Consolidated all MS Office parsing to a single class
    - Reliable MIME magic for pseudo type application/x-tika-msoffice
    - Added MIME magic for RTF

git-svn-id: https://svn.eu.apache.org/repos/asf/incubator/tika/trunk@635224 13f79535-47bb-0310-9956-ffa450edef68
src/main/java/org/apache/tika/parser/microsoft/ExcelExtractor.java [moved from src/main/java/org/apache/tika/parser/microsoft/ExcelParser.java with 92% similarity]
src/main/java/org/apache/tika/parser/microsoft/OfficeParser.java
src/main/java/org/apache/tika/parser/microsoft/PowerPointParser.java [deleted file]
src/main/java/org/apache/tika/parser/microsoft/PropertyParser.java [deleted file]
src/main/java/org/apache/tika/parser/microsoft/WordParser.java [deleted file]
src/main/resources/mime/tika-mimetypes.xml
src/main/resources/tika-config.xml
src/test/java/org/apache/tika/parser/microsoft/ExcelParserTest.java
src/test/java/org/apache/tika/parser/microsoft/PowerPointParserTest.java
src/test/java/org/apache/tika/parser/microsoft/WordParserTest.java