Supported File Formats
This is a list of file formats that Fess has been verified to crawl and search.
text (txt)
XML (xml, xhtml, mm, etc.)
HTML (html, htm)
MS Office (doc, xls, ppt, docx, xlsx, pptx, etc.)
PDF (pdf, etc.)
Source Code (js, c, h, java, etc.)
Compressed Files (gz, tar, zip, etc.)
Rich text (rtf)
ePub
Audio/Image/Video (metadata extraction)
mbox
ai files (PDF compatible)
Fess extracts text from various types of unknown files. Files not listed above can also be crawled and searched. If you have files you would like to verify, please submit a pull request to the Test Data Repository for Search Systems.
Other
The following files are supported through commercial support:
Ichitaro
OASYS for Windows
DocuWorks
AutoCAD