In Fess make Apache Solr based search server-introduction

This page is generated by Machine Translation from Japanese.

Introduction

Document management is increasing daily, is expected to effectively manage their documents. More managed document, with specific information from its difficult to continue. Include implementing full-text search server able to search through vast as the solution.

Fess is a easy deployment, Java-based open-source full-text search server. The search engine part of the Fess by using Apache Solr. Is a very powerful search engine called SOLR index can be the 200 million document. On the other hand, may need to implement in your own crawler parts, such as when trying to build in the Apache Solr search system. You can use S2Robot Fess offers from the Seasar Project crawler parts, collect Web or file system on various types of documents to search.

Therefore, this article introduces about building a search server by Fess .

Intended audience

  • Those who want to build search system
  • Those who observed to add search functionality into existing systems
  • Those who are interested in Apache Solr

Required environment

Regarding the content of this article in the following environment and behavior verification.

  • Windows 7 (Service Pack1)
  • JDK 1.7.0_21

And Fess

Fess is a open source Web and file system using the full text search system. The SourceForge.jpFess sitesFrom the provided in the Apache license.

Fess features

Java-based search system

Fess is as in the following figure, has been built using various open source products.

Fess structure image0

A Fess and Solr war file is deployed to the Tomcat distribution. War file of the Fess offers search and management screens. The Fess as a development framework Seasar2, SAStruts employs in the presentation layer. So, by modifying the JSP if you want to customize, such as screen easy customization is possible.

Also using the built-in database H2Database to save settings and crawl data, is accessed by using o/r Mapper DBFlute. S9chronos is used to perform a crawl in the time specified in the Fess , scheduling framework provided by the Seasar project. SOLR and S2Robot are discussed.

Fess was constructed as a Java-based system, so any platform can be performed. Provides a UI to easily set from the Web browser settings.

As a search engine using Apache Solr

Apache Solr is an enterprise search server based on Lucene, is available from the Apache Software Foundation. Roundness that characterized the support such as faceted search, search result highlighting, multiple output formats. Also in the Solr server configuration depends on the number of documents that can be searched for, and several hundred million documents, you can scale out to large scale site search server. Said to search engine usage and many in Japan, has been in.

Fess uses Apache Solr to search engines. Distributed in the distribution of the Fess in the Solr, but cut out Fess Solr server to another server that is available. Also, multiple Solr Server manages the Fess as a group, form a redundant configuration is possible. Design can take advantage of scalability in this way with SOLR in Fess ;

Available as a crawling engine S2Robot

S9robot is the krolaframework provided by the Seasar project. S2Robot can collect touring the document to the documents on the Web or on the file system. It is possible to also document collection in multiple threads simultaneously multiple documents efficiently treating. Also, document can handle HTML, not to mention in numerous formats such as MS Office system files, such as Word and Excel, zip archive files, images and audio files, including covers (images and audio files, gets a meta-information).

Fess by using S2Robot, touring on the Web and file system documents, collect text information. You can can handle S2Robot file format to accommodate even those to be searched. Etc to crawl through the S2Robot for parameter it is possible to set from the management UI of the Fess .

Mobile support

Fess is compatible for viewing on docomo, au and Softbank Mobile phones. You can specify when indexing documents to can be viewed in search results with what handsets. Book and skip for viewing on your mobile device in the paper, describes the next time.

Installation and startup

Start the Fess , and describes the steps to do a search. You can install and launch in almost similar steps in Mac OS X and Linux provides information intended to be run on Windows XP, but.

Download and installation

http://sourceforge.jp/projects/ |Fess| /releases/ From the download the latest package. The most recent version at the time of writing this article ( 2013 / 06 ) 8.1.0. Unzip the download has finished, in any directory.

Download Fess image1

Launch

CATALINA_HOME and JAVA_HOME environment more appropriately, please run the %CATALINA_HOME%\bin\startup.bat. For example, if you unzip the fess-8.1.0.zip C:\fess CATALINA_HOME is C:\fess\fess-server-8.1.0.

Launch of the Fess

C:\fess\fess-Server-8.1.0 > set "JAVA_HOME = C:\Program Files \Java\jdk1.7.0_21" C:\fess\fess-server-8.1.0 > set CATALINA_HOME = C:\fess\fess-server-8.1.0 C:\fess\fess-server-8.1.0 > cd bin C:\fess\fess-server-8.1.0\bin > startup.bat

In the browser / http://localhost:8080/ |Fess| The Fess is starting and access the following screen appears, the.

Search top screen image2

Stop

Please run the shutdown.bat.

Stop Fess

C:\fess\fess-Server-8.1.0\Bin > shutdown.bat

Directory configuration

Directory structure looks like this.

Directory configuration

|Fess| -Server-8.1.0/ |--LICENSE |--NOTICE |--RELEASE-NOTES |--RUNNING.txt |--bin / |--conf / |--extension / |--lib / |--logs / |--solr /--
|–contrib / | |–core1 / | | |– bin/ –
| |– conf/ –
| |– data/ –
| ‘–txlog / | |–dist / | ‘–lib / |–temp / |–webapps / | |–fess and | | |–META-INF / | | |–Web-INF / | | | |–cachedirs /–
| | |–classes /–
| | |– db/ –
| | |–cmd | | | |–conf / | | | |–lib / | | | |–orig / | | | |– logs/ –
| | |– view/ –
| | |–fe.tld | | | |–struts-config.xml | | | |–validator-rules.xml | | | ‘–web.xml | | |– css/ –
| |– js/ –
| |– images/ –
| ‘–jar / | |–fess.war | |– solr/ –
|–solr.war | |–manager / |

‘–manager.war ‘–work /

just below the “fess-server-8.1.0” directory configuration is similar to the Tomcat 7, might be deployed Solr data directory ‘solr’, ‘fess.war’ and ‘solr.war. Is deployed ‘fess.war’ to ‘webapps/fess/WEB-INF/view’ put JSP file search and management screens. Also, if you need to customize the screen CSS file is placed in the ‘webapps/fess/css’, so edit the files.

To search from indexing

Also indexed for search in the State immediately after the launch, make a search returned nothing results. So, you must first create the index. In this case,http://fess.codelibs.org/ja/ Create index to below, to do a search as an example.

Login to the management page

First of all, on Administration page http://localhost:8080/ |Fess| /Admin To access, please login. By default user name and password are both admin.

Login to the management page image3

Registration of the crawl

Then, register the crawled. Because the Web page, select the [Web] from the left of the admin page. For anything not registered in the initial state, select Create new.

Select the [new] image4

As a Web crawl settings, this ishttp://fess.codelibs.org/ja/ That will crawl all the pages below. In addition, results are displayed when you search from any PC or mobile phone, and then select all as the browser type.

Web crawl settings image5

Then, click the [create] on the confirmation screen that can crawl to register. Registration is possible to change from the Edit.

Completing the registration Web crawl settings image6

Crawl schedule

Set to collect, document, crawl schedules. Crawl schedules are set from the menu on the left of the admin page crawl General.

Formatting is similar to the Unix Cron. From left, seconds, minutes, time, day, month, represents a day of the week. For example, daily 12: If you successfully crawl your 10 am ‘ 0 10 12 * *? ‘ and then I.

Crawl schedule image7

Crawl is started and the index has been created to make from the menu on the left side, the session information that you can. Displays the document number when the crawl is complete, the search index size of session information (Web/file).

Check the crawl status of image8

If the crawl is complete example image9

Search examples

Like image below to search crawl after the results are returned.

Search example image10

Customizing the search screen

Here, the most viewed users, search results and search top screen shows how to customize the list screen.

Shows how we change the log file name. You can change any knowledge of HTML, so if you want to change the design itself described in a simple JSP files.

First of all, find the top screen ‘webapps/fess/WEB-INF/view/index.jsp’ file.

Search top screen JSP files

<%@page pageEncoding="UTF-8" contentType="text/html; charset=UTF-8"%>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta http-equiv="content-style-type" content="text/css">
<meta http-equiv="content-script-type" content="text/javascript">
<title> |Fess| </title>
<link href="${f:url('/css/style.css')}" rel="stylesheet" type="text/css">
</head>
<body>
<div id='main'>
<s:form action="search">
  <table>
    <tbody>
      <tr>
        <td><img id="logo" src="${f:url('/images/ id="logo" src=""></img id="logo" src="${f:url('/images/></td></tr></tbody></s:form></div></body></html>')}" alt="<bean:message key="labels.search_top_logo_alt"></bean:message>" />
        <td><div class="input">
          <html:text styleclass="query" property="query" title="Search" size="50" maxlength="1000"></html:text>
          <input class="btn" type="submit" value="<bean:message key=" labels.top.search"/="">"name ="search"/ ></div></td>

Change the file name to change the images that appear on the home screen search ‘logo.gif’ where you want to replace. Files placed in the ‘webapps/fess/images’.

<s:form>And <bean:message>such as a JSP tag. For example,<s:form> the actual HTML view when converted to the form tag. Detailed description see SAStruts or for JSP sites. </s:form></bean:message></s:form>

The search results list screen will be in the ‘webapps/fess/WEB-INF/view/search.jsp’ file.

Search results part of the JSP file list screen

<div id="header">
  <s:form action="search">
    <div class="input">
      <s:link action="index" title=" |Fess| Home">
        <img class="logo" src="${f:url('/images/ class="logo" src=""></img class="logo" src="${f:url('/images/></s:link></div></s:form></div>')}" alt="<bean:message key="labels.search_header_logo_alt"></bean:message>"/>
      <html:text styleclass="query" property="query" title="Search" size="50" maxlength="1000"></html:text>
      <input class="btn" type="submit" value="<bean:message key=" labels.search"/="">"name ="search"/ >

Results of the ‘logo-head.gif’ file name change to change the image that appears at the top of the screen. similar to ‘logo.gif’ put in ‘webapps/fess/images’.

Edit ‘Style.css’ If you want to change the CSS file used in a JSP file located in the ‘webapps/fess/css’.

Summary

About the Fess in the full-text search system, from installation until search and simple customization methods discussed. I could introduce you can easily build a search system if you have the Java runtime environment, with no special environment. Can be introduced into an existing system site search functionality, such as if you want, so you try.

I want to introduce the next time you support Fess mobile site search feature.