coreseek的安装

2017-06-26 20:14:23  对羊弹琴

Sphinx是开源的搜索引擎,它支持英文的全文检索。所以如果单独搭建Sphinx,你就已经可以使用全文索引了。但是往往我们要求的是中文索引,怎么做呢?国人提供了一个可供企业使用的,基于Sphinx的中文全文检索引擎。也就是说Coreseek实际上的内核还是Sphinx。

我选择的是4.1版本

下载tar包并解压

wget http://files.opstool.com/man/coreseek-4.1-beta.tar.gz
tar -zxvf coreseek-4.1-beta.tar.gz 

目录结构如下:

[root@centos6 coreseek-4.1-beta]# ll
总用量 16
drwxrwxrwx. 15 root root 4096 10月 18 2011 csft-4.1
drwxrwxrwx.  9 root root 4096 10月 18 2011 mmseg-3.2.14
-rwxrwxrwx.  1 root root 2467 1月  16 2011 README.txt
drwxrwxrwx.  6 root root 4096 10月 18 2011 testpack

先安装mmseg,mmseg是中文分词组件,有了它才能实现中文检索

[root@centos6 coreseek-4.1-beta]# cd mmseg-3.2.14/
[root@centos6 mmseg-3.2.14]# ./bootstrap 
+ aclocal -I config
./bootstrap: line 23: aclocal: command not found
+ libtoolize --force --copy
./bootstrap: line 24: libtoolize: command not found
+ autoheader
./bootstrap: line 25: autoheader: command not found
+ automake --add-missing --copy
./bootstrap: line 26: automake: command not found
+ autoconf
./bootstrap: line 27: autoconf: command not found

这里报错,需要安装相关软件

yum -y install glibc-common libtool autoconf automake mysql-devel expat-devel
./bootstrap
./configure --prefix=/usr/local/mmseg3        //这里报错configure: error: C++ compiler cannot create executables
yum install gcc gcc-c++ gcc-g77    //安装下相关软件
./configure --prefix=/usr/local/mmseg3     //这次正常
make && make install
[root@centos6 mmseg-3.2.14]# /usr/local/mmseg3/bin/mmseg    //测试是否安装成功
Coreseek COS(tm) MM Segment 1.0
Copyright By Coreseek.com All Right Reserved.....

安装coreseek

cd csft-4.1-beta/
sh buildconf.sh
./configure --prefix=/usr/local/coreseek  --without-unixodbc --with-mmseg --with-mmseg-includes=/usr/local/mmseg3/include/mmseg/ --with-mmseg-libs=/usr/local/mmseg3/lib/ --with-mysql
make && make install

测试mmseg分词和coreseek搜索

cd testpack 
cat var/test/test.xml  #此时应该显示中文
/usr/local/mmseg3/bin/mmseg -d /usr/local/mmseg3/etc var/test/test.xml  
/usr/local/coreseek/bin/indexer -c etc/csft.conf --all  
/usr/local/coreseek/bin/search -c etc/csft.conf 网络搜索    //结果如下
words:
1. '网络': 1 documents, 1 hits
2. '搜索': 2 documents, 5 hits

配置:

cd /usr/local/coreseek/etc
cp sphinx-min.conf.dist csft.conf
vim csft.conf

source article_src
{
    type            = mysql

    sql_host        = 192.168.189.128
    sql_user        = remote
    sql_pass        = 123456
    sql_db            = coreseek_test
    sql_port        = 3306    # optional, default is 3306

    sql_query         = SELECT * from article
    sql_query_pre        = SET NAMES utf8

#    sql_attr_uint        = group_id
#    sql_attr_timestamp    = date_added

#    sql_query_info        = SELECT * FROM documents WHERE id=$id
}


index article_index
{
    source            = article_src
    path            = /usr/local/coreseek/var/data/article_index
    docinfo            = extern
    charset_dictpath    = /usr/local/mmseg3/etc/
    charset_type        = zh_cn.utf-8
    min_word_len        = 1
}
indexer
{
        mem_limit               = 32M
}


searchd
{
        listen                  = 9312
        listen                  = 9306:mysql41
        log                     = /usr/local/coreseek/var/log/searchd.log
        query_log               = /usr/local/coreseek/var/log/query.log
        read_timeout            = 5
        max_children            = 30
        pid_file                = /usr/local/coreseek/var/log/searchd.pid
        max_matches             = 1000
        seamless_rotate         = 1
        preopen_indexes         = 1
        unlink_old              = 1
        workers                 = threads # for RT to work
}

测试:

往article插入几条数据,建立索引

/usr/local/coreseek/bin/indexer --all --rotate

测试搜索:

[root@centos6 coreseek]# /usr/local/coreseek/bin/search 通信
Coreseek Fulltext 4.1 [ Sphinx 2.0.2-dev (r2922)]
Copyright (c) 2007-2011,
Beijing Choice Software Technologies Inc (http://www.coreseek.com)

 using config file '/usr/local/coreseek/etc/csft.conf'...
index 'article_index': query '通信 ': returned 1 matches of 1 total in 0.000 sec

displaying matches:
1. document=2, weight=1680

words:
1. '通信': 1 documents, 1 hits



评论(0) 最后更新于 2017-06-26 20:14:23