Elasticlunr.js 简单介绍

Elasticlunr.js 简单介绍

大家好,又见面了,我是全栈君。

Elasticlunr.js

Build Status

项目地址:http://elasticlunr.com/
代码地址:https://github.com/weixsong/elasticlunr.js
文档地址:http://elasticlunr.com/docs/index.html

Elasticlurn.js is a lightweight full-text search engine in Javascript for browser search and offline search.
Elasticlunr.js is developed based on Lunr.js, but more flexible than lunr.js. Elasticlunr.js provides Query-Time boosting and field search.
Elasticlunr.js is a bit like Solr, but much smaller and not as bright, but also provide flexible configuration and query-time boosting.

Key Features Comparing with Lunr.js

  • Query-Time boosting, you don’t need to setup boosting weight in index building procedure, this make it more flexible that you could try different boosting scheme.
  • More rational scoring mechanism, Elasticlunr.js use quite the same scoring mechanism as Elasticsearch, and also this scoring mechanism is used by lucene.
  • Field-search, you could choose which field to index and which field to search.
  • Boolean Model, you could set which field to search and the boolean model for each query token, such as “OR”, “AND”.
  • Combined Boolean Model, TF/IDF Model and the Vector Space Model, make the results ranking more reliable.
  • Fast, Elasticlunr.js removed TokenCorpus and Vector from lunr.js, by using combined model there is no need to compute the vector of a document and query string to compute similarity of query and matched document, this improve the search speed significantly.
  • Small index file, Elasticlunr.js did not store TokenCorpus because there is no need to compute query vector and document vector, then the index file is very small, this is especially helpful when elasticlurn.js is used as offline search.

Example

A very simple search index can be created using the following scripts:

var index = elasticlunr(function () {
    this.addField('title');
    this.addField('body');
    this.setRef('id');
});

Adding documents to the index is as simple as:

var doc1 = {
    "id": 1,
    "title": "Oracle released its latest database Oracle 12g",
    "body": "Yestaday Oracle has released its new database Oracle 12g, this would make more money for this company and lead to a nice profit report of annual year."
}

var doc2 = {
    "id": 2,
    "title": "Oracle released its profit report of 2015",
    "body": "As expected, Oracle released its profit report of 2015, during the good sales of database and hardware, Oracle's profit of 2015 reached 12.5 Billion."
}

index.addDoc(doc1);
index.addDoc(doc2);

Then searching is as simple:

index.search("Oracle database profit");

Also, you could do query-time boosting by passing in a configuration.

index.search("Oracle database profit", {
    fields: {
        title: {boost: 2},
        body: {boost: 1}
    }
});

This returns a list of matching documents with a score of how closely they match the search query:

[{
    "ref": 1,
    "score": 0.5376053707962494
},
{
    "ref": 2,
    "score": 0.5237481076838757
}]

API documentation is available, as well as a full working example.

Description

Elasticlunr.js is developed based on Lunr.js, but more flexible than lunr.js. Elasticlunr.js provides Query-Time boosting and field search.
A bit like Solr, but much smaller and not as bright, but also provide flexible configuration and query-time boosting.

Why

  1. In some system, you don’t want to deploy any Web Server(such as Apache, Nginx, etc.), you only provide some static web pages and provide search function in client side. Then you could build index in previous and load index in client side.
  2. Provide offline search functionality. For some documents, user usually download these documents, you could build index and put index in the documents package, then provide offline search functionality.
  3. For some limited or restricted network, such WAN or LAN, offline search is a better choice.
  4. For mobile device, Iphone or Android phone, network traffic maybe very expensive, then provide offline search is a good choice.

Installation

Simply include the elasticlunr.js source file in the page that you want to use it. Elasticlunr.js is supported in all modern browsers.

Browsers that do not support ES5 will require a JavaScript shim for Elasticlunr.js to work. You can either use Augment.js, ES5-Shim or any library that patches old browsers to provide an ES5 compatible JavaScript environment.

Documentation

This part only contain important apects of elasticlunr.js, for the whole documentation, please go to API documentation.

1. Build Index

When you first create a index instance, you need to specify which field you want to index. If you did not specify which field to index, then no field will be searchable for your documents.
You could specify fields by:

var index = elasticlunr(function () {
    this.addField('title');
    this.addField('body');
    this.setRef('id');
});

You could also set the document reference by this.setRef('id'), if you did not set document ref, elasticlunr.js will use ‘id’ as default.

You could do the above index setup as followings:

var index = elasticlunr();
index.addField('title');
index.addField('body');
index.setRef('id');

Default supported language of elasticlunr.js is English, if you want to use elasticlunr.js to index other language documents, then you need to use elasticlunr.js combined with lunr-languages.
Assume you’re using lunr-language in Node.js envrionment, you could import lunr-language as followings:

var lunr = require('./lib/lunr.js');
require('./lunr.stemmer.support.js')(lunr);
require('./lunr.de.js')(lunr);

var idx = lunr(function () {
    // use the language (de)
    this.use(lunr.de);
    // then, the normal lunr index initialization
    this.field('title')
    this.field('body')
});

For more details, please go to lunr-languages.

2. Add document to index

Add document to index is very simple, just prepare you document in JSON format, then add it to index.

var doc1 = {
    "id": 1,
    "title": "Oracle released its latest database Oracle 12g",
    "body": "Yestaday Oracle has released its new database Oracle 12g, this would make more money for this company and lead to a nice profit report of annual year."
}

var doc2 = {
    "id": 2,
    "title": "Oracle released its profit report of 2015",
    "body": "As expected, Oracle released its profit report of 2015, during the good sales of database and hardware, Oracle's profit of 2015 reached 12.5 Billion."
}

index.addDoc(doc1);
index.addDoc(doc2);

If your JSON document contains field that not configured in index, then that field will not be indexed, which means that field is not searchable.

3. Remove document from index

Elasticlunr.js support remove a document from index, just provide JSON document to elasticlunr.Index.prototype.removeDoc() function.

For example:

var doc = {
    "id": 1,
    "title": "Oracle released its latest database Oracle 12g",
    "body": "Yestaday Oracle has released its new database Oracle 12g, this would make more money for this company and lead to a nice profit report of annual year."
}

index.removeDoc(doc);

Remove a document will remove each token of that document’s each field from field-specified inverted index.

4. Update a document in index

Elasticlunr.js support update a document in index, just provide JSON document to elasticlunr.Index.prototype.update() function.

For example:

var doc = {
    "id": 1,
    "title": "Oracle released its latest database Oracle 12g",
    "body": "Yestaday Oracle has released its new database Oracle 12g, this would make more money for this company and lead to a nice profit report of annual year."
}

index.update(doc);

5. Query from Index

Elasticlunr.js provides flexible query configuration, supports query-time boosting and Boolean logic setting.
You could setup a configuration tell elasticlunr.js how to do query-time boosting, which field to search in, how to do the boolean logic.
Or you could just use it by simply provide a query string, this will aslo works perfectly because the scoring mechanism is very efficient.

5.1 Simple Query

Because elasticlunr.js has a very perfect scoring mechanism, so for most of your requirement, simple search would be easy to meet your requirement.

index.search("Oracle database profit");

Output is a results array, each element of results array is an Object contain a ref field and a score field.
ref is the document reference.
score is the similarity measurement.

Results array is sorted descent by score.

5.2 Configuration Query

5.2.1 Query-Time Boosting

Setup which fields to search in by passing in a JSON configuration, and setup boosting for each search field.
If you setup this configuration, then elasticlunr.js will only search the query string in the specified fields with boosting weight.

The scoring mechanism used in elasticlunr.js is very complex, please goto details for more information.

index.search("Oracle database profit", {
    fields: {
        title: {boost: 2},
        body: {boost: 1}
    }
});

5.2.2 Boolean Model

Elasticlunr.js also support boolean logic setting, if no boolean logic is setted, elasticlunr.js use “OR” logic defaulty. By “OR” default logic, elasticlunr.js could reach a high Recall.

index.search("Oracle database profit", {
    fields: {
        title: {boost: 2},
        body: {boost: 1}
    },
    boolean: "OR"
});

Boolean operation is performed based on field. This means that if you choose “AND” logic, documents with all the query tokens in the query field will be returned as a field results. If you query in multiple fields, different field results will be merged together to give a final query results.

版权声明:本文内容由互联网用户自发贡献,该文观点仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请联系我们举报,一经查实,本站将立刻删除。

发布者:全栈程序员-站长,转载请注明出处:https://javaforall.net/115845.html原文链接:https://javaforall.net

(0)
全栈程序员-站长的头像全栈程序员-站长


相关推荐

  • jwt解析网站_jwt工作原理

    jwt解析网站_jwt工作原理1.Token与Session优缺点概述1.1Session的由来在登录一个网站进行访问时由于HTTP协议是无状态的就是说一次HTTP请求后他就会被销毁,比如我在www.a.com/login里面登录了,然后你就要访问别的了比如要访问www.a.com/index但是你访问这个网站你就得再发一次HTTP请求,至于说之前的请求跟现在没关,不会有任何记忆,这次访问会失败,因为无法验证你的身份。所以你登录完之后每次在请求上都得带上账号密码等验证身份的信息,但是你天天这么带,那太麻烦了。那还可以这样,把我第一

    2022年10月17日
    0
  • renren-fast 与 renren-fast-vue 与 renren-generator 基本操作[通俗易懂]

    renren-fast 与 renren-fast-vue 与 renren-generator 基本操作[通俗易懂]一、前言公司主打产品的,近来发现了一款快速完成前后端CRUD的框架renren-fast,打算用它来“刷”小型的外包,积攒资金。个人觉得,renren-fast主要面向后台开发者,使用方式和Guns类似:使用Guns自动生成SpringBoot+LayUI的后台管理系统①由于完整开发文档需要费用,②前端使用vue,有的后台开发者不清楚。笔者参考了…

    2022年7月28日
    1
  • assertEquals 方法「建议收藏」

    assertEquals 方法「建议收藏」assertEquals  函数原型1:assertEquals([Stringmessage],expected,actual)参数说明:message是个可选的消息,假如提供,将会在发生错误时报告这个消息。  expected是期望值,通常都是用户指定的内容。actual是被测试的代码返回的实际值。  函数原型2:assertEquals([Stringmessa…

    2022年7月12日
    15
  • Java的八种基本数据类型

    Java的八种基本数据类型1.byte:字节1.计算机中,数据传输大多是以“位”(bit,比特)为单位。2.一位就代表一个0或1(二进制),每8个位(bit)组成一个字节(byte),所以,1个字节=8位0101代码。2.short3.int4.long5.float6.double7.char1.char类型占2个字节(16位),用来表示字符。2.char是基本数据类型。String表示字符串,是类类型。一个String是由0~n个char组成。3.字符使用单引号表示,字符串使用双引号表示。8.boolea

    2022年7月7日
    18
  • 【Android】Android游戏编程之从零开始[通俗易懂]

    【Android】Android游戏编程之从零开始[通俗易懂]《Android游戏编程之从零开始》主要系统地讲解了Android游戏开发,从最基础部分开始,让零基础的Android初学者也能快速学习和掌握Android游戏开发。《Android游戏编程之从零开始》一共8章,内容包括Android平台介绍与环境搭建、Hello,Android!项目剖析、游戏开发中常用的系统组件、游戏开发基础、游戏开发实战、游戏开发提高篇、Box2d物理引擎、物…

    2022年5月27日
    204
  • 安装oracle11g oci.exe,oracle 11g安装图解|安装oracle数据库软件详细教程[通俗易懂]

    安装oracle11g oci.exe,oracle 11g安装图解|安装oracle数据库软件详细教程[通俗易懂]oracle是非常强大的数据库软件,有很多朋友对oracle安装并不是很了解,因为除了安装还有一些变量需要设置,下面一起来看看oracle11g安装图解,定能帮助你快速安装oracle11g。Oracle11g安装图解:1、首先下载Oracle11gR2forWindows的版本本站下载地址:其中包括两个压缩包:win64_11gR2_database_1of2.zip,win64_…

    2022年9月21日
    0

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注

关注全栈程序员社区公众号