您现在的位置是：首页 > 后端

当前栏目

Lucene全文检索的【增、删、改、查】实例详解编程语言

实例编程语言详解全文检索 Lucene

2023-06-13 09:11:51 时间

创建索引

Lucene在进行创建索引时，根据前面一篇博客，已经讲完了大体的流程，这里再简单说下：

Directory directory = FSDirectory.open("/tmp/testindex"); 

IndexWriterConfig config = new IndexWriterConfig(Version.LUCENE_CURRENT, analyzer); 

IndexWriter iwriter = new IndexWriter(directory, config); 

Document doc = new Document(); 

String text = "This is the text to be indexed."; 

doc.add(new Field("fieldname", text, TextField.TYPE_STORED)); iwriter.close();

1 创建Directory，获取索引目录

2 创建词法分析器，创建IndexWriter对象

3 创建document对象，存储数据

4 关闭IndexWriter，提交

/** 

 * 建立索引 

 * @param args 

 public static void index() throws Exception { 

 String text1 = "hello,man!"; 

 String text2 = "goodbye,man!"; 

 String text3 = "hello,woman!"; 

 String text4 = "goodbye,woman!"; 

 Date date1 = new Date(); 

 analyzer = new StandardAnalyzer(Version.LUCENE_CURRENT); 

 directory = FSDirectory.open(new File(INDEX_DIR)); 

 IndexWriterConfig config = new IndexWriterConfig( 

 Version.LUCENE_CURRENT, analyzer); 

 indexWriter = new IndexWriter(directory, config); 

 Document doc1 = new Document(); 

 doc1.add(new TextField("filename", "text1", Store.YES)); 

 doc1.add(new TextField("content", text1, Store.YES)); 

 indexWriter.addDocument(doc1); 

 Document doc2 = new Document(); 

 doc2.add(new TextField("filename", "text2", Store.YES)); 

 doc2.add(new TextField("content", text2, Store.YES)); 

 indexWriter.addDocument(doc2); 

 Document doc3 = new Document(); 

 doc3.add(new TextField("filename", "text3", Store.YES)); 

 doc3.add(new TextField("content", text3, Store.YES)); 

 indexWriter.addDocument(doc3); 

 Document doc4 = new Document(); 

 doc4.add(new TextField("filename", "text4", Store.YES)); 

 doc4.add(new TextField("content", text4, Store.YES)); 

 indexWriter.addDocument(doc4); 

 indexWriter.commit(); 

 indexWriter.close(); 

 Date date2 = new Date(); 

 System.out.println("创建索引耗时：" + (date2.getTime() - date1.getTime()) + "ms/n"); 

 }

增量添加索引

Lucene拥有增量添加索引的功能，在不会影响之前的索引情况下，添加索引，它会在何时的时机，自动合并索引文件。

/** 

 * 增加索引 

 * @throws Exception 

 public static void insert() throws Exception { 

 String text5 = "hello,goodbye,man,woman"; 

 Date date1 = new Date(); 

 analyzer = new StandardAnalyzer(Version.LUCENE_CURRENT); 

 directory = FSDirectory.open(new File(INDEX_DIR)); 

 IndexWriterConfig config = new IndexWriterConfig( 

 Version.LUCENE_CURRENT, analyzer); 

 indexWriter = new IndexWriter(directory, config); 

 Document doc1 = new Document(); 

 doc1.add(new TextField("filename", "text5", Store.YES)); 

 doc1.add(new TextField("content", text5, Store.YES)); 

 indexWriter.addDocument(doc1); 

 indexWriter.commit(); 

 indexWriter.close(); 

 Date date2 = new Date(); 

 System.out.println("增加索引耗时：" + (date2.getTime() - date1.getTime()) + "ms/n"); 

 }

删除索引

Lucene也是通过IndexWriter调用它的delete方法，来删除索引。我们可以通过关键字，删除与这个关键字有关的所有内容。如果仅仅是想要删除一个文档，那么最好就顶一个唯一的ID域，通过这个ID域，来进行删除操作。

/** 

 * 删除索引 

 * @param str 删除的关键字 

 * @throws Exception 

 public static void delete(String str) throws Exception { 

 Date date1 = new Date(); 

 analyzer = new StandardAnalyzer(Version.LUCENE_CURRENT); 

 directory = FSDirectory.open(new File(INDEX_DIR)); 

 IndexWriterConfig config = new IndexWriterConfig( 

 Version.LUCENE_CURRENT, analyzer); 

 indexWriter = new IndexWriter(directory, config); 

 indexWriter.deleteDocuments(new Term("filename",str)); 

 indexWriter.close(); 

 Date date2 = new Date(); 

 System.out.println("删除索引耗时：" + (date2.getTime() - date1.getTime()) + "ms/n"); 

 }

更新索引

Lucene没有真正的更新操作，通过某个fieldname，可以更新这个域对应的索引，但是实质上，它是先删除索引，再重新建立的。

/** 

 * 更新索引 

 * @throws Exception 

 public static void update() throws Exception { 

 String text1 = "update,hello,man!"; 

 Date date1 = new Date(); 

 analyzer = new StandardAnalyzer(Version.LUCENE_CURRENT); 

 directory = FSDirectory.open(new File(INDEX_DIR)); 

 IndexWriterConfig config = new IndexWriterConfig( 

 Version.LUCENE_CURRENT, analyzer); 

 indexWriter = new IndexWriter(directory, config); 

 Document doc1 = new Document(); 

 doc1.add(new TextField("filename", "text1", Store.YES)); 

 doc1.add(new TextField("content", text1, Store.YES)); 

 indexWriter.updateDocument(new Term("filename","text1"), doc1); 

 indexWriter.close(); 

 Date date2 = new Date(); 

 System.out.println("更新索引耗时：" + (date2.getTime() - date1.getTime()) + "ms/n"); 

 }

通过索引查询关键字

Lucene的查询方式有很多种，这里就不做详细介绍了。它会返回一个ScoreDoc的集合，类似ResultSet的集合，我们可以通过域名获取想要获取的内容。

/** 

 * 关键字查询 

 * @param str 

 * @throws Exception 

 public static void search(String str) throws Exception { 

 directory = FSDirectory.open(new File(INDEX_DIR)); 

 analyzer = new StandardAnalyzer(Version.LUCENE_CURRENT); 

 DirectoryReader ireader = DirectoryReader.open(directory); 

 IndexSearcher isearcher = new IndexSearcher(ireader); 

 QueryParser parser = new QueryParser(Version.LUCENE_CURRENT, "content",analyzer); 

 Query query = parser.parse(str); 

 ScoreDoc[] hits = isearcher.search(query, null, 1000).scoreDocs; 

 for (int i = 0; i hits.length; i++) { 

 Document hitDoc = isearcher.doc(hits[i].doc); 

 System.out.println(hitDoc.get("filename")); 

 System.out.println(hitDoc.get("content")); 

 ireader.close(); 

 directory.close(); 

 }

全部代码

package test; 

import java.io.File; 

import java.util.Date; 

import java.util.List; 

import org.apache.lucene.analysis.Analyzer; 

import org.apache.lucene.analysis.standard.StandardAnalyzer; 

import org.apache.lucene.document.Document; 

import org.apache.lucene.document.LongField; 

import org.apache.lucene.document.TextField; 

import org.apache.lucene.document.Field.Store; 

import org.apache.lucene.index.DirectoryReader; 

import org.apache.lucene.index.IndexWriter; 

import org.apache.lucene.index.IndexWriterConfig; 

import org.apache.lucene.index.Term; 

import org.apache.lucene.queryparser.classic.QueryParser; 

import org.apache.lucene.search.IndexSearcher; 

import org.apache.lucene.search.Query; 

import org.apache.lucene.search.ScoreDoc; 

import org.apache.lucene.store.Directory; 

import org.apache.lucene.store.FSDirectory; 

import org.apache.lucene.util.Version; 

public class TestLucene { 

 // 保存路径 

 private static String INDEX_DIR = "D://luceneIndex"; 

 private static Analyzer analyzer = null; 

 private static Directory directory = null; 

 private static IndexWriter indexWriter = null; 

 public static void main(String[] args) { 

 try { 

// index(); 

 search("man"); 

// insert(); 

// delete("text5"); 

// update(); 

 } catch (Exception e) { 

 e.printStackTrace(); 

 /** 

 * 更新索引 

 * @throws Exception 

 public static void update() throws Exception { 

 String text1 = "update,hello,man!"; 

 Date date1 = new Date(); 

 analyzer = new StandardAnalyzer(Version.LUCENE_CURRENT); 

 directory = FSDirectory.open(new File(INDEX_DIR)); 

 IndexWriterConfig config = new IndexWriterConfig( 

 Version.LUCENE_CURRENT, analyzer); 

 indexWriter = new IndexWriter(directory, config); 

 Document doc1 = new Document(); 

 doc1.add(new TextField("filename", "text1", Store.YES)); 

 doc1.add(new TextField("content", text1, Store.YES)); 

 indexWriter.updateDocument(new Term("filename","text1"), doc1); 

 indexWriter.close(); 

 Date date2 = new Date(); 

 System.out.println("更新索引耗时：" + (date2.getTime() - date1.getTime()) + "ms/n"); 

 /** 

 * 删除索引 

 * @param str 删除的关键字 

 * @throws Exception 

 public static void delete(String str) throws Exception { 

 Date date1 = new Date(); 

 analyzer = new StandardAnalyzer(Version.LUCENE_CURRENT); 

 directory = FSDirectory.open(new File(INDEX_DIR)); 

 IndexWriterConfig config = new IndexWriterConfig( 

 Version.LUCENE_CURRENT, analyzer); 

 indexWriter = new IndexWriter(directory, config); 

 indexWriter.deleteDocuments(new Term("filename",str)); 

 indexWriter.close(); 

 Date date2 = new Date(); 

 System.out.println("删除索引耗时：" + (date2.getTime() - date1.getTime()) + "ms/n"); 

 /** 

 * 增加索引 

 * @throws Exception 

 public static void insert() throws Exception { 

 String text5 = "hello,goodbye,man,woman"; 

 Date date1 = new Date(); 

 analyzer = new StandardAnalyzer(Version.LUCENE_CURRENT); 

 directory = FSDirectory.open(new File(INDEX_DIR)); 

 IndexWriterConfig config = new IndexWriterConfig( 

 Version.LUCENE_CURRENT, analyzer); 

 indexWriter = new IndexWriter(directory, config); 

 Document doc1 = new Document(); 

 doc1.add(new TextField("filename", "text5", Store.YES)); 

 doc1.add(new TextField("content", text5, Store.YES)); 

 indexWriter.addDocument(doc1); 

 indexWriter.commit(); 

 indexWriter.close(); 

 Date date2 = new Date(); 

 System.out.println("增加索引耗时：" + (date2.getTime() - date1.getTime()) + "ms/n"); 

 /** 

 * 建立索引 

 * @param args 

 public static void index() throws Exception { 

 String text1 = "hello,man!"; 

 String text2 = "goodbye,man!"; 

 String text3 = "hello,woman!"; 

 String text4 = "goodbye,woman!"; 

 Date date1 = new Date(); 

 analyzer = new StandardAnalyzer(Version.LUCENE_CURRENT); 

 directory = FSDirectory.open(new File(INDEX_DIR)); 

 IndexWriterConfig config = new IndexWriterConfig( 

 Version.LUCENE_CURRENT, analyzer); 

 indexWriter = new IndexWriter(directory, config); 

 Document doc1 = new Document(); 

 doc1.add(new TextField("filename", "text1", Store.YES)); 

 doc1.add(new TextField("content", text1, Store.YES)); 

 indexWriter.addDocument(doc1); 

 Document doc2 = new Document(); 

 doc2.add(new TextField("filename", "text2", Store.YES)); 

 doc2.add(new TextField("content", text2, Store.YES)); 

 indexWriter.addDocument(doc2); 

 Document doc3 = new Document(); 

 doc3.add(new TextField("filename", "text3", Store.YES)); 

 doc3.add(new TextField("content", text3, Store.YES)); 

 indexWriter.addDocument(doc3); 

 Document doc4 = new Document(); 

 doc4.add(new TextField("filename", "text4", Store.YES)); 

 doc4.add(new TextField("content", text4, Store.YES)); 

 indexWriter.addDocument(doc4); 

 indexWriter.commit(); 

 indexWriter.close(); 

 Date date2 = new Date(); 

 System.out.println("创建索引耗时：" + (date2.getTime() - date1.getTime()) + "ms/n"); 

 /** 

 * 关键字查询 

 * @param str 

 * @throws Exception 

 public static void search(String str) throws Exception { 

 directory = FSDirectory.open(new File(INDEX_DIR)); 

 analyzer = new StandardAnalyzer(Version.LUCENE_CURRENT); 

 DirectoryReader ireader = DirectoryReader.open(directory); 

 IndexSearcher isearcher = new IndexSearcher(ireader); 

 QueryParser parser = new QueryParser(Version.LUCENE_CURRENT, "content",analyzer); 

 Query query = parser.parse(str); 

 ScoreDoc[] hits = isearcher.search(query, null, 1000).scoreDocs; 

 for (int i = 0; i hits.length; i++) { 

 Document hitDoc = isearcher.doc(hits[i].doc); 

 System.out.println(hitDoc.get("filename")); 

 System.out.println(hitDoc.get("content")); 

 ireader.close(); 

 directory.close(); 

}

19499.html

cgojava

猜你喜欢

Oracle存储过程基本语法介绍
为Linux家居化：探索Home目录（linux的home目录）
SUSE Linux：把握未来的开源操作系统（linuxsuse）
Tomcat+Nginx配置以及Tomcat宕机后的问题
NSString实现trim的代码详解手机开发
MySQL外键的高效查询技巧（mysql外键效率）
【错误记录】Flutter 报错 ( Could not read script ‘xxxflutter_toolsgradleapp_plugin_loader.gradle‘ )
BOM核心——window对象之窗口
Linux的分叉：探索多样性的世界（linux的分支）
Zabbix面试题，附详细答案！
ORA-19681: block media recovery on control file not possible ORACLE 报错故障修复远程处理
Linux 抓包技术：深入探索数据的神秘世界（linux收发包）
Redis实现高效通知服务（redis通知）
破解MSSQL：揭开数据库安全之谜（mssql破解）

zl程序教程

当前栏目

Lucene全文检索的【增、删、改、查】实例详解编程语言

相关文章

当前栏目

Lucene全文检索的【增、删、改、查】 实例详解编程语言

相关文章

Lucene全文检索的【增、删、改、查】实例详解编程语言