zl程序教程

您现在的位置是:首页 >  其他

当前栏目

加强学术文章的关键词提取

2023-03-31 10:32:56 时间

随着互联网技术的发展,信息超载现象的出现越来越明显。用户获取他们需要的信息需要很多时间。但是,高度总结文档信息的关键短语有助于用户快速获取和理解文档。对于学术资源,大多数现有的研究都是通过论文的标题和摘要来提取关键短语的。我们发现,参考文献中的标题信息也包含了作者指定的关键短语。因此,本文利用参考信息,采用两种典型的非监督提取方法(TF*IDF和TextRank)、两种典型的传统监督学习算法(朴素贝叶斯和条件随机场)和一种监督深度学习模型(BiLSTM-CRF),分析了参考信息在关键词提取方面的具体性能。期望从扩展源文本的角度来提高关键字识别的质量。实验结果表明,参考信息可以在一定程度上提高自动提取的精度、查全率和f1。这说明了参考信息在学术论文关键词提取方面的有效性,并为今后自动关键词提取的研究提供了新的思路。

原文题目:Enhancing Keyphrase Extraction from Academic Articles with their Reference Information

原文:With the development of Internet technology, the phenomenon of information overload is becoming more and more obvious. It takes a lot of time for users to obtain the information they need. However, keyphrases that summarize document information highly are helpful for users to quickly obtain and understand documents. For academic resources, most existing studies extract keyphrases through the title and abstract of papers. We find that title information in references also contains author-assigned keyphrases. Therefore, this article uses reference information and applies two typical methods of unsupervised extraction methods (TF*IDF and TextRank), two representative traditional supervised learning algorithms (Naïve Bayes and Conditional Random Field) and a supervised deep learning model (BiLSTM-CRF), to analyze the specific performance of reference information on keyphrase extraction. It is expected to improve the quality of keyphrase recognition from the perspective of expanding the source text. The experimental results show that reference information can increase precision, recall, and F1 of automatic keyphrase extraction to a certain extent. This indicates the usefulness of reference information on keyphrase extraction of academic papers and provides a new idea for the following research on automatic keyphrase extraction.