您现在的位置是：首页 > 其他

当前栏目

HTMOT：随时间变化的分层主题建模

2023-04-18 14:48:17 时间

多年来，主题模型提供了一种从文本中提取见解的有效方法。然而，虽然已经提出了许多模型，但没有一个模型能够共同建模主题的时间性和层次结构。建模时间通过分离词汇上相近但时间上不同的主题来提供更精确的主题，而建模层次结构则提供了文档语料库内容的更详细的视图。因此，在本研究中，我们提出了一种新的方法，HTMOT，来执行分层主题建模。我们使用一种新的吉布斯采样实现来训练HTMOT，这更有效。具体来说，我们表明，仅将时间建模应用于深度子主题，就提供了一种提取特定故事或事件的方法，而高级主题则在语料库中提取更大的主题。我们的结果表明，我们的训练过程是快速的，可以提取准确的高级主题和时间精确的子主题。我们使用Word入侵任务测量了我们的模型的性能，并概述了这种评估方法的一些局限性，特别是对于分层模型。作为一个案例研究，我们重点关注了2020年航天工业的各种发展。

原文题目：HTMOT : Hierarchical Topic Modelling Over Time

原文：Over the years, topic models have provided an efficient way of extracting insights from text. However, while many models have been proposed, none are able to model topic temporality and hierarchy jointly. Modelling time provide more precise topics by separating lexically close but temporally distinct topics while modelling hierarchy provides a more detailed view of the content of a document corpus. In this study, we therefore propose a novel method, HTMOT, to perform Hierarchical Topic Modelling Over Time. We train HTMOT using a new implementation of Gibbs sampling, which is more efficient. Specifically, we show that only applying time modelling to deep sub-topics provides a way to extract specific stories or events while high level topics extract larger themes in the corpus. Our results show that our training procedure is fast and can extract accurate high-level topics and temporally precise sub-topics. We measured our model's performance using the Word Intrusion task and outlined some limitations of this evaluation method, especially for hierarchical models. As a case study, we focused on the various developments in the space industry in 2020.

HTMOT：随时间变化的分层主题建模.pdf

猜你喜欢

Jease 2.6发布 Java开源内容框架
EasyCVR对接华为iVS订阅摄像机和用户变更请求接口介绍
JVM调优总结：反思
【技术种草】cdn+轻量服务器+hugo=让博客“云原生”一下
JVM调优总结：调优方法
前端面试【JavaScript】— typeof 是否能正确判断类型？
JVM调优总结：新一代的垃圾回收算法
前端面试【JavaScript】— instanceof 能否判断基本数据类型？
JVM调优总结：典型配置举例
前端面试【JavaScript】— 能不能手动实现一下 instanceof 的功能？
前端面试【JavaScript】— Object.is和=== 有什么区别？
JVM调优总结：分代垃圾回收详述
前端面试【JavaScript】— JS中类型转换有哪几种？
WPF开发入门尝试
前端面试【JavaScript】— == 和 ===有什么区别？
一个Java程序员对2011年的回顾
前端面试【JavaScript】— 对象转原始类型是根据什么流程运行的？
JVM调优总结：垃圾回收面临的问题
直接在代码里面对list集合进行分页
JVM调优总结：基本垃圾回收算法

zl程序教程

当前栏目

HTMOT：随时间变化的分层主题建模

相关文章