許安順

An-Shun Hsu

碩士論文 (2008)

語意感知為基之資訊檢索機制研發

Development of a Semantic Awareness-based Information Retrieval Mechanism

關鍵字 Keywords

資訊檢索, 語意擷取, 潛在語意分析, 支持向量機制

Support vector machines, Latent semantic analysis, Semantic extraction, Information retrieval

摘要

資訊科技的進步與網際網路的快速發展,實現了便利與通透的資訊分享。由於數位資訊快速累積,致使透過網際網路搜尋資訊常存在下列問題:(1)傳統以關鍵字為基的搜尋方法僅能比對資訊部份概念,使用者必須進行多次修改查詢才能得到所需之內容;(2)相對於一般的文章,查詢通常以較少的內容構成,導致因比對資訊量不足所造成的主題不易判定與適當內容不易搜尋的困難;(3)人類語言具曖昧性,造成語意落差,也易導致搜尋結果錯誤。 為解決上述問題,本研究發展一個語意感知為基之資訊檢索機制。透過「內容語意擷取與鑑定」、「查詢內容語意圖像之語意擴張」與「內容語意圖像之搜尋」,本機制可提供更正確之搜尋結果。經由語意分析、語意探勘與語意比較,可解決傳統關鍵字為基礎之資訊檢索技術所無法克服的語意曖昧問題,有效提升資訊檢索正確性與效率。

Abstract

The rapid advance in information technologies and the fast development of the Internet have realized expedient and transparent information sharing. However, the following problems often occur due to the fast accumulation of information, when searching for content via Internet. (1) Conventional keyword-based search methods can only make partial concept comparisons. Revisions on query are always required before getting appropriate contents. (2) As contents provided by typical queries are less than that of general texts, difficulties in determining search topics and matching appropriate contents occurred very often due to lack of information. (3) Semantics variations may cause concept ambiguity and lead to the low accuracy in information retrieval. To address the aforementioned issues, this study developed a semantic- awareness mechanism for information retrieval. By conducting “semantic retrieval and determination” and “query content semantic extension” and “semantic pattern search”, the mechanism provides more accurate results as compared to traditional keyword based methods. Through semantic analysis, latent semantics mining, and semantic comparison, the issues caused by semantic ambiguity can be resolved and thus improve efficiency and accuracy of information retrieval.