Journal Volumes


Visitors
ALL : 2,315,609
TODAY : 8,908
ONLINE : 922

  JOURNAL DETAIL



Name-alias Relationship Identification in Thai News Articles: A Comparison of Co-occurrence Matrix Construction Methods


Paper Type 
Contributed Paper
Title 
Name-alias Relationship Identification in Thai News Articles: A Comparison of Co-occurrence Matrix Construction Methods
Author 
Thawatchai Suwanapong*, Thanaruk Theeramunkong and Ekawit Nantajeewarawat
Email 
amaritsar@yahoo.com
Abstract:
Named entity disambiguation is one of the most challenging tasks in natural language processing. In many Thai news categories, referential ambiguity is often found, i.e., in addition to its formal names, an entity is often referred to by other names, called name aliases. Name co-occurrence information is very useful for name-alias relationship identification, and it is usually represented by a co-occurrence matrix in the vector space model. Traditionally, a co-occurrence matrix is constructed by multiplying a weighted name-by-document matrix, possibly normalized, and its transpose. This paper proposes an alternative co-occurrence matrix construction method using association measures. The effects of association measures are investigated by comparing their use with the traditional co-occurrence matrix construction method. Various complementary factors are considered in the comparison, e.g., weighting schemes, a normalization process, and linkage functions for hierarchical clustering. Two collections of Thai news articles, 1,000 articles in the domain of football and 1,000 articles in the domain of politics, are used in experiments. The experimental results show that co-occurrence matrix construction using association measures yields the highest performance in both news domains.

Start & End Page 
1805 - 1821
Received Date 
2015-05-29
Revised Date 
Accepted Date 
2016-09-07
Full Text 
  Download
Keyword 
name alias, association measure, Thai news article, relationship identification, co-occurrence matrix, preprocessing factor, name clustering
Volume 
Vol.44 No.4 (October 2017)
DOI 
Citation 
Suwanapong T., Theeramunkong T. and Nantajeewarawat E., Name-alias Relationship Identification in Thai News Articles: A Comparison of Co-occurrence Matrix Construction Methods , Chiang Mai J. Sci., 2017; 44(4): 1805-1821.
SDGs
View:1,156 Download:264

Search in this journal


Document Search


Author Search

A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z

Popular Search






Chiang Mai Journal of Science

Faculty of Science, Chiang Mai University
239 Huaykaew Road, Tumbol Suthep, Amphur Muang, Chiang Mai 50200 THAILAND
Tel: +6653-943-467




Faculty of Science,
Chiang Mai University




EMAIL
cmjs@cmu.ac.th




Copyrights © Since 2021 All Rights Reserved by Chiang Mai Journal of Science