xcit'ed

- paper management system

xcit'ed

- paper management system

 

by matton

tag search result for 'cutoff' return

search: 
add new paper
James R. Curran and Marc Moens.
Improvements in Automatic Thesaurus Extraction.
In Proceedings of the Workshop of the ACL Special Interest Group on the Lexicon (SIGLEX),
pp. 59-66,
2002.
Abstract: The use of semantic resources is common in modern NLP systems, but methods to extract lexical semantics have only recently begun to perform well enough for practical use. We evaluate existing and new similarity metrics for thesaurus extraction, and experiment with the tradeoff between extraction performance and efficiency. We propose an approximation algorithm, based on canonical attributes and coarse- and fine-grained matching, that reduces the time complexity and execution time of thesaurus extraction with only a marginal performance penalty.
thesaurus extraction systems -> differ in the definition of "context"
used a statistical shallow parser
frequency cutoff speeds up the calculation, but doesn't decrease the performance
misc. topics: weights, measures, cutoff frequency, speed-up by canonical vectors

canonical vectors: subj+dobj+iobj, TTestLog + maximum frequency cutoff
updated at: 2007/07/07 17:25:42
寺田昭, 吉田稔, 中川裕志
文脈情報による同義語辞書作成支援ツール
IPSJ SIG Technical Report, NL176, pp. 87-94,
2006.
To improve the proficiency of text processing such as information retrieval or text mining, it is necessary to construct a synonym dictionary, but it is very tiresome to make it by hands. In some fields, such as aviation, synonym nouns are mixed with kanji/hiragana, katakana, alphabet and their abbreviations. As new words always come to be used, the dictionary update is a big issue. In this paper, we propose a tool for constructing a synonym dictionary. The system will return synonym candidates against a query. A synonym can be easily registered in dictionary by looking the synonym candidates. We experimented the system performance by aviation pilot report and evaluated it by average precision.
"frequency is sometimes adjusted as log(x_i + 1)" -> effective
window[2,2] was the best
spiral construction -> not better
updated at: 2007/01/23 10:01:54