MARC: Addressing polysemy in bilingual lexicon extraction from comparable corpora

Addressing polysemy in bilingual lexicon extraction from comparable corpora

This paper presents an approach to extract translation equivalents from comparable corpora for polysemous nouns. As opposed to the standard approaches that build a single context vector for all occurrences of a given headword, we first disambiguate the headword with third-party sense taggers and the...

Full description

Permalink:	http://skupnikatalog.nsk.hr/Record/ffzg.KOHA-OAI-FFZG:318225/Details
Matična publikacija:	Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC'12) Istanbul : 2012
Glavni autori:	Fišer, Darja (-), Kubelka, Ozren (Author), Ljubešić, Nikola, informatičar
Vrsta građe:	Članak
Jezik:	eng
Online pristup:	http://www.lrec-conf.org/proceedings/lrec2012/index.html


LEADER	02185naa a2200265uu 4500
008	131111s2012 xx 1 eng\|d
035			\|a (CROSBI)616769
040			\|a HR-ZaFF \|b hrv \|c HR-ZaFF \|e ppiak
100	1		\|a Fišer, Darja
245	1	0	\|a Addressing polysemy in bilingual lexicon extraction from comparable corpora / \|c Fišer, Darja ; Ljubešić, Nikola ; Kubelka, Ozren.
246	3		\|i Naslov na engleskom: \|a Addressing polysemy in bilingual lexicon extraction from comparable corpora
300			\|f str.
520			\|a This paper presents an approach to extract translation equivalents from comparable corpora for polysemous nouns. As opposed to the standard approaches that build a single context vector for all occurrences of a given headword, we first disambiguate the headword with third-party sense taggers and then build a separate context vector for each sense of the headword. Since state-of-the- art word sense disambiguation tools are still far from perfect, we also tried to improve the results by combining the sense assignments provided by two different sense taggers. Evaluation of the results shows that we outperform the baseline (0.473) in all the settings we experimented with, even when using only one sense tagger, and that the best-performing results are indeed obtained by taking into account the intersection of both sense taggers (0.720).
536			\|a Projekt MZOS \|f 130-1301679-1380
536			\|a Projekt MZOS \|f FP7-248347
546			\|a ENG
690			\|a 5.04
693			\|a bilingual lexicon extraction, comparable corpora, polysemy \|l hrv \|2 crosbi
693			\|a bilingual lexicon extraction, comparable corpora, polysemy \|l eng \|2 crosbi
700	1		\|a Kubelka, Ozren \|4 aut
700	1		\|9 445 \|a Ljubešić, Nikola, \|c informatičar \|4 aut
773	0		\|a Eight International Conference on Language Resources and Evaluation (LREC'12) (21-27.05.2012. ; Istanbul, Turska) \|t Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC'12) \|d Istanbul : 2012 \|n Nicoletta Calzolari et al.
856			\|u http://www.lrec-conf.org/proceedings/lrec2012/index.html
942			\|c RZB \|u 2 \|v Recenzija \|z Znanstveni - Predavanje - CijeliRad \|t 1.08
999			\|c 318225 \|d 318223

Addressing polysemy in bilingual lexicon extraction from comparable corpora

Slični primjerci