Skip to content

Commit

Permalink
mf
Browse files Browse the repository at this point in the history
  • Loading branch information
mam10eks committed Jun 30, 2024
1 parent 92f241c commit 5ed5e04
Show file tree
Hide file tree
Showing 6 changed files with 3,616 additions and 1,512 deletions.
255 changes: 154 additions & 101 deletions ui/qrel-details.jsonl

Large diffs are not rendered by default.

255 changes: 154 additions & 101 deletions ui/run-details.jsonl

Large diffs are not rendered by default.

30 changes: 30 additions & 0 deletions ui/src/example-documents.json
Original file line number Diff line number Diff line change
Expand Up @@ -28,5 +28,35 @@
"url": "https://aclanthology.org/O02-2002"
}
}
},
"ir-lab-sose-2024/ir-acl-anthology-topics-augsburg-20240525_0-test": {
"O02-2002": {
"docno": "O02-2002",
"text": "A Study on Word Similarity using Context Vector Models\n\n\n There is a need to measure word similarity when processing natural languages, especially when using generalization, classification, or example -based approaches. Usually, measures of similarity between two words are defined according to the distance between their semantic classes in a semantic taxonomy . The taxonomy approaches are more or less semantic -based that do not consider syntactic similarit ies. However, in real applications, both semantic and syntactic similarities are required and weighted differently. Word similarity based on context vectors is a mixture of syntactic and semantic similarit ies. In this paper, we propose using only syntactic related co-occurrences as context vectors and adopt information theoretic models to solve the problems of data sparseness and characteristic precision. The probabilistic distribution of co-occurrence context features is derived by parsing the contextual environment of each word , and all the context features are adjusted according to their IDF (inverse document frequency) values. The agglomerative clustering algorithm is applied to group similar words according to their similarity values. It turns out that words with similar syntactic categories and semantic classes are grouped together.",
"original_document": {
"title": "A Study on Word Similarity using Context Vector Models",
"abstract": "There is a need to measure word similarity when processing natural languages, especially when using generalization, classification, or example -based approaches. Usually, measures of similarity between two words are defined according to the distance between their semantic classes in a semantic taxonomy . The taxonomy approaches are more or less semantic -based that do not consider syntactic similarit ies. However, in real applications, both semantic and syntactic similarities are required and weighted differently. Word similarity based on context vectors is a mixture of syntactic and semantic similarit ies. In this paper, we propose using only syntactic related co-occurrences as context vectors and adopt information theoretic models to solve the problems of data sparseness and characteristic precision. The probabilistic distribution of co-occurrence context features is derived by parsing the contextual environment of each word , and all the context features are adjusted according to their IDF (inverse document frequency) values. The agglomerative clustering algorithm is applied to group similar words according to their similarity values. It turns out that words with similar syntactic categories and semantic classes are grouped together.",
"year": "2002",
"authors": "Chen, Keh-Jiann and\nYou, Jia-Ming",
"anthology": "acl",
"anthology_id": "O02-2002",
"url": "https://aclanthology.org/O02-2002"
}
}
},
"ir-lab-sose-2024/ir-acl-anthology-topics-leipzig-20240423-test": {
"O02-2002": {
"docno": "O02-2002",
"text": "A Study on Word Similarity using Context Vector Models\n\n\n There is a need to measure word similarity when processing natural languages, especially when using generalization, classification, or example -based approaches. Usually, measures of similarity between two words are defined according to the distance between their semantic classes in a semantic taxonomy . The taxonomy approaches are more or less semantic -based that do not consider syntactic similarit ies. However, in real applications, both semantic and syntactic similarities are required and weighted differently. Word similarity based on context vectors is a mixture of syntactic and semantic similarit ies. In this paper, we propose using only syntactic related co-occurrences as context vectors and adopt information theoretic models to solve the problems of data sparseness and characteristic precision. The probabilistic distribution of co-occurrence context features is derived by parsing the contextual environment of each word , and all the context features are adjusted according to their IDF (inverse document frequency) values. The agglomerative clustering algorithm is applied to group similar words according to their similarity values. It turns out that words with similar syntactic categories and semantic classes are grouped together.",
"original_document": {
"title": "A Study on Word Similarity using Context Vector Models",
"abstract": "There is a need to measure word similarity when processing natural languages, especially when using generalization, classification, or example -based approaches. Usually, measures of similarity between two words are defined according to the distance between their semantic classes in a semantic taxonomy . The taxonomy approaches are more or less semantic -based that do not consider syntactic similarit ies. However, in real applications, both semantic and syntactic similarities are required and weighted differently. Word similarity based on context vectors is a mixture of syntactic and semantic similarit ies. In this paper, we propose using only syntactic related co-occurrences as context vectors and adopt information theoretic models to solve the problems of data sparseness and characteristic precision. The probabilistic distribution of co-occurrence context features is derived by parsing the contextual environment of each word , and all the context features are adjusted according to their IDF (inverse document frequency) values. The agglomerative clustering algorithm is applied to group similar words according to their similarity values. It turns out that words with similar syntactic categories and semantic classes are grouped together.",
"year": "2002",
"authors": "Chen, Keh-Jiann and\nYou, Jia-Ming",
"anthology": "acl",
"anthology_id": "O02-2002",
"url": "https://aclanthology.org/O02-2002"
}
}
}
}
2 changes: 1 addition & 1 deletion ui/src/run_overview.json

Large diffs are not rendered by default.

Loading

0 comments on commit 5ed5e04

Please sign in to comment.