MatchSim: a novel neighbor-based similarity measure with maximum neighborhood matching

return to the website
by Zhenjiang Lin, Michael R Lyu, Irwin King
Abstract:
The problem of measuring similarity between web pages arises in many important Web applications, such as search engines and Web directories. In this paper, we propose a novel neighbor-based similarity measure called MatchSim, which uses only the neighborhood structure of web pages. Technically, MatchSim recursively defines similarity between web pages by the average similarity of the maximum matching between their neighbors. Our method extends the traditional methods which simply count the numbers of common and/or different neighbors. It also successfully overcomes a severe counterintuitive loophole in SimRank, due to its strict consistency with the intuitions of similarity. We give the computational complexity of MatchSim iteration. The accuracy of MatchSim is compared against others on two real datasets. The results show that our method performs best in most cases.
Reference:
MatchSim: a novel neighbor-based similarity measure with maximum neighborhood matching (Zhenjiang Lin, Michael R Lyu, Irwin King), In Proceeding of the 18th ACM conference on Information and knowledge management, ACM, 2009.
Bibtex Entry:
@inproceedings{Lin09,
abstract = {The problem of measuring similarity between web pages arises in many important Web applications, such as search engines and Web directories. In this paper, we propose a novel neighbor-based similarity measure called MatchSim, which uses only the neighborhood structure of web pages. Technically, MatchSim recursively defines similarity between web pages by the average similarity of the maximum matching between their neighbors. Our method extends the traditional methods which simply count the numbers of common and/or different neighbors. It also successfully overcomes a severe counterintuitive loophole in SimRank, due to its strict consistency with the intuitions of similarity. We give the computational complexity of MatchSim iteration. The accuracy of MatchSim is compared against others on two real datasets. The results show that our method performs best in most cases.},
address = {New York, NY, USA},
author = {Lin, Zhenjiang and Lyu, Michael R and King, Irwin},
booktitle = {Proceeding of the 18th ACM conference on Information and knowledge management},
keywords = {SML-LIB-BIBLIO,lang:ENG},
mendeley-tags = {SML-LIB-BIBLIO,lang:ENG},
pages = {1613--1616},
publisher = {ACM},
series = {CIKM '09},
title = {{MatchSim: a novel neighbor-based similarity measure with maximum neighborhood matching}},
url = {http://doi.acm.org/10.1145/1645953.1646185},
year = {2009}
}
Powered by bibtexbrowser