A Measure of Similarity between Graph Vertices: Applications to Synonym Extraction and Web Searching

return to the website
by Vincent D. Blondel, Anahí Gajardo, Maureen Heymans, Pierre Senellart, Paul Van Dooren
Abstract:
We introduce a concept of \similarity\ between vertices of directed graphs. Let GA and GB be two directed graphs with, respectively, nA and nB vertices. We define an nB $\backslash$times nA similarity matrix S whose real entry sij expresses how similar vertex j (in GA) is to vertex i (in GB): we say that sij is their similarity score. The similarity matrix can be obtained as the limit of the normalized even iterates of Sk+1 = BSkAT + BTSkA, where A and B are adjacency matrices of the graphs and S0 is a matrix whose entries are all equal to 1. In the special case where GA = GB = G, the matrix S is square and the score sij is the similarity score between the vertices i and j of G. We point out that Kleinberg's "hub and authority" method to identify web-pages relevant to a given query can be viewed as a special case of our definition in the case where one of the graphs has two vertices and a unique directed edge between them. In analogy to Kleinberg, we show that our similarity scores are given by the components of a dominant eigenvector of a nonnegative matrix. Potential applications of our similarity concept are numerous. We illustrate an application for the automatic extraction of synonyms in a monolingual dictionary.
Reference:
A Measure of Similarity between Graph Vertices: Applications to Synonym Extraction and Web Searching (Vincent D. Blondel, Anahí Gajardo, Maureen Heymans, Pierre Senellart, Paul Van Dooren), In SIAM Review, volume 46, 2004.
Bibtex Entry:
@article{Blondel2004,
abstract = {We introduce a concept of \{similarity\} between vertices of directed graphs. Let GA and GB be two directed graphs with, respectively, nA and nB vertices. We define an nB $\backslash$times nA similarity matrix S whose real entry sij expresses how similar vertex j (in GA) is to vertex i (in GB): we say that sij is their similarity score. The similarity matrix can be obtained as the limit of the normalized even iterates of Sk+1 = BSkAT + BTSkA, where A and B are adjacency matrices of the graphs and S0 is a matrix whose entries are all equal to 1. In the special case where GA = GB = G, the matrix S is square and the score sij is the similarity score between the vertices i and j of G. We point out that Kleinberg's "hub and authority" method to identify web-pages relevant to a given query can be viewed as a special case of our definition in the case where one of the graphs has two vertices and a unique directed edge between them. In analogy to Kleinberg, we show that our similarity scores are given by the components of a dominant eigenvector of a nonnegative matrix. Potential applications of our similarity concept are numerous. We illustrate an application for the automatic extraction of synonyms in a monolingual dictionary.},
author = {Blondel, Vincent D. and Gajardo, Anahí and Heymans, Maureen and Senellart, Pierre and {Van Dooren}, Paul},
doi = {10.1137/S0036144502415960},
issn = {00361445},
journal = {SIAM Review},
keywords = {05c50,05c85,1,10,1137,15a18,68r10,SML-LIB-BIBLIO,algorithms,ams subject classifications,doi,efficient web search engines,eigenvalues of graphs,generalizing hubs and authorities,graph algorithms,graph theory,lang:ENG,s0036144502415960,such as},
mendeley-tags = {SML-LIB-BIBLIO,lang:ENG},
number = {4},
pages = {647},
title = {{A Measure of Similarity between Graph Vertices: Applications to Synonym Extraction and Web Searching}},
url = {http://link.aip.org/link/SIREAD/v46/i4/p647/s1\&Agg=doi},
volume = {46},
year = {2004}
}
Powered by bibtexbrowser