Project overview

The Semantic Measures Library (SML) is a generic and open source Java library dedicated to the computation and analysis of semantic measures, e.g. semantic similarity, semantic relatedness, semantic distance, etc. Based on the SML we also develop the SML-Toolkit, a command line program which gives access to some of the functionalities of the library, e.g. to compute measure scores.

  • SML Library: The Java Library - documentation of the SML (e.g. source code examples) and guidelines related to the use of the library in applications.
  • SML-Toolkit: The Program - a user-friendly way (for non-programmer) to benefit of SML functionalities.

The SML and the toolkit can be used to compute semantic similarity and semantic relatedness between entities semantically characterised (concepts, instances), e.g. concepts defined in a taxonomy or entities defined in a semantic graph such as documents, genes or products annotated by concepts defined in an ontology. For more information on semantic measures you can read this brief introduction.
You can also consult the more technical introductory survey we wrote, as well as the list of semantic measures proposed by the Semantic Measures Library.
Since the library is generic, it can be used with numerous ontologies and knowledge organisation systems: RDF(S) graphs, OWL ontologies, WordNet, MesH, Gene Ontology (OBO ontologies), SNOMED-CT...

News:

[31/01/17]
Version 0.9.4 of SML and SML-Toolkit have been released.

[29/05/15]
We published a book on semantic measures:
Semantic Similarity from Natural Language and Ontology Analysis

[10/09/14]
The version 0.9 of SML and SML-Toolkit have been released.
SML is also now available on Maven Central!
More news

[10/05/14]
The version 0.8 of SML is out.

[05/05/14]
A tutorial on the SML will be proposed at the 25es Journées francophones d'Ingénierie des Connaissances (in French, 12-16 May 2014 - Montpellier, France).
You can download the resources used in the tutorial (in French).
A demo session is also planned at the 19th International Conference on Application of Natural Language to Information Systems (18-20 June 2014 - Montpellier, France).

[11/03/14]
The beta version 0.7.3 of the SML-Toolkit is out.
Specific profiles, i.e. command-line interfaces, are added in order to facilitate the use of the toolkit with specific knowledge representations.
For instance, this beta version integrates a profile dedicated to the Medical Subject Headings (MesH)
Give it a try: java -jar sml-toolkit-0.7.3.jar -t sm -profile MESH

[09/11/13]
We published a survey on semantic measures.
Semantic Measures for the Comparison of Units of Language, Concepts or Instances from Text and Knowledge Base Analysis.
AuthorsSébastien Harispe*, Sylvie Ranwez, Stefan Janaqi, Jacky Montmain
arXiv.org

[27/11/13]
We are pleased to announce the publication of a theoretical study of Semantic Measures in which the SML has been used:
A framework for unifying ontology-based semantic similarity measures: A study in the biomedical domain.
Authors: Sébastien Harispe*, David Sánchez, Sylvie Ranwez, Stefan Janaqi, Jacky Montmain
Journal of Biomedical Informatics 2013.

[10/10/13]
We are pleased to announce the publication of the paper presenting the Semantic Measures Library and associated toolkit.
See references. A new version of the toolkit, with minor bug fixes, have been released.

[08/10/13]
The bibliography associated to the project has been updated.

[29/08/13]
Addition of an evaluation section in which SML performance are discussed and compare to other domain-specific tools, see here. A community open source project has also been created to evaluate software solutions dedicated to semantic measures https://github.com/sharispe/sm-tools-evaluation.

[23/07/13]
The version 0.6 of both the SML and the toolkit has been released. See the download section.

[15/07/13]
The SML in action: Demo of a Linked Data content-based recommendation system based on the Semantic Measures Library (to appear in ODBASE 2013), see website.

[08/04/13]
The beta version of the release 0.6 is out.
A profile dedicated to the Gene Ontology has been included in the SML-Toolkit to facilitate semantic measure computations using the GO.
Just type: 'java -jar sml-toolkit-0.6.jar -t sm -profile GO' to compute semantic measures scores.

[08/10/13]
The bibliography associated to the project has been updated.

[29/08/13]
Addition of an evaluation section in which SML performance are discussed and compare to other domain-specific tools, see here. A community open source project has also been created to evaluate software solutions dedicated to semantic measures https://github.com/sharispe/sm-tools-evaluation.

[23/07/13]
The version 0.6 of both the SML and the toolkit has been released. See the download section.

[15/07/13]
The SML in action: Demo of a Linked Data content-based recommendation system based on the Semantic Measures Library (to appear in ODBASE 2013), see website.

[08/04/13]
The beta version of the release 0.6 is out.
A profile dedicated to the Gene Ontology has been included in the SML-Toolkit to facilitate semantic measure computations using the GO.
Just type: 'java -jar sml-toolkit-0.6.jar -t sm -profile GO' to compute semantic measures scores.

Main goals of the project

  • To provide software solutions giving access both developers and practitioners to state-of-the-art approaches/methods for the computation of semantic measures.
  • To facilitate the comparison of semantic measures providing both tools and gold standard benchmarks for their evaluations.
  • To facilitate selection of semantic measures according to a specific usage context.
  • To summarize theoretical findings related to semantic measures. The bibliography associated to the project is publicly available.

References

The main publications of the project:
  • The SML or the toolkit:

The Semantic Measures Library and Toolkit: fast computation of semantic similarity and relatedness using biomedical ontologies
Sébastien Harispe*, Sylvie Ranwez, Stefan Janaqi and Jacky Montmain
Bioinformatics 2014 30(5): 740-742. doi: 10.1093/bioinformatics/btt581

  • A book on semantic measures (254 pages):

Semantic Similarity from Natural Language and Ontology Analysis
Sébastien Harispe*, Sylvie Ranwez, Stefan Janaqi and Jacky Montmain
Synthesis Lectures on Human Language Technologies, May 2015, Vol. 8, No. 1 , Pages 1-254 doi: 10.2200/S00639ED1V01Y201504HLT027

  • A theoretical studies of knowledge-based semantic similarity measures:

A framework for unifying ontology-based semantic similarity measures: A study in the biomedical domain.
Sébastien Harispe*; David Sánchez, Sylvie Ranwez, Stefan Janaqi and Jacky Montmain
Journal of Biomedical Informatics 2013; http://dx.doi.org/10.1016/j.jbi.2013.11.006

About us

The project was initiated by Sébastien Harispe (PhD - project leader) from the LGI2P laboratory, Ecole des mines d'Alès. more

Questions, issues, remarks? Don't hesitate to help us improve both the software and the documentation, contact us.