slib.sml.sm.core.engine
Class SM_Engine

java.lang.Object
  extended by slib.sml.sm.core.engine.SM_Engine

public class SM_Engine
extends Object

This class defines a Semantic Measures Engine giving access to numerous methods commonly used to define graph-based semantic measures.
The engine distinguished two types of vertices in the graph:

More information between classes and instances can be found in the web site of the library. Notice that this documentation to refer to classes and instances even if the underlying object referring to them in the graph are URIs. Please, consider that we refer to the classes/instances identified by the URIs. Accesses to ancestors/parents and descendants are only constrained by the partial ordering defined by the RDFS.SUBCLASSOF relationships. As an example an ancestors is any class which is link by a path composed of RDFS.SUBCLASSOF relationships.
Important:
Note that the graph associated to the engine is expected to be immutable even if this condition will not be checked during the process. Indeed, the engine expects the graph not to be modified and will not work on a copy of the given graph. The engine stores some results to ensure performances, as a conclusion coherency of results will be impacted if the graph is modified next to engine construction. Some methods provided by the class expect the underlying taxonomic graph to be transitively reduced. In other words if z is a sub class of y and y is a sub class of x an edge z is a sub class of x is not expected. As an example this is important to ensure coherency of parent retrieval. Such transitive reduction can be performed though the GraphReduction_Transitive class or even more easily using the GraphActionExecutor class. The engine stores commonly accessed results (e.g. ancestors of a class) which can lead to high memory consumption dealing with large graphs.

Author:
Harispe Sébastien

Constructor Summary
SM_Engine(slib.sglib.model.graph.G g)
          Constructor of an engine associated to the given graph.
 
Method Summary
 double computeGroupwiseAddOnSim(SMconf confGroupwise, SMconf confPairwise, Set<org.openrdf.model.URI> setA, Set<org.openrdf.model.URI> setB)
          Compute the indirect group wise semantic measure score considering the two set of vertices and the semantic measure configuration.
 double computeGroupwiseStandaloneSim(SMconf confGroupwise, Set<org.openrdf.model.URI> setA, Set<org.openrdf.model.URI> setB)
          Compute the direct group wise semantic measure score considering the two set of vertices and the semantic measure configuration.
 Map<org.openrdf.model.URI,Double> computeIC(ICconf icConf)
          Compute the information content for all classes.
 double computePairwiseSim(SMconf pairwiseConf, org.openrdf.model.URI a, org.openrdf.model.URI b)
          Compute the pairwise semantic measures score considering the two vertices and the semantic measure configuration.
 Map<org.openrdf.model.URI,Set<org.openrdf.model.URI>> getAllAncestorsInc()
          Access to the inclusive ancestors for all classes.
 Map<org.openrdf.model.URI,Set<org.openrdf.model.URI>> getAllDescendantsInc()
          Access to a view of the inclusive descendants for all classes.
 Map<org.openrdf.model.URI,Integer> getAllNbAncestorsInc()
          Inclusive process
 Map<org.openrdf.model.URI,Integer> getAllNbDescendantsInc()
          Compute the number of inclusive descendants for all classes
 Map<org.openrdf.model.URI,Integer> getAllNbReachableLeaves()
          Compute for each class x the number classes which are leaves which are subsumed by x.
 Map<org.openrdf.model.URI,Double> getAllShortestPath(org.openrdf.model.URI a, slib.sglib.model.graph.weight.GWS weightingScheme)
          CACHED
 slib.sglib.algo.graph.extraction.rvf.AncestorEngine getAncestorEngine()
           
 Set<org.openrdf.model.URI> getAncestorsInc(Set<org.openrdf.model.URI> setClasses)
          Compute the union of the inclusive ancestors of a set of classes.
 Set<org.openrdf.model.URI> getAncestorsInc(org.openrdf.model.URI v)
          Give access to a view of the inclusive ancestors of a class.
 Set<org.openrdf.model.URI> getClasses()
          Access to the set of URI of the graph considered as classes.
 slib.sglib.algo.graph.extraction.rvf.DescendantEngine getDescendantEngine()
           
 Set<org.openrdf.model.URI> getDescendantsInc(org.openrdf.model.URI v)
          Give access to a view of the inclusive descendants of a class.
 slib.sglib.model.graph.G getGraph()
          Access to the graph associated to the engine.
 Set<org.openrdf.model.URI> getHypoAncEx(org.openrdf.model.URI a, org.openrdf.model.URI b)
          NOT_CACHED
 double getIC_MICA(ICconf icConf, org.openrdf.model.URI a, org.openrdf.model.URI b)
          Get the information content of the most informative common ancestor (MICA) of two classes.
 Map<org.openrdf.model.URI,Double> getIC_results(ICconf icConf)
          Access to a view of the information content of all classes.
 double getIC(ICconf icConf, org.openrdf.model.URI v)
          Get the Information Content of a class.
 Set<org.openrdf.model.URI> getInstances()
          Access to the set of URI of the graph considered as instances.
 Set<org.openrdf.model.URI> getLCAs(org.openrdf.model.URI a, org.openrdf.model.URI b)
           
 slib.utils.impl.MatrixDouble<org.openrdf.model.URI,org.openrdf.model.URI> getMatrixScore(Set<org.openrdf.model.URI> setA, Set<org.openrdf.model.URI> setB, SMconf pairwiseConf)
          Compute the matrix of similarity for two sets of vertex/concepts/classes.
 int getMaxDepth()
          Access to the maximal depth of a class in the underlying taxonomic graph.
 Map<org.openrdf.model.URI,Integer> getMaxDepths()
          Give access to a view of the maximal depth of all classes.
 org.openrdf.model.URI getMICA(ICconf icConf, org.openrdf.model.URI a, org.openrdf.model.URI b)
          Get the most informative common ancestor (MICA) of two classes.
 Map<org.openrdf.model.URI,Integer> getMinDepths()
          Give access to a view of the minimal depth of all classes.
 org.openrdf.model.URI getMSA(org.openrdf.model.URI a, org.openrdf.model.URI b, slib.sglib.model.graph.weight.GWS weightingScheme)
          NOT_CACHED
 Map<org.openrdf.model.URI,Integer> getNbInstancesInferredPropFromCorpus()
           
 Map<org.openrdf.model.URI,Integer> getNbOccurrenceProp()
          Topological propagation considering one occurrence per term
 Map<org.openrdf.model.URI,Integer> getnbPathLeadingToAllVertex()
           
 double getP_MICA(ICconf conf, org.openrdf.model.URI a, org.openrdf.model.URI b)
           
 Set<org.openrdf.model.URI> getParents(org.openrdf.model.URI v)
          Get the parents of a class, that is to say its direct ancestors.
 Map<org.openrdf.model.URI,Set<org.openrdf.model.URI>> getReachableLeaves()
          Compute for each class x the classes which are leaves which are subsumed by x.
 org.openrdf.model.URI getRoot()
          Get the root of the taxonomic graph contained in the graph associated to the engine.
 double getShortestPath(org.openrdf.model.URI a, org.openrdf.model.URI b, slib.sglib.model.graph.weight.GWS weightingScheme)
          CACHED ! Be careful modification of RelTypes requires cache clearing
 Set<org.openrdf.model.URI> getTaxonomicLeaves()
          Access to a view of the set of leaves of the underlying taxonomic graph.
 Map<org.openrdf.model.URI,Double> getVector(Set<org.openrdf.model.URI> set, SMconf groupwiseconf)
           
 slib.sglib.model.graph.weight.GWS getWeightingScheme(String param)
          TODO store the weighting scheme in a Map Provide a way to load edge weight from file or to compute them using specific methods
 boolean isCachePairwiseResults()
          Check if the engine is configured to store the results of the pairwise semantic measure computation.
 void setCachePairwiseResults(boolean cachePairwiseResults)
          Set the configuration of the engine regarding pairwise semantic measure score caching.
 void setICSvalues(ICconf icConf, Map<org.openrdf.model.URI,Double> ics)
          Set the ICS stored for the given IC configuration to the specified set of values.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

SM_Engine

public SM_Engine(slib.sglib.model.graph.G g)
          throws slib.utils.ex.SLIB_Ex_Critic
Constructor of an engine associated to the given graph. Note that the engine expects the graph not to be modified and coherency of results are only ensured in this case. Please refer to the general documentation of the class for more information considering this specific restriction. The engine creation is expensive, avoid useless calls to the constructor. Indeed, some information such as classes and instances of the graph are computed at engine creation which could lead to performance issues dealing with large graph.

Parameters:
graph - the graph associated to the engine.
Throws:
slib.utils.ex.SLIB_Ex_Critic
Method Detail

getAncestorsInc

public Set<org.openrdf.model.URI> getAncestorsInc(Set<org.openrdf.model.URI> setClasses)
Compute the union of the inclusive ancestors of a set of classes. The given classes are included in the result (inclusive). This process can be computationally expensive if the number of ancestors is important. The result is not cached by the engine.

Parameters:
setClasses - the set of classes considered
Returns:
the union of the inclusive ancestors of the given classes
Throws:
IllegalAccessException - if the given set contains an URI which cannot be associated to a class

getAncestorsInc

public Set<org.openrdf.model.URI> getAncestorsInc(org.openrdf.model.URI v)
Give access to a view of the inclusive ancestors of a class. The given class will therefore be include in the results. The result is cached by the engine for fast access.

Parameters:
v - the considered class
Returns:
the set of inclusive ancestors of the given class (v included)
Throws:
IllegalAccessException - if the given URI cannot be associated to a class

getDescendantsInc

public Set<org.openrdf.model.URI> getDescendantsInc(org.openrdf.model.URI v)
Give access to a view of the inclusive descendants of a class. The given class will therefore be include in the results. The result is cached by the engine for fast access.

Parameters:
v - the considered class
Returns:
the set of inclusive descendants of the given class (v included)
Throws:
IllegalAccessException - if the given URI cannot be associated to a class

getParents

public Set<org.openrdf.model.URI> getParents(org.openrdf.model.URI v)
Get the parents of a class, that is to say its direct ancestors.
Important:
The direct parent of a class are all classes x linked to the given class c to a an edge x RDFS.SUBLASSOF c. The result is not cached by the engine. To ensure result coherency the underlying requires to be transitively reduced, refer to the class documentation for more information.

Parameters:
v - the focus vertex
Returns:
the set of parents of the given vertex
Throws:
IllegalAccessException - if the given URI cannot be associated to a class

getMaxDepths

public Map<org.openrdf.model.URI,Integer> getMaxDepths()
                                                throws slib.utils.ex.SLIB_Ex_Critic
Give access to a view of the maximal depth of all classes. The result is stored by the engine.

Returns:
a resultStack containing the maximal depths for all classes
Throws:
slib.utils.ex.SLIB_Ex_Critic

getMinDepths

public Map<org.openrdf.model.URI,Integer> getMinDepths()
                                                throws slib.utils.ex.SLIB_Ex_Critic
Give access to a view of the minimal depth of all classes. The result is stored by the engine.

Returns:
a resultStack containing the maximal depths for all classes
Throws:
slib.utils.ex.SLIB_Ex_Critic

getIC

public double getIC(ICconf icConf,
                    org.openrdf.model.URI v)
             throws slib.utils.ex.SLIB_Exception
Get the Information Content of a class. The information content to considered is defined by the given configuration.

Parameters:
icConf - The configuration of the information content
v - the class
Returns:
the information content of the specified class according to the specified configuration.
Throws:
IllegalAccessException - if the given URI cannot be associated to a class
slib.utils.ex.SLIB_Exception

getMaxDepth

public int getMaxDepth()
                throws slib.utils.ex.SLIB_Exception
Access to the maximal depth of a class in the underlying taxonomic graph.

Returns:
the maximal depth of the graph.
Throws:
slib.utils.ex.SLIB_Exception

getRoot

public org.openrdf.model.URI getRoot()
                              throws slib.utils.ex.SLIB_Ex_Critic
Get the root of the taxonomic graph contained in the graph associated to the engine. An exception will be thrown if the taxonomic graph contains multiple roots.

Returns:
the class corresponding to the root.
Throws:
slib.utils.ex.SLIB_Ex_Critic

getIC_MICA

public double getIC_MICA(ICconf icConf,
                         org.openrdf.model.URI a,
                         org.openrdf.model.URI b)
                  throws slib.utils.ex.SLIB_Exception
Get the information content of the most informative common ancestor (MICA) of two classes. The MICA is the class with the maximal IC found among the sets of ancestors of the two given classes.

Parameters:
icConf - the configuration of the information content
a - the first class
b - the second class
Returns:
the IC of the most informative common ancestor of the two classes.
Throws:
slib.utils.ex.SLIB_Exception - if no common ancestor is found between the two classes
IllegalAccessException - if the given URIs cannot be associated to a class

getMICA

public org.openrdf.model.URI getMICA(ICconf icConf,
                                     org.openrdf.model.URI a,
                                     org.openrdf.model.URI b)
                              throws slib.utils.ex.SLIB_Exception
Get the most informative common ancestor (MICA) of two classes. The MICA is the class with the maximal IC found among the sets of ancestors of the two given classes.

Parameters:
icConf - the configuration of the information content
a - the first class
b - the second class
Returns:
the most informative common ancestor of the two classes.
Throws:
slib.utils.ex.SLIB_Exception - if no common ancestor is found between the two classes
IllegalAccessException - if the given URIs cannot be associated to a class

getAllNbDescendantsInc

public Map<org.openrdf.model.URI,Integer> getAllNbDescendantsInc()
                                                          throws slib.utils.ex.SLIB_Ex_Critic
Compute the number of inclusive descendants for all classes

Returns:
a map containing the number of inclusive descendants
Throws:
slib.utils.ex.SLIB_Ex_Critic

getAllDescendantsInc

public Map<org.openrdf.model.URI,Set<org.openrdf.model.URI>> getAllDescendantsInc()
                                                                           throws slib.utils.ex.SLIB_Ex_Critic
Access to a view of the inclusive descendants for all classes.

Returns:
the inclusive descendants for all classes
Throws:
slib.utils.ex.SLIB_Ex_Critic

getAllAncestorsInc

public Map<org.openrdf.model.URI,Set<org.openrdf.model.URI>> getAllAncestorsInc()
                                                                         throws slib.utils.ex.SLIB_Ex_Critic
Access to the inclusive ancestors for all classes.

Returns:
the inclusive ancestors for all classes
Throws:
slib.utils.ex.SLIB_Ex_Critic

getIC_results

public Map<org.openrdf.model.URI,Double> getIC_results(ICconf icConf)
                                                throws slib.utils.ex.SLIB_Ex_Critic
Access to a view of the information content of all classes.

Parameters:
icConf - the information content considered.
Returns:
the information content of all classes
Throws:
slib.utils.ex.SLIB_Ex_Critic

computeIC

public Map<org.openrdf.model.URI,Double> computeIC(ICconf icConf)
                                            throws slib.utils.ex.SLIB_Ex_Critic
Compute the information content for all classes. Results are stored for fast access.

Parameters:
icConf - the configuration to consider
Returns:
the IC for all classes
Throws:
slib.utils.ex.SLIB_Ex_Critic

getReachableLeaves

public Map<org.openrdf.model.URI,Set<org.openrdf.model.URI>> getReachableLeaves()
Compute for each class x the classes which are leaves which are subsumed by x. Inclusive i.e. a leaf will contain itself in it set of reachable leaves. The result is cached for fast access.

Returns:
the subsumed leaves for each classes

getTaxonomicLeaves

public Set<org.openrdf.model.URI> getTaxonomicLeaves()
Access to a view of the set of leaves of the underlying taxonomic graph.

Returns:
the set of classes which are leaves

getAllNbReachableLeaves

public Map<org.openrdf.model.URI,Integer> getAllNbReachableLeaves()
Compute for each class x the number classes which are leaves which are subsumed by x. Inclusive i.e. a leaf will contain itself in it set of reachable leaves. The result is cached for fast access.

Returns:
the number subsumed leaves for each classes

getAllNbAncestorsInc

public Map<org.openrdf.model.URI,Integer> getAllNbAncestorsInc()
                                                        throws slib.utils.ex.SLIB_Ex_Critic
Inclusive process

Returns:
Throws:
slib.utils.ex.SLIB_Ex_Critic

computePairwiseSim

public double computePairwiseSim(SMconf pairwiseConf,
                                 org.openrdf.model.URI a,
                                 org.openrdf.model.URI b)
                          throws slib.utils.ex.SLIB_Ex_Critic
Compute the pairwise semantic measures score considering the two vertices and the semantic measure configuration.

Parameters:
pairwiseConf - the pairwise semantic measure configuration
a - the first vertex/class/concept
b - the second vertex/class/concept
Returns:
the pairwise semantic measure score
Throws:
slib.utils.ex.SLIB_Ex_Critic
IllegalAccessException - if the given URIs cannot be associated to classes defined in the graph

computeGroupwiseStandaloneSim

public double computeGroupwiseStandaloneSim(SMconf confGroupwise,
                                            Set<org.openrdf.model.URI> setA,
                                            Set<org.openrdf.model.URI> setB)
                                     throws slib.utils.ex.SLIB_Ex_Critic
Compute the direct group wise semantic measure score considering the two set of vertices and the semantic measure configuration.

Parameters:
confGroupwise - the direct groupwise semantic measure configuration
setA - the first set of vertices/classes/concepts
setB - the first set of vertices/classes/concepts
Returns:
the group wise semantic measure score
Throws:
slib.utils.ex.SLIB_Ex_Critic
IllegalAccessException - if the given URIs cannot be associated to classes defined in the graph

computeGroupwiseAddOnSim

public double computeGroupwiseAddOnSim(SMconf confGroupwise,
                                       SMconf confPairwise,
                                       Set<org.openrdf.model.URI> setA,
                                       Set<org.openrdf.model.URI> setB)
                                throws slib.utils.ex.SLIB_Ex_Critic
Compute the indirect group wise semantic measure score considering the two set of vertices and the semantic measure configuration.

Parameters:
confGroupwise - the pairwise semantic measure configuration
confPairwise - the indirect aggregation strategy configuration
setA - the first set of vertices/classes/concepts
setB - the first set of vertices/classes/concepts
Returns:
the group wise semantic measure score
Throws:
slib.utils.ex.SLIB_Ex_Critic
IllegalAccessException - if the given URIs cannot be associated to classes defined in the graph

getnbPathLeadingToAllVertex

public Map<org.openrdf.model.URI,Integer> getnbPathLeadingToAllVertex()
                                                               throws slib.utils.ex.SLIB_Ex_Critic
Returns:
@throws SLIB_Ex_Critic
Throws:
slib.utils.ex.SLIB_Ex_Critic

getNbInstancesInferredPropFromCorpus

public Map<org.openrdf.model.URI,Integer> getNbInstancesInferredPropFromCorpus()
Returns:

getNbOccurrenceProp

public Map<org.openrdf.model.URI,Integer> getNbOccurrenceProp()
                                                       throws slib.utils.ex.SLIB_Exception
Topological propagation considering one occurrence per term

Returns:
Throws:
slib.utils.ex.SLIB_Exception

getMatrixScore

public slib.utils.impl.MatrixDouble<org.openrdf.model.URI,org.openrdf.model.URI> getMatrixScore(Set<org.openrdf.model.URI> setA,
                                                                                                Set<org.openrdf.model.URI> setB,
                                                                                                SMconf pairwiseConf)
                                                                                         throws slib.utils.ex.SLIB_Ex_Critic
Compute the matrix of similarity for two sets of vertex/concepts/classes. In other words, the matrix will contain all the semantic scores which can be computed for every pair of concepts which can be build from the two sets.

Parameters:
setA - the first set of vertices/classes/concepts
setB - the second set of vertices/classes/concepts
pairwiseConf - the pairwise semantic measure configuration which must be used to compute the score of a pair of vertex
Returns:
the matrix filled with the scores.
Throws:
slib.utils.ex.SLIB_Ex_Critic
IllegalAccessException - if the given URIs cannot be associated to a class

isCachePairwiseResults

public boolean isCachePairwiseResults()
Check if the engine is configured to store the results of the pairwise semantic measure computation.

Returns:
true if the engine store the results.

setCachePairwiseResults

public void setCachePairwiseResults(boolean cachePairwiseResults)
Set the configuration of the engine regarding pairwise semantic measure score caching.
Important:
Storing the results can be very useful in specific cases. However, storing the results can also lead to high memory consumption and therefore slow the process or crash the process.

Parameters:
cachePairwiseResults - set to true if the engine must stores the results.

getVector

public Map<org.openrdf.model.URI,Double> getVector(Set<org.openrdf.model.URI> set,
                                                   SMconf groupwiseconf)
Parameters:
set -
groupwiseconf -
Returns:

getGraph

public slib.sglib.model.graph.G getGraph()
Access to the graph associated to the engine.

Returns:
the graph associated to the engine

setICSvalues

public void setICSvalues(ICconf icConf,
                         Map<org.openrdf.model.URI,Double> ics)
Set the ICS stored for the given IC configuration to the specified set of values.

Parameters:
icConf -
ics -

getLCAs

public Set<org.openrdf.model.URI> getLCAs(org.openrdf.model.URI a,
                                          org.openrdf.model.URI b)
                                   throws slib.utils.ex.SLIB_Exception
Throws:
slib.utils.ex.SLIB_Exception

getWeightingScheme

public slib.sglib.model.graph.weight.GWS getWeightingScheme(String param)
TODO store the weighting scheme in a Map Provide a way to load edge weight from file or to compute them using specific methods

Parameters:
param - the key corresponding to the id of the weighting scheme to retrieve
Returns:

getAncestorEngine

public slib.sglib.algo.graph.extraction.rvf.AncestorEngine getAncestorEngine()

getDescendantEngine

public slib.sglib.algo.graph.extraction.rvf.DescendantEngine getDescendantEngine()

getClasses

public Set<org.openrdf.model.URI> getClasses()
Access to the set of URI of the graph considered as classes.

Returns:
the set of classes

getInstances

public Set<org.openrdf.model.URI> getInstances()
Access to the set of URI of the graph considered as instances.

Returns:
the set of instances

getShortestPath

public double getShortestPath(org.openrdf.model.URI a,
                              org.openrdf.model.URI b,
                              slib.sglib.model.graph.weight.GWS weightingScheme)
                       throws slib.utils.ex.SLIB_Ex_Critic
CACHED ! Be careful modification of RelTypes requires cache clearing

Parameters:
a -
b -
Returns:
Throws:
slib.utils.ex.SLIB_Ex_Critic

getMSA

public org.openrdf.model.URI getMSA(org.openrdf.model.URI a,
                                    org.openrdf.model.URI b,
                                    slib.sglib.model.graph.weight.GWS weightingScheme)
                             throws slib.utils.ex.SLIB_Ex_Critic
NOT_CACHED

Parameters:
a -
b -
Returns:
Throws:
slib.utils.ex.SLIB_Ex_Critic
IllegalAccessException - if the given URI cannot be associated to a class

getAllShortestPath

public Map<org.openrdf.model.URI,Double> getAllShortestPath(org.openrdf.model.URI a,
                                                            slib.sglib.model.graph.weight.GWS weightingScheme)
                                                     throws slib.utils.ex.SLIB_Ex_Critic
CACHED

Parameters:
a -
Returns:
a map containing the weight of the shortest path linking a the given vertex.
Throws:
slib.utils.ex.SLIB_Ex_Critic

getP_MICA

public double getP_MICA(ICconf conf,
                        org.openrdf.model.URI a,
                        org.openrdf.model.URI b)
                 throws slib.utils.ex.SLIB_Exception
Parameters:
conf -
a -
b -
Returns:
Throws:
slib.utils.ex.SLIB_Exception
IllegalAccessException - if the given URI cannot be associated to a class

getHypoAncEx

public Set<org.openrdf.model.URI> getHypoAncEx(org.openrdf.model.URI a,
                                               org.openrdf.model.URI b)
NOT_CACHED

Parameters:
a -
b -
Returns:
Throws:
IllegalAccessException - if the given URI cannot be associated to a class


Copyright © 2013. All Rights Reserved.