All examples By author By category About

fabiovalse

Instance similarity II

This experiment tries to cluster a set of instances belonging to a Linked Data set. Different classes of DBpedia (Guitaritst, Poet, Cheese, ...) have been used in order to understand how the clustering approach behaves (Top left of the visualization for changing the class and updating the matrix).

All the instances (n) of a certain class are placed on the columns while their predicates (m) are placed on the rows. The cells of the n x m matrix shows whether an instance holds a certain predicate (Black cells) or not (Gray cells). Hence, each column describes an instance in term of the predicates hold by the class to which it belongs and corresponds to a vector composed by zeros and ones.

Hierarchical clustering allows to calculate a sequence of instances in which the most similar ones are adjacents. The sequence has been used in order to sort the column in the matrix. This allows to see clusters of similar instances depicted as adjacents columns.

Predicates on the rows have been sorted using a descending order from the most frequent one.