Understanding PageRank in a network graph
Visualizing pagerank of nodes in a graph.
Example studied is that of extracting most informative sentences from a textual document for auto summarization. Uses the LexRank algorithm.
A graph G is created where :
- Nodes: All sentences of the document.
- Edges: There is an edge between two nodes if the frequency vectors of the corresponding sentences have (cosine) similarity above a threshold.
For more details on SentenceGraph, refer to TextGraphics
LexRank asserts that PageRank (now called LexRank) scores of sentences in such a graph can be used to rank sentences in the order of their relevance to the document. And in turn, can be used for generating a summary of teh document.
Usage:
- The landing page has a sentence graph, created from the text of my answer to a question on Quora.
- Click on the
Rank Nodes
button, the nodes will be re-sized according to their LexRank scores.
- Click on a node will print the associated sentence on the right side of the canvas. Try different nodes to explore the sentences.
- Click on the text of the sentence on the right side, will remove the sentence from the view.