This is the third post about how to use random walk vectors for clustering. The main idea as was stated before is to represent point cloud as a graph based on similarities between points. Similarities between points are encoded in the form of a matrix and the matrix is then treated as a weight matrix of a graph. Such a graph is then traversed randomly resulting in set of random walk vectors (with seed vectors being focused on one different starting point each walk). Each random walk vector represents similarities between points once again – but this time it encodes global dataset shape around given starting point.

In this post we will try to combine many random walk vectors into one matrix that will be used as a data matrix. We will represent each original point as a sequence of numbers that reflect given point different clusters memberships. Having that representation we will use NMF clustering technique to cluster these new data points. (I think it might be considered as a type of consensus clustering as well)