Posts tagged with: random walks

## Random walk vectors for clustering (III)

This is the third post about how to use random walk vectors for clustering. The main idea as was stated before is to represent point cloud as a graph based on similarities between points. Similarities between points are encoded in the form of a matrix and the matrix is then treated as a weight matrix of a graph. Such a graph is then traversed randomly resulting in set of random walk vectors (with seed vectors being focused on one different starting point each walk). Each random walk vector represents similarities between points once again – but this time it encodes global dataset shape around given starting point.

In this post we will try to combine many random walk vectors into one matrix that will be used as a data matrix. We will represent each original point as a sequence of numbers that reflect given point different clusters memberships. Having that representation we will use NMF clustering technique to cluster these new data points. (I think it might be considered as a type of consensus clustering as well)

---

## Random walk vectors for clustering (part II – perspective switch)

This post is a second part of the series of posts that will result in combining random walk vectors for clustering. So far we have understood what the similarity is, how to build similarity matrix based on distance matrix. We know the similarity matrix is a structure that encodes similarities between all objects in our dataset.

Today we will further motivate our quest for similarity matrix started in the previous post. We will tell a little bit about what the graph is and how to switch from point cloud perspective to graph perspective. The bridge between these two worlds is the similarity matrix introduced last time.

Basics first.

A wide range of structures occurring in nature are naturally expressed as networks. Whenever a set of objects and relations among them appears, the whole setting can be interpreted this way. A typical example of such network is – becoming hot topic in last decade – social network. A node in that type of a network is an abstraction of a social unit (mainly a person) and an edge represents social relation between pairs of nodes like friendship, business relation, co-interest etc. Another type of network is the internet itself where websites are represented as nodes and links between websites as connections between them. In fact, any set of objects with relation between them can be modeled as a network.

Formally a network is called graph and it consists of set of nodes and set of edges and is usually formalized as a pair $$G = (V,E)$$ where $$V$$ is a set of $$N$$ objects indexed with integers $$\{1, 2, \dots, N \}$$ and $$E$$ is a function:

$E : V^2 \rightarrow \mathcal{R}$

---