Browsing posts in: Convolutional Neural Networks

## Monte Carlo Tree Search – beginners guide

For quite a long time, a common opinion in academic world was that machine achieving human master performance level in the game of Go was far from realistic. It was considered a ‘holy grail’ of AI – a milestone we were quite far away from reaching within upcoming decade. Deep Blue had its moment more than 20 years ago and since then no Go engine became close to human masters. The opinion about ‘numerical chaos’ in Go established so well it became referenced in movies, too.

Surprisingly, in march 2016 an algorithm invented by Google Deepmind called Alpha Go defeated korean world champion in Go 4-1 proving fictional and real-life sceptics wrong. Around a year after that, Alpha Go Zero – the next generation of Alpha Go Lee (the one beating Korean master) – was reported to destroy its predecessor 100-0, being very doubtfully reachable for humans.

---

## Variational Autoencoder in Tensorflow – facial expression low dimensional embedding

### Digest

The main motivation of this post is to use Variational Autoencoder model to embed unseen faces into the space of pre-trained single actor-centric face expressions. Two datasets are used in experiments later in this post. They are based on youtube videos passed through openface feature extraction utility:

The datasets are:

• Donald Trump faces

because of the recent presidential election in USA it was very easy to get videos of frontal-positioned faces of Donald Trump and use it as input dataset

• Edward Snowden faces

because he provided long lasting Q&A session for internauts being a good source of faces

The high level idea is to build VAE face expression model for single actor only and then embed new unseen face into VAE latent space – from where original actor with similar face expression is reconstructed. The code in python (using Google TensorFlow) is available on github

Example videos presenting results of the embeddings of my face into latent face expression space for different actors are presented below:

---

## Chess position evaluation with convolutional neural network in Julia

In this post we will try to challenge the problem of chess position evaluation using convolutional neural network (CNN) – neural network type designed to deal with spatial data. We will first explain why we need CNNs then we will present two fundamental CNNs layers. Having some knowledge from the inside of the black box, we will apply CNN to binary classification problem of chess position evaluation using Julia deep learning library – Mocha.jl.

### Introduction – data representation

One of the challenges that frequently occurs in machine learning is proper representation of the input data. Ideally, data is desired to be represented in a way that it carries as much information while being digestable for the ML algorithms. Digestibility means fitting in existing mathematical frameworks where known abstract tools can be applied.

A common convenient representation of single observation is a vector in $$\mathbb{R}^n$$. Assuming such representation, ML problems may be seen from many different angles – with benefit of using well known abstractions/interpretations. One perspective that is very common is algebraic perspective – having the input data as a matrix (one vector per column), its eigendecomposition or various factorizations may be considered – they both yield important results in the context of machine learning. Set of vectors in $$\mathbb{R}^n$$ shapes a point cloud – when geometry of such cloud is considered manifold learning methods emerge. Linear model with least squares error has closed form solution in algebraic framework. In all of these cases, representing input data as vectors implies broad range of tools to handle the problem effectively.

For some domains though it is not obvious how to represent input as vectors while preserving original information contained in the data. An example of such domain is text. Text document is rich in various types of information – there is a semantics and syntax of the text or even personal style of the writer. It is not clear how to represent this unnamed information contained in text. People tend to simplify it and use Bag of Words (BoW) approach to represent text (which completely ignores ordering of words in a document – treats it a a set).

Another domain that suffers from similar problem is domain of images. The spatiality of the data is missing when representing images as vectors of dimensionality equal to the total number of pixels. When one represents image that way the spatial information is lost – the algorithm that later consumes the input vectors is usually not aware the original structure of images is a set of 2-dimensional grids (one matrix for each channel).

So far our neural network has not been aware of two dimensional nature of input data (MNIST). It could of course find it out itself learning relations between neighboring pixels, but, the fact is, it had no clue so far.

---