November 19, 2014

Stacking of Classifiers

Stacking of Classifiers: 

Many times one particular feature extraction and classification technique may not provide good results. So one simple way is to stack the available combinations and then deploy a linear classifier (for example, Logistic regression) to generate the final classifier. The structure of Stacking is shown in the figure below. Here, the results obtained from various combinations of Feature Extraction methods and training models are stacked. 



October 29, 2014

Weka Tool for Classification

Weka Tool for Classification:

Weka [1][2] is an open source tool widely used for performing various Machine Learning and Data Mining tasks. Weka is developed in Java. 

Download the zip file for Weka from  [1] and extract it. Change the directory to the extracted folder and enter the command

java -­jar weka.jar

This will start the Weka GUI. Weka provides a lot of options for various data mining tasks like clustering, classification, preprocessing, feature extraction, etc. The important advantage of Weka is the visualization of results. We just need to load the data and start performing the necessary task. 

[1] http://www.cs.waikato.ac.nz/ml/weka/
[2] Mark Hall, Eibe Frank, Geoffrey Holmes, Bernhard Pfahringer, Peter Reutemann, Ian H. Witten (2009); The WEKA Data Mining Software: An Update; SIGKDD Explorations, Volume 11, Issue 1.

September 14, 2014

Topic Modelling of Text Data using LDA in Python

Topic Modelling using LDA:

In Machine Learning, feature extraction of text data is a very critical step. Some of the Commonly used approaches are LDA (Latent Dirichlet Allocation), TF-IDF (Term Frequency Inverse Document Frequency), etc. In this post, I will go through the basics of Topic Modelling using LDA in Python.

The algorithm by M.Hoffman [1] uses stochastic optimization to maximize the variational objective function for the Latent Dirichlet Allocation (LDA) topic model. From programming point of view, it's easier to use the LDA implementation in Python [2]. Finding the optimal number of topics is an important step. This optimal number depends on data, number and types of attributes. Its important to try for different numbers with trial and error technique. 

The basic flow to create a classifier will be as follows. First, feature extraction is done using
LDA. The generated feature vector is then applied with Singular Value Decomposition (SVD) for feature reduction because there may be some unimportant features present. Then different classifiers can be used to train the feature vector obtained.

[1] "Online Learning for Latent Dirichlet Allocation" by Matthew D. Hoffman, David M. Blei, and
Francis Bach, to be presented at NIPS 2010.
[2] https://pypi.python.org/pypi/lda

August 17, 2014

Independent Component Analysis (ICA) in Matlab

Independent Component Analysis (ICA) in Matlab:

While performing data analysis tasks, it is generally assumed that the feature attributes are independent of each other. But data may not be available in such format. So it is very important to convert the feature attributes independent of each other. This can be done with a simple function for ICA in Matlab. This function will reduce the data into required number of feature attributes. 

ICA technique basically removes the linearity factor among the feature attributes. Matlab does not provide in built function for ICA. Various packages [1][2][3] of ICA in Matlab are available. It can also be termed as a method for dimensionality reduction. ICA basically projects the data into N number of dimensions. Note that ICA can only separate linearly mixed vectors. ICA assumes that the vectors are non-Gaussian in nature. 

[1] http://www.mathworks.com/matlabcentral/fileexchange/38300-pca-and-ica-package
[2] http://research.ics.aalto.fi/ica/fastica/
[3] http://jim-stone.staff.shef.ac.uk/bookmatlabcode.html

August 04, 2014

Policy Iteration for Grid World Games

In my previous post, I discussed about Value Iteration algorithm for Grid World Games. Policy iteration is another algorithm which is commonly used for solving MDP problems. Here I will be discussing Policy Iteration algorithm for Grid World Games. 

Consider the same problem in the previous post. Both Value Iteration and Policy Iteration algorithms are initialized to arbitrary and random values which are far from optimal solution. These initialization do not take into account the specification of all states S and actions A which may be detrimental to the speed of convergence. For each iteration, the algorithms explore all the set A of actions, defined in the MDP, to improve the current policy, this may be causing unnecessary calculations. The transition model used is fixed, but, as we shall see later, in robotics a model of this type can degrade the results and not be in agreement with the theory of MDP.

June 30, 2014

Value Iteration for Grid World Games

Solving Grid World problems is vary interesting topic among AI community. In this post, I will touch the Value Iteration method for Grid World problem. 

Let's consider a 10*10 grid consisting of an agent, a reward cell, some wall cells and some penalty cells. The reward cell, wall cells and penalty cells are placed at random locations. The agent tries to reach the reward while avoiding the wall cells and penalty cells with the shortest path. This problem can be considered as solving sequential decision problems in computer games. Consider a two stage procedure to solve this problem. In the first stage, value iteration algorithm is applied which gives the policy to reach the reward stage. In the second stage, the path planning algorithm is used to select the shortest path for agent to reach the reward cell. This modified value iteration algorithm takes into consideration all the wall cells and penalty cells and returns the policy. We analyze the policy returned after certain number of iterations and then apply path planning algorithm. The policy changes as the number of iterations and hence the path too. During value iteration process, we directly select the optimal choice for each cell.

Assumptions:

1) The position of the walls is not fixed but it is known to the agent.
2) The agent can move only in horizontal and vertical direction.
3) The agent can only move one cell in a unit time.
4) The positions of reward cell, penalty cell and wall cell are randomly generated for each new game.
5) The starting point for agent is fixed at cell (1, 1).
6) The outermost rows and columns (eleventh row and eleventh column) are considered as walls.
7) The agent moves continuously without stopping till the agent finds the reward.

Algorithm:

We can consider the given task as solving a Sequential Decision Problem. We will be using fully observable MDP as the agent has complete knowledge of grid.

The modified value iteration is an iterative algorithm which is based on the direct resolution of Bellman optimality equation. It requires an initialization of utility values and searches over values and finds the resulting policy. It calculates the expected utility of each state using the utilities of the neighboring states until the utilities calculated on two successive steps are close enough. 


Where e is predefined threshold value. The smaller the threshold is, the higher is the precision of the algorithm. Given certain conditions on the environment, the utility values are guaranteed to converge.
Given a utility matrix we can calculate a corresponding policy according to the maximum expected utility principle, i.e. choosing an optimal policy as the equation below.


For each iteration, the algorithms explore all the set A of actions, defined in the MDP, to improve the current policy, this may be causing unnecessary calculations. The transition model used is fixed, but, as we shall see later, in robotics a model of this type can degrade the results and not be in agreement with the theory of MDP. In our algorithm, we use modified value iteration which takes into account multiple number of penalty cells and multiple number of wall cells. It performs well even if there are multiple penalty states or multiple walls present.

Path Planning: 

Path planning is used extensively in the field of Artificial Intelligence. It finds the shortest path for the agent towards the goal state. Various informed and uninformed search strategies are used for path planning. The path planning algorithm should return the optimal path for the agent to the goal state.
A grid-based algorithm should firstly overlay a grid on the maze space, and assume each state is identified with a grid point. At each grid point, the robot is allowed to move to adjacent grid points as long as the line between them is contiguous. For our work, we use path planning for the agent to reach the reward cell in the best possible way.

Some sample results are as follows. The value of each cell is shown.  


Figure 1: Sample Grid 1


Figure 2: Sample Grid 2

Discussion:

For various configuration of maze, the agent is able to reach the positive state by using the policy produced from the value iteration process. Even if there are different paths produced by the policy on certain maze, the numbers of steps are exactly the same because the agent is similarly reaching the positive state in rectangular or zigzag movement. With the increasing number of iteration, the smallest value in each maze is increased with the value difference, from 20 iteration to 60 iteration, up to around +0.7 value.



January 31, 2014

Unsupervised Image Segmentation Using Expectation-Maximization Algorithm

In this post, I would like to post my work related to Unsupervised Image Segmentation Using Clustering Algorithms like K-means clustering and Expectation-Maximization Algorithm. The Expectation-Maximization Algorithm is also commonly known as EM algorithm.


The EM algorithm for image segmentation works on the basis of maximum likelihood parameter estimation. Th important difference between K-means and EM algorithm is about assigning
fractional membership of any given point to the clusters.

I will discuss the general procedures of these algorithms.  Matlab is used as the tool.


K-means Clustering Algorithm:


Step 1: Initialize the centroids with k random intensities
Step 2: Repeat until the algorithm has converged:
             • Cluster the points based on the distance of their intensities from the centroid intensities
             • Compute the new centroid for each of the clusters



EM Algorithm:

Given Data X = {x(1), ..., x(N )}, where x(k) ∈ n

Step 1: Introduce hidden or latent variables Z such that ln p(X|Z, Θ) has an analytically tractable form.
Then construct complete data model p(X, Z|Θ) = p(X|Z, Θ)p(Z|Θ)

Step 2: Initialize parameter guess Θ(0)

Step 3: Repeat for t = 1, 2, 3 . . .

E step: Compute
Q(Θ, Θ(t) ) = E(Z|X,Θ(t)) [ln p(X, Z|Θ)];

M step: Choose
Θ(t+1) = arg max Q(Θ, Θ(t) )

Stop |Θ(t+1) − Θ(t) | ≤ , where is some consequence criteria
Output Final Parameter estimation Θ


The Image Segmentation procedure using EM algorithm is shown below. 




Unsupervised EM Algorithm for Image Segmentation: 

Goal: Estimate Θ and number of segmentation

Input: Image vector X = { x[1], x[2], . . . , x[N^2] }^T

Step 1: Set initial minimum message length Lmin = +∞
Step 2: for m = Mmax : −1 : Mmin

Initialize θ(0) = θk-means
Update by EM θm = θEM

Compute L(θ, x) 
if L(θ, x) ≤ Lmin , then
Lmin = L(θ, x); θ = θm ; Mbest = m
end if
end for

Output Estiamted parameter θ and number of segmentation Mbest


Results:

Peppers Image


City Image