== Introduction ==

This package contains the source code of the closed-form EM algorithm for the 
spike-and-slab or Gaussian sparse coding (GSC) model as described in [1]. 
The python based implementation of the GSC algorithm can either be run on 
a single or a multi-core/parallel architecture as described below. 
The iterative results of the algorithm can readily be stored in the starndard 
HDF5 file format, which can be easily read into other software packages such as Matlab etc. 
Please read through the following for further instrubtions on how to run the code. 
This software comes under the Academic Free License (AFL), v3.0.

If you have problems running the code, please contact:
Abdul-Saboor Sheikh <sheikh [aT] fias.uni-frankfurt.de>

== Overview ==

pulp/       - Python library/framework for MPI-parallelized 
               EM-based algorithms. The GS implementation
               is in pulp/em/camodels/gsc.py

examples/   - Some examples showing how to run the code

data/       - Training data used in one of the examples\

== Software dependencies ==
 
 * Python (>= 2.6)
 * NumPy (reasonably recent)
 * SciPy (reasonably recent)
 * pytables (reasonably recent)
 * mpi4py (>= 1.2)

== Running ==

First, run some examples. E.g.

  $ cd examples
  $ python gsc-test.py

This should run the GSC algorithm on synthetic data generated by the model.

== Running on a parallel architecture ==

The code uses MPI based parallelization. If you have parallel resources
(multi-core system or a compute cluster), the provided code can make a 
use of them by evenly distributing the training data among multiple cores.
See examples/gsc-test.py as an example. Here is how you can run it on:

a) a multi-core machine say with 32 cores

 $ mpirun -np 32 python gsc-test.py

b) a cluster:

 $ mpirun --hostfile machines python gsc-test.py

 where 'machines' contains a list of suitable machines.

See your MPI documentation for further details on how to start/configure MPI based programs.

== Results/Output ==

For every run, the code will store the results in a 'results.h5' file 
under "./output/.../" in the directory where the called python script was
residing. The results file stores the model parameters (W, pi and Sigma) 
for each EM iteration performed. To read the results file, you can use
openFile function of the standard tables package in python. Moreover, the
results files can also be easily read by other packages such as Matlab etc.


== References ==

[1] Luecke, J. and Sheikh, A.-S. 
    A Closed-Form EM Algorithm for Sparse Coding and Its Application to Source Separation.
    International Conference on Latent Variable Analysis and Signal Separation (LVA/ICA), 213-221, 2012.