== Introduction == This package contains the source code of the closed-form EM algorithm for the spike-and-slab or Gaussian sparse coding (GSC) model as described in [1]. The python based implementation of the GSC algorithm can either be run on a single or a multi-core/parallel architecture as described below. The iterative results of the algorithm can readily be stored in the starndard HDF5 file format, which can be easily read into other software packages such as Matlab etc. Please read through the following for further instrubtions on how to run the code. This software comes under the Academic Free License (AFL), v3.0. If you have problems running the code, please contact: Abdul-Saboor Sheikh == Overview == pulp/ - Python library/framework for MPI-parallelized EM-based algorithms. The GS implementation is in pulp/em/camodels/gsc.py examples/ - Some examples showing how to run the code data/ - Training data used in one of the examples\ == Software dependencies == * Python (>= 2.6) * NumPy (reasonably recent) * SciPy (reasonably recent) * pytables (reasonably recent) * mpi4py (>= 1.2) == Running == First, run some examples. E.g. $ cd examples $ python gsc-test.py This should run the GSC algorithm on synthetic data generated by the model. == Running on a parallel architecture == The code uses MPI based parallelization. If you have parallel resources (multi-core system or a compute cluster), the provided code can make a use of them by evenly distributing the training data among multiple cores. See examples/gsc-test.py as an example. Here is how you can run it on: a) a multi-core machine say with 32 cores $ mpirun -np 32 python gsc-test.py b) a cluster: $ mpirun --hostfile machines python gsc-test.py where 'machines' contains a list of suitable machines. See your MPI documentation for further details on how to start/configure MPI based programs. == Results/Output == For every run, the code will store the results in a 'results.h5' file under "./output/.../" in the directory where the called python script was residing. The results file stores the model parameters (W, pi and Sigma) for each EM iteration performed. To read the results file, you can use openFile function of the standard tables package in python. Moreover, the results files can also be easily read by other packages such as Matlab etc. == References == [1] Luecke, J. and Sheikh, A.-S. A Closed-Form EM Algorithm for Sparse Coding and Its Application to Source Separation. International Conference on Latent Variable Analysis and Signal Separation (LVA/ICA), 213-221, 2012.