Learning and Cognitive Systems

BCM 'Exam Scores' Model for Two Latent Classes

Lee & Wagenmakers (Bayesian Cognitive Modeling, 2013, ch. 6.1, p.77-78) file this model under Latent-mixture models without proper definition of this term and without making it evident where the mixing takes place. We think that the label Latent-Class is more appropriate than Latent-Mixture for the Exam Scores model.

"A mixture model, ..., usually means that each individual data point y(i) is assumed drawn from one of a list of possible distributions. This can be considered as a clustering of the points into groups G(j) (j=1,...,J), where y(i) is a member of group T(i) and has a distribution parameterised by theta(T(i)). We may write this general model as

y(i) ~ p(y(i) | theta(T(i)), G(T(i)),    T(i) ~ Categorical(p[])

so that the probability that y(i) is in the jth group G(j) is PR(T(i)=j)=p(j)." (Lunn et al., The BUGS Book, 2013, p.280f)

In this vague sense the Exam Scores model is a latent mixture model, but the data in this application are assumed not to be generated by a mixture of distributions but by a set of mutual exclusive classes with according latent parameters: group membership probabilities are discrete values {0,1}. So the label Latent-Class seems to be more appropriate for the Exam Scores model.

The task in its original wording: "Suppose a group of 15 people sit an exam made up of 40 true-or-false questions, and they get 21, 17, 21, 18, 22, 31, 31, 34, 34, 35, 35, 36, 39, 36, and 35 right. These scores suggest that the first 5 people were just guessing, but the last 10 had some level of knowledge.

One way to make statistical inferences along these lines is to assume there are two different groups of people. These groups have different probabilities of success, with the guessing group having a probability of 0.5, and the knowledge group having a probability greater than 0.5. Whether each person belongs to the first or the second group is a latent or unobserved variable that can take just two values. Using this approach, the goal is to infer to which group each person belongs, and also the rate of success for the knowledge group." (Lee & Wagenmakers, 2013, p.77)

The probabilistic graphical model and the Bugs/Jags code are presented below. Discrete variables are designated by square nodes, continous by circle nodes. White nodes are latent and have to be inferred. Grey nodes are manifest; either observed data or set as constants. Single bordered nodes are stochastic and double bordered are deterministic derived variables.

The symbols denote:

(1) n = #questions (here set to n = 40)

(2) k(i) = #right or correct answers of person i

(3) psi = ability parameter of the guessing group (here set to 0.5)

(4) phi = ability parameter of the knowledge group (constrained to be between 0.5 and 1.0)

(5) z(i) = group membership of person i (is either 0 or 1; 0 is guessing group)

(6) theta(i) = ability parameter of person i

The only row of code which could be interpreted as a kind of mixing ist the fourth line from below:

theta[i] <- equals(z[i],0)*psi+equals(z[i],1)*phi.

But this is not true, because the semantics of this BUGS line-of-code is a conditional expression. Because BUGS does not know IF-THEN-ELSE constructs, conditional expressions have to be simulated by indicator functions (e.g. equal(X, Y)):

 equal (X, Y) means IF(X == Y) THEN 1.

So the meaning of the above line-of-code is (translated to R):

theta[i] <- if(z[i] == 0) psi else phi.

Plot and summary below show that the 95%-posterior-high-density region of the phi-parameter is 0.828-0.894. Furthermore the latent class indicator z indicates the class membership without any error. This is not astonishing because the Exam Scores problem is similar to hiding and looking for easter eggs because the data were constructed according to the hypothesis that there should be the two classes guessers and knowers.

Modified BCM Exam Scores Model for Two Latent Classes

We modified the Lee & Wagenmakers model in various ways: (1) We introduced a bias parameter gbias for estimating the group size, (2) both group-specific ability parameters theta_0 and theta_1 are unconstrained; their priors are uniform flat with dbeta(1,1), and (3) we use only one for loop running over persons.

We ran the simulation with OpenBUGS. 10000 samples with thinning=10. The results show that the knowledge group 1 has an estimated sizeof 64.7% (mean of gbias = 0.647). The ability of the guessing group is the mean of the posterior theta_0; which is 0.4962. The ability of the knowledge group is 0.8634. All 15 class memberships could be inferred correctly.

Latent Class Model for G Classes of Binomial Distributed Variables

In this model the priors of the class membership g_bias[] are sampled from the Dirichlet distribution with hyperparameter alpha[] = 1. This means that we assume equal probable class memberships. The class membership indicators z[i] are discrete from the set {1,2,...ng} and the theta[i] are not the expected mean of the group-specific theta[i,g] but selected from the set {theta[i,1], theta[i,2], ..., theta[i, ng]}.

The results show that persons 1-10 with scores 10-15 constitute the 3rd latent class; accordingly persons 11-15 with scores 30-34 the 2nd latent class, and persons 16-20 with scores 46-50 the 1st latent class. We suspect that the method is not very discriminative. We found that the z[i] will loose their discrete nature with data having smaller between group variance.