Bayesian test of independence of discrete events A and B

I’m looking to implement a Bayesian test of whether discrete events / outcomes A and B are independent or not, as an alternative to the frequentist chi-squared test (or similar).

I have paired count data arranged in the classic A vs B grid:

    | B=0 | B=1 | B=2 | sum
---------------------------
A=0 | 33  | 343 |  30 | 406
A=1 | 64  | 360 |  70 | 494
---------------------------
sum | 97  | 703 | 100 | 900

If A and B were independent then cell probabilities would be P(A,B)=P(A).P(B). If this were true the expected cell counts would thus be:

    | B=0 | B=1 | B=2 |
-----------------------
A=0 | 44  | 317 |  45 |
A=1 | 53  | 386 |  55 |

These differ quite considerably from my observed count data, hence A and B are likely not independent. But I want to develop a hypothesis test to determine this more rigorously.

One idea I had was to define Dirichlet prior for P(A) and P(B) together with a grid of multipliers k[A,B] such that cell probabilities P(A=a,B=b)=k[A=a,B=b].P(A=a).P(B=b). The idea being that if the posterior k[A,B] are straddling 1.0 then this implies independence.

I would likely expand the count data (which could be modelled via multinomial RVs) into a long format so that Categorical RVs can be used; I expect the model may then be easier to code.

Does this sound like the best way to proceed, or is there a more established test / model structure I should use instead?

In the above to keep things simple I have A as a 2-outcome event, B as a 3-outcome event. In general I need these to be n-outcome and m-outcome however.

Hi @cowirihy,

This is interesting! I am by no means an expert but thought I could offer a different perspective.

First, I am not sure if the Dirichlet would work because if you constraint the sum of your event probabilities to 1 then you are telling the model to treat the events as being dependent.

I think an alternative would be similar to what Krushke from DBDA2 chapter 12 does and model the events as either the event happened or did not happen. That way you can model the joint event types using a Beta distribution. To check for dependence, you would compute the marginal event probabilities from the posterior and subtract the joint posterior probabilities from the product of the marginal posterior probabilities you computed. You can then define a region of practical equivalence (ROPE) to decide what probability mass would constitute dependence/independence.

Here is some code showing how this could be done with PyMC:

import arviz as az
import numpy as np
import pymc as pm

# From table in post
a0b0 = 33
a0b1 = 343
a0b2 = 30
a1b0 = 64
a1b1 = 360
a1b2 = 70

# Long format A event type, B event type, event count
d = np.concatenate(
    (
        np.stack([np.array([0, 0, 1]) for _ in range(a0b0)]),
        np.stack([np.array([0, 0, 0]) for _ in range(900 - a0b0)]), # these are the failures
        np.stack([np.array([0, 1, 1]) for _ in range(a0b1)]),
        np.stack([np.array([0, 1, 0]) for _ in range(900 - a0b1)]),
        np.stack([np.array([0, 2, 1]) for _ in range(a0b2)]),
        np.stack([np.array([0, 2, 0]) for _ in range(900 - a0b2)]),
        np.stack([np.array([1, 0, 1]) for _ in range(a1b0)]),
        np.stack([np.array([1, 0, 0]) for _ in range(900 - a1b0)]),
        np.stack([np.array([1, 1, 1]) for _ in range(a1b1)]),
        np.stack([np.array([1, 1, 0]) for _ in range(900 - a1b1)]),
        np.stack([np.array([1, 2, 1]) for _ in range(a1b2)]),
        np.stack([np.array([1, 2, 0]) for _ in range(900 - a1b2)]),
    )
)

a_event_uniques, a_event_idx = np.unique(d[:, 0], return_inverse=True)
b_event_uniques, b_event_idx = np.unique(d[:, 1], return_inverse=True)
y_event_counts = d[:, 2]

coords = {
    "a_event": a_event_uniques,
    "b_event": b_event_uniques,
    "n_obs": np.arange(len(y_event_counts))
}

with pm.Model(coords=coords) as model:
    covariate_a = pm.Data("covariate_a", a_event_idx, dims="n_obs")
    covariate_b = pm.Data("covariate_b", b_event_idx, dims="n_obs")

    joint_prob = pm.Beta("joint_prob", alpha=2, beta=2, dims=("a_event", "b_event"))
    pm.Bernoulli("likelihood", p=joint_prob[covariate_a, covariate_b], observed=y_event_counts, dims="n_obs")

with model:
    idata = pm.sample()

# Get the marginals
a0_marginal = idata.posterior['joint_prob'].sel(a_event=0).mean(('b_event'))
b0_marginal = idata.posterior['joint_prob'].sel(b_event=0).mean(('a_event'))

# Get difference between product of marginals and joint (should be close to zero if independent)
difference_between_joint_and_prod_marginal = idata.posterior['joint_prob'].sel(a_event=0, b_event=0) - (a0_marginal * b0_marginal)

# Let's say that they are independent if most of the posterior mass is inside 0% to 5%.
az.plot_posterior(difference_between_joint_and_prod_marginal, rope=[0, 0.05])

You’ll get something like this:

Here you can see for the particular events A0 and B0 with a ROPE spanning 0 to 0.05 (which may be too wide) we would say that the events are independent because almost all of the posterior mass lies in the region.

Thanks for this reply and taking the time to set it out via model code.

If I understand your reply correctly then:

  • Modelled probabilities are the joint_prob, i.e. the 2x2 joint cell probabilities P(A=a,B=b) (or nxm in my more general case).
  • You then aggregate these in a row-wise or column-wise sense to obtain marginal probabilities P(A=a) and P(B=b).
  • If pairs of A and B were independent then the following would hold true: P(A=a,B=b) = P(A=a)P(B=b)
  • Therefore you evaluate P(A=a,B=b) - P(A=a).P(B=b) and inspect the posterior of this difference quantity; if this straddles zero, or has significant density within the ROPE, then we can conclude that A=a and B=b are independent. Or rather not reject the null hypothesis that they are independent.
  • And we can conclude that A and B are independent only if this holds true across all possible pairings (a,b)

Is that the correct interpretation? Or have I misunderstood? I will give this a go at any rate, thanks!

Yes, this is exactly what I was trying to convey! I apologize, I should’ve been more clear.

You are very welcome. I hope this helps!

1 Like

Why not just use the standard frequentist test?

You can see what Andrew Gelman has to say about this situation in the blog post:

What is the Bayesian counterpart to the so-called Fisher exact test?

The blog to which I link in my comment to Gelman’s post is gone, but it’s on the Wayback Machine:

I can’t believe I used to enjoy R! That was before I had to try to use it for real work.

Converting to Bayes literally is impossible because hypothesis testing is by definition a frequentist notion. Null hypotheses typically have zero probability in a Bayesian analysis.

An alternative approach would be to build multiple models and compare them using cross-validation or something similar. This generalizes to other cases.

1 Like

Thanks so much! I will look that up (and also likely purchase his 3rd edition book - looks good)

There’s a free pdf with the publisher’s permission on BDA3’s home page:

https://sites.stat.columbia.edu/gelman/book/

There’s also a free pdf with the publisher’s permission of Regression and Other Stories, Gelman, Hill, and Vehtari’s expanded regression book (fifth bullet down—the home page buries the lead):

This expands and updates the first half of Gelman and Hill’s multilevel regression book (if you want the 500 page abridged version!). We’re all waiting for part II, which will focus on hierarchical models.

1 Like