← Back to Main Page ... Go to Next Page (Theoretical Framework)
We formalize the anti-causal representation learning problem as follows: given a causal structure where label $Y$ causes the observation $X$ and environment $E$ also influences $X$ ($Y \rightarrow X \leftarrow E$), our goal is to learn representations that capture the causal generative invariant from $Y$ to $X$.
Given observations from environments $\mathcal{E} = \{e_i\}_{i=1}^n$ with corresponding datasets $\mathcal{D} = \{D_{e_i}\}_{i=1}^n$, we aim to learn two-level representations:
The predictor $\mathcal{C}: \mathcal{Z}_H \rightarrow \mathcal{Y}$ then maps high-level representations to labels. With a loss function $\ell: \mathcal{Y} \times \mathcal{Y} \rightarrow \mathbb{R}_+$ defined across all environments $\mathcal{E}$, the full model $f = \mathcal{C} \circ \phi_H \circ \phi_L$ can be trained in an end-to-end fashion.
More specifically, we introduce causal dynamics (Theorem 3) to facilitate learning low-level representations $\mathcal{Z}_L$ by jointly optimizing the loss with a causal structure consistency regularizer ($R_2$), where minimizing it encourages the low-level representations to align with the true causal mechanisms underlying the data.
On top of $\mathcal{Z}_L$, we further introduce causal abstraction (Theorem 4) to learn high-level representations, guided by another environment independence regularizer ($R_1$). This regularizer measures the discrepancy between the expected high-level representations across environments conditioned on the label $Y$. Minimizing it can remove environment-specific information while retaining label-relevant causal features.
This hierarchical structure enables us to do two procedures: