← Back to Main Page ... Go to Next Page (Problem)
A measurable space $(\Omega, \mathscr{F}, \mu)$ consists of a sample space $\Omega$, a $\sigma$-algebra $\mathscr{F}$ of measurable sets, and a probability measure $\mu$. Within the causal context, $\Omega$ represents possible states of the world, $\mathscr{F}$ represents events we can measure, and $\mu$ assigns probabilities to these events. The important notations in this paper are summarized in a table in the Appendix.
Given a finite set of environments $\mathcal{E}$, each $e \in \mathcal{E}$ is associated with a measurable input space $(\mathcal{X}_e, \mathscr{F}_{\mathcal{X}_e})$, a measurable output space $(\mathcal{Y}_e, \mathscr{F}_{\mathcal{Y}_e})$, and a probability measure $P_e$ on the product space $(\mathcal{X}_e \times \mathcal{Y}_e, \mathscr{F}_{\mathcal{X}_e} \otimes \mathscr{F}_{\mathcal{Y}_e})$.
For each environment $e \in \mathcal{E}$, the data space is a tuple $(D_{e}, \mathscr{F}_{D_{e}}, p_e)$ where $D_{e} = \{(x^{e}_j, y^{e}_j)\}_{j=1}^{|D_{e}|}$ is a finite collection of input-output pairs from environment $e$, $x^{e}_j$ are elements of the input space $\mathcal{X}_{e}$, $y^{e}_j$ are elements of the output space $\mathcal{Y}_{e}$, $T_{e}$ is the index set that defines the component-wise sample space structure for environment $e$. Specifically, it indexes the components of the product space such that $\Omega_{e} = \times_{t \in T_{e}} E_t$, where each $E_t$ represents a measurable component space at index $t$, and $p_{e}$ is a probability measure on $D_{e}$ defining the distribution of $(x^{e}_j, y^{e}_j)$.
A representation is a measurable function $\phi: \mathcal{X} \rightarrow \mathcal{R}$ mapping inputs to a latent space $\mathcal{R}$, where $(\mathcal{R}, \mathscr{F}_{\mathcal{R}})$ is a measurable space. A representation is causal if it captures the underlying causal mechanisms generating the data.
A kernel $K$ is a function $K: \Omega \times \mathscr{F} \rightarrow [0,1]$ such that:
Intuitively, $K(\omega, A)$ represents the probability of $A$ conditioned on the information encoded in $\omega$. Properties of kernels being used in this work are discussed in the Appendix.
In the measure-theoretic framework, interventions modify kernels rather than structural equations, enabling unified treatment of both perfect and imperfect interventions.
An intervention is a measurable mapping $\mathbb{Q}(\cdot|\cdot): \mathscr{H} \times \Omega \rightarrow [0,1]$ that modifies causal kernels by modifying the underlying probability structure. There are two types of intervention in causal representation learning:
The basis of this work is on understanding the meaning of causal dependence and causal spaces.
Variables $X$ and $Y$ are causally independent given $Z$, denoted $X \perp\!\!\!\perp_c Y | Z$, if $P(Y|do(X=x), Z) = P(Y|Z)$ for all $x$ in the support of $X$, and $P(X|do(Y=y), Z) = P(X|Z)$ for all $y$ in the support of $Y$. The do-operator $do(X=x)$ represents an intervention that sets variable $X$ to value $x$, i.e., breaking all cause factors to $X$. (Note: $P(Y|do(X=x))$ differs from $P(Y|X=x)$, which observes $X=x$ while preserving causal relationships.)
For an environment $e$, a causal space is a tuple $(\Omega_{e}, \mathscr{H}_{e}, P_{e}, K_{e})$, where $\Omega_{e} = \times_{t \in T_{e}} E_t$ is the sample space, $P_e$ is the probability measure on $(\Omega_{e}, \mathscr{H}_{e})$, and $K_{e}$ is a kernel function for environment $e$. For each $t \in T_e$, $\mathscr{A}_t$ is the $\sigma$-algebra on component space $E_t$, and the overall $\sigma$-algebra $\mathscr{H}_e = \otimes_{t \in T_e} \mathscr{A}_t$ is the tensor product of these component $\sigma$-algebras. This definition is the backbone of measure-theoretic causality in this paper.