← Back to Main Page ... Go to Next Page (Experimental Setup)
For each $S \in \mathscr{P}(T)$, the kernel $K_S: \Omega \times \mathscr{H} \rightarrow [0,1]$ extends from the component kernels as follows:
Intuition: Sub-$\sigma$-algebras capture partial information from subsets of environments. These properties ensure our hierarchical structure is well-behaved and consistent across different environment combinations. The following propositions formalize these characteristics.
Let $(\Omega, \mathscr{H}, \mathbb{P}, \mathbb{K})$ be a product causal space and $\mathscr{H}_S$ be a sub-$\sigma$-algebra for $S \subseteq T$. Then:
(i) By construction, $\mathscr{H}_S$ is generated by measurable rectangles $A_i \times A_j$ where $A_i \in \mathscr{H}_{e_i}$ and $A_j \in \mathscr{H}_{e_j}$ corresponding to events in the time indices $S$. Since $\mathscr{H} = \mathscr{H}_{e_i} \otimes \mathscr{H}_{e_j}$ is the product $\sigma$-algebra that contains all measurable rectangles, we have $\mathscr{H}_S \subseteq \mathscr{H}$ by definition.
(ii) Let $S_1 \subseteq S_2 \subseteq T$. Any measurable rectangle generating $\mathscr{H}_{S_1}$ corresponds to events in time indices from $S_1$. Since $S_1 \subseteq S_2$, these same rectangles are also in the generating set of $\mathscr{H}_{S_2}$. By the minimality property of $\sigma$-algebras, $\mathscr{H}_{S_1} \subseteq \mathscr{H}_{S_2}$.
(iii) When $S = T$, the generating rectangles of $\mathscr{H}_S$ include all possible measurable rectangles from $\mathscr{H}_{e_i}$ and $\mathscr{H}_{e_j}$ that can be formed from the complete set of time indices. These rectangles generate $\mathscr{H} = \mathscr{H}_{e_i} \otimes \mathscr{H}_{e_j}$, so $\mathscr{H}_T = \mathscr{H}$.
For any $S \subseteq T$, the restriction $\mathbb{P}|_{\mathscr{H}_S}$ of the product probability measure to $\mathscr{H}_S$ is a well-defined probability measure, and for $S_1 \subseteq S_2 \subseteq T$:
First, we establish that $\mathbb{P}|_{\mathscr{H}_S}$ is a well-defined probability measure. $\mathscr{H}_S$ is a $\sigma$-algebra by construction, $\mathbb{P}|_{\mathscr{H}_S}(A) = \mathbb{P}(A)$ for all $A \in \mathscr{H}_S$, $\mathbb{P}|_{\mathscr{H}_S}(\Omega) = \mathbb{P}(\Omega) = 1$, and $\mathbb{P}|_{\mathscr{H}_S}$ inherits countable additivity from $\mathbb{P}$.
To prove the consistency of the restrictions for $S_1 \subseteq S_2 \subseteq T$, let $A \in \mathscr{H}_{S_1}$. Since $S_1 \subseteq S_2$, by the previous proposition, we have $A \in \mathscr{H}_{S_2}$.
For any measurable rectangle $A = A_i \times A_j$ where $A_i \in \mathscr{H}_{e_i}$ and $A_j \in \mathscr{H}_{e_j}$ corresponding to events in time indices $S_1$:
by definition of product measure. This equality extends to all sets in $\mathscr{H}_{S_1}$ by the uniqueness of measure extension.
For $S_1 \subseteq S_2 \subseteq T$ and any $\mathscr{H}$-measurable random variable $X$:
Let $S_1 \subseteq S_2 \subseteq T$ and let $X$ be any $\mathscr{H}$-measurable random variable. By the previous proposition, we have $\mathscr{H}_{S_1} \subseteq \mathscr{H}_{S_2}$. This nested relationship between the sub-$\sigma$-algebras is crucial for applying the tower property of conditional expectation.
By the tower property of conditional expectation, for nested $\sigma$-algebras $\mathscr{G}_1 \subseteq \mathscr{G}_2 \subseteq \mathscr{F}$:
Applying this to our sub-$\sigma$-algebras $\mathscr{H}_{S_1} \subseteq \mathscr{H}_{S_2} \subseteq \mathscr{H}$ gives the desired result.
Information-theoretic interpretation: Conditioning on a larger $\sigma$-algebra ($\mathscr{H}_{S_2}$) provides more refined information than conditioning on a smaller one ($\mathscr{H}_{S_1}$). The tower property shows that the expected value of this refined information, when further conditioned on the smaller $\sigma$-algebra, equals the direct conditioning on the smaller $\sigma$-algebra.
For a causal space $(\Omega, \mathscr{H}, \mathbb{P}, \mathbb{K})$, the relationship between causal kernels and conditional probabilities is characterized as follows:
For a product causal space with kernel $K_S$ and any measurable event $A \in \mathscr{H}$:
where $U \subseteq S \in \mathscr{P}(T)$.
Note: In our settings, an intervention is a measurable mapping $\mathbb{Q}(\cdot|\cdot): \mathscr{H} \times \Omega \rightarrow [0,1]$. Hard intervention is $\mathbb{Q}(A|\omega') = P(X \in A \mid do(Y=y'))$, and soft intervention is $\mathbb{Q}(A|\omega') = P(X \in A \mid Y=y', E \in S)$ where $y'$ denotes the $Y$-component of $\omega'$.
We establish the distinct properties of causal and anti-causal events through their behavior under the causal kernel.
Part 1: Causal events
Let $A$ be causally dependent on variables in $\mathscr{H}_S$. By definition of causal dependence, there exist $\omega, \omega' \in \Omega$ such that $K_S(\omega, A) \neq K_S(\omega', A)$. Since $\mathbb{P}(A) = \int_{\Omega} K_S(\omega, A) \, d\mathbb{P}(\omega)$ is a fixed constant, we cannot have $K_S(\omega, A) = \mathbb{P}(A)$ for all $\omega \in \Omega$.
Since $\mathbb{P}(A)$ is a weighted average of $K_S(\omega, A)$ over all $\omega$, if the kernel values vary with $\omega$ (as they do for causal events), then $K_S(\omega, A) \neq \mathbb{P}(A)$ for some $\omega$, which is consistent with the causal structure of $A$.
Part 2: Anti-causal events
Let $A \in \mathscr{H}$ be an anti-causal event, and let $U \subseteq S \in \mathscr{P}(T)$. We need to show that $K_S(\omega, A) = K_{S \setminus U}(\omega, A)$ for all $\omega \in \Omega$.
By definition, an anti-causal event is one whose probability is invariant to certain interventions. Specifically, removing a subset $U$ from the conditioning information $S$ does not change the kernel's value if $A$ is anti-causal with respect to $U$.
From the definition of causal kernels:
For an anti-causal event $A$, the information in $\mathscr{H}_U$ (corresponding to indices in $U$) has no causal influence on $A$ when conditioning on $\mathscr{H}_{S \setminus U}$. Formally, this means:
By the properties of conditional expectation under conditional independence:
Therefore: $K_S(\omega, A) = K_{S \setminus U}(\omega, A)$ for all $\omega \in \Omega$. This equality demonstrates that anti-causal events exhibit invariance with respect to certain subsets of the conditioning information, reflecting their position in the causal structure.