Properties

← Back to Main Page ... Go to Next Page (Experimental Setup)

Properties of Kernel

For each $S \in \mathscr{P}(T)$, the kernel $K_S: \Omega \times \mathscr{H} \rightarrow [0,1]$ extends from the component kernels as follows:

For measurable rectangles $A_i \times A_j$ with $A_i \in \mathscr{H}_{e_i}$ and $A_j \in \mathscr{H}_{e_j}$, and $\omega = (\omega_i, \omega_j) \in \Omega$:
$$K_S(\omega, A_i \times A_j) = K_{e_i}(\omega_i, A_i) \cdot K_{e_j}(\omega_j, A_j)$$
For general measurable sets $A \in \mathscr{H}$, by the Carathéodory extension theorem:
$$K_S(\omega, A) = \mathbb{E}[\mathbf{1}_A \mid \mathscr{H}_S](\omega)$$
where $\mathscr{H}_S$ is the sub-$\sigma$-algebra corresponding to indices in $S$.

Properties of Product Causal Space Sub-$\sigma$-algebra

Intuition: Sub-$\sigma$-algebras capture partial information from subsets of environments. These properties ensure our hierarchical structure is well-behaved and consistent across different environment combinations. The following propositions formalize these characteristics.

Proposition (Properties of Sub-$\sigma$-algebras)

Let $(\Omega, \mathscr{H}, \mathbb{P}, \mathbb{K})$ be a product causal space and $\mathscr{H}_S$ be a sub-$\sigma$-algebra for $S \subseteq T$. Then:

$\mathscr{H}_S \subseteq \mathscr{H}$ for all $S \subseteq T$
If $S_1 \subseteq S_2 \subseteq T$, then $\mathscr{H}_{S_1} \subseteq \mathscr{H}_{S_2}$
$\mathscr{H}_T = \mathscr{H}$

Proof

(i) By construction, $\mathscr{H}_S$ is generated by measurable rectangles $A_i \times A_j$ where $A_i \in \mathscr{H}_{e_i}$ and $A_j \in \mathscr{H}_{e_j}$ corresponding to events in the time indices $S$. Since $\mathscr{H} = \mathscr{H}_{e_i} \otimes \mathscr{H}_{e_j}$ is the product $\sigma$-algebra that contains all measurable rectangles, we have $\mathscr{H}_S \subseteq \mathscr{H}$ by definition.

(ii) Let $S_1 \subseteq S_2 \subseteq T$. Any measurable rectangle generating $\mathscr{H}_{S_1}$ corresponds to events in time indices from $S_1$. Since $S_1 \subseteq S_2$, these same rectangles are also in the generating set of $\mathscr{H}_{S_2}$. By the minimality property of $\sigma$-algebras, $\mathscr{H}_{S_1} \subseteq \mathscr{H}_{S_2}$.

(iii) When $S = T$, the generating rectangles of $\mathscr{H}_S$ include all possible measurable rectangles from $\mathscr{H}_{e_i}$ and $\mathscr{H}_{e_j}$ that can be formed from the complete set of time indices. These rectangles generate $\mathscr{H} = \mathscr{H}_{e_i} \otimes \mathscr{H}_{e_j}$, so $\mathscr{H}_T = \mathscr{H}$.

Proposition (Probability Measure Restriction)

For any $S \subseteq T$, the restriction $\mathbb{P}|_{\mathscr{H}_S}$ of the product probability measure to $\mathscr{H}_S$ is a well-defined probability measure, and for $S_1 \subseteq S_2 \subseteq T$:

$$\mathbb{P}|_{\mathscr{H}_{S_2}}(A) = \mathbb{P}|_{\mathscr{H}_{S_1}}(A) \text{ for all } A \in \mathscr{H}_{S_1}$$

Proof

First, we establish that $\mathbb{P}|_{\mathscr{H}_S}$ is a well-defined probability measure. $\mathscr{H}_S$ is a $\sigma$-algebra by construction, $\mathbb{P}|_{\mathscr{H}_S}(A) = \mathbb{P}(A)$ for all $A \in \mathscr{H}_S$, $\mathbb{P}|_{\mathscr{H}_S}(\Omega) = \mathbb{P}(\Omega) = 1$, and $\mathbb{P}|_{\mathscr{H}_S}$ inherits countable additivity from $\mathbb{P}$.

To prove the consistency of the restrictions for $S_1 \subseteq S_2 \subseteq T$, let $A \in \mathscr{H}_{S_1}$. Since $S_1 \subseteq S_2$, by the previous proposition, we have $A \in \mathscr{H}_{S_2}$.

For any measurable rectangle $A = A_i \times A_j$ where $A_i \in \mathscr{H}_{e_i}$ and $A_j \in \mathscr{H}_{e_j}$ corresponding to events in time indices $S_1$:

$$\mathbb{P}|_{\mathscr{H}_{S_1}}(A) = \mathbb{P}(A) = \mathbb{P}_{e_i}(A_i) \cdot \mathbb{P}_{e_j}(A_j) = \mathbb{P}|_{\mathscr{H}_{S_2}}(A)$$

by definition of product measure. This equality extends to all sets in $\mathscr{H}_{S_1}$ by the uniqueness of measure extension.

Proposition (Monotonicity of Information)

For $S_1 \subseteq S_2 \subseteq T$ and any $\mathscr{H}$-measurable random variable $X$:

$$\mathbb{E}[\mathbb{E}[X|\mathscr{H}_{S_2}]|\mathscr{H}_{S_1}] = \mathbb{E}[X|\mathscr{H}_{S_1}]$$

Proof

Let $S_1 \subseteq S_2 \subseteq T$ and let $X$ be any $\mathscr{H}$-measurable random variable. By the previous proposition, we have $\mathscr{H}_{S_1} \subseteq \mathscr{H}_{S_2}$. This nested relationship between the sub-$\sigma$-algebras is crucial for applying the tower property of conditional expectation.

By the tower property of conditional expectation, for nested $\sigma$-algebras $\mathscr{G}_1 \subseteq \mathscr{G}_2 \subseteq \mathscr{F}$:

$$\mathbb{E}[\mathbb{E}[X|\mathscr{G}_2]|\mathscr{G}_1] = \mathbb{E}[X|\mathscr{G}_1]$$

Applying this to our sub-$\sigma$-algebras $\mathscr{H}_{S_1} \subseteq \mathscr{H}_{S_2} \subseteq \mathscr{H}$ gives the desired result.

Information-theoretic interpretation: Conditioning on a larger $\sigma$-algebra ($\mathscr{H}_{S_2}$) provides more refined information than conditioning on a smaller one ($\mathscr{H}_{S_1}$). The tower property shows that the expected value of this refined information, when further conditioned on the smaller $\sigma$-algebra, equals the direct conditioning on the smaller $\sigma$-algebra.

Properties of Anti-Causal Kernels

Remark (Characterization of Causal Kernels)

For a causal space $(\Omega, \mathscr{H}, \mathbb{P}, \mathbb{K})$, the relationship between causal kernels and conditional probabilities is characterized as follows:

Kernel-induced conditional probabilities: For any $S \in \mathscr{P}(T)$:
$$K_S(\omega, A) = \mathbb{E}[\mathbf{1}_A \mid \mathscr{H}_S](\omega)$$
where $\mathbf{1}_A$ is the indicator function of set $A$.
Regular conditional probabilities: Versions that are measurable in $\omega$ arise as a special case:
$$K_S(\omega, A) = \mathbb{P}(A \mid \mathscr{H}_S)(\omega)$$
In anti-causal structures:
$$K_S(\omega, A) = \mathbb{P}(A \mid Y=y, E \in S)$$
where $y$ is the $Y$-component of $\omega$.

Properties of Causal and Anti-Causal Events

Proposition (Causal and Anti-Causal Event Properties)

For a product causal space with kernel $K_S$ and any measurable event $A \in \mathscr{H}$:

For a causal event $A$: $K_S(\omega, A) \neq \mathbb{P}(A)$ for some $\omega \in \Omega$
For an anti-causal event $A$: $K_S(\omega, A) = K_{S \setminus U}(\omega, A)$ for all $\omega \in \Omega$

where $U \subseteq S \in \mathscr{P}(T)$.

Note: In our settings, an intervention is a measurable mapping $\mathbb{Q}(\cdot|\cdot): \mathscr{H} \times \Omega \rightarrow [0,1]$. Hard intervention is $\mathbb{Q}(A|\omega') = P(X \in A \mid do(Y=y'))$, and soft intervention is $\mathbb{Q}(A|\omega') = P(X \in A \mid Y=y', E \in S)$ where $y'$ denotes the $Y$-component of $\omega'$.

Proof

We establish the distinct properties of causal and anti-causal events through their behavior under the causal kernel.

Part 1: Causal events

Let $A$ be causally dependent on variables in $\mathscr{H}_S$. By definition of causal dependence, there exist $\omega, \omega' \in \Omega$ such that $K_S(\omega, A) \neq K_S(\omega', A)$. Since $\mathbb{P}(A) = \int_{\Omega} K_S(\omega, A) \, d\mathbb{P}(\omega)$ is a fixed constant, we cannot have $K_S(\omega, A) = \mathbb{P}(A)$ for all $\omega \in \Omega$.

Since $\mathbb{P}(A)$ is a weighted average of $K_S(\omega, A)$ over all $\omega$, if the kernel values vary with $\omega$ (as they do for causal events), then $K_S(\omega, A) \neq \mathbb{P}(A)$ for some $\omega$, which is consistent with the causal structure of $A$.

Part 2: Anti-causal events

Let $A \in \mathscr{H}$ be an anti-causal event, and let $U \subseteq S \in \mathscr{P}(T)$. We need to show that $K_S(\omega, A) = K_{S \setminus U}(\omega, A)$ for all $\omega \in \Omega$.

By definition, an anti-causal event is one whose probability is invariant to certain interventions. Specifically, removing a subset $U$ from the conditioning information $S$ does not change the kernel's value if $A$ is anti-causal with respect to $U$.

From the definition of causal kernels:

$$K_S(\omega, A) = \mathbb{E}[\mathbf{1}_A \mid \mathscr{H}_S](\omega), \quad K_{S \setminus U}(\omega, A) = \mathbb{E}[\mathbf{1}_A \mid \mathscr{H}_{S \setminus U}](\omega)$$

For an anti-causal event $A$, the information in $\mathscr{H}_U$ (corresponding to indices in $U$) has no causal influence on $A$ when conditioning on $\mathscr{H}_{S \setminus U}$. Formally, this means:

$$A \perp\!\!\!\perp \mathscr{H}_U \mid \mathscr{H}_{S \setminus U}$$

By the properties of conditional expectation under conditional independence:

$$\mathbb{E}[\mathbf{1}_A \mid \mathscr{H}_S] = \mathbb{E}[\mathbf{1}_A \mid \mathscr{H}_{S \setminus U}]$$

Therefore: $K_S(\omega, A) = K_{S \setminus U}(\omega, A)$ for all $\omega \in \Omega$. This equality demonstrates that anti-causal events exhibit invariance with respect to certain subsets of the conditioning information, reflecting their position in the causal structure.