Step 4 — Orientation¶
After Step 3 produces the contemporaneous skeleton and the confirmed changing-module set, Step 4 determines as many edge directions as the algorithm can. Four sub-passes, run in this order:
1. Surrogate orientation¶
Every variable still in changing_modules is treated as receiving an
implicit directed edge C → X_i. This is encoded by membership in the
set rather than as a materialized adjacency-matrix entry — the surrogate
is special and has no contemporaneous edges to it.
2. V-structure detection¶
For every pair of contemporaneous neighbors (a, c) of a middle node b:
- If
aandcare adjacent → shielded triple; skip. - If
bis in the witness set that separatedaandcin Step 3 →bis a fork or chain on thea-cpath; skip. - Otherwise → orient as
a → b ← c(unshielded collider).
A small CDANs-specific tie-breaker: if exactly one of a and c is a
changing module while b is not, prefer the changing module as cause when
only one of the two edges can be oriented. Consistent with the
surrogate-priority intuition from CD-NOD.
3. Independent-change-principle sink-finding¶
The novel orientation primitive. Used for undirected edges between two changing modules — exactly the case where v-structures and Meek's rules together cannot decide direction.
Algorithm¶
candidates = {changing modules with at least one undirected edge}
while len(candidates) > 1:
for each candidate c in candidates:
parents_of_c = {everyone with an edge into c}
∪ {everyone with an undirected edge to c}
score[c] = independent_change_score(
parents=data[:, parents_of_c],
effect =data[:, c],
surrogate=surrogate,
)
sink = argmin(scores)
orient every undirected edge from sink's parents toward sink
candidates.remove(sink)
The intuition: under the independent-change principle (Huang & Zhang, 2017),
the causal mechanism P(effect | parents) and the cause distribution
P(parents) should change independently as the system evolves under
non-stationarity. The score quantifies the dependence between kernel
representations of these two distributions as functions of the surrogate.
Lower means more independent → stronger evidence the assumed direction is
correct → the candidate really is a sink.
Independent change principle: details¶
For each candidate-as-sink, the score is computed as follows.
- Standardize
parents,effect, andsurrogate(zero mean, unit std). - Build Gaussian kernel matrices
K_x,K_y,K_tfor the three. - Module 1 —
P(effect | parents):Mₗ = (1/T²) · K_t · (K_x³ ⊙ (invK · K_y · invK)) · K_twhereinvK = (K_x ⊙ K_t + λ I)⁻¹. ConvertMₗto a Gaussian-kernel form via the inner-product → squared-distance identity. - Module 2 —
P(parents):Mₗ' = K_t · (K_t + λ I)⁻¹ · K_x · (K_t + λ I)⁻¹ · K_t. Convert similarly. - HSIC-style dependence statistic: double-center both modules with
H = I - 1/T · 1 1ᵀ, then compute(1/T²) · sum(M_centered ⊙ M'_centered).
This is a direct port of infer_nonsta_dir.m from the MATLAB reference.
Auto kernel bandwidth¶
The X/Y kernel bandwidth matters a lot for this score. The default
independent_change_width="auto" selects via:
bandwidth =
0.05 · median_bandwidth(standardized parents)
The 0.05 multiplier is empirically tuned: the algorithm needs a narrow
kernel (the MATLAB reference hard-codes width = 0.1 on standardized data),
but a fixed constant doesn't adapt to non-Gaussian data. A bandwidth sweep
on synthetic non-stationary X → Y data over 60 trials:
| fraction of median | direction-recovery accuracy |
|---|---|
| 0.02–0.08 | ~92–93% |
| 0.10 | 72% |
| 0.15 | 30% |
| 1.00 (pure median) | ~30% |
The algorithm has a sharp performance cliff above fraction ≈ 0.1, so 0.05 is a safe choice in the working range.
GP-learned kernel bandwidth¶
Setting independent_change_width="gp" triggers an alternative bandwidth
selection: fit a Gaussian process with an ARD-RBF kernel on
(parents, surrogate) → effect and use the marginal-likelihood-optimized
length scales as the bandwidths.
This is the Python analog of the MATLAB reference's if_GP2=1 path. It's
adaptive (the bandwidths change per call to fit the local data) but
costs roughly 10× more per score call than "auto".
Empirical comparison on a 6-variable lag-3 DGP with two changing modules connected by a contemporaneous edge that's hard for IC to orient:
| Bandwidth mode | Direction recovery (8 seeds) | Per-call time |
|---|---|---|
"auto" |
2/8 (25%) | ~3 s |
"gp" |
5/8 (62%) | ~30 s |
GP isn't a strict improvement on every seed; it can overfit at small T
or pick suboptimal length scales when the optimizer hits a local minimum.
But on hard cases where "auto" fails consistently across all manual
bandwidths, "gp" is often the only thing that flips them.
Picking between modes¶
| Use ... | when |
|---|---|
"auto" (default) |
Starting from scratch, throughput matters, or the recovered graph already looks right. Run this first. |
"gp" |
Skeleton looks correct but a contemp edge between two changing modules is oriented wrong, AND you've checked across multiple seeds. Also: if you're trying to reproduce MATLAB-paper numbers for T ≤ 1000. |
| Manual float | Reproducing a specific reference configuration (0.1 for the MATLAB original) or you've done your own bandwidth tuning on held-out data. |
When "auto" and "gp" disagree on a direction: trust "gp" more on
small T (≤ 1000); on larger T neither bandwidth heuristic is
necessarily correct on its own — run multiple seeds and report the
majority-vote orientation, or fall back to whatever orientation evidence
is available (v-structures, prior knowledge, downstream task accuracy).
To override: pass a positive float for fully manual control:
CDANs(independent_change_width=0.1) # MATLAB-faithful fixed value
CDANs(independent_change_width=0.03) # a bit narrower
The surrogate-side bandwidth uses the standard median heuristic on the standardized surrogate, so users can pass raw time indices without scaling.
4. Meek's rules¶
Apply rules R1–R4 repeatedly until nothing changes.
- R1 —
a → bandb — candanot adjacent toc⇒b → c. (No new v-structure.) - R2 —
a → b → canda — c⇒a → c. (No cycle.) - R3 —
a — b,a — c,a — d,c → b,d → b,canddnot adjacent ⇒a → b. - R4 —
a — b,a — c,c → d,d → b,aanddnot adjacent ⇒a → b.
The rules run on the contemporaneous PDAG only — lagged edges and surrogate edges are excluded.
API¶
orient_edges ¶
orient_edges(
graph: TimeSeriesGraph,
*,
data: NDArray[float64] | None = None,
surrogate: NDArray[float64] | None = None,
use_independent_change: bool = True,
independent_change_width: WidthSpec = AUTO,
) -> TimeSeriesGraph
Orient contemporaneous edges in the graph in place.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
graph
|
TimeSeriesGraph
|
Graph from Step 3 (lagged + contemporaneous skeleton + confirmed
changing modules + |
required |
data
|
NDArray[float64] | None
|
Time series, shape |
None
|
surrogate
|
NDArray[float64] | None
|
Distribution-shift surrogate, shape |
None
|
use_independent_change
|
bool
|
Whether to run the independent-change-principle sink-finding
sub-pass. Setting this to |
True
|
independent_change_width
|
WidthSpec
|
Bandwidth for the kernels inside
:func: |
AUTO
|
Returns:
| Type | Description |
|---|---|
TimeSeriesGraph
|
The same graph with as many contemporaneous edges oriented as the algorithm can determine. |
independent_change_score ¶
independent_change_score(
parents: NDArray[float64],
effect: NDArray[float64],
surrogate: NDArray[float64],
*,
width: WidthSpec = AUTO,
width_t: WidthSpec = AUTO,
lambda_reg: float = 2.0,
) -> float
Score the hypothesis parents -> effect under non-stationarity.
Returns a scalar dependence statistic between the kernel
representation of P(effect | parents) and the kernel
representation of P(parents), both as they vary with the
surrogate. Lower values support the direction.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
parents
|
NDArray[float64]
|
Putative cause(s). Shape |
required |
effect
|
NDArray[float64]
|
Putative effect. Shape |
required |
surrogate
|
NDArray[float64]
|
Distribution-shift surrogate (time index, domain id, etc.).
Shape |
required |
width
|
WidthSpec
|
Bandwidth for the Gaussian kernels on
The MATLAB reference uses a fixed |
AUTO
|
width_t
|
WidthSpec
|
Bandwidth for the Gaussian kernel on the surrogate. Accepts the
same three forms as |
AUTO
|
lambda_reg
|
float
|
Tikhonov regularization for the kernel matrix inverses. |
2.0
|
Returns:
| Type | Description |
|---|---|
float
|
Dependence statistic. Lower is better evidence for the
|