# Math Genius: MDP tabular setting.

$$E_{tau_1,cdots, tau_{N} sim P^{pi}_{mu}} (1_{A}) = E_{s_1,cdots,s_{N times H} sim d^{pi}_{mu}} E_{a_1 sim pi(.|s_1) cdots a_{N times H} sim pi(.|s_{N times H})}(1_A)$$
With the usual notations: $$tau$$ are paths of length $$H$$, $$pi$$ is a policy, $$mu$$ is the initial distribution, $$d^{pi}_{mu}$$ is the state probability of appearance (ie $$P(s)$$).
$$A$$ is an event involving all the data points. (like for example observing $$(s,a)$$ $$k$$ times). It seems intuitively true, but not sure how to show it.