Computation of finite time transiton probabilities and inference for general Birth-Death-Process (BDP)

Refs:

F.W. Crawford, M.A. Suchard (2012). Transition probabilities for general birth–death processes with applications in ecology, genetics, and evolution. J. Math. Biol., 65:553–580
F.W. Crawford, V.N. Minin, M.A. Suchard (2014). Estimation for General Birth-Death Processes. JASA, 109 (506): 730-747.

Compute transition probabilities for general BDP

Find robust computational method for computing the finite-time transition probabilities of a general BDPs with arbitrary rates.
Assume birth rate \(\lambda_n\) and death rate \(\mu_n\) (time-homogeneous). Let \(\nu_n=\lambda_n+\mu_n\).
Denote transition prob: \(P_{m,n}(t)=\Pr(X(t)=n|X(0)=m)\)
- note \(P_{0,0}(0)=1\) and \(P_{0,n}(0)=0, n\geq 1\).
Forward/backward eqns
- \(P'_{m,0}(t) = \mu_1P_{m,1}(t)-\lambda_0P_{m,0}(t)\)
- \(P'_{m,n}(t) = \lambda_{n-1}P_{m,n-1}(t) + \mu_{n+1}P_{m,n+1}(t)-\nu_nP_{m,n}(t)\), \(n\geq 1\)
Laplace transform
- \(f_{m,n}(s) = L[P_{m,n}(t)](s) = \int_{-\infty}^{\infty} e^{-st}P_{m,n}(t)dt\)
- derive
  - \(sf_{0,0}(s)-P_{0,0}(0)=\mu_1f_{0,1}(s)-\lambda_0f_{0,0}(s)\)
  - \(sf_{0,n}(s)-P_{0,n}(0)=\lambda_{n-1}f_{0,n-1}(s)+\mu_{n+1}f_{0,n+1}(s)-\nu_nf_{0,n}(s)\), \(n\geq 1\).
- equivalent to
  - \(f_{0,1}(s)=\frac{1}{\mu_1}[(s+\lambda_0)f_{0,0}(s)-1]\)
  - \(f_{0,n}(s)=\frac{1}{\mu_n}[(s+\nu_{n-1})f_{0,n-1}(s)-\lambda_{n-2}f_{0,n-2}(s)]\), \(n\geq 2\).
- and hence the forward system of recurrence relations
  - \(f_{0,0}(s)=\frac{1}{s+\lambda_0 -\mu_1 \frac{f_{0,1}(s)}{f_{0,0}(s)}}\)
  - \(\frac{f_{0,n}(s)}{f_{0,n-1}(s)}=\frac{\lambda_{n-1}} {s+\nu_n -\mu_{n+1}\frac{f_{0,n+1}(s)}{f_{0,n}(s)}}\).
In the end, we derive the generalized continued fraction
- \(f_{0,0}(s)=\frac{1}{s+\lambda_0 - \frac{\lambda_0\mu_1}{s+\nu_1-\frac{\lambda_1\mu_2}{s+\nu_2-\cdots}}}\)
- an exact expression for the Laplace transoform of \(P_{0,0}(t)\).
- Define \(a_1=1,a_n=-\lambda_{n-2}\mu_{n-1}\), \(b_1=s+\lambda_0, b_n=s+\nu_{n-1}, n\geq 2\).
  - \(f_{0,0}(s)=\frac{a_1}{b_1 + \frac{a_2}{b_2+\frac{a_3}{b_3+\cdots}}} = \frac{a_1}{b_1+}\frac{a_2}{b_2+}\frac{a_3}{b_3+}\cdots\)
Denote the k-th approximation
- \(f_{0,0}^{(k)}(s)= \frac{a_1}{b_1+}\frac{a_2}{b_2+}\frac{a_3}{b_3+}\cdots\frac{a_k}{b_k}=\frac{A_k(s)}{B_k(s)}\).
- Note \(A_0=0,A_1=a_1,B_0=1,B_1=b_1\).
- Generally
  - \(A_k=b_kA_{k-1}+a_kA_{k-2}\), \(B_k=b_kB_{k-1}+a_kB_{k-2}\)
  - \(A_kB_{k-1}-A_{k-1}B_k=(-1)^{k-1}\prod_{i=1}^ka_i\).
- numerical calculation
  - \(f_{0,0}^{(k)} = f_{0,0}^{(k-1)}C_kD_k\), where \(C_k=A_k/A_{k-1}, D_k=B_{k-1}/B_k\).
  - Note \(C_k=b_k+a_k/C_{k-1}, D_k=1/(b_k+a_kD_{k-1})\)
General formula to compute \(f_{m,n}(s)\)
- for \(n\leq m\)
  - \((\prod_{j=n+1}^m\mu_j)\frac{B_n(s)}{B_{m+1}(s)+}\frac{B_m(s)a_{m+2}}{b_{m+2}+}\frac{a_{m+3}}{b_{m+3}+}\cdots\)
- for \(n\geq m\)
  - \((\prod_{j=m}^{n-1}\lambda_j)\frac{B_m(s)}{B_{n+1}(s)+}\frac{B_n(s)a_{n+2}}{b_{n+2}+}\frac{a_{n+3}}{b_{n+3}+}\cdots\)
- Numerical calculation!
Numerical inversion of Laplace tranformation
- \(P_{m,n}(t)=\frac{2e^{\epsilon t}}{\pi}\int_0^{\infty}Re(f_{m,n}(\epsilon+iu))cost(ut)du\).
  - \(\epsilon\) is a large positive number to be chosen.
- integration approximation
  - let \(A=2t\epsilon\)
  - \(P_{m,n}(t)\approx \frac{e^{A/2}}{2t}Re(f_{m,n}(A/2t)) + \frac{e^{A/2}}{t}\sum_{k=1}^{\infty}(-1)^kRe(f_{m,n}((A+2k\pi i)/2t))\)
  - take \(\epsilon = \frac{\gamma\log(10)}{2t}\), where \(10^{-\gamma}\) controls the approximation error.

Estimation and inference for general BDP

Continuously observed BDP

Observation \(Y=(X(0)=a,X(t)=b,t)\)
- a single realization of the BDP starting at \(X(0)=a\) and ending at \(X(t)=b\).
rate \(\lambda_k(\theta), \mu_k(\theta)\) depend on finite-dimensional parameter vector \(\theta\).
- \(\nu_k(\theta)=\lambda_k(\theta)+\mu_k(\theta)\)
Likelihood function
- \(\ell(\theta) = \sum_k [U_k\log[\lambda_k(\theta)] +D_k\log[\mu_k(\theta)] - \nu_k(\theta)T_k]\).
- \(T_k\) : total time spent in state \(k\)
- \(U_k\): number of births from state \(k\)
- \(D_k\): number of deaths from state \(k\)
- Note \(T_k\sim \exp(\nu_k)\), and observing birth/death is Bernoulli with prob of \(\lambda_k/\nu_k\).
  - So \(\log\Pr(T_k) = \log(\nu_k)-\nu_kT_k\)
  - \(\Pr(birth) = \log(\lambda_k) - \log(\nu_k)\)
  - \(\Pr(death) = \log(\mu_k) - \log(\nu_k)\)

Discretely observed BDP

only observe \(Y=\{X(0)=a, X(t)=b\}\).
\(T_k\), \(U_k\) and \(D_k\) are all missing data.
Using EM algorithm for estimation
- need to compute \(E(T_k|Y), E(U_k|Y), E(D_k|Y)\).
Some useful results
- \(E(U_k|Y) = \frac{\int_0^tP_{ak}(\tau)\lambda_kP_{k+1,b}(t-\tau)d\tau}{P_{ab}(t)}\)
- \(E(D_k|Y) = \frac{\int_0^tP_{ak}(\tau)\mu_kP_{k-1,b}(t-\tau)d\tau}{P_{ab}(t)}\)
- \(E(T_k|Y) = \frac{\int_0^tP_{ak}(\tau)P_{k,b}(t-\tau)d\tau}{P_{ab}(t)}\)
- so the E-step can be computed using integration over finite time transition prob of BDP
- Further simplication using Laplace transformation. Can show
  - \(E(U_k|Y) =\lambda_k \frac{L^{-1}[f_{ak}(s)f_{k+1,b}(s)](t)}{P_{ab}(t)}\)
  - \(E(D_k|Y) = \mu_k \frac{L^{-1}[f_{ak}(s)f_{k-11,b}(s)](t)}{P_{ab}(t)}\)
  - \(E(T_k|Y) = \frac{L^{-1}[f_{ak}(s)f_{k,b}(s)](t)}{P_{ab}(t)}\)
M-step maximization depends on various forms of birth and death rates.