Motivation: Estimating gene regulatory networks over biological lineages is central to a deeper understanding of how cells evolve during development and differentiation. However, one challenge in estimating such evolving networks is that their host cells are not only contiguously evolving, but also branching over time. For example, stem cells evolve into two more specialized daughter cells at each division, forming a tree of networks. Another example is in a laboratory setting: a biologist may apply several different drugs to a malignant cancer cell to analyze the changes each drug has produced in the treated cells. Each treated cell is not directly related to another treated cell, but rather to the malignant cancer cell that it was derived from.
Results: We propose a novel algorithm, Treegl, which builds on the l1 plus total variation penalized logistic regression to effectively estimate multiple gene networks corresponding to cell types related by a tree-genealogy, based on only a few samples from each cell type. Treegl takes advantage of the similarity between related networks along the biological lineage, while at the same time exposing sharp differences between the networks. We demonstrate that our algorithm performs significantly better than existing methods via simulation. Furthermore we explore an application to a small-scale breast cancer analysis. Based on only a few microarray measurements, our algorithm is able to produce biologically valid results that provide insight into the progression and reversion of breast cancer.