Complexity and algorithms for copy-number evolution problems

Abstract Background Cancer is an evolutionary process characterized by the accumulation of somatic mutations in a population of cells that form a tumor. One frequent type of mutations is copy number aberrations, which alter the number of copies of genomic regions. The number of copies of each positi...

Full description

Bibliographic Details
Main Authors: Mohammed El-Kebir, Benjamin J. Raphael, Ron Shamir, Roded Sharan, Simone Zaccaria, Meirav Zehavi, Ron Zeira
Format: Article
Language:English
Published: BMC 2017-05-01
Series:Algorithms for Molecular Biology
Subjects:
Online Access:http://link.springer.com/article/10.1186/s13015-017-0103-2
Description
Summary:Abstract Background Cancer is an evolutionary process characterized by the accumulation of somatic mutations in a population of cells that form a tumor. One frequent type of mutations is copy number aberrations, which alter the number of copies of genomic regions. The number of copies of each position along a chromosome constitutes the chromosome’s copy-number profile. Understanding how such profiles evolve in cancer can assist in both diagnosis and prognosis. Results We model the evolution of a tumor by segmental deletions and amplifications, and gauge distance from profile $$\mathbf {a}$$ a to $$\mathbf {b}$$ b by the minimum number of events needed to transform $$\mathbf {a}$$ a into $$\mathbf {b}$$ b . Given two profiles, our first problem aims to find a parental profile that minimizes the sum of distances to its children. Given k profiles, the second, more general problem, seeks a phylogenetic tree, whose k leaves are labeled by the k given profiles and whose internal vertices are labeled by ancestral profiles such that the sum of edge distances is minimum. Conclusions For the former problem we give a pseudo-polynomial dynamic programming algorithm that is linear in the profile length, and an integer linear program formulation. For the latter problem we show it is NP-hard and give an integer linear program formulation that scales to practical problem instance sizes. We assess the efficiency and quality of our algorithms on simulated instances. Availability https://github.com/raphael-group/CNT-ILP
ISSN:1748-7188