Efficient construction of match strength distributions for uncertain multi-locus genotypes

Natural variation in biological evidence leads to uncertain genotypes. Forensic comparison of a probabilistic genotype with a person's reference gives a numerical strength of DNA association. The distribution of match strength for all possible references usefully represents a genotype's po...

Full description

Bibliographic Details
Main Author: Mark W. Perlin
Format: Article
Language:English
Published: Elsevier 2018-10-01
Series:Heliyon
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2405844018336168
id doaj-e294dd87743f41c3843b8efdca4ca952
record_format Article
spelling doaj-e294dd87743f41c3843b8efdca4ca9522020-11-25T03:04:50ZengElsevierHeliyon2405-84402018-10-01410e00824Efficient construction of match strength distributions for uncertain multi-locus genotypesMark W. Perlin0Corresponding author.; Cybergenetics, Pittsburgh, PA, USANatural variation in biological evidence leads to uncertain genotypes. Forensic comparison of a probabilistic genotype with a person's reference gives a numerical strength of DNA association. The distribution of match strength for all possible references usefully represents a genotype's potential information. But testing more genetic loci exponentially increases the number of multi-locus possibilities, making direct computation infeasible.At each locus, Bayesian probability can quickly assemble a match strength random variable. Multi-locus match strength is the sum of these independent variables. A multi-locus genotype's match strength distribution is efficiently constructed by convolving together the separate locus distributions. This convolution construction can accurately collate all trillion trillion reference outcomes in a fraction of a second.This paper shows how to rapidly construct multi-locus match strength distributions by convolution. Function convergence demonstrates that distribution accuracy increases with numerical resolution. Convolution construction has quadratic computational complexity, relative to the exponential number of reference genotypes. A suitably defined random variable reduces high-dimensional computational cost to fast real-line arithmetic.Match strength distributions are used in forensic validation studies. They provide error rates for match results. The convolution construction applies to discrete or continuous variables in the forensic, natural and social sciences. Computer-derived match strength distributions elicit the information inherent in DNA evidence, often overlooked by human analysis.http://www.sciencedirect.com/science/article/pii/S2405844018336168Molecular biologyMathematical biosciencesGeneticsComputational biologyBiotechnologyBioinformatics
collection DOAJ
language English
format Article
sources DOAJ
author Mark W. Perlin
spellingShingle Mark W. Perlin
Efficient construction of match strength distributions for uncertain multi-locus genotypes
Heliyon
Molecular biology
Mathematical biosciences
Genetics
Computational biology
Biotechnology
Bioinformatics
author_facet Mark W. Perlin
author_sort Mark W. Perlin
title Efficient construction of match strength distributions for uncertain multi-locus genotypes
title_short Efficient construction of match strength distributions for uncertain multi-locus genotypes
title_full Efficient construction of match strength distributions for uncertain multi-locus genotypes
title_fullStr Efficient construction of match strength distributions for uncertain multi-locus genotypes
title_full_unstemmed Efficient construction of match strength distributions for uncertain multi-locus genotypes
title_sort efficient construction of match strength distributions for uncertain multi-locus genotypes
publisher Elsevier
series Heliyon
issn 2405-8440
publishDate 2018-10-01
description Natural variation in biological evidence leads to uncertain genotypes. Forensic comparison of a probabilistic genotype with a person's reference gives a numerical strength of DNA association. The distribution of match strength for all possible references usefully represents a genotype's potential information. But testing more genetic loci exponentially increases the number of multi-locus possibilities, making direct computation infeasible.At each locus, Bayesian probability can quickly assemble a match strength random variable. Multi-locus match strength is the sum of these independent variables. A multi-locus genotype's match strength distribution is efficiently constructed by convolving together the separate locus distributions. This convolution construction can accurately collate all trillion trillion reference outcomes in a fraction of a second.This paper shows how to rapidly construct multi-locus match strength distributions by convolution. Function convergence demonstrates that distribution accuracy increases with numerical resolution. Convolution construction has quadratic computational complexity, relative to the exponential number of reference genotypes. A suitably defined random variable reduces high-dimensional computational cost to fast real-line arithmetic.Match strength distributions are used in forensic validation studies. They provide error rates for match results. The convolution construction applies to discrete or continuous variables in the forensic, natural and social sciences. Computer-derived match strength distributions elicit the information inherent in DNA evidence, often overlooked by human analysis.
topic Molecular biology
Mathematical biosciences
Genetics
Computational biology
Biotechnology
Bioinformatics
url http://www.sciencedirect.com/science/article/pii/S2405844018336168
work_keys_str_mv AT markwperlin efficientconstructionofmatchstrengthdistributionsforuncertainmultilocusgenotypes
_version_ 1724679655480360960