Discovery of diverse anellovirus sequences in Thai human sequencing data

ABSTRACT Anelloviruses are part of the normal human viral flora. Although their diversity in humans has been investigated in many countries, and despite their initial detection in Thailand in 1999, knowledge of Thai anelloviruses remains very limited. This study analyzed 1,175 whole-genome sequencin...

Full description

Bibliographic Details
Published in:Microbiology Spectrum
Main Authors: Worakorn Phumiphanjarphak, Jinjutha Parkbhorn, Chumpol Ngamphiw, Sissades Tongsima, Pakorn Aiewsakun
Format: Article
Language:English
Published: American Society for Microbiology 2025-10-01
Subjects:
Online Access:https://journals.asm.org/doi/10.1128/spectrum.00866-25
_version_ 1848764402134679552
author Worakorn Phumiphanjarphak
Jinjutha Parkbhorn
Chumpol Ngamphiw
Sissades Tongsima
Pakorn Aiewsakun
author_facet Worakorn Phumiphanjarphak
Jinjutha Parkbhorn
Chumpol Ngamphiw
Sissades Tongsima
Pakorn Aiewsakun
author_sort Worakorn Phumiphanjarphak
collection DOAJ
container_title Microbiology Spectrum
description ABSTRACT Anelloviruses are part of the normal human viral flora. Although their diversity in humans has been investigated in many countries, and despite their initial detection in Thailand in 1999, knowledge of Thai anelloviruses remains very limited. This study analyzed 1,175 whole-genome sequencing data sets from Thai individuals to mine for potential anellovirus sequences. Our analyses detected anellovirus sequences in 149 data sets (12.68%), uncovering 434 partial anellovirus sequences and 77 complete genome sequences, characterized by the presence of terminal redundancy, complete orf1, and the conserved untranslated region upstream of the orf1 gene. Sequence analyses indicated that these viruses belong to seven genera, including Alphatorquevirus, Betatorquevirus, Gammatorquevirus, Hetorquevirus, Lamedtorquevirus, Samektorquevirus, and Yodtorquevirus. Notably, Hetorquevirus, Lamedtorquevirus, Samektorquevirus, and Yodtorquevirus had not previously been reported in Thailand. Phylogenetic analysis of ORF1 protein sequences showed that Thai anelloviruses form multiple phylogenetic clusters with non-Thai anelloviruses, indicating frequent cross-country transmission and multiple origins of the virus in Thailand. Furthermore, sequence similarity network analysis identified 33 potentially novel anellovirus species in our data set. Our findings greatly expand the knowledge of anellovirus diversity in Thailand and demonstrate the potential of human whole-genome sequencing data as a valuable resource for viral discovery. Lastly, we highlight and discuss some challenges with the use of the current pairwise sequence similarity-based classification scheme, in particular, how gaps can influence similarity calculation and potentially lead to inconsistencies with a phylogenetic-based classification scheme.IMPORTANCEAnelloviruses are widespread in humans, yet their diversity remains poorly characterized in many regions, including Thailand. Here, we demonstrate that human sequencing data sets, originally generated without the intention for virome research, can be effectively mined for anellovirus sequences, including complete genomes. Our findings reveal a substantial number of previously unreported anelloviruses in Thailand, significantly expanding the known diversity of the virus. We also highlight potential limitations of the current anellovirus species classification scheme, which is based on pairwise orf1 sequence similarity analysis with a hard threshold cutoff at 69%. Our results reveal that the current scheme can sometimes yield taxonomic groupings that are inconsistent with phylogenetic relationships, particularly when significant alignment gaps are present. Overall, our results show that existing human sequencing data can be effectively repurposed for virus discovery research and suggest the need for more robust and phylogenetically informed classification frameworks as viral sequence databases continue to expand.
format Article
id doaj-art-bb99cd7e07d14e0eacf8de01dbfb6aad
institution Directory of Open Access Journals
issn 2165-0497
language English
publishDate 2025-10-01
publisher American Society for Microbiology
record_format Article
spelling doaj-art-bb99cd7e07d14e0eacf8de01dbfb6aad2025-10-07T13:08:59ZengAmerican Society for MicrobiologyMicrobiology Spectrum2165-04972025-10-01131010.1128/spectrum.00866-25Discovery of diverse anellovirus sequences in Thai human sequencing dataWorakorn Phumiphanjarphak0Jinjutha Parkbhorn1Chumpol Ngamphiw2Sissades Tongsima3Pakorn Aiewsakun4Department of Microbiology, Faculty of Science, Mahidol University, Bangkok, ThailandPornchai Matangkasombut Center for Microbial Genomics, Department of Microbiology, Faculty of Science, Mahidol University, Bangkok, ThailandNational Center for Genetic Engineering and Biotechnology, National Science and Technology Development Agency, Pathum Thani, ThailandNational Center for Genetic Engineering and Biotechnology, National Science and Technology Development Agency, Pathum Thani, ThailandDepartment of Microbiology, Faculty of Science, Mahidol University, Bangkok, ThailandABSTRACT Anelloviruses are part of the normal human viral flora. Although their diversity in humans has been investigated in many countries, and despite their initial detection in Thailand in 1999, knowledge of Thai anelloviruses remains very limited. This study analyzed 1,175 whole-genome sequencing data sets from Thai individuals to mine for potential anellovirus sequences. Our analyses detected anellovirus sequences in 149 data sets (12.68%), uncovering 434 partial anellovirus sequences and 77 complete genome sequences, characterized by the presence of terminal redundancy, complete orf1, and the conserved untranslated region upstream of the orf1 gene. Sequence analyses indicated that these viruses belong to seven genera, including Alphatorquevirus, Betatorquevirus, Gammatorquevirus, Hetorquevirus, Lamedtorquevirus, Samektorquevirus, and Yodtorquevirus. Notably, Hetorquevirus, Lamedtorquevirus, Samektorquevirus, and Yodtorquevirus had not previously been reported in Thailand. Phylogenetic analysis of ORF1 protein sequences showed that Thai anelloviruses form multiple phylogenetic clusters with non-Thai anelloviruses, indicating frequent cross-country transmission and multiple origins of the virus in Thailand. Furthermore, sequence similarity network analysis identified 33 potentially novel anellovirus species in our data set. Our findings greatly expand the knowledge of anellovirus diversity in Thailand and demonstrate the potential of human whole-genome sequencing data as a valuable resource for viral discovery. Lastly, we highlight and discuss some challenges with the use of the current pairwise sequence similarity-based classification scheme, in particular, how gaps can influence similarity calculation and potentially lead to inconsistencies with a phylogenetic-based classification scheme.IMPORTANCEAnelloviruses are widespread in humans, yet their diversity remains poorly characterized in many regions, including Thailand. Here, we demonstrate that human sequencing data sets, originally generated without the intention for virome research, can be effectively mined for anellovirus sequences, including complete genomes. Our findings reveal a substantial number of previously unreported anelloviruses in Thailand, significantly expanding the known diversity of the virus. We also highlight potential limitations of the current anellovirus species classification scheme, which is based on pairwise orf1 sequence similarity analysis with a hard threshold cutoff at 69%. Our results reveal that the current scheme can sometimes yield taxonomic groupings that are inconsistent with phylogenetic relationships, particularly when significant alignment gaps are present. Overall, our results show that existing human sequencing data can be effectively repurposed for virus discovery research and suggest the need for more robust and phylogenetically informed classification frameworks as viral sequence databases continue to expand.https://journals.asm.org/doi/10.1128/spectrum.00866-25anellovirusAnelloviridaevirus discoveryvirome
spellingShingle Worakorn Phumiphanjarphak
Jinjutha Parkbhorn
Chumpol Ngamphiw
Sissades Tongsima
Pakorn Aiewsakun
Discovery of diverse anellovirus sequences in Thai human sequencing data
anellovirus
Anelloviridae
virus discovery
virome
title Discovery of diverse anellovirus sequences in Thai human sequencing data
title_full Discovery of diverse anellovirus sequences in Thai human sequencing data
title_fullStr Discovery of diverse anellovirus sequences in Thai human sequencing data
title_full_unstemmed Discovery of diverse anellovirus sequences in Thai human sequencing data
title_short Discovery of diverse anellovirus sequences in Thai human sequencing data
title_sort discovery of diverse anellovirus sequences in thai human sequencing data
topic anellovirus
Anelloviridae
virus discovery
virome
url https://journals.asm.org/doi/10.1128/spectrum.00866-25
work_keys_str_mv AT worakornphumiphanjarphak discoveryofdiverseanellovirussequencesinthaihumansequencingdata
AT jinjuthaparkbhorn discoveryofdiverseanellovirussequencesinthaihumansequencingdata
AT chumpolngamphiw discoveryofdiverseanellovirussequencesinthaihumansequencingdata
AT sissadestongsima discoveryofdiverseanellovirussequencesinthaihumansequencingdata
AT pakornaiewsakun discoveryofdiverseanellovirussequencesinthaihumansequencingdata