Perturbation AUTOVC: Voice Conversion From Perturbation and Autoencoder Loss
AUTOVC is a voice-conversion method that performs self-reconstruction using an autoencoder structure for zero-shot voice conversion. AUTOVC has the advantage of being easy and simple to learn because it only uses the autoencoder loss for learning. However, it performs voice conversion by disentangli...
| Published in: | IEEE Access |
|---|---|
| Main Authors: | Hwa-Young Park, Young Han Lee, Chanjun Chun |
| Format: | Article |
| Language: | English |
| Published: |
IEEE
2023-01-01
|
| Subjects: | |
| Online Access: | https://ieeexplore.ieee.org/document/10353951/ |
Similar Items
Stop Voicing and F0 Perturbation in Pahari
by: Nazia Rashid, et al.
Published: (2021-01-01)
by: Nazia Rashid, et al.
Published: (2021-01-01)
Enrichment of Oesophageal Speech: Voice Conversion with Duration–Matched Synthetic Speech as Target
by: Sneha Raman, et al.
Published: (2021-06-01)
by: Sneha Raman, et al.
Published: (2021-06-01)
Streaming ASR Encoder for Whisper-to-Speech Online Voice Conversion
by: Anastasia Avdeeva, et al.
Published: (2024-01-01)
by: Anastasia Avdeeva, et al.
Published: (2024-01-01)
Effective Zero-Shot Multi-Speaker Text-to-Speech Technique Using Information Perturbation and a Speaker Encoder
by: Chae-Woon Bang, et al.
Published: (2023-12-01)
by: Chae-Woon Bang, et al.
Published: (2023-12-01)
Machine Learning Approaches for Whisper to Normal Speech Conversion
by: Marco A. Oliveira
Published: (2022-04-01)
by: Marco A. Oliveira
Published: (2022-04-01)
Non-Parallel Whisper-to-Normal Speaking Style Conversion Using Auxiliary Classifier Variational Autoencoder
by: Shogo Seki, et al.
Published: (2023-01-01)
by: Shogo Seki, et al.
Published: (2023-01-01)
Reimagining speech: a scoping review of deep learning-based methods for non-parallel voice conversion
by: Anders R. Bargum, et al.
Published: (2024-08-01)
by: Anders R. Bargum, et al.
Published: (2024-08-01)
Accent conversion method with real-time voice cloning based on a nonautoregressive neural network model
by: V. A. Nechaev, et al.
Published: (2025-06-01)
by: V. A. Nechaev, et al.
Published: (2025-06-01)
Overview of Voice Conversion Methods Based on Deep Learning
by: Tomasz Walczyna, et al.
Published: (2023-02-01)
by: Tomasz Walczyna, et al.
Published: (2023-02-01)
Jointly Trained Conversion Model With LPCNet for Any-to-One Voice Conversion Using Speaker-Independent Linguistic Features
by: Ivan Himawan, et al.
Published: (2022-01-01)
by: Ivan Himawan, et al.
Published: (2022-01-01)
CCLCap-AE-AVSS: Cycle consistency loss based capsule autoencoders for audio–visual speech synthesis
by: Ghosh Subhayu, et al.
Published: (2024-06-01)
by: Ghosh Subhayu, et al.
Published: (2024-06-01)
MPFM-VC: A Voice Conversion Algorithm Based on Multi-Dimensional Perception Flow Matching
by: Yanze Wang, et al.
Published: (2025-05-01)
by: Yanze Wang, et al.
Published: (2025-05-01)
Intelligibility Improvement of Esophageal Speech Using Sequence-to-Sequence Voice Conversion with Auditory Attention
by: Kadria Ezzine, et al.
Published: (2022-07-01)
by: Kadria Ezzine, et al.
Published: (2022-07-01)
Zero-Shot Unseen Speaker Anonymization via Voice Conversion
by: Hyung-Pil Chang, et al.
Published: (2022-01-01)
by: Hyung-Pil Chang, et al.
Published: (2022-01-01)
Voice Analysis and Classification System Based on Perturbation Parameters and Cepstral Presentation in Psychoacoustic Scales
by: M. I. Vashkevich, et al.
Published: (2022-03-01)
by: M. I. Vashkevich, et al.
Published: (2022-03-01)
Voice Conversion Using a Perceptual Criterion
by: Ki-Seung Lee
Published: (2020-04-01)
by: Ki-Seung Lee
Published: (2020-04-01)
Breathy Voice and the Oxytonic or the Paroxytonic Rhythm in the Reproduction of Thought Creations in German-Language Conversations
by: Dario Marić
Published: (2024-07-01)
by: Dario Marić
Published: (2024-07-01)
The Perturbing Mediatization of Voice-based Virtual Assistants: The Case of Alexa
by: Leopoldina Fortunati, et al.
Published: (2024-01-01)
by: Leopoldina Fortunati, et al.
Published: (2024-01-01)
Speaker Anonymization: Disentangling Speaker Features from Pre-Trained Speech Embeddings for Voice Conversion
by: Marco Matassoni, et al.
Published: (2024-04-01)
by: Marco Matassoni, et al.
Published: (2024-04-01)
A noise-robust voice conversion method with controllable background sounds
by: Lele Chen, et al.
Published: (2024-02-01)
by: Lele Chen, et al.
Published: (2024-02-01)
Arabic Emotional Voice Conversion Using English Pre-Trained StarGANv2-VC-Based Model
by: Ali H. Meftah, et al.
Published: (2022-11-01)
by: Ali H. Meftah, et al.
Published: (2022-11-01)
CycleDiffusion: Voice Conversion Using Cycle-Consistent Diffusion Models
by: Dongsuk Yook, et al.
Published: (2024-10-01)
by: Dongsuk Yook, et al.
Published: (2024-10-01)
A survey of voice conversion based on non-parallel data
by: Pengcheng LI, et al.
Published: (2024-05-01)
by: Pengcheng LI, et al.
Published: (2024-05-01)
Scalability and diversity of StarGANv2-VC in Arabic emotional voice conversion: Overcoming data limitations and enhancing performance
by: Ali H. Meftah, et al.
Published: (2024-07-01)
by: Ali H. Meftah, et al.
Published: (2024-07-01)
Multi-Scale Recurrence Quantification Measurements for Voice Disorder Detection
by: Xin-Cheng Zhu, et al.
Published: (2022-09-01)
by: Xin-Cheng Zhu, et al.
Published: (2022-09-01)
Is Natural Necessary? Human Voice versus Synthetic Voice for Intelligent Virtual Agents
by: Amal Abdulrahman, et al.
Published: (2022-06-01)
by: Amal Abdulrahman, et al.
Published: (2022-06-01)
“Eh? Aye!”: Categorisation bias for natural human vs AI-augmented voices is influenced by dialect
by: Neil W. Kirk
Published: (2025-05-01)
by: Neil W. Kirk
Published: (2025-05-01)
Wavelet-based Robust Voice conversion systems
by: Morteza Farhid, et al.
Published: (2009-03-01)
by: Morteza Farhid, et al.
Published: (2009-03-01)
Forensic analysis of auditorily similar voices
by: Sandra Carmo, et al.
Published: (2023-06-01)
by: Sandra Carmo, et al.
Published: (2023-06-01)
Singing Voice Therapy Revisited
by: Ilter Denizoglu, et al.
Published: (2021-12-01)
by: Ilter Denizoglu, et al.
Published: (2021-12-01)
Voice Conversion Based on Hybrid SVR and GMM
by: Peng SONG, et al.
Published: (2013-10-01)
by: Peng SONG, et al.
Published: (2013-10-01)
Wav2wav: Wave-to-Wave Voice Conversion
by: Changhyeon Jeong, et al.
Published: (2024-05-01)
by: Changhyeon Jeong, et al.
Published: (2024-05-01)
Speech Emotion Recognition Based on Voice Fundamental Frequency
by: Teodora DIMITROVA-GREKOW, et al.
Published: (2019-04-01)
by: Teodora DIMITROVA-GREKOW, et al.
Published: (2019-04-01)
Improving the Efficiency of Dysarthria Voice Conversion System Based on Data Augmentation
by: Wei-Zhong Zheng, et al.
Published: (2023-01-01)
by: Wei-Zhong Zheng, et al.
Published: (2023-01-01)
ASSESSMENT CRITERIA OF VOICE SIGNAL LEAKAGE PROTECTION
by: V. K. Zheleznyak, et al.
Published: (2017-04-01)
by: V. K. Zheleznyak, et al.
Published: (2017-04-01)
A Deep Semantic Comprehension-Based Automatic Conversation Model Based on Autoencoder-Enhanced Transformer
by: Shouyu Liang, et al.
Published: (2025-01-01)
by: Shouyu Liang, et al.
Published: (2025-01-01)
STRAIGHTMORPH: A Voice Morphing Tool for Research in Voice Communication Sciences [version 2; peer review: 2 approved, 1 approved with reservations]
by: P Belin, et al.
Published: (2025-01-01)
by: P Belin, et al.
Published: (2025-01-01)
Generative autoencoder to prevent overregularization of variational autoencoder
by: YoungMin Ko, et al.
Published: (2025-02-01)
by: YoungMin Ko, et al.
Published: (2025-02-01)
Aeroacoustic Sound Source Characterization of the Human Voice Production-Perturbed Convective Wave Equation
by: Stefan Schoder, et al.
Published: (2021-03-01)
by: Stefan Schoder, et al.
Published: (2021-03-01)
An Evasion Attack against Stacked Capsule Autoencoder
by: Jiazhu Dai, et al.
Published: (2022-01-01)
by: Jiazhu Dai, et al.
Published: (2022-01-01)
Similar Items
-
Stop Voicing and F0 Perturbation in Pahari
by: Nazia Rashid, et al.
Published: (2021-01-01) -
Enrichment of Oesophageal Speech: Voice Conversion with Duration–Matched Synthetic Speech as Target
by: Sneha Raman, et al.
Published: (2021-06-01) -
Streaming ASR Encoder for Whisper-to-Speech Online Voice Conversion
by: Anastasia Avdeeva, et al.
Published: (2024-01-01) -
Effective Zero-Shot Multi-Speaker Text-to-Speech Technique Using Information Perturbation and a Speaker Encoder
by: Chae-Woon Bang, et al.
Published: (2023-12-01) -
Machine Learning Approaches for Whisper to Normal Speech Conversion
by: Marco A. Oliveira
Published: (2022-04-01)
