Full Program
Summary:
The rise of synthetic and manipulated audio content, especially partial fake speech, presents significant challenges for verifying audio authenticity. Partial fake speech refers to segments of audio where only certain parts have been altered or synthesized, making it more difficult to detect compared to fully synthetic speech. This paper introduces a novel detection model specifically designed to identify partial fake speech. Our approach incorporates Wav2Vec 2.0 as a feature extractor, along with max pooling, conformer blocks, attention-based pooling, and fully connected layers. Experimental results on two datasets demonstrate the model's effectiveness in detecting partial fake speech, outperforming existing methods in terms of Equal Error Rate (EER), achieving 0% on the RFP dataset and 2.99% on the ASVSpoof 2019 LA dataset.Author(s):
Abdulazeez AlAli
Cardiff University
United Kingdom
George Theodorakopoulos
Cardiff University
United Kingdom
He is a Reader at the School of Computer Science & Informatics, Cardiff University. From 2007 to 2011, he was a Senior Researcher at the Ecole Polytechnique Federale de Lausanne (EPFL), Switzerland. He is a coauthor (with John Baras) of the book Path Problems in Networks (Morgan & Claypool, 2010).
He received the Best Paper award at the ACM Workshop on Wireless Security, October 2004, for "Trust evaluation in ad-hoc networks" and the 2007 IEEE ComSoc Leonard Abraham prize for "On trust models and trust evaluation metrics for ad hoc networks." He coauthored the paper "Quantifying Location Privacy," which was runner-up for the 2012 PET Award (Award for Outstanding Research in Privacy Enhancing Technologies) and received a Test of Time award at 2021 IEEE Symposium on Security and Privacy.
Abdullah Emad
Zewail City
Egypt