首页 | 官方网站   微博 | 高级检索  
     

融合分布对齐和对抗学习的无监督跨域声纹识别
引用本文:陈志高,赵庆卫,王丽,王文超.融合分布对齐和对抗学习的无监督跨域声纹识别[J].声学学报,2021,46(5):767-774.
作者姓名:陈志高  赵庆卫  王丽  王文超
作者单位:1. 中国科学院声学研究所 语言声学与内容理解重点实验室 北京 100190;
基金项目:国家自然科学基金项目(11590774,11590772,11590770)资助
摘    要:针对声纹识别领域不匹配,且目标领域缺少标注数据的难题,提出在对抗学习基础上融合分布对齐的无监督领域自适应方法,通过训练过程中统计分布的对齐,以减小领域差异,从而提取声音中更有声纹鉴别性的特征,取得了稳定的性能提升。在文本相关的声纹识别任务中,对抗学习和分布对齐的方法能协同发挥作用,等错率相对降低11%;在文本无关的任务中,对抗学习效果不稳定,而分布对齐的方法依然有相对8%的性能提升。实验结果证明该方法在领域不匹配且目标领域缺少标注数据时,能有效提取语音中声纹鉴别信息,稳定提升识别性能。 

关 键 词:声纹识别    无监督    跨领域    鉴别性特征
收稿时间:2020-09-28

Unsupervised cross-domain speaker recognition based on distribution alignment and adversarial learning
Affiliation:1. Key Laboratory of Speech Acoustics and Content Understanding, Institute of Acoustics, Chinese Academy of Sciences Beijing 100190;2. University of Chinese Academy of Sciences, Beijing 100049
Abstract:Domain mismatch has become one of the biggest challenges for realistic speaker recognition systems,especially labeled data in the target domain are unavailable.The proposed methods fuse with adversarial learning to extract speaker discriminative features.It reduces domain discrepancy by distribution alignment during the training stage.Consistent performance improvements are achieved under variety of domain mismatch circumstances.For text-dependent tasks,adversarial learning and distribution alignment work together to reduce the equal error rates 11% relatively.As for text-independent tasks,adversarial learning can hardly make contributions while our distribution alignment still achieves a relative 8% improvement.The proposed methods can steadily improve the performance effectively for unsupervised cross-domain speaker recognition. 
Keywords:
本文献已被 CNKI 等数据库收录!
点击此处可从《声学学报》浏览原始摘要信息
点击此处可从《声学学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号