Yan Zhao (赵岩)
    
    About me
    
		I obtained my Ph.D. in Computer Science at The Ohio State University, advised by Prof. DeLiang Wang at Perception and Neurodynamics Laboratory (PNL). Now I'm working as a Research Scientist at ByteDance.
	
    
		My research interest focuses on speech enhancement/separation, audio processing and multimodal LLMs.
	
	
	
	
    Contact
    
	
    Experience
    
	      - 
          Jul. 2020 - present: Research Scientist at Speech Team, ByteDance (San Jose, CA)
	      
- 
          May 2019 - Aug. 2019: Applied Scientist Intern at AWS, Amazon (East Palo Alto, CA)
	      
- 
          May 2018 - Aug. 2018: Research Intern at Machine Intelligence Technology, DAMO Academy, Alibaba Group (Bellevue, WA)
        
- 
          May 2017 - Aug. 2017: Research Intern at Signal Processing Research Department, Starkey Hearing Technologies (Eden Prairie, MN)
        
Teaching
    
      - 
        Instructor: CSE 1223: Introduction to Computer Programming in Java, Fall 2018, OSU
      
Dissertation
    
    Publications
	Full List: Google Scholar
    
		-  Liu Y., Liu X., Zhao Y., Wang Y., Xia R., Tian P., Wang Y. (2024): Audio prompt tuning for universal sound separation. Proceedings of ICASSP-24, pp. 1446-1450.
		
 
 
-  Liu X., Kong Q., Zhao Y., Liu H., Yuan Y., Liu Y., Xia R., Wang Y., Plumbley M., Wang W. (2023): Separate anything you describe.
		  arXiv preprint arXiv:2308.05037.
	  	
 
 
-  Tam K., Li L., Zhao Y., and Xu C. (2023): FedCoop: Cooperative federated learning for noisy
labels.
		Proceedings of ECAI-23, pp. 2298-2306.
		
 
 
-  Shu X., Chen Y., Shang C., Zhao Y., Zhao C., Zhu Y., Huang C., and Wang Y. (2022): Non-intrusive speech quality assessment with a multi-task learning based subband adaptive attention temporal convolutional neural network.
		Proceedings of INTERSPEECH-22, pp. 3298-3302.
		
 
 
-  Liu H., Liu X., Kong Q., Tian Q., Zhao Y., Wang D.L., Huang C., and Wang Y. (2022): VoiceFixer: A unified framework for high-fidelity speech restoration.
		Proceedings of INTERSPEECH-22, pp. 4232-4236.
	  	
 
 
-  Liu H., Kong Q., Tian Q., Zhao Y., Wang D.L., Huang C., and Wang Y. (2021): VoiceFixer: Toward general speech restoration with neural vocoder.
		  arXiv preprint arXiv:2109.13731.
	  	
 
 
- 
      	Zhao Y., and Wang D.L. (2020): Noisy-reverberant speech enhancement using DenseUNet with time-frequency attention.
	    Proceedings of INTERSPEECH-20, pp. 3261-3265.
      	
 
 
- 
      	Zhao Y., Wang D.L., Xu B., and Zhang T. (2020): 
		  Monaural speech dereverberation using temporal convolutional networks with self attention.
	    IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 28, pp. 1598-1607.
      	
 
 
- 
      	Zhao Y., Wang Z.-Q., and Wang D.L. (2019): 
	    Two-stage deep learning for noisy-reverberant speech enhancement.
      	IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 27, pp. 53-62.
      	
 
 
- 
      	Zhao Y., Wang D.L., Johnson E.M., and Healy E.W. (2018):
      	A deep learning based segregation algorithm to increase speech intelligibility for
hearing-impaired listeners in reverberant-noisy conditions. Journal of the Acoustical Society of America, vol. 144, pp. 1627-1637.
	    
 
 
- 
      	Zhao Y., Wang D.L., Xu B., and Zhang T. (2018):
      	Late reverberation suppression using recurrent neural networks with long short-term memory. Proceedings of ICASSP-18, pp. 5434-5438.
	    
 
 
- Zhao Y., Xu B., Giri R., and Zhang T. (2018):
      	Perceptually guided speech enhancement using deep neural networks. Proceedings of ICASSP-18, pp. 5074-5078.
	    
 
 
- 
      	Zhao Y., Wang Z.-Q., and Wang D.L. (2017):
      	A two-stage algorithm for noisy and reverberant speech enhancement. Proceedings of ICASSP-17, pp. 5580-5584.
	    
 
 
- 
      	Zhao Y., Wang D.L., Merks I., and Zhang T. (2016): 
      	DNN-based enhancement of noisy and reverberant speech.
      	Proceedings of ICASSP-16, pp. 6525-6529.
	    
 
 
- 
      	Wang Z.-Q., Zhao Y., and Wang D.L. (2016): 
      	Phoneme-specific speech separation.
      	Proceedings of ICASSP-16, pp. 146-150.
	    
 
 
Service
      	
      	Journal/Conference Reviewer:
	
         
			- IEEE/ACM Transactions on Audio, Speech, and Language Processing (TASLP)
			
- IEEE Signal Processing Letters (SPL)
			
- IEEE Transactions on Multimedia (TMM)
			
- JASA Express Letters (JASA-EL)
			
- Journal of Speech, Language, and Hearing Research (JSLHR)
			
- Computer Speech and Language
			
- Digital Signal Processing
			
			
- ICASSP/INTERSPEECH/AAAI