Nabarun Goswami

Nabarun Goswami

PhD Student in Machine Intelligence

University of Tokyo

Biography

I am a PhD student at the Machine Intelligence Laboratory, The University of Tokyo, under the supervision of Prof. Tatsuya Harada. I am working on learning unsupervised representations for speech synthesis.

Download my resumé.

Interests
  • Artificial Intelligence
  • Speech Synthesis
  • Audio Source Separation
Education
  • PhD in Advanced Interdisciplinary Studies, 2021~2024

    University of Tokyo, Japan

  • Master of Information Science and Technology, 2019~2021

    University of Tokyo, Japan

  • Bachelor of Technology in Electronics and Communication Engineering, 2008~2012

    Tezpur University, India

Skills

Python
Deep Learning
pytorch
Pytorch

Experience

 
 
 
 
 
NABLAS
Jr. Research Engineer (part time)
Oct 2021 – Present Tokyo

Responsibilities include:

  • Research on Deep Fake technology
 
 
 
 
 
Tokyo Coding Club
Programming and Robotics Instructor (part time)
Feb 2020 – Oct 2021 Tokyo
Taught basic robotics and programming to middle and high school students
 
 
 
 
 
Sony India Software Centre
Technical Specialist
Oct 2016 – Feb 2019 Bangalore, Tokyo

Responsibilities include:

  • Research on Audio Source Separation in collaboration with Sony R&D Center, Japan
  • Lead newly formed Machine Learning Technology Group
 
 
 
 
 
Sony India Software Centre
Senior Software Engineer
Aug 2012 – Sep 2016 Bangalore

Responsibilities include:

  • Software development for various products (PlayStation4, Bravia TV, Xperia, etc.)
  • Prototyping of new ideas utilizing Sony products
  • Development of inhouse test automation tools

Recent Publications

Quickly discover relevant content by filtering publications.
(2022). SATTS: Speaker Attractor Text to Speech, Learning to Speak by Learning to Separate. Proc. Interspeech 2022.

Cite

(2020). System and method for sharing multimedia content with synched playback controls. Google Patents.

Cite

(2020). System and method for processing video content based on emotional state detection. Google Patents.

Cite

(2019). Device and method for generating a panoramic image. Google Patents.

Cite

(2019). Recursive Speech Separation for Unknown Number of Speakers. Proc. Interspeech 2019.

Cite

(2018). MMDenselstm: An Efficient Combination of Convolutional and Recurrent Neural Networks for Audio Source Separation. 2018 16th International Workshop on Acoustic Signal Enhancement (IWAENC).

Cite

(2018). PhaseNet: Discretized Phase Modeling with Deep Neural Networks for Audio Source Separation. Proc. Interspeech 2018.

Cite

(2017). DenseNet with pre-activated deconvolution for estimating depth map from single image. AMMDS 2017, Workshop on Activity Monitoring by Multiple Distributed Sensing.

Cite

(2012). Video Noise Reduction based on Statistical Modelling of Wavelet Coefficients. Bachelor Thesis, Tezpur University, Assam, India.

Cite

Contact