Vijayaditya Peddinti

About me
Academics
Publications
Experience
Projects
Resume



Contact Info:

vijay [dot] p [at] jhu [dot] edu



clsp logo

Vijayaditya Peddinti

About me

I graduated from the PhD program of the Electrical and Computer engineering department at Johns Hopkins University. I am currently a research scientist @ Google Speech.

Previously I worked in the Center for Language and Speech Processing on acoustic models for speech recognition, with Dan Povey and Sanjeev Khudanpur. I contribute to the acoustic modelling code in Kaldi project.

I had previously worked with Hynek Hermansky, on distortion invariant feature design for acoustic models. I worked in Speech and Vision Lab at IIIT-Hyd with Kishore Prahallad, on efficient back-off strategies for quality speech synthesis, for my Masters (by research)

Research Interests: Speech Recognition, Machine Learning


Academics


  • Johns Hopkins University, Maryland, US
     PhD in Electrical and Computer Engineering, 2011 - 2017;
  • International Institute of Information Technology, Hyderabad, India
     Master of Science (by Research) in Computer Science, 2011
       Thesis: Synthesis of missing units in Telugu text-to-speech system
  • Dhirubhai Ambani Institute of Information and Communication Technology, Gandhinagar, India
    Bachelor of Technology in Information and Communication Technology, 2007

Publications


  

2016

Far-field ASR without parallel data
Vijayaditya Peddinti, Vimal Manohar, Yiming Wang, Daniel Povey and Sanjeev Khudanpur
Submitted to Interspeech, 2016

[abstract] [bib]

@inproceedings{peddinti2016ami,
author = {Peddinti, Vijayaditya and Manohar, Vimal and Wang, Yiming and Povey, Daniel and Khudanpur, Sanjeev},
title = {Far-field ASR without parallel data},
booktitle = {Submitted to Interspeech}
}

Purely sequence-trained neural networks for ASR based on lattice-free MMI
Daniel Povey, Vijayaditya Peddinti, Daniel Galvez, Pegah Ghahrmani, Vimal Manohar, Yiming Wang, Xingyu Na and Sanjeev Khudanpur
Submitted to Interspeech, 2016

[abstract] [bib]

@inproceedings{povey2016,
author = {Povey, Daniel and Peddinti, Vijayaditya and Galvez, Daniel and Pegah Ghahrmani and Manohar, Vimal and Wang, Yiming and Na, Xingyu and Khudanpur, Sanjeev},
title = {Purely sequence-trained neural networks for ASR based on lattice-free MMI},
booktitle = {Submitted to Interspeech}
}

2015

Winner of the IARPA ASpIRE challenge [press announcement]

Reverberation robust acoustic modeling using with time delay neural networks
Vijayaditya Peddinti, Guoguo Chen, Daniel Povey and Sanjeev Khudanpur
Proceedings of Interspeech, 2015

[abstract] [bib]

@inproceedings{peddinti2015reverb,
author = {Peddinti, Vijayaditya and Chen, Guoguo and Povey, Daniel and Khudanpur, Sanjeev},
title = {Reverberation robust acoustic modeling using with time delay neural networks},
booktitle = {Proceedings of Interspeech}
}

Audio Augmentation for Speech Recognition
Tom Ko,
Vijayaditya Peddinti, Daniel Povey and Sanjeev Khudanpur
Proceedings of Interspeech, 2015

[abstract] [bib]

@inproceedings{ko2015augmentation,
author = {Tom Ko and Peddinti, Vijayaditya and Povey, Daniel and Khudanpur, Sanjeev},
title = {Audio Augmentation for Speech Recognition},
booktitle = {Proceedings of Interspeech}
}

Best paper award

A time delay neural network architecture for efficient modeling of long temporal contexts
Vijayaditya Peddinti, Daniel Povey and Sanjeev Khudanpur
Proceedings of Interspeech, 2015

[abstract] [bib]

@inproceedings{peddinti2015multisplice,
author = {Peddinti, Vijayaditya and Povey, Daniel and Khudanpur, Sanjeev},
title = {A time delay neural network architecture for efficient modeling of long temporal contexts},
booktitle = {Proceedings of Interspeech},
publisher = {ISCA}
}

Back to Top

2014

Deep Scattering Spectrum with deep neural networks
Vijayaditya Peddinti, T. Sainath, S. Maymon, B. Ramabhadran, D. Nahamoo and Vaibhava Goel
Proceedings of ICASSP, 2014

[abstract] [bib]

@inproceedings{peddinti2014,
author = {Peddinti, Vijayaditya and T. Sainath and S. Maymon and B. Ramabhadran and D. Nahamoo and Goel, Vaibhava},
title = {Deep Scattering Spectrum with deep neural networks},
booktitle = {Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on},
pages = {210-214}
}

Evaluating speech features with the Minimal-Pair ABX task (II): Resistance to noise
Thomas Schatz,
Vijayaditya Peddinti, Yuan Cao, Francis Bach, Hynek Hermansky and Emmanuel Dupoux
Proceedings of Interspeech, 2014

[bib]

@inproceedings{schatz-peddinti-cao-bach-hermansky-dupoux:is2014c,
author = {Thomas Schatz and Peddinti, Vijayaditya and Cao, Yuan and Francis Bach and Hermansky, Hynek and Emmanuel Dupoux},
title = {Evaluating speech features with the Minimal-Pair ABX task (II): Resistance to noise},
booktitle = {Proc. of INTERSPEECH}
}

Deep Scattering Spectra with Deep Neural Networks for LVCSR Tasks
Tara N Sainath, Vijayaditya Peddinti, Brian Kingsbury, Petr Fousek, Bhuvana Ramabhadran and David Nahamoo
Proceedings of Interspeech, 2014

[abstract] [bib]

@inproceedings{sainath2014deep,
author = {Tara N Sainath and Peddinti, Vijayaditya and Brian Kingsbury and Petr Fousek and Bhuvana Ramabhadran and David Nahamoo},
title = {Deep Scattering Spectra with Deep Neural Networks for LVCSR Tasks},
publisher = {ISCA},
url = {http://ttic.uchicago.edu/~haotang/speech/IS140389.pdf}
}

Back to Top

2013

Evaluating speech features with the Minimal-Pair ABX task: Analysis of the classical MFC/PLP pipeline
Thomas Schatz,
Vijayaditya Peddinti, Francis Bach, Aren Jansen, Hynek Hermansky and Emmanuel Dupoux
Proceedings of Interspeech, 2013

[bib]

@inproceedings{schatz-peddinti-bach-jansen-hermansky-dupoux:is2013,
author = {Thomas Schatz and Peddinti, Vijayaditya and Francis Bach and Jansen, Aren and Hermansky, Hynek and Emmanuel Dupoux},
title = {Evaluating speech features with the Minimal-Pair ABX task: Analysis of the classical MFC/PLP pipeline},
booktitle = {Proc. INTERSPEECH}
}

A Summary Of The 2012 JHU CLSP Workshop on Zero Resource Speech Technologies and Models of Early Language Acquisition
Aren Jansen, Emmanuel Dupoux, Sharon Goldwater, Mark Johnson, Sanjeev Khudanpur, Kenneth Church, Naomi Feldman, Hynek Hermansky, Florian Metze, Richard Rose, Michael Seltzer, Pascal Clark, Ian Mcgraw, Balakrishnan Varadarajan, Erin Bennett, Benjamin Borschinger, Justin Chiu, Ewan Dunbar, Abdellah Fourtassi, David Harwath, Chia-Ying Lee, Keith Levin, Atta Norouzain, Vijayaditya Peddinti, Rachael Richardson, Thomas Schatz and Samuel Thomas
Proceedings of ICASSP, 2013

[bib]

@inproceedings{jansen-dupoux-goldwater-johnson-khudanpur-church-feldman-hermansky-metze-rose-seltzer-clark-mcgraw-varadarajan-bennett-borschinger-chiu-dunbar-fourtassi-harwath-lee-levin-norouzain-peddinti-richardson-schatz-thomas:icassp2013,
author = {Jansen, Aren and Emmanuel Dupoux and Sharon Goldwater and Mark Johnson and Khudanpur, Sanjeev and Church, Kenneth and Naomi Feldman and Hermansky, Hynek and Florian Metze and Richard Rose and Michael Seltzer and Pascal Clark and Ian Mcgraw and Varadarajan, Balakrishnan and Erin Bennett and Benjamin Borschinger and Justin Chiu and Ewan Dunbar and Abdellah Fourtassi and David Harwath and Chia-Ying Lee and Levin, Keith and Atta Norouzain and Peddinti, Vijayaditya and Rachael Richardson and Thomas Schatz and Thomas, Samuel},
title = {A Summary Of The 2012 JHU CLSP Workshop on Zero Resource Speech Technologies and Models of Early Language Acquisition},
booktitle = {Proc. ICASSP},
address = {Vancouver, Canada}
}

Mean Temporal Distance: Predicting ASR Error from Temporal Properties of Speech Signal
Hynek Hermansky, Ehsan Variani and Vijayaditya Peddinti
Proceedings of ICASSP, 2013

[bib]

@inproceedings{hermansky-variani-peddinti:icassp2013,
author = {Hermansky, Hynek and Variani, Ehsan and Peddinti, Vijayaditya},
title = {Mean Temporal Distance: Predicting ASR Error from Temporal Properties of Speech Signal},
booktitle = {Proc. ICASSP},
address = {Vancouver, Canada}
}

Filter-Bank Optimization for Frequency Domain Linear Prediction
Vijayaditya Peddinti and Hynek Hermansky
Proceedings of ICASSP, 2013

[abstract] [bib]

@inproceedings{peddinti2013filterbank,
author = {Peddinti, Vijayaditya and Hermansky, Hynek},
title = {Filter-Bank Optimization for Frequency Domain Linear Prediction},
booktitle = {Proceedings of ICASSP},
address = {Vancouver, Canada},
publisher = {IEEE},
pages = {7102 - 7106}
}

Back to Top

2011

Significance of vowel epenthesis in Telugu text-to-speech synthesis
Vijayaditya Peddinti and K. Prahallad
Proceedings of ICASSP, 2011

[abstract] [bib]

@inproceedings{peddinti2011,
author = {Peddinti, Vijayaditya and K. Prahallad},
title = {Significance of vowel epenthesis in Telugu text-to-speech synthesis},
booktitle = {Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on},
pages = {5348-5351}
}

Exploiting Phone-Class Specific Landmarks for Refinement of Segment Boundaries in TTS Databases
Vijayaditya Peddinti and Kishore Prahallad
Proceedings of Interspeech, 2011

[abstract] [bib]

@inproceedings{peddinti2011exploiting,
author = {Peddinti, Vijayaditya and Kishore Prahallad},
title = {Exploiting Phone-Class Specific Landmarks for Refinement of Segment Boundaries in TTS Databases},
booktitle = {Proceedings of Interspeech 2011}
}

Back to Top


Experience

Back to Top


Projects

Back to Top