About me

I'm a ML Engineer from Nepal, working in machine learning in the domain of audio and speech processing. I enjoy turning complex problems into solutions and finding new ways to tackle problems.

I am also involved in research in the audio and speech processing domain with interests in automatic speech recognition, singing voice synthesis, speech enhancement and audio classification. Other than this, I am also working to document datasets for low resource languages in my country for speech and NLP purposes. Any one willing to collaborate to collect these data, please feel free to connect. Find my CV.

Outside my work, I am interested in sports and its analytics and music.

What I'm doing

  • ASR

    ASR For Low Resource Language

    Working to create robust ASRs for Low Resource Languages in Nepal.

  • Speech Enhancement

    Speech Enhancement

    Working on building different architectures for speech enhancement.

  • Audio Classification

    Audio Classification

    Detection of early MCI through speech and detection of audio manipulation like splicing and pasting.

  • Speech Synthesis

    Speech Synthesis

    Building a robust TTS for Nepali language.

Past/Present Affiliations for Research

Resume

Education

  1. Vellore Institute of Technology, Vellore, India

    2020 — 2024

    • Received the full ride COMPEX Scholarship from the Embassy of India to Nepal.
    • Graduated with a first class degree with a CGPA of 9.09 out of 10.
    • Main courses: Data Structures and Algorithms, Neural Networks, Statistics, DBMS, Operating System, Cloud Computing, Cyber Security, Machine Learning, Deep Learning, Computer Vision, Natural Language

  2. Capital Secondary School

    2018 — 2020

    • Completed my +2 with a GPA of 3.64/4 from the National Examinations Board (Nepal).

Experience

  1. Research Associate @ SH-RI, Kathmandu

    Aug 2024 — Present

    Working on ASR and TTS for Low Resource Languages in Nepal.

  2. Research Student @ A*STAR, Singapore

    Feb 2024 — July 2024

    • Awarded SIPGA and worked on Automatic Speech Recognition, Singing Voice Synthesis and Conversion, Speech Enhancement and Audio Classification with different neural network architectures like RNNs, GANs, diffusion, transformer and encoder-decoder.
    • Worked with large scale audio datasets including Indonesian corpus (18k hours), ATC and Singing corpus.
    • Conducted pretraining and finetuning of models like Wav2Vec2, Whisper, WavLM and AST for ASR and classification and Bark and VITS for singing generation and conversion.

  3. Research Student @ Samsung Research Institute, Bangalore

    Dec 2022 — Aug 2023

    • Selected to work on a Samsung IoT Edge project as part of the PRISM program.
    • Worked on the kernel of Linux based real time operating system, TizenRT.
    • Implemented a network file system client library on TizenRT and used it for low storage embedded devices.

  4. Full Stack Developer Intern @ BitsKraft, Kathmandu

    May 2023 — July 2023

    • Developed an efficient MERN-based Employee Data Management System and contributed to the creation of RESTful APIs for a Drone Delivery System, enhancing operational efficiency.

Blog

Contact

Contact Form