Journal of Data Acquisition and Processing

07 April 2023, Volume 38 Issue 2

Article

RajBharath, V. Mahalakshmi, S.R.Rajeshwari, S. Hemamalini

Journal of Data Acquisition and Processing, 2023, 38 (2): 4298-4313 .

Abstract

Real-time Voice Translation is an AI-powered technology that can translate speech from one language to another. Voice translation consists of three processes namely speech recognition, machine translation and speech synthesis. Speech recognition is a machine's ability to recognize words spoken aloud and convert them into readable text. Machine translation converts the text of one language to another of the user's choice. Speech synthesis acts as a text-to-speech translator, generating an automated replication of human speech. Voice translation application works by integrating all these three processes and gives the best output to the user. The project aims to develop a real-time language translation system using the Python programming language. First, the audio is extracted from the video using Python's Moviepy library. The speech recognition module helps convert source audio to text and uses the Hidden Markov Model algorithm. The source text is converted to text in the target language using the GoogleTrans API. Google Text to Speech (GTTS) helps with speech synthesis. The generated target audio is then merged with the original video to get the final output. Real-time language translation is a promising technology that can break down language barriers and improve communication between people from different language backgrounds.

Keyword

text-to-speech translator, Google Text to Speech (GTTS), GoogleTrans API, Python's Moviepy library.

PDF Download (click here)