|
|
Bimonthly Since 1986 |
ISSN 1004-9037
|
|
|
|
|
Publication Details |
Edited by: Editorial Board of Journal of Data Acquisition and Processing
P.O. Box 2704, Beijing 100190, P.R. China
Sponsored by: Institute of Computing Technology, CAS & China Computer Federation
Undertaken by: Institute of Computing Technology, CAS
Published by: SCIENCE PRESS, BEIJING, CHINA
Distributed by:
China: All Local Post Offices
|
|
|
|
|
|
|
|
|
|
Abstract
Real-time Voice Translation is an AI-powered technology that can translate speech from one language to another. Voice translation consists of three processes namely speech recognition, machine translation and speech synthesis. Speech recognition is a machine's ability to recognize words spoken aloud and convert them into readable text. Machine translation converts the text of one language to another of the user's choice. Speech synthesis acts as a text-to-speech translator, generating an automated replication of human speech. Voice translation application works by integrating all these three processes and gives the best output to the user. The project aims to develop a real-time language translation system using the Python programming language. First, the audio is extracted from the video using Python's Moviepy library. The speech recognition module helps convert source audio to text and uses the Hidden Markov Model algorithm. The source text is converted to text in the target language using the GoogleTrans API. Google Text to Speech (GTTS) helps with speech synthesis. The generated target audio is then merged with the original video to get the final output. Real-time language translation is a promising technology that can break down language barriers and improve communication between people from different language backgrounds.
Keyword
text-to-speech translator, Google Text to Speech (GTTS), GoogleTrans API, Python's Moviepy library.
PDF Download (click here)
|
|
|
|
|