Bimonthly    Since 1986
ISSN 1004-9037
Publication Details
Edited by: Editorial Board of Journal of Data Acquisition and Processing
P.O. Box 2704, Beijing 100190, P.R. China
Sponsored by: Institute of Computing Technology, CAS & China Computer Federation
Undertaken by: Institute of Computing Technology, CAS
Published by: SCIENCE PRESS, BEIJING, CHINA
Distributed by:
China: All Local Post Offices
 
   
      09 May 2023, Volume 38 Issue 3
    Article

    IMAGE CAPTION GENERATOR WITH VOICE USING LSTM AND CNN ALGORITHMS
    1Dr. Dattatray G. Takale, 2Dr. Dattatray S. Galhe, 3Dr. Parishit N. Mahalle, 4Dr. Chitrakant O. Banchhor 5Prof.Piyush P. Gawali, 6Prof.Gopal Deshmukh, 7Dr. Vajid Khan, 8Prof. Madhuri Karnik
    Journal of Data Acquisition and Processing, 2023, 38 (3): 1121-1132 . 

    Abstract

    In the area of voice-driven picture caption creation, the VGG16 Convolutional Neural Network (CNN) and Long Short-Term Memory (LSTM) networks have showed potential. In this study, we demonstrate a system that uses this potent combination to provide captions and audio explanations for pictures. In order to provide a rich representation of the input pictures' information, high-level features are extracted from the images using the VGG16 CNN. The LSTM network then receives these characteristics and expands the memory by including sequential data to provide illustrative captions. The well-known "Flickr8k" dataset, which includes a large collection of photographs and related human-written captions, serves as the basis for the system's training and evaluation. Our method generates precise and contextually appropriate captions and audio explanations for a variety of pictures by combining the strengths of CNN and LSTM. The trial results show the value of the suggested strategy, opening the door to further developments in picture captioning and accessibility for those with visual impairments.

    Keyword

    Image caption generation, voice synthesis, VGG16, Convolutional Neural Network, LSTM, Long Short-Term Memory, Flickr8k dataset


    PDF Download (click here)

SCImago Journal & Country Rank

ISSN 1004-9037

         

Home
Editorial Board
Author Guidelines
Subscription
Journal of Data Acquisition and Processing
Institute of Computing Technology, Chinese Academy of Sciences
P.O. Box 2704, Beijing 100190 P.R. China
E-mail: info@sjcjycl.cn
 
  Copyright ©2015 JCST, All Rights Reserved