Publication Date: 2023/01/19
Abstract: This text-to-image convertor aims to check the conversion of data between the various modalities (text, image) because of the evolution of human-machine communication that introduced the utilization of natural communication modalities to humans. Such as gestures, speech, sound, and vision. In fact, one of the main challenges of this "multimodal" learning is the learning of a shared illustration between the distinct modalities and the prediction of the missing knowledge ( by retrieval or synthesis) from one conditioned modality to another. Some researchers work on the various varieties of conversions; Text to Speech, Speech to image or Text to image synthesis, and vice-versa however in this paper we tend to can focus on: image to audio image-to-text synthesis.
Keywords: No Keywords Available
DOI: https://doi.org/10.5281/zenodo.7551296
PDF: https://ijirst.demo4.arinfotech.co/assets/upload/files/IJISRT22DEC1083_(2).pdf
REFERENCES