Publication Date: 2023/09/18
Abstract: Text summarization is an area within natural language processing (NLP) that revolves around producing brief and condensed summaries from extended passages of text. The exponential growth of digital content has given rise to a vast quantity of textual information, creating a challenge for individuals to stay abreast of this information overload. While previous advancements in text summarization have marked significant achievements, there remains an existing void in adequately addressing the specific requirements for summarizing general textual content. The project's goal is to create a summarization system that generates concise summaries by using creative methods in natural language processing and sophisticated machine learning algorithms. This system will help fill the informational divide between lengthy texts and condensed summaries.The primary objective is to create an efficient and effective summarization model that enables text summarization and speech synthesis integrating the gTTS library, enabling the transformation of summaries into speech. We strived to empower users by developing customization options that grant them the ability to define summary attributes such as length and style, culminating in personalized and precisely tailored summarization outputs. This project seamlessly integrates web scraping, frequency-based text summarization, and a user-friendly Flask interface, enhancing content consumption and accessibility. Users input URLs, initiating efficient processes of extracting essential text, generating concise summaries, and estimating reading time. Web scraping extracts data for text summarization, using frequency- based scoring for succinct summaries. The Flask interface empowers users to input URLs, triggering content extraction and summarization. The project finds applications in content understanding, gTTS-enabled accessibility, and efficient information management. Beneficial for education, it aids in quick comprehension of complex subjects, supported by estimated reading time. Merging technology with user-centric design, it enriches learning, research, and content assimilation across domains. An empowering tool for academia, professionals, and personal exploration, it navigates the digital realm effectively. The project's integrated approach of web scraping, frequency-based text summarization, and Flask interface yields efficient content extraction, concise summaries, and estimated reading time. Quantitative analysis involves comparing the generated summaries' quality, coherence, and accuracy with existing literature.
Keywords: NLP, gTTS library, Flask, TextRank algorithm, URLs
DOI: https://doi.org/10.5281/zenodo.8355268
PDF: https://ijirst.demo4.arinfotech.co/assets/upload/files/IJISRT23SEP011.pdf
REFERENCES