A Learning Based Approach for Automatic Text Document Classification

Ravi Prasad Ravuri1

1

Publication Date: 2023/07/14

Abstract: Abstract:-Text documents over Internet, social media and in internal applications of various organizations such as judiciary are increasing exponentially. Manual observation of such documents and classifying them for further processing is tedious task. There is need for automatic text document classification. Traditional heuristics based approaches have limitations to scale up to the demand in terms of volumes of input documents. To overcome this problem, machine learning (ML) techniques are used as they can learn from the training data and perform classification. They can also deal with large corpus. However, existing ML models when used directly their performance gets deteriorated due to lack of training quality. In this paper we proposed a framework that has a hybrid approach including feature selection and also ML models towards leveraging prediction performance. Our framework is named as Learning based Text Document Classification Framework (LbTDCF). We also proposed an algorithm known as Intelligent Document Classification Algorithm (IDCA) to realize our framework.

Keywords: Machine Learning, Text Document Classification, Supervised Learning, Intelligent Document Classification

DOI: https://doi.org/10.5281/zenodo.8146672

PDF: https://ijirst.demo4.arinfotech.co/assets/upload/files/IJISRT23JUN1309.pdf

REFERENCES

No References Available