Publication Date: 2022/03/30
Abstract: Nowadays, when protecting the information of an organization, professionals would consider the level of confidentiality and sensitivity of the data as a major concern. This is reflected in a manual process where ideas, decisions, and expectations of the data owners and other professionals classify data according to their perspectives. The classification of data will depend on the decisions made by humans and expose sensitive data to many users who are unauthorized to access and alter it. This research was developed to reduce the involvement of humans in making decisions on data classification and divided them into different clusters according to the level of confidentiality. The system divides documents into 3 major categories, such as confidential, sensitive, and public data, using the unsupervised self-organizing map method, which is an artificial neural network originally designed for the clustering of high-dimensional data samples onto a low-dimensional map.
Keywords: Information Technology, Intellectual Property, Self-Organizing Map, Information retrieval, Statistical Natural Language Processing
DOI: https://doi.org/10.5281/zenodo.6395394
PDF: https://ijirst.demo4.arinfotech.co/assets/upload/files/IJISRT22MAR205_(1).pdf
REFERENCES