Publication Date: 2022/03/07
Abstract: :- Customary, conventional healthcare Database Management Systems are used as a repository of data and to process structured data efficiently, but in case of diverse variety and huge volumes of data it becomes arduous to handle such mammoth volumes. The question arises of what and how to process such data from various sources which could be structured as well as unstructured and in a distributed manner? Hadoop is open source framework, based on distributed computing, which is capable of storing and processing Big Data, which may comprise of structured, unstructured as well as semi-structured data. In this paper, we summarize the basic operations performed on healthcare data in a Data Management Lifecycle.
Keywords: Big Data, Data Analysis, Distributed Computing, ETL Hadoop, Healthcare, MapReduce.
DOI: https://doi.org/10.5281/zenodo.6334712
PDF: https://ijirst.demo4.arinfotech.co/assets/upload/files/IJISRT22FEB711.pdf
REFERENCES