Collating SQL Databases, No-SQL Databases and Machine Learning Algorithms for Data Analysis

Dylan Coelho; Cliff Machado; Leon Correia; Shree Jaswal; Neil Fernando1

1

Publication Date: 2022/05/19

Abstract: Big Data Tools and Machine learning algorithms have been applied to data analytics and prediction frequently. This paper evaluates and illustrates the differences between SQL and NoSQL for storage of Big Data and processing and compares various algorithms used for analysis and predictions. The paper shows our basic understanding of Hadoop and Spark cloud and compares the two platforms on various parameters such as the time taken for input data and the time taken for the output data and the total memory used by the databases. The system has implementing the Databases in Hadoop and Spark.In Hadoop, the Hive database will be used for implementingthe SQL part and Cassandra for NOSQL. In Spark the SQLpart will be implemented using Post GreSQL and NOSQL uses MongoDB. We get the end results by comparing various parameters like the input, output data and the total memory used will be represented graphically after which a user will be in a position to choose the appropriate database accordingto their requirements. Additionally, we will also be studyingand comparing various Machine Learning algorithms by implementing them on the selected dataset. To compare the algorithms, we will be considering parameters of Accuracy, Root Mean Square Error and Mean Absolute Value. Choosing the right machine learning algorithm can be difficult, but doing so is essential to answering the given question with great speed and accuracy. In order for the user to yield the required insights, algorithms must be carefully analysed and studied upon considering parameters like these. The final research results will be illustrated with the help of graph on a UI which will help to better understand the results obtained on our selected datasetfor this particular paper.

Keywords: Hadoop, NoSQL, Spark, SQL.

DOI: https://doi.org/10.5281/zenodo.6562532

PDF: https://ijirst.demo4.arinfotech.co/assets/upload/files/IJISRT22APR1087.pdf

REFERENCES

No References Available