Research consistency and perfomance of nosql replicated databases
Main Article Content
Abstract
This paper evaluates performance of distributed fault-tolerant computer systems and replicated NoSQL databases and studies the impact of data consistency on performance and throughput on the example of a three-replicated Cassandra cluster. The paper presents results of heavy-load testing (benchmarking) of Cassandra cluster’s read and write performance which replicas were deployed on Amazon EC2 cloud. The presented quantitative results show how different consistency settings affect the performance of a Cassandra cluster under different workloads considering two deployment scenarios: when all cluster replicas are located in the sane data center, and when they are geographically distributed across different data centers (i.e. Amazon availability zones). We propose a new method of minimizing Cassandra response time while ensuring strong data consistency which is based on optimization of consistency settings depending on the current workload and the proportion between read and write operations.
Article Details
References
Meier, A. and Kaufmann, M. (2019), SQL and NoSQL Databases: Models, Languages, Consistency Options and Ar-chitectures for Big Data Management, Springer Verlag, Berlin, 229 p.
Pritchett, D. (2008), “Base: An Acid Alternative”, ACM Queue, Vol. 6, No. 3, pp. 48-55.
Brewer, E. (2000), “Towards Robust Distributed Systems”, Proceedings of the 19th Annual ACM Symposium on Principles of Distributed Computing, Portland, USA, pp. 7-8.
Gilbert, S. and Lynch, N. (2002), “Brewer’s Conjecture and the Feasibility of Consistent, Available, Partition-Tolerant Web Services”, ACM SIGACT News, Vol. 33, No. 2, pp. 51-59.
Cooper, B., Silberstein, A., Tam, E. and Ramakrishnan, R. (2010), “Sears Benchmarking Cloud Serving Systems with YCSB”, Proceedings of the 1st ACM Symposium on Cloud Computing, Indianapolis, USA, pp. 143-154.
Abramova, V., Bernardino, J. and Furtado P. (2014), “Testing Cloud Benchmark Scalability with Cassandra”, Proceed-ings of the IEEE 10th World Congress on Services, Anchorage, USA, pp. 434-441.
Klein, J., Gorton, I., Ernst, N., Donohoe, P., Pham, K. and Matser C. (2015), “Performance Evaluation of NoSQL Data-bases: A Case Study”, Proceedings of the 1st ACM/SPEC Int. Workshop on Performance Analysis of Big Data Sys-tems, Austin, USA, pp. 5-10.
Haughian, G., Osman, R. and Knottenbelt W. (2016), “Benchmarking Replication in Cassandra and MongoDB NoSQL Datastores”, Proceedings of the 27th Int. Conf. on Database and Expert Systems Applications, Porto, Portugal, pp. 152-166.
Farias, V. A., Sousa, F. R., Maia, J. G. R., Gomes, J. P. P. and Machado J. C. (2018), “Regression based performance mod-eling and provisioning for NoSQL cloud databases”, Future Generation Computer Systems, Vol. 79, pp. 72–81.
Karniavoura, F. and Magoutis, K. (2017), “A measurement-based approach to performance prediction in NoSQL sys-tems”, Proceedings of the 25th IEEE Int. Symposium on the Modeling, Analysis, and Simulation of Computer and Telecom. Systems, Banff, Canada, pp. 255-262.
Cruz, F., Maia, F., Matos, M., Oliveira, R., Paulo, J. and Pereira, J. R. (2017), “Vilaca Resource usage prediction in dis-tributed key-value datastores”, Proceedings of the IFIP Distributed Applications and Interoperable Systems Conf. (DAIS'2017), Heraklion, Crete, pp. 144-159.
Abdelmoniem, A. M. and Bensaou, B. (2018), “Curbing Timeouts for TCP-Incast in Data Centers via A Cross-Layer Faster Recovery Mechanism”, Proceedings of the IEEE Conf. on Computer Communications, Honolulu, HI, pp. 675-683.
Libman, L. and Orda, A. (2002), “Optimal retrial and timeout strategies for accessing network resources”, IEEE/ACM Transactions on Networking, Vol. 10, No. 4, pp. 551-564.
Avizienis, A., Laprie, J.-C., Randell, B. and Landwehr, C. (2004), “Basic concepts and taxonomy of dependable and secure computing”, IEEE Transactions on Dependable and Secure Computing, Vol. 1, No. 1, pp. 11-33.
Brewer, E. (2012), “CAP twelve years later: How the "rules" have changed”, Computer, Vol. 45, No. 2, pp. 23-29.
Gorbenko, A., Romanovsky, A. and Tarasyuk, O. (2019), “Fault tolerant internet computing: Benchmarking and model-ling trade-offs between availability, latency and consistency”, Journal of Network and Computer Applications, Vol. 146, pp. 1-14.
Gorbenko, A. and Romanovsky, A. (2013), “Time-outing Internet Services”, IEEE Security & Privacy, Vol. 11, No. 2,
pp. 68-71.
Chandani, M (2016), Benchmarking Cassandra and other NoSQL databases with YCSB, URL: https://github.com/cloudius-systems/osv/wiki/Benchmarking-Cassandra-and-other-NoSQL-databases-with-YCSB
Gorbenko, A., Romanovsky, A. and Tarasyuk, O. (2020), “Interplaying Cassandra NoSQL consistency and perfor-mance: A benchmarking approach”, Communications in Computer and Information Science, Vol. 1279 / Editors: S. Bernardi, et al. Springer Nature. Berlin, pp. 168-184.
Gorbenko, A., Kharchenko, V., Tarasyuk, O., Chen, Y. and Romanovsky A. (2008), “The threat of uncertainty in Ser-vice-Oriented Architecture”, Proceedings of the RISE/EFTS Joint International Workshop on Software Engineering for Resilient Systems, Newcastle, UK, pp. 49-54.