Number Of Nodes Are Add To Computing

Print   

02 Nov 2017

Disclaimer:
This essay has been written and submitted by students and is not an example of our work. Please click this link to view samples of our professional work witten by our professional essay writers. Any opinions, findings, conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of EssayCompany.

Abstract:

In this paper I want to discuss about how modeling of object dependencies are varying in NOSQL and SQL under Open-source technologies and how the functionalities are works under these technologies. Modeling of object dependencies are considered as Performance, scalability. Where Performance includes the high output (result) and scalability includes the quality of output, it uses the number of resources to achieve better output. In present world big data is the key point to database systems. When we comparing these two data bases in terms of performance and scalability there are lot of variations are around it. NOSQL and SQL include the lot of Open source technologies to check performance and scalability. Presently we have lot Open source technologies for these both database systems. NOSQL includes Open-sources like Cassandra, HP BASE, Mango DB, Accumulo, HPP, Amazon simple DB, Hyper table, Stratosphere…etc. SQL includes Open source like VoltDB, NuoDB, Sensei DB, Genie DB, ScalArc, Scale DB, SQL, MySQL cluster…etc. Performance and scalability are measured under cpu work load data transactions, read and Write times. Each Open source technologies are performed differently but it gives high performance and scalability. Open source is the user friendly source code and it can be used by anyone without paying any fees. Main purpose of this paper is too discussed about SQL and NOSQL design patterns and achieve good performance, scalability, CPU workload, Memory functioning. Investigation on when the data inserted between two systems how the performance varies.

Introduction

No SQL and SQL both are the databases, those are used in different platforms to store the large data.

SQL is a structured query language used for database manipulation. It includes the data insert, create, delete and many other factors to set data in order to storing data. It follows the data definition language and data manipulation language.

NOSQL it has no structure and "NOSQL involves and the way it converts large data into small data and stored in several servers which implies towards good performance and scalability. Notes that NOSQL applications are still relatively few and small, but they are becoming more widely used in enterprise, government telecommunications companies (Telco’s), and banks"(Leonard J 2012,P20-23).

Mainly Database management system (DBMS) is depends on ODBC or JDBC.

1. ODBC (Open database connectivity): Is a standard C language middleware API for accessing databases. If an application writes using ODBC it can be ported to other flat forms also.

2. JDBC (Java database connectivity): Is a standard Java language middleware API for accessing relational database.

NOSQL is a non-relational database management systems and it have multiple Design structure types:

Key-value stores

Document databases

Object store

Graph style

Literature review:

SQL is structured query language is responsible for querying and editing the information and stores in database.

SQL database systems are mostly depends on ACID theorem and it stands for

A stands for Atomicity: Atomicity means indivisibility and irreducibility.

C stands for consistency: In each instance database changes and provides correct results continuously.

I stand for Isolation: Its function is to tell when and how the changes are made in database.

D stands for Durability: Transactions are set permanently in the database, even if the system crashes transaction will remain same.

"MySQL Cluster shards data over multiple database servers "shared nothing" architecture. Every shard is replicated, to support recovery. Bi-directional geographic replication is also supported" (Rick Cattell, 2010).

VoltDB: Is the new-open source technology and it is a relational data base system, it gives high throughput.

Minhas, Umar Farooq et..al(2012) are defined about Voltdb in their paper accordingly "Voltdb has been designed to provide very high throughput and fault tolerance for transactional workloads". Voltdb has design choices like

1. All the data can be stored in a Main memory, it should be avoids slow disks operations.

2. Transactions are done in the server side and stored procedures can be done at server side.

3. Transactions are executed at the each database partition, is there are no single transactions at the single partition.

4. These type of partitioning provides durability, fault tolerance.

Voltdb supports two types of transactions those are:

1. Single partition transactions: It supports only one partition at the database hence it is very.

2. Multi partition transactions: it takes data from more than one database so transactions speed is low.

Voltdb always creates k+1 instances for k inputs as the results shows as no failures. How to increase the performance in the Voltdb:

1. Increase the size of cluster

2. Database information can be transferred between the nodes.

2.0 NOSQL data storage techniques

This is the definition about NOSQL from (www.oracle.com)"The recent launch

of Oracle NOSQL Database has further spurred interest and excitement. Oracle NOSQL Database is a horizontally scalable key-value database. Built by the acclaimed Berkeley DB team, it features excellent performance, tunable consistency, integration with Hadoop, with a simple but powerful client API".

NOSQL mainly works on Cap and BASE theorems. CAP theorem: defined by Brewerin the year 2000…

Seth Gilbert and Nancy A. Lynch discussed about "the CAP Theorem can be stated as follows: In a network subject to communication failures, it is impossible for any web service to implement an atomic read/write shared memory that guarantees a response to every request".

C indicates the consistency: it defines a system how the performance is going under long time period. When a client send request to server it has be generates the output immediately.

A indicates the Availability: whether the system available for all the times or not.

P indicates the Partition Tolerance: When the system servers are partitioned into multiple disks there should be a communication delay is possible between each disk.

Base theorem:

BA stands for Basic Availability. S stands for Soft-state.

E stands for Eventual Consistency.

Stone braker has explained about NOSQL in his paper like NOSQL considered as to work under OLTP technology. OLTP means Online Transaction Processing. This technology mainly used in Banking sectors, Railway sectors, Super markets...etc. There are two ways to improve OLTP performance, first one is automatic sharing over a shared nothing processing and second one is improve server OLTP performance.

OLTP time would be depends on four factors

1. Logging: It is a process to entering into data.

2. Locking: It is a process to set data base lock after completion of work.

3. Latching: it is a process to update system and data in all the disks.

4. Buffer Management: Instructions can get results very fast with buffer management because of it store the value in cache memory.

DSS (Decision support system): Is used for reports, analytics, data ware houses, etc.

There are two primary ways a database can be used OLTP and DSS. Database design sets are

Normalized Database: Large data can be divided into smaller data and sets a connection between them.

De normalized Database: Copying the same data into multiple documents or tables in order to reduce the query time.

NOSQL performance can be increases in two ways

1. Number of Nodes are add to computing

2. Increase the performance of the each-Node

Guy Harrison (2011) discussed about oracle NOSQL in his paper like "Oracle NOSQL is a distributed key-value store: values written to nodes in a cluster based on a hash of the key value. Unlike some NOSQL databases, there is no support for a partitioning scheme that allows adjacent keys to be located on the same node. However, Oracle NOSQL supports the concept of major and minor key paths:

A major key may have sub-keys all stored on the same node. These may be used to optimize the retrieval of master detail records".

Oracle NOSQL mainly looks over the big data process in order to achieve the large data over the systems.

Cassandra:

Bagade, prasanna ...et al (2012) defines "Cassandra is NOSQL distributed database system which is known for managing large amount of distributed data. It provides high availability without single point of failure, the reason behind this is that it treats failure of node as norm rather than exception. It is also famous for high write throughput without harming read efficiency".

Cassandra is the leading new technology in NOSQL, it is very is to configure the distributed database. It is column oriented structure; it partitions the data in clusters like Random partitioning, old preventing partition.

Cassandra follows the SEDA architecture, in this process transactions are made by queues it also called as thread pool. It operates active transactions first and later it finishes the waiting transactions hence it provides the high performance.

Cassandra uses the Node tool to perform the resources like CPU utilization, memory statistics, and column family graphs.

Jason brooks (2011) also mentioned about oracle NOSQL in his paper like "a key- value data store on which Oracle has layered services supporting scale out over large numbers of nodes". Here key value function is main asset.

Erik meijer and Gavin bierman (2011) discussed about key-value storage and relational tables like "While we don’t often think of it this way, the RAM for storing object graphs is actually a key-value store where keys are addresses (l- values) and values are the data stored at some address in memory (R-values).

Apache Hbase: This is an open-source technology; it is used when we handling random read/write big data and it are column oriented. It operates big tables we can create billions of tables.

Features

Linear and modular scalability

Strictly consistent reads and write

Automatic failover support between region servers

Automatic and configurable sharding of tables

Easy to use java API for client access

Query predicate push down via server side filers

- (hbase.apache.org)

Mainly hbase is used when there is huge data is to be store.

Mehul Nalin Vora (2011) defined about hbase like "HBase, an Apache open- source project, is a distributed fault-tolerant and highly scalable, column-oriented, NOSQL database built on top of HDFS".

Mehul Nalin Vora discussed in his paper about Hbase performance by comparing with SQL and mySQL. Data should be stored in the form of image files on the hdfs, location of the hdfs stored in hbase and mySQL. This process impacts on the performance because of query searching in two flatforms.

Hbase Performance and response times are depends on

Random read

Random write

Equal read and write

Heavy read and write

3.0 Comparisons between SQL and NOSQL:

SQL:

Query translation time

Number of storages

Number of translation algorithms

Query types

Network environment

Query transmission time

Number of storages

Query type

Network environment

Storage structure

Size of data set

- Jiseong Son, Jeong-Dong Kim et al (2011).

From the table it analyzes that performance and scalability depends on the query translation and transmission times, these are varies according to user inputs.

SQL performance depends on the:

Storage dependent systems: here data should be stored in multiple disks then when a query passed from the user it takes more time to produce output.

Storage independent systems: here data should be stored in single disk then user can get fast output.

Then there is no transmission delay to produce output

When a query passed in both dependent and independent systems output could be varies.

Nicolae .M, Victor .V (2010) discussed about SQL performance like

Designing an efficient data schema

Optimizing indexes, stored procedures and transactions

Analyzing indexes, stored procedures and transactions

Monitoring access to data

Optimizing queries

NOSQL:

Column family

Document store

Key value

Eventual consistent key value store

1.Cassandra

2.Hypertable

3.Cloud data

4.Amazon simple DB

1.Couch DB

2.Mango DB

3.Siso DB

4.Raven DB

1.Azure table

Storage

2.Genie DB

3.Hampster DB

4.Cloud DB

1.Mamgo DB

2.Dovetail DB

3.Amazon dynamo

4.Voldemart

- Tudorica, Bogdan George et al (2011)

From the table it analyzes that NOSQL performance and scalability depends on the four factors, column family, document store, key value, eventual consistent key value store. These factors changed in the each open source technology but when

We operating those technologies we can get difference performance and scalability. From the both databases it analyzes that performance and scalability is changes accordingly within the systems.

Erik meijer and Gavin bierman (2011) discussed about NOSQL performance by maintaining key value store. It is the major condition when operating the NOSQL technology.

Specifications: Following specifications are going to be used in my final dissertation.

Software requirements:

I would like to go with Open source software technologies are considering for my project

Apache Hbase latest version

MySQL latest version

VMware workstation

Ubuntu iso image file

Hardware requirements:

Following configurations are according to my requirements those are will be vary.

Default setting for both systems

Memory: 2GB

Hard disk space: 20GB

Implementation:

MySQL:

Results:

Expected results for my project

Runtime

Throughput

Average latency

Max latency

Min latency

Conclusion:

In this paper I discussed various SQL and NOSQL technologies and their performance and scalability variations. Yes modeling of object dependencies are varied in SQL and NOSQL environment. In SQL environment performance can be varied accordingly query translation time and transmission time. In NOSQL environment four factors are shows the impact on performance and scalability. Performance and scalability are varies in SQL and NOSQL environment according to their structures but both are good technologies in the real world to store the large data. Mainly NOSQL platform offers to store wide range of data in the organizations. Performance tools provide a safety eye on the components and their working functionality. NOSQL provides data duplication in the systems; it helps a lot to retrieve data at any time easily. Optimization is also a process to improve performance in the SQL and easy code should be a method to produce good quality performance. Monitoring at each component will helps a lot to get high performance. SQL technology mainly used in the online transactions to produce instant results like ‘railway tickets, pay slips in banking sector, generating any prints…etc.’ then it automatically stores that information in database. NOSQL technology is very useful in order to store large data or to create data.

A recent trend towards the use of non-relational NoSQL databases raises the question where to store

application data when part of it is perfectly relational. Dividing data over separate SQL and NoSQL



rev

Our Service Portfolio

jb

Want To Place An Order Quickly?

Then shoot us a message on Whatsapp, WeChat or Gmail. We are available 24/7 to assist you.

whatsapp

Do not panic, you are at the right place

jb

Visit Our essay writting help page to get all the details and guidence on availing our assiatance service.

Get 20% Discount, Now
£19 £14/ Per Page
14 days delivery time

Our writting assistance service is undoubtedly one of the most affordable writting assistance services and we have highly qualified professionls to help you with your work. So what are you waiting for, click below to order now.

Get An Instant Quote

ORDER TODAY!

Our experts are ready to assist you, call us to get a free quote or order now to get succeed in your academics writing.

Get a Free Quote Order Now