How Does Nosql Fit Into Corporate It

Print   

02 Nov 2017

Disclaimer:
This essay has been written and submitted by students and is not an example of our work. Please click this link to view samples of our professional work witten by our professional essay writers. Any opinions, findings, conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of EssayCompany.

Petroleum-Gas University of Ploiesti, Romania

The Bucharest University of Economic Studies, Romania

ABSTRACT

The concept described by the term NoSQL (Not Only SQL) is a database that is distributed, may not require fixed table schemas, usually avoids join operations and is typically horizontally scalable, it does not offer SQL query interface and is available in most cases as open source - some bibliographic sources use the term to refer to a completely unrelated system. This concept is also assimilated by sources in the academic world as a structured form of storage. The two terms seem not to be entirely equivalent; relational databases, for example, also meet the official definition of data storage structures, but they are somewhat opposite qualities to the concept of NoSQL. The aim of this paper is to discuss the challenges met by the NoSQL solutions and to propose solutions for these challenges.

Keywords: NoSQL, product adoption, postability, scalability, performance

How does NoSQL fit into corporate IT

Currently not all corporate applications are easily modeled relationally, as there are applications that do not require strict ACID properties (especially consistency and isolation may be missing). The present situation differs from that of the ‘80s or ‘90s when most of the existing data in a company were structured to be generated and accessed in a controlled manner and constituted "records" for the business transactions. Without discussion, such data continue and will continue to exist and always will be modeled, stored and accessed using relational DBMS. But in addition to these data we are dealing with an explosion of large data volumes which are uncontrolled, unstructured and focused on information that occurred in the last 15 years with the advent of web, digital commerce, social applications, etc. Companies do not need relational DBMS to store and retrieve them, as key properties of relational DBMS do not fit with the nature and use of these data.

NoSQL databases are a better choice for treating these trends (compared with relational DBMS) given the support they offer to such unstructured data systems, horizontal scalability through partitioning, high availability etc.

The following are some uses that support the view above:

Log Mining: server logs, application logs, user activity logs are generated in multiple cluster nodes. For production problems, Log mining tools can access logs across multiple servers, can relate and analyze data from them. Such analytical solutions are easy to implement with NoSQL databases.

Understanding social facilities: Many companies today provide their users (internal users, customers, and partners) social facilities through forums, blogs, etc. Mining in these unstructured data can lead to conclusions of utmost importance on user feedback, such that it can improve services. Using NoSQL systems fits perfectly in this context.

Integrating external data feeds: In many cases, companies must use data coming from their partners. Of course, even after a number of discussions and negotiations, companies have little control over their data format. There are also many situations where these formats change frequently due to changes in business partners. NoSQL databases can be used with great success to solve this problem as long as a solution for ETL (extract / transfer / load) is designed and developed.

EAI Systems (company application integration) with large amounts of data: Most companies have large volumes of data traffic through EAI systems (whether based on product or custom developed).These messages flowing through EAI should usually be stored persistently for security and audit purposes. Again NoSQL databases may be appropriate as data storage systems for this scenario, given the variation in data structure between the source and target systems, and the large volume of data in question.

Front end processing systems: given the explosion of digital trade (in terms of volume of orders), the number of applications and service requests submitted through various channels by retail vendors, banks, insurance providers, entertainment providers, logistics firms, etc. is enormous. Also, given the restrictions and behavioral patterns associated with different channels, formats in which information is captured slightly differ in each case and various types of rules are imposed. More than this, many of these applications do not require immediate processing and reconciliation at back end. Rather, these applications need to be captured without interruption whenever and wherever users make commands. Later, captured information can be reconciled and updated with data from back end systems where the users are updating the status of their orders. This scenario is another example of NoSQL systems that can be used initially to store input from users. In this case NoSQL databases are perfect given the large volumes of data handled, differences in input structure and consistency that can be accepted eventually, it will be obtained at the stage of reconciliation.

Services for company content management: content management is widely used today for various purposes in various functional components of the company such as sales, marketing, retail sales and human resources. Most often this entails pooling requirements of different groups (translated into differences in the metadata) into a common platform for knowledge management service. NoSQL databases are a good match in this context.

Unification and acquisitions: Companies face huge challenges during unification and acquisitions, as they are forced to unify different systems with the same functions. NoSQL databases can be used to solve this problem either by quickly creating a temporary database clusters or clusters of data by creating a permanent structure that will accommodate existing applications of merging companies.

In the following paragraphs, I will discuss some major benefits derived from the basic features of NoSQL databases (as they were presented earlier) in the three key areas of decision for any company: cost reduction, lower response time and high quality.

Business agility – reduced response time

NoSQL databases can increase business agility in two main ways:

The scheme helps data without any changes to accommodate business offering a lower response time and less impact on existing applications and functionality. In most of the cases, migration effort for any change will be almost zero.

Horizontal scalability creates an inherent ability to handle much larger traffic from users such as to adjust to seasonal variations or sudden changes in the pattern of use. Horizontal scalability oriented architectures are also the first step towards cloud systems that essentially ensures business continuity in different situations.

Greater customer satisfaction - higher quality

In today’s company IT, application quality is mainly determined by the user satisfaction. Using NoSQL databases can help achieve this goal by addressing the following issues raised by users (which are also the most common and most difficult to solve):

NoSQL databases create opportunities to improve application performance dramatically. The basic concept of data distribution ensures that the operations of I/ O on disks is no longer the bottleneck point of application performance. Rather, performance is governed by the rate of transfer. Moreover, many NoSQL solutions and paradigms offer new generation fast data processing such as MapReduce, Sorted Columns, Bloom Filter, Appended only BTree, Memtable etc.

Another important aspect in ensuring user satisfaction is availability. Users want to access applications as and when they want it so that they carry out the various tasks when they have the necessary time. Thus, unavailability of applications becomes something to be avoided at all costs. Most NoSQL databases are equipped to support such requirements on availability, taking into account the concepts of strict consistency or eventually.

Lower total cost of use

In the competitive market of the moment, where IT costs of companies are pursued intensely, obtaining the desired quality at the lowest possible cost is a primary quality. From this point of view, NoSQL databases exceed the relational DBMS, especially when volumes of data to be stored and processed are large:

The basic requirement for horizontal scalability ensures that NoSQL systems can run on ordinary computers. This reduces not only the cost of purchasing hardware and operating costs (electricity, maintenance and so on). Also, on this path, it is ensured that such applications can use next-generation infrastructure (cloud, virtualized data centers and so on).

Long term lower maintenance requirement is also a cost benefit. For example, relational DBMS that store large amounts of data, increasing their performance is an art that requires specialized knowledge (which themselves have a high cost). By comparison NoSQL databases always provide quick response and uniform even if the amount of data grows by leaps and bounds. Indexing and caching have the same behavior. With NoSQL products, developers need to care less for hardware, discs, crawl, layout of files, etc. The time gained can be used as actual implementation of the application.

CHALLENGES ADDRESSED TO NOSQL SYSTEMS

Promises given by NoSQL systems have generated much enthusiasm, but there are still a number of obstacles to overcome before these systems can become attractive to a large part of companies. This chapter tries to comment on some of the biggest challenges. I

In addition to the high level of resistance given by certain misconceptions and mistrust, the most important tactical challenges currently faced are as follows:

Identify applications / scenarios suitable for NoSQL databases

While it is easy to prove theoretically that not all company data require a relational ACID constraints, the years when only solutions of this type were used make it difficult to choose the data that can be processed and stored using non-relational solutions. Most IT managers (and IT staff with applications related responsibilities) do not have a clear idea of what performance can be achieved and continue to be opponents of the idea of (partial) removal from relational DBMS. Data are the most valuable asset of company IT. From this point of view, the ability to make decisions to manage the same data using a solution that is not so entrenched and widespread means using another framework and involves mental and sustained support (and even encouragement/ insistence) from top management.

How to select the best product / best solution(s)

The next challenge is to identify the product or tool appropriate for use as a NoSQL database. Currently there are more than 25 different products/ solutions [1] with different approaches to the four fundamental features. Because typically each of the available products treats a little differently these four fundamental traits it is usually very difficult to select a product to cover all needs. In the recent past, this situation has resulted in adopting different NoSQL solutions in the various departments of a company, which had negative effects, sometimes pushing the readopting of simple relational solutions from the simple need for standardization.

How to achieve scale economies

This challenge derives basically from the previous point. If an organization is forced to use multiple non-relational solutions (due to the fact that none of them satisfies all conditions) to ensure economies of scale in terms of staff (developers, administrators, support staff), infrastructure (hardware costs software, licensing, and consulting support) and in terms of structure (common components and services) becomes a difficult problem. Such a situation, compared to a situation of using a relational DBMS becomes really significant given that most organizations run their data agglomerations as competing data services.

How to get portability solution

Given the early stage in which the NoSQL databases are, it is expected that in the coming years, there will occur a lot of changes in terms of building solutions providers, facilities and standards development. From this point of view, the best strategy would be for a company not to bind strongly to a particular product/ solution available today so that they can easily move to a possibly better product. Given the current status of NoSQL products/ services, working most often in divergent ways, portability becomes an important issue to consider when managers of a particular organization decide to use non-relational products. This is necessary for pure protection of current investment.

How to get the right type of support. How complex is the application development and management

Not many of NoSQL databases have support solutions for external organizations. Even those who have such a solution cannot be compared to big names like Oracle, IBM and Microsoft. In particular support for data recovery, backup and ad hoc repair data is a big issue in the minds of decision-makers since many of NoSQL databases do not provide robust mechanisms for solving these problems.

There are millions of developers around the world, in each business segment, who are familiar with programming concepts for relational DBMS. By contrast, almost all NoSQL application developers are still learning. The situation will be resolved naturally over time of course, but for now it’s much easier to find a programmer or experienced manager for an application than a relational NoSQL expert.

It is true that many of the projected targets NoSQL applications are offering it with zero administration solution, but the current reality does not reach this result. NoSQL systems today require quite a lot of knowledge to install and pretty much maintenance effort.

How to forecast the total cost

Compared to competing solutions of major relational DBMS, about NoSQL DBMS one typically does not know a lot of data on performance or scalability (the subject was described in detail in the previous report). This puts the decision makers of a company in a situation without hints of what must be spent on hardware, licenses, infrastructure management and support. This is a major obstacle to make budget estimates. For this reason, NoSQL solutions are still avoided in favor of known relational solutions.

Sometimes, even if such values are available, it may not be sufficient to create a model of total cost of ownership (in the sense of comparing a relational product in terms of cost analysis Capex + Opex). Often, a large number of computers needed to ensure horizontal scalability makes decision makers prefer the first solution to the classical vertical scalability (although the latter costs could be higher in a full analysis of total cost of use).

Maturity

Relational DBMS are used for a long time. NoSQL proponents will argue that this "old age" is a sign of their aging but for most IT managers, relational maturity gives security solutions. Mostly, relational systems are stable and rich in functionality. By comparison, most NoSQL alternatives are in pre-production versions with many key features still missing. Such a feature is absent in many cases due to a lack of mature management tool (with a user friendly graphical interface and a multitude of functions offered).

Activity at the top of technology is exciting prospects for many developers, but companies should exercise extreme caution against such products.

OLAP and Business Intelligence

NoSQL databases have evolved to meet increasing requirements of modern Web 2.0 applications. Consequently, most facilities are geared to meet these requirements. But data from an application value to business cycle beyond the insert-read-update-delete of a typical Web applications. Companies analyze information from corporate databases to increase their efficiency and competitiveness, and Business Intelligence is an important issue for all medium and large companies.

NoSQL databases offer few facilities for ad-hoc queries. Even simple queries require some expertise in programming and business intelligence tools available today do not provide connectivity to NoSQL databases.

A certain solution to this situation is provided by emerging solutions such as HIVE or PIG, which may facilitate access to data stored in Hadoop clusters and perhaps more than a while, other NoSQL databases. Quest Software has also developed a product - Toad for Cloud Databases [2] - which allows for ad-hoc queries on a variety of NoSQL databases.

POSSIBLE DIRECTIONS FOR RESEARCH AND DEVELOPMENT.WAYS OF IMPROVING NOSQL PRODUCTS TO EASE THEIR ADOPTION

All that was presented in the previous chapter does not mean that companies should not adopt NoSQL solutions (waiting to see the evolution of these products). It is true that the unrelated solutions are in an early stage of widely adoption by companies. But the major potential of NoSQL databases for future businesses should not be lost sight of. This is especially true given that businesses will face in future volumes of increasingly large data - semi-structured / unstructured and eventually consistent, while strictly structured data volumes that meet ACID philosophy will have a similar pattern, remaining at low levels. So it is important, at this point at least, to move to habituation of decision makers in companies with the need to use our products for manipulating relational company data. In this test, it seems necessary to introduce/ solve/ research/ develop of at least some key elements of technological, human and process nature.

Adoption of a (single) product

There are lots of NoSQL solutions currently on the market that offer alternative approaches to solutions of the four key features presented before. At the same time, the different uses of a company may require different types of features. Obviously, as argued above, using several products in a company is not desirable, at least from the perspective of the necessity of economy of scale. The solution is obviously using a single product, its choice depending on the target applications. It should be noted also that although some products may lack some features, they can usually be found on alternatives or are still to be introduced later. Also, most products will reach maturity in the near future and will offer various solutions for that moment, by configuration. So as long as the product covers most needs (though not all) it can be considered as a good option to start.

Rules of good practice for the selection of a product/ solution are:

It will be given greater importance to product support for logical data model required for the application. This will essentially decide how the application will adapt easily to various current and future business needs.

It will investigate the physical model data capacity offered by the product to get a product capacity to scale horizontally, availability, consistency and partitionability necessary according to application needs. In this section, it will be assessed the possibilities for backup and recovery mechanisms.

The interface must be in accordance with standard operating environment used in the company. Given that each product NoSQL usually offers a variety of interfaces, this requirement can be easily covered.

A persistent model is not important as long as the product provides horizontal scalability.

To illustrate the rules of good practice, in what follows I will make a comparative analysis of a set of NoSQL products. Such an analysis can be a good starting point for companies that are serious about adopting a non-relational product right now.

The steps of filtering are:

For many companies the ability to support reasonably complex data structures is mandatory. The contrary case would mean that it is the responsibility of the application to implement complex structures, which is very difficult. This condition is met for all products of the pairs of keys - values to relational schemas. From this point of view, products like Voldemort, Tokyo Cabinet etc. cannot be used.

A second criterion is the ability to handle large amounts of data through horizontal scalability based on partition/ shard sites and a low price. The lack of such skills eliminates almost all the advantages of NoSQL databases to relational ones. In this way certain products such as Neo4J (although it has a very rich data model based on graphs), Redis or CouchDB are removed from the list.

A final criterion is providing some form of commercial support for the company. This criterion eliminates products such as Cassandra (chances are high that this product soon will get support either from Rackspace or Cloudera, given that the product Cassandra is already used in commercial systems such as Twitter, Digg and Facebook).

Having done the above filtering process, the list of products compared remain MongoDB, Riak, Hypertable and HBase. The following table summarizes the key features of these four products. A company can evaluate its detailed requirements in relation to this table when selecting a desired short list for NoSQL solutions.

Table 1 - A comparative analysis used to determine NoSQL system suitable for a particular application (Table built from product specifications analyzed)

Facilities

MongoDB

Riak

HyperTable

HBase

Logical data model

Rich Document with support for nested documents

Rich Document

Family of columns

Family of columns

Support for CAP

CA

AP

CA

CA

Add or remove a node dynamically

Supported

Supported

Supported

Supported

Multi data center

Supported

Supported

Supported

Supported

Interfaces

A variety of APIs specific programming languages (Java, Python, Perl, C # etc.). + REST

JSON over HTTP

REST, Thrift, Java

C + +, Thrift

Persistence model

Disk

Disk

Memory + Disc (adjustable)

Memory + Disc (adjustable)

Comparative performance (on a scale from 1 to 5)

4 (product written in C + +)

5 (items written in Erlang)

4 (product written in C + +)

3 (product written in Java)

Commercial Support

10gen.com

Basho Technologies

Hypertable Inc.

Cloudera

Building abstractions for data access

Building a separate layer of abstraction for data access in NoSQL databases is required. The existence of such an abstratus is advantageous in several ways. Firstly, application developers should not consider the low-level details of the solution. This advantages scaling in terms of trained personnel. All this makes it easy to change the future solutions developed if necessary. Also this allows the various requirements of multiple applications to be solved in a standardized way (such as SQL, possibly without complex facilities such as JOIN, GROUP BY and so on). For example, in the interface C # /. NET for MongoDB recently it was added such a layer of abstraction represented by MongoDB.Linq library [3] which implements the component .NET Language Integrated Query over MongoDB.

Creating a model for performance and scalability

Unrelated to the solution that was chosen, scalability and performance modeling using standard techniques (such as Queuing Network Model, Layered Queuing Network etc.) is highly recommended. Following this analysis, it will be produced data that can be used for rough sizing servers and for cluster topology design, and to estimate the cost of licensing, administration etc. This analysis will essentially become the primary entry for all types of budget, helping to make a decision.

Construction of explicit redundancy

To prevent any data loss, there is no other solution than to replicate data to a backup server. Although many NoSQL database servers provide automatic replication, they have a single point of failure (master node). For this reason it is better that the data are protected by a secondary backup and recovery, and automatic data repair is available in prefabricated scripts.

In order to ensure such an action, it is necessary to understand the physical model of NoSQL product, identifying options for potential recovery mechanisms and examining these options to see if they meet the requirements and company practices. And from this point of view MongoDB product offers superior functionality, allowing you to create sets of lines [4] .

Building a common platform for data services

As with common relational database services, common NoSQL database service can be built to achieve economies of scale in terms of infrastructure and support. Such unification helps to change and improve future applications. Such unification would be the final target on the needs list to mature the product (to be achieved from medium to long term). Even if a long-term target, viewing it from the very beginning will help you make the right decision when it is needed.

The development of a dedicated team in the company

In every company there are a few people who are interested in learning new unconventional concepts. Forming a group of such individuals that spend part of the time or even working all the time to investigate and evaluate progress in the NoSQL things, issues and challenges popular ideas of the next generation will help in directing projects using such a technology. Such a group can also help decision makers by eliminating myths and providing correct information.

Development of relations with the product community

After the adoption of a product it makes sense to develop relations with the product community so that businesses and communities help each other. Most NoSQL products have very active communities that are more than willing to help. A relationship of this kind will help each party. Knowing from the beginning the problems and solutions can help companies to make decisions on certain features or versions. The company can also influence the product by requiring facilities (important both for the organization and for the community). On the other hand the community is familiar with the problems of detail, knowledge necessary to make a product to be robust and rich in features. Also successful developments of medium and large companies help the community to grow.

Iterative development

Compared to the relative maturity of relational DBMS, the only way for NoSQL products to achieve the same level with minimum risk is the approach of iterative development methodologies.

For example the idea of building a common platform for data services, with the introduction of a standardized data abstraction cannot happen overnight, in a single step. Rather, working from a model oriented on repetition and re-manufacturing will result in that outcome. In this type of technological developments, changing solution halfway is not recommended. Also a flexible way of seeing things helps create a mental framework for accepting changes for both management and for those implementing the product.

However, for an iterative approach, it is important to define a matrix for decision criteria. An example of such a set can be a guide (with models) to direct classification of an object as being more suitable for modeling relational or NoSQL modeling, or infrastructure sizing guide, a list of required test cases etc.

Discussion and conclusions

In the adoption in company of NoSQL databases the biggest challenge is changing the mind of decision makers - changing their belief that not all data/ objects match the relational model. The best thing to do is test a NoSQL solution for the right type of uses, demonstrating how NoSQL-based applications may be more effective than the relational solutions, if used in the right context. It also helps identify some projects (not necessarily critical for the company, but that have mostly good visibility) for which the non-relational implementations would fit. The success (or failure) of such a design framework helps one change his mind. Such a project would also help learning what needs to be done differently to better implement NoSQL solutions. This step-by-step policy is vital if the company wants to reshape the mechanisms of information management in the near future in order to adopt non-relational solutions.

In addition to the rules of good practice and information obtained through benchmarking, implementation of this chapter led to the selection of target product research practice in the PhD thesis. It can be seen by following the previous subsections that the product MongoDB, besides having been selected among the four products that meet the minimum requirements for use in a company, meets some additional requirements.



rev

Our Service Portfolio

jb

Want To Place An Order Quickly?

Then shoot us a message on Whatsapp, WeChat or Gmail. We are available 24/7 to assist you.

whatsapp

Do not panic, you are at the right place

jb

Visit Our essay writting help page to get all the details and guidence on availing our assiatance service.

Get 20% Discount, Now
£19 £14/ Per Page
14 days delivery time

Our writting assistance service is undoubtedly one of the most affordable writting assistance services and we have highly qualified professionls to help you with your work. So what are you waiting for, click below to order now.

Get An Instant Quote

ORDER TODAY!

Our experts are ready to assist you, call us to get a free quote or order now to get succeed in your academics writing.

Get a Free Quote Order Now