Development And Maintenance Of Large Database Systems

Print   

02 Nov 2017

Disclaimer:
This essay has been written and submitted by students and is not an example of our work. Please click this link to view samples of our professional work witten by our professional essay writers. Any opinions, findings, conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of EssayCompany.

Definition : A database is an organized collection of data. The data is typically organized to model relevant aspects of reality (for example, the availability of rooms in hotels), in a way that supports processes requiring this information (for example, finding a hotel with vacancies).

Database management systems (DBMSs) are specially designed applications that interact with the user, other applications, and the database itself to capture and analyze data. A general-purpose database management system (DBMS) is a software system designed to allow the definition, creation, querying, update, and administration of databases. Well-known DBMSs include MySQL, PostgreSQL, SQLite, Microsoft SQL Server, Microsoft Access, Oracle, Sybase, dBASE, FoxPro, and IBM DB2. A database is not generally portable across different DBMS, but different DBMSs can inter-operate by using standards such as SQL and ODBC or JDBC to allow a single application to work with more than one database.

Efficient data management typically requires the use of a computer database. A database is a shared, integrated computer structure that stores a collection of:

• End-user data—that is, raw facts of interest to the end user.

• Metadata, which is data about data, through which the end-user data are integrated and managed.

The metadata describe the data characteristics and the set of relationships that links the data found within the database.

For example, the metadata component stores information such as the name of each data element, the type of values (numeric, text, or dates) stored on each data element, and whether the data element can be null. The metadata provide information that complements and expands the value and use of the data. In fact metadata presents a more complete picture of the data in the database.

Role and Advantages of the DBMS


The DBMS serves as the intermediary between the user and the database. The database structure itself is stored as a collection of files, and the only way to access the data in those files is through the DBMS. The DBMS presents the end user (or application program) with a single, integrated view of the data in the database. The DBMS receives all application requests and translates them into the complex operations required to fulfill those requests. The DBMS hides much of the database’s internal complexity from the application programs and users. The application program might be written by a programmer using a programming language such as C#, Java, or Visual Basic.NET, or it might be created through a DBMS utility program.

Having a DBMS between the end user’s applications and the database offers some important advantages. Mainly the DBMS enables the data in the database to be shared among multiple applications or users. Also, the DBMS integrates the many different users’ views of the data into a single all-encompassing data repository.

A DBMS provides advantages such as:

• Improved data security. The more users access the data, the greater the risks of data security breaches. Corporations invest considerable amounts of time, effort, and money to ensure that corporate data are used properly. A DBMS provides a framework for better enforcement of data privacy and security policies.

• Better data integration. Wider access to well-managed data promotes an integrated view of the organization’s operations and a clearer view of the database. It becomes much easier to see how actions in one part of the company affect other parts.

• Better data access. The DBMS makes it possible to produce quick answers to ad hoc queries. From a database perspective, a query is a specific request issued to the DBMS for data manipulation—for example, to read or update the data. Simply put, a query is a question, and an ad hoc query is a spur-of-the-moment question.

The DBMS sends back an answer (called the query result set) to the application. For example, when dealing with large amounts of sales data, end users might want quick answers to questions (ad hoc queries).

• Improved data sharing. The DBMS helps create an environment in which end users have better access to more and better-managed data. Such access makes it possible for end users to respond quickly to changes in their environment.

• Minimized data inconsistency. Data inconsistency exists when different versions of the same data appear in different places. For example, data inconsistency exists when a company’s sales department stores a sales representative’s name as James and the company’s personnel department stores that same person’s name as Ronny, or when the company’s regional sales office shows the price of a product as $85.06 and its national sales office shows the same product’s price as $90.50. The probability of data inconsistency is greatly reduced in a properly designed database.

The Database System Environment

The term database system refers to an organization of components that define and regulate the collection, management, storage, and use of data within a database environment. From a general management point of view, the database system is composed of the five major parts : hardware, software, procedures, people, and data.

• Hardware refers to all of the system’s physical devices, including computers (PCs, workstations, servers, and supercomputers), storage devices, printers, network devices (hubs, switches, routers, fiber optics), and other devices (automated teller machines, ID readers, and so on).

• Software : Although the most readily identified software is the DBMS itself, three types of software are needed to make the database system function fully: DBMS software, operating system software, and application programs and utilities.

- DBMS software manages the database within the database system. Some examples of DBMS software include Microsoft’s SQL Server, Oracle Corporation’s Oracle, Sun’s MySQL, and IBM’s DB2.

- Operating system software manages all hardware components and makes it possible for all other software to run on the computers. Examples of operating system software include Microsoft Windows, Linux, Mac OS, UNIX, and MVS.

- Application programs and utility software are used to access and manipulate data in the DBMS and to manage the computer environment in which data access and manipulation take place. Application programs are most commonly used to access data within the database to generate tabulations, reports, and other information to facilitate decision making. Utilities are the software tools used to help manage the database system’s computer components. For example, all of the major DBMS vendors now provide graphical user interfaces (GUIs) to control database access, help create database structures, and monitor database operations.

• People. This component includes all users of the database system. On the basis of primary job functions, five types of users can be identified in a database system: system administrators, database administrators, database designers, system analysts and programmers, and end users. Each user type, described below, performs both unique and complementary functions.

- System administrators control the database system’s general operations.

- Database administrators, also known as DBAs, manage the DBMS and ensure that the database is functioning properly and maintain the DBMS if anything goes wrong. The DBA’s role is sufficiently important to warrant a detailed exploration, Database Administration and Security. DBA is also responsible for setting up backup and disaster recovery procedures.

- Database designers design the database structure. They are, in effect, the database architects. If the database design is poor, even the best application programmers and the most dedicated DBAs cannot produce a useful database environment. Because organizations strive to optimize their data resources, the database designer’s job description has expanded to cover new dimensions and growing responsibilities.

- System analysts and programmers design and implement the application programs. They design and create the forms, reports, and procedures through which end users access and manipulate the database data.

- End users are the people who use the application programs to run the organization’s daily operations. For example, bank clerks, supervisors, managers, and directors are all classified as end users. High-level end users employ the information obtained from the database to make tactical and strategic business decisions.

• Procedures. Procedures are the instructions and rules that govern the design and use of the database system.

Procedures are a critical, although occasionally forgotten, component of the system. Procedures play an important role in a company because they enforce the standards by which business is conducted within the organization and with customers. Procedures also help to ensure that companies have an organized way to monitor and audit the data that enter the database and the information generated from those data.

• Data. The word data covers the collection of facts stored in the database. Because data are the raw material from which information is generated, determining what data to enter into the database and how to organize those data is a vital part of the database designer’s job.

Database design and modeling

The main task of a database designer is to produce a conceptual data model that reflects the structure of the information to be held in the database. A common approach to this is to develop an entity-relationship model, often with the aid of drawing tools. Another popular approach is the Unified Modeling Language. A successful data model will accurately reflect the possible state of the external world being modeled: for example, if people can have more than one phone number, it will allow this information to be captured. Designing a good conceptual data model requires a good understanding of the application domain; it typically involves asking deep questions about the things of interest to an organization, like "can a customer also be a supplier?", or "if a product is sold with two different forms of packaging, are those the same product or different products?", or "if a plane flies from New York to Dubai via Frankfurt, is that one flight or two (or maybe even three)?". The answers to these questions establish definitions of the terminology used for entities (customers, products, flights, flight segments) and their relationships and attributes.

Producing the conceptual data model sometimes involves input from business processes, or the analysis of workflow in the organization. This can help to establish what information is needed in the database, and what can be left out. For example, it can help when deciding whether the database needs to hold historic data as well as current data.

Having produced a conceptual data model that users are happy with, the next stage is to translate this into a schema that implements the relevant data structures within the database. This process is often called logical database design, and the output is a logical data model expressed in the form of a schema. Whereas the conceptual data model is (in theory at least) independent of the choice of database technology, the logical data model will be expressed in terms of a particular database model supported by the chosen DBMS. (The terms data model and database model are often used interchangeably, but in this article we use data model for the design of a specific database, and database model for the modeling notation used to express that design.)

The most popular database model for general-purpose databases is the relational model that is represented by SQL query language. The process of creating a logical database design using this model uses a methodical approach known as normalization. The goal of normalization is to ensure that each elementary fact is only recorded in one place, so that insertions, updates, and deletions automatically maintain consistency, and redundancy is avoid.

The final stage of database design is to make the decisions that affect performance, scalability, recovery, maintainability and security. This is often called physical database design. A key goal during this stage is data independence, meaning that the decisions made for performance optimization purposes should be invisible to end-users and applications. Physical design is driven mainly by high performance requirements which require a good knowledge of the expected workload and access patterns, and a good understanding of the features offered by the chosen DBMS which can make the DBA job much easier.

Another aspect of physical database design is security. It involves both defining access control to database objects by providing levels of user access and privileges, as well as defining security levels and methods for the data itself such as encryption.

Database models :

A database model is a type of data model that determines the logical structure of a database and fundamentally determines in which way data can be organized, stored, and manipulated. The most popular example of a database model is the relational model, which uses a table-based format and defines relationships between tables.

Common logical data models for databases include:

Hierarchical database model

Network model

Relational model

Entity–relationship model

Enhanced entity–relationship model

Object model

Document model

Entity–attribute–value model

Star schema

An object-relational database combines the two related structures.

Physical data models include:

Inverted index

Flat file

Other models include:

Associative model

Multidimensional model

Multivalue model

Semantic model

XML database

Named graph

Performance, security, and availability

Because of the high importance of database technology to the smooth running of an enterprise, database systems include complex mechanisms to deliver the required performance, security, and availability, and allow database administrators to control the use of these features.

Database storage

Database storage is the container of the physical materialization of a database. It comprises the internal physical level in the database architecture. It also contains all the information needed including metadata and internal data structures to reconstruct the conceptual level and external level from the internal level whenever modifications needs to be done on the database architecture. Putting data into permanent storage is generally the responsibility of the database engine (storage engine). Though typically accessed by a DBMS through the underlying operating system (and often utilizing the operating systems' file systems as intermediates for storage layout), storage properties and configuration setting are extremely important for the efficient operation of the DBMS, and thus are closely maintained by database administrators. A DBMS, while in operation, always has its database residing in several types of storage of memory and external storage. The database data and the additional needed information, possibly in very large amounts, are coded into bits. Data typically reside in the storage in structures that look completely different from the way the data look in the conceptual and external levels, but in ways that attempt to optimize the best possible these levels' reconstruction when needed by users and programs, as well as for computing additional types of needed information from the data like derived attributes.

Some DBMS support specifying which character encoding was used to store data, so multiple encodings can be used in the same database which provides more scalability.

Various low-level database storage structures are used by the storage engine to serialize the data model so it can be written to the medium of choice. Techniques such as indexing may be used to improve performance, but indexing should be used wisely because sometimes it affects performance negatively instead of affecting it positively if used in the wrong place. Conventional storage is row-oriented, but there are also column-oriented and correlation databases and that offer more flexibility to the database.

Database materialized views

Often storage redundancy is employed to increase performance. A common approach is storing materialized views, which consist of frequently accessed external views or query results. Storing such views saves the cost of expensive query operations needed (such as join operation) each time they are queried. The downsides of materialized views are the overhead incurred when updating them to keep them synchronized with their original updated database data, and the cost of storage redundancy.

Database and database object replication

Occasionally a database employs storage redundancy by database objects replication with one or more copies to increase data availability both to improve performance of simultaneous multiple end-user accesses to a same database object, and to provide resiliency in a case of partial failure of a distributed database. Updates of a replicated object need to be synchronized across the object copies otherwise database inconsistency may occur. In many cases the entire database is replicated.

Database security

Database security deals with all various aspects of protecting the database content, its owners, and its users. It ranges from protection from intentional unauthorized database uses to unintentional database accesses by unauthorized entities (e.g., a person or a computer program).

Database access control deals with controlling who (a person or a certain computer program) is allowed to access what information in the database. The information may comprise specific database objects (e.g., record types, specific records, data structures), certain computations over certain objects (e.g., query types, or specific queries), or utilizing specific access paths to the former (e.g., using specific indexes or other data structures to access information). Database access controls are set by special authorized (by the database owner) personnel that uses dedicated protected security DBMS interfaces.

This may be managed directly on an individual basis, or by the assignment of individuals and privileges to groups, or (in the most elaborate models) through the assignment of individuals and groups to roles which are then granted entitlements. Data security prevents unauthorized users from viewing or updating the database. Using passwords, users are allowed access to the entire database or subsets of it called "subschemas". For example, an employee database can contain all the data about an individual employee, but one group of users may be authorized to view only payroll data, while others are allowed access to only work history and medical data. If the DBMS provides a way to interactively enter and update the database, as well as interrogate it, this capability allows for managing personal databases.

Data security in general deals with protecting specific chunks of data, both physically (i.e., from corruption, or destruction, or removal; e.g., see physical security), or the interpretation of them, or parts of them to meaningful information (e.g., by looking at the strings of bits that they comprise, concluding specific valid credit-card numbers; e.g., see data encryption).

Change and access logging records who accessed which attributes, what was changed, and when it was changed. Logging services allow for a forensic database audit later by keeping a record of access occurrences and changes. Sometimes application-level code is used to record changes rather than leaving this to the database. Monitoring can be set up to attempt to detect security

Migration

A database built with one DBMS is not portable to another DBMS (i.e., the other DBMS cannot run it). However, in some situations it is desirable to move, migrate a database from one DBMS to another. The reasons are primarily economical (different DBMSs may have different total costs of ownership or TCOs), functional, and operational (different DBMSs may have different capabilities). The migration involves the database's transformation from one DBMS type to another. The transformation should maintain (if possible) the database related application (i.e., all related application programs) intact. Thus, the database's conceptual and external architectural levels should be maintained in the transformation. It may be desired that also some aspects of the architecture internal level are maintained. A complex or large database migration may be a complicated and costly (one-time) project by itself, which should be factored into the decision to migrate. This in spite of the fact that tools may exist to help migration between specific DBMS. Typically a DBMS vendor provides tools to help importing databases from other popular DBMSs.

DBMS tuning

DBMS tuning refers to tuning of the DBMS and the configuration of the memory and processing resources of the computer running the DBMS. This is typically done through configuring the DBMS, but the resources involved are shared with the host system.

Tuning the DBMS can involve setting the recovery interval (time needed to restore the state of data to a particular point in time), assigning parallelism (the breaking up of work from a single query into tasks assigned to different processing resources), and network protocols used to communicate with database consumers.

Memory is allocated for data, execution plans, procedure cache, and work space. It is much faster to access data in memory than data on storage, so maintaining a sizable cache of data makes activities perform faster. The same consideration is given to work space. Caching execution plans and procedures means that they are reused instead of recompiled when needed. It is important to take as much memory as possible, while leaving enough for other processes and the OS to use without excessive paging of memory to storage.

Processing resources are sometimes assigned to specific activities to improve concurrency. On a server with eight processors, six could be reserved for the DBMS to maximize available processing resources for the database.

Database maintenance

Database maintenance includes backups, column statistics updates, and defragmentation of data inside the database files.

On a heavily used database, the transaction log grows rapidly. Transaction log entries must be removed from the log to make room for future entries. Frequent transaction log backups are smaller, so they interrupt database activity for shorter periods of time.

DBMS use statistic histograms to find data in a range against a table or index. Statistics updates should be scheduled frequently and sample as much of the underlying data as possible. Accurate and updated statistics allow query engines to make good decisions about execution plans, as well as efficiently locate data.

Defragmentation of table and index data increases efficiency in accessing data. The amount of fragmentation depends on the nature of the data, how it is changed over time, and the amount of free space in database pages to accept inserts of data without creating additional pages.

Backup and restore

Sometimes it is desired to bring a database back to a previous state (for many reasons, e.g., cases when the database is found corrupted due to a software error, or if it has been updated with erroneous data). To achieve this a backup operation is done occasionally or continuously, where each desired database state (i.e., the values of its data and their embedding in database's data structures) is kept within dedicated backup files (many techniques exist to do this effectively). When this state is needed, i.e., when it is decided by a database administrator to bring the database back to this state (e.g., by specifying this state by a desired point in time when the database was in this state), these files are utilized to restore that state.

Resources :

Database Systems, Design, Implementation, and Management by Carlos Coronel, Steven Morris, Peter Rob.

Fundamentals of Database Systems, 6th edition by Ramez Elmasri, University of Texas & Shamkant B. Navathe, Georgia Institute of Technology.

http://en.wikipedia.org/wiki/Database

http://en.wikipedia.org/wiki/Database_tuning

http://www2.amk.fi/digma.fi/www.amk.fi/opintojaksot/0303011/1146161367915/1146161680673/1146161836562/1146161929756.html



rev

Our Service Portfolio

jb

Want To Place An Order Quickly?

Then shoot us a message on Whatsapp, WeChat or Gmail. We are available 24/7 to assist you.

whatsapp

Do not panic, you are at the right place

jb

Visit Our essay writting help page to get all the details and guidence on availing our assiatance service.

Get 20% Discount, Now
£19 £14/ Per Page
14 days delivery time

Our writting assistance service is undoubtedly one of the most affordable writting assistance services and we have highly qualified professionls to help you with your work. So what are you waiting for, click below to order now.

Get An Instant Quote

ORDER TODAY!

Our experts are ready to assist you, call us to get a free quote or order now to get succeed in your academics writing.

Get a Free Quote Order Now