The Businesses And Xml

Print   

02 Nov 2017

Disclaimer:
This essay has been written and submitted by students and is not an example of our work. Please click this link to view samples of our professional work witten by our professional essay writers. Any opinions, findings, conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of EssayCompany.

IEE 530

Literature Review

XML

Aditya Patil 1205089427

Ratik Mittal 1204023452

XML Overview:

XML, short for extensible markup language, is a markup language that is used to define configuration details in a format that can be read by humans as well as machines. It is developed by the World Wide Consortium (W3C). [5]It was developed from a former language, called the Standard Generalized Markup Language (SGML) which is now replaced by its child languages. SGML is an ISO standard technology for conducting the terminology of generalized markup languages. Its derived languages viz. XHTML, HTML and XML are now widely used. One of the design goals of XML are that it should be cross platform , meaning that it can run over a variety of platforms ranging from Personal Computers , mobile devices as well as on different operating systems, applications.[4] It is designed to be both hardware and software independent. It aims at structurally handling data in multi-tier enterprise applications. These applications could be for domestic purposes or could be established on the Internet. Keeping the basic goals of ease of use and simplicity in writing code, XML can thus be used for storing data in a structured manner, storing configuration information as well as interoperability. The stored data can be used to browse from variety of targets such as web browsers or mobile devices etc. or it may be used for independent processing by different applications and programs. XML files also can contain configuration details of most application servers and web servers. By having a generalized format we can have a variety of diverse applications running on their respective hardware platforms to work together. Most software developers working in a wide range of enterprises, from insurance companies, travel agencies, social networks to other data repository based organizations all make use of XML or its cousin languages at some point in their application. Almost all basic programming languages, particularly Java and C# encourage the usage of XML while software development. [10]

XML has proved an important role in terms of manipulating Big Data as well. Nowadays an increasing growth in global enterprises which have multiple locations spread throughout the world, a need for having multiple sites for storing data has arose. This involves the development of variety of applications, their databases and directory services that make up distributed systems. Data transferred within such vast areas must be assured with integrity and not even a small amount of data loss must be tolerated. Data transfer rate is also a factor to be considered. Such data management is possible by adopting XML for development, along with its other members: XSTL, PIT, DTD, XHTML.

The diagram below represents a multi-tier application that most businesses use today. By adopting such architecture, we are able to introduce loose coupling between the data and the logic part that gets implement in order to finally display the data to the clients. A change in one end will not majorly affect another tier. Also, none of the tiers need to be at the same site.

Figure1: Multi-tier client server architecture. Source: datahousecorp.com [9]

Businesses and XML:

Today we see abundant valuable information shared among businesses worldwide. The web has enabled businesses to seek a global outreach easily. They are able to sell products to customer on the other end of the world. For such capabilities, it is important to have a streamlined flow of data, which should also be able to deal with factors such as amount of data to handle, integrity and security. A company may choose any type of web service to provide data to its customers on the internet. Here it is important to distinguish between another markup language called HTML which only deals with the representation of the data that it receives. The business logic that gets implemented when the particular users carry out their own job function transmits data to the layer that deals with how data is to be represented. This layer forms the XML. To better understand the functionality, consider the case of N-tier Business Software Applications.

Business Benefits:

Adopting XML technologies is important for today’s bottom-line business impact. In order to deliver real and sustainable advantage, a company that shares information with customers as critical part of the value chain must keep a single source of information which can be delivered to current and future source of media. [7]

The benefits of doing this can be generalized into the following:

1. Increasing Revenues.

2. Decreasing Costs.

Increasing the Revenues:

There are multiple ways by which XML authoring systems can increase a company’s revenue.

Customer Satisfaction and retention:

Customer retention can be achieved by offering more services to the customer. In terms of documentation this means offering more personalized content. This is an added competitive advantage. Product specific guides, repair information and other guidelines are of a particular value to a customer. In reality, the impact on revenue of increased customer satisfaction is very difficult.

Products to market faster.

Most products require some documentation along with their release. Otherwise it may delay their launch. At the same time, an earlier launch would mean more revenues.

Expansion into new markets.

Establishing businesses in a foreign market could mean translating documentation into different languages. Using structured XML documents would decrease translation costs and would cut costs for new market entry.

While some of the benefits mentioned above are difficult to quantify, the management of a particular business may require solid numbers in terms of revenues that can be generated. Improvements may be calculated if there is enough information about the processes such as the bottlenecks involved during faster time to market the product, how to reduce the customer attrition and increase satisfaction, how to conduct market expansion, agility while responding to competitive threats.

At such times XML systems certainly have greater effects that are not only the developers are influenced by.

2. Decreasing Costs:

Besides providing increased revenue opportunities, an XML authoring system can also produce substantial cost savings:

Increasing authoring productivity:

Instead of working on trivial tasks such as font selection, size, indentions, the author writing a document can work only on the subject matter and leave the other page layout work to the XML authoring system.

Reducing publishing effort

After authoring an XML document, automated publishing processes can begin to

Theses may be printing, CD ROM, Internet PDA or mobile phone output.

Increasing Information Reuse.

Unlike a typical document, a Content Management System may be utilized along with the XML authoring systems that make it possible to store and reuse the similar information objects. This is done through the use of metadata.

Reducing translation costs:

By using glossary or controlled vocabulary, reusing Information objects via XML authoring system or by using a translation memory system to find matching sentence pairs we can reduce the amount of new/changed text. This helps in the reduction of costs involved in translation of text. Also eliminating the manual page layout of translated language can help in reducing the translation costs.

Reducing maintenance and distribution costs:

The object reuse and maintenance cost go hand in hand. Also distribution of information via web is far cheaper than paper manuals.

Future proofing data:

Moving from a system to another system can have various data conversion drawbacks. Text published in one format may not be compatible in another file system. Native XML data is software product and operating system independent.

As XML is able to offer so many different functionalities, a variety of rules need to be followed at all steps so as to get a complete functioning file that is able to address issues such as proper compiling of the document, validity of the document, a well –formed document, Complexity not increasing due to the size and provide security. The different problems that can affect those functionalities can best be identified through the help of the fishbone (Ishikawa) diagram given below. The rules that need to be followed at every will then be described accordingly.

Figure 2: XML Fishbone Diagram

XML Syntax:

For writing an XML document there are simple but strict rules [10]:

1. Elements are primary building blocks of the document

2. XML elements are not predefined. This gives user a greater control over them.

3. They are case sensitive. e.g.: An element tag <Age> is not the same as <age>.

4. All XML elements can have attributes in name/value pairs just like in HTML. But the attribute values should always be quoted.

e.g.: <pen color="red">Sheaffer</pen>

6. They must be nested properly

7. They must have opening and closing tags

7. The documents must have a root element.

8. Comments may be denoted as <!--Insert Comment Here-->

Example 1:

<parent>

<child>

<subchild>.....</subchild>

</child>

</parent>

In the example above the element parent is the root:

Example 2:

<name>Jill<lname>Jack</name></lname>

The example above exemplifies incorrect tag usage. The element <lname> must close before <name> as it is opened after <name>

Given below is an example of a complete XML document:

<?xml version="1.0" encoding="ISO-8859-1"?>

<customers>

<customer>

<customerId>1001</customerId>

<name>John</name>

</customer>

<customer>

<customerId>1002</customerId>

<name/>

</customer>

</customers

<?xml version="1.0" encoding="ISO-8859-1"?>

The first statement in the XML document is the XML declaration which describes the XML version and the character encoding used in the document. The default encoding is "UTF-8".

The next line is the root element (<Customers>). This is giving the information about, your document is storing customer information.

Here we have two customer records.(<customer>). Each record is containing two fields, <customerId> and <name>.

<name/>indicates that it is an empty element. Although this is a valid declaration , the user has control to specify if such elements can have constraints.

XML Validation:

There arises a need to validate XML documents to test that various elements within it are specified as per definition. There are two ways to serve this purpose, creating a Document Type definition or defining an XML Schema. We first look at the Document type definition (DTD) method. [10]

Document Type definition:

By using a DTD, we can define the structure of an XML document by listing various valid elements and attributes.

A DTD can either be defined within an XML document or be referenced externally

Basically can be used to enforce structure requirements for an XML document

Following is an example of an Embedded DTD:

<?xml version="1.0"?>

<!DOCTYPE IEE530students [

<!ELEMENT IEE530students (student+)>

<!ELEMENT student (ASUId,name)>

<!ELEMENT ASUId (#PCDATA)>

<!ELEMENT name (#PCDATA)>

]>

<IEE530students>

<student>

<ASUId>1001</ASUId>

<name>Aditya</name></student>

<student>

<student>

<ASUId>1002</ASUId>

<name>Ratik</name></student>

</IEE530students>

In a similar fashion we can have an external DTD. This can be created by implementing an additional file with an extension of ‘.dtd’.The example stated above will now be split into two files:

IEE530Students.dtd

<!ELEMENT IEE530students (student+)>

<!ELEMENT student (ASUId,name)>

<!ELEMENT ASUId (#PCDATA)>

<!ELEMENT name (#PCDATA)>

Students.xml

<?xml version="1.0"?>

<!DOCTYPE IEE530students SYSTEM * IEE530students.dtd>

<IEE530students>

<student>

<ASUId>1001</ASUId>

<name>Adi</name></student>

<student>

<student>

<ASUId>1002</ASUId>

<name>Ratik</name></student>

</IEE530students>

A third type also allows us to apply a combination of the above two techniques.

As mentioned for XML, the Document Type Declaration also follows a strict set of rules when it comes to syntax.

The XML Schema

There are certain limitations while validating XML documents using Document Type declarations. Consider the case wherein, in our example above, we want to impose a limitation on the student ID to have exactly 10 digits or name to be of minimum 5 alphabets. Implementing such a case would be difficult. To overcome such difficulty, we have another option for validation purposes, the XML Schema.

Unlike DTD, Schema uses pure XML based syntax which makes it more consistent. It supports more different types of data such as string, integer formats. It extends support for field level validations as well as more complex types. As DTD has the ‘.dtd’ extension , schema is store in a ‘.xsd’ file format. As writing code in XML schema is a better alternative , it becomes more useful than DTDs and simple but more importantly data type validation is possible through Schema.

XML Schema introduces the concept of Namespace. In an XML document the elements are not predefined, there is a chance that if two different documents use the same names for elements a conflict could arise. Namespace is a method to avoid conflicts when elements have same names.

Example:

Consider an example of web-development wherein there are two documents having elements head and a body. They contain different data, one deals with a bank whereas the other deals with the account holder.

Document 1:

<head>

BankOfAmerica

</head>

<body>

Deposit

Withdraw

</body>

Document 2:

<head>

Name

AccountNumber

</head>

<body>

Savings

Checking

</body>

Without namespaces it will be difficult to differentiate which reference of head and body is being made, the bank or its account holder. Introducing the names for specific elements we will be able to differentiate the references clearly. The corresponding documents will now look as follows:

Document 1:

<bank:head>

BankOfAmerica

</bank:head>

<bank:body>

Deposit

Withdraw

</bank:body>

Document 2:

<AccountHolder:head>

Name

AccountNumber

</AccountHolder:head>

< AccountHolder:body>

Savings

Checking

</AccountHolder:body>

In such a manner XML namespace constitutes a collection of elements and attribute names which are recognized by a specific URI/URL. XML Schema, unlike DTD, is designed to be aware of the namespace concept. Elements in a namespace are identified by a namespace prefix rather that an URI/URL. The name for a namespace is defined with the help of the Uniform Resource Locator (URL). This URL may or may not be real, such that an alternate user defined alias for a particular element can be used. In the examples shown above the prefixes are AccountHolder and bank respectively. By not mentioning a prefix, a default namespace gets attached. XML schema allows data types declaration for an element in the following manner:

<xsd:element name="name" type="xsd:string"/>

Following the concepts mentioned so far, there are two terms that deal with an XML document creation: Well-formed meaning that the document is syntactically correct .This is the minimum requirement. Secondly, that the document is Valid conforming to either DTD or a schema. A valid document implies a well formed one but the same isn’t true vice versa.

XML Parsing :

An XML Parser is used to extract the contents from and to validate an XML document. This helps in transferring data into another application. An XML parser first checks whether the document is well formed and while doing so , also parses through the document to separate the different data elements which can be passed on to the next layer application. The application using such a parser can access data by using the element tag names. [11]

The Diagram below shows how XML documents are parsed:

Figure3: Diagram representing XML Parsing Process. Source: oracle[8]

There are different types of XML parsers which always check for the well-formed ness of the XML documents. They also provide an option to either switch on or off the validation part. Accordingly they are differentiated as validating or non-validating parsers. Parsers are also based on the application architectures. For example, Java developers widely use the Simple API (Application product interface) for XML (SAX), Document Object Model (DOM) parsers. Java API for XML (JAXP) provides SAX interfaces and DOM interfaces for the Java applications.[11]

The DOM Parser operates by creating a tree representation of the XML document. It is useful when an application changes data frequently for adding, deleting and reordering. Processing the element by element data in the document, the SAX parser reports event and data to the functions that applications declare. SAX is more useful when an application performs searching and retrieving of data that does not change over a period of time.

Looking at our example on page 11, the following represents the parsed data formats of DOM and SAX.[11]

DOM processing:

IEE530Students

|

Student

/ \

ASUid NAME

SAX Processing:

start document

start element:IEE530Students

start element:Student

start element:ASUID

characters: 1001

end element: ID

start element :NAME

characters:Aditya

end element: NAME

end element :Student

end element:IEE530Students

end document

Understanding the diagrams, we see that implementing DOM technique is easier and can allow random access to the objects that it creates. SAX on the other hand is a much faster processing parser, although it allows for only sequential access. The retrieval from a SAX parser requires that the data within the XML document not change over time. DOM has no restriction for the amount of times data changes. The output size of a DOM parser increases significantly if large amounts of data are contained in the document. [11]

Figure 4: The XML DOM Parser. Source: oracle[8]

The DOM parser does not need to have a specific parsing implementation. Any third party developer can create their own version of parsing the document.

Figure 5: XML SAX Parser(JAXP). Source: oracle [8]

The SAX parser generates events that are transmitted to the application. For each event, a user defined function can be called.

Use of XML in Managing Big Data:

Many companies today are expanding multi-nationally. Such a global outreach has in turn given a rise to a large amount of information for them to store. The information could be large number of records. There may be a plethora of similarly structured data that may need to be delivered from one end of the world to another site located overseas within minutes. XML has the potential to handle such data transfer by effectively reading its elements and values. Many companies today are using XML and its other forms to assist in data management between a variety of sources that could be search engines, database repositories and other query systems. Defining their own set of rules for the kind of data they handle, companies release their versions XML.

Recently, U.S. National Library of Medicine (NLM) that is affiliated to the National Institutes of Health and one of the world’s largest medical company has released their own version of XML for handling data stored in their IndexCat database. The researchers at NLM validate data using the Document Type Definition(DTD). The XML has data collected from a number of journals and newspaper articles, obituaries, and letters, many dissertations and monographs as well as portraits covering a wide range of topics such as scientific research, military and civilian medicine, hospital management and public health. Thus their data set consists of records with more than 3.7 million bibliographic items. By releasing the Index Catalogue in XML format, NLM has been able to come up with an important history of medicine and science source to new uses and users. It forms a systematically indexed form of medical literature.[12]

For around 40 years ‘Axciom’ has proved useful by providing data driven insights of customer traits in the markets so that professional goods manufacturers can produce the right household products. By analyzing large databases, analysts at The Little Rock, an Ark based company are able to create analytical models that help them understand the production environment and derive efficient development systems. Axciom has been able to meets its goal of increasing its interoperability with clients by utilizing the Predictive Model Markup Language (PMML) that is an XML based language developed by the Data Mining Group. Data Mining Group aims at increasing the implementation of predictive models from an analyst’s site to a remote data-warehouse. [1]

Although XML plays an important part in the Storage and retrieval of Big data, there are certain challenges that need to be addressed [2]

Converting data types:

For software independence there involves a need for data conversion by parsing, translating, serializing data when it may be extracted from a user program to a database or vice versa. This can prove to be a time consuming effort especially if there are more sites with different hardware involved.

Number of Records in a Data Site:

XML are designed to be human readable as well as for being processed by machines. For both targets there has to exist an upper bound for the amount of data that can be present in a document. An increase in size is directly proportional to the storage and processing time. This means that CPU processing, storage and memory costs are important factors to be considered while designing the documents for efficiency.

Complexity in format:

The strict syntax and semantics rules of XML documents such as the necessity of opening and closing tags, nesting of tags can be more difficult for a person to directly look into the document and read the data. The inline tagging of elements turns redundant for someone who merely could read a document more easily had the data been separate by commas or full stops. Moreover different XML formats impose different markup rules which make it even more difficult to decode the data presented within a file.

To overcome such difficulties there are various rules that can be suggested:

Reducing the Number of new formats:

Instead of have too many independent format for a particular data set, the same could be reduced and standardized to a certain extent. This can be done by how other languages, particularly Latex and MathML, handle data.

Simplicity from a developers perspective:

Design a format that makes it easier for a developer to write or read from a format.

As the world’s greatest contributor to XML, James Clark has said that, if a technology is too complicated it should not be adopted on a wide scale even though it may prove very useful for a limited number of users. [13]

Adopting Lazy Data Formatting:

Many Bid Data developers are thinking of ways to eliminate the time taken for data conversion by building data stores to accommodate a broader range of data types and formats. In the yet-to-be developed applications, the right schema for data depends on future use cases. This kind of modeling is similar to lazy evaluation. By storing data in "as is" format we can deal with its transformation in the future when the need arises.

Security and XML:

Transferring data between various sites could require security and authentication apart from data integrity. These factors are especially important for the businesses that use the internet. Unlike previous years wherein there used to be more physical interaction with computers for handling data, the methods for securing the same were also physically oriented. In today’s world, where we have more businesses growing day by day, the demand for automation has been more or less directly proportional. The growth in software and hardware development has been relatively diverse. As such, having a centralized source to guard such data is a difficult task and to ensure integrity of data various standards must be made available. It is difficult to create and scale up such a centrally pervasive structure on the Internet as many different hardware and software requirements come into the picture. While developing such standards, it is important that they carry out different functionalities and for data transfer speeds avoid any unrequired repetitive checks in-order to satisfy effective data management over distributed systems. XML has now established itself as a standard for providing effective content transfer on the web. Various data encryption algorithms can be adapted to port into the XML Security. These security oriented standards do not conform to all the technical norms of designing the XML Documents mentioned before. Instead XML security provides a common framework for various applications to provide security. This eliminates the need for excessive customization of programs in other tiers in the application. For data security various standards such as XML Digital Signatures for providing integrity and signatures, XML encryption for confidentiality and XML key management scheme (xkms) have been developed. Exploiting various popular security and cryptography techniques they are able to satisfy the data safety requirements. The goal is to provide for security to Web Services. [3]

Variety of standards have been developed which play their respective parts for offering data security. These languages are text based and are designed to be integrated as well as extended: [3]

Security Assertion Markup Language (SAML): Authentication, attribute assertions and authorization of data.

XML Access Control Markup Language (XACML): Data access control rules

3. Platform for Privacy Preferences (P3P): Defining privacy policies and preferences.

4. eXtensible Rights Markup Language 2.0 (XrML) for Digital Rights Management.

Further processing of XML documents by tools such as XMLRef is not prevented if these standards are used for the whole or portions of the document. This factor of integrating XML security with XML, in a way that adding security capabilities do not hamper the advantages and capabilities that are previously obtained, is important. This is a requirement in case of XML Protocols wherein changing messages and other processing may take place at frequent times.

Conclusion:

To Date, XML has been able to present data management to both developers and the businesses alike. The development of various cousin languages has only helped to further extend its uses such as providing for a loosely coupled model, authentication of data, and providing security by means of encryption. Although it is targeted to be simple and easy to use, the documents are required to follow stringent standards that are used to build them, failing which, there can be data management issues at immediate steps that follow the document utilization or later stages. Observing the current growth of businesses, it is expected that data growth handled by businesses will only project an upward graph. XML so far has proved a great deal in maintaining structured data to present to clients as well as in terms of keeping integrity, for configuring the components of the web architecture involved and interoperability.

Due to the hardware and software diversity at different sites, it is important to develop standards that will help integrate data while keeping various components involved to present it loosely coupled, a goal XML has been able to accomplish.



rev

Our Service Portfolio

jb

Want To Place An Order Quickly?

Then shoot us a message on Whatsapp, WeChat or Gmail. We are available 24/7 to assist you.

whatsapp

Do not panic, you are at the right place

jb

Visit Our essay writting help page to get all the details and guidence on availing our assiatance service.

Get 20% Discount, Now
£19 £14/ Per Page
14 days delivery time

Our writting assistance service is undoubtedly one of the most affordable writting assistance services and we have highly qualified professionls to help you with your work. So what are you waiting for, click below to order now.

Get An Instant Quote

ORDER TODAY!

Our experts are ready to assist you, call us to get a free quote or order now to get succeed in your academics writing.

Get a Free Quote Order Now