Integrating Web Caching And Pre Fetching

Published Date: 02 Nov 2017

Abstractâ€” Due to the phenomenal growth of the World Wide Web network loads and response time for accessing the documents online are increased. Many researchers use Web caching and pre-fetching techniques to overcome these difficulties. With the increased commercialization of web, exceeding the "eight second rule" for downloading web page content is quite annoying to the user, this may result in a significant loss of revenue as many user might switch over to the other sites, if they are not satisfied with the performance of current web page. Web caching and pre-fetching is one of the technique to increase the scalability of web. Thus by integrating the web caching and pre-fetching technique the performance of the web can be efficiently increased.

Keywords-component; web caching ,pre-fetching,eight second rule.

1. Introduction

The World Wide Web is the Internetâ€™s most widely used tool for information access, but todayâ€™s users often experience long latency due to network congestion issues particularly during peak hours. Caching frequently used data at proxies close to clients is an effective way to lighten these problems. Researchers have studied this management task broadly in other systems, such as distributed file-sharing systems and memory hierarchies. Given this, we need novel solutions for deploying Web caching proxies on the Internet. Here, we offer an overview of key management problems for Web proxy caching and pre-fetching and present the solutions to these problems. Our focus is on the distribution of conservative Web objects, such as HTML pages and images, but we also address some of the issues which are raised from other emerging applications.

2. WEB ACHING TECHNIQUE

Web caching technology improves client download times and reduces network traffic by caching recurrently accessed copies. The major research issues in Web caching are where to maintain cache copies of objects, how to continue the cached copies consistent, and how to readdress clients to the most favourable cache server. A proxy is frequently deployed at a networkâ€™s edge, namely an enterprise networkâ€™s gateway. The browser first attempts to convince the request from its local cache; if it fails, it sends the vague request to its proxy. The proxy then relays the object to the client and, if required, saves a copy in its cache. If a request is fulfilled from the cache, it is called a cache hit; or else, itâ€™s a cache miss. Thus, today most of Internet media objects are still accessed via downloading or pseudo streaming instead of streaming, which cause roughly 56% and 32% of wasted bandwidth according to study. In a web service environment, a continuous streaming session (often with a duration of minutes or hours, compared to milliseconds or seconds for traditional Web pages) keeps consuming network bandwidth and disk bandwidth on the hosting server. Multiple concurrent streaming sessions can easily exhaust the available network bandwidth and overload the media content server.

2.1 TYPES OF WEB CACHING

Web caching keeps a local copy of Web pages in places close to the end user. Caches are found in browsers and in any of the Web intermediate between the user agent and the origin server.

a. Browser cache: It is located in the client. The user can notice the cache setting of any modern Web browser such as Internet Explorer, Safari, Google chrome, Netscape, Mozilla Firefox. This cache is useful, especially when users hit the "back" button or click a link to see a page they have just looked at. In addition, if the user uses the same navigation images throughout the browser, they will be served from browsersâ€™ caches almost instantaneously.

b. Proxy server cache: It is found in the proxy server which located between client machines and origin servers. It works on the identical standard of browser cache, but it is quiet larger magnitude. Contrasting the browser cache which deals with only a particular user, the proxies serve thousands of users in the similar way. When a request is received, the proxy server checks its cache. If the object is available, it sends the object to the client. If the object is not accessible, or it has expired, the proxy server will demand the object from the origin server and forward it to the client.

c. Origin server cache: Even at the source server, web documents can be kept in a server-side cache for plummeting the necessitate for redundant computations. Thus, the server load can be reduced if the origin server cache is employed. Availability of Web access logs files that can be exploited as training data is the main motivation for adopting intelligent Web caching approaches. The machine learning techniques can adapt to the important changes through a instruction period Although there are many studies in Web caching, enrichment of Web caching concert using bright techniques is still new. Recent studies have shown that the intelligent approaches are more proficient and adaptive to the Web caching background.

3. pre-fetching

Recent years have seen significant growth in the Web caching literature and a large number of commercial offerings from both established network vendors and start up companies that exclusively focus on caching-related hardware and software solutions (see http://www.web-caching.com/ for a partial list of Web caching products). In effect, caching is ubiquitous in todayâ€™s computing environment. All of the major Internet backbone providers and Internet service providers (ISPs) now implement Web caching as part of their infrastructure, often transparent to end users and service subscribers. Many medium-to-large enterprises are using a variety of caching products and services to improve the network performance and reduce networking connection costs. Many end-user programs, including Web browsers, also maintain their local caches to reduce user-perceived network latencies. According to the location of caches, Web caching systems

can be classified into three types: browser caches, proxy caches, and surrogate caches. Browser caches are located within user browser programs. Surrogate caches are typically located near the Web servers and are owned and operated by the Web content providers . Proxy caches are located between end-user client sites and original Web servers, typically closer to the clients than to the servers. Proxy caches are typically configured and operated by ISPs and enterprises operating internal networks that are connected to the Internet. This paper mainly focuses on proxy caching for the following four reasons. First, a dominant portion of the current caching literature is directly related to various technical aspects of proxy caches. Although surveys on Web caching technology exist in the literature

new developments in proxy caching and its extended applications

in areas such as caching "uncacheable" Web objects and differentiated services are of important practical significance, calling for a new updated survey. Second, from the point of view of system deployment, proxy caching does not require major changes in the networking environment and can achieve the economy of scale because multiple users are served. In addition, proxy caching does not rely on any major changes (e.g., with respect to protocols) to original Web servers and, in most cases, does not require much end-user configuration efforts.

The amount of traffic over the Internet has experienced tremendous growth in recent years largely due to the wide adoption of the World Wide Web technologies and the resulting explosion of Web-based content development and dissemination The Internet bandwidth capacity expansion, on the other hand, is lagging behind, making the Web a major performance bottleneck. The gap between the Web infrastructure capacity and demand will continue to exist, if not expand, as information search and business transactions are being increasingly conducted over the Web. Another compounding factor is related to the recent developments in the Web technologies such as Web services, which will potentially bring in new classes of distributed applications in large numbers that will communicate among one another over the Internet, consuming network bandwidth. caching is an established approach to meet the important Web capacity challenge and address related issues such as

user-perceived network latencies. Broadly speaking, caching can be defined as serving user Web requests from places other than the Web servers that publish the original copies of the requested objects Recent years have seen significant growth in theWeb caching literature and a large number of commercial offerings from both established network vendors and start up companies that exclusively focus on caching-related hardware and software solutions (see http://www.web-caching.com/ for a partial list of Web caching products). In effect, caching is ubiquitous in todayâ€™s computing environment. All of the major Internet backbone providers and Internet service providers (ISPs) now implement Web caching as part of their infrastructure, often transparent to end users and service subscribers Many medium-to-large enterprises are using a variety of caching products and services to improve the network performance and reduce networking connection costs. Many end-user programs, including Web browsers, also maintain their local caches to reduce user-perceived network latencies. According to the location of caches, Web caching systems can be classified into three types: browser caches, proxy caches, and surrogate caches. Browser caches are located within user browser programs. Surrogate caches are typically located near the Web servers and are owned and operated by the Web content providers. Proxy caches are located between end-user client sites and original Web servers, typically closer to the clients than to the servers. Proxy caches are typically configured and operated by ISPs and enterprises operating internal networks that are connected to the Internet. This paper mainly focuses on proxy caching for the following four reasons. First, a dominant portion of the current caching literature is directly related to various technical aspects of proxy caches. Although surveys on Web caching technology exist in the literature new developments in proxy caching and its extended application in areas such as caching "uncacheable" Web objects and differentiated services are of important practical significance, calling for a new updated survey. Second, from the point of view of system deployment, proxy caching does not require major changes in the networking environment and can achieve the economy of scale because multiple users are served. In addition, proxy caching does not rely on any major changes (e.g., with respect to protocols) to original Web servers and, in most cases, does not require much end-user configuration efforts.

4. RELATED WORK

The WWW continues to grow up at an astonishing pace as an information gateway and as a intermediate for conducting commerce. Web mining is the origin of motivating and constructive information and embedded data from artefacts or activity related to the WWW. Based on quite a lot of research studies we can generally organize Web mining into three domains content , structure and usage mining . This work is troubled with Web usage mining. Web servers record and collect data about user communications whenever desires for resources are inward. Analyzing the Web access logs can help out in understanding the user behaviour and the web arrangement. From the business and applications point of view, acquaintance obtained from the Web usage patterns could be directly useful to resourcefully handle actions associated to e business, e-services, e-education and so on. Perfect Web usage information could assist to create a centre of attention on new clients, retain current customers, improve cross marketing/sales, efficiency of promotional campaigns, follow leaving clients and locate the the majority of effective logical construction for their Web space. consumer profiles may possibly be built by combining users' direction-finding paths with other data account, such as page appearance instance, hyper- link structure, and page relaxing. What makes the showing information interesting had been addressed by frequent works. Results previously recognized are very often deliberate as not attractive. So the key idea to make the exposed information attractive will be its innovation or unforeseen manifestation. Whenever a visitor accesses the server, it leaves the IP, genuine user ID, time/date, ask for style, status, bytes, referrer, manager and so on. The obtainable information fields are particular by the HTTP procedure. There are several business software that could supply Web practice information. These stats could be beneficial for Web administrators to get a intellect of the definite load on the server. However, the statistical data obtainable from the standard Web log data records or even the information provided by Web trackers could only provide the information plainly for the reason that of the nature and boundaries of the method itself. Generally, one could say that the analysis relies on three general sets of information given a present focal point of concentration past usage patterns, quantity of mutual content and inter-memory associative connection structures. After browsing throughout some of the kind of the best trackers obtainable it is easy to terminate that rather than generating statistical data and texts they actually do not help to find much significant information . For small web servers, the procedure statistics provided by predictable Web site trackers may be sufficient to analyze the usage prototype and trends. However as the amount and complication of the information enhance, the statistics provided by existing Web log file investigation equipment may prove insufficient and more intelligent knowledge mining techniques will be essential.

To express the effectiveness of the planned frameworks, Web access record data at the Monash University's Web site were used for experimentations. The University's server take delivery of over 7 million hits in a week and thus it is a genuine confront to find and extract hidden usage pattern information. To demonstrate the University's Web usage patterns, standard daily and hourly admission patterns for 5 weeks are exposed. The average daily and hourly patterns on the other hand tend to follow a related trend the differences tend to enhance during elevated traffic days (Monday - Friday) and during the peak hours (11:00 - 17:00 Hrs). Due to the huge traffic capacity and disordered admission performance, the computation of the user right of entry prototype become complex.

Earlier effort offered approaches for determining and tracking developing user profiles. It also illustrates how the discovered user profiles be able to enrich with the explicit information require that is supplementary from exploration query extracted from Web log data. A purpose for substantiation the scheme is also used to measure the excellence of the mined profiles, in exacting their flexibility in the features of developing user behaviour. However the preceding work determine simply on user profiling at the submission level data but not connecting it to the web server. The user profile sustained by the web server enhance the userâ€™s session of genuineness at diverse spatial article.

4. INTEGRATING WEB CACHING AND PRE-FETCHING

Web proxy caching and pre-fetching are the most popular techniques which play a key role in improving the Web performance. Thus, combination of the caching and the pre-fetching helps on improving hit ratio and reducing the user-perceived latency. However, if the web caching and pre-fetching are integrated incompetently, this might source growing the system traffic. Moreover, the cache space is not used efficiently. Therefore, the pre-fetching approach should be considered cautiously in order to overcome these boundaries.

On the whole, the web pre-fetching requires two steps: predict future pages of users and preloading them into a cache. However, the web caching and pre-fetching are concentrated on independently by a lot of researchers in the past. It is significant to take into concern that the impact of these two techniques joint together. Few studies were examined about the integration of web caching and web pre-fetching mutually. Studied effect of a combination of caching and pre-fetching on end user latency. They concluded that the combination of web caching and pre-fetching can potentially improve latency up to 60%, whereas web caching alone improves the latency up to 26%. suggested an application of web log mining to obtain web-document access patterns and used these patterns to extend the well-known GDSF caching policies and pre-fetching policies. proposed cache replacement algorithm called IWCP for integrating Web caching and Web pre-fetching in client-side proxies. They formulated a normalized profit function to evaluate the profit from caching an object either a non implied object or an implied object according to some pre-fetching rule.

This approach depended on the keywords of URL anchor text to predict the user's future requests .The most significant factors (recency and frequency) were ignored in web cache replacement decision. Moreover, since the keywords extracted from web documents were given as inputs to ANN, applying ANN in this way may cause extra overhead on the server. Proposed framework for combining Web caching and pre-fetching on mobile environment. They proposed hybrid technique (Rough Neuro-PSO) based on combination of ANN and PSO for classification Web object. Then, rules from log data are generated by Rough Set technique on the proxy server.

In summary, the previous works integrated the web pre-fetching with caching; However, these approaches are still not efficient enough . Most previous works used association rules for pre-fetching approach, which are inaccurate and inefficient since these works predict a particular page depending on patterns observed from all users' references. Moreover, these approaches employ the conventional replacement policies that are not efficient in web caching.

5. conclusion

Web caching and pre-fetching are two successful solutions to reduce the Web service blockage, reduce traffic over the Internet and develop scalability of the Web system. The Web caching and pre-fetching can complement each other since the web caching exploits the temporal locality for predicting revisiting requested objects, while the web pre-fetching utilizes the spatial locality for predicting next related web objects of the requested Web objects. Thus, combination of the web caching and the web pre-fetching doubles the performance compared to single caching. This paper reviews principles and some the existing web caching and pre-fetching approaches. Initially, we have evaluated principles and existing works of web caching. These include the conservative and bright web caching. Secondly, types and grouping of pre-fetching have presented and discussed in brief. Moreover, the history-based pre-fetching approaches have been concentrated and discussed with review of the related works for each approach in this survey. Finally, this survey has presented some studies that discussed integration of web caching and web pre-fetching together.

Our Service Portfolio

Want To Place An Order Quickly?

Then shoot us a message on Whatsapp, WeChat or Gmail. We are available 24/7 to assist you.

Do not panic, you are at the right place

Visit Our essay writting help page to get all the details and guidence on availing our assiatance service.

Get 20% Discount, Now
£19 £14/ Per Page
14 days delivery time

Our writting assistance service is undoubtedly one of the most affordable writting assistance services and we have highly qualified professionls to help you with your work. So what are you waiting for, click below to order now.

Get An Instant Quote

ORDER TODAY!

Our experts are ready to assist you, call us to get a free quote or order now to get succeed in your academics writing.

Get a Free Quote Order Now