Characterization Of Youtube Video Streaming Traffic

Print   

02 Nov 2017

Disclaimer:
This essay has been written and submitted by students and is not an example of our work. Please click this link to view samples of our professional work witten by our professional essay writers. Any opinions, findings, conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of EssayCompany.

Online digital videos have made a revolutionary evolving since the social networking sites such as YouTube and Hulu have emerged. These websites facilitate video access affable and only a click away.  Ever increasing internet traffic and a very significant increase in the use of videos in social networking has led to the problem of network congestion. Consequently, it becomes essential and imperative to analyze the traffic flow and comprehend how it is being delivered from the server. If the methodology involved in the flow of traffic is analyzed appropriately, the service providers can understand the reasons for network congestion and avoid them.

Given the context, a few research studies have thrown light on video delivery procedure of YouTube with emphasis on methodology. A few other works have examined the location of video storage and the strategy involved in sending the video packets, yet the packet delivery strategy for different types of videos has not been explained in detail. Our research is an attempt to fill this gap. We explore the origin of the source of the video and the exact strategy being followed by YouTube in delivering widely known videos and non-popular videos. We present the methodology involved in packet delivery in detail.

Keywords: YouTube, video delivery, burst analysis.

ACKNOWLEDGEMENT

First and foremost, we are deeply indebted to Prof. Markus Fiedler, our Professor and thesis supervisor, for his hand-holding guidance. His inspiring tutelage brought forth the best in us. We shall ever remain thankful to him for sharpening our insights, kindling creativity in us and molding us into what we are today. Big thanks to him from the bottom of our hearts. We record our sincere thanks to Prof. Patrik Arlos for his kind co-operation and encouragement. A word of thanks to Mr. Junaid Shaik for his valuable tips and suggestions.

We record our thanks to JNTU, Hyderabad, India and BTH, Karlskrona, Sweden for providing us a life time opportunity to study under teachers of global repute and we thank all those who made our stay in Sweden most memorable.

Finally we would like to thank our parents for all their support and love. Without them we would have never reached this stage.

Radha Ravattu

Prudviraj Balasetty

Contents

Abstract 2

1 introduction 7

2 KEY CONCEPTS 9

2.1 Video Streaming 9

2.2 Quality of Service (QoS) and Quality of Experience (QoE): 9

2.3 Mathematical Background 10

2.4 Methodology: 13

2.5 Related work 13

3 Video Delivery Procedure 15

3.1 INTERNET 15

3.2 History of YouTube and Delivery Methodology 15

3.3 Communication between client, YouTube and CDN 17

3.4 User experience on real time videos 17

3.5 Waiting intervals 18

3.6 Burst Analysis 19

4 design and implementation 21

4.1 RESEARCH QUESTIONS 21

4.2 Research Methodology 21

4.3 Design 22

5 Analysis 26

5.1 Case – I: 26

5.2 Case – II: 27

5.3 Case – III: 28

5.4 Case – IV: 29

6 Results and Discussions 32

6.1 Duration of Burst 32

6.2 Inter-Burst Time 33

33

6.3 Length of the Burst 34

34

6.4 Coefficient of Throughput Variation 34

6.5 Server Selection Strategy 35

7 Conclusion 37

8 References 38

9 Refernces 40

Figure 3‑I: Brief overview of communication between client and YouTube 17

Figure 3‑II: Buffering of a video 18

Figure 3‑III Burst 20

Table 4‑IV: Videos used for the experiment 23

Table 1: User perspective of watching videos 19

LIST OF ACRONYMS

CCDF – Complimentary Cumulative Distribution Function

CDF – Cumulative Distribution Function

CDN – Content Delivery Network

DNS – Domain Name System

HTML – Hyper Text Markup Language

HTTP – Hyper Text Transfer Protocol

LAN – Local Area Network

MSN – MicroSoft Network

NAT – Network Address Translation

QoE – Quality of Experience

QoS – Quality of Service

TCP – Transfer Control Protocol

UDP – User Datagram Protocol

URL – Uniform Resource Locator

WAN – Wide Area Network

introduction

Contemporary world is in the midst of communication technology explosion. Computer, Internet and Social Networking have revolutionized dissemination and access to information. Vast and varied information is available to a willing user with a click of mouse. From sports and entertainment content to education and business themes, anything and everything is readily available and accessible in the internet for a user. One can access information from any corner of the world. Availability of images and videos, their uploading and downloading them, exchange and sharing them in the internet has greatly enhanced the value, quality and access of information in the Internet. Internet, a network of networks, whose scope extends from local to global services, has become an integral part of our day-to-day life As the Internet progressively and increasingly started using multimedia applications, the development of applications have also gained momentum. The current tech-savvy generation, with wide use of information access gadgets like Laptops, iPads, Tablets and Cell Phones, increasingly demand access to wide variety and quality streaming multimedia on the Internet. Dissemination of multimedia information has tremendously enhanced the utility and usage of Internet by the enthralled user. Streaming wide variety videos available in the Internet have matched the needs of the video hungry generation. The clamor for videos on demand has reached its zenith so much that it often results in network congestion. Ever-increasing Internet traffic has thrown-up an open challenge to the video service providers like YouTube and Hulu to deliver user genial access to video streaming or else stand outcast.

In order to overcome these challenges, quite a few research works investigated video streaming, Internet and WAN traffic. In the last few decades, analyzing web video traffic has become a major area of enquiry in the quest for determining, improving and optimizing the dynamic characteristics of traffic structures. Video delivery methodology being significant criterion in determining the traffic has captivated researcher’s scrutiny. YouTube being the largest video sharing site was on the constant field of research. Ever since YouTube was owned by Google, the video delivery infrastructure has been completely re-structured. YouTube employed data centers in U.S. With this powerfully built infrastructure and exponentially increasing number of videos and users, in no time YouTube contributed to a large portion of web traffic. In order to avoid congestion, YouTube employed third party Content Delivery Network (CDN) [25].

In this thesis we attempt to find video delivery methodology of YouTube. We try to examine the internal policy in delivering the video packets to the end users.

KEY CONCEPTS

Video Streaming

In comparison, presentation of information in the form of continuous video stream is always better than simple text or images. The emergence of social networks that are completely video based such as YouTube, Hulu and MSN stand as an impeccable paradigm for this. Online digital videos have become a very important aspect of Internet services and have become the most used medium of communication for business, education, social and entertainment purposes. Online videos also present the users with real time videos where they need not wait for hours to download their favorite video and then have to watch it. Enormous growth in the deployment and usage of internet during the recent years led to rapid increase in network traffic. Quite a few video streaming services use User Datagram Protocol (UDP) as a transport layer protocol. UDP-based applications can adjust their data transfer rate as UDP does not have congestion control and the retransmission of packets that are discarded in the network does not happen. However, UDP based communications are mostly blocked by firewalls or Network Address Translations (NATs). Considering this, most leading video service providers run their videos over Hyper Text Transfer Protocol (HTTP), which uses Transmission Control Protocol (TCP) [1].

Increase in traffic leads to network congestion and packet loss. Particularly the outages in the network are quite frequent and they result in longer waiting times. The consequences of the problems in the network are experienced immediately in the form of freezes in the video. However, in practice, many users face volatile performance of the service, e.g. bad network conditions, congested media streaming servers that cause waste of time due to re-buffering. In addition, degradation can occur when the video is encoded, during transmission of the packets across the IP network, and/or during decoding and playback. Video quality degradation makes itself quite visible through various ways such as jerkiness, freezes, gaps in playback, and image-related impairments such as stalling or blurred video [r1] [r2].

Quality of Service (QoS) and Quality of Experience (QoE):

One of the fundamental factors that decide the quality of video stream satisfaction of user is ensuring substantially high and constant levels of perceptual QoS. It can be defined as [2] the ability of a network to provide a service with an assured service level [r3] [r4].

QoS is considered as the best effort service in the Internet and hence it becomes important to put maximum effort to improve the QoS levels as it shows the quality of the video stream. Many studies and proposals have been successfully carried out to improve the QoS levels as well as the performance of the video streaming applications. QoS depends on various factors such as the rate of packet loss, end to end transmission delay, rate of the video transmission and jitter.

The uncertainness of the packet transmissions in real time video streaming service always poses a problem for both the service providers and users. Hence it becomes difficult to ensure an appropriate QoE for the end user. QoS and QoE concepts were introduced in the IP network to explain the satisfaction rate of the end user regarding the quality. QoE can be defined as "The overall performance of a system from the user perspective."

In simple words, QoE is the term, which describes the satisfaction of the customers towards the service providers. Less the QoE, more is the dis-satisfaction of customers towards the service providers. QoE not only implies the network performance parameters but also shows the service quality parameters such as cost, accessibility, reliability and availability. As QoE is subjective, proper planning must be deployed to show it realistically. This gives the service provider as well as the customer, the satisfaction level over the network performance.

As stated in the first section, the video streaming is done by use of the TCP protocol. When a packet is lost during transmission the TCP immediately detects it and decreases the transfer rate. If this rate is less than that of the playback rate, the play back will stop and wait for the new set of video data. No data will be displayed until the new packets arrive. This can highly degrade the user perception quality (QoE). Quality of the video and audio, smoothness of play back also affects QoE.

QoS as well as QoE have gained substantial consideration as they play a major role in stating the quality of the video and the user satisfaction over the streaming.

Mathematical Background

In order to understand this thesis completely, it is essential to understand few mathematical calculations and their usage. If the reader is already familiar with this kind of math, he/she can skip this section.

As we are going to model a system based on the data obtained by the experimentation, it is important to select the important and useful sets of data that is hidden in existing bunches of data. The following are certain mathematical calculations in understanding the obtained data.

Mean

As the data obtained will be in the form of packets, it will be essential to see the number of packets being arrived at a particular instant of time and the time gap elapsed in receiving one packet to another. Hence, in order to find out the average of the time in the arrival of packets, calculating the average of time difference between two packets is required.

Equation 1: MEAN

Where n is the total number of packets.

is the time difference between two consecutive packets.

Variance

Variance can be defined as the measure of how far a set of numbers are distributed from the expected average. It is given by

############

Where is the expected value

And X is a random variable.

It is the square of the standard deviation.

STDEV

Standard deviation is defined as the deviation from the expected average. It is calculated as the square root of variance and is denoted by ‘’.

Equation 2: STDEV

Where, average

are the time differences between 2 packets

(In our case)

It is a measure to show the spread out between the numbers. If the standard deviation is less it shows that the number of packets per interval in the burst is almost the same and the burst is not bumpy and has a smooth flow.

Co-efficient of Variation:

Coefficient of variation is defined as the normalized measure of dispersion of a distribution. It is the ratio of STDEV and mean.

Equation 3: Co-efficient of Variation

Where is the Standard Deviation and

is the average.

If it tends too low, i.e., approximately 0 to 0.3, in the present context means outage processor of the system is very good.

Cross-Correlation:

Cross-Correlation is the measure of similarity between two waveforms as a function of a time lag applied to one of them. It is given by

Where f and g are continuous functions

Auto-Correlation:

Auto-correlation is the cross-correlation of signal with itself. It can be stated as similarity between observations as a function of the time separation between them. Using this we will be able to find repeating patterns such as presence of periodic signal which has been buried under noise.

Where Variance

Mean

Methodology:

The research methodology is classified into two methods, the Quantitative and the Qualitative methods. Quantitative methods involves in testing and experimentation whereas the Qualitative methods involve in surveys, case studies and analysis.

In our thesis we followed the method of experimentation, which is a quantitative research method. This type of research method facilitates controlled environment during testing. The main steps involved in the experimentation procedure are:

Selection of samples from known population

Allocation of samples to different experimental conditions

Measurement of small number of variables

The utilization and execution of the above steps are clearly demonstrated in the ensuing chapters of this report.

Related work

Video streaming has taken Internet to next level. As online digital videos play key role in internet usage, many studies and research works addressed the video quality based on user perspective [3]. With rapid increase in usage of Internet [4] [5], many problems based on the network traffic congestion made their appearance. Many proposed and studied the location management congestion problem arising in different network scenarios. Previous works [6] [7] proposed different methods and algorithms for congestion control.

The increase in the video sharing networks and sites have drastically uplifted the traffic in Internet. With this increased traffic, providing good quality live streaming is at jeopardy. Several researches developed schemes and methods with efficient approaches such as increase in bandwidth utilization, maintaining reliable and high performance infrastructure which can help in enhancing the video quality over different networks [8][9][10]][11].

The multimedia methodology gained wider acceptance due to the flexibility and reliability it offers and plays a key role in video streaming [12]. The success of video streaming has opened new vistas in that field of knowledge and concomitant research interest in the delivery architecture. Many authors proposed video delivery architecture in different scenario [13] [14].

Delivery methodology has become essential factor for analyzing video streaming. When one examines closely the video delivery technologies of today, it turns out to be a surprisingly fragmented landscape, even as IP becomes the common infrastructure [15]. There are different architectural approaches for enterprise and service provider networks; different technologies used for "linear" (live) and "nonlinear" (e.g., video on demand) content; and differences in the delivery of content in "over-the-top" environments (e.g., Hulu, Netflix, YouTube) versus managed environments, such as the IPTV services offered currently by many broadband service providers

YouTube emerged as one of the most popular and effective video sharing websites. With the extensive number of videos and its effective serving strategy, it became a source for consistent research. Many research works have tried to analyze architecture and video methodology of YouTube [16] [n1] [n2].

Few research works give a brief description of the architecture and their analysis of YouTube streaming provides means to examine the performance of video delivery [17]. Existing media providers like YouTube and Hulu deliver videos through progressive download [18]. For the in depth analysis of video packet delivery methodology, having profound knowledge about trace collection, probing bursts, classification of off-times and on-times is essential.

Author gives a brief introduction about ON-OFF models in [19] and [20]. In general, ON-OFF models capture the essential phases of user communication in wireless networks [21] [22] [19]. The most commonly used packet monitoring software tool to analyze data traffic is Wireshark [23] [24].

The current state of art investigates the architecture of YouTube’s methodology with respect to content delivery infrastructure [25]. They stated the strategy followed by YouTube design and its distributed delivery infrastructure to match the geographical span of its users and meet varying user demands. The underlying methodology of YouTube in sending the correct video from the nearest CDN is explained in detail.

But there is no much research related to YouTube’s methodology in analyzing the packet delivery rate and streaming of video on user end. Also the division between the strategies employed for different types of user demanded videos is not focused much.

Video Delivery Procedure

INTERNET

Use of computer and the subsequent use of Internet have comprehensively revolutionized the domain of communication. The modern computer technology coupled with Internet has facilitated access to vast and varied information for a willing user and catered daily tasks of a common man. The impact of computer and Internet services on human life is so profuse that failure in these services will throw the normal life out of gear. Accessing, uploading, downloading, sharing and exchange of information are integral and vital components of Internet use.

 The usage of Internet has become vastly popular in the contemporary world. Internet continues to be the ideal technological platform to introduce online applications and within no time, based on the utility, the application gains acceptance and become popular. The surveys conducted by Cisco suggest that the Internet traffic has reached the zettabyte level [27]. The number of global online video users has touched the magical 1 billion in 2010 and is expected to reach an astounding 500 million mark in 2015 [26]. Wide and varied varieties of videos are easily available and accessible online. Different video types such as short videos that run for a duration of 1-15 minutes to full length movies with a running time of 2 to 3 hours, live shows, repeat programs that were telecast in digital box are readily available for the users. Live streaming and online videos have become most popular of the Internet services. Flash players are widely used and almost 99% of net users make use of them [n5]. This evidently shows that the online video streaming has an important role to play and will stay put for a while. Making use of these services, one can upload and download as many videos, irrespective of place and time. YouTube with a tag line of ‘Broadcast yourself’ is a widely used service provider in video sharing and deliver videos through progressive download.

History of YouTube and Delivery Methodology

YouTube is the most prominent video-streaming portal that serves more than two billion videos on a daily basis [17]. Initially the domain name for YouTube was registered on February 14, 2005 and was purchased by Google in November 2006 [n4]. To begin with, YouTube started offering videos with (320 × 240) pixels. As it gained popularity, it started providing videos in different resolution in order to keep the bandwidth problem away. When a video is uploaded to YouTube then it generates the same video in different resolutions so that it can accommodate the clients with different bandwidths and accommodate mobile applications, which need low resolution.

The YouTube video delivery consists of three basic parts

Video Id space

Hierarchical cache server DNS namespaces

Physical server cache hierarchy.

Video Id Space:

Every video of YouTube has unique identifier, which is known as the video id space. It is eleven characters comprising of alphabets [A-Z] and numeric [0-9].

DNS namespaces:

The YouTube operates and performs with the help of DNS namespaces. The DNS namespaces signify an assortment of logical video servers with definite roles. Collectively these DNS namespace structure a layered organization of logical video servers. The three DNS name spaces are

Iscache

Tccache

Cache

                   . 

Physical server cache hierarchy:

After thorough research, it was concluded that YouTube employs 3- tier physical cache hierarchy with 38 primary locations, 8 secondary locations and 5 tertiary locations [25].

Figure 3‑I: Brief overview of communication between client and YouTube

Communication between client, YouTube and CDN

When a client visits or plays a video of a particular URL such as http://www.youtube.com/watch?v=t4H_Zoh7G5A, it returns to a HTML page with embedded URLS and points to the respective flash video server that is responsible for serving that particular video. When the clients click on the play button of the selected video, a HTTP GET message will be sent from client to server. When the server receives the message, it can understand that client is requesting a video from the unique video ID space identifier. After receiving GET message server replies with a HTTP 303, which contains location response that redirects the client to the video servers from which videos are streamed. This way of redirecting the videos introduces load balancing. Since the main server redirects the client to relevant video servers, the main server should have a complete idea about all the videos and their servers. But in this particular case YouTube has a different and better strategy, usage of CDN. The CDN servers send the video content over TCP protocol in single message such as HTTP 200 OK message [28]. As YouTube has tremendous growth and due to it ever increasing demand and popularity, to get sustainable development it is essential to network traffic engineering. Even after trying lot many strategies, client often experiences various problems such as freezing and re-buffering. Hence considering the user experience plays a key role.

User experience on real time videos

The increase in number of viewers accessing internet on one hand and number of videos provided in the internet on the other hand proliferating to such a large extent that the network congestion has become a regular phenomenon. Due to network congestion, the viewers experience many problems, particularly the outages in the network that necessitate longer waiting times. The network congestion leads to freezes in the video, volatile performance of the service, bad network conditions, congested media streaming and waste of time due to re-buffering. When the time taken to view a video or image exceeds more than the regular time, a busy and time conscious user desists from using the service. Expeditious, quick and prompt access to video is the penchant of the present day viewer and a service provider found wanting in this is likely to lose clientele [28].

.

Figure 3‑II: Buffering of a video

Waiting intervals

As stated above, the user feels weary after waiting for a certain time. The waiting intervals are fundamentally classified into three categories. The intervals of time and the user experience are stated below [19].

S.no

Time elapsed

Type of Reply

1

0.1-0.2 sec

Instant reply

2

1-5 sec

Immediate reply

3

5-10 sec

Slow reply

Table 1: User perspective of watching videos

If the reply comes between in 0.1to 0.2 seconds, it is considered as instant reply where the response is very quick and the user need not wait to watch the video.

If the reply comes between 1 to 5 seconds, it can be stated as immediate reply and at this interval user feels delay but they are not interrupted and remain watching.

If the reply comes between 5 to 10seconds, it can be stated as the interval where the response is slow and the users lose all their interest.

As the number of online user increases, the demand gets more. When the user is watching a video, he/she expects a smooth play back without any delays. If the video is being blurred and if the video stops and buffers at any point the user gets dis-satisfied and may choose to move on without watching it. Hence, it becomes a very critical issue for the service providers to address to this issue and have to come up with strategies where the user does not feel bored and quit.

Burst Analysis

Video traffic in the Internet can be evaluated by analyzing the traces intensely. The traces can basically be classified into three divisions.

Off times

On times

Bursts

Off time

Off time is defined as the time when there is no data transfer. In the thesis we considered it as an off time if the data transfer does not take place at least for two or three seconds.

When we observe the traces we get the packets at an instant and there will not be any packet transfer for few seconds and again we get another bunch of data packets, this process repeats for few minutes. We consider zeroes as the off time but on downscaling too much, zeroes tend to appear in the burst also. Hence we should note that all the zeroes are not off times provided that the time scale is considered. In our observations off time varied from 17 to 25 seconds.

On time

On time is defined as the time of arrival of packets or data with respect to request. For example if the client requests the server for a video, the server starts transferring the data packets. The time during which the data is transferred is called the on time.

Burst

The arbitrary areas of intensity traces can be marked as bursts.

In simple words if the traces of particular data occurs in a very fast succession then it is called as the burst. In our case we define burst as the amount of heavy data after certain off times.

For instance in our research when we collected the traces of the video we had a set of 1221 (nearly) packets after 28 seconds (nearly) of no data delivery. The units may vary a little in the above stated statistics and are clearly given in the further sections.

Figure 3‑III Burst

design and implementation

RESEARCH QUESTIONS

Which are the most annoying QoE problems in the YouTube video delivery?

What is the video delivery methodology of YouTube?

How does the video delivery methodology of popular videos differ from the non-popular ones?

What are the main parameters considered in analyzing the bursts?

What type of distribution is being followed by the video delivery?

Research Methodology

Our present research is analysis oriented and empirical. In this we are going to observe the main differences in the video delivery between the popular and non-popular videos.

Traces are collected by playing various popular YouTube videos from different networks at different resolutions.

Three videos were uploaded from three different continents and their traces were collected in the same conditions as the above.

Collection of traces was followed by identification of bursts in the traces and performing mathematical calculations.

The origin of the video and the methodology being deployed for popular and non-popular videos has been analyzed.

The time taken for the arrival of video packets and their inter-packet times have been calculated and the bursts were sorted.

After analyzing the bursts and ON-OFF times the strategy of the traffic flow was examined so as to identify whether the data flow was smooth or disturbed.

From the identified ON-OFF times, bursts pattern and their distribution over multiple time scales were investigated.

The Cumulative Distribution Function (CDF) and Complimentary Cumulative Distribution Function (CCDF) graphs were plotted for the bursts in order to find the type of distribution followed by them.

Design

This section presents the methodology and explains the way in which the experiment was carried out. This experimentation deals with two aspects. Firstly, it explains the strategy followed by the popular videos that already exist in the YouTube. Secondly, it explains the strategies followed when we upload videos into YouTube and view them.

System Requirements:

A pc/Laptop

Web browser with embedded flash player

Wireshark

For our experiment, we used a Sony Vaio (VPCSB26FG) laptop with CORE i5 processor with a speed 2.4GHz running on a Windows7 operating system with 64 bit version. The latest version of the Wireshark win64-1.8.3 is installed on the laptop.

Experimentation:

Our thesis explains two different aspects of the experiment. The first aspect critically examines video delivery strategy of popular videos followed by the service providers. The second aspect deals with the change in the strategy when the videos are not popular. The underlying process being almost similar, the selection of videos differed in the two aspects. In this section, both the aspects are discussed in detail.

Aspect-1:

The place of origin, content and the language of a video certainly decide the number of viewers and as such a video of regional language may not have international following. For the purpose of experiment, we selected three sets of videos based on their popularity and categorized them as popular, moderately popular and least popular. For the video selection, number of views and the place of the video origin were considered as the basic criteria and two vidoes from each of the above categories were selected.

For this, two most popular English pop songs, two widely watched Indian songs, two popular and two least viewed South-Indian regional songs and one Indian song that is quite popular among all the places were selected. The URLs of all the songs and their descriptions are given below in the following table.

S.no

Video URLs

Description

1

http://www.youtube.com/watch?v=t4H_Zoh7G5A

Very popular English pop song

2

http://www.youtube.com/watch?v=2up_Eq6r6Ko

Very popular English pop song

3

http://www.youtube.com/watch?v=eHZn85RsrCE

Semi Popular Hindi song from India

4

http://www.youtube.com/watch?v=h7j17dx_rjw

Semi popular Hindi song from India

5

http://www.youtube.com/watch?v=1aVVwG6W59Y

Semi popular South Indian regional song

6

http://www.youtube.com/watch?v=Y1gKGTAVDNo

Semi popular South Indian regional song

7

http://www.youtube.com/watch?v=5Tyi_tNYOeg

Less popular South Indian regional video

8

http://www.youtube.com/watch?v=_GMuCjESLZ0

Less popular South Indian regional song

9

http://www.youtube.com/watch?v=N3bUt10-FIQ

Popular Indian song

Table 4‑IV: Videos used for the experiment

Wireshark was kept running in the background with two filters. (ip.dst==80.78.216.226*)&&(tcp.port==80). The computer’s IP address changes with the Internet connection. Hence, it is required to check the IP address before setting the filter. It is essential to filter http packets only. As the Wireshark gives the details of all the programs running on the computer, it is essential to filter only the required packets so that we can reduce lot of unwanted traces and thereby reduce unnecessary manual work that is to be done while selecting the video packets.

Firstly, the videos were played on a Google chrome web browser on a fixed network from Karlskrona, Sweden. We set the video quality to 360p and played the video. The Wireshark collected all the traces. We saved them in a text format and then imported them into a Microsoft Excel sheet, where it was easy and simple to perform all the mathematical calculations. We need the time, source and length fields only and rest of the unnecessary fields were removed. We calculated the inter-packet time, time difference between arrival times of two packets. We then scaled our packets into different time divisions, 1-Second scale, 0.1-Second scale, 0.01-Second scale and the number of packets arrived in each interval has been calculated. This experiment was repeated in the similar way but in low resolution i.e., 240p and the calculations were carried out.

This entire experiment was repeated on a wireless network. For this we collected the traces for the same set of videos on a busy hour at BTH Library, Karlskrona. The traces were collected and were imported into excel sheets to perform all the mathematical calculations.

Aspect-2:

Our main motive was to find the differences in the video delivery strategy of the distinct videos based on their origin and popularity in YouTube. Hence, we on our own uploaded three videos from three different continents that are Hyderabad, India and New Jersey, United States and Birmingham, United Kingdom by using local help.

Just to ensure that the video length and type doesn’t mismatch, we downloaded the same popular video from YouTube and re-uploaded it. We also ensured that these videos are not opened anywhere near Sweden locations. To avoid the cache search string storage, we did not search for those. Instead, we directly pasted the URL of that video in the URL tab and started collecting the traces using Wireshark.

The URLs of the uploaded videos are given below.

S.no

URL

Location

1

http://youtu.be/AX1Orh9CW6k

India

2

http://www.youtube.com/watch?v=sStxsORRaOs&feature=youtu.be

U.S

3

http://www.youtube.com/watch?v=IJ-9dQAbCjY

U.K

Table 5-II: Videos Uploaded by Us

We then performed the Wireshark experiment in the same scenarios like fixed network and wireless LAN. We saved those packets in Excel sheet and performed the mathematical calculations for inter-packet time and number of packets arrived in the same time scales.

Figure 5‑I: Capturing traces using Wireshark

As stated in section 2.4 we have deployed the three major steps in experimental method. We selected a set of videos from a wide range of videos in YouTube. After selection of videos, we collected the traces of the videos in various experimental conditions such as fixed network, wireless network over two different resolutions. Finally we analyzed the on times and off times of all the videos with emphasis on the bursts and their behavior.

Analysis

In chapter 4 we have explained collection of traces. The present chapter explains how the collected traces are analyzed. As explained we have two aspects. We provide an overview of video delivery methodology from the YouTube where in the packet arrival times and the strategy being followed in burst delivery were analyzed thoroughly. We divide videos in two categories as popular and non-popular. We explain the methodology of video delivery in YouTube, with the focus on difference in delivering the popular and non-popular videos.

Burst: Arbitrary areas of intensity traces can be marked as bursts and any relative high bandwidth transmission over a short period. In our case, we define a burst as set of packets being delivered in a very short span of time.

We consider a burst as continuous flow of packets. For our research we assume that if there is no data flow for more than 0.1 second, we considered that the packets do not belong to the same burst and assume it as a new one. But in the traces the time difference between two bursts was more than twenty seconds, hence we can easily identify our bursts and can sort them.

Time Scale Selection: As mentioned earlier we have selected different time scales. But the statistics given below are of the 0.01-second scale. As we go down scaling, we can observe the data packets in the bursts more clearly but if we go beyond 0.01-scale more number of zeroes make their appearance in the midst of the burst which leads to confusion. When we go upscale, the packet arrival is not clear. Hence we set 0.01-scale as standard for our analysis.

As a part of the experiment a comparison between two different sets videos was made and is presented here under. In first case, the strategy of bursts in a popular video is presented. In the second case, the uploaded video is presented and in the third and fourth cases the strategies between the popular and non-popular (uploaded) video in fixed and wireless networks is compared and hence the different strategies deployed in the two cases is observed.

Case – I:

Comparison between Fixed LAN and Wireless Delivery in a Popular Video

S.no

Burst Timings

Inter-burst time

Packets

1 Lan

19.5842 – 20.49794

-

2428

Wireless

9.21627 – 10.7956

-

2448

2

33.5307 – 33.74371

13.03276

1239

23.69603 – 24.36507

12.90043

1219

3

59.1077 – 59.25039

25.36399

1194

49.10903 – 49.91003

24.74396

1222

4

84.50535 – 84.83664

25.25496

1204

74.48594 – 75.11597

24.57591

1219

5

113.70724 – 113.91231

28.8706

1208

103.8926 – 104.8277

28.77663

1199

6

145.1061 – 145.3521

31.19377

1221

135.1518 – 135.8595

31.2542

1221

7

178.37667 – 178.75199

33.0246

1221

168.4999 – 169.3528

32.6404

1194

8

209.72976 – 210.10278

30.97777

1221

199.8512 – 200.5509

30.4984

1220

9

239.13356 – 239.48198

29.03078

1191

229.1873 – 229.8208

28.6364

1220

10

270.58453 – 271.09081

31.10255

1221

260.6478 – 261.3496

30.827

1221

Avg Lan

29.35238

Wireless

28.99411

Table 5-I: Comparison between Fixed LAN and Wireless Networks of a popular video

In the table 5-I, the dark shaded rows represent LAN and the rows with no shade give the statistics of wireless. This same pattern is followed for the next case also. The second column gives the starting and ending time of each burst. The next column shows the time taken between two bursts. It can be observed that the inter-burst time is almost the same for both types. The initial burst is a combination of two bursts in both the cases.

Case – II:

Comparison between Fixed LAN and Wireless Delivery of Uploaded Video

S.no

Burst Timings

Inter-burst Time

Packets

1 Lan

24.08368 – 27.228853

-

2435

Wireless

28.105929 – 30.831294

-

2447

2

28.331829 – 29.780078

1.102976

1221

31.871215 – 33.343162

1.039921

1221

3

53.920937 – 55.370884

24.14086

1221

57.860147 – 59.054956

24.51699

1229

4

81.771248 – 83.215627

26.40036

1221

85.292794 – 86.753115

26.23784

1220

5

110.84133 – 112.630225

27.6257

1221

115.391562 – 116.804061

28.63845

1220

6

145.271422 – 146.735107

32.6412

1215

148.575039 – 150.527657

31.77098

1221

7

179.404464 – 181.135459

32.66936

1201

182.777963 – 184.43573

32.25031

1207

8

210.1197 – 211.586933

28.98424

1222

213.45918-214.974627

29.02345

1222

9

244.325938 – 245.750965

32.73901

1205

247.589923 – 249.051766

32.6153

1221

10

278.172615 – 279.641148

32.42165

1209

281.515927 – 283.008923

32.46416

1221

Avg Lan

29.7028

Wireless

29.68968

Table 5-II: Comparison between Fixed LAN and Wireless Network of Uploaded video

The table 5-II presents the statistics of the video delivery of the uploaded video in LAN and wireless networks. The delivery followed in both the networks is almost similar. The first two bursts arrive simultaneously as in the case-I and the third burst arrived with a short gap. In point of fact the first three bursts arrive almost in the same time but as stated earlier we consider the burst as a new one when the time arrival between two video packets exceeds 0.1 second. The inter-burst time is almost similar in both the networks.

Case – III:

Comparison between Popular and Uploaded Video in Fixed Network

S.no

Burst Duration

Inter-burst time

CoTV

Correlation (%)

1 Popular

0.91374

-

0.563702

99.55

Uploaded

3.145173

-

0.177135

94.93

2

0.21301

13.03276

0.41004

98.62

1.448249

1.102976

0.079226

94.47

3

0.14269

25.36399

0.369031

97.37

1.449947

24.14086

0.062403

93.26

4

0.33129

25.25496

0.472802

97.43

1.444379

26.40036

0.096203

95.30

5

0.20507

28.8706

0.349421

99.60

1.788895

27.6257

0.223138

90.60

6

0.246

31.19377

0.361452

98.78

1.463685

32.6412

0.240099

97.91

7

0.37532

33.0246

0.708073

99.49

1.730995

32.66936

0.398772

93.54

8

0.37302

30.97777

0.778645

98.93

1.467233

28.98424

0.161139

93.78

9

0.34842

29.03078

0.419414

97.93

1.425027

32.73901

0.130974

91.89

10

0.50628

31.10255

0.597891

98.42

1.468533

32.42165

0.194845

91.59

Avg Popular

0.332258

29.35238

0.503047

98.612

Uploaded

1.530192

29.702797

0.176393

93.727

Table 5-III: Comparison between Popular and Uploaded Video in Fixed LAN

The table 5-III shows the video delivery statistics between popular and uploaded (non-popular) videos in fixed network. The second column gives the duration that a burst lasts. In the popular video the first two bursts come as a set. Not much difference was observed in the inter-burst time.

Case – IV:

Comparison between Popular and Uploaded Video in Wireless Network

S.no

Burst Duration

Inter-burst time

CoTV

Correlation (%)

1 Popular

1.57933

-

0.387033

92.12

Uploaded

2.725365

-

0.168367

94.93

2

0.66904

12.90043

0.201448

90.85

1.471947

1.039921

0.223493

94.47

3

0.801

24.74396

0.408963

88.22

1.194809

24.51699

0.458311

93.26

4

0.63003

24.57591

0.183063

94.22

1.460321

26.23784

0.218209

95.33

5

0.9351

28.77663

0.267653

91.97

1.41249

28.63845

0.262304

92.61

6

0.7077

31.2542

0.364916

87.52

1.952618

31.77098

0.415253

97.91

7

0.8529

32.6404

0.246851

90.67

1.65776

32.25031

0.325936

93.54

8

0.6997

30.4984

0.125357

96.17

1.515447

29.02345

0.346176

93.78

9

0.6335

28.6364

0.217714

95.82

1.461843

32.6153

0.254624

92.89

10

0.7018

30.827

0.306578

85.64

1.49299

32.46416

0.227936

92.59

Avg Popular

0.746373

28.99411

0.2709576

91.32

Uploaded

1.485965

29.68968

0.2900609

94.131

Table 5-IV: Comparison between Popular and Uploaded Video in Wireless Network

The table 5-IV presents the video delivery statistics between popular and uploaded (non-popular) videos in wireless network. In the popular video the first two bursts come as a set and the next burst takes a gap. Not much difference is observed in the inter-burst time.

After analyzing all different scenarios thoroughly, we plotted the Cumulative Distribution Function (CDF) and Complimentary Cumulative Distribution Function (CCDF) graphs for each burst individually.

Fig 5-I: Comparison of CDFs between Popular and Uploaded Videos in Fixed LAN

Figure 5-II: Comparison of CDFs between Popular and Uploaded Videos in Wireless Network

The graphs in Fig 5-I and 5-II represent the CDF of popular video against uploaded videos in both the networks. All the three uploaded videos almost have the same CDF but the CDF of the popular video varies a lot from the uploaded ones in both the scenario. Now that the burst strategy has been analyzed, it is important to see the distribution followed by it. The CDF plots have been matched with various distributions and it was found that the distribution which was mostly in pact with the obtained graph was Normal distribution.

Results and Discussions

Main focus of the thesis was on the video delivery strategy of YouTube. As discussed in Chapter 5, the burst analysis was done and the important conclusions arrived at are presented the form of ensuing graphs.

Duration of Burst

Fig 6-I: Burst Duration in Popular video and Uploaded Video

For both the popular and uploaded videos, the first two bursts arrive together. Hence the burst duration for the first burst is more (highlighted in red).

The burst delivery is very fast in the popular video and hence the burst duration is comparatively very less and does not exceed 0.6 seconds; for the uploaded video the burst duration time was more than twice to that of popular ones in fixed network. The burst duration in the uploaded video ranges between 1.5 to 2 seconds.

For the wireless network, the burst duration was marginally more than that of fixed network and used the same strategy of the fixed network.

Fig 6.I is the error bar graph of the burst duration for popular and uploaded videos. The graph shows the positive/negative error between the bursts.

Inter-Burst Time

Fig 6-II: Inter-burst Time in Fixed LAN and Wireless Networks for Popular and Uploaded Videos

It is clear that the first two bursts arrive together and the next burst (third) follows shortly. The third burst takes half the time compared to other inter-burst times in popular videos which is 12-13 seconds.

For the uploaded videos the third burst is more instantaneous and appears nearly after one second.

The inter-burst time remained almost constant for Fixed LAN and wireless networks (except the first one) and was on average of 25 to 27 seconds for both popular and uploaded video.

Length of the Burst

Fig 6-III: Total Packets per Burst

A burst consists of 1221 video packets. However, due to network problems it may vary by ±30.

This length of the burst was almost the same for all the bursts in both types of videos and in both the networks.

As the first burst is a combination of two bursts it has 2442 packets in it and the graph goes down to 1221 for the remaining bursts.

Coefficient of Throughput Variation

S.no

Network

Popular

Uploaded

1

LAN

0.503047

0.176393

Wireless

0.2709576

0.2900609

Table 6-I: Average of CoTV for Popular and Uploaded Videos in Fixed LAN and Wireless Networks

If the CoTV is less, it denotes that the flow of the data in that burst was smooth and the traffic flow was regular. Hence a lesser CoTV is desirable.

CoTV is significantly better for uploaded videos in fixed network but the CoTV is slightly better for popular videos in wireless networks.

The CoTV for the popular videos was more than that of the uploaded videos in fixed network.

The CoTV for the popular videos was slightly more for uploaded videos in wireless network.

With the scaling up, the CoTV values get better because the burst gets smooth as evident in the figure given below.

Fig 6-IV: Number of packets per interval in different time scales.

In the Fig 8.IV the graphs of the bursts in 0.01-Second scale, 0.02-Second scale and 0.04-Second Scale are presented (clockwise order). In the first case the burst was rather uneven and bumpy and the CoTV was 0.260358. In the second, the stability of the burst was better and the CoTV was 0.044766. In the third case the burst was rather stable, the traffic seemed to be pretty uniform and the CoTV was 0.029955.We can see that as the time scale increases the burst gets more and more flat and the CoTV gets better. But in order to analyse the real bursts we need to check for the scale where the burst does not have a uniform flow.

Server Selection Strategy

If the video is popular, YouTube maps the user to nearest cache location and if the video is non-popular, the user is mapped to the U.S cache location.

To reduce the network traffic, the popular videos are replicated and are placed in most of the hotspots and hence there is no need for the main server to send the requested video to every client. It redirects the request to the hotspot which is near to the user and the video delivery is done from that hotspot (cache location).

Conclusion

This thesis presents the results of our experiment performed in order to characterize the YouTube video streaming. An experimental set-up was created to capture the packet transfer of selected videos from the server to the end user in different networks using Wireshark. The collected data was filtered and processed on different time scales.

The objective of the thesis was to analyze certain points in YouTube.

First one is about the most annoying QoE problems in YouTube.

After thorough literature review, we found that were stalling and re-buffering. Stalling was the most pertaining and annoying problem which is caused due to

Second one is about the video delivery methodology of YouTube and its difference in serving videos based on their popularity.

YouTube employs CDN architecture in order to avoid network congestion and the popular videos are being served from those local hotspots. For the non-popular videos, the main cache server located in U.S delivers the video. The results have shown the origin of popular and non-popular videos.

Third one is about the distribution being followed by the packet delivery within a burst and the main parameters determining the video delivery.

After thorough analysis of number of bursts from 50 videos, we concluded that the type of distribution being followed by them is a Normal distribution. The main parameters involved in determining the video packets delivered to the burst are burst duration, inter-burst time, length of the burst.



rev

Our Service Portfolio

jb

Want To Place An Order Quickly?

Then shoot us a message on Whatsapp, WeChat or Gmail. We are available 24/7 to assist you.

whatsapp

Do not panic, you are at the right place

jb

Visit Our essay writting help page to get all the details and guidence on availing our assiatance service.

Get 20% Discount, Now
£19 £14/ Per Page
14 days delivery time

Our writting assistance service is undoubtedly one of the most affordable writting assistance services and we have highly qualified professionls to help you with your work. So what are you waiting for, click below to order now.

Get An Instant Quote

ORDER TODAY!

Our experts are ready to assist you, call us to get a free quote or order now to get succeed in your academics writing.

Get a Free Quote Order Now