Evaluation Of The Youtube Architecture

Print   

02 Nov 2017

Disclaimer:
This essay has been written and submitted by students and is not an example of our work. Please click this link to view samples of our professional work witten by our professional essay writers. Any opinions, findings, conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of EssayCompany.

This report evaluates the architecture and technologies used in YouTube, the strengths and weakness of YouTube. Evaluation focuses on the scalability and failure handing of the system. The potential bottlenecks and some improvements are discussed in this report.

Information technology has done tremendous influences on every aspect in human’s daily life. Comparing to the conventional centralized system, distributed system is an advanced solution in utilizing the Internet and computer itself. It can achieve better performance in the operation system. "We define a distributed system as one in which hardware or software components located at networked computers communicate and coordinate their actions only by passing messages." (Colouris, 2005, pg. 2 from lecture note). The distributed system enables people to cooperate and share effectively. Users can use the devices to connect with others and share the resource through the network connection. The distributed system is popular with government, bank, hospital and supermarkets. Base on the concepts described above, the distributed system can be analyzed more details with resource sharing, scalability, concurrency, transparency, failure handing and openness.

People are witnessing the new trends in multimedia consumption that everyone is able to create and share digital content. Multimedia content distribution over Internet has been a major application because of the increase demand of multimedia content and the technology improvement. YouTube is the leader of online video sharing website which was founded in 2005. It is a well-known video streaming system which implanted with distributed system. It makes the video sharing easier for people. People can upload and share video through websites, mobile devices, and e-mails (Song, Ronaldo & Marius, 2010). The main purpose of this report is to evaluate a specific distributed system, which is the YouTube video streaming system. In this report, the evaluation of YouTube architecture and the strengths and weakness of the technologies that used in YouTube are described after the introduction. The following part is about the potential bottlenecks and the recommendation for YouTube.

System Architecture

YouTube is a video sharing website which using the Client/server architecture. In this report, the architecture would be discussed based on the following aspects, Web servers, Video servers, Thumbnails servers, Databases and Delivery policy.

Web servers

Routers

Apache

Mod_fastcgi

_

Python App Server

Databases

Fig 1. Basic design of web servers

The NetScaler is implemented in the front of the web servers. NetScaler provides the optimal, security services and controls the delivery of the enterprises. With the use of NetScalar technology, YouTube can achieve the load balancing, infinite flexibility cloud connectivity, content caching and exchanging. NetScaler also provides some protection for the advanced application layer attack and increases the scalability of the web applications. YouTube uses the Apache with mod_fastcgi as the web servers. Apache web server is a language-independent, scalable web server. It does not need to be limited by a specific language and can be applied on various platforms. Python is used in the YouTube and allows rapid flexible development and deployment to achieve high performance. Request from the router is handled by the Python application. The application also connects with the database and other resources to exchange information. The web server uses the Linux as the operation system because of its efficient in maintaining the application.

The incredible expansion is a significant issue that YouTube needs to cope with because it is a video sharing website. In the web server area, the Psyco, the Python to C compiler, is used. With the use of C language, YouTube can improve the system performance cause of the efficient of C language. Also in some high CPU intensive activities they use the C extension. And the pre-generate cached HTML is another method used to deal with the expansion issue.

Video servers

In the video streaming, the bandwidth, hardware, and power consumption are three important issues when designing the video streaming system. In YouTube, some methods are used to reduce the consumption of the system.

YouTube applies cluster method to solve the consumption problem. Cluster means each video is hosted by a mini-clutser and be stored in several machines. It is an effective way to increase the speed of video downloading. This method can be used as a backup in case of the system failure. Lighttpd is another method used in YouTube. The less hardware consumption with Lighttpd compares to the Apache. With Lighttpd it can switch the single process to the multiple process configuration in order to achieve more connection.

YouTube also uses different strategies to deal with most popular videos and less popular videos. For the most popular videos, they will move to the content delivery network. In this network, the content would be replicated in several places. The video server would deliver the video which is closer to the user in geography when user makes a request. With the content delivery network, the transferring uses fewer hops. But the content delivery network often runs out of memory because of the most popular video keep transferring in and out of memory. On the contrary, the less popular videos are storage in a few YouTube colo servers. These videos play a few times every day and spend much more cache to storage these vides is not a reasonable strategy.

The YouTube video server is designed following four principles. First of all, the path between videos and users is designed as simple as possible. Secondly, using the common hardware make the scalability easier and using the simple tools that can be implemented in Linux. The final one is handing the random seek well.

Thumbnail Servers

There are approximate four thumbnails for each video and the number of thumbnails is much larger than the amount of the total videos. With the tremendous amount of thumbnails, the disk seeking and caching inodes and pages at OS level become problems. And the limitation of file directory is another issue for Linux system. It is challenge to find an efficient way to store this large amount of thumbnails.

The number of thumbnails for a web page in YouTube is up to 60 and the requests from users are much higher. The Apache and Lighttpd cannot handle these situations. It takes a long time to set up and reboot the cache in the machine cause of the large number of thumbnails. YouTube use the BigTable, a product from Google using the distributed multilevel cache, to solve this issue. Images are replicated to different data centers using the BigTable. The BigTable help YouTube to achieve fault tolerance when implementing in an unreliable network.

Databases

YouTube use the MySQL as their databases in the early year. The meta data like users, tags and descriptions are stored in the MySQL. But the replication issue arrives when the booming amount of video stored in the databases. Replication is the main down side of MySQL and replication lag was horrible. A shard architecture is designed to solve this problem. Database partitions assign different users into different shards to solve the reputation problem cause by the MySQL. This method also reduces the replica lag to zero and achieve 30% hardware reduction. It has the faster backup and recovery ability. And the scalability of the databases achieves a significant improvement.

Delivery policy

Some research has been done to evaluate and analyses the delivery policy of YouTube. Evaluation of the delivery policy would take into account the content, sound and picture quality, fluidness and loading speed as the factors. In the research, a reliable tool evaluates the playback quality of YouTube videos that experienced by users automatically. The tool is designed based on the delivery policy of YouTube and the DNS resolution policy. From the research result, it indicates that the network/server load-balancing and ISP-dependent policies have some impacts on the quality of user experience, but the geographical proximity does not. "The QoE (quality of experience) is no longer impact by access capacity, but by peering agreement of ISPs and by the server load." (Louis, Ernst & Parikshit, 2012). From the analysis of the delivery policy of YouTube, it shows that YouTube, and the CDNs have three ways to control the content delivery. "First, customizing the URL of the video server is done by the YouTube front-end servers. Second, the YouTube authoritative DNS server resolves the URL of the video server to a different IP address. Third, cache site level uses HTTP redirect messages at the video server level." (Louis, Ernst & Parikshit, 2012).

There are some disadvantages of the delivery policy. YouTube limits the data rate by pacing the data into connection and the block size is fixed to 64KB which matching the block size used by GFS. Cause of the block sending, there is a detrimental effect on YouTube flow performance. When the client request is over a congested network, the transmission of block can aggravate the congestion of the network and reduce the throughput of the network. A research shows that over 40% of the packet loss in the congested network and result in data retransmission (Shane & Richard, 2011).

Evaluation

Scalability

One of the main features about distributed system is the scalability. Scalability is the ability of the system to be enlarged to accommodate the growing amount of work. For example, it can be the capability of a system to increase the total throughput when resources are added. YouTube achieves the scalability by using the NetScaler in the Web server. NetScaler provides service that not only accelerates the application ability but also improves the scalability of YouTube. The using of Linux as operating system also makes it easy for YouTube to add new server machine. The core of scalability for video streaming is the databases. YouTube use the database partition to deal with the rapid expansion of data. The database partition method has a good performance on data expansion with less hardware consumption. With the technology used above, YouTube provides high quality of service for users. The burst issue arises when the number of request service increased. It is a severe issue that YouTube faces nowadays when the dramatic increasing of video sharing in YouTube. And this issue deteriorates when the network is congested.

Failure Handing

Another feature evaluated in YouTube is the failure tolerate. Computer software, hardware and the network infrastructure may fail due to several of reasons. The distributed system has a good performance in the failure handing. The system still can operate when parts of the system has failed. In the video server, YouTube uses the different strategies between the most popular videos and the less popular videos. The replication of the most popular videos not only helps the system improve the service quality, but also helps the system to handle the failure. This strategy can make sure the most popular videos working while some servers are broken down. In the thumbnail server, the BigTable from Google also helps YouTube to achieve the fault-tolerance. And in the database, the advantage of the Sharding technology is the fast backup and recovery abilities. These abilities help YouTube to achieve fault-tolerance.

Future Recommendations

YouTube is one of the most consumed applications nowadays and it takes a large amount of the overall Internet traffic volume. Some research has been done to evaluate the quality of experience over mobile broadband network. It shows that YouTube quality of experience is sensitive to the downlink encoding bottlenecks, which means that the users’ satisfaction is relative to the ratio between video bitrate and Downlink Bandwidth (Pedro,Andreas, Sebastian & Rainmund, 2012). To improve users’ experience, a kind of server needs to be established to optimize the video delivery. The content delivery network information, users’ IP address and network bandwidth information are stored in this server. With the help of delivery algorithm, the server can provide the best video playing performance for users.

As the edge of Internet is growing exponentially, more and more devices can access to YouTube. For instance, phones, tablets, laptops, ebook readers and etc. To improve the performance of the system, YouTube needs to optimize the architecture of the website and make more investment in the infrastructure.

As the description of YouTube delivery policy above, Monia, Yuchung, Ankur and Matt have done some research on the burst issue of YouTube and they find a method to solve this problem. They present the Trickle, which removes these large bursts by some limitation on the TCP rate. "This method solves the problem by dynamic set a maximum cwnd to control the streaming rate and strictly limit the maximum size of bursts." (Monia, Yuchung, Ankur & Matt, 2011) With some experiments on this method, they find Trickle has a good performance on burst issue.

Conclusion

This report shows the architecture and technologies used on YouTube and do some evaluation on the strengths and weakness points of this distributed system. The potential bottlenecks and improvements are discussed in this report.

In a conclusion, technology makes changes throughout mankind. New innovations for distributed system have played a vital role in people’s daily life such as education, finance, investment and tourism.... YouTube will be beneficial and become as a simple, efficient video sharing and uploading website.

2140 words



rev

Our Service Portfolio

jb

Want To Place An Order Quickly?

Then shoot us a message on Whatsapp, WeChat or Gmail. We are available 24/7 to assist you.

whatsapp

Do not panic, you are at the right place

jb

Visit Our essay writting help page to get all the details and guidence on availing our assiatance service.

Get 20% Discount, Now
£19 £14/ Per Page
14 days delivery time

Our writting assistance service is undoubtedly one of the most affordable writting assistance services and we have highly qualified professionls to help you with your work. So what are you waiting for, click below to order now.

Get An Instant Quote

ORDER TODAY!

Our experts are ready to assist you, call us to get a free quote or order now to get succeed in your academics writing.

Get a Free Quote Order Now