The Packet Level Erasure Recovery

Print   

02 Nov 2017

Disclaimer:
This essay has been written and submitted by students and is not an example of our work. Please click this link to view samples of our professional work witten by our professional essay writers. Any opinions, findings, conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of EssayCompany.

A testbed is presented for the evaluation of a real time encoding of interactive video applications with packet erasure coding at the IP layer. Specifically, practical video CODECs like H.264/AVC with different packetization strategies are deployed to test the improvements obtained in the recovered video stream for different coding rates in packet erasure codes. Quality of video is analyzed using PSNR of reconstructed video frames under different raw packet loss rates (PLRs) and for different packet sizes.

Introduction

The advances in video coding techniques along with the rapid developments of network infrastructures are enabling an increasing number of multimedia applications [1]. Video streaming over Internet has become already an essential part of personal communications and every day access to broadcast and entertainment media [2-5]. With the evolution of Internet to heterogeneous networks both in terms of network bandwidth and different link qualities, and because of congestions in the routers, different video streams experience different PLRs. In additions, the delay tolerance for interactive applications like video telephony is less than 200 ms which limits the data buffering causing packets arriving too late for playback also to be considered lost. With these effects, when streaming video over Internet, compressed video can suffer severe degradation in user’s perceived quality. Thus, packet loss resilience is one of the main requirements for IP video transmissions. While some video encoders and decoders have already build-in error-resilience and error-concealment mechanisms, there are some techniques at the network transport layer designed to reduce the impact of raw PLRs in the network. The most popular among them are: (i) joint source channel coding; (ii) advanced forward error correction and (iii) feedback based techniques [5-10]. The key factor in all of them is to control end-to-end packet loss by modifying the transmitted packets, so that after packet reception, with additional data processing, the decoded video is not so adversely affected by the raw PLR in the network [4]. In this project, the focus is on PLR control systems that relay only on packet erasure coding.

The H.264/AVC video CODEC provides enhanced compression and network adaptable coded video for conventional and scalable applications [5]. Moreover this CODEC is designed to discard the entire packet if there is a single bit error in a packet, and to discard the entire frame if there is a single packet loss within the frame [6]. In some cases, by the use of slices and slice groups for macroblocks for which no data has been received, appropriate error concealment has to be invoked before the frame is forwarded to the reference and display buffer. Even though various techniques are used in this video CODEC to conceal the channel imperfections, there is still a lot of room to reduce the effects PLR at the network interface to provide the acceptable perceived QoS [11-14].

For real time communications, most commonly used technique to recover from packet lost in the network is packet erasure coding (EC). Packet level EC is a technique for better than best-effort IP service without the need for retransmissions. Erasure codes in packet-level FEC add parity packets to a media stream prior to transmission, so that packet losses can be partially repaired at the receiver. This reduces at the receiver the effective PLR over the raw PLR due to network congestions. The impact of EC on the performance of video codec with different packet loss concealments under realistic PLR conditions addressed in this paper is still an open problem [15].

The end-to-end applications and related network performances mainly depend on the transport layer protocols, such as the transmission control protocol (TCP) and the user datagram protocol (UDP). TCP, which is equivalent to the ARQ strategy, could suffer long delays in the scenarios such as poor channel conditions (particularly in wireless networks), multicast and long-distance transmission. UDP, in contrast to TCP, offers speedy data delivery as it has no re-transmission, but cannot guarantee for reliable services as it does not recover the lost or corrupted packets. Therefore, it is a big challenge for conventional IP-based networks to meet the increasing demand for supporting the multimedia distribution that requires both real-time and high-quality performances. This leads naturally to the consideration of employing erasure coding techniques, combined with ARQ, to tackle these problems [7].

In this project, a testbed for evaluation of real-time encoding of interactive video call application is presented, with the realistic packet loss and a feasible packet loss recovery technique. Video quality is evaluated by subjective and objective measurement comparing the original frames and recovered frames after packet loss. In contrast to similar works like in [7] our model provides evaluation of real time encoding video using H.264 CODEC with simple packet loss recovery technique based on even parity check redundant packet generation. We accomplish this with the use of a specialized Python software emulating Wide Area Network (WAN) that manipulates packets captured using Wireshark protocol analyzer. Specifically, this paper describes an open-source software that allows us to test real-world video applications with emulated PLRs in erasure coding, implementing more realistic statistics for packet losses. In addition, our findings can be used for tuning the network operations and performance control parameters, as required for maintaining acceptable video quality measured through the Peak Signal-to-Noise Ratio.

Erasure codes

In IP networks, one can model the path between the source and the destination as a packet erasure channel, which assumes that a transmitted packet is either correctly received or lost. Erasure coding techniques are a sort of FEC where at the destination the redundant packets are used to recover some of the lost packets in the network. In the basic, applications of packet EC, the source uses linear block codes by grouping k information packets and r = n − k redundancy packets into a coded packet group of n packets. Every EC technique has an associated erasure recovery capability represented by e, and it is the number of packets in the coded packet block that the code can guarantee to be recovered. The values of n, k and e, which are normally given as (n, k, e), are specified by the erasure code being employed. The packets are marked with sequence numbers, and the receiver can detect the exact location of the lost packets to be recovered [16-18]. There are number of well documented EC algorithms such as Reed-Solomon (RS) codes, Raptors and Tornado codes [7] which may be introducing prohibitive delays in real-time video streaming or may be too complex to implement in real time. In this paper, a parity check code is used for the erasure coding as presented next.

Packet Level Erasure Recovery

In [5] XOR operations are performed to generate the group of n packets out of k information packets with linear block codes applied column-wise with packets arranged in rows. This, in a general case, may increase the overall bandwidth requirements contributing further to network congestions. In this paper even parity codes are generated, to reduce size of redundant data, In Figure 1 the number of packets before and after the addition of redundant packet at the source is shown in a block of transmitted packet group. In Figure 2 (a) shows how even parity packet (PE) is generated by bit-wise XORing each of the k = 4 information packets and the total of n=5 coded packet being sent from the source. In Figure 2 (b) received packets at the destination are shown with packet 3 being lost. To recover an erased packet it is a simple matter of bit-wise XORing the received k packets together to recreate the missing packet. This is a simplistic form of EC and it only offers a limited amount of protection to the data packets being transmitted (protects against single packet lost in a block of n transmitted packets e=1). But its simplicity lends itself well to ease of implementation and makes this kind of EC attractive in real-world applications. First encoding and the loss recovery can be implemented as the packets are transmitted and received and does not introduce prohibitive delay which is a key factor in real time video. This is provided that the value of the k is small.

.…

Pk

P1 1

P2

n - k

……

Pk

P2

P1 1

Figure 2.: Number of packet before and after addition of redundant packet.

……

Fullscreen capture 13032013 73902 AM

Figure 2. (a) Even Parity packets generated at source, (b) Packet loss recovered using redundant

Packet at the destination

Erasure coding on video streaming

A video sequence consists of a number of video frames or images. There are three basic common types of coded frames: (1) intra-coded frames, or I-frames, where the frames are coded independently of all other frames, (2) predictively coded, or P-frames, where the frame is coded based on a previously coded frame, and (3) bi-directionally predicted frames, or Bframes, where the frame is coded using both previous and future coded frames. Figure 2 illustrates the different coded frames and prediction dependencies in an MPEG Group of Pictures (GOP), as an example. The selection of prediction dependencies between frames can have a significant effect on video streaming performance, e.g. in terms of compression efficiency and error resilience.

Most of the current video coding schemes, such as MPEG- 1/2/4 and H.261/263/264, are compressed by applying the same basic principles. The temporal redundancy is exploited by applying motion compensated prediction, the spatial redundancy is exploited by applying the Discrete Cosine Transform (DCT), and the color space redundancy is exploited by a color space conversion. The resulting DCT coefficients are quantized, and the nonzero quantized DCT coefficients are runlength and Huffman coded to produce the compressed bitstream. After compression, strong spatiotemporal dependency in video data is created. When these compressed data are transmitted over lossy networks, packet losses can severely affect streaming video quality. For example, as little as 3% MPEG packet loss can cause 30% of the frames to be undecodable [7].

A video communication system is designed with error control to combat the effect of losses. There are four rough classes of approaches for error control: retransmissions, FEC, error concealment, and error-resilient video coding. The last two classes of approaches are source coding approaches for error control. A video streaming system is typically designed using a number of these different approaches. In addition, joint design of the source coding and channel coding is very important. FEC provides a number of advantages. For example, compared to retransmissions, FEC does not require a back-channel and may provide lower delay since it does not depend on the round-trip-time of retransmits.

Most importantly, FEC-based approaches are designed to overcome a predetermined amount of losses and they are quite effective if they are appropriately matched to the channel. If the losses are less than a threshold, then the transmitted data can be perfectly recovered from the received data with losses. However, if the losses are greater than the threshold, then only a portion or none of the data can be recovered, depending on the type of FEC used. Although it is difficult to find those thresholds, constructing a testbed or framework that can investigate the packet loss effects on FEC-based video communication system is very useful to verify if the new algorithms and schemes work properly [7].

Video Over IP Test Model

This section presents the configuration of the video server and client for testing Video over IP. The schematic diagram is actually shown in Figure 2. The various processes involved in encoding and decoding the video packets, along with various components of the testbed and operations used to add the EC for the network layer packets and drop the packets based on the different PLR conditions, are discussed as follows:

Fullscreen capture 22032013 100903 AM

Figure 3. Conceptual diagram of test model

1 An open-source SIP server Kamailio 2.3 which offers secure communication via TLS for video;

2 Jitsi a Video Communicator based on SIP protocol, which supports H.264 codec for video is used as a softphone;

3 Packets are captured between the video server and client using the free packet analyzer wireshark and it is stored as .pcap file for processing;

4 A unique python script, in which EC is implemented over the captured IP packets with desired PLR and block size values.

In this model, the client and the server are connected via a 100 Mbps Ethernet cable as shown in the Figure 2, where the packets are encapsulated with the EC by the sender. WAN environment is simulated using the home made python script with a defined PLR and block size. Later on, these packets are captured at the client using Wireshark and stored as a .pcap file, from which the H.264 video files are extracted using the Videosnarf software, which can be used for further processing.

The quality of the Video over IP transmission using EC was examined as follows:

Reference video file

In step 1, a test video call was established between the server and the client via the SIP server connecting over UDP. Packets were captured from the client using Wireshark during the video call, and it is stored as a reference .pcap file.

Emulation with special module

In the step two, we used a Python script which acts as a packet loss emulator, when the RTP packets are filtered out from the data packets, and some of the RTP packets are marked and lost randomly based on the PLR. The output of this state is considered as an unconcealed .pcap file. Then the EC method was used to recover the lost RTP packets based on the block size as explained in section 3. At this step, the output is considered as a concealed .pcap file which contains the recovered RTP packets of the unconcealed .pcap files on the client side. These steps are repeated under various PLR and FEC block size for the same reference file.

Transcoding Video data to frames

In step three, Videosnarf tool is used to extract the RTP packets from the .pcap file. Thereafter the hexadecimal data packets are decoded, later it detects the information of the RTP packet and stores the audio and video files based on the CODEC used for encoding the data at the transmitter. H.264 video files are obtained in our case; this compressed information is converted in to raw .avi video file using FFmpeg decoder tool. This .avi file is read in MATLAB by which video is converted into individual frames.

Examining the video quality

In the last step the video quality of the reference (original), concealed and unconcealed video is measured by comparing the individual frames of the video. MSE and PSNR are calculated for RGB per pixel by comparing the frames. A plot is drawn (PSNR vs. Frame number) for different under different block and packet size.

Results on Various Packet Analyses

This section demonstrates the video quality comparisons made under different packet EC block size, packet size and resolution of the video. A raw PLR of 2% is considered for the entire experiment, whereas the block sizes are considered under two scenarios (i) k=4 and n=5, (ii) k=10 and n=11, and the packet size is based on three scenarios (i) 600 bytes, (ii) 1024 bytes (iii) 5000 bytes. By changing the packet size at the application and at the interface, it is ensured the packet transmission is done without fragmentation. Also, an experiment of the video call was made using different video resolutions such as 480p and 720p. Images of the frames obtained under different scenarios are displayed for 480p videos with the plot of PSNR in terms of average and RGB components of the individual frames. Finally, Table 1 summarizes the average PSNR with our erasure recovery for different packet sizes and different resolutions.

Subjective quality vs. error length and PSNR drop

To examine the impact of error length and loss severity on the perceptual quality, we created several sequences with a single loss (losing two consecutive frames) in the middle of a sequence but differing in error length and loss severity. The error length is defined as the number of frames starting from the lost frame to the end of next IDR frame in the bitstream. This is because the error propagation usually does not stop until the I-frame in the next GOP. The only exception is when a scene change happens before the end of the GOP. Sequences with different error lengths are created by varying the loss position within a GOP. To measure the severity of a loss, we first determine the PSNR drop for each affected frame, which is the difference between the PSNR of the frame decoded in the absence of the loss, and the PSNR of this frame decoded with packet loss. We then find the biggest PSNR drop among all affected frames, which is simply referred as PSNR drop. There are totally 8 sequences with similar error length (12 frames) but different PSNR drops, and 6 sequences with similar PSNR drops (about 10 dB) but different error lengths.

.

1500_480_2_4_xor

Figure 4.: PSNR between original frames and frames without EC. Recovery (Average Packet Length of 1024 Bytes and raw PLR of 2%)

Packet EC as presented in Section 2, assumes uniform length data packets which could be considered a limitation; however, it is easily overcome by padding the packets to the same length.

In Figure 4.1 plot of PSNR between the original and not EC concealed frames is shown with an average value of 4.73 dB, which is affected by the raw PLR in the network of 2%. The average packet size for this video is 1024 bytes. Next a comparison is made for concealed video for different block sizes in packet EC.

1500_480_2_4_dro

Figure 4.PSNR between Original Frame and Frame Recovered using Erasure Code with k=4 and n=5

Plot of PSNR between the original and concealed video with raw PLR of 2% and average packet size of 1000 bytes using EC with block size of k=4, n=5 is shown in Figure 4.2 with an average PSNR of 18.2 dB and for block size of k=10, n=11 the same results are shown in Figure 4.3 with an average PSNR of 13.0 dB.

1500_480_2_10_dro

Figure 4. PSNR between original frames and frames recovered using erasure code with k=10 and n=11.

600_480_2_4_dro

Figure 4. PSNR between original frames and frames with EC recovery using erasure code with average packet length of 600 B

Plot of PSNR between the original and the concealed video for packet size of 600 bytes is shown in Figure 4.4 with an average PSNR of 7.6dB and in Figure 4.5 for packet size of 5000 bytes with an average PSNR of 19.2 dB which leads to a statement that a bigger packet size gives a high PSNR value implying superior video quality.

Similarly comparison for unconcealed frames is made using different packet sizes and plot of PSNR is also shown in Figure 4.6 for 600 bytes with average PSNR of 3.6 dB, and in Figure 4.7 for 5000 bytes with average PSNR of 6.563 dB.

5000_480_2_4_dro

Figure 4. PSNR between original Frame and frame with EC recovery using erasure code with average packet length of 5000B

600_480_2_4_rtp

Figure 4.PSNR between original frames and frames without EC with average packet length of 600 Bytes

5000_480_2_4_rtp

Figure 4. PSNR between the original frames and frames without EC with average packet length of 5000 Bytes

Table Average PSNR values of various packet size and resolution.

Payload Size

480p

720p

PSNR with PLR recovery with k=4 and n=5

600

7.6

8.7

1500

18.2

10.4

5000

19.2

11.3

PSNR without PLR recovery

600

3.6

3.3

1500

4.7

4.7

5000

6.5

3.8

Conclusions

Based on variety of experimental results presented with real-time video streaming, packet EC deployed with minimum block size and maximum packet size gives a superior video quality when comparing with moderate packet sizes and other EC parameters. However, to avoid the IP fragmentation because of the Maximum Transmission Unit (MTU) size of the Ethernet cables where a single packet loss causes much bigger segment of data being discarded, 1500B payloads are recommended. In future, in networks with higher MTU, this paper provides evidence that when deploying EC the quality of the video will be improved further by working with bigger packet sizes limited by the acceptable delay.



rev

Our Service Portfolio

jb

Want To Place An Order Quickly?

Then shoot us a message on Whatsapp, WeChat or Gmail. We are available 24/7 to assist you.

whatsapp

Do not panic, you are at the right place

jb

Visit Our essay writting help page to get all the details and guidence on availing our assiatance service.

Get 20% Discount, Now
£19 £14/ Per Page
14 days delivery time

Our writting assistance service is undoubtedly one of the most affordable writting assistance services and we have highly qualified professionls to help you with your work. So what are you waiting for, click below to order now.

Get An Instant Quote

ORDER TODAY!

Our experts are ready to assist you, call us to get a free quote or order now to get succeed in your academics writing.

Get a Free Quote Order Now