Cascade Decoder And Encoder Architecture

Published Date: 02 Nov 2017

Keywords: Video Transcoding, Spatial Domain, Transcoding Architecture, Error Resilience.

Introduction

Video transcoding is a process of converting a compressed video stream into another compressed video stream. The type of transcoding is determined by the difference in original stream and transcoded stream. Video transcoding deals with converting a previously compressed video signal into another one with different format, such as different bit rate, frame rate, frame size, or even compression standard.

Target bitstream

Transcoder

Source bitstream

Figure1 Video Transcoding

The earliest transcoding application is to adapt bit rate of the compressed video stream to the channel bandwidth. For instance, originally a program is compressed at high bit rate but when transmitted over a channel, it need to be at lower bit rate [5].

One scenario is to deliver a high quality video content through a more restricted wireless network to be accessed by mobile phones. To deliver multimedia data to multiple users, multimedia content need to be adapted dynamically according to user environment. Transcoding technology is needed to fulfill these tasks [5]. Transcoder code the video stream from one format to another in multimedia applications such as digital video broadcasting, teleconferencing, Video on Demand etc. [3].

VIDEO TRANSCODING ARCHITECTURE

CASCADE DECODER AND ENCODER ARCHITECTURE

Cascade Video transcoder is the simplest transcoder. This architecture takes source Bitstream that is decoded by Variable Length Decoder (VLD). The decoded video frame are inversely quantised & converted into Inverse Discrete Coefficient Cosine Transform (IDCT). After converted into IDCT, get a copy of coded pixel. These pixels are added to reference frame and then we apply motion Compensation to these pixels [4] [3].

Figure2. Cascade Video Transcoder

Advantages:

It is simple

It is flexible

It reduces computational complexity

OPEN LOOP ARCHITECTURE

The Open Loop Architecture in which at decoder end, source Bitstream is given to VLD and we get DCT coefficient. After getting DCT coefficient, do inverse quantisation. At the encoder end, the DCT coefficients are re-encoded by using different quantiser (Qt). Then we pass it through Variable Length Coding (VLC).

Figure3. Open Loop Architecture

Advantages:

It is computationally efficient.

It is fast.

It has minimum transcoding complexity.

Disadvantage:

It introduces drift error.

Drift Error:

A video picture is predicted from the reference frame (R). The prediction errors are only coded. To work correctly, the decoder predictor and encoder predictor should have the same video picture that is stored in Reference frame. The difference in pictures stored in decoder predictor and encoder predictor causes error. This error is called drift error [5].

CLOSE LOOP ARCHITECTURE

Close Loop transcoder contain feedback loop. The close loop architecture has the ability to remove the mismatch between the residual and the predicted frame. The difference between the closed-loop architecture and the cascaded decoder-encoder is the reconstruction loop operating in the pixel domain. It has only one DCT/IDCT pair.

Figure4. Close Loop Architecture

FREQUENCY DOMAIN TRANSCODING

In the Frequency Domain Transcoding Architecture [4] at decoder end, to get the DCT Coefficient, first pass the source bitstream (Vs) to VLD then inverse quantization is performed. At the Encoder End, we encode Motion Compensation by applying re-quantization and VLC and Then DCT values are stored in Reference frame (R).

Figure5. Frequency Domain Transcoding

Advantages:

It needs less computation.

Disadvantages:

It introduces drift error.

It lacks flexibility.

SPATIAL DOMAIN TRANSCODING

In this architecture, At Decoder End, the source bitstream (Vs) is passed through Variable Length Decoder (VLD) and then inverse quantization to get DCT Coefficients and then passed through IDCT to inverse DCT Coefficient value. The DCT values are stored in Reference Frame (R).

This Architecture has two functional blocks:

MV Composition and Refinement (MVCR):

Spatial/ temporal Resolution Reduction (STR)

The two functional blocks are between decoder and encoder. MVCR are used to adjust the Motion Vector (MV). STR is used to adjust spatial and temporal resolution of the target video stream [4].

Figure6. Spatial Domain Transcoding

Advantages:

It is flexible.

It is drift free.

The information such as motion information and mode decision can be reused in encoder.

HOMOGENEOUS TRANSCODING

In homogeneous architecture, the video content is recompressed within the same compression standard. This transcoding type provides several functions such as adjustment of bit rate and picture format conversion. Multimedia communications uses transcoding in order to match the particular constraints of each channel. Each channel may have different bandwidth limitations and different target decoders.

Bitrate reduction

The first application of homogeneous transcoder is the use for bitrate reduction. It is used to match the network resources. The bitrate reduction can be performed by using four different techniques:

Coefficient truncation - This technique have unequal distribution of energy of DCT Coefficients. The major portion of coefficient energy is at low frequency band and high frequency band have minimal impact on video quality. This technique avoids inverse quantization and re-quantization.

Figure7. Coefficient Truncation

Re-quantisation â€“ Quantization is the important tool to perform bitrate control in encoding. The most common method for bitrate reduction is to increase the quantisation step size. This also increases the number of zero coefficients. Re-quantisation decreases the number of symbols to be encoded and increasing the compression.

Figure8. Re-quantization

Re-encoding with reuse of motion vectors and mode decisions - This technique re-encodes the video by reusing the original motion vectors and mode decisions embedded in the bit stream. This technique eliminates the error drift, as the reference frames are reconstructed and the residual information recompressed. This enhances image quality.

Figure9. Re-encoding with reuse of motion vectors and mode decision

Re-encoding reusing input motion vectors - This technique is an extension of the previous one. In this coding modes may be changed. For higher bit rate reductions, the motion information overhead becomes too high, constraining the bitrate and resulting into poorly encoded residual information. This technique reuses the motion information. It modifies the coding modes to achieve new optimal coding decisions based on the output bitrate.

CASCADED DCT- DOMAIN TRANSCODER FOR SPATIAL RESOLUTION DOWNCONVERSION

The transcoder is divided into four main blocks: decoder, downscaler, encoder, MV composer. The architecture avoids DCT and IDCT computation.

Figure10.Cascaded DCT-Domain transcoder for spatial resolution down conversion

The DCT-Domain Down conversion transcoder can be simplified by moving the downscaling operations into decoder loop so that decoder only needs to decode one quarter of the original picture size [2].

FAST CLOSE LOOP TRANSCODER

In this architecture, one switch is used to control whether or not to replace 8*8 blocks with accumulated error. If in a block, the accumulated error is greater than threshold then it follows traditional transcoder. If the block has error less than threshold, it goes to re-quantization (Q2). Accumulated error for each block is measured with sum of absolute error. If threshold is larger, then less lock is compensated with accumulated error and transcoding process will be simplified. But it can degrade the picture quality.

Figure11. Fast Close-Loop Transcoder

ERROR RESILIENCE TECHNIQUE

As technology is growing rapidly, the video can be transferred via wired or wireless medium. To transfer video wirelessly are more prone to errors, we use Error Resiliency Techniques [6]. In video coding scheme design, coding efficiency is essential. The strategy that can be employed during coding are:

Localization:

In Localization, the spatial & temporal dependencies are removed. The predictive coding loops are break so that if an error, it not affects the other parts of video. There are types of localizations:

Spatial Localization: In entire bitstream, if a bit is lost, decoding of video stream is not possible. The video is not decoded till it is re-synchronized. In Spatial Localization, re-synchronization makers are added to the bitstream periodically at the boundary of the particular frame. After re-synchronization maker, then header information is added. Header information is used to restart the decoding process. The data between the first error and first re-synchronization occur, is discarded.

Temporal Localization: It is a reference picture selection that was introduced in H.263 and MPEG4 to improve Error Resilience. It is assumed that system is feedback based. The decoder sends the information of corrupt areas to encoder. It then alerts its operations by choosing non- corrupted parts of stream.

Data partitioning:

In Data Partitioning, bits are grouped according to their importance to the decoding. If bitstream send over error prone channel, then important bits are protected.

In MPEG2, data partitioning is divided into two parts:

Low priority

High priority.

In high priority, all the important information is stored such as picture type, DCT coefficients, header information etc. All the other information is stored in low priority partition. In MPEG4, data partition is achieved by separating motion and MB header from texture information. It requires two texture information marker between motion and texture information. If texture information is lost, then motion information is used to conceal the error and texture information is discarded.

Redundant coding:

Redundant coding enhances error resilience by adding redundant video coding. The redundant coding can be either added:

Implicitly (using Reversible Variable Length Coding or Multiple Description Coding) or

Explicitly (Redundant Slices).

In Reversible Variable Length Coding (RVLC), is used for data recovery. The variable length codes are designated so that they can read in both directions. In Multiple Description Coding (MD Coding), source is encoded with multiple bitstream, if we receive one bit correctly then it achieve basic quality reconstruction. If more than one bit is received correctly, then it achieve enhance quality reconstruction. In Redundant Slices, use of same data source to be represented differently using different encoding parameters. In this video stream may have primary slice and redundant slice of same data, if primary slice is lost then redundant slice is used.

Concealment-driven:

Concealment driven can be:

Concealment motion vector

Flexible Marco-block Order (FMO)

In Concealment motion vector are motion vectors that may be carried out by intra-MBs for the purpose of concealment error. In FMO, specified pattern that allocates the MBS in a picture in flexible manner. Then spatially consecutive MBs may be assigned to different slice group. If a slice group is lost, image pixel in spatially neighboring MBs that belongs to other received slice group can be used for error concealment.

Scalable Video Coding

In Scalable Coding, video is encoded once using a flexible syntax. It allow receiver to partially decode the bitstream to different levels previously defined at encoding stage. The goal of Scalable video Coding is to provide adaptation with other network types and terminals. This concept is friendly. Scalable Video Coding is implemented by encoding several layers, combined to achieve a high quality video. In this video have a base layer and one or more enhancement layers. The base layer is important layer to decode the enhancement layers.

Figure12. SVC

Types of Scalability:

Temporal Scalability: In temporal scalability, upper layers have higher temporal resolution. It has higher resolution by inserting frames in between lower layers. The enhancement frames may be coded with reference to either themselves or their prediction may come from the lower layer frames.

Spatial Scalability: Spatial scalability provides an efficient coding method to increase the spatial resolution. The base layer is composed by a spatially reduced version of the whole video. This downsize operation can be performed either in the pixel or in the compressed domain in order to save some computational resources at the encoding stage

SNR Scalability: The quality scalability, also known as Signal to Noise Ratio (SNR) scalability, aims to provide an additional layer with an increased quality. In this scalability type, pictures of the same spatial resolutions are produced at different quality

CONCLUSION

Video transcoding is a technology for providing multimedia access by the internet users with different access links and devices. This paper reviewed several existing video transcoding techniques. The Open Loop transcoder is fastest and simplest of the other transcoding. FDTA lacks flexibility and introduces drift error. But FDTA needs less computation. SDTA is flexible. SDTA is drift free

Our Service Portfolio

Want To Place An Order Quickly?

Then shoot us a message on Whatsapp, WeChat or Gmail. We are available 24/7 to assist you.

Do not panic, you are at the right place

Visit Our essay writting help page to get all the details and guidence on availing our assiatance service.

Get 20% Discount, Now
£19 £14/ Per Page
14 days delivery time

Our writting assistance service is undoubtedly one of the most affordable writting assistance services and we have highly qualified professionls to help you with your work. So what are you waiting for, click below to order now.

Get An Instant Quote

ORDER TODAY!

Our experts are ready to assist you, call us to get a free quote or order now to get succeed in your academics writing.

Get a Free Quote Order Now

Cascade Decoder And Encoder Architecture

Introduction

VIDEO TRANSCODING ARCHITECTURE

CASCADE DECODER AND ENCODER ARCHITECTURE

OPEN LOOP ARCHITECTURE

CLOSE LOOP ARCHITECTURE

FREQUENCY DOMAIN TRANSCODING

SPATIAL DOMAIN TRANSCODING

HOMOGENEOUS TRANSCODING

Bitrate reduction

Coefficient truncation - This technique have unequal distribution of energy of DCT Coefficients. The major portion of coefficient energy is at low frequency band and high frequency band have minimal impact on video quality. This technique avoids inverse quantization and re-quantization.

CASCADED DCT- DOMAIN TRANSCODER FOR SPATIAL RESOLUTION DOWNCONVERSION

FAST CLOSE LOOP TRANSCODER

ERROR RESILIENCE TECHNIQUE

Localization:

Data partitioning:

Redundant coding:

Concealment-driven:

OTHER CONTENT APDATATION METHOD

Scalable Video Coding

CONCLUSION

Our Service Portfolio

Want To Place An Order Quickly?

Do not panic, you are at the right place

Get 20% Discount, Now £19 £14/ Per Page14 days delivery time

Get An Instant Quote

Get 20% Discount, Now
£19 £14/ Per Page
14 days delivery time