Scalable coding is the most important technology enabling the seamless and dynamic adaptation of content to network and terminal characteristics
and user requirements. It is meant to decouple the processes of encoding, stream management (including rate control) and decoding.
It is crucial to master different media coding technologies for video, audio and also synthetic content (vector graphics and 3D content)
in order to address the broad spectrum of possible applications. Complexity scalability of encoding and decoding processes and combination
of scalable streams with error resiliency mechanisms (at bit and packet level) are also important issues.
a) Scalable Video
Scalability as available in existing Video Coding standards (in particular MPEG-4, which provides most scalable tools among all standards
developed so far) clearly lacks efficiency, which has prevented from wider acceptance so far:
- Invoking scalable tools is usually penalized by a decrease in compression performance, which is in particular true for fine-granular scalable approaches and for scalability over broad ranges of bandwidths;
- Free combination of different dimensions of scalability (spatial, temporal SNR) is highly restricted by available tools.
Simulcast and simulstore solutions (parallel provision of media streams) have been widely used as an alternative, which is no viable solution in flexible media adaptation due to the limitation to pre-determined stream configurations. Transcoding is not a viable solution either, due to the high effort needed for realtime processing at server, proxy or client.
The goal of DANAE is the development of an efficient scalable video coding solution, overcoming the limitations listed above, which should particularly provide high compression efficiency at low and medium bandwidths. The DANAE project intends to contribute to upcoming standardization efforts for efficient and flexible scalable coding in MPEG, where in the area of video the most prominent advanced technology is Motion-Compensated Interframe Wavelet Coding. This class of new codecs, recently investigated in an MPEG exploration, provides numerous advantages over conventional techniques based on motion-compensated prediction:
- No recursive predictive loop, such that no drift occurs if decoding is performed at various bit-rates and resolutions;
- Separation of noise and sampling artefacts from the content through use of longer temporal filters;
- Consistent removal of both long range and short range temporal redundancies;
- Flexibility in the spatial and temporal filtering methods, number of decomposition levels, and filter choices, which allows many improvements not feasible in predictive coding.
As a consequence, wavelet video coding schemes can provide flexible spatial, temporal, SNR and complexity scalability with fine granularity over a wide range of bit rates, while maintaining a high coding efficiency.. By concentrating a critical mass of leader companies in the subject, DANAE prepares to actively define, enhance and implement this new promising standard. In this context, DANAE will concentrate on evaluation and development of tools for motion-compensated 2D+t Wavelet filtering, including optimisation of layered quantisation and entropy coding exploiting spatio-temporal contexts, optimised inter-working of spatio-temporal filters with motion estimation and compensation, scalable motion vector representations, efficient combinations of different scalable dimensions and related stream management. Development of software both for standards reference purposes and real-time applications will be inherent part of this activity.
b) Scalable audio
The same approach will be followed for the standardization effort concerning low bitrate Scalable Audio, which is planned to mainly build on the extensions of MPEG-4 audio coding tools. Innovative proposals solving project's key features, for example scalability and multi-channel, will be particularly studied and proposed to MPEG-4 in that context. Multi-channel audio coding solutions will be investigated and developed as improvements over existing MPEG-4 tools such as SBR.
c) 2D/3D Graphics
The current Graphics Coding formats, whether to be used for scene composition (SVG, BIFS) or for streaming graphics such as cartoons or ads (SWF, BIFS), are all inappropriate for the mobile environment. The formats were designed for the PC and Internet. The players are heavyweight programs that do not scale onto mobiles. The authoring tools do not provide for easy profiling. The goal of DANAE is to design a graphics environment that is as much as possible compatible with the existing standards and yet is the right answer for the mobile environment.
For scalable representation of Avatars, DANAE will focus in both facial animation and virtual characters animation. Particular attention will be paid to algorithms that enable the implementation on (mobile) devices with very limited resources, e.g. methods for transforming / adapting 3D models to 2D models and low complexity 2D face animation player implementation
d) Error resilience
Both consumption and production of content particularly in mobility situations are strongly impacted by the network characteristics: bandwidth, latency, packet losses but also errors. The packet loss and error resiliency is an aspect that needs to be specifically dealt with especially for mobile networks and impacts the very specification of codecs. Not only will this error resiliency have to be taken into account at the very early stages of codec design, but also it may have to be adapted, in a dynamic way, depending on network characteristics. In order to optimize the bandwidth usage while at the same time minimizing the latency, many applications require that UDP packets are not completely discarded if there are residual errors in the payload. Following this trend, techniques for the UDP-Lite protocol have recently been proposed that allow for a subtler handling of damaged UDP packets. According to this trend a kind of paradigm change in the IETF and other related bodies can be observed suggesting that the strict separation of OSI layers is not optimal in all application cases. 'Inter Layer Signaling' has been suggested to resolve this. Another related approach is 'Robust Header Compression' (ROHC) which is especially useful in multimedia messaging and streaming applications over wireless networks where significant bandwidth savings can be achieved, and is unavoidable in future wireless and 4G networks. The consequence is that the application layer (the media decoders) will have to handle both losses (erasures) and bit errors in the payload. Joint Source Channel Coding approaches will be investigated thoroughly in DANAE in this context.
Classical coding schemes (Huffman and arithmetic) used in nearly every compression standard are intrinsically very sensitive to transmission noise. Reversible variable length codes have been introduced in MPEG-4 and H263+ to reducet the impact of the noise on the quality of the received content. However, these codes lead to some loss in compression efficiency while at the same time not completely avoiding error propagation within packets.
DANAE will work on the robust decoding of classical entropy codes used in emerging standards (arithmetic codes) and will also design a new family of codes avoiding error propagation and decoder de-synchronisation at no cost in terms of compression efficiency.
Another type of degradation that the data will suffer from on the wired and wireless links is packet losses. Traditional transform coding systems widely used in compression standards such as MPEGx and H26x, despite ad-hoc solutions in the direction of higher robustness to erasures, such as restricting prediction modes to avoid loss propagation, or choosing per-block coding modes that would take into account the signal distortion induced by the erasures are ad-hoc and sub-optimal. To compensate this sub-optimality, classical solutions consist in using forward error correction at the packet level and/or ARQ mechanisms. These error correction schemes are most efficient if they are dynamically adapted to the QoS provided by the used network Therefore such dynamic adaptation approaches based on MPEG-21 DIA will be explored in the project. However there is another approach to the problem of fighting against losses and errors known as Joint Source Channel Coding (JSCC). These techniques are able to take at best into account the source characteristics and rate-distortion performances of the coder. Optimal rate-distortion performances can be achieved by keeping or introducing a controlled amount of redundancy in the compression source representation. Rather than trying to de-correlate the signal samples by using orthogonal transforms, to eventually produce close to independent samples, in presence of erasures, it is preferable to maintain or introduce redundancy in the compressed source representation. Therefore, in DANAE, solutions based on overcomplete frame expansions will be developed.
|