In recent years, the aerospace industry, particularly in the military market, has seen the proliferation of unmanned vehicles in a variety of end-use applications. This is especially true for unmanned aerial vehicles with deployments ranging from long-distance flights at high altitudes to ultra-short-range tactical vehicles launched from a person’s shoulder.
A particular area of challenge for unmanned vehicles is video processing. Video is the primary link between the operator who is located on the ground some distance away and the vehicle itself (Figure 1). Some UAV pilots have likened controlling an unmanned aerial vehicle to “flying an airplane while looking through a straw.” The challenges are only increasing as the demand grows for greater visual awareness and more detailed surveillance data.
New unmanned systems are adding many video sources to support, including visible image capture, infrared imaging, radar imaging, and other sources – and sometimes multiple instances of each. These devices are also increasing in resolution from NTSC/PAL standard-definition sensors (~20 MBps raw data) to high-definition sensors at 720p60 and even 1080p30 resolutions (~125 MBps). The resultant increase of raw video data is in excess of 10x per video stream. At the same time, available bandwidth has not increased by a similar ratio and, in fact, at times is more limited because of increased demands on lower-bandwidth satellite links to numerous unmanned aerial vehicles in a given theatre of operation. As a result, efficient data compression is imperative to enable an operator to view meaningful video at a control center while also providing valuable mission data. Codecs, whether JPEG2000 or H.264, are critical in transmitting and storing video data, but which of the two is better suited for video compression/decompression for modern UAV requirements?
Codecs are the key
There are many factors that contribute to the effectiveness of the video subsystem onboard an unmanned aerial vehicle. The subsystem must be compact, lightweight, draw little power, and dissipate little heat. It must be capable of working with a broad range of video standards. One of several vital decisions to be made by the system designer, however, concerns the codecs to be implemented on the vehicle.
It is the function of a video codec to encode/decode video such that it can be transmitted and/or stored, and will generally involve a greater or lesser degree of compression/decompression: Transmitting and storing a “raw” video stream will generally require far more bandwidth than can be made available. Unmanned vehicles are now being designed not only to capture more video information, but also more detailed video information; hence, the codec challenge is magnified as the quantity, resolution, and frame rate of the video increases. The objective in supplying compression product is to deliver an application-appropriate trade-off that balances image quality on the one hand with bandwidth consumption on the other.
Today, two codecs dominate the video market: JPEG2000 and ITU-T H.264, aka MPEG-4 part 10 or AVC (see Figure 2). Each codec has clear advantages and disadvantages. The challenge is to identify, for any given application, which set of advantages is most desirable, or even necessary, and which disadvantages have low net impact. The following are brief descriptions of each codec citing major benefits and limitations of each.
JPEG2000: True lossless compression
JPEG2000 is a standard originally developed for still photos by the Joint Photographic Experts Group. When applied to motion video, JPEG2000 is called MJPEG2000, which is simply a sequence of compressed “static” images, where each frame is compressed and all data to decode a frame is located within the bitstream for that frame.
Because JPEG2000 was developed for static images, its features and functionality were inevitably tailored for that environment, rather than for motion video. Design priorities included the ability to support lossless compression, and the ability to display a low-resolution “thumbnail” of the original for quick indexing views. There are two versions of the JPEG2000 transform: one that has a reversible complement for true mathematically lossless compression and one that provides greater compression ratios but is lossy.
Lossless JPEG2000 compression has a maximum compression ratio of about 3:1. A single, 30 frame/sec, HDTV signal (1,920 x 1,080 pixels) would result in data rates of approximately 316 Mbps. Given this reality, lossless compression is not practical in an unmanned vehicle environment as sufficient communications bandwidth is simply not available. However, JPEG2000 still provides a convenient approach to viewing low-resolution versions of static images via its “thumbnail” feature.
Lossy JPEG2000 can increase the compression ratio up to 20:1 with reasonable visual quality, reducing the compressed data rate to around 30 Mbps. The limiting factor is the degree of quality degradation that can be endured. JPEG2000 is limited by the need for all of the information for each frame to be fully transmitted and received: The only way to reduce bandwidth requirements in JPEG2000 is to reduce the video quality.
H.264: Visually lossless compression
ITU-T H.264 (aka MPEG-4 part 10) was a joint effort between the International Telecommunication Union and the Moving Picture Experts Group. These two groups are both interested in efficient transfer of moving images across low-bandwidth media. Thus, H.264 was designed specifically to provide optimal motion video quality over a low-bandwidth link – making it ideal for many UAV applications. It was originally targeted for telephone lines, but it has now been extended to Ethernet and RF applications as well.
Various methods were developed to leverage the fact that not all information in a given frame changes from the previous frame. In fact, in many applications, only a small amount of information changes on a frame-to-frame basis. Additionally, much of the changed information from frame to frame actually occurred in the preceding frame, but just in a different location. Motion video, especially for live video camera inputs, provides significant frame-to-frame similarities. These similarities and redundancies can be leveraged by a codec like H.264 to significantly increase compression performance with little or no impact on perceived video quality. Although the result is not mathematically lossless compression, the image can truly be considered visually lossless when reconstructed. A visually lossless image is typically defined as one in which the missing data does not result in a noticeable effect when viewed by the human eye, without very close inspection or comparison with the original data. Meanwhile, as mentioned, the inability to capitalize on frame-to-frame efficiencies is one of the key weaknesses of the JPEG2000 codec when applied to motion-video streaming applications.
The compression ratios that can be attained by the H.264 codec, using motion-video redundancies, can reach up to 100:1 or 150:1 while maintaining excellent video quality. Thus, a high-quality HDTV video stream at 30 frames/sec can be transmitted with as little as 5 to 10 Mbps of data bandwidth. Lower data rates can be attained if needed by slightly reducing the video quality or limiting the frame size or frame rate of the transmitted stream. Acceptable video could be transmitted with less than 1 Mbps in some cases.
Figure 3 contrasts JPEG2000 (top) versus H.264 (bottom). How many cars are parked in front of the building? In a limited (8 Mbps) bandwidth environment using the same compression ratio, the H.264 codec can deliver greater image accuracy than JPEG2000. JPEG2000 can deliver very high-quality images but at much lower compression ratios.
Trading image quality, performance, bandwidth
Choosing an appropriate video codec is fundamental to achieving optimal success in a UAV-based mission. JPEG2000 and H.264 offer designers two alternatives, each with advantages and disadvantages. JPEG2000 provides substantial flexibility in its implementation, in addition to truly lossless compression and the ability to generate low-resolution “thumbnail” images. H.264 offers the opportunity to compress video at higher density, resulting in lower bandwidth utilization. And, while not truly lossless, H.264 is visually lossless – which can be sufficient. As always, it is the specific demands of the application that determine whether one, the other, or both are implemented.
GE Intelligent Platforms www.ge-ip.com