首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
郑兆青  桑红石  黄卫锋  沈绪榜 《电子学报》2007,35(10):1921-1926
本文提出了一种用于H.264/AVC的D级数据重用整数运动估计VLSI结构.提出的结构是在一种固定块尺寸运动估计VLSI结构基础上,利用交叉网络实现变块尺寸的计算,使用多bank的存储器组织方式,使片上存储器的读写规则简单,易于处理不同搜索范围和不同尺寸的视频的运动估计.提出的运动估计结构用Verilog HDL描述,使用HJTC 0.18μm工艺,用Synopsys DC做了逻辑综合.相比现有结构,该结构由于增加片上存储器,因此数据重用率高,大大降低了存储带宽需求;另外数据吞吐率高,能够满足高性能视频编码需求.  相似文献   

2.
MPEG4AVC/ITU—T H.264视频编码标准中所采用的多模式运动估计算法与传统的MPEG4、H.263 高级预测模式相比较而言,编码效率和性能都大大提高。但其诸如模式决策等问题却给运动估计器,特别是硬件运动估计器带来非常大的运算复杂度。本文提出一种H.264运动估计器硬件结构,它采用了新的模式决策算法和快速运动估计算法。仿真结果证明,这两种算法不但能使运动估计器降低其硬件实现成本,而且能减少模式决策和运动估计的时间。  相似文献   

3.
李宇  梅顺良 《电视技术》2007,31(8):23-26
对H.264/AVC和AVS的宏观算法和局部异同点进行了分析,提出了基于H.264/AVC和AVS的视频解码器芯片系统结构,以满足高处理能力和高吞吐量的要求.在此结构中,将混合视频编码框架分为5个处理核,各处理核通过不同参数的设置来实现相应标准的处理过程,实现硬件的可重用.采用多级混合的流水线结构,充分利用视频处理任务级的并行性,提高处理的吞吐量.采用3级的存储器系统结构,并对存储器结构的3个层次分别进行优化,有效提高了数据访问的效率核并行度.  相似文献   

4.
提出了一种用于H.264/AVC编解码器的通用并行变换结构,并利用Verilog语言进行了电路设计.该并行结构主要包含4个移位器和16个累加器,可以完成H.264/AVC中的全部4×4变换,包括4×4哈达马变换和4×4离散余弦变换和反变换,能够达到每个时钟周期处理一个像素点的速度.使用SMIC 0.18 μm工艺对该并行结构进行了综合,电路面积为3757门,工作在100 MHz时钟频率下的关键路径为10.3 ms.  相似文献   

5.
This letter presents a novel approach for organizing computational resources into groups within H.264/AVC motion estimation architectures, leading to reductions of up to 75% in the equivalent gate count with respect to state‐of‐the‐art designs.  相似文献   

6.
基于H.264/AVC的分像素点滤波算法提出了一种新的分像素插值结构,避免了大量中间数据的存储,并且具有数据流规整、控制简单、垂直方向连续插值、可重用等优点。在0.18μm工艺下,最大频率125MHz时,综合逻辑门数为18k门,能够满足SDTV(1280×720,30f/s)视频图像实时处理的需要。  相似文献   

7.
一种基于H.264/AVC的高效块匹配搜索算法   总被引:15,自引:2,他引:13  
薛金柱  沈兰荪 《电子学报》2004,32(4):583-586
本文针对H.264/AVC的编码特点,提出了一种利用时空域运动相关性的快速块匹配搜索算法.该算法充分利用了视频序列的运动程度与宏块编码模式间的关联特性以及运动矢量的统计特征,明显减少了运动估计的搜索复杂度.实验表明,本文方法的搜索速度分别比FS和DS算法平均提高了77.96%和32.19%;重建图像的PSNR比DS算法平均提高了0.06dB,更接近FS算法的编码质量.  相似文献   

8.
H.264中运动估计算法的一种硬件实现架构   总被引:1,自引:1,他引:0  
提出了一种适用于H.264标准中可变块大小运动估计算法的硬件实现架构.架构中采用一维处理单元(PE)阵列来实现运动估计算法中匹配块的搜索,通过对较小子块的块间误差(SAD)的复用来计算不同大小块的块间误差.与传统的处理一个运动矢量的架构相比,这种架构在一定的时钟周期内最多可处理41个运动矢量,并且具有面积小、速度快的特点.  相似文献   

9.
Several specific features have been incorporated into Motion estimation (ME) in H.264 coding standard to improve its coding efficiency. However, they result in very high computational load. In this paper, a fast ME algorithm is proposed to reduce the computational complexity. First, a mode discriminant method is used to free the encoder from checking the small block size modes in homogeneous regions. Second, a condensed hierarchical block matching method and a spatial neighbor searching scheme are employed to find the best full-pixel motion vector. Finally, direction-based selection rule is utilized to reduce the searching range in sub-pixel ME process. Experimental results on commonly used QCIF and CIF format test sequences have shown that the proposed algorithm achieves a reduction of 88% ME process time on average, while incurring only 0.033 dB loss in PSNR and 0.50% increment on the total bit rate compared with that of exhaustive ME process, which is a default approach adopted in the JM reference software.  相似文献   

10.
This paper proposes a novel cost-effective and programmable architecture of CAVLC decoder for H.264/AVC, including decoders for Coeff_token, T1_sign, Level, Total_zeros and Run_before. To simplify the hardware architecture and provide programmability, we propose four new techniques: a new group-based VLD with efficient memory (NG–VLDEM) for Coeff_token decoder, a novel combined architecture (NCA) for level decoder, a new group-based VLD with memory access once (GMAO) for Total_zeros decoder and a new VLD architecture based on multiplexers instead of searching memory (MISM) for Run_before decoder. With the above four techniques, the proposed CAVLC decoder can decode every syntax element within one clock cycle. Synthesis result shows that the hardware cost is 3,310 gates with 0.18 μm CMOS technology at a clock constrain of 125 MHz. Therefore, the proposed design is satisfied for real-time applications, such as H.264/AVC HD1080i video decoding.
Shunliang MeiEmail:
  相似文献   

11.
In this paper, we present high performance motion compensation architecture for H.264/AVC HDTV decoder. The bottleneck of efficient motion compensation implementation primarily rests on the high memory bandwidth demand and six-tap fractional interpolation complexity. To solve the bottleneck for H.264/AVC HD applications, three combined bandwidth optimization strategies are proposed to minimize the memory bandwidth for MB-based decoding process. To improve the interpolation hardware utilization and reduce the interpolation cycles, an interpolation classification scheme is proposed. By classifying the fifteen fractional pixels into five types and processing correspondingly, the interpolation cycles decrease significantly. A direct mapping memory cache characterized with circular addressing, byte-aligned addressing and horizontal and vertical parallel access is designed to support the proposed scheme. The hardware of proposed motion compensation is implemented at 100 M with 31.841 K logic gates, averagely 70–80% reduced memory bandwidth can be offered and the interpolation hardware can be fully utilized and interpolate one MB within 304 cycles, which can satisfy the real time constraint for H.264/AVC HD (1,920 × 1,088) 30 fps decoder. The design is implemented under UMC 0.18 μm technology, and the synthesis results and comparisons are shown.
Yu LiEmail:
  相似文献   

12.
This paper presents a compact hardware architecture of Context-Based Adaptive Binary Arithmetic Coding (CABAC) codec for H.264/AVC. The similarities between encoding algorithm and decoding algorithm are explored to achieve remarkable hardware reuse. System-level hardware/software partition is conducted to improve overall performance. Meanwhile, the characteristics of CABAC algorithm are utilized to implement dynamic pipeline scheme, which increases the processing throughput with very small hardware overhead. Proposed architecture is implemented under 0.18 μm technology. Results show that the core area of proposed design is 0.496 mm2 when the maximum clock frequency is 230 MHz. It is estimated that the proposed architecture can support CABAC encoding or decoding for HD1080i resolution at a speed of 30 frame/s.
Lingfeng LiEmail:
  相似文献   

13.
H.264/AVC中去块效应环路滤波的VLSI实现   总被引:2,自引:0,他引:2  
提出了一种适用于H.264编解码环内去块效应滤波的VLSI结构。利用相邻4×4像素块间数据的依赖关系合理组织数据存储顺序,并通过增加本地SRAM,使垂直滤波数据来自本地,读写外部SDRAM的次数减半,从而大大减少滤波处理的周期数。设置转置寄存器,水平滤波和垂直滤波可共用一维滤波电路。仿真结果显示,一个宏块去块效应滤波仅需要230个周期。在0.18μm工艺下,最大频率100M时,综合逻辑门数为14K。  相似文献   

14.
一种基于H.264/AVC的快速运动估计算法   总被引:5,自引:5,他引:0  
针对H.264/AVC视频编码标准中的多种块模式运动估计特点,提出了一种新的运动估计算法。该算法包括整像素和亚像素搜索的非对称混合菱形网格搜索法(UDiamondGS,unsymmetrical diamond grading integer pixel search)和分层的矢量偏移算法(HVBFDS,hierarchical vector biased fractional pixel search),通过同Joint Video Team(JVT)提供的JM85进行验证和比较表明,该算法在保持编码器原有的失真度特性的同时,可显著提高编码器编码速度16~20倍。  相似文献   

15.
A Highly Parallel Joint VLSI Architecture for Transforms in H.264/AVC   总被引:1,自引:0,他引:1  
In H.264/AVC, the concept of adapting the transform size to the block size of motion-compensated prediction residue has proven to be an important coding tool. This paper presents highly parallel joint circuit architecture for 8 × 8 and 4 × 4 adaptive block-size transforms in H.264/AVC. By decomposing the 8 × 8 transform to basic 4 × 4 transforms, a unified architecture is designed for both 8 × 8 and 4 × 4 transform and the transform data-path can be efficiently reused for six kinds of transforms. i.e., 8 × 8 forward, 8 × 8 inverse, 4 × 4 forward, 4 × 4 inverse, forward-Hadamard, inverse-Hadamard transforms. Linear shift mapping is applied on the memory buffer to support parallel access both in row and column directions which eliminates the need for a transpose circuit. For reusable and configurable transform data-path, a multiple-stage pipeline is designed to reduce the critical path length and increase throughput. The design is implemented under UMC 0.18 um technology at 200 MHz with 13.651 K logic gates, which can support 1,920 × 1,088 30 fps H.264/AVC HDTV decoder.
Yu LiEmail:
  相似文献   

16.
针对H.264/AVC标准的参考软件JM中存在的几种运动估计算法进行了分析和研究。通过对运动矢量预测、搜索模式设计、提前截止设计等核心技术的比较和分析,指出了几种算法的优缺点,并由此总结了若干运动估计算法设计的重要原则。  相似文献   

17.
In this paper, we propose hardware architecture for a high‐speed context‐adaptive variable length coding (CAVLC) decoder in H.264. In the CAVLC decoder, the codeword length of the current decoding block is used to determine the next input bitstreams (valid bits). Since the computation of valid bits increases the total processing time of CAVLC, we propose two techniques to reduce processing time: one is to reduce the number of decoding steps by introducing a lookup table, and the other is to reduce cycles for calculating the valid bits. The proposed CAVLC decoder can decode 1920×1088 30 fps video in real time at a 30.8 MHz clock.  相似文献   

18.
Deblocking filter is one of the most time consuming modules in the H.264/AVC decoder as indicated in many studies. Therefore, accelerating deblocking filter is critical for improving the overall decoding performance. This paper proposes a novel parallel algorithm for H.264/AVC deblocking filter to speed the H.264/AVC decoder up. We exploit pixel-level data parallelism among filtering steps, and observe that results of each filtering step only affect a limited region of pixels. We call this “the limited propagation effect”. Based on this observation, the proposed algorithm could partition a frame into multiple independent rectangles with arbitrary granularity. The proposed parallel deblocking filter algorithm requires very little synchronization overhead, and provides good scalability. Experimental results show that applying the proposed parallelization method to a SIMD optimized sequential deblocking filter achieves up to 95.31% and 224.07% speedup on a two-core and four-core processor, respectively. We have also observed a significant speedup for H.264/AVC decoding, 21% and 34% on a two-core and four-core processor, respectively.
Ja-Ling WuEmail:

Sung-Wen Wang   received his Ph.D. degree in computer science from National Taiwan University, Taipei, Taiwan, in 2008. His general research interests are in the field of digital video coding, codec-processor architecture co-design and multimedia systems optimization, especially in video coding technology optimization. Shu-Sian Yang   received the B.S. and M.S. degrees in computer science and information engineering from National Taiwan University, Taiwan, in 2005 and 2007, respectively. His current research interests include video compression, image processing, and multimedia application. He is currently working at PixArt Imaging Inc., HsinChu, Taiwan as a senior engineer. Hong-Ming Chen   received the B.S. degree in computer science and information engineering from National Taiwan University, Taiwan, in 2007. He is currently pursuing the M.S. degree in the same department in National Taiwan University. His current research interests include video compression, image processing, digital content analysis, and multimedia application. Chia-Lin Yang   received the B.S. degree from the National Taiwan Normal University, Taiwan, R.O.C., in 1989, the M.S. degree from the University of Texas at Austin in 1992, and the Ph.D. degree from the Department of Computer Science, Duke University, Durham, NC, in 2001. In 1993, she joined VLSI Technology Inc. (now Philips Semiconductors) as a Software Engineer. She is currently an Associate Professor in the Department of Computer Science and Information Engineering, National Taiwan University, Taipei, Taiwan, R.O.C. Her research interests include energy-efficient microarchitectures, memory hierarchy design, and multimedia workload characterization. Dr. Yang is the recipient of a 2000-2001 Intel Foundation Graduate Fellowship Award and 2005 IBM Faculty Award. Ja-Ling Wu   (SM ’98, Fellow ’08) received his Ph.D. degree in electrical engineering from Tatung Institute of Technology, Taipei, Taiwan, in 1986. From 1986 to 1987, he was an Associate Professor of the Electrical Engineering Department, Tatung Institute of Technology. Since 1987, he transferred to the Department of Computer Science and Information Engineering(CSIE), National Taiwan University(NTU), Taipei, where he is presently a Professor. From 1996 to 1998, he was assigned to be the first Head of the CSIE Department, National Chi Nan University, Puli, Taiwan. During his sabbatical leave (from 1998 to 1999), Prof. Wu was invited to be the Chief Technology Officer of the Cyberlink Corp. In this one year term, he involved with the developments of some well-known audio-video softwares, such as the PowerDVD. Since Aug. 2004, Prof. Wu has been appointed to head the Graduate Institute of Networking and Multimedia, NTU. Prof. Wu has published more than 200 technique and conference papers. His research interests include digital signal processing, image and video compression, digital content analysis, multimedia systems, digital watermarking, and digital right management systems. Prof. Wu was the recipient of the Outstanding Young Medal of the Republic of China in 1987 and the Outstanding Research Award three times of the National Science Council, Republic of China, in 1998, 2000 and 2004, respectively. In 2001, his paper “Hidden Digital Watermark in Images” (co-authored with Prof. Chiou-Ting Hsu), published in IEEE Transactions on Image Processing, was selected to be one of the winners of the “Honoring Excellence in Taiwanese Research Award”, offered by ISI Thomson Scientific. Moreover, his paper “Tiling Slideshow” (co-authored with his students) won the Best Full Technical Paper Award in ACM Multimedia 2006. Professor Wu was selected to be one of the lifetime Distinguished Professors of NTU, November 2006. Prof. Wu has been elected to be IEEE Fellow, since 1 January 2008, for his contributions to image and video analysis, coding, digital watermarking, and rights management.   相似文献   

19.
Motion estimation (ME) is the most critical component of a video coding standard. H.264/AVC adopts the variable block size motion estimation (VBSME) to obtain excellent coding efficiency, but the high computational complexity makes design difficult. This paper presents an effective processor chip for integer motion estimation (IME) in H264/AVC based on the full-search block-matching algorithm (FSBMA). It uses architecture with a configurable 2D systolic array to obtain a high data reuse of search area. This systolic array supports a three-direction scan format in which only one row of pixels is changed between the two adjacent subblocks, thus reducing the memory accesses and saving clock cycles. A computing array of 64 PEs calculates the SAD of basic 4×4 subblocks and a modified Lagrangian cost is used as matching criterion to find the best 41 variable-size blocks by means of a tree pipeline parallel architecture. Finally, a mode decision module uses serial data flow to find the best mode by comparing the total minimum Lagrangian costs. The IME processor chip was designed in UMC 0.18 μm technology resulting in a circuit with only 32.3 k gates and 6 RAMs (total 59kBits on-chip memory). In typical working conditions (25 °C, 1.8 V), a clock frequency of 300 MHz can be estimated with a processing capacity for HDTV (1920×1088 @ 30 fps) and a search range of 32×32.  相似文献   

20.
基于H.264/AVC的视频信息隐藏算法   总被引:4,自引:0,他引:4       下载免费PDF全文
胡洋  张春田  苏育挺 《电子学报》2008,36(4):690-694
在H.264/AVC的帧内预测环节,调制H.264/AVC编码中I帧4×4亮度块的帧内预测模式实现信息隐藏.这种调制基于该模式与待隐藏比特之间的映射规则进行.宿主4×4块的具体位置由各块自身特点结合密钥所指定的嵌入位置模板确定.信息的提取过程不需要原始视频内容,也不需完全解码,而只要对码流中的帧内预测模式进行解码即可.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号