期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Background memory area estimation for multidimensional signalprocessing systems

Balasa F. Catthoor F. Hugo De Man 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》1995,3(2):157-172

Memory cost is responsible for a large amount of the chip and/or board area of customized video and image processing system realizations. In this paper, we present a novel technique-founded on data-flow analysis which allows one to address the problem of background memory size evaluation for a given nonprocedural algorithm specification, operating on multidimensional signals with affine indexes. Most of the target applications are characterized by a huge number of signals, so a new polyhedral data-flow model operating on groups of scalar signals is proposed. These groups are obtained by a novel analytical partitioning technique, allowing to select a desired granularity, depending on the application complexity. The method incorporates a way to tradeoff memory size with computational and controller complexity 相似文献

2.

A combined DMA and application-specific prefetching approach for tackling the memory latency bottleneck

Dasygenis M. Brockmeyer E. Durinck B. Catthoor F. Soudris D. Thanailakis A. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2006,14(3):279-291

Memory latency has always been a major issue in embedded systems that execute memory-intensive applications. This is even more true as the gap between processor and memory speed continues to grow. Hardware and software prefetching have been shown to be effective in tolerating the large memory latencies inherit in large off-chip memories; however, both types of prefetching have their shortcomings. Hardware schemes are more complex and require extra circuitry to compute data access strides, while software schemes generate prefetch instructions, which if not computed carefully may hamper performance. On the other hand, some applications domains (such as multimedia) have a uniform and known a priori memory access pattern, that if exploited, could yield significant application performance improvement. With this characteristic in mind, we present our findings on hiding memory latency using the direct memory access (DMA) mode, which is present in all modern systems, combined with a software prefetch mechanism, and a customized on-chip memory hierarchy mapping. Compared to previous approaches, we are able to estimate the performance and power metrics, without actually implementing the embedded system. Experimental results on nine well known multimedia and imaging applications prove the efficiency of our technique. Finally, we verify the performance estimations by implementing and simulating the algorithms on the TI C6201 processor. 相似文献

3.

Energy Aware Algorithm and Implementation of SDR Oriented HSDPA Chip Level Equalizer

Min Li Bruno Bougard Liesbet Van Der Perre Francky Catthoor 《Journal of Signal Processing Systems》2009,56(2-3):327-340

The flexibility and programmability of SDR come at the expense of reduced efficiency and increased energy consumption. This is usually considered as the penalty of SDR. However, the flexibility and programmability have great potentials for improving the system-wide efficiency if they are properly exploited. In this paper, we present a HSDPA chip equalizer that is explicitly designed for SDR implementations. The first SDR-specific feature of our work is the multi-mode operation based on heterogeneous algorithms. The proposed equalizer combines an optimized LMS variant (with subspace-aware extension) and an optimized SRI-RLS algorithm based on QRD. Instead of always applying the powerful SRI-RLS algorithm, the equalizer switches to simple LMS-variant when possible. With negligible BER degradation, the multi-mode operation can reduce 60% of the cycle-count on TI TMS320C6713 for 3GPP case 4 with 16QAM modulation. The proposed equalizer framework also incorporates a generic, robust and efficient scheme for equalization-length adaptation. The length-adaptation scheme can make very fast run-time decision based on an efficient policy-template, which is optimized with large training set at design time. We test 14 representative channel profiles specified in ITU-R M.1225, 3GPP TR 25.943 and 3GPP TS 25.101. Comparing to worst-case based design the length-adaptation achieves more than 10× cycle-count reductions for ten of the cases. 相似文献

4.

Design exploration of a NVM based hybrid instruction memory organization for embedded platforms

Manu Perumkunnil Komalan José Ignacio Gómez Pérez Christian Tenllado José Miguel Montañana Antonio Artés José Francisco Tirado Fernández Francky Catthoor 《Design Automation for Embedded Systems》2013,17(3-4):459-483

相似文献

5.

Exploiting Varying Resource Requirements in Wavelet-based Applications in Dynamic Execution Environments

Bert Geelen Vissarion Ferentinos Francky Catthoor Spyridon Toulatos Gauthier Lafruit Thanos Stouraitis Rudy Lauwereins Diederik Verkest 《Journal of Signal Processing Systems》2009,56(2-3):125-139

In the context of future dynamic applications, systems will exhibit unpredictably varying platform resource requirements. To deal with this, they will not only need to be programmable in terms of instruction set processors, but also at least partial reconfigurability will be required. In this context, it is important for applications to optimally exploit the memory hierarchy under varying memory availability. This article presents a mapping strategy for wavelet-based applications: depending on the encountered conditions, it switches to different memory optimized instantations or localizations, permitting up to 51% energy gains in memory accesses. Systematic and parameterized mapping guidelines indicate which localization should be selected when, for varying algorithmic wavelet parameters. The results have been formalized and generalized to be applicable to more general wavelet-based applications. 相似文献

6.

Green Reconfigurable Radio Systems 总被引：1，自引：0，他引：1

Dejonghe A. Bougard B. Pollin S. Craninckx J. Bourdoux A. Ven der Perre L. Catthoor F. 《Signal Processing Magazine, IEEE》2007,24(3):90-101

The wireless standards scene and its evolution strengthens the need for functional flexibility in future radios. Multimode terminals supporting an increasingly large variety of standards (cellular, WLANs, WMANs, WPANs) are subject to a cost increase that is addressed by more flexible radio interfaces. Energy efficiency, however, is the main obstacle to successfully deploying such reconfigurable radios. To address this, it is essential to design energy-scalable SDRs, both for the radio front-end and the digital baseband platform. Complementing this, an essential ingredient is an intelligent controller that optimally exploits this scalability and the run-time dynamics to translate potential energy scalability to actual low-power operation. To realize this goal, an energy-aware cross-layer radio management framework is introduced. It was instantiated in different case studies, showing the applicability of this approach in realistic setups. Results have shown that substantial gains can be achieved through effective cross-layer optimization and problem partitioning. Next, it was shown that SDRs will play a crucial role in enabling CRs, which will enable saving on both the scarce radio spectrum and battery lifetime. A key building block for the design of such CRs, i.e., the appropriate control intelligence to make the SDR platform cognitive, can be derived by incrementally building on the proposed framework. As a result, green (or environment friendly) reconfigurable radio systems will emerge, which offer a wide variety and ubiquitous availability of wireless services, while overcoming energy and spectrum scarcity 相似文献

7.

Energy Aware Signal Processing for Software Defined Radio Baseband Implementation

Min Li David Novo Bruno Bougard Claude Desset Antoine Dejonghe Liesbet Van Der Perre Francky Catthoor 《Journal of Signal Processing Systems》2011,63(1):13-25

The fast pacing diversity and evolution of wireless communications require a wide variety of baseband implementations within a short time-to-market. Besides, the exponentially increased design complexity and design cost of deep sub-micron silicon highly desire the designs to be reused as much as possible. This yields an increasing demand for reconfigurable/ programmable baseband solutions. Implementing all baseband functionalities on programmable architectures, as foreseen in the tier-2 SDR, will become necessary in the future. However, the energy efficiency of SDR baseband platforms is a major concern. This brings a challenging gap that is continuously broadened by the exploding baseband complexity. We advocate a system level approach to bridge the gap. Specifically, we fully leverage the advantages (programmability) of SDR platforms to compensate its disadvantages (energy efficiency). Highly flexible and dynamic baseband signal processing algorithms are designed and implemented to exploit the abundant dynamics in the environment and the user requirement. Instead of always performing the best effort, the baseband can dynamically and autonomously adjust its work load to optimize the average energy consumption. In this paper, we will introduce such baseband signal processing techniques optimized for SDR implementations. The methodology and design steps will be presented together with 3 representative case studies in HSDPA, WiMAX and 3GPP LTE. 相似文献

8.

Experience with Widening Based Equivalence Checking in Realistic Multimedia Systems

Sven Verdoolaege Martin Palkovič Maurice Bruynooghe Gerda Janssens Francky Catthoor 《Journal of Electronic Testing》2010,26(2):279-292

The application of loop and data transformations to array and loop intensive programs is crucial to obtain a good performance. Designers often apply these transformations manually or semi-automatically. For the class of static affine programs, automatic methods exist for proving the correctness of these transformations. Realistic multimedia systems, however, often contain constructs that fall outside of this class. We present an extension of a widening based approach to handle the most relevant of these constructs, viz. accesses to array slices, data dependent accesses and data dependent assignments, and report on some experiments with non-trivial applications. 相似文献

9.

Minimizing the required memory bandwidth in VLSI systemrealizations

Wuytack S. Catthoor F. De Jong G. De Man H.J. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》1999,7(4):433-441

In this paper, we present the problem of storage bandwidth optimization (SBO) in VLSI system realizations. Our goal is to minimize the required memory bandwidth within the given cycle budget by adding ordering constraints to the flow graph. This allows the subsequent memory allocation and assignment tasks to come up with a cheaper memory architecture with less memories and memory ports. The importance and the effect of SBO is shown on realistic examples both in the video and asynchronous transfer-mode (ATM) domains. We show that it is important to take into account which data is being accessed in parallel, instead of only considering the number of simultaneous memory accesses. Our problem formulation leads to the optimization of a conflict (hyper) graph. For the target domain of ATM, only flat graphs without loops have to be treated. For this subproblem, a prototype tool has been implemented to demonstrate the feasibility of automating this important system design step 相似文献

10.

A Framework for Data Partitioning for C++ Data-Intensive Applications

Milidonis A. Dimitroulakos G. Galanis M. D. Kakarountas A. P. Theodoridis G. Goutis C. Catthoor F. 《Design Automation for Embedded Systems》2004,9(2):101-121

We present an automated framework that partitions the code and data types for the needs of data management in an object-oriented source code. The goal is to identify the crucial data types from data management perspective and separate these from the rest of the code. In this way, the design complexity is reduced allowing the designer to easily focus on the important parts of the code to perform further refinements and optimizations. To achieve this, static and dynamic analysis is performed on the initial C++ specification code. Based on the analysis results, the data types of the application are characterized as crucial or non-crucial. Continuing, the initial code is rewritten automatically in such a way that the crucial data types and the code portions that manipulate them are separated from the rest of the code. Experiments on well-known multimedia and telecom applications demonstrate the correctness of the performed automated analysis and code rewriting as well as the applicability of the introduced framework in terms of execution time and memory requirements. Comparisons with Rational’s Quantify^TM suite show the failure of Quantify^TM to analyze correctly the initial code for the needs of data management. 相似文献