首页 | 本学科首页   官方微博 | 高级检索  
文章检索
  按 检索   检索词:      
出版年份:   被引次数:   他引次数: 提示:输入*表示无穷大
  收费全文   5篇
  免费   0篇
力学   1篇
无线电   4篇
  2014年   1篇
  2010年   1篇
  2006年   1篇
  2005年   1篇
  2004年   1篇
排序方式: 共有5条查询结果,搜索用时 568 毫秒
1
1.
We present an automated framework that partitions the code and data types for the needs of data management in an object-oriented source code. The goal is to identify the crucial data types from data management perspective and separate these from the rest of the code. In this way, the design complexity is reduced allowing the designer to easily focus on the important parts of the code to perform further refinements and optimizations. To achieve this, static and dynamic analysis is performed on the initial C++ specification code. Based on the analysis results, the data types of the application are characterized as crucial or non-crucial. Continuing, the initial code is rewritten automatically in such a way that the crucial data types and the code portions that manipulate them are separated from the rest of the code. Experiments on well-known multimedia and telecom applications demonstrate the correctness of the performed automated analysis and code rewriting as well as the applicability of the introduced framework in terms of execution time and memory requirements. Comparisons with Rational’s QuantifyTM suite show the failure of QuantifyTM to analyze correctly the initial code for the needs of data management.  相似文献   
2.
We present an architecture of decoupled processors with a memory hierarchy consisting only of scratch-pad memories, and a main memory. This architecture exploits the more efficient pre-fetching of Decoupled processors, that make use of the parallelism between address computation and application data processing, which mainly exists in streaming applications. This benefit combined with the ability of scratch-pad memories to store data with no conflict misses and low energy per access contributes significantly for increasing the system’s performance. The application code is split in two parallel programs the first runs on the Access processor and computes the addresses of the data in the memory hierarchy. The second processes the application data and runs on the Execute processor, a processor with a limited address space—just the register file addresses. Each transfer of any block in the memory hierarchy up to the Execute processor’s register file is controlled by the Access processor and the DMA units. This strongly differentiates this architecture from traditional uniprocessors and existing decoupled processors with cache memory hierarchies. The architecture is compared in performance with uniprocessor architectures with (a) scratch-pad and (b) cache memory hierarchies and (c) the existing decoupled architectures, showing its higher normalized performance. The reason for this gain is the efficiency of data transferring that the scratch-pad memory hierarchy provides combined with the ability of the Decoupled processors to eliminate memory latency using memory management techniques for transferring data instead of fixed prefetching methods. Experimental results show that the performance is increased up to almost 2 times compared to uniprocessor architectures with scratch-pad and up to 3.7 times compared to the ones with cache. The proposed architecture achieves the above performance without having penalties in energy delay product costs.  相似文献   
3.
In this work, the authors proposed a microscopic particle tracking system based on the previous work (Tien et al. in Exp Fluids 44(6):1015–1026, 2008). A three-pinhole plate, color-coded by color filters of different wavelengths, is utilized to create a triple exposure pattern on the image sensor plane for each particle, and each color channel of the color camera acts as an independent image sensor. This modification increases the particle image density of the original monochrome system by three times and eliminates the ambiguities caused by overlap of the triangle exposure patterns. A novel lighting method and a color separation algorithm are proposed to overcome the measurement errors due to crosstalk between color filters. A complete post-processing procedure, including a cascade correlation peak-finding algorithm to resolve overlap particles, a calibration-based method to calculate the depth location based on epipolar line search method, and a vision-based particle tracking algorithm is developed to identify, locate and track the Lagrangian motions of the tracer particles and reconstruct the flow field. A 10X infinity-corrected microscope and back-lighted by three individual high power color LEDs aligning to each of the pinhole is used to image the flow. The volume of imaging is 600 × 600 × 600 μm3. The experimental uncertainties of the system verified with experiments show that the location uncertainties are less than 0.10 and 0.08 μm for the in-plane and less than 0.82 μm for the out-of-plane components, respectively. The displacement uncertainties are 0.62 and 0.63 μm for the in-plane and 0.77 μm for the out-of-plane components, respectively. This technique is applied to measure a flow over a backward-facing micro-channel flow. The channel/step height is 600/250 μm. A steady flow with low particle density and an accelerating flow with high particle density are measured and compared to validate the flow field resolved from a two-frame tracking method. The Reynolds number in the current work varies from 0.033 to 0.825. A total of 20,592 vectors are reconstructed by time-averaged tracking of 156 image pairs from the steady flow case, and roughly 400 vectors per image pair are reconstructed by two-frame tracking from the accelerating flow case.  相似文献   
4.
In this paper, we propose a methodology for accelerating application segments by partitioning them between reconfigurable hardware blocks of different granularity. Critical parts are speeded-up on the coarse-grain reconfigurable hardware for meeting the timing requirements of application code mapped on the reconfigurable logic. The reconfigurable processing units are embedded in a generic hybrid system architecture which can model a large number of existing heterogeneous reconfigurable platforms. The fine-grain reconfigurable logic is realized by an FPGA unit, while the coarse-grain reconfigurable hardware by our developed high-performance data-path. The methodology mainly consists of three stages; the analysis, the mapping of the application parts onto fine and coarse-grain reconfigurable hardware, and the partitioning engine. A prototype software framework realizes the partitioning flow. In this work, the methodology is validated using five real-life applications. Analytical partitioning experiments show that the speedup relative to the all-FPGA mapping solution ranges from 1.5 to 4.0, while the specified timing constraints are satisfied for all the applications.  相似文献   
5.
In this paper, we propose a method for speeding-up Digital Signal Processing applications by partitioning them between the reconfigurable hardware blocks of different granularity and mapping critical parts of applications on coarse-grain reconfigurable hardware. The reconfigurable hardware blocks are embedded in a heterogeneous reconfigurable system architecture. The fine-grain part is implemented by an embedded FPGA unit, while for the coarse-grain reconfigurable hardware our developed high-performance coarse-grain data-path is used. The design flow mainly consists of three steps; the analysis procedure, the mapping onto coarse-grain blocks, and the mapping onto the fine-grain hardware. In this work, the methodology is validated using five real-life applications; an OFDM transmitter, a medical imaging technique, a wavelet-based image compressor, a video compression scheme and a JPEG encoder. The experimental results show that the speedup, relative to an all-FPGA solution, ranges from 1.55 to 4.17 for the considered applications.  相似文献   
1
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号