期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

A floating-point genetic algorithm for solving the unit commitment problem

Chuangyin Dang Minqiang Li 《European Journal of Operational Research》2007

This paper proposes a floating-point genetic algorithm (FPGA) to solve the unit commitment problem (UCP). Based on the characteristics of typical load demand, a floating-point chromosome representation and an encoding–decoding scheme are designed to reduce the complexities in handling the minimum up/down time limits. Strategic parameters of the FPGA are characterized in detail, i.e., the evaluation function and its constraints, population size, operation styles of selection, crossover operation and probability, mutation operation and probability. A dynamic combination scheme of genetic operators is formulated to explore and exploit the FPGA in the non-convex solution space and multimodal objective function. Experiment results show that the FPGA is a more effective technique among the various styles of genetic algorithms, which can be applied to the practical scheduling tasks in utility power systems. 相似文献

2.

一种高速浮点加法器的设计实现

唐世庆尹勇生刘聪《微电子学与计算机》2003,20(8):163-166

浮点加法器是协处理器的核心运算部件，是实现浮点指令各种运算的基础，其设计优化是提高浮点运算速度和精度的关键途径。文章从浮点加法器算法和电路实现的角度给出设计方法，并且提出动态与静态结合设计进位链的方案以及前导O预测面积与速度的折衷方法。动态与静态结合设计进位链的方法有效地降低了功耗，提高了速度，改善了性能。目前已经嵌入协处理器的设计中，并且流片测试成功。相似文献

3.

一种双精度浮点乘法器的设计 总被引：2，自引：0，他引：2

何晶韩月秋《微电子学》2003,33(4):331-334

设计了一个双精度浮点乘法器。该器件采用改进的BOOTH算法产生部分积，用阵列和树的混合结构实现对部分积的相加，同时，还采用了快速的四舍五入算法，以提高乘法器的性能。把设计的乘法器分为4级流水线，用FPGA进行了仿真验证，结果正确；并对FPGA实现的时序结果进行了分析。相似文献

4.

Computation of exact inertia and inclusions of eigenvalues (singular values) of tridiagonal (bidiagonal) matrices

K.V. Fernando 《Linear algebra and its applications》2007,422(1):77-99

This report may be considered as a non-trivial extension of an unpublished report by William Kahan (Accurate Eigenvalues of a symmetric tri-diagonal matrix, Technical Report CS 41, Computer Science Department, Stanford University, 1966). His interplay between matrix theory and computer arithmetic led to the development of algorithms for computing accurate eigenvalues and singular values. His report is generally considered as the precursor for the development of IEEE standard 754 for binary arithmetic. This standard has been universally adopted by virtually all PC, workstation and midrange hardware manufactures and tens of billions of such machines have been produced. Now we use the features in this standard to improve the original algorithm.In this paper, we describe an algorithm in floating-point arithmetic to compute the exact inertia of a real symmetric (shifted) tridiagonal matrix. The inertia, denoted by the integer triplet (π, ν, ζ), is defined as the number of positive, negative and zero eigenvalues of a real symmetric (or complex Hermitian) matrix and the adjective exact refers to the eigenvalues computed in exact arithmetic. This requires the floating-point computation of the diagonal matrix D of the LDL^t factorization of the shifted tridiagonal matrix T − τI with +∞ and −∞ rounding modes defined in IEEE 754 standard. We are not aware of any other algorithm which gives the exact answer to a numerical problem when implemented in floating-point arithmetic in standard working precisions. The guaranteed intervals for eigenvalues are obtained by bisection or multisection with this exact inertia information. Similarly, using the Golub-Kahan form, guaranteed intervals for singular values of bidiagonal matrices can be computed. The diameter of the eigenvalue (singular value) intervals depends on the number of shifts with inconsistent inertia in two rounding modes. Our algorithm not only guarantees the accuracy of the solutions but is also consistent across different IEEE 754 standard compliant architectures. The unprecedented accuracy provided by our algorithms could be also used to debug and validate standard floating-point algorithms for computation of eigenvalues (singular values). Accurate eigenvalues (singular values) are also required by certain algorithms to compute accurate eigenvectors (singular vectors).We demonstrate the accuracy of our algorithms by using standard matrix examples. For the Wilkinson matrix, the eigenvalues (in IEEE double precision) are very accurate with an (open) interval diameter of 6 ulps (units of the last place held of the mantissa) for one of the eigenvalues and lesser (down to 2 ulps) for others. These results are consistent across many architectures including Intel, AMD, SGI and DEC Alpha. However, by enabling IEEE double extended precision arithmetic in Intel/AMD 32-bit architectures at no extra computational cost, the (open) interval diameters were reduced to one ulp, which is the best possible solution for this problem. We have also computed the eigenvalues of a tridiagonal matrix which manifests in Gauss-Laguerre quadrature and the results are extremely good in double extended precision but less so in double precision. To demonstrate the accuracy of computed singular values, we have also computed the eigenvalues of the Kac₃₀ matrix, which is the Golub-Kahan form of a bidiagonal matrix. The tridiagonal matrix has known integer eigenvalues. The bidiagonal Cholesky factor of the Gauss-Laguerre tridiagonal is also included in the singular value study. 相似文献

5.

运算流水线的实现和优化

陈弦于伦正《微电子学与计算机》2006,23(1):134-136,139

文章在对流水线性能进行分析的基础上，以双精度浮点运算流水线为例子，阐述了实现多条运算流水机制的方法。并对单条流水线，从设计结构和运算的分段两个方面详细介绍了设计的优化方案，并对优化后流水化设计和传统流水设计进行了可靠性和速度的比较，其速度可以提高近1倍。相似文献

6.

Precise evaluation of a polynomial at a point given in staggered correction format

Luc Paquet 《Journal of Computational and Applied Mathematics》1994,50(1-3):433-454

The problem of the evaluation in floating-point arithmetic of a polynomial with floating-point coefficients at a point which is a finite sum of floating-point numbers is studied. The solution is obtained as an infinite convergent series of floating-point numbers. The algorithm requires a precise scalar product, but this can always be implemented by software in a high-level language without assembly language routines as we indicate. A convergence result is proved under a very weak restriction on the size of the degree of the polynomial in terms of the unit roundoff u; roughly speaking, the degree should not be larger than the square root of (1 + u)(2u). Even in the particular case when the point at which to evaluate the polynomial reduces to one floating-point number, we find a new simplified algorithm among the whole family that the preceding convergence result allows.

This problem occurs, among others, in the convergence of the Newton method to some real root of the given polynomial p. If we simply use the Horner scheme to evaluate the polynomial p in a neighbourhood of the root, in some cases the evaluation will contain no correct digits and will prevent us from getting convergence even to machine accuracy. The convergence of iterative methods, among which the Newton method, with added perturbations was the central theme of my talk given at the ICCAM'92. The second part will appear in a forthcoming paper. These added perturbations can represent for example forward or backward errors occurring in finite-precision computations.

The problem discussed here appears in validating some hypotheses of these general convergence results (see the forthcoming paper). 相似文献

7.

基于改进分水岭算法的医学图像分割的研究 总被引：12，自引：0，他引：12

刘喜英吴淑泉徐向民《微电子技术》2003,31(4):39-42

本文针对医学图象的特点，改进了医学图像的分水岭算法，并用于医学图像的分割处理。这种分割通常应用分水岭算法，但是它有过分割的严重问题。本文闸述了在分水岭算法的基础上做的一些改进，其内容是：在分水岭算法之前，引入浮点，活动图像作为分水岭算法的输入，在分水岭算法之后，在面积控制的基础上，同时根据面积控制和对比度控制的准则，将某些被分割的小区并入邻近较大的区域。这种改进的方法使过分割现象得到了很好的抑制，而且医学图像中的病变小区被分割出来了，效果很好。相似文献

8.

Multi-functional floating-point MAF designs with dot product support

Mustafa Gök Metin Mete Özbilen 《Microelectronics Journal》2008,39(1):30-43

This paper presents multi-functional double-precision and quadruple-precision floating-point multiply-add fused (FPMAF) designs. The double-precision FPMAF design can execute adouble-precision floating-point multiply-add, or two single-precision floating-point multiplications, or a single-precision floating-point dot product. The quadruple-precision FPMAF can perform similar operations with quadruple, double and single precision operands. These architectures can perform a dot-product operation two times or more faster than a basic FPMAF design. The presented multi-functional designs are compared with basic double-precision and quadruple-precision FPMAF designs by ASIC syntheses. The syntheses results show that the proposed double-precision implementation has 8%more area than a standard double-precision FPMAF implementation, and the proposed quadruple-precision design has 12.5% more area than a standard quadruple-precision FPMAF. Both of the proposed designs have one more pipeline stage compared to the basic designs. 相似文献

9.

A reconfigurable 4-GS/s power-efficient floating-point FFT processor design and implementation based on single-sided binary-tree decomposition

《Integration, the VLSI Journal》2019

This paper presents a high throughput size-configurable floating point (FP) Fast Fourier Transform (FFT) processor, having implemented the 8-parallel multi-path delay feedback (MDF) functions suitable for applications in the real-time radar imaging system. With regard to floating-point FFT design, to acquire a high throughput with restricted area and power consumptions poses as a greater challenge due to some higher degrees of complexity involved in realizing of FP operations than those fixed-point counterparts. To address the related issues, a novel mixed-radix FFT algorithm featuring the single-sided binary-tree decomposition strategy is proposed aiming at effectively containing the complexity of multiplications for any 2^k-point FFT. To this aid, the parallel-processing twiddle factor generator and the dual addition-and-rounding fused FP arithmetic units are optimized to meet the high accuracy demand in computation and the low power budget in implementation. The proposed FP FFT processor has been designed in silicon based on SMIC's 28 nm CMOS technology with the active area of 1.39 mm². The prototype design delivers a throughput of 4 GSample/s at 500 MHz, at a peak power consumption of 84.2 mW. Thus, the proposed design approach achieves a significant improvement in power efficiency approximately by 14 times on average over some other FP FFT processors previously reported. 相似文献

10.

Configurable Floating-Point FFT Accelerator on FPGA Based Multiple-Rotation CORDIC

《电子学报:英文版》2016,(6):1063-1070

Fast Fourier transform (FFT) accelerator and Coordinate rotation digital computer (CORDIC) algorithm play important roles in signal processing.We propose a conflgurable floating-point FFT accelerator based on CORDIC rotation,in which twiddle direction prediction is presented to reduce hardware cost and twiddle angles are generated in real time to save memory.To finish CORDIC rotation efficiently,a novel approach in which segmentedparallel iteration and compress iteration based on CSA are presented and redundant CORDIC is used to reduce the latency of each iteration.To prove the efficiency of our FFT accelerator,four FFT accelerators are prototyped into a FPGA chip to perform a batch-FFT.Experimental results show that our structure,which is composed of four butterfly units and finishes FFT with the size ranging from 64 to 8192 points,occupies 33230(3％) REGs and 143006(30％)LUTs.The clock frequency can reach 122MHz.The resources of double-precision FFT is only about 2.5 times of single-precision while the theoretical value is 4.What's more,only 13331 cycles are required to implement 8192-points double-precision FFT with four butterfly units in parallel. 相似文献