首页 | 本学科首页   官方微博 | 高级检索  
     


Modeling the memory and performance impacts of loop fusion
Authors:Ian Karlin  Elizabeth Jessup  Erik Silkensen
Affiliation:1. Barcelona Supercomputing Center, BSC-CNS, Spain;2. Artificial Intelligence Research Institute (IIIA), Spanish National Research Council (CSIC), Spain;3. Universitat Politècnica de Catalunya (UPC), Spain;4. High Performance Computing Center Stuttgart (HLRS), University of Stuttgart, Germany;1. EECS, University of Tennessee, 1122 Volunteer Boulevard, Knoxville, TN 37996-3450, USA;2. Oak Ridge National Laboratory, Oak Ridge, TN, USA;3. University of Manchester, Manchester, UK;1. Max-Planck-Institut für Plasmaphysik, Boltzmannstr. 2, 85748 Garching, Germany;2. Unlimited Computer Systems, Seeshaupter Str. 15, 82393 Iffeldorf Germany, Germany;1. Department of Computer Science, Christian-Albrechts Universität zu Kiel, 24098 Kiel, Germany;2. Engineering Optimization & Modeling Center, School of Science and Engineering, Reykjavik University, Menntavegur 1, 101 Reykjavik, Iceland;3. GEOMAR – Helmholtz Centre for Ocean Research Kiel, Düsternbrooker Weg 20, 24105 Kiel, Germany
Abstract:On modern processors, data transfer exceeds floating-point operations as the predominant cost in many linear algebra computations. One tuning technique that focuses on reducing memory accesses is loop fusion. Determining the optimum amount of loop fusion to apply to a routine is difficult as fusion can both positively and negatively impact memory traffic. We present a model that accurately and efficiently evaluates how loop fusion choices affect data movement through the memory hierarchy. We show how to convert the model’s memory traffic predictions to runtime estimates that can be used to compare loop fusion variants.
Keywords:
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号