Skip to main content

Showing 1–20 of 20 results for author: Grosser, T

  1. arXiv:2407.13726  [pdf, other

    cs.PL cs.LG cs.MS

    Compressing Structured Tensor Algebra

    Authors: Mahdi Ghorbani, Emilien Bauer, Tobias Grosser, Amir Shaikhha

    Abstract: Tensor algebra is a crucial component for data-intensive workloads such as machine learning and scientific computing. As the complexity of data grows, scientists often encounter a dilemma between the highly specialized dense tensor algebra and efficient structure-aware algorithms provided by sparse tensor algebra. In this paper, we introduce DASTAC, a framework to propagate the tensors's captured… ▽ More

    Submitted 18 July, 2024; originally announced July 2024.

  2. arXiv:2407.03685  [pdf, other

    cs.PL cs.LO

    Verifying Peephole Rewriting In SSA Compiler IRs

    Authors: Siddharth Bhat, Alex Keizer, Chris Hughes, Andrés Goens, Tobias Grosser

    Abstract: There is an increasing need for domain-specific reasoning in modern compilers. This has fueled the use of tailored intermediate representations (IRs) based on static single assignment (SSA), like in the MLIR compiler framework. Interactive theorem provers (ITPs) provide strong guarantees for the end-to-end verification of compilers (e.g., CompCert). However, modern compilers and their IRs evolve a… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

    Comments: accepted at ITP 2024

  3. arXiv:2405.11244  [pdf, other

    cs.SC cs.PL

    Strided Difference Bound Matrices

    Authors: Arjun Pitchanathan, Albert Cohen, Oleksandr Zinenko, Tobias Grosser

    Abstract: A wide range of symbolic analysis and optimization problems can be formalized using polyhedra. Sub-classes of polyhedra, also known as sub-polyhedral domains, are sought for their lower space and time complexity. We introduce the Strided Difference Bound Matrix (SDBM) domain, which represents a sweet spot in the context of optimizing compilers. Its expressiveness and efficient algorithms are parti… ▽ More

    Submitted 4 July, 2024; v1 submitted 18 May, 2024; originally announced May 2024.

    Comments: Preprint and extended from the CAV 2024 conference version. Fixed issue in arxiv version where URLs were not wrapped

  4. arXiv:2404.02218  [pdf, other

    cs.DC cs.MS

    A shared compilation stack for distributed-memory parallelism in stencil DSLs

    Authors: George Bisbas, Anton Lydike, Emilien Bauer, Nick Brown, Mathieu Fehr, Lawrence Mitchell, Gabriel Rodriguez-Canal, Maurice Jamieson, Paul H. J. Kelly, Michel Steuwer, Tobias Grosser

    Abstract: Domain Specific Languages (DSLs) increase programmer productivity and provide high performance. Their targeted abstractions allow scientists to express problems at a high level, providing rich details that optimizing compilers can exploit to target current- and next-generation supercomputers. The convenience and performance of DSLs come with significant development and maintenance costs. The siloe… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

  5. arXiv:2311.07422  [pdf, other

    cs.PL

    Sidekick compilation with xDSL

    Authors: Mathieu Fehr, Michel Weber, Christian Ulmann, Alexandre Lopoukhine, Martin Lücke, Théo Degioanni, Michel Steuwer, Tobias Grosser

    Abstract: Traditionally, compiler researchers either conduct experiments within an existing production compiler or develop their own prototype compiler; both options come with trade-offs. On one hand, prototyping in a production compiler can be cumbersome, as they are often optimized for program compilation speed at the expense of software simplicity and development speed. On the other hand, the transition… ▽ More

    Submitted 16 June, 2024; v1 submitted 13 November, 2023; originally announced November 2023.

    Comments: 14 pages, 15 figures; updated twice to include acknowledgements

  6. arXiv:2310.04196  [pdf, other

    cs.PL cs.CL cs.DC cs.PF

    mlirSynth: Automatic, Retargetable Program Raising in Multi-Level IR using Program Synthesis

    Authors: Alexander Brauckmann, Elizabeth Polgreen, Tobias Grosser, Michael F. P. O'Boyle

    Abstract: MLIR is an emerging compiler infrastructure for modern hardware, but existing programs cannot take advantage of MLIR's high-performance compilation if they are described in lower-level general purpose languages. Consequently, to avoid programs needing to be rewritten manually, this has led to efforts to automatically raise lower-level to higher-level dialects in MLIR. However, current methods rely… ▽ More

    Submitted 6 October, 2023; originally announced October 2023.

  7. arXiv:2310.01914  [pdf, ps, other

    cs.DC cs.PF cs.PL

    Stencil-HMLS: A multi-layered approach to the automatic optimisation of stencil codes on FPGA

    Authors: Gabriel Rodriguez-Canal, Nick Brown, Maurice Jamieson, Emilien Bauer, Anton Lydike, Tobias Grosser

    Abstract: The challenges associated with effectively programming FPGAs have been a major blocker in popularising reconfigurable architectures for HPC workloads. However new compiler technologies, such as MLIR, are providing new capabilities which potentially deliver the ability to extract domain specific information and drive automatic structuring of codes for FPGAs. In this paper we explore domain specif… ▽ More

    Submitted 3 October, 2023; originally announced October 2023.

    Comments: Author accepted version which appears in ACM Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis (SC-W 2023)

  8. Fortran performance optimisation and auto-parallelisation by leveraging MLIR-based domain specific abstractions in Flang

    Authors: Nick Brown, Maurice Jamieson, Anton Lydike, Emilien Bauer, Tobias Grosser

    Abstract: MLIR has become popular since it was open sourced in 2019. A sub-project of LLVM, the flexibility provided by MLIR to represent Intermediate Representations (IR) as dialects at different abstraction levels, to mix these, and to leverage transformations between dialects provides opportunities for automated program optimisation and parallelisation. In addition to general purpose compilers built upon… ▽ More

    Submitted 3 October, 2023; originally announced October 2023.

    Comments: Author accepted version of paper in ACM Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis (SC-W 2023)

  9. arXiv:2212.00873  [pdf, other

    cs.AR

    CONVOLVE: Smart and seamless design of smart edge processors

    Authors: M. Gomony, F. Putter, A. Gebregiorgis, G. Paulin, L. Mei, V. Jain, S. Hamdioui, V. Sanchez, T. Grosser, M. Geilen, M. Verhelst, F. Zenke, F. Gurkaynak, B. Bruin, S. Stuijk, S. Davidson, S. De, M. Ghogho, A. Jimborean, S. Eissa, L. Benini, D. Soudris, R. Bishnoi, S. Ainsworth, F. Corradi , et al. (3 additional authors not shown)

    Abstract: With the rise of Deep Learning (DL), our world braces for AI in every edge device, creating an urgent need for edge-AI SoCs. This SoC hardware needs to support high throughput, reliable and secure AI processing at Ultra Low Power (ULP), with a very short time to market. With its strong legacy in edge solutions and open processing platforms, the EU is well-positioned to become a leader in this SoC… ▽ More

    Submitted 2 May, 2023; v1 submitted 1 December, 2022; originally announced December 2022.

  10. arXiv:2210.04795  [pdf, other

    cs.DC cs.PF

    TensorFlow as a DSL for stencil-based computation on the Cerebras Wafer Scale Engine

    Authors: Nick Brown, Brandon Echols, Justs Zarins, Tobias Grosser

    Abstract: The Cerebras Wafer Scale Engine (WSE) is an accelerator that combines hundreds of thousands of AI-cores onto a single chip. Whilst this technology has been designed for machine learning workloads, the significant amount of available raw compute means that it is also a very interesting potential target for accelerating traditional HPC computational codes. Many of these algorithms are stencil-based,… ▽ More

    Submitted 26 August, 2022; originally announced October 2022.

    Comments: This preprint has not undergone any post-submission improvements or corrections. Preprint of paper submitted to Euro-Par DSL-HPC workshop

  11. arXiv:2208.10391  [pdf

    cs.PL cs.MS

    MOM: Matrix Operations in MLIR

    Authors: Lorenzo Chelini, Henrik Barthels, Paolo Bientinesi, Marcin Copik, Tobias Grosser, Daniele G. Spampinato

    Abstract: Modern research in code generators for dense linear algebra computations has shown the ability to produce optimized code with a performance which compares and often exceeds the one of state-of-the-art implementations by domain experts. However, the underlying infrastructure is often developed in isolation making the interconnection of logically combinable systems complicated if not impossible. In… ▽ More

    Submitted 22 August, 2022; originally announced August 2022.

    Comments: 3 pages, 1 figure, 1 table, and 3 listings. Short paper presented at 12th International Workshop on Polyhedral Compilation Techniques (IMPACT 22)

  12. arXiv:2201.07272  [pdf, other

    cs.PL

    Lambda the Ultimate SSA: Optimizing Functional Programs in SSA

    Authors: Siddharth Bhat, Tobias Grosser

    Abstract: Static Single Assignment (SSA) is the workhorse of modern optimizing compilers for imperative programming languages. However, functional languages have been slow to adopt SSA and prefer to use intermediate representations based on minimal lambda calculi due to SSA's inability to express higher order constructs. We exploit a new SSA construct -- regions -- in order to express functional optimizatio… ▽ More

    Submitted 18 January, 2022; originally announced January 2022.

    ACM Class: D.3

  13. arXiv:2012.15592  [pdf, other

    cs.DC cs.PF

    Extracting Clean Performance Models from Tainted Programs

    Authors: Marcin Copik, Alexandru Calotoiu, Tobias Grosser, Nicolas Wicki, Felix Wolf, Torsten Hoefler

    Abstract: Performance models are well-known instruments to understand the scaling behavior of parallel applications. They express how performance changes as key execution parameters, such as the number of processes or the size of the input problem, vary. Besides reasoning about program behavior, such models can also be automatically derived from performance data. This is called empirical performance modelin… ▽ More

    Submitted 31 December, 2020; originally announced December 2020.

    Comments: Accepted at PPoPP 2021

  14. arXiv:2010.12478  [pdf, other

    cs.DC

    Work-stealing prefix scan: Addressing load imbalance in large-scale image registration

    Authors: Marcin Copik, Tobias Grosser, Torsten Hoefler, Paolo Bientinesi, Benjamin Berkels

    Abstract: Parallelism patterns (e.g., map or reduce) have proven to be effective tools for parallelizing high-performance applications. In this paper, we study the recursive registration of a series of electron microscopy images - a time consuming and imbalanced computation necessary for nano-scale microscopy analysis. We show that by translating the image registration into a specific instance of the prefix… ▽ More

    Submitted 23 October, 2020; originally announced October 2020.

  15. arXiv:2005.13014  [pdf, other

    cs.PL

    Domain-Specific Multi-Level IR Rewriting for GPU

    Authors: Tobias Gysi, Christoph Müller, Oleksandr Zinenko, Stephan Herhut, Eddie Davis, Tobias Wicky, Oliver Fuhrer, Torsten Hoefler, Tobias Grosser

    Abstract: Traditional compilers operate on a single generic intermediate representation (IR). These IRs are usually low-level and close to machine instructions. As a result, optimizations relying on domain-specific information are either not possible or require complex analysis to recover the missing information. In contrast, multi-level rewriting instantiates a hierarchy of dialects (IRs), lowers programs… ▽ More

    Submitted 27 July, 2020; v1 submitted 26 May, 2020; originally announced May 2020.

    Comments: 12 pages, 16 figures

  16. arXiv:2004.03494  [pdf, other

    cs.PL

    LLHD: A Multi-level Intermediate Representation for Hardware Description Languages

    Authors: Fabian Schuiki, Andreas Kurth, Tobias Grosser, Luca Benini

    Abstract: Modern Hardware Description Languages (HDLs) such as SystemVerilog or VHDL are, due to their sheer complexity, insufficient to transport designs through modern circuit design flows. Instead, each design automation tool lowers HDLs to its own Intermediate Representation (IR). These tools are monolithic and mostly proprietary, disagree in their implementation of HDLs, and while many redundant IRs ex… ▽ More

    Submitted 7 April, 2020; originally announced April 2020.

  17. arXiv:2003.04293  [pdf, other

    cs.DC cs.ET

    Compiling Neural Networks for a Computational Memory Accelerator

    Authors: Kornilios Kourtis, Martino Dazzi, Nikolas Ioannou, Tobias Grosser, Abu Sebastian, Evangelos Eleftheriou

    Abstract: Computational memory (CM) is a promising approach for accelerating inference on neural networks (NN) by using enhanced memories that, in addition to storing data, allow computations on them. One of the main challenges of this approach is defining a hardware/software interface that allows a compiler to map NN models for efficient execution on the underlying CM accelerator. This is a non-trivial tas… ▽ More

    Submitted 24 April, 2020; v1 submitted 5 March, 2020; originally announced March 2020.

    Comments: Accepted at SPMA '20

  18. A Fast Analytical Model of Fully Associative Caches

    Authors: Tobias Gysi, Tobias Grosser, Laurin Brandner, Torsten Hoefler

    Abstract: While the cost of computation is an easy to understand local property, the cost of data movement on cached architectures depends on global state, does not compose, and is hard to predict. As a result, programmers often fail to consider the cost of data movement. Existing cache models and simulators provide the missing information but are computationally expensive. We present a lightweight cache mo… ▽ More

    Submitted 6 January, 2020; originally announced January 2020.

    Comments: 14 pages, 16 figures, PLDI19

  19. arXiv:1712.04892  [pdf, other

    cs.AR

    Accelerator Codesign as Non-Linear Optimization

    Authors: Nirmal Prajapati, Sanjay Rajopadhye, Hristo Djidjev, Nandkishore Santhi, Tobias Grosser, Rumen Andonov

    Abstract: We propose an optimization approach for determining both hardware and software parameters for the efficient implementation of a (family of) applications called dense stencil computations on programmable GPGPUs. We first introduce a simple, analytical model for the silicon area usage of accelerator architectures and a workload characterization of stencil computations. We combine this characterizati… ▽ More

    Submitted 13 December, 2017; originally announced December 2017.

    Comments: 10 pages, 4 figures, 2 tables

  20. arXiv:1302.5586  [pdf, other

    cs.PL cs.DC

    PENCIL: Towards a Platform-Neutral Compute Intermediate Language for DSLs

    Authors: Riyadh Baghdadi, Albert Cohen, Serge Guelton, Sven Verdoolaege, Jun Inoue, Tobias Grosser, Georgia Kouveli, Alexey Kravets, Anton Lokhmotov, Cedric Nugteren, Fraser Waters, Alastair F. Donaldson

    Abstract: We motivate the design and implementation of a platform-neutral compute intermediate language (PENCIL) for productive and performance-portable accelerator programming.

    Submitted 22 February, 2013; originally announced February 2013.