» Homepage

Karu Sankaralingam
aka Karthikeyan Sankaralingam
Mark D. Hill and David A. Wood Professor
Computer Sciences
Affiliate Faculty: Electrical and Computer Engineering
Affiliate Faculty: Robert M. La Follette School of Public Affairs

karu@cs.wisc.edu, Phone: (608) 890-0121
Office hours: by appointment.

Research (CV)

  • I lead the Vertical Research Group. I also founded a chip startup SimpleMachines.
  • Interests: Computer Architecture, Microarchitecture, VLSI, computing devices.
  • If you are a graduate or undergraduate student interested in pursuing research with me, read this.

Select Publications

  • A Journey of a 1,000 Kernels Begins with a Single Step: A Retrospective of Deep Learning on GPUs, ASPLOS 2024 to appear
  • LookupFFN: Making Transformers Compute-lite for CPU inference, ICML 2023
  • The Mozart Reuse Exposed Dataflow Processor for AI and Beyond, ISCA 2022, pdf
  • Mozart: Designing for Software Maturity and the Next Paradigm for Chip Architectures, HOTCHIPS 2021
  • Heterogeneous Von Neumann/Dataflow Microprocessors, CACM Research Highlights 2019
  • Stream-Dataflow Acceleration, ISCA 2017, pdf
  • Kickstarting Semiconductor Innovation with Open Source Hardware. IEEE Computer June 2017, arXiv pre-print
  • Pushing the Limits of Accelerator Efficiency While Retaining General-Purpose Programmability, HPCA 2016, pdf, IEEE Micro Top Picks
  • Analyzing Behavior Specialized Acceleration, ASPLOS 2016, pdf
  • Cross-Architecture Performance Prediction (XAPP) Using CPU Code to Predict GPU Performance, MICRO 2015, pdf, IEEE Micro Top Picks Honorable Mention
  • MIAOW: An Open source GPGPU, HOTCHIPS 2015, homepage, IEEE Spectrum, the register, EE Times article, Enterprisetech
  • Enabling GPGPU Low-level Hardware Explorations with MIAOW - An Open Source RTL Implementation of a GPGPU, ACM TACO paper
  • Efficient Execution of Memory Access Phases Using Dataflow Specialization, ISCA 2015, pdf, IEEE Micro Top Picks
  • Exploring the Potential of Heterogeneous Von Neumann/Dataflow Execution Models, ISCA 2015, pdf, , IEEE Micro Top Picks, CACM Research Highlights
  • Architecture Simulators Considered Harmful, IEEE Micro, WDDD '14 version pdf
  • Virtually-Aged Sampling DMR: Unifying Circuit Failure Prediction and Circuit Failure Detection, MICRO 2013, pdf
  • A General Constraint-centric Scheduling Framework for Spatial Architectures, PLDI 2013, Distinguished Paper Award, pdf
  • Power Struggles: Revisiting the RISC vs. CISC Debate on Contemporary ARM and x86 Architectures, HPCA 2013. pdf, extended tech-report, web-page, ACM TOCS ISA Wars journal version
  • iGPU: Exception Support and Speculative Execution on GPUs, ISCA 2012, pdf
  • Idempotent Processor Architecture, MICRO 2011, pdf
  • Dark Silicon and the End of Multicore Scaling, ISCA 2011, pdf; IEEE Micro Top Picks 2012,Invited ACM TOCS, Communications of ACM Research Highlights 2013
  • Sampling + DMR: Practical and Low-overhead Permanent Fault Detection, ISCA 2011, pdf
  • Dynamically Specialized Datapaths for Energy Efficient Computing, HPCA 2011, pdf
  • Relax: An Architectural Framework for Software Recovery of Hardware Faults, ISCA 2010, pdf

Research Summary

I am interested in microarchitecture, architecture, and software issues for future computation systems and lead the Vertical research group. Technology constraints of unreliable hardware, process variations, and energy efficiency are going to define computation substrates of the future. My research goal is to understand the constraints of the underlying technology and applications and use these to derive architecture and microarchitecture solutions. Vertical Research Group Page.


  • Michael Davies
  • Ian McDougall


Teaching (Teaching evaluations)


  • B.Tech - Indian Institute of Technology, Madras, 1999
  • MS - The University of Texas at Austin, August 2006
  • PhD - The University of Texas at Austin, December 2006

Bio: Karu Sankaralingam is a Professor at UW-Madison, an Entrepreneur and Inventor and leads the Vertical Research Group. His work has been featured in industry forums of Mentor and Synopsys, and has been covered by the New York Times, Wired, and IEEE Spectrum. He has published over 100 research papers, has graduated 9 PhD students, is an inventor on 21 patents, and 9 award papers. He is a Fellow of IEEE. He is a recipient of the Vilas Faculty Early Career Investigator Award in 2018, Wisconsin Innovation Award in 2016, IEEE TCCA Young Computer Architecture Award in 2012, the Emil H Steiger Distinguished Teaching award in 2014, the Letters and Science Philip R. Certain - Gary Sandefur Distinguished Faculty Award in 2013, and the NSF CAREER award in 2009. He founded SimpleMachines in 2017 which developed chip designs applying dataflow computing to push the limits of AI generality in hardware. In his career, he has led three chip projects: Mozart (16nm, HBM2 based design), MIAOW open source GPU on FPGA, and the TRIPS chip as a student during his PhD. In his research he has pioneered the principles of dataflow computing, focusing on the role of architecture, microarchitecture and the compiler. His research breakthroughs include constraint-theory based compilation for spatial architectures, specialized datapaths that can be dynamically configured, hybrid dataflow von-Neumann execution, modularizing specialization principles to allow programmability while retaining specialization benefits, new dataflow execution models that combine streaming and dataflow, and sampling theory applied to reliable computing.

Page last modified on December 20, 2023