Satish Kumar Sadasivam is a Senior Performance Engineer and a Master Inventor at IBM STG responsible for Compiler and Hardware Performance analysis and optimization of IBM Power Processors and Compilers. He has 9+ years of experience in the area of Computer Architecture covering wide range of domains including Performance Analysis, Compiler Optimization, HPC, Competitive Analysis and Processor Validation. Currently he is responsible for delivering Performance Leadership for Power 8 Processor for emerging workloads. He also evaluates Competitors (Intel) Microrarchitecture design in detail and provide feedback to Power 9 Hardware design to address the next generation computing needs. He has filed more than 15 patents and achieved his 5th Invention Plateau and has several publications to his credit.
IBM Systems and Technology Group
The primary objective of this presentation is to provide a performance evaluation methodology to the OpenPower user community to evaluate the performance using the advanced instrumentation capabilities available in the Power 8 Microprocessor. And also to present a case study on how CPI stack cycle accounting model can be effectively used to evaluate the performance of SPEC 2006 workloads in various SMT modes.
This presentation has been split into two sections. In the first section of the presentation we will primarily cover the key Performance Instrumentation capabilities of the Power 8 Microprocessor and how it can be effectively utilized to understand and resolve the performance bottlenecks in the Code. This will cover in detail the CPI stack cycle accounting model of Power 8 microprocessor and how this is different from the previous Power 7 processor architecture. The improvements which went into the POWER 8 CPI Stack cycle accounting which help CPI cycle accounting very precise.
In the second section of the presentation we will cover the Single Core SMT performance analysis of the SPEC 2006 workloads on the POWER 8 microprocessor. We will also discuss a performance evaluation methodology used to evaluate the performance of SMT. We will describe in detail how the CPI stack building for various SMT levels will help us root cause the key performance bottlenecks in the code at the higher SMT levels and how this can be attributed effectively to the different units of the microprocessor.