Performance evaluation is an integral part of computer architecture research, and is typically carried out with the help of benchmark suites. These benchmark suites consist of a number of workloads which are generally representative of a particular application domain.It is essential for computer architects to select the right set of workloads required to demonstrate the benefits of proposed enhancements to existing architectures. Selecting the right set of workloads for performance evaluation of an architecture is not an easy task due to multiple reasons:

  • A benchmark suite comprises of a number of workloads that are representative of the typical characteristics and behaviors of a class of applications.
  • Evaluation of the same sub-system across multiple contexts could lead to different benchmark suites being used.
  • A benchmark suite is an aggregation of a large number of disparate applications where each workload can potentially have multiple phases of execution. Typically, in a given phase, a workload stresses one or a small subset of system components.

This varied space could become especially difficult to maneuver for architects and designers who are trying to optimize a specific components of the architecture, say the floating point subsystem. Once an enhancement has been designed, the architects would like to study the system level implications of the proposed optimization in terms of various metrics for performance and/or energy consumption. As a result, the architect would like to select the workloads from all possible available suites that show a significant amount of floating point activity.

Currently, selecting a set of workloads that satisfy the criteria specified by an architect, is a very difficult task, mostly because there is no central repository which can help architects compare workload characteristics across multiple suites in one place. Further, there are very few, well-defined axes using which an architect can specify the requirements of a workload that she is interested in. FAB aims to facilitate the process of workload selection. The block level diagram of FAB is shown below:

Block Level Model of FAB

FAB’s workflow currently consists of two parts. The first part, or the backend, deals with a Pin based flow, which takes a workload binary and the associated input files as input, and generates an instruction mix for the workload as output. This has to be done once for every new workload. The instruction mix is then used for various kinds of analyses using the tool’s frontend, which is a Jupyter notebook. The frontend also contains the pre-classified instruction bins, which can be referenced directly from within the notebook. The notebook takes the benchmark and instruction bins as input and produces stacked barcharts and dendrograms which assist in analysis. The examples of instruction mix pie charts and workload similarity dendograms generated by FAB are shown below. Using these plots, architects can select workloads showing a particular kind of activity. The similarity plot also allows them to select a subset of diverse workloads which are representative of entire benchmark, which is useful to reduce the simulation time.

Instruction Mix Workload Similarity

FAB was accepted as a work-in-progress paper at ICPE 2019.
You can find the resources here Paper , Slides, Bibtex