Skip navigation
Brigham Young University
Department of Electrical & Computer Engineering

Zhuo Ruan's Publications

This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author’s copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder.

If you have institutional or personal access to the ACM Digital Library, IEEE Xplore, and/or SpringerLink, the DOI links will give you the official versions of papers.

(hide abstracts)

Interface Design and Synthesis for Structural Hybrid Microarchitectural Simulators [abstract] (PDF)
Zhuo Ruan
Masters Thesis, Department of Electrical and Computer Engineering, Brigham Young University, December 2013.

Computer architects have discovered the potential of using FPGAs to accelerate software microarchitectural simulators. One type of FPGA-accelerated microarchitectural simulator, named the hybrid structural microarchitectural simulator, is very promising. This is because a hybrid structural microarchitectural simulator combines structural software and hardware, and this particular organization provides both modeling flexibility and fast simulation speed. The performance of a hybrid simulator is significantly affected by how the interface between software and hardware is constructed. The work of this thesis creates an infrastructure, named Simulator Partitioning Research Infrastructure (SPRI), to implement the synthesis of hybrid structural microarchitectural simulators which includes simulator partitioning, simulator-to-hardware synthesis, interface synthesis. With the support of SPRI, this thesis characterizes the design space of interfaces for synthesized hybrid structural microarchitectural simulators and provides the implementations for several such interfaces. The evaluation of this thesis thoroughly studies the important design tradeoffs and performance factors (e.g. hardware capacity, design scalability, and interface latency) involved in choosing an efficient interface. The work of this thesis is essential to the research community of computer architecture. It not only contributes a complete synthesis infrastructure, but also provides guidelines to architects on how to organize software microarchitectural models and choose a proper software/hardware interface so the hybrid microarchitectural simulators synthesized from these software models can achieve desirable speedup.

Interface Design for Synthesized Structural Hybrid Microarchitectural Simulators [abstract] (DOI, PDF)
Zhuo Ruan and David A. Penry
Proceedings of the 2012 IEEE International Conference on Computer Design (ICCD), October 2012.

Computer designers rely upon near-cycle-accurate microarchitectural simulators to explore the design space of new systems. Hybrid simulators which offload simulation work onto FPGAs overcome the speed limitations of software-only simulators as systems become more complex, however, such simulators must be automatically synthesized or the time to design them becomes prohibitive. The performance of a hybrid simulator is significantly affected by how the interface between software and hardware is constructed. We characterize the design space of interfaces for synthesized structural hybrid microarchitectural simulators, provide implementations for several such interfaces, and determine the tradeoffs involved in choosing an efficient design candidate.

Techniques for LI-BDN Synthesis for Hybrid Microarchitectural Simulation [abstract] (DOI, PDF)
Tyler S. Harris, Zhuo Ruan, and David A. Penry
Proceedings of the 2011 IEEE International Conference on Computer Design (ICCD), October 2011.

Computer designers rely upon near-cycle-accurate microarchitectural simulation to explore the design space of new systems. Unfortunately, such simulators are becoming increasingly slow as systems become more complex. Hybrid simulators which offload some of the simulation work onto FPGAs can increase the speed; however, such simulators must be automatically synthesized or the time to design them becomes prohibitive. Furthermore, FPGA implementations of simulators may require multiple FPGA clock cycles to implement behavior that takes place within one simulated clock cycle, making correct arbitrary composition of simulator components impossible and limiting the amount of hardware concurrency which can be achieved.

Latency-Insensitive Bounded Dataflow Networks (LI-BDNs) have been suggested as a means to permit composition of simulator components in FPGAs. However, previous work has required that LI-BDNs be created manually. This paper introduces techniques for automated synthesis of LI-BDNs from the processes of a System-C microarchitectural model. We demonstrate that LI-BDNs can be successfully synthesized. We also introduce a technique for reducing the overhead of LI-BDNs when the latency-insensitive property is unnecessary, resulting in up to a 60% reduction in FPGA resource requirements.

Elaboration-time Synthesis of High-level Language Constructs in SystemC-based Microarchitectural Simulators [abstract] (DOI, PDF)
Zhuo Ruan, Kurtis Cahill, and David A. Penry
Proceedings of the 2010 IEEE International Conference on Computer Design (ICCD), October 2010.

Structural modeling serves as an efficient method for creating detailed microarchitectural models of complex microprocessors. High-level language constructs such as templates and object polymorphism are used to achieve a high degree of code reuse, thereby reducing development time. However, these modeling frameworks are currently too slow to evaluate future design of multicore microprocessors. The synthesis of portions of these models into hardware to form hybrid simulators promises to improve their speed substantially. Unfortunately, the high-level language constructs used in structural simulation frameworks are not typically synthesizable. One factor which limits their synthesis is that it is very difficult to determine statically what exactly the code and data to synthesize are. We propose an \emph{elaboration-time synthesis} method for SystemC-based microarchitectural simulators. As part of the runtime environment of our infrastructure, the synthesis tool extracts architectural information after elaboration, binds dynamic information to a low-level intermediate representation (IR), and synthesizes the IR to VHDL. We show that this approach permits the synthesis of high-level language constructs which could not be easily synthesized before.

Partitioning and Synthesis for Hybrid Architecture Simulators [abstract] (DOI, PDF)
Zhuo Ruan and David A. Penry
Proceedings of the 2010 IEEE International Symposium on Circuits and Systems (ISCAS), June 2010.
Finalist for Best Student Paper Award.

Pure software simulators are too slow to simulate modern complex computer architectures and systems. Hybrid software/hardware simulators have been proposed to accelerate architecture simulation. However, the design of the hardware portions and hardware/software interface of the simulator is time-consuming, making it difficult to modify and improve these simulators. We here describe the Simulation Partitioning Research Infrastructure (SPRI), an infrastructure which partitions the software architectural model under user guidance and automatically synthesizes hybrid simulators. We also present a case study using SPRI to investigate the performance limitations and bottlenecks of the generated hybrid simulators.

Issues in Hybrid Simulator Synthesis [abstract] (PDF)
Zhuo Ruan, Koy Rehme, and David A. Penry
Proceedings of the 4th Workshop on Architectural Research Prototyping (WARP), June 2009.

The Simulator Partitioning Research Infrastructure (SPRI) is a project to automate the generation of hybrid architectural simulators. In this paper, we examine the interesting issues and challenges in hybrid simulator synthesis.

SPRI: Simulator Partitioning Research Infrastructure [abstract] (PDF)
Zhuo Ruan, Koy Rehme, and David A. Penry
Proceedings of the 3rd Workshop on Architectural Research Prototyping (WARP), June 2008.

Using FPGAs as architectural simulation accelerators has been widely discussed in the computer architecture design community. We previously proposed a hybrid SW/HW simulation infrastructure named SPRI (Simulator Partitioning Research Infrastructure) which automatically partitions the general timing model into the software and hardware portions for simulation speedup, conforming to the set-based partitioning specification. The SPRI platform takes two main inputs—partitioning specification and the architectural model; it then produces a modified SW architectural binary and a HW-accelerated RTL description which can communicate with each other, called hybrid SW/HW co-simulator—the final output of SPRI. Various experiment cases have been also run through the SPRI infrastructure to test its partitioning functionality and API wrapper generation.

An Infrastructure for HW/SW Partitioning and Synthesis of Architectural Simulators [abstract] (PDF)
David A. Penry, Zhuo Ruan, and Koy Rehme
Proceedings of the 2nd Workshop on Architectural Research Prototyping (WARP), June 2007.

Many researchers are interested in using FPGAs to accelerate architectural simulation. Partitioning of the simulator between hardware and software is an important problem which has not been explored because of the enormous effort required to develop different RTL and communication infrastructure for each potential partition. We are developing a hybrid HW/SW simulation infrastructure which will provide tools for partitioning architectural simulators and synthesizing RTL for the hardware portions. This infrastructure will allow the community to explore and understand the partitioning problem and will eventually lead to automated partitioning algorithms.