## Zhiru Zhang

## List of Publications by Citations

Source: https://exaly.com/author-pdf/2767337/zhiru-zhang-publications-by-citations.pdf

Version: 2024-04-25

This document has been generated based on the publications and citations recorded by exaly.com. For the latest version of this publication list, visit the link given above.

The third column is the impact factor (IF) of the journal, and the fourth column is the number of citations of the article.

84 1,770 19 39 h-index g-index citations papers 1.8 101 2,559 4.95 L-index avg, IF ext. citations ext. papers

| #  | Paper                                                                                                                                                                              | IF  | Citations |
|----|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----|-----------|
| 84 | High-Level Synthesis for FPGAs: From Prototyping to Deployment. <i>IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems</i> , <b>2011</b> , 30, 473-491   | 2.5 | 405       |
| 83 | Accelerating Binarized Convolutional Neural Networks with Software-Programmable FPGAs 2017,                                                                                        |     | 195       |
| 82 | Application-specific instruction generation for configurable processor architectures 2004,                                                                                         |     | 133       |
| 81 | An efficient and versatile scheduling algorithm based on SDC formulation 2006,                                                                                                     |     | 80        |
| 80 | AutoPilot: A Platform-Based ESL Synthesis System <b>2008</b> , 99-112                                                                                                              |     | 50        |
| 79 | SDC-based modulo scheduling for pipeline synthesis <b>2013</b> ,                                                                                                                   |     | 46        |
| 78 | Architecture and synthesis for on-chip multicycle communication. <i>IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems</i> , <b>2004</b> , 23, 550-564  | 2.5 | 46        |
| 77 | The Celerity Open-Source 511-Core RISC-V Tiered Accelerator Fabric: Fast Architectures and Design Methodologies for Fast Chips. <i>IEEE Micro</i> , <b>2018</b> , 38, 30-41        | 1.8 | 34        |
| 76 | Platform-Based Behavior-Level and System-Level Synthesis <b>2006</b> ,                                                                                                             |     | 34        |
| 75 | Instruction set extension with shadow registers for configurable processors 2005,                                                                                                  |     | 34        |
| 74 | High-Level Power Estimation and Low-Power Design Space Exploration for FPGAs 2007,                                                                                                 |     | 29        |
| 73 | Rosetta <b>2018</b> ,                                                                                                                                                              |     | 28        |
| 72 | Fast and Accurate Estimation of Quality of Results in High-Level Synthesis with Machine Learning <b>2018</b> ,                                                                     |     | 27        |
| 71 | HeteroCL <b>2019</b> ,                                                                                                                                                             |     | 24        |
| 70 | Platform choices and design demands for IoT platforms: cost, power, and performance tradeoffs. <i>IET Cyber-Physical Systems: Theory and Applications</i> , <b>2016</b> , 1, 70-77 | 2.5 | 24        |
| 69 | Flushing-Enabled Loop Pipelining for High-Level Synthesis <b>2014</b> ,                                                                                                            |     | 22        |
| 68 | A Parallel Bandit-Based Approach for Autotuning FPGA Compilation 2017,                                                                                                             |     | 21        |

| 67 | Reverse engineering convolutional neural networks through side-channel information leaks 2018,                                | 20     |
|----|-------------------------------------------------------------------------------------------------------------------------------|--------|
| 66 | Dynamic Hazard Resolution for Pipelining Irregular Loops in High-Level Synthesis 2017,                                        | 19     |
| 65 | ElasticFlow: A complexity-effective approach for pipelining irregular loop nests 2015,                                        | 19     |
| 64 | A New Approach to Automatic Memory Banking using Trace-Based Address Mining <b>2017</b> ,                                     | 17     |
| 63 | T2S-Tensor: Productively Generating High-Performance Spatial Hardware for Dense Tensor Computations <b>2019</b> ,             | 17     |
| 62 | Architectural Specialization for Inter-Iteration Loop Dependence Patterns 2014,                                               | 17     |
| 61 | Tensaurus: A Versatile Accelerator for Mixed Sparse-Dense Tensor Computations 2020,                                           | 16     |
| 60 | Evaluation of Static Analysis Techniques for Fixed-Point Precision Optimization 2009,                                         | 15     |
| 59 | MatRaptor: A Sparse-Sparse Matrix Multiplication Accelerator Based on Row-Wise Product 2020,                                  | 15     |
| 58 | FracBNN: Accurate and FPGA-Efficient Binary Neural Networks with Fractional Activations 2021,                                 | 15     |
| 57 | Scheduling with soft constraints <b>2009</b> ,                                                                                | 14     |
| 56 | Painting on Placement <b>2019</b> ,                                                                                           | 13     |
| 55 | Area-efficient pipelining for FPGA-targeted high-level synthesis 2015,                                                        | 13     |
| 54 | Multithreaded pipeline synthesis for data-parallel kernels <b>2014</b> ,                                                      | 13     |
| 53 | Bit-level optimization for high-level synthesis and FPGA-based acceleration 2010,                                             | 13     |
| 52 | High-level Synthesis for Low-power Design. <i>IPSJ Transactions on System LSI Design Methodology</i> , <b>2015</b> , 8, 12-25 | 0.2 12 |
| 51 | Building Efficient Deep Neural Networks With Unitary Group Convolutions 2019,                                                 | 12     |
| 50 | Boosting the Performance of CNN Accelerators with Dynamic Fine-Grained Channel Gating <b>2019</b> ,                           | 11     |

| 49 | Binarized Convolutional Neural Networks with Separable Filters for Efficient Hardware Acceleration <b>2017</b> ,                                     |     | 11 |
|----|------------------------------------------------------------------------------------------------------------------------------------------------------|-----|----|
| 48 | A Parallelized Iterative Improvement Approach to Area Optimization for LUT-Based Technology Mapping <b>2017</b> ,                                    |     | 10 |
| 47 | LAMDA: Learning-Assisted Multi-stage Autotuning for FPGA Design Closure 2019,                                                                        |     | 10 |
| 46 | PRIMAL <b>2019</b> ,                                                                                                                                 |     | 10 |
| 45 | Mapping-Aware Constrained Scheduling for LUT-Based FPGAs <b>2015</b> ,                                                                               |     | 10 |
| 44 | Statistically certified approximate logic synthesis <b>2017</b> ,                                                                                    |     | 10 |
| 43 | Architecture and synthesis for multi-cycle communication 2003,                                                                                       |     | 10 |
| 42 | Accurate operation delay prediction for FPGA HLS using graph neural networks 2020,                                                                   |     | 10 |
| 41 | Architecture-level synthesis for automatic interconnect pipelining 2004,                                                                             |     | 9  |
| 40 | High-level synthesis with timing-sensitive information flow enforcement 2018,                                                                        |     | 9  |
| 39 | A 1.4 GHz 695 Giga Risc-V Inst/s 496-Core Manycore Processor With Mesh On-Chip Network and an All-Digital Synthesized PLL in 16nm CMOS <b>2019</b> , |     | 8  |
| 38 | Bitwidth-aware scheduling and binding in high-level synthesis <b>2005</b> ,                                                                          |     | 8  |
| 37 | Enabling Design Methodologies and Future Trends for Edge AI: Specialization and Codesign. <i>IEEE Design and Test</i> , <b>2021</b> , 38, 7-26       | 1.4 | 8  |
| 36 | Improving high-level synthesis with decoupled data structure optimization 2016,                                                                      |     | 7  |
| 35 | CASA <b>2014</b> ,                                                                                                                                   |     | 7  |
| 34 | AutoBridge: Coupling Coarse-Grained Floorplanning and Pipelining for High-Frequency HLS Design on Multi-Die FPGAs <b>2021</b> , 2021, 81-92          |     | 7  |
| 33 | Reverse Engineering Convolutional Neural Networks Through Side-channel Information Leaks <b>2018</b>                                                 |     | 7  |
| 32 | Accelerating Face Detection on Programmable SoC Using C-Based Synthesis <b>2017</b> ,                                                                |     | 6  |

## (2016-2018)

| 31             | A Scalable Approach to Exact Resource-Constrained Scheduling Based on a Joint SDC and SAT Formulation <b>2018</b> ,                                                                                                                                                                                                                                                                                                                                                                                       |     | 6      |
|----------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----|--------|
| 30             | Behavior-Level Observability Analysis for Operation Gating in Low-Power Behavioral Synthesis. <i>ACM Transactions on Design Automation of Electronic Systems</i> , <b>2010</b> , 16, 1-29                                                                                                                                                                                                                                                                                                                 | 1.5 | 6      |
| 29             | Architecture and Compiler Optimizations for Data Bandwidth Improvement in Configurable Processors. <i>IEEE Transactions on Very Large Scale Integration (VLSI) Systems</i> , <b>2006</b> , 14, 986-997                                                                                                                                                                                                                                                                                                    | 2.6 | 6      |
| 28             | Predictable accelerator design with time-sensitive affine types <b>2020</b> ,                                                                                                                                                                                                                                                                                                                                                                                                                             |     | 6      |
| 27             | Improving Scalability of Exact Modulo Scheduling with Specialized Conflict-Driven Learning 2019,                                                                                                                                                                                                                                                                                                                                                                                                          |     | 5      |
| 26             | Enabling adaptive loop pipelining in high-level synthesis 2017,                                                                                                                                                                                                                                                                                                                                                                                                                                           |     | 5      |
| 25             | Scheduling with integer time budgeting for low-power optimization 2008,                                                                                                                                                                                                                                                                                                                                                                                                                                   |     | 5      |
| 24             | Behavior and communication co-optimization for systems with sequential communication media <b>2006</b> ,                                                                                                                                                                                                                                                                                                                                                                                                  |     | 5      |
| 23             | SuSy <b>2020</b> ,                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |     | 5      |
| 22             | Evaluating Celerity: A 16-nm 695 Giga-RISC-V Instructions/s Manycore Processor With Synthesizable PLL. <i>IEEE Solid-State Circuits Letters</i> , <b>2019</b> , 2, 289-292                                                                                                                                                                                                                                                                                                                                | 2   | 5      |
|                |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |     |        |
| 21             | Layout Symmetry Annotation for Analog Circuits with Graph Neural Networks 2021,                                                                                                                                                                                                                                                                                                                                                                                                                           |     | 5      |
| 21             | Layout Symmetry Annotation for Analog Circuits with Graph Neural Networks <b>2021</b> ,  Architecture and Synthesis for Area-Efficient Pipelining of Irregular Loop Nests. <i>IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems</i> , <b>2017</b> , 36, 1817-1830                                                                                                                                                                                                             | 2.5 | 5      |
|                | Architecture and Synthesis for Area-Efficient Pipelining of Irregular Loop Nests. <i>IEEE Transactions</i>                                                                                                                                                                                                                                                                                                                                                                                                | 2.5 | 5<br>4 |
| 20             | Architecture and Synthesis for Area-Efficient Pipelining of Irregular Loop Nests. <i>IEEE Transactions</i> on Computer-Aided Design of Integrated Circuits and Systems, <b>2017</b> , 36, 1817-1830                                                                                                                                                                                                                                                                                                       | 2.5 | 4      |
| 20<br>19       | Architecture and Synthesis for Area-Efficient Pipelining of Irregular Loop Nests. <i>IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems</i> , <b>2017</b> , 36, 1817-1830  A reconfigurable analog substrate for highly efficient maximum flow computation <b>2015</b> ,                                                                                                                                                                                                       | 2.5 | 4      |
| 20<br>19<br>18 | Architecture and Synthesis for Area-Efficient Pipelining of Irregular Loop Nests. <i>IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems</i> , <b>2017</b> , 36, 1817-1830  A reconfigurable analog substrate for highly efficient maximum flow computation <b>2015</b> ,  Behavior-level observability don\cong cares and application to low-power behavioral synthesis <b>2009</b> ,                                                                                          | 2.5 | 4 4    |
| 20<br>19<br>18 | Architecture and Synthesis for Area-Efficient Pipelining of Irregular Loop Nests. <i>IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems</i> , <b>2017</b> , 36, 1817-1830  A reconfigurable analog substrate for highly efficient maximum flow computation <b>2015</b> ,  Behavior-level observability donW-cares and application to low-power behavioral synthesis <b>2009</b> ,  Designing Secure Cryptographic Accelerators with Information Flow Enforcement <b>2019</b> , | 2.5 | 4 4 3  |

| 13 | FPGA-Based Real-Time Charged Particle Trajectory Reconstruction at the Large Hadron Collider <b>2017</b> ,                                                                               |     | 2 |
|----|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----|---|
| 12 | Behavioral synthesis with activating unused flip-flops for reducing glitch power in FPGA 2008,                                                                                           |     | 2 |
| 11 | Architectural synthesis Integrated with global placement for multi-cycle communication                                                                                                   |     | 2 |
| 10 | Architecture and compilation for data bandwidth improvement in configurable embedded processors                                                                                          |     | 2 |
| 9  | Dagger: efficient and fast RPCs in cloud microservices with near-memory reconfigurable NICs 2021,                                                                                        |     | 2 |
| 8  | GraphLily: Accelerating Graph Linear Algebra on HBM-Equipped FPGAs 2021,                                                                                                                 |     | 2 |
| 7  | PIMap. ACM Transactions on Reconfigurable Technology and Systems, 2019, 11, 1-23                                                                                                         | 2.7 | 1 |
| 6  | Programming and Synthesis for Software-defined FPGA Acceleration: Status and Future Prospects. <i>ACM Transactions on Reconfigurable Technology and Systems</i> , <b>2021</b> , 14, 1-39 | 2.7 | 1 |
| 5  | Reverse Engineering CNN Models using Side-Channel Attacks. IEEE Design and Test, 2022, 1-1                                                                                               | 1.4 | О |
| 4  | . IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, <b>2021</b> , 1-1                                                                                       | 2.5 | O |
| 3  | ESL Design Methodology. <i>Journal of Electrical and Computer Engineering</i> , <b>2012</b> , 2012, 1-2                                                                                  | 1.9 |   |
| 2  | Introduction Of Special Issue on FPGA-Based Computing [From The Guest Editors]. <i>IEEE Circuits and Systems Magazine</i> , <b>2021</b> , 21, 3-3                                        | 3.2 |   |

FPGA-Specific Compilers **2022**, 1-37