## Zhiru Zhang

List of Publications by Year in descending order

Source: https://exaly.com/author-pdf/2767337/publications.pdf Version: 2024-02-01



**7ніріі 7нало** 

| #  | Article                                                                                                                                                              | IF  | CITATIONS |
|----|----------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----|-----------|
| 1  | High-Level Synthesis for FPGAs: From Prototyping to Deployment. IEEE Transactions on Computer-Aided<br>Design of Integrated Circuits and Systems, 2011, 30, 473-491. | 1.9 | 594       |
| 2  | Accelerating Binarized Convolutional Neural Networks with Software-Programmable FPGAs. , 2017, , .                                                                   |     | 286       |
| 3  | Application-specific instruction generation for configurable processor architectures. , 2004, , .                                                                    |     | 177       |
| 4  | An efficient and versatile scheduling algorithm based on SDC formulation. , 2006, , .                                                                                |     | 130       |
| 5  | MatRaptor: A Sparse-Sparse Matrix Multiplication Accelerator Based on Row-Wise Product. , 2020, , .                                                                  |     | 83        |
| 6  | HeteroCL., 2019,,.                                                                                                                                                   |     | 73        |
| 7  | AutoPilot: A Platform-Based ESL Synthesis System. , 2008, , 99-112.                                                                                                  |     | 71        |
| 8  | SDC-based modulo scheduling for pipeline synthesis. , 2013, , .                                                                                                      |     | 67        |
| 9  | Rosetta. , 2018, , .                                                                                                                                                 |     | 67        |
| 10 | Tensaurus: A Versatile Accelerator for Mixed Sparse-Dense Tensor Computations. , 2020, , .                                                                           |     | 66        |
| 11 | Architecture and Synthesis for On-Chip Multicycle Communication. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2004, 23, 550-564.   | 1.9 | 65        |
| 12 | Fast and Accurate Estimation of Quality of Results in High-Level Synthesis with Machine Learning. ,<br>2018, , .                                                     |     | 65        |
| 13 | The Celerity Open-Source 511-Core RISC-V Tiered Accelerator Fabric: Fast Architectures and Design<br>Methodologies for Fast Chips. IEEE Micro, 2018, 38, 30-41.      | 1.8 | 58        |
| 14 | Instruction set extension with shadow registers for configurable processors. , 2005, , .                                                                             |     | 51        |
| 15 | Painting on Placement. , 2019, , .                                                                                                                                   |     | 51        |
| 16 | Reverse engineering convolutional neural networks through side-channel information leaks. , 2018, , .                                                                |     | 50        |
| 17 | FracBNN: Accurate and FPGA-Efficient Binary Neural Networks with Fractional Activations. , 2021, , .                                                                 |     | 49        |
| 18 | High-Level Power Estimation and Low-Power Design Space Exploration for FPGAs. , 2007, , .                                                                            |     | 45        |

| #  | Article                                                                                                                                                                | IF  | CITATIONS |
|----|------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----|-----------|
| 19 | Reverse Engineering Convolutional Neural Networks Through Side-channel Information Leaks. , 2018, ,                                                                    |     | 44        |
| 20 | PRIMAL., 2019,,.                                                                                                                                                       |     | 44        |
| 21 | Platform-Based Behavior-Level and System-Level Synthesis. , 2006, , .                                                                                                  |     | 43        |
| 22 | AutoBridge. , 2021, 2021, 81-92.                                                                                                                                       |     | 43        |
| 23 | Accurate operation delay prediction for FPGA HLS using graph neural networks. , 2020, , .                                                                              |     | 42        |
| 24 | FPGA HLS Today: Successes, Challenges, and Opportunities. ACM Transactions on Reconfigurable Technology and Systems, 2022, 15, 1-42.                                   | 1.9 | 40        |
| 25 | A Parallel Bandit-Based Approach for Autotuning FPGA Compilation. , 2017, , .                                                                                          |     | 36        |
| 26 | Platform choices and design demands for IoT platforms: cost, power, and performance tradeoffs. IET<br>Cyber-Physical Systems: Theory and Applications, 2016, 1, 70-77. | 1.9 | 35        |
| 27 | T2S-Tensor: Productively Generating High-Performance Spatial Hardware for Dense Tensor Computations. , 2019, , .                                                       |     | 34        |
| 28 | Architecture and synthesis for multi-cycle communication. , 2003, , .                                                                                                  |     | 29        |
| 29 | Dynamic Hazard Resolution for Pipelining Irregular Loops in High-Level Synthesis. , 2017, , .                                                                          |     | 29        |
| 30 | ElasticFlow: A complexity-effective approach for pipelining irregular loop nests. , 2015, , .                                                                          |     | 28        |
| 31 | Flushing-Enabled Loop Pipelining for High-Level Synthesis. , 2014, , .                                                                                                 |     | 27        |
| 32 | A New Approach to Automatic Memory Banking using Trace-Based Address Mining. , 2017, , .                                                                               |     | 27        |
| 33 | FeatGraph: A Flexible and Efficient Backend for Graph Neural Network Systems. , 2020, , .                                                                              |     | 26        |
| 34 | GraphLily: Accelerating Graph Linear Algebra on HBM-Equipped FPGAs. , 2021, , .                                                                                        |     | 26        |
| 35 | Boosting the Performance of CNN Accelerators with Dynamic Fine-Grained Channel Gating. , 2019, , .                                                                     |     | 24        |
| 36 | Enabling Design Methodologies and Future Trends for Edge AI: Specialization and Codesign. IEEE Design and Test, 2021, 38, 7-26.                                        | 1.1 | 24        |

| #  | Article                                                                                                             | IF  | CITATIONS |
|----|---------------------------------------------------------------------------------------------------------------------|-----|-----------|
| 37 | High-Performance Sparse Linear Algebra on HBM-Equipped FPGAs Using HLS. , 2022, , .                                 |     | 23        |
| 38 | High-level Synthesis for Low-power Design. IPSJ Transactions on System LSI Design Methodology, 2015,<br>8, 12-25.   | 0.5 | 22        |
| 39 | Predictable accelerator design with time-sensitive affine types. , 2020, , .                                        |     | 22        |
| 40 | Architectural Specialization for Inter-Iteration Loop Dependence Patterns. , 2014, , .                              |     | 21        |
| 41 | A Parallelized Iterative Improvement Approach to Area Optimization for LUT-Based Technology<br>Mapping. , 2017, , . |     | 21        |
| 42 | SuSy. , 2020, , .                                                                                                   |     | 21        |
| 43 | Bitwidth-aware scheduling and binding in high-level synthesis. , 2005, , .                                          |     | 20        |
| 44 | High-level synthesis with timing-sensitive information flow enforcement. , 2018, , .                                |     | 20        |
| 45 | Scheduling with soft constraints. , 2009, , .                                                                       |     | 20        |
| 46 | Architecture-level synthesis for automatic interconnect pipelining. , 2004, , .                                     |     | 19        |
| 47 | Area-efficient pipelining for FPGA-targeted high-level synthesis. , 2015, , .                                       |     | 19        |
| 48 | LAMDA: Learning-Assisted Multi-stage Autotuning for FPGA Design Closure. , 2019, , .                                |     | 19        |
| 49 | Layout Symmetry Annotation for Analog Circuits with Graph Neural Networks. , 2021, , .                              |     | 19        |
| 50 | Dagger: efficient and fast RPCs in cloud microservices with near-memory reconfigurable NICs. , 2021, , .            |     | 19        |
| 51 | RapidStream. , 2022, , .                                                                                            |     | 19        |
| 52 | Evaluation of Static Analysis Techniques for Fixed-Point Precision Optimization. , 2009, , .                        |     | 18        |
| 53 | Statistically certified approximate logic synthesis. , 2017, , .                                                    |     | 18        |
| 54 | Building Efficient Deep Neural Networks With Unitary Group Convolutions. , 2019, , .                                |     | 18        |

| #  | Article                                                                                                                                                                                    | IF  | CITATIONS |
|----|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----|-----------|
| 55 | Mapping-Aware Constrained Scheduling for LUT-Based FPGAs. , 2015, , .                                                                                                                      |     | 17        |
| 56 | Binarized Convolutional Neural Networks with Separable Filters for Efficient Hardware Acceleration. , 2017, , .                                                                            |     | 17        |
| 57 | Bit-level optimization for high-level synthesis and FPGA-based acceleration. , 2010, , .                                                                                                   |     | 16        |
| 58 | Multithreaded pipeline synthesis for data-parallel kernels. , 2014, , .                                                                                                                    |     | 16        |
| 59 | Evaluating Celerity: A 16-nm 695 Giga-RISC-V Instructions/s Manycore Processor With Synthesizable PLL.<br>IEEE Solid-State Circuits Letters, 2019, 2, 289-292.                             | 1.3 | 16        |
| 60 | A Scalable Approach to Exact Resource-Constrained Scheduling Based on a Joint SDC and SAT Formulation. , 2018, , .                                                                         |     | 15        |
| 61 | Programming and Synthesis for Software-defined FPGA Acceleration: Status and Future Prospects.<br>ACM Transactions on Reconfigurable Technology and Systems, 2021, 14, 1-39.               | 1.9 | 15        |
| 62 | A 1.4 GHz 695 Giga Risc-V Inst/s 496-Core Manycore Processor With Mesh On-Chip Network and an All-Digital Synthesized PLL in 16nm CMOS. , 2019, , .                                        |     | 14        |
| 63 | Accelerating Face Detection on Programmable SoC Using C-Based Synthesis. , 2017, , .                                                                                                       |     | 11        |
| 64 | Architecture and Compiler Optimizations for Data Bandwidth Improvement in Configurable<br>Processors. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 2006, 14, 986-997. | 2.1 | 10        |
| 65 | CASA., 2014,,.                                                                                                                                                                             |     | 10        |
| 66 | Enabling adaptive loop pipelining in high-level synthesis. , 2017, , .                                                                                                                     |     | 10        |
| 67 | Logic Synthesis Meets Machine Learning: Trading Exactness for Generalization. , 2021, , .                                                                                                  |     | 10        |
| 68 | Scheduling with integer time budgeting for low-power optimization. , 2008, , .                                                                                                             |     | 9         |
| 69 | Improving high-level synthesis with decoupled data structure optimization. , 2016, , .                                                                                                     |     | 9         |
| 70 | PIMap. ACM Transactions on Reconfigurable Technology and Systems, 2018, 11, 1-23.                                                                                                          | 1.9 | 9         |
| 71 | Analysis and Optimization of the Implicit Broadcasts in FPGA HLS to Improve Maximum Frequency. , 2020, , .                                                                                 |     | 9         |

| #  | Article                                                                                                                                                                               | IF  | CITATIONS |
|----|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----|-----------|
| 73 | Behavior-Level Observability Analysis for Operation Gating in Low-Power Behavioral Synthesis. ACM<br>Transactions on Design Automation of Electronic Systems, 2010, 16, 1-29.         | 1.9 | 8         |
| 74 | Designing Secure Cryptographic Accelerators with Information Flow Enforcement. , 2019, , .                                                                                            |     | 8         |
| 75 | Behavior-level observability don't-cares and application to low-power behavioral synthesis. , 2009, , .                                                                               |     | 7         |
| 76 | Architecture and Synthesis for Area-Efficient Pipelining of Irregular Loop Nests. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2017, 36, 1817-1830. | 1.9 | 7         |
| 77 | Behavior and communication co-optimization for systems with sequential communication media. , 2006, , .                                                                               |     | 6         |
| 78 | Improving Scalability of Exact Modulo Scheduling with Specialized Conflict-Driven Learning. , 2019, , .                                                                               |     | 6         |
| 79 | Dagger: Towards Efficient RPCs in Cloud Microservices With Near-Memory Reconfigurable NICs. IEEE<br>Computer Architecture Letters, 2020, 19, 134-138.                                 | 1.0 | 6         |
| 80 | A-QED Verification of Hardware Accelerators. , 2020, , .                                                                                                                              |     | 6         |
| 81 | Architecture and compilation for data bandwidth improvement in configurable embedded processors. , 0, , .                                                                             |     | 4         |
| 82 | Challenges and opportunities of ESL design automation. , 2012, , .                                                                                                                    |     | 4         |
| 83 | A reconfigurable analog substrate for highly efficient maximum flow computation. , 2015, , .                                                                                          |     | 4         |
| 84 | IMpress: Large Integer Multiplication Expression Rewriting for FPGA HLS. , 2022, , .                                                                                                  |     | 4         |
| 85 | MGX. , 2022, , .                                                                                                                                                                      |     | 4         |
| 86 | Characterizing the Benefits and Limitations of Smart Building Meeting Room Scheduling. , 2016, , .                                                                                    |     | 3         |
| 87 | Rapid Generation of High-Quality RISC-V Processors from Functional Instruction Set Specifications. , 2019, , .                                                                        |     | 3         |
| 88 | A Tensor Processing Framework for CPU-Manycore Heterogeneous Systems. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2022, 41, 1620-1635.             | 1.9 | 3         |
| 89 | Architectural synthesis Integrated with global placement for multi-cycle communication. , 0, , .                                                                                      |     | 2         |
| 90 | Behavioral synthesis with activating unused flip-flops for reducing glitch power in FPGA. , 2008, , .                                                                                 |     | 2         |

| #   | Article                                                                                                                            | IF  | CITATIONS |
|-----|------------------------------------------------------------------------------------------------------------------------------------|-----|-----------|
| 91  | FPGA-Based Real-Time Charged Particle Trajectory Reconstruction at the Large Hadron Collider. , 2017, , .                          |     | 2         |
| 92  | Reverse-Engineering CNN Models Using Side-Channel Attacks. IEEE Design and Test, 2022, 39, 15-22.                                  | 1.1 | 2         |
| 93  | DA systemization of knowledge: A catalog of prior forward-looking initiatives. , 2015, , .                                         |     | 1         |
| 94  | GLAIVE: Graph Learning Assisted Instruction Vulnerability Estimation. , 2021, , .                                                  |     | 1         |
| 95  | Distilling Arbitration Logic from Traces using Machine Learning: A Case Study on NoC. , 2021, , .                                  |     | 1         |
| 96  | Gradual relaxation techniques with applications to behavioral synthesis. , 2003, , .                                               |     | 0         |
| 97  | ESL Design Methodology. Journal of Electrical and Computer Engineering, 2012, 2012, 1-2.                                           | 0.6 | 0         |
| 98  | Guest Editors' Introduction: Machine Intelligence at the Edge. IEEE Design and Test, 2021, 38, 5-6.                                | 1.1 | 0         |
| 99  | Introduction Of Special Issue on FPGA-Based Computing [From The Guest Editors]. IEEE Circuits and Systems Magazine, 2021, 21, 3-3. | 2.6 | 0         |
| 100 | Architecture and synthesis for multi-cycle on-chip communication. , 2003, , .                                                      |     | 0         |