ABSTRACT
Randomized differential testing of compilers has had great success in finding compiler crashes and silent miscompilations. In this paper we investigate whether we can use similar techniques to improve the quality of the generated code: Can we compare the code generated by different compilers to find optimizations performed by one but missed by another?
We have developed a set of tools for running such tests. We compile C code generated by standard random program generators and use a custom binary analysis tool to compare the output programs. Depending on the optimization of interest, the tool can be configured to compare features such as the number of total instructions, multiply or divide instructions, function calls, stack accesses, and more. A standard test case reduction tool produces minimal examples once an interesting difference has been found.
We have used our tools to compare the code generated by GCC, Clang, and CompCert. We have found previously unreported missing arithmetic optimizations in all three compilers, as well as individual cases of unnecessary register spilling, missed opportunities for register coalescing, dead stores, redundant computations, and missing instruction selection patterns.
- Antoine Balestrat. 2016. CCG: A random C code generator. Source code repository. (2016). https://github.com/Mrktn/ccgGoogle Scholar
- Gergö Barany. 2017. Liveness-Driven Random Program Generation. In Pre-proceedings of LOPSTR 2017. https://arxiv.org/abs/1709.04421Google Scholar
- Dirk Beyer and M. Erkan Keremoglu. 2011. CPAchecker: A Tool for Configurable Software Verification. In Computer Aided Verification -23rd International Conference, CAV. 184–190. Google ScholarDigital Library
- Sandrine Blazy, David Bühler, and Boris Yakobowski. 2017. Structuring Abstract Interpreters Through State and Value Abstractions. In Verification, Model Checking, and Abstract Interpretation (VMCAI).Google Scholar
- Sebastian Buchwald. 2015. Optgen: A Generator for Local Optimizations. In Compiler Construction: 24th International Conference, CC 2015.Google Scholar
- Yang Chen and John Regehr. 2010. Comparing Compiler Optimizations. Blog post. (2010). https://blog.regehr.org/archives/320Google Scholar
- Pascal Cuoq, Benjamin Monate, Anne Pacalet, Virgile Prevosto, John Regehr, Boris Yakobowski, and Xuejun Yang. 2012. Testing static analyzers with randomly generated programs. In Proceedings of the 4th NASA Formal Methods Symposium (NFM 2012). http://www.cs.utah.edu/~regehr/ papers/nfm12.pdf Google ScholarDigital Library
- Eric Eide and John Regehr. 2008. Volatiles Are Miscompiled, and What to Do About It. In Proceedings of the 8th ACM International Conference on Embedded Software (EMSOFT ’08). 255–264. Google ScholarDigital Library
- Lal George and Andrew W. Appel. 1996. Iterated Register Coalescing. In POPL ’96. Google ScholarDigital Library
- Atsushi Hashimoto and Nagisa Ishiura. 2016. Detecting Arithmetic Optimization Opportunities for C Compilers by Randomly Generated Equivalent Programs. IPSJ Transactions on System LSI Design Methodology 9 (2016), 21–29.Google ScholarCross Ref
- Mitsuyoshi Iwatsuji, Atsushi Hashimoto, and Nagisa Ishiura. 2016. Detecting Missed Arithmetic Optimization in C Compilers by Differential Random Testing. In The 20th Workshop on Synthesis And System Integration of Mixed Information technologies (SASIMI 2016). http://ist.ksc.kwansei.ac. jp/~ishiura/publications/C2016-10a.pdfGoogle Scholar
- Florent Kirchner, Nikolai Kosmatov, Virgile Prevosto, Julien Signoles, and Boris Yakobowski. 2015. Frama-C: A software analysis perspective. Formal Aspects of Computing 27, 3 (2015), 573–609. Google ScholarDigital Library
- Christopher Lidbury, Andrei Lascu, Nathan Chong, and Alastair F. Donaldson. 2015. Many-core Compiler Fuzzing. In Proceedings of PLDI ’15. 65–76. Google ScholarDigital Library
- Henry Massalin. 1987. Superoptimizer: A Look at the Smallest Program. In Proceedings of ASPLOS II. 122–126. Google ScholarCross Ref
- William M. McKeeman. 1998. Differential Testing for Software. Digital Technical Journal 10, 1 (1998). http://www.cs.dartmouth.edu/~mckeeman/ references/DifferentialTestingForSoftware.pdfGoogle Scholar
- Jan Midtgaard, Mathias Nygaard Justesen, Patrick Kasting, Flemming Nielson, and Hanne Riis Nielson. 2017. Effect-driven QuickChecking of Compilers. Proc. ACM Program. Lang. 1, ICFP, Article 15 (Aug. 2017). Google ScholarDigital Library
- Tipp Moseley, Dirk Grunwald, and Ramesh Peri. 2009a. Chainsaw: Using Binary Matching for Relative Instruction Mix Comparison. In PACT 2009. Google ScholarDigital Library
- Tipp Moseley, Dirk Grunwald, and Ramesh Peri. 2009b. OptiScope: Performance Accountability for Optimizing Compilers. In Proceedings of CGO ’09. Google ScholarDigital Library
- Eriko Nagai, Atsushi Hashimoto, and Nagisa Ishiura. 2014. Reinforcing Random Testing of Arithmetic Optimization of C Compilers by Scaling up Size and Number of Expressions. IPSJ Transactions on System LSI Design Methodology 7 (2014), 91–100.Google ScholarCross Ref
- Phitchaya Mangpo Phothilimthana, Aditya Thakur, Rastislav Bodik, and Dinakar Dhurjati. 2016. Scaling Up Superoptimization. In Proceedings of ASPLOS ’16. 297–310. Google ScholarDigital Library
- John Regehr, Yang Chen, Pascal Cuoq, Eric Eide, Chucky Ellison, and Xuejun Yang. 2012. Test-case Reduction for C Compiler Bugs. In PLDI ’12. Google ScholarDigital Library
- Jesse Ruderman. 2015. jsfunfuzz. Source code repository. (2015). https: //github.com/MozillaSecurity/funfuzz/tree/master/js/jsfunfuzzGoogle Scholar
- Raimondas Sasnauskas, Yang Chen, Peter Collingbourne, Jeroen Ketema, Jubi Taneja, and John Regehr. 2017. Souper: A Synthesizing Superoptimizer. CoRR abs/1711.04422 (2017). arXiv: 1711.04422 http://arxiv.org/ abs/1711.04422Google Scholar
- Yan Shoshitaishvili, Ruoyu Wang, Christopher Salls, Nick Stephens, Mario Polino, Andrew Dutcher, John Grosen, Siji Feng, Christophe Hauser, Christopher Kruegel, and Giovanni Vigna. 2016. SoK: (State of ) The Art of War: Offensive Techniques in Binary Analysis. In IEEE Symposium on Security and Privacy.Google ScholarCross Ref
- Chengnian Sun, Vu Le, Qirun Zhang, and Zhendong Su. 2016. Toward Understanding Compiler Bugs in GCC and LLVM. In ISSTA 2016. Google ScholarDigital Library
- Tao Wei, Jian Mao, Wei Zou, and Yu Chen. 2007. A New Algorithm for Identifying Loops in Decompilation. In SAS’07. http://dl.acm.org/citation. cfm?id=2391451.2391464 Google ScholarDigital Library
- Xuejun Yang, Yang Chen, Eric Eide, and John Regehr. 2011. Finding and Understanding Bugs in C Compilers. In Proceedings of PLDI ’11. 283–294. Google ScholarDigital Library
Index Terms
- Finding missed compiler optimizations by differential testing
Recommendations
Tuning Compiler Optimizations for Simultaneous Multithreading
Special issue on the 30th annual ACM/IEEE international symposium on microarchitecture, part IISimultaneous Multithreading (SMT) is a processor architectural technique that promises to significantly improve the utilization and performance of modern wide-issue superscalar processors. An SM T processor is capable of issuing multiple instructions ...
Tuning compiler optimizations for simultaneous multithreading
MICRO 30: Proceedings of the 30th annual ACM/IEEE international symposium on MicroarchitectureCompiler optimizations are often driven by specific assumptions about the underlying architecture and implementation of the target machine. For example, when targeting shared-memory multiprocessors, parallel programs are compiled to minimize sharing, in ...
Feedback-directed differential testing of interactive debuggers
ESEC/FSE 2018: Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software EngineeringTo understand, localize, and fix programming errors, developers often rely on interactive debuggers. However, as debuggers are software, they may themselves have bugs, which can make debugging unnecessarily hard or even cause developers to reason about ...
Comments