Hiding critical program components via ambiguous translation

Authors:
Chijung Jung

University of Virginia

University of Virginia
View Profile

,
Doowon Kim

University of Tennessee

University of Tennessee
View Profile

,
An Chen

University of Georgia

University of Georgia
View Profile

,
Weihang Wang

University at Buffalo

University at Buffalo
View Profile

,
Yunhui Zheng

IBM Research

IBM Research
View Profile

,
Kyu Hyung Lee

University of Georgia

University of Georgia
View Profile

,
Yonghwi Kwon

University of Virginia

University of Virginia
View Profile

ICSE '22: Proceedings of the 44th International Conference on Software EngineeringMay 2022Pages 1120–1132https://doi.org/10.1145/3510003.3510139

Published:05 July 2022Publication History

ICSE '22: Proceedings of the 44th International Conference on Software Engineering

Pages 1120–1132

ABSTRACT

Software systems may contain critical program components such as patented program logic or sensitive data. When those components are reverse-engineered by adversaries, it can cause significantly damage (e.g., financial loss or operational failures). While protecting critical program components (e.g., code or data) in software systems is of utmost importance, existing approaches, unfortunately, have two major weaknesses: (1) they can be reverse-engineered via various program analysis techniques and (2) when an adversary obtains a legitimate-looking critical program component, he or she can be sure that it is genuine.

In this paper, we propose Ambitr, a novel technique that hides critical program components. The core of Ambitr is Ambiguous Translator that can generate the critical program components when the input is a correct secret key. The translator is ambiguous as it can accept any inputs and produces a number of legitimate-looking outputs, making it difficult to know whether an input is correct secret key or not. The executions of the translator when it processes the correct secret key and other inputs are also indistinguishable, making the analysis inconclusive. Our evaluation results show that static, dynamic and symbolic analysis techniques fail to identify the hidden information in Ambitr. We also demonstrate that manual analysis of Ambitr is extremely challenging.

References

Hiralal Agrawal and Joseph R. Horgan. 1990. Dynamic Program Slicing. SIGPLAN Not. 25, 6 (June 1990), 246--256. Google ScholarDigital Library
Christian Ammann. 2012. Hyperion: Implementation of a PE-Crypter.Google Scholar
David E Bakken, R Rarameswaran, Douglas M Blough, Andy A Franz, and Ty J Palmer. 2004. Data obfuscation: Anonymity and desensitization of usable data sets. IEEE Security & Privacy 2, 6 (2004), 34--41.Google ScholarDigital Library
Gogul Balakrishnan and Thomas Reps. 2004. Analyzing Memory Accesses in x86 Executables. In Compiler Construction, Evelyn Duesterwald (Ed.). Springer Berlin Heidelberg, Berlin, Heidelberg, 5--23.Google Scholar
Sebastian Banescu, Christian Collberg, Vijay Ganesh, Zack Newsham, and Alexander Pretschner. 2016. Code Obfuscation against Symbolic Execution Attacks. In Proceedings of the 32nd Annual Conference on Computer Security Applications (Los Angeles, California, USA) (ACSAC '16). Association for Computing Machinery, New York, NY, USA, 189--200. Google ScholarDigital Library
Cristian Barría, David Cordero, Claudio Cubillos, and Robinson Osses. 2016. Obfuscation procedure based in dead code insertion into crypter. In 2016 6th International Conference on Computers Communications and Control (ICCCC). IEEE, 23--29.Google ScholarCross Ref
BDLeet. 2016. GitHub - BDLeet/public-shell: Some Public Shell. https://github.com/BDLeet/public-shell.Google Scholar
Bart Blaze. 2019. GitHub - bartblaze/PHP-backdoors: A collection of PHP backdoors. https://github.com/bartblaze/PHP-backdoors.Google Scholar
David Brumley, Cody Hartwig, Zhenkai Liang, James Newsome, Dawn Song, and Heng Yin. 2008. Automatically identifying trigger-based behavior in malware. In Botnet Detection. Springer, 65--88.Google Scholar
Jerry R Burch, Edmund M Clarke, Kenneth L McMillan, David L Dill, and Lain-Jinn Hwang. 1992. Symbolic model checking: 1020 states and beyond. Information and computation 98, 2 (1992), 142--170.Google Scholar
Juan Manuel Martinez Caamaño and Serge Guelton. 2018. Easy::Jit: Compiler Assisted Library to Enable Just-in-Time Compilation in C+ + Codes. In Conference Companion of the 2nd International Conference on Art, Science, and Engineering of Programming (Nice, France) (Programming'18 Companion). Association for Computing Machinery, New York, NY, USA, 49--50. Google ScholarDigital Library
Haibo Chen, Liwei Yuan, Xi Wu, Binyu Zang, Bo Huang, and Pen-chung Yew. 2009. Control flow obfuscation with information flow tracking. In Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture. ACM, 391--400.Google ScholarDigital Library
Binlin Cheng, Jiang Ming, Jianmin Fu, Guojun Peng, Ting Chen, Xiaosong Zhang, and Jean-Yves Marion. 2018. Towards Paving the Way for Large-Scale Windows Malware Analysis: Generic Binary Unpacking with Orders-of-Magnitude Performance Boost. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security (Toronto, Canada) (CCS '18). Association for Computing Machinery, New York, NY, USA, 395--411. Google ScholarDigital Library
Binlin Cheng, Jiang Ming, Erika A Leal, Haotian Zhang, Jianming Fu, Guojun Peng, and Jean-Yves Marion. 2021. Obfuscation-Resilient Executable Payload Extraction From Packed Malware. In 30th USENIX Security Symposium (USENIX Security 21).Google Scholar
Edmund M Clarke, William Klieber, Miloš Nováček, and Paolo Zuliani. 2011. Model checking and the state explosion problem. In LASER Summer School on Software Engineering. Springer, 1--30.Google Scholar
Christian Collberg, Clark Thomborson, and Douglas Low. 1998. Manufacturing cheap, resilient, and stealthy opaque constructs. In Proceedings of the 25th ACM SIGPLAN-SIGACT symposium on Principles of programming languages. ACM, 184--196.Google ScholarDigital Library
Johannes Dahse and Jörg Schwenk. 2010. RIPS-A static source code analyser for vulnerabilities in PHP scripts. Retrieved: February 28 (2010), 2012.Google Scholar
Biniam Fisseha Demissie, Mariano Ceccato, and Roberto Tiella. 2015. Assessment of Data Obfuscation with Residue Number Coding. In Proceedings of the 1st International Workshop on Software Protection (Florence, Italy) (SPRO '15). IEEE Press, 38--44.Google ScholarDigital Library
Zhui Deng, Brendan Saltaformaggio, Xiangyu Zhang, and Dongyan Xu. 2015. iris: Vetting private api abuse in ios applications. In Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security. 44--56.Google ScholarDigital Library
Derick Rethans. 2009. Variable tracing with Xdebug --- Derick Rethans. https://derickrethans.nl/variable-tracing-with-xdebug.html.Google Scholar
Derick Rethans. 2020. Xdebug - Debugger and Profiler Tool for PHP. https://xdebug.org/.Google Scholar
dwyl. 2019. A text file containing 479k English words. https://github.com/dwyl/english-words.Google Scholar
Evi1cg. 2019. GitHub - Ridter/Pentest. https://github.com/Ridter/Pentest.Google Scholar
Aurore Fass, Michael Backes, and Ben Stock. 2019. Hidenoseek: Camouflaging malicious javascript in benign asts. In Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security. 1899--1913.Google ScholarDigital Library
Daniele Filaretti and Sergio Maffeis. 2014. An executable formal semantics of PHP. In European Conference on Object-Oriented Programming. Springer.Google ScholarDigital Library
Maurice Fonk. 2019. GitHub - naneau/php-obfuscator: an "obfuscator" for PSR/OOp PHP code. https://github.com/naneau/php-obfuscator.Google Scholar
Heilan Yvette Grimes. 2015. Eir - Static Vulnerability Detection in PHP Applications. (2015).Google Scholar
David Hauzar and Jan Kofroň. 2014. WeVerca: Web Applications Verification for PHP. In International Conference on Software Engineering and Formal Methods. Springer, 296--301.Google Scholar
Cristian Barría Huidobro, David Cordero, Claudio Cubillos, Héctor Allende Cid, and Claudio Casado Barragán. 2018. Obfuscation procedure based on the insertion of the dead code in the crypter by binary search. In 2018 7th International Conference on Computers Communications and Control (ICCCC). IEEE, 183--192.Google ScholarCross Ref
Imperva. 2021. Data Obfuscation. https://www.imperva.com/learn/data-security/data-obfuscation/.Google Scholar
Torben Jensen, Heine Pedersen, Mads Chr Olesen, and René Rydhof Hansen. 2012. Thaps: automated vulnerability scanning of php applications. In Nordic conference on secure IT systems. Springer, 31--46.Google ScholarDigital Library
Ryan Johnson and Angelos Stavrou. 2013. Forced-path execution for android applications on x86 platforms. In 2013 IEEE Seventh International Conference on Software Security and Reliability Companion. IEEE, 188--197.Google ScholarDigital Library
Nenad Jovanovic, Christopher Kruegel, and Engin Kirda. 2006. Pixy: A static analysis tool for detecting web application vulnerabilities. In 2006 IEEE Symposium on Security and Privacy (S&P). IEEE, 6--pp.Google ScholarDigital Library
Min Gyung Kang, Pongsin Poosankam, and Heng Yin. 2007. Renovo: A Hidden Code Extractor for Packed Executables. In Proceedings of the 2007 ACM Workshop on Recurring Malcode (Alexandria, Virginia, USA) (WORM '07). Association for Computing Machinery, New York, NY, USA, 46--53. Google ScholarDigital Library
Kyungtae Kim, I Luk Kim, Chung Hwan Kim, Yonghwi Kwon, Yunhui Zheng, Xiangyu Zhang, and Dongyan Xu. 2017. J-force: Forced execution on javascript. In Proceedings of the 26th international conference on World Wide Web. International World Wide Web Conferences Steering Committee, 897--906.Google ScholarDigital Library
Pascal Kissian. 2019. YAK Pro: Php Obfuscator. https://www.php-obfuscator.com/.Google Scholar
Byoungyoung Lee, Yuna Kim, and Jong Kim. 2010. binOb+: a framework for potent and stealthy binary obfuscation. In Proceedings of the 5th ACM Symposium on Information, Computer and Communications Security. ACM, 271--281.Google ScholarDigital Library
Young Bi Lee, Jae Hyuk Suk, and Dong Hoon Lee. 2021. Bypassing Anti-Analysis of Commercial Protector Methods Using DBI Tools. IEEE Access 9 (2021), 7655--7673.Google ScholarCross Ref
Robert Lie. 2019. Simple online PHP obfuscator: encodes PHP code into random letters, numbers and/or characters. https://www.mobilefish.com/services/php_obfuscator/php_obfuscator.php.Google Scholar
Alessandro Mantovani, Simone Aonzo, Xabier Ugarte-Pedrero, Alessio Merlo, and Davide Balzarotti. 2020. Prevalence and Impact of Low-Entropy Packing Schemes in the Malware Ecosystem. In Network and Distributed System Security (NDSS) Symposium, NDSS, Vol. 20.Google Scholar
Jian Mao, Jingdong Bian, Guangdong Bai, Ruilong Wang, Yue Chen, Yinhao Xiao, and Zhenkai Liang. 2018. Detecting malicious behaviors in javascript applications. IEEE Access 6 (2018), 12284--12294.Google ScholarCross Ref
Ibéria Medeiros, Nuno F Neves, and Miguel Correia. 2014. Automatic detection and correction of web application vulnerabilities using data mining to predict false positives. In Proceedings of the 23rd international conference on World wide web. ACM, 63--74.Google ScholarDigital Library
Microsoft. 2020. Z3Prover/z3: The Z3 Theorem Prover. https://github.com/Z3Prover/z3.Google Scholar
Jiang Ming, Dongpeng Xu, Li Wang, and Dinghao Wu. 2015. Loop: Logic-oriented opaque predicate detection in obfuscated binary code. In Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security. ACM, 757--768.Google ScholarDigital Library
Ondřej Mirtes. 2019. GitHub - phpstan/phpstan: PHP Static Analysis Tool. https://github.com/phpstan/phpstan.Google Scholar
Shoya Morishige, Shuichiro Haruta, Hiromu Asahina, and Iwao Sasase. 2017. Obfuscated malicious javascript detection scheme using the feature based on divided url. In 2017 23rd Asia-Pacific Conference on Communications (APCC). IEEE, 1--6.Google ScholarCross Ref
Andreas Moser, Christopher Kruegel, and Engin Kirda. 2007. Limits of static analysis for malware detection. In Twenty-Third Annual Computer Security Applications Conference (ACSAC 2007). IEEE, 421--430.Google ScholarCross Ref
Abbas Naderi-Afooshteh, Yonghwi Kwon, Anh Nguyen-Tuong, Ali Razmjoo-Qalaei, Mohammad-Reza Zamiri-Gourabi, and Jack W Davidson. 2019. MalMax: Multi-Aspect Execution for Automated Dynamic Web Server Malware Analysis. In Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security. 1849--1866.Google ScholarDigital Library
James Newsome and Dawn Xiaodong Song. 2005. Dynamic Taint Analysis for Automatic Detection, Analysis, and SignatureGeneration of Exploits on Commodity Software.. In NDSS, Vol. 5. Citeseer, 3--4.Google Scholar
Hung Viet Nguyen, Hoan Anh Nguyen, Tung Thanh Nguyen, and Tien N Nguyen. 2011. Auto-locating and fix-propagating for HTML validation errors to PHP server-side code. In Proceedings of the 2011 26th IEEE/ACM International Conference on Automated Software Engineering. IEEE Computer Society, 13--22.Google ScholarDigital Library
nixawk. 2018. GitHub - nixawk/fuzzdb: Web Fuzzing Discovery and Attack Pattern Database. https://github.com/nixawk/fuzzdb.Google Scholar
Paulo Jorge Costa Nunes, José Fonseca, and Marco Vieira. 2015. phpSAFE: A security analysis tool for OOP web application plugins. In 2015 45th Annual IEEE/IFIP International Conference on Dependable Systems and Networks.Google ScholarDigital Library
Oswaldo Olivo. 2016. GitHub - olivo/TaintPHP: Static Taint Analysis for PHP web applications. https://github.com/olivo/TaintPHP.Google Scholar
Mathilde Ollivier, Sébastien Bardin, Richard Bonichon, and Jean-Yves Marion. 2019. How to Kill Symbolic Deobfuscation for Free (or: Unleashing the Potential of Path-Oriented Protections). In Proceedings of the 35th Annual Computer Security Applications Conference (San Juan, Puerto Rico, USA) (ACSAC '19). Association for Computing Machinery, New York, NY, USA, 177--189. Google ScholarDigital Library
OneSourceCat. 2015. GitHub - OneSourceCat/phpvulhunter: A tool that can scan php vulnerabilities automatically using static analysis methods. https://github.com/OneSourceCat/phpvulhunter.Google Scholar
Ioannis Papagiannis, Matteo Migliavacca, and Peter Pietzuch. 2011. PHP Aspis: using partial taint tracking to protect against injection attacks. In 2nd USENIX Conference on Web Application Development, Vol. 13.Google Scholar
Fei Peng, Zhui Deng, Xiangyu Zhang, Dongyan Xu, Zhiqiang Lin, and Zhendong Su. 2014. X-force: force-executing binary programs for security applications. In 23rd USENIX Security Symposium. 829--844.Google Scholar
PHP. 2019. PHP: Pspell Functions. https://www.php.net/manual/en/ref.pspell.php.Google Scholar
phpencoder 2021. PHP Encoder, protect PHP scripts with SourceGuardian and bytecode. https://www.sourceguardian.com/.Google Scholar
Pipsomania. 2018. Best PHP Obfuscator. http://www.pipsomania.com/best_php_obfuscator.doGoogle Scholar
Igor V Popov, Saumya K Debray, and Gregory R Andrews. 2007. Binary Obfuscation Using Signals. In USENIX Security Symposium. 275--290.Google Scholar
Paul Royal, Mitch Halpin, David Dagon, Robert Edmonds, and Wenke Lee. 2006. PolyUnpack: Automating the Hidden-Code Extraction of Unpack-Executing Malware. In 2006 22nd Annual Computer Security Applications Conference (ACSAC'06). 289--300. Google ScholarDigital Library
Dewhurst Ryan. 2011. Implementing basic static code analysis into integrated development environments (ides) to reduce software vulnerablitilies. A Report submitted in partial fulfillment of the regulations governing the award of the Degree of BSc (Honours) Ethical Hacking for Computer Security at the University of Northumbria at Newcastle 2012 (2011).Google Scholar
Sebastian Schrittwieser, Stefan Katzenbeisser, Peter Kieseberg, Markus Huber, Manuel Leithner, Martin Mulazzani, and Edgar Weippl. 2013. Covert computation: Hiding code in code for obfuscation purposes. In Proceedings of the 8th ACM SIGSAC symposium on Information, computer and communications security. ACM, 529--534.Google ScholarDigital Library
Design Security. 2016. GitHub - designsecurity/progpilot: A static analysis tool for security. https://github.com/designsecurity/progpilot.Google Scholar
Monirul I Sharif, Andrea Lanzi, Jonathon T Giffin, and Wenke Lee. 2008. Impeding Malware Analysis Using Conditional Code Obfuscation.. In NDSS.Google Scholar
Brendan Sheridan and Micah Sherr. 2016. On Manufacturing Resilient Opaque Constructs Against Static Analysis. In European Symposium on Research in Computer Security. Springer, 39--58.Google Scholar
Guillermo Suarez-Tangil, Juan E Tapiador, Flavio Lombardi, and Roberto Di Pietro. 2014. Thwarting obfuscated malware via differential fault analysis. Computer 47, 6 (2014), 24--31.Google ScholarDigital Library
themida 2021. Oreans Technologies. https://www.oreans.com/Themida.php.Google Scholar
Xabier Ugarte-Pedrero, Davide Balzarotti, Igor Santos, and Pablo Bringas. 2016. RAMBO: Run-Time Packer Analysis with Multiple Branch Observation. 186--206. Google ScholarDigital Library
Antti Valmari. 1998. The State Explosion Problem. In Lectures on Petri Nets I: Basic Models, Advances in Petri Nets, the Volumes Are Based on the Advanced Course on Petri Nets. Springer-Verlag, London, UK, UK, 429--528. http://dl.acm.org/citation.cfm?id=647444.727054Google ScholarDigital Library
Bart van Arnhem. 2017. GitHub - bartvanarnhem/phpscan: Symbolic execution inspired PHP application scanner for code-path discovery. https://github.com/bartvanarnhem/phpscan.Google Scholar
Vimeo. 2019. GitHub - vimeo/psalm: A static analysis tool for finding errors in PHP applications. https://github.com/vimeo/psalm.Google Scholar
VirusShare. 2019. VirusShare.com. https://virusshare.com/.Google Scholar
Zhi Wang, Jiang Ming, Chunfu Jia, and Debin Gao. 2011. Linear obfuscation to combat symbolic execution. In European Symposium on Research in Computer Security. Springer, 210--226.Google ScholarDigital Library
Zhi Wang, Jiang Ming, Chunfu Jia, and Debin Gao. 2011. Linear Obfuscation to Combat Symbolic Execution. In Proceedings of the 16th European Conference on Research in Computer Security (Leuven, Belgium) (ESORICS'11). Springer-Verlag, Berlin, Heidelberg, 210--226.Google ScholarDigital Library
Mark Weiser. 1981. Program Slicing. In Proceedings of the 5th International Conference on Software Engineering (San Diego, California, USA) (ICSE '81). IEEE Press, 439--449.Google ScholarDigital Library
Jeffrey Wilhelm and Tzi-cker Chiueh. 2007. A forced sampled execution approach to kernel rootkit identification. In International Workshop on Recent Advances in Intrusion Detection. Springer, 219--235.Google ScholarCross Ref
Dongpeng Xu, Jiang Ming, and Dinghao Wu. 2017. Cryptographic function detection in obfuscated binaries via bit-precise symbolic loop mapping. In 2017 IEEE Symposium on Security and Privacy (SP). IEEE, 921--937.Google ScholarCross Ref
Babak Yadegari and Saumya Debray. 2015. Symbolic Execution of Obfuscated Code. In Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security (Denver, Colorado, USA) (CCS '15). Association for Computing Machinery, New York, NY, USA, 732--744. Google ScholarDigital Library
Quan Yang. 2019. GitHub - quanyang/Taint-em-All: A taint analysis tool for the PHP language. https://github.com/quanyang/Taint-em-All.Google Scholar
yodap 2021. Yoda's Protector. https://sourceforge.net/projects/yodap/.Google Scholar
zendguard 2021. Protect PHP Code With Zend Guard. https://www.zend.com/products/zend-guard.Google Scholar

Index Terms

Hiding critical program components via ambiguous translation
1. Security and privacy
  1. Software and application security
    1. Software reverse engineering
    2. Software security engineering

Recommendations

Automated generation of program translation and verification tools using annotated grammars

Automatically generating program translators from source and target language specifications is a non-trivial problem. In this paper we focus on the problem of automating the process of building translators between operations languages, a family of DSLs ...
Read More
Analyzing and Understanding Architectural Characteristics of COM+ Components
IWPC '03: Proceedings of the 11th IEEE International Workshop on Program Comprehension

Understanding architectural characteristics of software components that constitute distributed systems is crucial for maintaining and evolving them. One component framework heavily used for developing component-based software systems is Microsoft's COM+...
Read More
Using model-driven engineering to automate software language translation
Abstract
The porting or translation of software applications from one programming language to another is a common requirement of organisations that utilise software, and the increasing number and diversity of programming languages makes this capability as ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ICSE '22: Proceedings of the 44th International Conference on Software Engineering
May 2022
2508 pages
ISBN:9781450392211
DOI:10.1145/3510003
General Chair:
Matthew B Dwyer
University of Virginia
,
Program Chairs:
Daniela Damian
University of Victoria, Canada
,
Andreas Zeller
CISPA, Germany
Copyright © 2022 Owner/Author
This work is licensed under a Creative Commons Attribution International 4.0 License.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 5 July 2022
Check for updates
Author Tags
program translation
reverse engineering
software protection
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate276of1,856submissions,15%

Upcoming Conference

ICSE 2025

2025 IEEE/ACM 46th International Conference on Software Engineering

April 26 - May 3, 2025

Ottawa , ON , Canada
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 1
  Total Citations
  View Citations
- 269
  Total Downloads
- Downloads (Last 12 months)137
- Downloads (Last 6 weeks)14
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Hiding critical program components via ambiguous translation

ICSE '22: Proceedings of the 44th International Conference on Software Engineering

ABSTRACT

References

Cited By

Index Terms

Recommendations

Automated generation of program translation and verification tools using annotated grammars

Analyzing and Understanding Architectural Characteristics of COM+ Components

Using model-driven engineering to automate software language translation