2025 IEEE International Conference on Cyber Security and Resilience

Full Program

Summary:

Recovering meaningful function names from stripped executables is a difficult challenge in software reverse engineering and security analysis. When binaries lack debug symbols, functions receive generic names, complicating both manual and automated analyses. Building upon AsmDepictor, this work proposes architectural refinements to reduce repetitive and ambiguous predictions, and an API-based integration of a Large Language Model to further improve function name suggestions. The proposed enhancements are evaluated on 26 million functions extracted from approximately 6,800 open-source Windows binaries, compiled with MSVC 2017-2022 for both X86 and X86_64 architectures. Preliminary evaluations incorporating the LLM show notable reductions in repetitive naming errors. Future directions include exploring fine-tuning BERT-based masked language models on sequences of function names, as well as incorporating graph structure to capture contextual relationships better. The overall approach highlights the potential to restore relevant semantic information in stripped binaries, promoting more efficient reverse engineering and malware analysis.

Author(s):

Remus Petrache    
Romania

Camelia Lemnaru    
Romania

 


Copyright © 2025 SUMMIT-TEC GROUP LTD