Full Program
Summary:
This research aims to improve malware classification by focusing on malware-specific characteristics and applying feature engineering techniques to enhance datasets, enabling ML models to classify malicious traffic with greater accuracy. Additionally, different dataset labelling strategies were examined to determine whether it is better to focus on infection indicators of compromise (IoC) only, or the whole communication between an adversary and the victim. The models are trained on real captured malware datasets, which have been refined with malware-specific engineered features to enhance detection capabilities. The work presents indicative results that Random Forest and Extreme Gradient Boosting classifiers achieve over 98% accuracy in identifying malware families such as WarmCookie, FormBook, and AgentTesla using malware-unique network-based parameters. These findings contribute to advancing intrusion detection systems (IDS) by providing an automated, network-based threat detection methodology.Author(s):
Martin Martijan
Vilnius University
Lithuania
Virgilijus Krinickij
Vilnius University
Lithuania
Linas Bukauskas
Vilnius University
Lithuania