I'm working on a cybersecurity application and one of its features is machine learning to detect SQL injection attacks. I'm in the research phase right now and I'm looking at data collection and preprocessing. So far I've been able to write a script to extract recognizable SQL statements from Apache logs.
I started looking into implementing something similar for IIS logs and I stumbled across an article that talked about encoded attack statements in request query parameters that are visible in the logs. The article only focused on attacks that use the .cast() SQL method of encoding the statements to hexadecimal. I'm assuming since this is just applied to query parameters, similar encoding can be present in Apache logs as well. Are there other methods of encoding (other than SQL methods such as .cast() and .convert() ) that would be present in the access log files? I also read other papers that don't even consider encoded statements. Does that mean encoded statements don't occur often or even at all?
I'm not sure what else to research to answer my question or point me in the right direction (all my efforts haven't turned up anything useful yet). I'd appreciate any help! Thanks