I woke up this morning and decided that it was a good idea to write a blog post about one of my favorite subjects, Antivirus Evasion. I was in a dilemma about writing something on this topic for many years because the knowledge may not only help pentesters during their engagements, but also blackhat hackers in their malicious activities. However, we have been hearing a lot of new antivirus technologies that have emerged in the last 10 years, like behavior analysis, emulation engines, sandboxing, cloud-based reputation services, artificial intelligence and machine learning. Are these technologies just security buzzwords or are they true? How far have the antivirus vendors managed to evolve their products from the traditional signature-based antiviruses that existed 10-15 years ago? Well…. I decided to find out.

So two weeks ago, a Saturday morning like this one, I wrote a small Python script that modified the text strings in a compiled executable file in order to make it undetectable by antivirus products. I don’t mean one or two strings but hundreds, having their letter case randomly modified or converted to LeetSpeek. This way I could make my executable undetectable by many antivirus products without the need to search for the signature for each and every one of them. Bear in mind that some antivirus products use signature strings that are not text based but binary strings (assembly instructions) and hence my script will not evade those. Anyway, the purpose of the PoC script was to check the dependence on text based signature of antivirus products and not binary one. For my tests I chose an a precompiled version of PrintSpoofer (privilege escalation tool) because it is a small file with long text strings, which makes it a good candidate for demo purposes. You can get the source code at https://github.com/itm4n/PrintSpoofer or the compiled version at https://github.com/dievus/printspoofer.

Running my Python PoC script with the –h option provides usage and parameter information. You can find the PoC in our Github repository here.

The script to work properly needs as input a file containing the strings that it will search and replace in the source file (i.e. the binary executable file). An easy way to get them is by using the Linux strings command or the equivalent for MS Windows which you can download from http://ftp.gnu.org/gnu/binutils/ or Sysinternals (https://docs.microsoft.com/en-us/sysinternals/downloads/strings)

The default setting of the Python script is to search and replace ASCII strings in the source binary file. However, some executables also contain Unicode (UTF-16) strings which should be replaced since they can be part of an antivirus’s signature pattern. Therefore, I need to extract the ASCII and Unicode strings from the binary file.

You should be careful with your selection of text strings because some of the ones extracted using the strings command will be part of the executable’s PE header, others will be binary instructions whose hex bytes happened to show up as regular ASCII characters, and others will be part of the Import Address Table which stores the address of library functions imported form DLLs. If you change these strings your modified executable file will fail to run properly. Samples of these strings are shown below.

The ASCII strings that I have selected to modify are actually a single string (shown below). The rest of them would most probably break the executable.

As you can see, the string above is the path to the Print Spooler service’s named pipe. If I modify this extensively, I could break the executable’s functionality. Therefore, I will set the Python script to run the letter case modification algorithm (-c) only and avoid using the LeetSpeek mode (default mode without an argument). If MS Windows encounters the named pipe \p1p3\sp00ls5 it will definitely complain, whereas \PipE\SpOOlSs is more Windows friendly (remember that Windows is case-insensitive).

Now that I have finished with the ASCII strings, I need to work on the Unicode ones. I have generated a file with the Unicode strings of the PrintSpoofer executable which looks like this.

It is evident that I have selected text sentences used by the binary executable to display usage instructions, progress status messages and error messages. No third party DLLs, function names, variable names or instructions were included. Therefore, the selected strings can be modified using the LeetSpeek algorithm since the possibility of breaking the program’s functionality is very small.

The only Unicode strings that were separated from the LeetSpeek list were again the named pipe strings because, as I mentioned before, they could break the program. These, although in Unicode this time, will again have their letter case changed and no LeetSpeek modifications will be performed on them.

Normally I should sort my lists based on their length (longest first) and remove any duplicate entries that may exist, but I will leave this as an exercise to the reader. Also bear in mind, that with most executables you will end up having single words in your ASCII and Unicode lists. You should manually review them and remove the ones that may be included in paths, function or variable names. For example, if you include the word “Length” in a list that will be modified using the LeetSpeek algorithm, you may end up modifying a function or variable name that may include this text in its name (e.g. ObjectLength).

Now that I have my Unicode list I can proceed with modifying the executable file. First I modify the named pipe strings using the letter case modification algorithm (-c) in combination with the Unicode mode (-u).

Once done, I proceed with the LeetSpeek mode using the list of Unicode strings I have already setup.

If I open the final executable in a hex editor I can verify that the text strings were modified successfully. First I check if the ASCII string was changed correctly (i.e. no LeetSpeek).

Then I check if the Unicode strings were modified successfully using the LeetSpeek mode.

As a final step I run the final executable to verify that I didn’t break anything and that it functions properly.

I execute the modified version of PrintSpoofer as an administrator (i.e. in a High Integrity Process) in order to have the required SeImpersonatePrivilege privilege, that would allow me to test the program up to the point of successful privilege escalation. It looks like the program is working fine.

As a site note, if you want to use the script in live engagements, you have two benefits. The first is, that once you setup your strings lists for an executable (e.g. PrintSpoofer, Mimikatz) you can reuse them on newer versions of that binary file as well. Secondly, if your modified executable ever gets caught by an antivirus product that you had managed to bypass before using this technique, chances are you can quickly generate another undetectable version with your existing strings lists, without losing valuable time during your Pentest engagement.

Al that is good, but what about the newly generated executable’s antivirus detection results? The original precompiled version of PrintSpoofer that I downloaded from GitHub had a detection rate of 40/69 whereas the modified version has 19/72. For those of you that want to check it out themselves here are the file hashes:

Original file SHA-256: 9D6F82C75B90CFAD4907CF4EB8EC1ED57B21725A24257F444AF46D9C486BA0CB

Modified file SHA-256: CBB2AF9E99347C52FA00E16938D8540D97A4BF62A6BF4DEC453A3C83FF603A81

A 50% success rate with a simple letter case or LeetSpeek modification script is quite good I think. It proves my point.

With this Proof of Concept exercise we have evidence that there are many antivirus products out there that still depend heavily on pattern matching and signature strings to detect malware. I think the antivirus market should really move away from traditional malware detection techniques and invest more in new technologies that will be harder to bypass.

Stay tuned for more posts like this one.