Python Malware: Persistence with Windows

Red Skål
4 min readJan 22, 2021

When it comes to malware persistence is a high priority for developers. It allows you to maintain access, increase propagation, or maximise infection rates (dependent on your malwares intent). My focus here will be on remote control backdoors but I’m borrowing concepts from viruses. So it should be applicable across the board.

Years ago when I first took an interest in all things InfoSec-related the go-to method for persistence on Windows systems was the well-documented registry additions. Every now and again a developer would have the sense to create a hijacking situation with these entries and use their malware as a proxy. Similar hijacking methods can be used against short-cut files in the Start-up directory, shell extensions in the registry, etc, etc. The possibilities are vast.

Routine for searching library directory and selecting file at random.
Selecting a library to infect.

Alas, these leave a trail along the filesystem and modern tools can reveal the previous methods fairly easily. So what are the other options? Well, if I were using binary malware in today’s market I would create a parasitic infection of a program I know is set to start with Windows.

Forcing a self-decrypting payload into another binary has been written about for many years. I recommend VX Underground for tutorials — 29A are a great starting point.

It’s worth taking into account whether we want the malware set to be ‘always-on’. That adds noise and negates some of the stealth we gain by using parasitic persistence methods. Therein lies the beauty of using Python malware in commercial environments.

Python is arguably the most popular scripting language at this point in time. It makes automation easy and lends itself to parsing pretty well. It stands to reason that it will be present in environments where system administrators will use scheduled Python scripts to perform remedial tasks daily, maybe weekly, to ease their workload. This is the vulnerability we exploit in this situation — the human propensity to decrease effort required.

The problem with infecting Python scripts is that we would need a good idea of where they are kept on disk. This leads to more fine-tuning than is sometimes realistic to entertain. So what is our next best option?

The majority of Python scripts will import libraries to complete their tasks. There are some usual suspects among these we could target specifically. However, limiting our target scope too much means we again lose some of the stealth we’re working towards — we want to be as unpredictable as possible. This does risk limiting our execution but that’s a necessary risk to avoid detection. Obviously the trade-off of stealth to reliability is dependent on the task you’re performing but it’s a decision that needs to be made.

My proof-of-concept implementation will attempt to infect any Python file within the Libraries directory. If I were to re-work it into something more stable I would probably give my library hunting class a white list of libraries. That can then be fine-tuned for each engagement after some reconnaissance. I think that would be the most logical trade-off between stealth and reliability for red teams/pentesters.

The Python Infection Engine (PIE) that I have developed attempts infect Python libraries by randomly selecting a host, parsing said host, removing and then dumping any comment section back into the host file.

After the comments we assume there will be a section of imports. We parse this, making note of the existing imports, then we add the remainder of our payloads imports and dump this to file. It’s worth noting that my PoC is a little unstable with this part; I believe that is due to the incredibly inconsistent commenting conventions among the Python libraries. I also haven’t accounted for imports in the form of from x import (a, b, c) across multiple lines. That will come if I develop this into something more than a PoC.

To achieve parasitic infection we need to inject our payload class and create an instance of it. In order to do that we parse the remainder of the host file for function and class declarations and add them to two lists. An entry point for our instance creation is selected at random; from that index we count backwards to the first available point for payload injection. Using these index we then write the remainder of the host file content, plus our additions, to disk.

Host file enumeration routine, locating possible entry points and payload areas.
Enumerate host for possible entry points and payload areas.

There are lots of improvements that can be made on my implementation. Stability is probably the biggest issue with my code; certain scripts with odd layouts tend to cause crashes around the import parsing routine. I think the best mitigation for this is to narrow the scope of targeted libraries. This would also solve the issue of infrequent executions but may require prior reconnaissance of the host system. Some frequency analysis on known scripts may also make for a nice, curated list of libraries.

My code can be found on my github.

--

--

Red Skål

Tech hobbyist hoping for a new career in InfoSec. Currently a field engineer. Keen on reverse engineering, malware research, pentesting and CTFs.