The DGA of CoreBot

Table of Contents

These are just unpolished notes. The content likely lacks clarity and structure; and the results might not be adequately verified and/or incomplete.


The DGA in this blog post has been implemented by the DGArchive project.


For more information about the malware in this blog post see the Malpedia entry on Corebot.

Recently, IBM’s Security X-Force researchers analysed and reported a new banking trojan called CoreBot. They note that CoreBot features an inactive domain generation algorithm (DGA). The DGA has since been activated as observed by Kleissner & Associates who sinkholed some of the domains.

Since I couldn’t find a description of the DGA elsewhere, in the following my short write-up about the DGA of CoreBot. I looked at this sample provided by @benkow_ and referenced in a Tweet of @Bry_Campbell. Here are the first 10 DGA domains from the malwr report: 	

Edit 2015-09-28: The analysed sample turned out to be a debugging exemplar. I revised the post to highlight the difference.


The DGA is configured with the following routine init_dga_config:

Call to init_dga_config

The meaning of these values are:

  • charset_len: This is the length of the charset array containing ASCII characters used for the DGA. The actual array is initialized later (see below).
  • r: this is the random number, initialized to the hardcoded seed 1DB98930. Other samples of CoreBot use a different hardcoded seeds, see Section Samples in the Wild.
  • len_l: this is the inclusive lower bound on the length of the subdomains of
  • len_u: this is the exclusive upper bound on the length of the subdomains of

The set of characters for the domains is initialized as follows:


This code fills the charset array with “abcdefghijklmnopqrstuvwxy012345678”. Note that “z” and “9” are missing due to an off-by-one error. This bug seems to be widespread among VXers: Necurs, Ramnit, and Ranbyus all have similar errors that lead to missing *“z”*s. Edit 2015-09-17: Tinba, Geodo/Emotet, and Cryptolocker also have the missing “z” problem, thanks to Daniel Plohmann for pointing that out.


The DGA is time dependent. The time is determined by making an HTTP request to

Request to Google

… and querying the date and time with the WinHTTP function WinHttpQueryHeaders:

Systemtime from Google’s Response Header

My sample later overwrites the day with 8. While this could be to reduce the granularity of the DGA from days to months, it is more likely a debugging measure:

Overwriting the day

The next screenshot shows another sample that doesn’t overwrite the day. Notice that the offset nicely line up; the two samples are equal except for the removed “day ← 8” statement.

Without debug

Apart from the year, month, and day (set to 8), there is a fourth value used for seeding. This value is stored as a configuration value

reading the group

In my sample the returned value was NULL, and the group was set to 1. I have yet to see a sample that uses the config value.

The year, month, day (set to 8) and the are then applied to the random number:


The above disassembly boils down to:

	r = r + year + ((group << 16) + (month << 8) | day)


The itself is very simple. It generates up to 40 subdomains (configurable with core.dga.domains_count) using the common linear congruential generator with multiplier 1664525 and increment 1013904223:

the dga

The disassembly decompiles to:

	r = (1664525*r + 1013904223) & 0xFFFFFFFF
	domain_len = len_l + r % (len_u - len_l)
	domain = ""
	for i in range(domain_len):
			r = ((1664525 * r) + 1013904223) & 0xFFFFFFFF
			domain += charset[r % charset_size]

Python Code

The following Python code generates the domains for any given date. It takes the following arguments:

  • -s, --seed: the seed as a hex string. If none is provided, the script uses 1DBA8930
  • -d, --date: the date for which to generate the domains. If none is provided, then the current date is used. If you like to get the domains for the debug sample, you can use the next option --debug.
  • -t, --debug: overwrite the day with 8 like the debug in this blog post does.
  • -n, --nr: number of domains to generate, default 40.

You can also find the code in my GitHub repository:

	import argparse
	from datetime import datetime

	def init_rand_and_chars(year, month, day, nr_b, r):
			r = (r + year + ((nr_b << 16) + (month << 8) | day)) & 0xFFFFFFFF
			charset = [chr(x) for x in xrange(ord('a'), ord('z'))] +\
							[chr(x) for x in xrange(ord('0'), ord('9'))]
			return charset, r

	def generate_domain(charset, r):
			len_l = 0xC
			len_u = 0x18
			r = (1664525*r + 1013904223) & 0xFFFFFFFF
			domain_len = len_l + r % (len_u - len_l)
			domain = ""
			for i in range(domain_len, 0, -1):
					r = ((1664525 * r) + 1013904223) & 0xFFFFFFFF
					domain += charset[r % len(charset)] 
			domain += ""
			return r

	if __name__=="__main__":
			parser = argparse.ArgumentParser()
			parser.add_argument("-s", "--seed", help="seed", default="1DBA8930")
			parser.add_argument("-d", "--date", help="date for which to generate domains")
			parser.add_argument("-t", "--debug", help="debug DGA (day set to 8)")
			parser.add_argument("-n", "--nr", help="nr of domains to generate", 
					type=int, default=40)
			args = parser.parse_args()
			d = datetime.strptime(, "%Y-%m-%d") if else
			day = 8 if args.debug else

			charset, r = init_rand_and_chars(d.year, d.month, day, 1, 
							int(args.seed, 16)) 
			for _ in range(40):
					r = generate_domain(charset, r)

Samples in the Wild

The sample in this blog post (first entry in the following table) turns out to be a special case: the day is set to 8 for debugging purposes, and the seed is slightly different than the ones of the “productive” samples. All other samples have the same seed.

c40a5db6c20ba4316edd64d612481c41 21DBA8930unknown 3

1: meaning the day is set to 8. 2: md5 sum of javascript that dropped corebot 3: the sample was submitted September 8th.


The following table summarizes the properties of Corebot’s DGA:

seedmagic number and current date
granularity1 day
domains per seed and day40
wait time between domainsnone
top and second level
third level characterslower case letters except ‘z’
third level length12 to 23 letters