cover image for post 'Yet Another Bazar Loader DGA'

Yet Another Bazar Loader DGA

Table of Contents
Disclaimer

These are just unpolished notes. The content likely lacks clarity and structure; and the results might not be adequately verified and/or incomplete.

Changes
  • 2021-03-23 21:24:10: second seed and details for the IP transformation
Aliases

The malware in this blog post is also known as BazarBackdoor, Team9Backdoor, BazDor, BazarLoader and BazaLoader

Malpedia

For more information about the malware in this blog post see the Malpedia entry on BazarBackdoor.

URLhaus

This page lists malware URLs that are tagged with BazarLoader.

MalwareBazaar

You can download bazaloader samples from this page.

Cover Image Cover Photo by Naim Benjelloun on Pexels

Bazar Loader decided to change its perfectly fine domain generation algorithm (DGA) once again. The change in the algorithm is very minor, but it yields more domain names.

Sample

I looked at this sample:

MD5
9ad20d0e6da3cf135a93bf162a0a8cfb
SHA1
a97893ab95f794cabc261483423f942f552926d0
SHA256
8e244f1a5b4653d6dbb4cc3978c7dd773b227a443361fbc30265b79f102f7eed
Size
288 KB (295616 Bytes)
Compile Timestamp
2021-01-20 19:37:37 UTC
Links
MalwareBazaar, Malpedia, Dropping_sha256, Cape, VirusTotal
Filenames
Preview_report20-01.exe (VirusTotal)
Detections
MalwareBazaar: BazaLoader, Virustotal: 33/76 as of 2021-01-23 07:31:37 - Trojan.Win32.Zenpak.4!c (AegisLab), Backdoor:Win32/KZip.90c5e0b2 (Alibaba), BackDoor.Bazar.55 (DrWeb), Trojan.Win32.Zenpak.bfcu (Kaspersky), Trojan:Win64/Bazarldr.BMB!MSR (Microsoft), Trojan.Win32.Zenpak.bfcu (ZoneAlarm)

it unpacks to this

MD5
63784053ac2f608d94c18b17c46ab5d4
SHA1
e01c814d6a4993c74a2bfb87b1b661fe78c41291
SHA256
c0a087a520fdfb5f1e235618b3a5101969c1de85b498bc4670372c02756efd55
Size
98 KB (100864 Bytes)
Compile Timestamp
2021-01-20 19:10:11 UTC
Links
MalwareBazaar, Malpedia, Dropping_sha256, Dropping_sha256, Cape, VirusTotal
Filenames
none
Detections
MalwareBazaar: BazaLoader, Virustotal: 21/75 as of 2021-01-23 13:37:55 - Gen:Variant.Bulz.163525 (ALYac), Gen:Variant.Bulz.163525 (Ad-Aware), Trojan.Bulz.D27EC5 (Arcabit), Gen:Variant.Bulz.163525 (BitDefender), Gen:Variant.Bulz.163525 (B) (Emsisoft), Gen:Variant.Bulz.163525 (GData), Gen:Variant.Bulz.163525 (MicroWorld-eScan), Trojan:Win32/TrickBot.VSF!MTB (Microsoft), Trojan.TrickBot!8.E313 (TFE:5:6iToUtBEDBC) (Rising)

which finally drops

MD5
7e8eddaef14aa8de2369d1ca6347b06d
SHA1
4543e6da0515bb7d93e930c9f30e40912d495373
SHA256
f29253139dab900b763ef436931213387dc92e860b9d3abb7dcd46040ac28a0e
Size
89 KB (91136 Bytes)
Compile Timestamp
2021-01-18 14:29:29 UTC
Links
MalwareBazaar, Malpedia, Dropped_by_sha256, Cape, VirusTotal
Filenames
none
Detections
MalwareBazaar: None, Virustotal: 19/76 as of 2021-01-23 15:04:35 - Gen:Variant.Bulz.163525 (ALYac), Gen:Variant.Bulz.163525 (Ad-Aware), Trojan.Win32.Bulz.4!c (AegisLab), Trojan.Bulz.D27EC5 (Arcabit), Gen:Variant.Bulz.163525 (BitDefender), Gen:Variant.Bulz.163525 (B) (Emsisoft), Gen:Variant.Bulz.163525 (FireEye), Gen:Variant.Bulz.163525 (GData), Gen:Variant.Bulz.163525 (MicroWorld-eScan)

Difference from the Last Version

The current version is just a slight modification to the version from December. Like the previous version of the algorithm, this version calculates all ordered pairs of 19 consonants and 6 vowels (including y ). These pairs are then permuted based on a fixed value. This value is the same, so the resulting list of 228 pairs is also the same.

The calculation of the first four letters is the same – that is, the selection of the first two pairs of letters: The permuted list of letters is divided into groups of 19 pairs. Then the two digits of the current month determine which group is selected. From these, one pair at a time is randomly – and unpredictably – selected.

First four letters of bazar loader DGA domains

The last four letters (two pairs) are still determined by the two year digits. However, the division of letter pairs into groups is different. Based on the current decade, two letters are chosen from a group of 22 pairs. The groups of 22 pairs partly overlap, so that theoretically after every 10 years identical domains could be generated again. This in contrast to the version from December, where the decade still determined a non-overlapping group of 6 pairs only. The last two letters are picked from groups of 4 — instead of 6 — letter pairs.

Last four letters of bazar loader DGA domains

The DGA still generates 10'000 domains. But because there are 88 potential monthly combinations for the last for letters instead of just 36 previously, the excepted number of unique domains is larger:

$$ \mathbb{E} = 31768 \left( 1 - \left(\frac{31768 -1}{31768 }\right)^{10000} \right) \approx 8579 $$

Since domain names partially repeat after each decade, domains can no longer be uniquely assigned to a seed. But since I strongly doubt that the domain generation algorithm will still have any relevance in a few months, let alone 10 years, the domain to seed tool assumes domains are from the 20s.

IP Transformation

The A record of the domains is encrypted, just as in previous versions. Meaning, the four bytes of the IP are XORed with 0xFE. The IP is then used in URLs of format https://{ip}:443. For example, if the domain omleekyw.bazar has an A record of 220.39.239.27, then the actual contacted URL is https://34.217.17.229:443.

First four letters of bazar loader DGA domains

Reimplementation in Python

This is the new version reimplemented in Python

from itertools import product
from datetime import datetime
import argparse
from collections import namedtuple

Param = namedtuple('Param', 'mul mod idx')
pool = (
    "qeewcaacywemomedekwyuhidontoibeludsocuexvuuftyliaqydhuizuctuiqow"
    "agypetehfubitiaziceblaogolryykosuptaymodisahfiybyxcoleafkudarapu"
    "qoawyluxqagenanyoxcygyqugiutlyvegahepovyigqyqibaeqynyfkiobpeepby"
    "xaciyvusocaripfyoftesaysozureginalifkazaadytwuubzuvoothymivazyyz"
    "hoevmeburedeviihiravygkemywaerdonoyryqloammoseweesuvfopiriboikuz"
    "orruzemuulimyhceukoqiwfexuefgoycwiokitnuneroxepyanbekyixxiuqsias"
    "xoapaxmaohezwoildifaluzihipanizoecxyopguakdudyovhaumunuwsusyenko"
    "atugabiv"
)



def dga(date):
    seed = date.strftime("%m%Y")
    params = [
        Param(19, 19, 0),
        Param(19, 19, 1),
        Param(4, 22, 4),
        Param(4, 4, 5)
    ]
    ranges = []
    for p in params:
        s = int(seed[p.idx])
        lower = p.mul*s
        upper = lower + p.mod
        ranges.append(list(range(lower, upper)))

    for indices in product(*ranges):
        domain = ""
        for index in indices:
            domain += pool[index*2:index*2 + 2]
        domain += ".bazar"
        yield domain


if __name__ == "__main__":
    parser = argparse.ArgumentParser()
    parser.add_argument(
        "-d", "--date", help="date used for seeding, e.g., 2020-06-28",
        default=datetime.now().strftime('%Y-%m-%d'))
    args = parser.parse_args()
    d = datetime.strptime(args.date, "%Y-%m-%d")
    for domain in dga(d):
        print(domain)

Edit 23 March, 2021: There is also a version with a different character pool, but otherwise same algorithm (see my GitHub repo for the full code).

pool = (
    "yzewevmeywreomviekwyavygontowaerudsoyr"
    "exvuamtyseweesuvizpituiqowuzoretzemuul"
    "tiazicukoqiwolxuykosupwiymitisneroxeyx"
    "anlekyixxirasiasxoapuxqaohezwooxdigyqu"
    "ziutpavezohexyvyguqyqidyovynumunuwsusy"
    "enxaatyvusivaripfyoftesaysozureginalif"
)

Characteristics

Except for the number of domains per month, the characteristics are the same as for the previous verion:

propertyvalue
typeTDD (time-dependent-deterministic)
generation schemearithmetic
seedcurrent date
domain change frequencyevery month
unique domains per month19·19·22·4 = 31'768
sequencerandom selection, might pick domains multiple times
wait time between domains10 seconds
top level domain.bazar
second level charactersa-z, without j
regex[a-ik-z]{8}\.bazar
second level domain length8