1. Introduction

In this post, I’ll cover how I implemented the GDB Serial Protocol (GDBRSP) for my Game Boy simulator TLMBoy. For the whole code in action see gdb_server.cpp and gdb_server.h in my GitHub repo. While I used the Game Boy as a target architecture, the principles and details presented here can be applied to every other platform as well. In fact, you just need a GDB for your desired CPU architecture!
Since the Game Boy’s CPU (basically a Z80 clone) isn’t natively supported by GDB, I’ll show you how to get a Z80 GDB first.
If you don’t mind the extra work, you could also extend GDB by adding support for your favorite architecture. But in the case of the Z80, someone already went ahead ;)

2. Motivation

Before we dive into the technical details, let’s answer some simple yet important questions first:

2.1 What is GDBRSP?

GDB Remote Serial Protocol (GDBRSP) is the name of the protocol that GDB uses to communicate with so-called GDBstubs. The protocol defines how packets have to look and how servers and clients communicate. As a backbone usually either the TCP protocol or just a plain serial communication is employed. Extensive documentation can be found in the GDB docs

2.2 What is it good for?

Because why don’t we just use plain GDB to debug stuff?
Imagine you’re programming a Game Boy simulator like in my case. You end up with a piece of software (a Game Boy simulator) that executes another piece of software (for instance Pokémon Red).
To debug your simulator, you’d probably just use GDB, which is perfectly fine.
But how do you debug the software inside the software (Pokémon Red) from the Game Boy’s perspective?
One common approach is to incorporate a so-called GDBstub into your simulator. This stub receives messages from GDB, for example, via TCP, and translates them to simulator-specific instructions as depicted in the following illustration:

gdbstub tlm boy

Implementing this stub for your specific simulator requires some work by you, which is mainly covered in this post.
But trust me, having a GDBstub in your simulator is a really cool feature. Because once you have your stub, you can just use the typical GDB frontend and start your debug sessions.
This is why many well-known simulators like QEMU or gem5 also have implemented their own GDBstub.
Before I explain the details of implementing a Game Boy GDBstub, let’s take a look at how to get a GDB with Z80 (the Game Boy’s CPU) support first. If you already have one, feel free to skip the next section. Note: you can also use my TLMBoy’s docker container, which includes said Z80 GDB (start it with z80-unknown-elf-gdb).

3. Getting Z80 GDB

I guess you probably already consulted Google searching for a Z80 GDB, which might have led you to the following GitHub repository. However, most of this code is more than 10 years old, and compiling it is a pain in the *** if you’re using a quite recent Linux environment.
As it happens to be, a few months ago (September 2020), some cool guy submitted a patch to the GDB team, including architecture support for Z80 CPUs and even the Game Boy’s modified version. But as stated in the given link, it might take a while until this patch is upstream. And I guess adding support for an antiquated architecture isn’t really the first item on the maintainers’ priority list…
So, in the meantime, let’s just compile it ourselves!
Fortunately, the glorious Z80 patcher provided a GitHub repository which can be found here. The next steps are just cloning the repository and building that stuff as follows:

git clone https://github.com/b-s-a/binutils-gdb.git
cd binutils-gdb
mkdir build
./configure --target=z80-unknown-elf --prefix=$(pwd)/build --exec-prefix=$(pwd)/build
make
make install

Depending on your preferences, you may want to change things like the build directory or the executable format. Since the Game Boy doesn’t really have an executable format, I just took elf, but other file formats like coff should work as well.
At that point, you should find an executable Z80 GDB in the bin directory:

ls gdb z80

4 Exploring the Protocol

4.1 General Considerations

In this section, we’ll take a closer look at the protocol and what GDB expects from us.
As already mentioned, I want to implement a GDBstub for my Game Boy simulator. Depending on your GDBstub, you might have to meet different design considerations at some points. For example, my first few steps were implementing a TCP server (which is not covered in this post), but if you’re implementing a GDBstub for some embedded device, a serial connection might be a better choice. Anyway, let’s get down to business!

The typical GDBRSP packet uses the following pattern:

$packet-data#checksum

It comprises a “$” to indicate the beginning of a packet, some packet data, usually human-readable, and a two-digit hex checksum that is preceded by a “#”. For instance, a packet may look like this:

$m0,8#01

In this case m0,1 tells us to read 8 bytes beginning at memory location 0. The checksum is calculated by summing up the ASCII values of each character of the packet data (“$” and “#” are excluded!) and taking the first 8 bits of the results (corresponds to modulo 256). Or to formulate it as C++ code:

std::string GdbServer::GetChecksumStr(const std::string &msg) {
  uint checksum = 0;
  for (const char& c : msg) {
    checksum += static_cast<uint>(c);
  }
  checksum &= 0xff;
  return fmt::format("{:02x}", checksum);
}

Or in python:

def GetChecksumStr(msg):
  return "{:02x}".format(sum(ord(c) for c in msg) & 0xff)

In theory, verifying the checksum doesn’t make sense from our stub’s perspective as the TCP protocol already has some error detection under the hood. But for the sake of completeness, I implemented it anyway.

Controlling the checksum is one thing, but how does one check if the message is syntactically correct? The messages used by GDB are pretty simple and don’t contain any nested structures. Hence, I used a big chonky regex to detect all the packets that I want to support:

std::vector<std::string> GdbServer::SplitMsg(const std::string &msg) {
  static std::regex reg(
    R"(^(\?)|(D)|(g))"
    R"(|(c)([0-9]*))"
    R"(|(G)([0-9A-Fa-f]+))"
    R"(|(M)([0-9A-Fa-f]+),([0-9A-Fa-f]+):([0-9A-Fa-f]+))"
    R"(|(m)([0-9A-Fa-f]+),([0-9A-Fa-f]+))"
    R"(|([zZ])([0-1]),([0-9A-Fa-f]+),([0-9]))"
    R"(|(qAttached)$)"
    R"(|(qSupported):((?:[a-zA-Z-]+\+?;?)+))"
  );
  std::vector<std::string> res;
  std::smatch sm;
  regex_match(msg, sm, reg);
  for (uint i = 1; i < sm.size(); ++i) {
    if (sm[i].str() != "") {
      res.push_back(sm[i].str());
    }
  }
  return res;
}

Besides the standard packet, there is also an acknowledge packet + and a not-acknowledge packet -. Every message transmitted via GDBRSP needs a response in the form of + or -. With that in mind, let’s take a look at some first packets that GDB sends to a stub when initiating a connection!

4.2 First Contact

To do this, we first need to set up a TCP client. You can program a TCP client, or just a Linux network tool like netcat. For instance:

netcat -l 1337

This starts a TCP client listening on port 1337. As a second step, GDB has to be started and connected, which can be achieved with the following commands:

z80-unknown-elf-gdb
(gdb) set arch gbz80
(gdb) set debug remote 1
(gdb) target remote localhost:1337

With set arch gbz80, we tell GDB to switch to the modified Z80 instruction set that is used by the Game Boy. I also added the set debug remote 1 to make GDB more verbose and provide us with some interesting insights. The connection is finally established with target remote localhost:1337. If everything goes well, netcat should output the TCP messages sent by GDB. Let’s analyze them in the next section!

4.3 qSupported

The first packet which arrives at our GDBstub looks as follows:

$qSupported:multiprocess+;swbreak+;hwbreak+;qRelocInsn+;fork-events+;vfork-events+;exec-events+;vContSupported+;QThreadEvents+;no-resumed+#df

Using the gdb docs, let’s break down the message into its substantial parts. With qSupported (gdbdocs), GDB tries to tell us about all the fancy features it supports. This message is not only a statement, but it’s also asking the stub about which features it supports. So let’s take a look at the single parameters and try to contemplate which one we need:

multiprocess: Indicates support of the multiprocess extensions. However, the Game Boy doesn’t really have multiple processes, so there’s no need to support it.
swbreak: Indicates support of software breakpoint stop reason. With a software breakpoint, you basically replace the instruction with another instruction that triggers some behavior detected by the debugger. I chose not to support this as hardware breakpoints are a simpler alternative.
hwbreak: Indicates support of hardware breakpoint stop reason. Hardware breakpoints use special hardware registers that trigger some behavior if, for instance, a specified program counter value is reached. This is quite easy to implement in a simulator, so I chose to support this.
qRelocInsn: Indicates support for relocating instructions, a feature needed for so-called tracepoints. Tracepoints aren’t really interesting for use, so skip them.
fork-events: The Game Boy doesn’t have an OS. Consequently, there are no child processes (forks) to debug. Skip it.
vfork-events: Pretty similar to fork-events. Skip it.
vexec-events: Indicates support of the Linux execve command. Again there’s not really an OS, so we’ll skip that one.
vContSupported: Indicates support for vCont. Might be useful if your system supports multiple threads, which isn’t the case for the Game Boy. Skip it.
QThreadEvents: Again thread-related stuff which we can skip.
no-resumed: More thread-related stuff … skipped.

So, we only support hardware breakpoints. Consequently, the answer looks like this:

$hwbreak+#e4

And the C++ part:

void GdbServer::CmdSupported(const std::vector<std::string> &msg_split) {
  std::string msg_resp;
  if (msg_split[1].find("hwbreak+;") != std::string::npos) {
    msg_resp.append("hwbreak+;");
  }
  msg_resp = Packetify(msg_resp);
  DBG_LOG_GDB("sending supported features");
  tcp_server_.SendMsg(msg_resp.c_str());
}

In general, the minimum set of commands and features that a GDBstub needs to support is relatively small. The gdb docs state:

At a minimum, a stub is required to support the ‘?’ command to tell GDB the reason for halting, ‘g’ and ‘G’ commands for register access, and the ‘m’ and ‘M’ commands for memory access. Stubs that only control single-threaded targets can implement run control with the ‘c’ (continue) command, and if the target architecture supports hardware-assisted single-stepping, the ‘s’ (step) command. Stubs that support multi-threading targets should support the ‘vCont’ command. All other commands are optional.

4.4 vMustReplyEmpty

After sending our response, GDB immediately sends another packet to our stub:

$vMustReplyEmpty#3a

According to the docs, this command tests how our server responds to unknown packets (vMustReplyEmpty is not defined by definition). The correct response to an unknown packet is an empty response:

$#00

Apparently, some older stubs would incorrectly respond with an ‘OK’ to unknown packets. To test this, vMustReplyEmpty was introduced. The C++ code looks as follows:

// With: char const *kMsgEmpty = "+$#00";
void GdbServer::CmdNotFound(const std::vector<std::string> &msg_split) {
  tcp_server_.SendMsg(kMsgEmpty);
}

4.5 Hg

GDB doesn’t get tired of sending us packets responding directly with a:

$Hg0#df

With this command, all following ‘g’ commands (read register) refer to the thread of the given thread id. However, thread id ‘0’ is a special case, as can be read in the gdb docs: A thread-id can also be a literal ‘-1’ to indicate all threads, or ‘0’ to pick any thread. Since this command is not in the minimum set, and we don’t have multiple threads, we can send an empty response (command unknown) again:

$#00

4.6 qTStatus

The next incoming packet is:

$qTStatus#49

GDB is asking us whether a trace experiment is currently running. Well, we’re not supporting tracing anyway, so respond empty:

$#00

4.7 ?

With the ‘?’ packet, GDB asks for a reason why the target halted. Since we’re stopping our process once GDB connects, we have to reply with one of the responses listed in gdb docs. I felt like the following response was a good choice:

$S05#b8

Here ‘S05’ responds to POSIX signal SIGTRAP. It’s the typical signal being triggered when running into a software breakpoint, often leading to a halt. For instance, qemu uses the same signal in its stub. Also the guy from this cool tutorial uses SIGTRAP. Since the Game Boy doesn’t really have an OS, it doesn’t have POSIX signals as well. Hence, it’s more like a dummy answer to satisfy gdb. In theory, using any other signal number should work as well. The C++ looks as follows:

void GdbServer::CmdHalted(const std::vector<std::string> &msg_split) {
  std::string msg_resp = Packetify(fmt::format("S{:02x}", SIGTRAP));
  cpu_->Halt();
  tcp_server_.SendMsg(msg_resp.c_str());
}

4.8 qfThreadInfo, qL, Hc, qC

GDB seems to be happy with our ‘S05’ response and sends us the following packet afterward:

$qfThreadInfo#bb

With that packet, GDB is asking us about which threads are active. We’ll just respond empty as we’re not supporting threads:

$#00

GDB is really persistent about threads and sends us the predecessor of the qfThreadInfo packet:

$qL1160000000000000000#55

Gues what we respond?

$#00

The next incoming packet is:

$Hc-1#09

This packet is similar to the ‘Hg’ packet and indicates that all following ‘c’ packets refer to all threads (-1). Let’s respond with empty response as we haven’t changed our opinion about threads in the meanwhile. The subsequent packet asks for the current thread ID:

$qC#b4

… Insert generic statement about threads here …

4.9 qAttached

GDB seems to be unstoppable and proceeds with the following packet:

$qAttached#8f

Here we have to respond either with ‘1’ indicating that our remote server is attached to an existing process or with a ‘0’ indicating that the remote server created a new process itself. Depending on our answer here, we either get a kill or detach command when invoking ‘quit’. Since I want to keep the Game Boy running even when quitting GDB, the appropriate answer is ‘1’:

$1#31

4.10 g

The next packet received is:

$g#67

Here GDB wants to read our CPUs registers. The documentation provides more information about the respone format:

Each byte of register data is described by two hex digits. The bytes with the register are transmitted in target byte order. The size of each register and their position within the ‘g’ packet is determined by the GDB internal gdbarch functions DEPRECATED_REGISTER_RAW_SIZE and gdbarch_register_name. When reading registers from a trace frame (see Using the Collected Data), the stub may also return a string of literal ‘x’’s in place of the register data digits, to indicate that the corresponding register has not been collected; thus its value is unavailable.

This means, in order to put the correct register value in the correct place, I have to search through GDB’s source code… I feel like this is not a well-conceived solution, especially if multiple debuggers are used with each having a different ordering of the registers. It would be better if there was some kind of message to define the layout, or if the GDB team would just establish a standard per ISA.

Anyway, I followed down the function gdbarch_register_name in z80_tdep.c until I found the corresponding array:

// Frame 2
set_gdbarch_register_name (gdbarch, z80_register_name);

// Frame 1
/* Return the name of register REGNUM.  */
static const char *
z80_register_name (struct gdbarch *gdbarch, int regnum)
{

  if (regnum >= 0 && regnum < ARRAY_SIZE (z80_reg_names))
    return z80_reg_names[regnum];

  return NULL;
}

// Frame 0
static const char *z80_reg_names[] =
{
  /* 24 bit on eZ80, else 16 bit */
  "af", "bc", "de", "hl",
  "sp", "pc", "ix", "iy",
  "af'", "bc'", "de'", "hl'",
  "ir",
  /* eZ80 only */
  "sps"
};

Hence, our response will start with the “af” registers and then progress until the “pc” registers. Any subsequent registers are omitted due to the reduced registers set of the Game Boy’s Z80. Melting this into C++ code may look like this:

void GdbServer::CmdReadReg(const std::vector<std::string> &msg_split) {
  std::string msg_resp;
  msg_resp = fmt::format("{:04x}{:04x}{:04x}{:04x}{:04x}{:04x}{:x>{}}",
                         std::rotl(cpu_->reg_file.AF.val(), 8), std::rotl(cpu_->reg_file.BC.val(), 8),
                         std::rotl(cpu_->reg_file.DE.val(), 8), std::rotl(cpu_->reg_file.HL.val(), 8),
                         std::rotl(cpu_->reg_file.SP.val(), 8), std::rotl(cpu_->reg_file.PC.val(), 8),
                         "", 7*4);
  DBG_LOG_GDB("reading geeneral registers");
  msg_resp = Packetify(msg_resp);
  tcp_server_.SendMsg(msg_resp.c_str());
}

Please note that the Z80 is a little-endian system requiring us to send the LSB first. Hence the usage of this amazing new C++-20 Feature std::rotl. An example response may look like this one here:

$0000000000000000feff0000xxxxxxxxxxxxxxxxxxxxxxxxxxxx#77

Here only the stack pointer is initialized (SP=0xfffe) while all other registers are 0.

5 Connection Established

After answering more than 10 packets, GDB finally seems to be satisfied and offers me its terminal! See the debug log:

(gdb) target remote localhost:1337
Remote debugging using localhost:1337
Sending packet: $qSupported:multiprocess+;swbreak+;hwbreak+;qRelocInsn+;fork-events+;vfork-events+;exec-events+;vContSupported+;QThreadEvents+;no-resumed+#df...Ack
Packet received: swbreak+;
Packet qSupported (supported-packets) is supported
Sending packet: $vMustReplyEmpty#3a...Ack
Packet received:
Sending packet: $Hg0#df...Ack
Packet received:
Sending packet: $qTStatus#49...Ack
Packet received:
Packet qTStatus (trace-status) is NOT supported
Sending packet: $?#3f...Ack
Packet received: S05
Sending packet: $qfThreadInfo#bb...Ack
Packet received:
Sending packet: $qL1160000000000000000#55...Ack
Packet received:
Sending packet: $Hc-1#09...Ack
Packet received:
Sending packet: $qC#b4...Ack
Packet received:
Sending packet: $qAttached#8f...Ack
Packet received: 1
Packet qAttached (query-attached) is supported
warning: No executable has been specified and target does not support
determining executable automatically.  Try using the "file" command.
Sending packet: $g#67...Ack
Packet received: 000000000000000000000000xxxxxxxxxxxxxxxxxxxxxxxxxxxx
Sending packet: $qL1160000000000000000#55...Ack
Packet received:
0x00000000 in ?? ()
(gdb)

Yet we are not done, as some of the mandatory GDB commands aren’t implemented (like G, m, M, s, and c). I think the best way to explore them is to regard them in the context of GDB terminal commands. Hence, let’s start with some basic commands such as info registers and then work our way up to stuff like setting breakpoints.

5.1 Reading Registers

The command info registers prints out the values of the CPUs registers:

(gdb) info registers
af             0x0                 [ ]
bc             0x0                 0
de             0x0                 0x0
hl             0x0                 0x0
sp             0xfffe              0xfffe
pc             0x0                 0x0
ix             <unavailable>
iy             <unavailable>
af'            <unavailable>
bc'            <unavailable>
de'            <unavailable>
hl'            <unavailable>
ir             <unavailable>

As you might see in the debug log, there’s actually no message being sent! This is due to gdb already having all information thanks to ‘g’ that was used to establish the connection.

5.2 Displaying Assembly

With display/5i $pc GDB shows us the next 5 assembly instructions:

(gdb) display/5i $pc
1: x/5i $pc
=> 0x0: Sending packet: $m0,1a#5b...Ack
Packet received: 31feffaf21ff9f32cb7c20fb2126ff0e113e8032e20c3ef3e232
Sending packet: $m1a,1a#bd...Ack
Packet received: 3e77773efce0471104012110801acd9500cd9600137bfe3420f3
Sending packet: $m34,c#63...Ack
Packet received: 11d80006081a1322230520f9
ld sp,0xfffe
   0x3: xor a
   0x4: ld hl,0x9fff
   0x7: ld (0x7ccb),a
   0xa: jr nz,0x0007

The debug log reveals that this command comprises a bunch of m packets. For instance, the first incoming packet looks like this:

$m0,1a#5b

A quick lookup in the docs reveals that GDB wants to read a chunk of size 0x1a from memory location 0x00. Nothing easier than that. Let’s code some reply:

void GdbServer::CmdReadMem(const std::vector<std::string> &msg_split) {
  std::string msg_resp;
  std::string addr_str = msg_split[1];
  std::string length_str = msg_split[2];
  uint addr = std::stoi(addr_str, nullptr, 16);
  uint length = std::stoi(length_str, nullptr, 16);
  for (uint i = 0; i < length; ++i) {
    u8 data = cpu_->ReadBusDebug(addr + i);
    msg_resp.append(fmt::format("{:02x}", data));
  }
  DBG_LOG_GDB("reading 0x" << length_str << " bytes at address 0x" << addr_str);
  msg_resp = Packetify(msg_resp);
  tcp_server_.SendMsg(msg_resp.c_str());
}

5.3 Step Instruction

As a next typical GDB command, we’ll take a look at si, which is short for step instruction and tells our program to execute the next assembly instruction. So, let’s just take a look at the debug log and see what happens:

(gdb) si
Sending packet: $mffe0,1a#8c...Ack
Packet received: 0000000000000000000000000000000000000000000000000000
Sending packet: $mfffa,6#62...Ack
Packet received: 000000000000
Sending packet: $m0,8#01...Ack
Packet received: 31feffaf21ff9f32
Sending packet: $m3,1#fd...Ack
Packet received: af
Sending packet: $Z0,3,8#4d...Ack
Packet received: OK
Packet Z0 (software-breakpoint) is supported
Sending packet: $vCont?#49...Ack
Packet received:
Packet vCont (verbose-resume) is NOT supported
Sending packet: $Hc0#db...Ack
Packet received:
Sending packet: $c#63...Ack
Packet received: S05
Sending packet: $g#67...Ack
Packet received: 0000000000000000feff0300xxxxxxxxxxxxxxxxxxxxxxxxxxxx
Sending packet: $z0,3,8#6d...Ack
Packet received: OK
Sending packet: $mffe0,1a#8c...Ack
Packet received: 0000000000000000000000000000000000000000000000000000
Sending packet: $mfffa,6#62...Ack
Packet received: 000000000000
Sending packet: $qL1160000000000000000#55...Ack
Packet received:
Sending packet: $mffe0,1a#8c...Ack
Packet received: 0000000000000000000000000000000000000000000000000000
Sending packet: $mfffa,6#62...Ack
Packet received: 000000000000

As you can see, the first few packets are multiple memory reads at different addresses. These reads are issued as GDB wants to know the instructions that follow after the current one. At first, I was like: “Why doesn’t gdb just only read program counter + 1?” Well, the next instruction to be executed isn’t necessarily the one at the next program counter address! For example, in case of return instructions GDB has to backtrack this next instruction by unwinding the call stack. This finally explains why GDB read that 32 bytes beginning from 0xFFE0 (the current stack pointer at that time) and the following instruction (program counter was 0x0 at that time). Warning: There are some cases in which this command might blow up. See section Final Thoughts for more information.

The next packet sent is a ‘Z’ packet telling us to insert a software breakpoint (=0) with kind 8 at address 0x3. But… didn’t we tell GDB that we don’t support software breakpoints in the initialization phase? Well, I tried to reject that packet, but this then led to no breakpoint being inserted at all.

At this point, I was a little unsure about how to proceed and implement stuff. So, I took a look at other emulators/simulators/frameworks, namely qemu, gem5 and vcml, and they all do it the same way:

Every kind of breakpoint, be it software or hardware, is mapped onto some kind of virtual hardware breakpoint. For instance, qemu:

switch (type) {
case GDB_BREAKPOINT_SW:
case GDB_BREAKPOINT_HW:
    CPU_FOREACH(cpu) {
        err = cpu_breakpoint_insert(cpu, addr, BP_GDB, NULL);
        if (err) {
            break;
        }
    }

This method is quite easy to implement and avoids changing the memory’s content. We just insert a given address into a data structure, for example a set, and do a check in the simulator’s main loop whether we reached one of the breakpoints. This lead me to the following implementation:

void GdbServer::CmdInsertBp(const std::vector<std::string> &msg_split) {
  std::string msg_resp = "";
  if (msg_split[1] == "0" || msg_split[1] == "1") {
    msg_resp = "OK";
    uint addr = std::stoi(msg_split[2], nullptr, 16);
    DBG_LOG_GDB("set breakpoint at address 0x" << msg_split[2]);
    bp_set_.insert(addr);
  } else {
    DBG_LOG_GDB("watchpoints aren't supported yet");
  }
  msg_resp = Packetify(msg_resp);
  tcp_server_.SendMsg(msg_resp.c_str());
}

After the breakpoint was set, GDB tells us to continue execution with the ‘c’ packet. My implementation of that is quite simple:

void GdbServer::CmdContinue(std::vector<std::string> msg_split) {
  cpu_->Continue();
}

Our CPU will now continue its execution until it encounters a breakpoint which is already the next instruction in case of si. We tell GDB about this event by sending a SIGTRAP signal:

void GdbServer::SendBpReached() {
  std::string msg_resp = Packetify(fmt::format("S{:02x}", SIGTRAP));
  DEBUG_LOG("GDB: sending breakpoint reached");
  msg_resp = Packetify(msg_resp);
  tcp_server_.SendMsg(msg_resp.c_str());
}

We then get asked to return the current register data (‘g’) and to remove the current breakpoint (‘z’). Removing the breakpoint is pretty much the same as inserting it, just vice versa:

void GdbServer::CmdRemoveBp(const std::vector<std::string> &msg_split) {
  std::string msg_resp = "";
  if (msg_split[1] == "0" || msg_split[1] == "1") {
    msg_resp = "OK";
    uint addr = std::stoi(msg_split[2], nullptr, 16);
    DBG_LOG_GDB("removed breakpoint at address 0x" << msg_split[2]);
    bp_set_.erase(addr);
  } else {
    DBG_LOG_GDB("watchpoints aren't supported yet");
  }
  msg_resp = Packetify(msg_resp);
  tcp_server_.SendMsg(msg_resp.c_str());
}

After that, only a few memory reads follow, and this is it!

6. Demo: Custom Logo

Nothing beats a fancy demo, so I made a video showing how you can use GDB to boot up the Game Boy with a custom logo:

In the video I used the following command to start the TLMBoy:

./tlmboy -r ../roms/tetris.bin --wait-for-gdb

To attach GDB to the simulation, use:

target remote localhost:1337

Once GDB is attached, the simulation halts at PC=0x0, and you are free to throw in some commands. In my case I want to replace the Nintendo logo with my own custom logo. The logo resides at address 0x104 and upwards, hence I replace this data:

set {char[48]} 0x104 = {0x03, 0x22, 0x09, 0x11, 0x02, 0x2e, 0x07, 0x44, \
     0x02, 0x22, 0x04, 0x45, 0x01, 0x91, 0x0c, 0x00, 0x09, 0xdb, 0x00, \
     0x00, 0x00, 0x00, 0x00, 0x00, 0x22, 0x30, 0x11, 0x90, 0x22, 0x20, \
     0x44, 0x70, 0x22, 0x20, 0x65, 0x40, 0x11, 0x90, 0xc0, 0xc0, 0xb9, \
     0x90, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00}

As seen in the video, a high-quality “CHCIKEN” logo is rendered instead of the Nintendo logo. However, changing the boot logo results in a bricked boot process. The logo keeps being displayed, but it doesn’t advance past this point. This is Nintendo’s way of preventing the execution of non-licensed games (see my boot post for more information). So, I pressed Ctrl + C and called display/7i $pc to examine the situation. It can be seen that the Game Boy is stuck in the loop which compares the logo in the cartridge against the logo in the boot ROM.

1: x/7i $pc
=> 0xe9:        jr nz,0x00e9
   0xeb:        inc hl
   0xec:        ld a,l
   0xed:        cp 0x34
   0xef:        jr nz,0x00e6
   0xf1:        ld b,0x19
   0xf3:        ld a,b

The easiest way to resolve this awkward situation is to skip the check. With GDB, this can be achieved by advancing the program counter a few instructions:

set $pc = 0xfc

Alternatively, you can reload the Nintendo logo shortly before the check starts (your custom logo will remain displayed):

set {char[48]} 0x104 = {0xce, 0xed, 0x66, 0x66, 0xcc, 0x0d, 0x00, 0x0b, 0x03, 0x73, \
                        0x00, 0x83, 0x00, 0x0c, 0x00, 0x0d, 0x00, 0x08, 0x11, 0x1f, \
                        0x88, 0x89, 0x00, 0x0e, 0xdc, 0xcc, 0x6e, 0xe6, 0xdd, 0xdd, \
                        0xd9, 0x99, 0xbb, 0xbb, 0x67, 0x63, 0x6e, 0x0e, 0xec, 0xcc, \
                        0xdd, 0xdc, 0x99, 0x9f, 0xbb, 0xb9, 0x33, 0x3e}

And with that, Tetris finally starts 😊.

7. Final Thoughts

So, in this post I covered the basics of GDB remote serial protocol (GDBRSP) and how once can embed it into a Game Boy emulator (or any application). Due to the enormous scope of GDBRSP, this post just scratched the surface. Nevertheless, I hope that it provides a good starting for further adventures.

Last but not least, I still want to share some limitations, questions, and thoughts that came across my path during the development.

Let’s start with the limited debugability of the Game Boy’s ROMs. These are basically a chunk of handcrafted assembly that doesn’t require any specific file format or underlying operating system. Consequently, there are no such things as debug symbols or calling conventions that could be used by the debugger. In some cases, I even observed crashes as GDB was trying to unwind call stacks, that weren’t really call stacks. For instance, if you execute step instruction directly after connecting, GDB (or my TLMBoy) will say “goodbye”. This is because GDB tries to determine the callstack with a stack pointer that points to 0, leading to multiple, seemingly random reads to non-mapped addresses. Unfortunately, there’s not much one can do about it except avoiding commands that lead to undefined behavior.

Another thing that I didn’t consider at first, but later needed some problem solving, were bank switches. These are used to circumvent the 64kiB limit imposed by the Game Boys 16-bit address bus. With bank switching, some parts of the ROM are switched out by other parts of the ROM, which weren’t directly accessible prior to the switch. This mechanism is triggered by writing a specific value in a specific location. But in debug mode, I might want to write to certain locations to alter the memory’s value, not to trigger a bank switch. So, how can I distinguish between bank switch and actual memory write? The best solution I could come up with, are so-called custom queries. These can invoked with monitor data from the GDB terminal. As the name implies, a custom query can convey a custom message that triggers a custom behavior in the stub. Actually, this is so versatile that probably many other problems can be solved with it as well.

So, this finally concludes my post. If there’s any feedback, be it good or bad, feel free to contact me.

8. References

[1] GDB’s online documentation. The first address to consult when questions about GDBRSP packets arise.
[2] QEMU and GDBRSP.
[3] gem5 and GDBRSP.
[4] Quite old GitHub repository containing a GDB with Z80 support.
[5] Discussion about the most recent Z80 GDB patch.
[6] Most up-to-date Z80 GDB GitHub repository.
[7] Super detailed and user-oriented post about GDBRSP.
[8] Cool blog post about the GDBRSP.