[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v4 15/15] docs: Add debugging chapter to development document
From: |
Daniel Kiper |
Subject: |
Re: [PATCH v4 15/15] docs: Add debugging chapter to development documentation |
Date: |
Wed, 21 Dec 2022 16:50:09 +0100 |
User-agent: |
NeoMutt/20170113 (1.7.2) |
On Thu, Dec 15, 2022 at 11:29:38PM -0600, Glenn Washburn wrote:
> Signed-off-by: Glenn Washburn <development@efficientek.com>
> ---
> docs/grub-dev.texi | 191 +++++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 191 insertions(+)
>
> diff --git a/docs/grub-dev.texi b/docs/grub-dev.texi
> index f76fc658bf..8171e91c33 100644
> --- a/docs/grub-dev.texi
> +++ b/docs/grub-dev.texi
> @@ -79,6 +79,7 @@ This edition documents version @value{VERSION}.
> * Contributing Changes::
> * Setting up and running test suite::
> * Updating External Code::
> +* Debugging::
> * Porting::
> * Error Handling::
> * Stack and heap size::
> @@ -595,6 +596,196 @@ cp minilzo-2.10/*.[hc] grub-core/lib/minilzo
> rm -r minilzo-2.10*
> @end example
>
> +@node Debugging
> +@chapter Debugging
> +
> +GRUB2 can be difficult to debug because it runs on the bare-metal and thus
> +does not have the debugging facilities normally provided by an operating
> +system. This chapter aims to provide useful information on some ways to
> +debug GRUB2 for some architectures. It by no means intends to be exhaustive.
> +The focus will be one X86_64 and i386 architectures. Luckily for some issues
s/X86_64/x86_64/
> +virtual machines have made the ability to debug GRUB2 much easier, and this
> +chapter will focus debugging via the QEMU virtual machine. We will not be
> +going over debugging of the userland tools (eg. grub-install), there are
> +many tutorials on debugging programs in userland.
> +
> +You will need GDB and the QEMU binaries for your system, on Debian these
> +can be installed with the @samp{gdb} and @samp{qemu-system-x86} packages.
> +Also it is assumed that you have already successfully compiled GRUB2 from
> +source for the target specified in the section below and have some
> +familiarity with GDB. When GRUB2 is built it will create many different
> +binaries. The ones of concern will be in the @file{grub-core}
> +directory of the GRUB2 build dir. To aide in debugging we will want the
> +debugging symbols generated during the build because these symbols are not
> +kept in the binaries which get installed to the boot location. The build
> +process outputs two sets of binaries, one without symbols which gets executed
> +at boot, and another set of ELF images with debugging symbols. The built
> +images with debugging symbols will have a @file{.image} suffix, and the ones
> +without a @file{.img} suffix. Similarly, loadable modules with debugging
> +symbols will have a @file{.module} suffix, and ones without a @file{.mod}
> +suffix. In the case of the kernel the binary with symbols is named
> +@file{kernel.exec}.
> +
> +In the following sections, information will be provided on debugging on
> +various targets using @command{gdb} and the @samp{gdb_grub} GDB script.
> +
> +@menu
> +* i386-pc::
> +* x86_64-efi::
> +@end menu
> +
> +@node i386-pc
> +@section i386-pc
> +
> +The i386-pc target is a good place to start when first debugging GRUB2
> +because in some respects its easier than EFI platforms. The reason being
> +that the initial load address is always known in advance. To start
> +debugging GRUB2 first QEMU must be started in GDB stub mode. The following
> +command is a simple illustration:
> +
> +@example
> +qemu-system-i386 -drive file=disk.img,format=raw \
> + -device virtio-scsi-pci,id=scsi0,num_queues=4 -S -s
I would drop "num_queues=4" as non-essential thing here. And num_queues
seems deprecated today...
> +@end example
> +
> +This will start a QEMU instance booting from @file{disk.img}. It will pause
> +at start waiting for a GDB instance to attach to it. You should change
> +@file{disk.img} to something more appropriate. A block device can be used,
> +but you may need to run QEMU as a privileged user.
> +
> +To connect to this QEMU instance with GDB, the @code{target remote} GDB
> +command must be used. We also need to load a binary image, preferably with
> +symbols. This can be done using the GDB command @code{file kernel.exec}, if
> +GDB is started from the @file{grub-core} directory in the GRUB2 build
> +directory. GRUB2 developers have made this more simple by including a GDB
> +script which does much of the setup. This file at @file{grub-core/gdb_grub}
> +of the build directory and is also installed via @command{make install}.
> +If not building GRUB, the distribution may have a package which installs
> +this GDB script along with debug symbol binaries, such as Debian's
> +@samp{grub-pc-dbg} package. The GDB scripts is intended to by used
> +like so, assuming:
> +
> +@example
> +cd $(dirname /path/to/script/gdb_grub)
> +gdb -x gdb_grub
> +@end example
> +
> +Once GDB has been started with the @file{gdb_grub} script it will
> +automatically connect to the QEMU instance. You can then do things you
> +normally would in GDB like set a break point on @var{grub_main}.
> +
> +Setting breakpoints in modules is trickier since they haven't been loaded
> +yet and are loaded at addresses determined at runtime. The module could be
> +loaded to different addresses in different QEMU instances. The debug symbols
> +in the modules @file{.module} binary, thus are always wrong, and GDB needs
> +to be told where to load the symbols to. But this must happen at runtime
> +after GRUB2 has determined where the module will get loaded. Luckily the
> +@file{gdb_grub} script takes care of this with the
> @command{runtime_load_module}
> +command, which configures GDB to watch for GRUB2 module loading and when
> +it does add the module symbols with the appropriate offset.
> +
> +@node x86_64-efi
> +@section x86_64-efi
> +
> +Using GDB to debug GRUB2 for the x86_64-efi target has some similarities with
> +the i386-pc target. Please read be familiar with the @ref{x86_64-efi} section
> +when reading this one. Extra care must be used to run QEMU such that it boots
> +a UEFI firmware. This usually involves either using the @samp{-bios} option
> +with a UEFI firmware blob (eg. @file{OVMF.fd}) or loading the firmware via
> +pflash. This document will not go further into how to do this as there are
> +ample resource on the web.
> +
> +Like all EFI implementations, on x86_64-efi the (U)EFI firmware that loads
> +the GRUB2 EFI application determines at runtime where the application will
> +be loaded. This means that we do not know where to tell GDB to load the
> +symbols for the GRUB2 core until the (U)EFI firmware determines it. There
s/There/There are/?
> +two good ways of figuring this out when running in QEMU: use a @ref{OVMF
> debug log,
> +debug build of OVMF} and check the debug log or have GRUB2 say where it is
> +loaded when it starts. Neither of these are ideal because they both
> +generally give the information after GRUB2 is already running, which makes
> +debugging early boot infeasible. Technically, the first method does give
> +the load address before GRUB2 is run, but without debugging the EFI firmware
> +with symbols, the author currently does not know how to cause the OVMF
> +firmware to pause at that point to use the load address before GRUB2 is run.
> +
> +Even after getting the application load address, the loading of core symbols
> +is complicated by the fact that the debugging symbols for the kernel are in
> +an ELF binary named @file{kernel.exec} while what is in memory are sections
> +for the PE32+ EFI binary. When @command{grub-mkimage} creates the PE32+
> +binary it condenses several segments from the ELF kernel binary into one
> +.data section in the PE32+ binary. This must be taken into account to
> +properly load the other non-text sections. Otherwise, GDB will work as
> +expected when breaking on functions, but, for instance, global variables
> +will point to the wrong address in memory and thus give incorrect values
> +(which can be difficult to debug).
> +
> +The calculating of the correct offsets for sections when loading symbol
> +files are taken care of when loading the kernel symbols via the user-defined
> +GDB command @command{dynamic_load_kernel_exec_symbols}, which takes one
> +argument, the address where the text section is loaded, as determined by
> +one of the methods above. Alternatively, the command
> @command{dynamic_load_symbols}
> +with the text section address as an agrument can be called to load the
> +kernel symbols and setup loading the module symbols as they are loaded at
> +runtime.
> +
> +In the author's experience, when debugging with QEMU and OVMF, to have
> +debugging symbols loaded at the start of GRUB2 execution the GRUB2 EFI
> +application must be run via QEMU at least once prior in order to get the
> +load address. Two methods for obtaining the load address are described in
> +two subsections below. Generally speaking, the load address does not change
> +between QEMU runs. There are exceptions to this, namely that different
> +GRUB2 EFI Applications can be run at different addresses. Also, its been
> +observed that after running the EFI application for the first time, the
> +second run will many times have a different load address, but subsequent
> +runs of the same EFI application will have the same load address as the
> +second run. This predictability allows us to asume the load address on
> +subsequent runs and thus load the symbols before GRUB2 starts. The following
> +command illustrates this, assuming that QEMU is running and waiting for
> +a debugger connection and the current working directory is where
> +@file{gdb_grub} resides:
> +
> +@example
> +gdb -x gdb_grub -ex 'dynamic_load_symbols @var{load address}'
> +@end example
> +
> +If you load the symbols in this manner and, after continuing execution, do
> +not see output showing the loading of modules symbol, then its very likely
> +that the load address was incorrect.
> +
> +
> +@node OVMF debug log
> +@subsection OVMF debug log
> +
> +In order to get the GRUB2 load address from OVMF, first, a debug build
> +of OVMF must be obtained
> (@uref{https://github.com/retrage/edk2-nightly/raw/master/bin/DEBUGX64_OVMF.fd,
> +here is one} which is not officially recommended). OVMF will output debug
> +messages to a special serial device, which we must add to QEMU. The following
> +QEMU command will run the debug OVMF and write the debug messages to a
> +file named @file{debug.log}. It is assumed that @file{disk.img} is a disk
> +image or block device that is setup to boot GRUB2 EFI.
> +
> +@example
> +qemu-system-x86_64 -bios /path/to/debug/OVMF.fd \
> + -drive file=disk.img,format=raw \
> + -device virtio-scsi-pci,id=scsi0,num_queues=4 \
Please drop num_queues here too...
> + -debugcon file:debug.log -global isa-debugcon.iobase=0x402
> +@end example
> +
> +If GRUB2 was started by the (U)EFI firmware, then in the @file{debug.log}
> +file one of the last lines should be a log message like:
> +@code{Loading driver at 0x00006AEE000 EntryPoint=0x00006AEE756}. This
> +means that the GRUB2 EFI application was loaded at @code{0x00006AEE000} and
> +its .text section is at @code{0x00006AEE756}.
> +
> +@node Build GRUB2 to print out the load address
> +@subsection Build GRUB2 to print out the load address
> +
> +GRUB2 can be specially built to output the address of its .text section in
> +memory by defining @code{PRINT_GDB_SYM_LOAD_CMD} to @code{1} in
> @file{config.h.in}
> +before running @command{configure}. The benefit of this method is that it
> +will work on non-virtualized hardware where the (U)EFI firmware may not
> +be modifiable.
> +
> @node Porting
> @chapter Porting
Additionally, as Jeremy Szu pointed out in the other email it would be
beneficial at least shortly mention differences between "break"/"b" and
"hbreak"/"hb" GDB commands.
Otherwise +/- some minor things patches LGTM...
Thank you for doing this work!
Daniel