Discussion:
Supporting cross-debugging vmcores in libkvm
John Baldwin
2015-08-04 17:56:09 UTC
Permalink
Many debuggers (recent gdb and lldb) support cross-architecture debugging
just fine. My current WIP port of kgdb to gdb7 supports cross-debugging for
remote targets already, but I wanted it to also support cross-debugging for
vmcores.

The existing libkvm/kgdb code in the tree has some limited support for
cross-debugging. It requires building a custom libkvm (e.g. libkvm-i386.a)
and custom kgdb for each target platform. However, gdb (and lldb) both
support multiple targets in a single binary, so I'd like to have a single
kgdb binary that can cross-debug anything.

I started hacking on libkvm last weekend and have a prototype that I've used
(along with some patches to my kgdb port) to debug an amd64 vmcore on an
i386 machine and vice versa.

To do this I've made some additions to the libkvm API:

1) A new 'kvaddr_t' type represents a kernel virtual address. This is
similar to the psaddr_t type used for MI process addresses in userland
debugging. I almost reused psaddr_t directly, but that would have made
<libkvm.h> depend on <sys/procfs.h>. Instead, I opted for a separate
type. It is currently a uint64_t.

2) A new 'struct kvm_nlist'. This is a stripped-down version of
'struct nlist' that uses kvadd_t for n_value instead of an unsigned
long.

3) kvm_native() returns true if an open kvm descriptor is for a native
kernel and memory image.

4) kvm_nlist2() is like kvm_nlist() but it uses 'struct kvm_nlist'
instead of 'struct nlist'. Internally symbol names are always
resolved to kvaddr_t addresses rather than u_long addresses.
Native kernels still use _fdnlist() from libc to resolve symbols.
Cross kernels use a caller supplied function to resolve symbols
(the older cross code for libkvm required the caller to provide
a global ps_pglobal_lookup symbol typically provided for
<proc_service.h>).

5) kvm_open2() is like kvm_openfiles() except that it drops the unused
'swapfile' argument and adds a new function pointer argument to a
symbol resolving function. The function pointer can be NULL in
which case only native kernels can be opened. Kernels used with
/dev/mem or /dev/kmem must be native.

6) kvm_read2() is like kvm_read() except that it uses kvaddr_t
instead of unsigned long for the kernel virtual address.

Adding new symbols (specifically kvm_nlist2 and kvm_read2) preserves
ABI and API compatibility. Note that most libkvm functions such as
kvm_getprocs(), etc. only work with native kernels. I have not yet
done a full sweep to force them to fail for non-native kernels.

Also, the vnet and dpcpu stuff only works for native kernels currently
though that can be fixed at some point in the future.

For the MD backends, I've added a new kvm_arch switch:

struct kvm_arch {
int (*ka_probe)(kvm_t *);
int (*ka_initvtop)(kvm_t *);
void (*ka_freevtop)(kvm_t *);
int (*ka_kvatop)(kvm_t *, kvaddr_t, off_t *);
int (*ka_uvatop)(kvm_t *, const struct proc *, kvaddr_t, off_t *);
int ka_native;
};

Each backend implements the necessary callbacks (uvatop is optional)
and is added to a global linker set that kvm_open2() walks to find the
appropriate kvm_arch for a given kernel + vmcore. On x86 I've used
separate kvm_arch structures for "plain" vs minidumps.

The backends now have to avoid using native headers. For ELF handling
this means using libelf instead of <machine/elf.h> and raw mmap(). For
the x86 backends it meant defining some duplicate constants for certain
page table fields since <machine/pmap.h> can't be relied on (e.g.
I386_PG_V instead of PG_V). I added static assertions in the "native"
case (e.g. building kvm_i386.c on i386) to ensure the duplicate constants
match the originals.

You can see the current WIP patches here:

https://github.com/freebsd/freebsd/compare/master...bsdjhb:kgdb_enhancements

What I'm mostly after is comments on the API, etc. Once that is settled I
will move forward on converting and/or stubbing the other backends (the
stub route would be to only support other backends on native systems for
now).

Oh, and I do hope to have a 'KGDB' option for the devel/gdb port in the
near future.
--
John Baldwin
John-Mark Gurney
2015-08-04 19:00:59 UTC
Permalink
Post by John Baldwin
Many debuggers (recent gdb and lldb) support cross-architecture debugging
just fine. My current WIP port of kgdb to gdb7 supports cross-debugging for
remote targets already, but I wanted it to also support cross-debugging for
vmcores.
Have you looked at the work the my GSoC student, Daniel Lovasko, is
doing:
https://wiki.freebsd.org/SummerOfCode2015/TypeAwareKernelVirtualMemoryAccess

This uses libctf to completely abstract out the accessing of data
in libkvm so that it can be used w/ any arch as long as you have ctf
data... This means you could use netstat on amd64 on an armeb vmcore
w/o issues...

It does look like some of this is still useful, but want to make sure
that we aren't reproducing tons of work...

For example, he's working on procstat right now:
https://github.com/lovasko/taprocstat
--
John-Mark Gurney Voice: +1 415 225 5579

"All that I will do, has been done, All that I have, has not."
John Baldwin
2015-08-05 00:09:39 UTC
Permalink
Post by John-Mark Gurney
Post by John Baldwin
Many debuggers (recent gdb and lldb) support cross-architecture debugging
just fine. My current WIP port of kgdb to gdb7 supports cross-debugging for
remote targets already, but I wanted it to also support cross-debugging for
vmcores.
Have you looked at the work the my GSoC student, Daniel Lovasko, is
https://wiki.freebsd.org/SummerOfCode2015/TypeAwareKernelVirtualMemoryAccess
This uses libctf to completely abstract out the accessing of data
in libkvm so that it can be used w/ any arch as long as you have ctf
data... This means you could use netstat on amd64 on an armeb vmcore
w/o issues...
It does look like some of this is still useful, but want to make sure
that we aren't reproducing tons of work...
https://github.com/lovasko/taprocstat
That doesn't seem to address the need of how you parse the actual vmcore
file itself to resolve a virtual address to a location in the vmcore file.
(e.g. on "plain" dumps on i386 this means walking the page tables whereas
for minidumps it means parsing a special set of PTEs and bitmap at the
start of the file).

To be clear, all that my work enables is doing a kvm_read() of a foreign
vmcore. All the logic to decide how many bytes to read and at what
address (and then decoding those appropriately) happens in the debugger
(gdb/lldb, etc.). The project here seems to be using CTF instead of
dwarf to do the sort of things the debugger does when you 'p *foo', but
you still need a way to find the 'foo' in the vmcore file.
--
John Baldwin
John Baldwin
2015-08-12 17:50:20 UTC
Permalink
Post by John Baldwin
Many debuggers (recent gdb and lldb) support cross-architecture debugging
just fine. My current WIP port of kgdb to gdb7 supports cross-debugging for
remote targets already, but I wanted it to also support cross-debugging for
vmcores.
The existing libkvm/kgdb code in the tree has some limited support for
cross-debugging. It requires building a custom libkvm (e.g. libkvm-i386.a)
and custom kgdb for each target platform. However, gdb (and lldb) both
support multiple targets in a single binary, so I'd like to have a single
kgdb binary that can cross-debug anything.
I started hacking on libkvm last weekend and have a prototype that I've used
(along with some patches to my kgdb port) to debug an amd64 vmcore on an
i386 machine and vice versa.
...
What I'm mostly after is comments on the API, etc. Once that is settled I
will move forward on converting and/or stubbing the other backends (the
stub route would be to only support other backends on native systems for
now).
I guess this is closer to a nuclear power plant than a bikeshed judging by the
feedback. I have ported the rest of the MD backends and verified that the
updated libkvm passes a universe build (including various static assertions
for the duplicated constants in other backends). What I have not done is any
runtime testing and I would like to ask for help with that now. In particular
I need someone to test that kgdb and/or ps works against a native core dump
on all platforms other than amd64 and i386. Note that some of the trickiness
is that the backends now have to make runtime decisions for things that were
previously compile-time decisions. The biggest one affected by this is the
MIPS backend as that backend handles three ABIs (mipso32, mipsn32, and mipsn64).
I believe I have the handling for that correct (mips[on]32 use 32-bit KSEGs
where as mipsn64 uses the extended segments and compat32 KSEGS, and mipso32
uses 32-bit PTEs and mipsn32/n64 both use 64-bit PTEs) (plus both endians
for both in theory). The ARM backend also handles both endians (in theory).

Another wrinkle is that sparc64 uses its own dump format instead of writing
out an ELF file. I had to convert the header structures to use fixed-width
types to be cross-friendly. It would be good to ensure that a new libkvm
can read a vmcore from an old kernel and vice versa to make sure my conversion
is correct (I added an explicit padding field that I believe was implicit
before).

The code is currently available for review in phabric at
https://reviews.freebsd.org/D3341

To test, you can run 'arc patch D3341' in a clean tree to apply the patch.
--
John Baldwin
Julian Elischer
2015-08-13 04:43:15 UTC
Permalink
Post by John Baldwin
Post by John Baldwin
Many debuggers (recent gdb and lldb) support cross-architecture debugging
just fine. My current WIP port of kgdb to gdb7 supports cross-debugging for
remote targets already, but I wanted it to also support cross-debugging for
vmcores.
The existing libkvm/kgdb code in the tree has some limited support for
cross-debugging. It requires building a custom libkvm (e.g. libkvm-i386.a)
and custom kgdb for each target platform. However, gdb (and lldb) both
support multiple targets in a single binary, so I'd like to have a single
kgdb binary that can cross-debug anything.
I started hacking on libkvm last weekend and have a prototype that I've used
(along with some patches to my kgdb port) to debug an amd64 vmcore on an
i386 machine and vice versa.
...
What I'm mostly after is comments on the API, etc. Once that is settled I
will move forward on converting and/or stubbing the other backends (the
stub route would be to only support other backends on native systems for
now).
I guess this is closer to a nuclear power plant than a bikeshed judging by the
feedback.
It has "keep well clear" and "beware of the leopard" written all over it.

and when you get past those you see "dragons be here".
John Baldwin
2015-08-31 21:21:19 UTC
Permalink
Post by John Baldwin
Post by John Baldwin
Many debuggers (recent gdb and lldb) support cross-architecture debugging
just fine. My current WIP port of kgdb to gdb7 supports cross-debugging for
remote targets already, but I wanted it to also support cross-debugging for
vmcores.
The existing libkvm/kgdb code in the tree has some limited support for
cross-debugging. It requires building a custom libkvm (e.g. libkvm-i386.a)
and custom kgdb for each target platform. However, gdb (and lldb) both
support multiple targets in a single binary, so I'd like to have a single
kgdb binary that can cross-debug anything.
I started hacking on libkvm last weekend and have a prototype that I've used
(along with some patches to my kgdb port) to debug an amd64 vmcore on an
i386 machine and vice versa.
...
What I'm mostly after is comments on the API, etc. Once that is settled I
will move forward on converting and/or stubbing the other backends (the
stub route would be to only support other backends on native systems for
now).
I guess this is closer to a nuclear power plant than a bikeshed judging by the
feedback. I have ported the rest of the MD backends and verified that the
updated libkvm passes a universe build (including various static assertions
for the duplicated constants in other backends). What I have not done is any
runtime testing and I would like to ask for help with that now. In particular
I need someone to test that kgdb and/or ps works against a native core dump
on all platforms other than amd64 and i386. Note that some of the trickiness
is that the backends now have to make runtime decisions for things that were
previously compile-time decisions. The biggest one affected by this is the
MIPS backend as that backend handles three ABIs (mipso32, mipsn32, and mipsn64).
I believe I have the handling for that correct (mips[on]32 use 32-bit KSEGs
where as mipsn64 uses the extended segments and compat32 KSEGS, and mipso32
uses 32-bit PTEs and mipsn32/n64 both use 64-bit PTEs) (plus both endians
for both in theory). The ARM backend also handles both endians (in theory).
Another wrinkle is that sparc64 uses its own dump format instead of writing
out an ELF file. I had to convert the header structures to use fixed-width
types to be cross-friendly. It would be good to ensure that a new libkvm
can read a vmcore from an old kernel and vice versa to make sure my conversion
is correct (I added an explicit padding field that I believe was implicit
before).
The code is currently available for review in phabric at
https://reviews.freebsd.org/D3341
To test, you can run 'arc patch D3341' in a clean tree to apply the patch.
I've just rebased this to port aarch64's minidump support. I just need people
willing and able to test on non-x86. Testing with the in-tree kgdb using an
updated libkvm would be sufficient.
--
John Baldwin
Marius Strobl
2015-11-12 23:41:46 UTC
Permalink
Post by John Baldwin
Post by John Baldwin
Post by John Baldwin
Many debuggers (recent gdb and lldb) support cross-architecture debugging
just fine. My current WIP port of kgdb to gdb7 supports cross-debugging for
remote targets already, but I wanted it to also support cross-debugging for
vmcores.
The existing libkvm/kgdb code in the tree has some limited support for
cross-debugging. It requires building a custom libkvm (e.g. libkvm-i386.a)
and custom kgdb for each target platform. However, gdb (and lldb) both
support multiple targets in a single binary, so I'd like to have a single
kgdb binary that can cross-debug anything.
I started hacking on libkvm last weekend and have a prototype that I've used
(along with some patches to my kgdb port) to debug an amd64 vmcore on an
i386 machine and vice versa.
...
What I'm mostly after is comments on the API, etc. Once that is settled I
will move forward on converting and/or stubbing the other backends (the
stub route would be to only support other backends on native systems for
now).
I guess this is closer to a nuclear power plant than a bikeshed judging by the
feedback. I have ported the rest of the MD backends and verified that the
updated libkvm passes a universe build (including various static assertions
for the duplicated constants in other backends). What I have not done is any
runtime testing and I would like to ask for help with that now. In particular
I need someone to test that kgdb and/or ps works against a native core dump
on all platforms other than amd64 and i386. Note that some of the trickiness
is that the backends now have to make runtime decisions for things that were
previously compile-time decisions. The biggest one affected by this is the
MIPS backend as that backend handles three ABIs (mipso32, mipsn32, and mipsn64).
I believe I have the handling for that correct (mips[on]32 use 32-bit KSEGs
where as mipsn64 uses the extended segments and compat32 KSEGS, and mipso32
uses 32-bit PTEs and mipsn32/n64 both use 64-bit PTEs) (plus both endians
for both in theory). The ARM backend also handles both endians (in theory).
Another wrinkle is that sparc64 uses its own dump format instead of writing
out an ELF file. I had to convert the header structures to use fixed-width
types to be cross-friendly. It would be good to ensure that a new libkvm
can read a vmcore from an old kernel and vice versa to make sure my conversion
is correct (I added an explicit padding field that I believe was implicit
before).
The code is currently available for review in phabric at
https://reviews.freebsd.org/D3341
To test, you can run 'arc patch D3341' in a clean tree to apply the patch.
I've just rebased this to port aarch64's minidump support. I just need people
willing and able to test on non-x86. Testing with the in-tree kgdb using an
updated libkvm would be sufficient.
After a lot of crickets, I have updated the manpages for the new API. I will
commit this "soon". If you want kgdb to keep working on your non-x86
platform, this is your chance to test this before it hits the tree.
What exact test procedure do you suggest for full coverage of an
architecture?

Marius
John Baldwin
2015-11-13 19:50:37 UTC
Permalink
Post by Marius Strobl
Post by John Baldwin
Post by John Baldwin
Post by John Baldwin
Many debuggers (recent gdb and lldb) support cross-architecture debugging
just fine. My current WIP port of kgdb to gdb7 supports cross-debugging for
remote targets already, but I wanted it to also support cross-debugging for
vmcores.
The existing libkvm/kgdb code in the tree has some limited support for
cross-debugging. It requires building a custom libkvm (e.g. libkvm-i386.a)
and custom kgdb for each target platform. However, gdb (and lldb) both
support multiple targets in a single binary, so I'd like to have a single
kgdb binary that can cross-debug anything.
I started hacking on libkvm last weekend and have a prototype that I've used
(along with some patches to my kgdb port) to debug an amd64 vmcore on an
i386 machine and vice versa.
...
What I'm mostly after is comments on the API, etc. Once that is settled I
will move forward on converting and/or stubbing the other backends (the
stub route would be to only support other backends on native systems for
now).
I guess this is closer to a nuclear power plant than a bikeshed judging by the
feedback. I have ported the rest of the MD backends and verified that the
updated libkvm passes a universe build (including various static assertions
for the duplicated constants in other backends). What I have not done is any
runtime testing and I would like to ask for help with that now. In particular
I need someone to test that kgdb and/or ps works against a native core dump
on all platforms other than amd64 and i386. Note that some of the trickiness
is that the backends now have to make runtime decisions for things that were
previously compile-time decisions. The biggest one affected by this is the
MIPS backend as that backend handles three ABIs (mipso32, mipsn32, and mipsn64).
I believe I have the handling for that correct (mips[on]32 use 32-bit KSEGs
where as mipsn64 uses the extended segments and compat32 KSEGS, and mipso32
uses 32-bit PTEs and mipsn32/n64 both use 64-bit PTEs) (plus both endians
for both in theory). The ARM backend also handles both endians (in theory).
Another wrinkle is that sparc64 uses its own dump format instead of writing
out an ELF file. I had to convert the header structures to use fixed-width
types to be cross-friendly. It would be good to ensure that a new libkvm
can read a vmcore from an old kernel and vice versa to make sure my conversion
is correct (I added an explicit padding field that I believe was implicit
before).
The code is currently available for review in phabric at
https://reviews.freebsd.org/D3341
To test, you can run 'arc patch D3341' in a clean tree to apply the patch.
I've just rebased this to port aarch64's minidump support. I just need people
willing and able to test on non-x86. Testing with the in-tree kgdb using an
updated libkvm would be sufficient.
After a lot of crickets, I have updated the manpages for the new API. I will
commit this "soon". If you want kgdb to keep working on your non-x86
platform, this is your chance to test this before it hits the tree.
What exact test procedure do you suggest for full coverage of an
architecture?
Just ensuring that kgdb and things like ps -M <core> -N <kernel> still work.
Btw, Mark Linimon tried to generate a crashdump for me on his sparc64 running
HEAD recently so I could test the updated kgdb but it failed to generate a
dump. I was hoping that the thread on sparc64 with the guy from qemu would
result in working qemu as that would let me do the testing I need for this
and kgdb locally.
--
John Baldwin
Marius Strobl
2015-11-16 23:04:39 UTC
Permalink
Post by John Baldwin
Post by Marius Strobl
Post by John Baldwin
Post by John Baldwin
Post by John Baldwin
Many debuggers (recent gdb and lldb) support cross-architecture debugging
just fine. My current WIP port of kgdb to gdb7 supports cross-debugging for
remote targets already, but I wanted it to also support cross-debugging for
vmcores.
The existing libkvm/kgdb code in the tree has some limited support for
cross-debugging. It requires building a custom libkvm (e.g. libkvm-i386.a)
and custom kgdb for each target platform. However, gdb (and lldb) both
support multiple targets in a single binary, so I'd like to have a single
kgdb binary that can cross-debug anything.
I started hacking on libkvm last weekend and have a prototype that I've used
(along with some patches to my kgdb port) to debug an amd64 vmcore on an
i386 machine and vice versa.
...
What I'm mostly after is comments on the API, etc. Once that is settled I
will move forward on converting and/or stubbing the other backends (the
stub route would be to only support other backends on native systems for
now).
I guess this is closer to a nuclear power plant than a bikeshed judging by the
feedback. I have ported the rest of the MD backends and verified that the
updated libkvm passes a universe build (including various static assertions
for the duplicated constants in other backends). What I have not done is any
runtime testing and I would like to ask for help with that now. In particular
I need someone to test that kgdb and/or ps works against a native core dump
on all platforms other than amd64 and i386. Note that some of the trickiness
is that the backends now have to make runtime decisions for things that were
previously compile-time decisions. The biggest one affected by this is the
MIPS backend as that backend handles three ABIs (mipso32, mipsn32, and mipsn64).
I believe I have the handling for that correct (mips[on]32 use 32-bit KSEGs
where as mipsn64 uses the extended segments and compat32 KSEGS, and mipso32
uses 32-bit PTEs and mipsn32/n64 both use 64-bit PTEs) (plus both endians
for both in theory). The ARM backend also handles both endians (in theory).
Another wrinkle is that sparc64 uses its own dump format instead of writing
out an ELF file. I had to convert the header structures to use fixed-width
types to be cross-friendly. It would be good to ensure that a new libkvm
can read a vmcore from an old kernel and vice versa to make sure my conversion
is correct (I added an explicit padding field that I believe was implicit
before).
The code is currently available for review in phabric at
https://reviews.freebsd.org/D3341
To test, you can run 'arc patch D3341' in a clean tree to apply the patch.
I've just rebased this to port aarch64's minidump support. I just need people
willing and able to test on non-x86. Testing with the in-tree kgdb using an
updated libkvm would be sufficient.
After a lot of crickets, I have updated the manpages for the new API. I will
commit this "soon". If you want kgdb to keep working on your non-x86
platform, this is your chance to test this before it hits the tree.
What exact test procedure do you suggest for full coverage of an
architecture?
Just ensuring that kgdb and things like ps -M <core> -N <kernel> still work.
With the patch from D3341 applied, kgdb(1) still seems to work fine on
sparc64. However, `ps -M <core> -N <kernel>` doesn't; it just prints
the header and then exists after a short pause. Using the same core and
kernel with ps(1) on a machine with userland built without your patch,
ps(1) just segfaults after a short period of time. I can't tell whether
that's a regression or not as I've never used ps(1) on a core before
and you also have added padding to struct sparc64_dump_hdr, which might
be responsible for triggering the segfault. On the other hand, an old
kgb(1) seemingly works fine with the new core.

FYI, I needed the follow patch on top of D3341 (based on the amd64
counterpart):
--- lib/libkvm/kvm_minidump_aarch64.c 2015-11-16 23:41:58.075242000 +0100
+++ lib/libkvm/kvm_minidump_aarch64.c 2015-11-16 13:25:26.411577000 +0100
@@ -122,7 +122,7 @@
return (-1);
}
if (pread(kd->pmfd, bitmap, vmst->hdr.bitmapsize, off) !=
- vmst->hdr.bitmapsize) {
+ (ssize_t)vmst->hdr.bitmapsize) {
_kvm_err(kd, kd->program,
"cannot read %d bytes for page bitmap",
vmst->hdr.bitmapsize);
@@ -215,7 +215,7 @@
}

invalid:
- _kvm_err(kd, 0, "invalid address (0x%lx)", va);
+ _kvm_err(kd, 0, "invalid address (0x%jx)", (uintmax_t)va);
return (0);
}

Also, parallel builds failed with something not finding libelf but
building with a single jobs succeeded. I don't know whether D3341
introduces that or if it's a bug in head (the latter probably is
unlikely but I didn't investigate).
Post by John Baldwin
Btw, Mark Linimon tried to generate a crashdump for me on his sparc64 running
HEAD recently so I could test the updated kgdb but it failed to generate a
dump.
Ah, that reminds me of something; fixed in r290957.
Post by John Baldwin
I was hoping that the thread on sparc64 with the guy from qemu would
result in working qemu as that would let me do the testing I need for this
and kgdb locally.
Yeah, I also thought that after all that time, OpenBIOS and QEMU
would have progressed some more. Well, some bugs are fixed now but
they're also still not quite there, yet.

Marius
John Baldwin
2015-11-17 00:37:32 UTC
Permalink
Post by Marius Strobl
Post by John Baldwin
Post by Marius Strobl
Post by John Baldwin
Post by John Baldwin
Post by John Baldwin
Many debuggers (recent gdb and lldb) support cross-architecture debugging
just fine. My current WIP port of kgdb to gdb7 supports cross-debugging for
remote targets already, but I wanted it to also support cross-debugging for
vmcores.
The existing libkvm/kgdb code in the tree has some limited support for
cross-debugging. It requires building a custom libkvm (e.g. libkvm-i386.a)
and custom kgdb for each target platform. However, gdb (and lldb) both
support multiple targets in a single binary, so I'd like to have a single
kgdb binary that can cross-debug anything.
I started hacking on libkvm last weekend and have a prototype that I've used
(along with some patches to my kgdb port) to debug an amd64 vmcore on an
i386 machine and vice versa.
...
What I'm mostly after is comments on the API, etc. Once that is settled I
will move forward on converting and/or stubbing the other backends (the
stub route would be to only support other backends on native systems for
now).
I guess this is closer to a nuclear power plant than a bikeshed judging by the
feedback. I have ported the rest of the MD backends and verified that the
updated libkvm passes a universe build (including various static assertions
for the duplicated constants in other backends). What I have not done is any
runtime testing and I would like to ask for help with that now. In particular
I need someone to test that kgdb and/or ps works against a native core dump
on all platforms other than amd64 and i386. Note that some of the trickiness
is that the backends now have to make runtime decisions for things that were
previously compile-time decisions. The biggest one affected by this is the
MIPS backend as that backend handles three ABIs (mipso32, mipsn32, and mipsn64).
I believe I have the handling for that correct (mips[on]32 use 32-bit KSEGs
where as mipsn64 uses the extended segments and compat32 KSEGS, and mipso32
uses 32-bit PTEs and mipsn32/n64 both use 64-bit PTEs) (plus both endians
for both in theory). The ARM backend also handles both endians (in theory).
Another wrinkle is that sparc64 uses its own dump format instead of writing
out an ELF file. I had to convert the header structures to use fixed-width
types to be cross-friendly. It would be good to ensure that a new libkvm
can read a vmcore from an old kernel and vice versa to make sure my conversion
is correct (I added an explicit padding field that I believe was implicit
before).
The code is currently available for review in phabric at
https://reviews.freebsd.org/D3341
To test, you can run 'arc patch D3341' in a clean tree to apply the patch.
I've just rebased this to port aarch64's minidump support. I just need people
willing and able to test on non-x86. Testing with the in-tree kgdb using an
updated libkvm would be sufficient.
After a lot of crickets, I have updated the manpages for the new API. I will
commit this "soon". If you want kgdb to keep working on your non-x86
platform, this is your chance to test this before it hits the tree.
What exact test procedure do you suggest for full coverage of an
architecture?
Just ensuring that kgdb and things like ps -M <core> -N <kernel> still work.
With the patch from D3341 applied, kgdb(1) still seems to work fine on
sparc64. However, `ps -M <core> -N <kernel>` doesn't; it just prints
the header and then exists after a short pause. Using the same core and
kernel with ps(1) on a machine with userland built without your patch,
ps(1) just segfaults after a short period of time. I can't tell whether
that's a regression or not as I've never used ps(1) on a core before
and you also have added padding to struct sparc64_dump_hdr, which might
be responsible for triggering the segfault. On the other hand, an old
kgb(1) seemingly works fine with the new core.
Hmm, I had thought that the old and new sparc64_dump_hdr would be the
same? I was just using fixed width types so that any platform could
#include the header and get the same layout. In particular, I don't
want the dump format to change on disk after this change so that once
kgdb (or lldb) has cross-debugging support we can read both old and
new sparc64 vmcores.
Post by Marius Strobl
FYI, I needed the follow patch on top of D3341 (based on the amd64
--- lib/libkvm/kvm_minidump_aarch64.c 2015-11-16 23:41:58.075242000 +0100
+++ lib/libkvm/kvm_minidump_aarch64.c 2015-11-16 13:25:26.411577000 +0100
@@ -122,7 +122,7 @@
return (-1);
}
if (pread(kd->pmfd, bitmap, vmst->hdr.bitmapsize, off) !=
- vmst->hdr.bitmapsize) {
+ (ssize_t)vmst->hdr.bitmapsize) {
_kvm_err(kd, kd->program,
"cannot read %d bytes for page bitmap",
vmst->hdr.bitmapsize);
@@ -215,7 +215,7 @@
}
- _kvm_err(kd, 0, "invalid address (0x%lx)", va);
+ _kvm_err(kd, 0, "invalid address (0x%jx)", (uintmax_t)va);
return (0);
}
Oops, yes. I fixed this in my git branch when I built universe with it
recently but I might not have pushed that update to phabricator yet.
Post by Marius Strobl
Also, parallel builds failed with something not finding libelf but
building with a single jobs succeeded. I don't know whether D3341
introduces that or if it's a bug in head (the latter probably is
unlikely but I didn't investigate).
Hmm, it is true that libkvm now depends on libelf. My -j 16 tinderbox
builds did not trip over that, and lib/Makefile has libelf in its
"early" list of libraries (SUBDIR_ORDERED), so it seems like it should
be built before libkvm is tried?
Post by Marius Strobl
Post by John Baldwin
Btw, Mark Linimon tried to generate a crashdump for me on his sparc64 running
HEAD recently so I could test the updated kgdb but it failed to generate a
dump.
Ah, that reminds me of something; fixed in r290957.
Thanks!
--
John Baldwin
Marius Strobl
2015-11-17 22:45:05 UTC
Permalink
Post by John Baldwin
Post by Marius Strobl
Post by John Baldwin
Post by Marius Strobl
Post by John Baldwin
Post by John Baldwin
Post by John Baldwin
Many debuggers (recent gdb and lldb) support cross-architecture debugging
just fine. My current WIP port of kgdb to gdb7 supports cross-debugging for
remote targets already, but I wanted it to also support cross-debugging for
vmcores.
The existing libkvm/kgdb code in the tree has some limited support for
cross-debugging. It requires building a custom libkvm (e.g. libkvm-i386.a)
and custom kgdb for each target platform. However, gdb (and lldb) both
support multiple targets in a single binary, so I'd like to have a single
kgdb binary that can cross-debug anything.
I started hacking on libkvm last weekend and have a prototype that I've used
(along with some patches to my kgdb port) to debug an amd64 vmcore on an
i386 machine and vice versa.
...
What I'm mostly after is comments on the API, etc. Once that is settled I
will move forward on converting and/or stubbing the other backends (the
stub route would be to only support other backends on native systems for
now).
I guess this is closer to a nuclear power plant than a bikeshed judging by the
feedback. I have ported the rest of the MD backends and verified that the
updated libkvm passes a universe build (including various static assertions
for the duplicated constants in other backends). What I have not done is any
runtime testing and I would like to ask for help with that now. In particular
I need someone to test that kgdb and/or ps works against a native core dump
on all platforms other than amd64 and i386. Note that some of the trickiness
is that the backends now have to make runtime decisions for things that were
previously compile-time decisions. The biggest one affected by this is the
MIPS backend as that backend handles three ABIs (mipso32, mipsn32, and mipsn64).
I believe I have the handling for that correct (mips[on]32 use 32-bit KSEGs
where as mipsn64 uses the extended segments and compat32 KSEGS, and mipso32
uses 32-bit PTEs and mipsn32/n64 both use 64-bit PTEs) (plus both endians
for both in theory). The ARM backend also handles both endians (in theory).
Another wrinkle is that sparc64 uses its own dump format instead of writing
out an ELF file. I had to convert the header structures to use fixed-width
types to be cross-friendly. It would be good to ensure that a new libkvm
can read a vmcore from an old kernel and vice versa to make sure my conversion
is correct (I added an explicit padding field that I believe was implicit
before).
The code is currently available for review in phabric at
https://reviews.freebsd.org/D3341
To test, you can run 'arc patch D3341' in a clean tree to apply the patch.
I've just rebased this to port aarch64's minidump support. I just need people
willing and able to test on non-x86. Testing with the in-tree kgdb using an
updated libkvm would be sufficient.
After a lot of crickets, I have updated the manpages for the new API. I will
commit this "soon". If you want kgdb to keep working on your non-x86
platform, this is your chance to test this before it hits the tree.
What exact test procedure do you suggest for full coverage of an
architecture?
Just ensuring that kgdb and things like ps -M <core> -N <kernel> still work.
With the patch from D3341 applied, kgdb(1) still seems to work fine on
sparc64. However, `ps -M <core> -N <kernel>` doesn't; it just prints
the header and then exists after a short pause. Using the same core and
kernel with ps(1) on a machine with userland built without your patch,
ps(1) just segfaults after a short period of time. I can't tell whether
that's a regression or not as I've never used ps(1) on a core before
and you also have added padding to struct sparc64_dump_hdr, which might
be responsible for triggering the segfault. On the other hand, an old
kgb(1) seemingly works fine with the new core.
Hmm, I had thought that the old and new sparc64_dump_hdr would be the
same? I was just using fixed width types so that any platform could
#include the header and get the same layout. In particular, I don't
want the dump format to change on disk after this change so that once
kgdb (or lldb) has cross-debugging support we can read both old and
new sparc64 vmcores.
Yes, you are right; you added dh_pad to struct sparc64_dump_hdr but
that doesn't change its size or the native offsets of the members.
Post by John Baldwin
Post by Marius Strobl
Also, parallel builds failed with something not finding libelf but
building with a single jobs succeeded. I don't know whether D3341
introduces that or if it's a bug in head (the latter probably is
unlikely but I didn't investigate).
Hmm, it is true that libkvm now depends on libelf. My -j 16 tinderbox
builds did not trip over that, and lib/Makefile has libelf in its
"early" list of libraries (SUBDIR_ORDERED), so it seems like it should
be built before libkvm is tried?
Well, I'd agree in principle but also just can say that -j16 builds
reliably fail here:
--- lib/libkvm__L ---
/home/marius/co/build/head3/i386.i386/usr/home/marius/co/head3/src/tmp/usr/bin/
ld: cannot find -lelf
cc: error: linker command failed with exit code 1 (use -v to see invocation)
*** [libkvm.so.6] Error code 1

Marius
John Baldwin
2015-11-18 19:32:57 UTC
Permalink
Post by Marius Strobl
Post by John Baldwin
Post by Marius Strobl
Also, parallel builds failed with something not finding libelf but
building with a single jobs succeeded. I don't know whether D3341
introduces that or if it's a bug in head (the latter probably is
unlikely but I didn't investigate).
Hmm, it is true that libkvm now depends on libelf. My -j 16 tinderbox
builds did not trip over that, and lib/Makefile has libelf in its
"early" list of libraries (SUBDIR_ORDERED), so it seems like it should
be built before libkvm is tried?
Well, I'd agree in principle but also just can say that -j16 builds
--- lib/libkvm__L ---
/home/marius/co/build/head3/i386.i386/usr/home/marius/co/head3/src/tmp/usr/bin/
ld: cannot find -lelf
cc: error: linker command failed with exit code 1 (use -v to see invocation)
*** [libkvm.so.6] Error code 1
Ok. I'll see if I can't reproduce that on a test machine myself.
--
John Baldwin
John Baldwin
2015-11-19 19:16:24 UTC
Permalink
Post by Marius Strobl
Post by John Baldwin
Hmm, it is true that libkvm now depends on libelf. My -j 16 tinderbox
builds did not trip over that, and lib/Makefile has libelf in its
"early" list of libraries (SUBDIR_ORDERED), so it seems like it should
be built before libkvm is tried?
Well, I'd agree in principle but also just can say that -j16 builds
--- lib/libkvm__L ---
/home/marius/co/build/head3/i386.i386/usr/home/marius/co/head3/src/tmp/usr/bin/
ld: cannot find -lelf
cc: error: linker command failed with exit code 1 (use -v to see invocation)
*** [libkvm.so.6] Error code 1
I found this. There are three(!) places I've had to annotate the libkvm now
depends on libelf though it seems only one of them is actually used by
buildworld (and that was the one I had missed).
--
John Baldwin
Marius Strobl
2015-11-20 01:56:21 UTC
Permalink
Post by John Baldwin
Post by Marius Strobl
Post by John Baldwin
Hmm, it is true that libkvm now depends on libelf. My -j 16 tinderbox
builds did not trip over that, and lib/Makefile has libelf in its
"early" list of libraries (SUBDIR_ORDERED), so it seems like it should
be built before libkvm is tried?
Well, I'd agree in principle but also just can say that -j16 builds
--- lib/libkvm__L ---
/home/marius/co/build/head3/i386.i386/usr/home/marius/co/head3/src/tmp/usr/bin/
ld: cannot find -lelf
cc: error: linker command failed with exit code 1 (use -v to see invocation)
*** [libkvm.so.6] Error code 1
I found this. There are three(!) places I've had to annotate the libkvm now
depends on libelf though it seems only one of them is actually used by
buildworld (and that was the one I had missed).
Does this mean that the .WAITs in SUBDIR_ORDERED of lib/Makefile
don't have the desired effect? I see other build failures which
suggest that other .WAITs in tree just don't work as expected.
Usually I only hit these with -j128 or higher, though.

Marius
John Baldwin
2015-11-20 16:27:01 UTC
Permalink
Post by Marius Strobl
Post by John Baldwin
Post by Marius Strobl
Post by John Baldwin
Hmm, it is true that libkvm now depends on libelf. My -j 16 tinderbox
builds did not trip over that, and lib/Makefile has libelf in its
"early" list of libraries (SUBDIR_ORDERED), so it seems like it should
be built before libkvm is tried?
Well, I'd agree in principle but also just can say that -j16 builds
--- lib/libkvm__L ---
/home/marius/co/build/head3/i386.i386/usr/home/marius/co/head3/src/tmp/usr/bin/
ld: cannot find -lelf
cc: error: linker command failed with exit code 1 (use -v to see invocation)
*** [libkvm.so.6] Error code 1
I found this. There are three(!) places I've had to annotate the libkvm now
depends on libelf though it seems only one of them is actually used by
buildworld (and that was the one I had missed).
Does this mean that the .WAITs in SUBDIR_ORDERED of lib/Makefile
don't have the desired effect? I see other build failures which
suggest that other .WAITs in tree just don't work as expected.
Usually I only hit these with -j128 or higher, though.
They work if you do 'make' in lib. But buildworld does 'make libraries'
and then later does 'make' in lib, and the 'make libraries' step uses a
separate set of variables (_prereq_libs, _startup_libs, etc.) defined in
Makefile.inc1 to set the order of 'make libraries'. :-(
--
John Baldwin
John-Mark Gurney
2015-08-28 18:27:05 UTC
Permalink
Post by John Baldwin
Many debuggers (recent gdb and lldb) support cross-architecture debugging
just fine. My current WIP port of kgdb to gdb7 supports cross-debugging for
remote targets already, but I wanted it to also support cross-debugging for
vmcores.
The existing libkvm/kgdb code in the tree has some limited support for
cross-debugging. It requires building a custom libkvm (e.g. libkvm-i386.a)
and custom kgdb for each target platform. However, gdb (and lldb) both
support multiple targets in a single binary, so I'd like to have a single
kgdb binary that can cross-debug anything.
I started hacking on libkvm last weekend and have a prototype that I've used
(along with some patches to my kgdb port) to debug an amd64 vmcore on an
i386 machine and vice versa.
1) A new 'kvaddr_t' type represents a kernel virtual address. This is
similar to the psaddr_t type used for MI process addresses in userland
debugging. I almost reused psaddr_t directly, but that would have made
<libkvm.h> depend on <sys/procfs.h>. Instead, I opted for a separate
type. It is currently a uint64_t.
I like this.. W/ the work Lovasko has been working on, having this
type is good and makes the most sense...
Post by John Baldwin
2) A new 'struct kvm_nlist'. This is a stripped-down version of
'struct nlist' that uses kvadd_t for n_value instead of an unsigned
long.
3) kvm_native() returns true if an open kvm descriptor is for a native
kernel and memory image.
4) kvm_nlist2() is like kvm_nlist() but it uses 'struct kvm_nlist'
instead of 'struct nlist'. Internally symbol names are always
resolved to kvaddr_t addresses rather than u_long addresses.
Native kernels still use _fdnlist() from libc to resolve symbols.
Cross kernels use a caller supplied function to resolve symbols
(the older cross code for libkvm required the caller to provide
a global ps_pglobal_lookup symbol typically provided for
<proc_service.h>).
5) kvm_open2() is like kvm_openfiles() except that it drops the unused
'swapfile' argument and adds a new function pointer argument to a
symbol resolving function. The function pointer can be NULL in
which case only native kernels can be opened. Kernels used with
/dev/mem or /dev/kmem must be native.
6) kvm_read2() is like kvm_read() except that it uses kvaddr_t
instead of unsigned long for the kernel virtual address.
All the above looks good...

How will we prevent native only aware apps from getting confused when
accessing non-native cores? Will kvm_openfiles fail for non-native
cores? or will kvm_read fail for non-native cores?
Post by John Baldwin
Adding new symbols (specifically kvm_nlist2 and kvm_read2) preserves
ABI and API compatibility. Note that most libkvm functions such as
kvm_getprocs(), etc. only work with native kernels. I have not yet
done a full sweep to force them to fail for non-native kernels.
For things like getprocs, we could move to using Lovasko's project
to use ctf data to read the proc structures for non-native cores...
Then the core parts of getprocs would not have to change, and it'd
just work in all places..
Post by John Baldwin
Also, the vnet and dpcpu stuff only works for native kernels currently
though that can be fixed at some point in the future.
struct kvm_arch {
int (*ka_probe)(kvm_t *);
int (*ka_initvtop)(kvm_t *);
void (*ka_freevtop)(kvm_t *);
int (*ka_kvatop)(kvm_t *, kvaddr_t, off_t *);
int (*ka_uvatop)(kvm_t *, const struct proc *, kvaddr_t, off_t *);
int ka_native;
};
Each backend implements the necessary callbacks (uvatop is optional)
and is added to a global linker set that kvm_open2() walks to find the
appropriate kvm_arch for a given kernel + vmcore. On x86 I've used
separate kvm_arch structures for "plain" vs minidumps.
The backends now have to avoid using native headers. For ELF handling
this means using libelf instead of <machine/elf.h> and raw mmap(). For
the x86 backends it meant defining some duplicate constants for certain
page table fields since <machine/pmap.h> can't be relied on (e.g.
I386_PG_V instead of PG_V). I added static assertions in the "native"
case (e.g. building kvm_i386.c on i386) to ensure the duplicate constants
match the originals.
https://github.com/freebsd/freebsd/compare/master...bsdjhb:kgdb_enhancements
What I'm mostly after is comments on the API, etc. Once that is settled I
will move forward on converting and/or stubbing the other backends (the
stub route would be to only support other backends on native systems for
now).
You're API looks good... I can't see anything wrong w/ it...
Post by John Baldwin
Oh, and I do hope to have a 'KGDB' option for the devel/gdb port in the
near future.
--
John-Mark Gurney Voice: +1 415 225 5579

"All that I will do, has been done, All that I have, has not."
John Baldwin
2015-08-28 20:39:31 UTC
Permalink
Post by John-Mark Gurney
How will we prevent native only aware apps from getting confused when
accessing non-native cores? Will kvm_openfiles fail for non-native
cores? or will kvm_read fail for non-native cores?
kvm_openfiles() will fail. kvm_open2() will fail for a non-native core
if a symbol resolving routine is not supplied.

One API question I had is if it would be useful to allow a void * cookie
to be passed to the symbol resolving routine (the same cookie would be
passed to kvm_open2() and stored internally to be passed on each resolution
request). I think in practice we don't need that level of complexity
though (my kgdb changes did not).

I will need to rebase this to port the arm64 minidump support over, but
I also need people to test this.
--
John Baldwin
John-Mark Gurney
2015-08-28 21:19:52 UTC
Permalink
Post by John Baldwin
Post by John-Mark Gurney
How will we prevent native only aware apps from getting confused when
accessing non-native cores? Will kvm_openfiles fail for non-native
cores? or will kvm_read fail for non-native cores?
kvm_openfiles() will fail. kvm_open2() will fail for a non-native core
if a symbol resolving routine is not supplied.
One API question I had is if it would be useful to allow a void * cookie
to be passed to the symbol resolving routine (the same cookie would be
passed to kvm_open2() and stored internally to be passed on each resolution
request). I think in practice we don't need that level of complexity
though (my kgdb changes did not).
I can't think of a reason it would be required, but that doesn't mean
someone else wouldn't need it...

Though wouldn't the core parser provide the symbol lookup function?
Post by John Baldwin
I will need to rebase this to port the arm64 minidump support over, but
I also need people to test this.
I'll see what I can do to help test it...
--
John-Mark Gurney Voice: +1 415 225 5579

"All that I will do, has been done, All that I have, has not."
John Baldwin
2015-08-28 23:37:56 UTC
Permalink
Post by John-Mark Gurney
Post by John Baldwin
Post by John-Mark Gurney
How will we prevent native only aware apps from getting confused when
accessing non-native cores? Will kvm_openfiles fail for non-native
cores? or will kvm_read fail for non-native cores?
kvm_openfiles() will fail. kvm_open2() will fail for a non-native core
if a symbol resolving routine is not supplied.
One API question I had is if it would be useful to allow a void * cookie
to be passed to the symbol resolving routine (the same cookie would be
passed to kvm_open2() and stored internally to be passed on each resolution
request). I think in practice we don't need that level of complexity
though (my kgdb changes did not).
I can't think of a reason it would be required, but that doesn't mean
someone else wouldn't need it...
You need to resolve symbols to find the root of the global page table
structures that let you do virtual to physical translations.
Post by John-Mark Gurney
Though wouldn't the core parser provide the symbol lookup function?
Post by John Baldwin
I will need to rebase this to port the arm64 minidump support over, but
I also need people to test this.
I'll see what I can do to help test it...
Mostly it needs testing on non-x86.
--
John Baldwin
Loading...