Discussion:
New device wiring option
Warner Losh
2021-04-14 18:35:48 UTC
Permalink
Today, one can wire a PCI device like so:

hint.nvme.3.at="pci0:7:0:0"

to wire an instance to a unit number. This works well when you have a
relatively static configuration.

However, if you have a number of carrier cards that have a bunch of
storage, then you have a situation where you are wiring things like so:

hint.nvme.0.at="pci0:29:0:0" # card 0 in carrier 1
hint.nvme.4.at="pci0:30:0:0" # card 1 in carrier 1
hint.nvme.2.at="pci0:31:0:0" # card 2 in carrier 1
hint.nvme.3.at="pci0:32:0:0" # card 3 in carrier 1
hint.nvme.1.at="pci0:185:0:0" # card 0 in carrier 2
hint.nvme.5.at="pci0:186:0:0" # card 1 in carrier 2
hint.nvme.6.at="pci0:187:0:0" # card 2 in carrier 2
hint.nvme.7.at="pci0:188:0:0" # card 3 in carrier 2

where the bus numbers are stable from boot to boot... unless one of the
carrier cards isn't present, in which case the numbers change a bit, which
moves the nvme unit numbers around. So if carrier 1 goes away, the PCI bus
numbers on the second one may be 183, 184, 185, 186 so nvme1 becomes nvme8,
nvme5 becomes nvme9, nvme6 becomes nvme1 and nvme7 becomes nvme5. In our
application, this renumbering is undesirable. One might argue the
application shouldn't care about the numbering, but we have one that does
in a away that's tricky to remove that knowledge and dependency.

Fortunately, UEFI has solved this problem with their device paths. UEFI
device paths are completely independent of PCI bus numbering, and other
items that are the arbitrary choice of the OS and/or the firmware booting
the system.

On a UEFI system, you might see paths more like the following for the above
devices:

PciRoot(0x1)/Pci(0x1,0x1)/Pci(0x0,0x0)
PciRoot(0x1)/Pci(0x1,0x1)/Pci(0x1,0x0)
PciRoot(0x1)/Pci(0x1,0x1)/Pci(0x2,0x0)
PciRoot(0x1)/Pci(0x1,0x1)/Pci(0x3,0x0)
PciRoot(0x2)/Pci(0x1,0x3)/Pci(0x0,0x0)/Pci(0x0,0x0)/Pci(0x0,0x0)
PciRoot(0x2)/Pci(0x1,0x3)/Pci(0x0,0x0)/Pci(0x1,0x0)/Pci(0x0,0x0)
PciRoot(0x2)/Pci(0x1,0x3)/Pci(0x0,0x0)/Pci(0x2,0x0)/Pci(0x0,0x0)
PciRoot(0x2)/Pci(0x1,0x3)/Pci(0x0,0x0)/Pci(0x3,0x0)/Pci(0x0,0x0)

and if the first carrier card goes away, the path to the second one is
still the same. So one way out of this issue is to change the numbering to
be something more like:

hint.nvme.0.at="uefi:PciRoot(0x1)/Pci(0x1,0x1)/Pci(0x0,0x0)"
hint.nvme.4.at="uefi:PciRoot(0x1)/Pci(0x1,0x1)/Pci(0x1,0x0)"
hint.nvme.2.at="uefi:PciRoot(0x1)/Pci(0x1,0x1)/Pci(0x2,0x0)"
hint.nvme.3.at="uefi:PciRoot(0x1)/Pci(0x1,0x1)/Pci(0x3,0x0)"
hint.nvme.1.at
="uefi:PciRoot(0x2)/Pci(0x1,0x3)/Pci(0x0,0x0)/Pci(0x0,0x0)/Pci(0x0,0x0)"
hint.nvme.5.at
="uefi:PciRoot(0x2)/Pci(0x1,0x3)/Pci(0x0,0x0)/Pci(0x1,0x0)/Pci(0x0,0x0)"
hint.nvme.6.at
="uefi:PciRoot(0x2)/Pci(0x1,0x3)/Pci(0x0,0x0)/Pci(0x2,0x0)/Pci(0x0,0x0)"
hint.nvme.7.at
="uefi:PciRoot(0x2)/Pci(0x1,0x3)/Pci(0x0,0x0)/Pci(0x3,0x0)/Pci(0x0,0x0)"

which would solve the problem nicely (of course with a special case for
paths starting with "pci" for those use cases where that might still make
sense).

I've started work on implementing this for PCI. And am looking for feedback
before I get too far down that path. I plan on making these case
insensitive because different UEFI tools produce paths rendered differently.

One could take this further, of course. The full UEFI path to the a
partition on one of these devices is:
PciRoot(0x2)/Pci(0x1,0x3)/Pci(0x0,0x0)/Pci(0x3,0x0)/Pci(0x0,0x0)/NVMe(0x1,A2-19-48-44-8B-44-1B-00)/HD(9,GPT,0F8518D9-2DE5-11E8-B5F1-3CFDFE9D5250,0x430,0x19000)
so constructs like the following might make sense:

hint.nda.7.at="uefi:NVMe(0x1,A2-19-48-44-8B-44-1B-00)"
hint.ada.44.at="uefi:Sata(0x0,0xFFFF,0x0)"

for wiring up CAM devices. However, while these extra uses would be nice,
supporting them is beyond the scope of the initial work (though hopefully
the initial work would make enabling these later easier). I plan on
implementing a generic locator KPI for this, but will focus on only the
uefi and newbus locators initially. Later acpi, ofw, fdt and other location
mechanisms can be added. The uefi path stuff, btw, does not require the
system boot using UEFI.

So I'm writing today to solicit feedback on this approach. John Baldwin has
already offered some advice to structure this as a generic locator and to
have some newbus integration, but to also think about the larger picture.
I'm still working on the details about how to make the locators generic
enough to widely useful to other locators, but also specific enough to deal
with the variations between these different systems.

Warner
Konstantin Belousov
2021-04-14 18:57:13 UTC
Permalink
Post by Warner Losh
hint.nvme.3.at="pci0:7:0:0"
to wire an instance to a unit number. This works well when you have a
relatively static configuration.
However, if you have a number of carrier cards that have a bunch of
hint.nvme.0.at="pci0:29:0:0" # card 0 in carrier 1
hint.nvme.4.at="pci0:30:0:0" # card 1 in carrier 1
hint.nvme.2.at="pci0:31:0:0" # card 2 in carrier 1
hint.nvme.3.at="pci0:32:0:0" # card 3 in carrier 1
hint.nvme.1.at="pci0:185:0:0" # card 0 in carrier 2
hint.nvme.5.at="pci0:186:0:0" # card 1 in carrier 2
hint.nvme.6.at="pci0:187:0:0" # card 2 in carrier 2
hint.nvme.7.at="pci0:188:0:0" # card 3 in carrier 2
where the bus numbers are stable from boot to boot... unless one of the
carrier cards isn't present, in which case the numbers change a bit, which
moves the nvme unit numbers around. So if carrier 1 goes away, the PCI bus
numbers on the second one may be 183, 184, 185, 186 so nvme1 becomes nvme8,
nvme5 becomes nvme9, nvme6 becomes nvme1 and nvme7 becomes nvme5. In our
application, this renumbering is undesirable. One might argue the
application shouldn't care about the numbering, but we have one that does
in a away that's tricky to remove that knowledge and dependency.
Fortunately, UEFI has solved this problem with their device paths. UEFI
device paths are completely independent of PCI bus numbering, and other
items that are the arbitrary choice of the OS and/or the firmware booting
the system.
On a UEFI system, you might see paths more like the following for the above
PciRoot(0x1)/Pci(0x1,0x1)/Pci(0x0,0x0)
PciRoot(0x1)/Pci(0x1,0x1)/Pci(0x1,0x0)
PciRoot(0x1)/Pci(0x1,0x1)/Pci(0x2,0x0)
PciRoot(0x1)/Pci(0x1,0x1)/Pci(0x3,0x0)
PciRoot(0x2)/Pci(0x1,0x3)/Pci(0x0,0x0)/Pci(0x0,0x0)/Pci(0x0,0x0)
PciRoot(0x2)/Pci(0x1,0x3)/Pci(0x0,0x0)/Pci(0x1,0x0)/Pci(0x0,0x0)
PciRoot(0x2)/Pci(0x1,0x3)/Pci(0x0,0x0)/Pci(0x2,0x0)/Pci(0x0,0x0)
PciRoot(0x2)/Pci(0x1,0x3)/Pci(0x0,0x0)/Pci(0x3,0x0)/Pci(0x0,0x0)
and if the first carrier card goes away, the path to the second one is
still the same. So one way out of this issue is to change the numbering to
hint.nvme.0.at="uefi:PciRoot(0x1)/Pci(0x1,0x1)/Pci(0x0,0x0)"
hint.nvme.4.at="uefi:PciRoot(0x1)/Pci(0x1,0x1)/Pci(0x1,0x0)"
hint.nvme.2.at="uefi:PciRoot(0x1)/Pci(0x1,0x1)/Pci(0x2,0x0)"
hint.nvme.3.at="uefi:PciRoot(0x1)/Pci(0x1,0x1)/Pci(0x3,0x0)"
hint.nvme.1.at
="uefi:PciRoot(0x2)/Pci(0x1,0x3)/Pci(0x0,0x0)/Pci(0x0,0x0)/Pci(0x0,0x0)"
hint.nvme.5.at
="uefi:PciRoot(0x2)/Pci(0x1,0x3)/Pci(0x0,0x0)/Pci(0x1,0x0)/Pci(0x0,0x0)"
hint.nvme.6.at
="uefi:PciRoot(0x2)/Pci(0x1,0x3)/Pci(0x0,0x0)/Pci(0x2,0x0)/Pci(0x0,0x0)"
hint.nvme.7.at
="uefi:PciRoot(0x2)/Pci(0x1,0x3)/Pci(0x0,0x0)/Pci(0x3,0x0)/Pci(0x0,0x0)"
which would solve the problem nicely (of course with a special case for
paths starting with "pci" for those use cases where that might still make
sense).
I've started work on implementing this for PCI. And am looking for feedback
before I get too far down that path. I plan on making these case
insensitive because different UEFI tools produce paths rendered differently.
One could take this further, of course. The full UEFI path to the a
PciRoot(0x2)/Pci(0x1,0x3)/Pci(0x0,0x0)/Pci(0x3,0x0)/Pci(0x0,0x0)/NVMe(0x1,A2-19-48-44-8B-44-1B-00)/HD(9,GPT,0F8518D9-2DE5-11E8-B5F1-3CFDFE9D5250,0x430,0x19000)
hint.nda.7.at="uefi:NVMe(0x1,A2-19-48-44-8B-44-1B-00)"
hint.ada.44.at="uefi:Sata(0x0,0xFFFF,0x0)"
for wiring up CAM devices. However, while these extra uses would be nice,
supporting them is beyond the scope of the initial work (though hopefully
the initial work would make enabling these later easier). I plan on
implementing a generic locator KPI for this, but will focus on only the
uefi and newbus locators initially. Later acpi, ofw, fdt and other location
mechanisms can be added. The uefi path stuff, btw, does not require the
system boot using UEFI.
So I'm writing today to solicit feedback on this approach. John Baldwin has
already offered some advice to structure this as a generic locator and to
have some newbus integration, but to also think about the larger picture.
I'm still working on the details about how to make the locators generic
enough to widely useful to other locators, but also specific enough to deal
with the variations between these different systems.
DMAR (Intel x86 IOMMU) has a similar issue: some configuration details
for DMA and MSI remapping require specifying PCIe bridges and PCIe
devices in a way that is invariant against bus renumbering and hot-plug.
They use paths from root ports through bridges down to the target. This
is encoded in the binary structures of the ACPI DMAR table, see the VT-d
document.

But more, some devices that need configuration WRT DMAR, are not PCIe,
but still generate DMA and MSI interrupts. For them, DMAR table uses
ACPI ANDD (ACPI Namespace Device Declaration). Practically it is used
for devices behind LPC bridge.

So having such way to locate devices would be also useful for DMAR tweaking
and bhyve pass-through configs.
Bjoern A. Zeeb
2021-04-14 21:06:25 UTC
Permalink
Post by Warner Losh
for wiring up CAM devices. However, while these extra uses would be nice,
supporting them is beyond the scope of the initial work (though hopefully
the initial work would make enabling these later easier). I plan on
implementing a generic locator KPI for this, but will focus on only the
uefi and newbus locators initially. Later acpi, ofw, fdt and other location
mechanisms can be added. The uefi path stuff, btw, does not require the
system boot using UEFI.
Probably USB as well? Having 10 serial consoles on USB Hubs and unplugging
one “early” one it is easy to end up re-numbering the entire chain after a
reboot. Not sure if this really fits into your problem/implementation domain.

/bz
Warner Losh
2021-04-14 21:41:55 UTC
Permalink
On Wed, Apr 14, 2021 at 3:06 PM Bjoern A. Zeeb <
Post by Warner Losh
Post by Warner Losh
for wiring up CAM devices. However, while these extra uses would be nice,
supporting them is beyond the scope of the initial work (though hopefully
the initial work would make enabling these later easier). I plan on
implementing a generic locator KPI for this, but will focus on only the
uefi and newbus locators initially. Later acpi, ofw, fdt and other
location
Post by Warner Losh
mechanisms can be added. The uefi path stuff, btw, does not require the
system boot using UEFI.
Probably USB as well? Having 10 serial consoles on USB Hubs and unplugging
one “early” one it is easy to end up re-numbering the entire chain after a
reboot. Not sure if this really fits into your problem/implementation domain.
Yes. UEFI has a way to enumerate their physical location. However, each
device type has its own weirdness and I'm not planning on doing them all
with at least the initial effort. I'd love help in this area, so if you'd
(generic you)
like to be part of the efforts, please let me know. This is doubly true when
our device model doesn't quite match the UEFI's device model or when
we cross domains into CAM or the tty system...

Warner
Poul-Henning Kamp
2021-04-15 05:17:12 UTC
Permalink
--------
Post by Bjoern A. Zeeb
Probably USB as well? Having 10 serial consoles on USB Hubs and unplugging
one “early” one it is easy to end up re-numbering the entire chain after a
reboot. Not sure if this really fits into your problem/implementation domain.
My solution to that specific problem is the following entry in /etc/devd:

attach 500 {
match "device-name" "uftdi[0-9]*";
match "vendor" "0x0403";
match "product" "(0x6001|0x6015)";
action "ln -fs /dev/cua$ttyname /dev/cua_$sernum";
};

notify 500 {
match "system" "USB";
match "subsystem" "DEVICE";
match "type" "DETACH";
match "vendor" "0x0403";
match "product" "(0x6001|0x6015)";
action "rm -f /dev/cua_$sernum";
};
--
Poul-Henning Kamp | UNIX since Zilog Zeus 3.20
***@FreeBSD.ORG | TCP/IP since RFC 956
FreeBSD committer | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.
Daniel O'Connor via freebsd-arch
2021-04-16 11:35:20 UTC
Permalink
Post by Poul-Henning Kamp
--------
Post by Bjoern A. Zeeb
Probably USB as well? Having 10 serial consoles on USB Hubs and unplugging
one “early” one it is easy to end up re-numbering the entire chain after a
reboot. Not sure if this really fits into your problem/implementation domain.
attach 500 {
match "device-name" "uftdi[0-9]*";
match "vendor" "0x0403";
match "product" "(0x6001|0x6015)";
action "ln -fs /dev/cua$ttyname /dev/cua_$sernum";
};
notify 500 {
match "system" "USB";
match "subsystem" "DEVICE";
match "type" "DETACH";
match "vendor" "0x0403";
match "product" "(0x6001|0x6015)";
action "rm -f /dev/cua_$sernum";
};
I wrote a more general version of this, although when I did testing the serial number was not available so I had to store it for deletion later.

--
Daniel O'Connor
"The nice thing about standards is that there
are so many of them to choose from."
-- Andrew Tanenbaum
John-Mark Gurney
2021-05-17 17:56:01 UTC
Permalink
Post by Daniel O'Connor via freebsd-arch
Post by Poul-Henning Kamp
--------
Post by Bjoern A. Zeeb
Probably USB as well? Having 10 serial consoles on USB Hubs and unplugging
one ???early??? one it is easy to end up re-numbering the entire chain after a
reboot. Not sure if this really fits into your problem/implementation domain.
attach 500 {
match "device-name" "uftdi[0-9]*";
match "vendor" "0x0403";
match "product" "(0x6001|0x6015)";
action "ln -fs /dev/cua$ttyname /dev/cua_$sernum";
};
notify 500 {
match "system" "USB";
match "subsystem" "DEVICE";
match "type" "DETACH";
match "vendor" "0x0403";
match "product" "(0x6001|0x6015)";
action "rm -f /dev/cua_$sernum";
};
I wrote a more general version of this, although when I did testing the serial number was not available so I had to store it for deletion later.
Yeah, and I did a slight improvement here:
https://reviews.freebsd.org/D21886#554613

https://www.funkthat.com/~jmg/FreeBSD/usbserialsn

Which supports devices that don't have serial numbers by addressing them
via a path of hub port numbers which does not change...

One issue is that it isn't atomic, so there's always a slight race
between devd running and removing the link, and a new device appearing
at the old one's name.
--
John-Mark Gurney Voice: +1 415 225 5579

"All that I will do, has been done, All that I have, has not."
Poul-Henning Kamp
2021-05-17 18:21:09 UTC
Permalink
--------
Post by John-Mark Gurney
Which supports devices that don't have serial numbers by addressing them
via a path of hub port numbers which does not change...
Remember two decades ago, when Microsoft wanted all USB devices to have a unique serial number and everybody freaked out ? :-)
--
Poul-Henning Kamp | UNIX since Zilog Zeus 3.20
***@FreeBSD.ORG | TCP/IP since RFC 956
FreeBSD committer | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.
Loading...