Warner Losh
2021-04-14 18:35:48 UTC
Today, one can wire a PCI device like so:
hint.nvme.3.at="pci0:7:0:0"
to wire an instance to a unit number. This works well when you have a
relatively static configuration.
However, if you have a number of carrier cards that have a bunch of
storage, then you have a situation where you are wiring things like so:
hint.nvme.0.at="pci0:29:0:0" # card 0 in carrier 1
hint.nvme.4.at="pci0:30:0:0" # card 1 in carrier 1
hint.nvme.2.at="pci0:31:0:0" # card 2 in carrier 1
hint.nvme.3.at="pci0:32:0:0" # card 3 in carrier 1
hint.nvme.1.at="pci0:185:0:0" # card 0 in carrier 2
hint.nvme.5.at="pci0:186:0:0" # card 1 in carrier 2
hint.nvme.6.at="pci0:187:0:0" # card 2 in carrier 2
hint.nvme.7.at="pci0:188:0:0" # card 3 in carrier 2
where the bus numbers are stable from boot to boot... unless one of the
carrier cards isn't present, in which case the numbers change a bit, which
moves the nvme unit numbers around. So if carrier 1 goes away, the PCI bus
numbers on the second one may be 183, 184, 185, 186 so nvme1 becomes nvme8,
nvme5 becomes nvme9, nvme6 becomes nvme1 and nvme7 becomes nvme5. In our
application, this renumbering is undesirable. One might argue the
application shouldn't care about the numbering, but we have one that does
in a away that's tricky to remove that knowledge and dependency.
Fortunately, UEFI has solved this problem with their device paths. UEFI
device paths are completely independent of PCI bus numbering, and other
items that are the arbitrary choice of the OS and/or the firmware booting
the system.
On a UEFI system, you might see paths more like the following for the above
devices:
PciRoot(0x1)/Pci(0x1,0x1)/Pci(0x0,0x0)
PciRoot(0x1)/Pci(0x1,0x1)/Pci(0x1,0x0)
PciRoot(0x1)/Pci(0x1,0x1)/Pci(0x2,0x0)
PciRoot(0x1)/Pci(0x1,0x1)/Pci(0x3,0x0)
PciRoot(0x2)/Pci(0x1,0x3)/Pci(0x0,0x0)/Pci(0x0,0x0)/Pci(0x0,0x0)
PciRoot(0x2)/Pci(0x1,0x3)/Pci(0x0,0x0)/Pci(0x1,0x0)/Pci(0x0,0x0)
PciRoot(0x2)/Pci(0x1,0x3)/Pci(0x0,0x0)/Pci(0x2,0x0)/Pci(0x0,0x0)
PciRoot(0x2)/Pci(0x1,0x3)/Pci(0x0,0x0)/Pci(0x3,0x0)/Pci(0x0,0x0)
and if the first carrier card goes away, the path to the second one is
still the same. So one way out of this issue is to change the numbering to
be something more like:
hint.nvme.0.at="uefi:PciRoot(0x1)/Pci(0x1,0x1)/Pci(0x0,0x0)"
hint.nvme.4.at="uefi:PciRoot(0x1)/Pci(0x1,0x1)/Pci(0x1,0x0)"
hint.nvme.2.at="uefi:PciRoot(0x1)/Pci(0x1,0x1)/Pci(0x2,0x0)"
hint.nvme.3.at="uefi:PciRoot(0x1)/Pci(0x1,0x1)/Pci(0x3,0x0)"
hint.nvme.1.at
="uefi:PciRoot(0x2)/Pci(0x1,0x3)/Pci(0x0,0x0)/Pci(0x0,0x0)/Pci(0x0,0x0)"
hint.nvme.5.at
="uefi:PciRoot(0x2)/Pci(0x1,0x3)/Pci(0x0,0x0)/Pci(0x1,0x0)/Pci(0x0,0x0)"
hint.nvme.6.at
="uefi:PciRoot(0x2)/Pci(0x1,0x3)/Pci(0x0,0x0)/Pci(0x2,0x0)/Pci(0x0,0x0)"
hint.nvme.7.at
="uefi:PciRoot(0x2)/Pci(0x1,0x3)/Pci(0x0,0x0)/Pci(0x3,0x0)/Pci(0x0,0x0)"
which would solve the problem nicely (of course with a special case for
paths starting with "pci" for those use cases where that might still make
sense).
I've started work on implementing this for PCI. And am looking for feedback
before I get too far down that path. I plan on making these case
insensitive because different UEFI tools produce paths rendered differently.
One could take this further, of course. The full UEFI path to the a
partition on one of these devices is:
PciRoot(0x2)/Pci(0x1,0x3)/Pci(0x0,0x0)/Pci(0x3,0x0)/Pci(0x0,0x0)/NVMe(0x1,A2-19-48-44-8B-44-1B-00)/HD(9,GPT,0F8518D9-2DE5-11E8-B5F1-3CFDFE9D5250,0x430,0x19000)
so constructs like the following might make sense:
hint.nda.7.at="uefi:NVMe(0x1,A2-19-48-44-8B-44-1B-00)"
hint.ada.44.at="uefi:Sata(0x0,0xFFFF,0x0)"
for wiring up CAM devices. However, while these extra uses would be nice,
supporting them is beyond the scope of the initial work (though hopefully
the initial work would make enabling these later easier). I plan on
implementing a generic locator KPI for this, but will focus on only the
uefi and newbus locators initially. Later acpi, ofw, fdt and other location
mechanisms can be added. The uefi path stuff, btw, does not require the
system boot using UEFI.
So I'm writing today to solicit feedback on this approach. John Baldwin has
already offered some advice to structure this as a generic locator and to
have some newbus integration, but to also think about the larger picture.
I'm still working on the details about how to make the locators generic
enough to widely useful to other locators, but also specific enough to deal
with the variations between these different systems.
Warner
hint.nvme.3.at="pci0:7:0:0"
to wire an instance to a unit number. This works well when you have a
relatively static configuration.
However, if you have a number of carrier cards that have a bunch of
storage, then you have a situation where you are wiring things like so:
hint.nvme.0.at="pci0:29:0:0" # card 0 in carrier 1
hint.nvme.4.at="pci0:30:0:0" # card 1 in carrier 1
hint.nvme.2.at="pci0:31:0:0" # card 2 in carrier 1
hint.nvme.3.at="pci0:32:0:0" # card 3 in carrier 1
hint.nvme.1.at="pci0:185:0:0" # card 0 in carrier 2
hint.nvme.5.at="pci0:186:0:0" # card 1 in carrier 2
hint.nvme.6.at="pci0:187:0:0" # card 2 in carrier 2
hint.nvme.7.at="pci0:188:0:0" # card 3 in carrier 2
where the bus numbers are stable from boot to boot... unless one of the
carrier cards isn't present, in which case the numbers change a bit, which
moves the nvme unit numbers around. So if carrier 1 goes away, the PCI bus
numbers on the second one may be 183, 184, 185, 186 so nvme1 becomes nvme8,
nvme5 becomes nvme9, nvme6 becomes nvme1 and nvme7 becomes nvme5. In our
application, this renumbering is undesirable. One might argue the
application shouldn't care about the numbering, but we have one that does
in a away that's tricky to remove that knowledge and dependency.
Fortunately, UEFI has solved this problem with their device paths. UEFI
device paths are completely independent of PCI bus numbering, and other
items that are the arbitrary choice of the OS and/or the firmware booting
the system.
On a UEFI system, you might see paths more like the following for the above
devices:
PciRoot(0x1)/Pci(0x1,0x1)/Pci(0x0,0x0)
PciRoot(0x1)/Pci(0x1,0x1)/Pci(0x1,0x0)
PciRoot(0x1)/Pci(0x1,0x1)/Pci(0x2,0x0)
PciRoot(0x1)/Pci(0x1,0x1)/Pci(0x3,0x0)
PciRoot(0x2)/Pci(0x1,0x3)/Pci(0x0,0x0)/Pci(0x0,0x0)/Pci(0x0,0x0)
PciRoot(0x2)/Pci(0x1,0x3)/Pci(0x0,0x0)/Pci(0x1,0x0)/Pci(0x0,0x0)
PciRoot(0x2)/Pci(0x1,0x3)/Pci(0x0,0x0)/Pci(0x2,0x0)/Pci(0x0,0x0)
PciRoot(0x2)/Pci(0x1,0x3)/Pci(0x0,0x0)/Pci(0x3,0x0)/Pci(0x0,0x0)
and if the first carrier card goes away, the path to the second one is
still the same. So one way out of this issue is to change the numbering to
be something more like:
hint.nvme.0.at="uefi:PciRoot(0x1)/Pci(0x1,0x1)/Pci(0x0,0x0)"
hint.nvme.4.at="uefi:PciRoot(0x1)/Pci(0x1,0x1)/Pci(0x1,0x0)"
hint.nvme.2.at="uefi:PciRoot(0x1)/Pci(0x1,0x1)/Pci(0x2,0x0)"
hint.nvme.3.at="uefi:PciRoot(0x1)/Pci(0x1,0x1)/Pci(0x3,0x0)"
hint.nvme.1.at
="uefi:PciRoot(0x2)/Pci(0x1,0x3)/Pci(0x0,0x0)/Pci(0x0,0x0)/Pci(0x0,0x0)"
hint.nvme.5.at
="uefi:PciRoot(0x2)/Pci(0x1,0x3)/Pci(0x0,0x0)/Pci(0x1,0x0)/Pci(0x0,0x0)"
hint.nvme.6.at
="uefi:PciRoot(0x2)/Pci(0x1,0x3)/Pci(0x0,0x0)/Pci(0x2,0x0)/Pci(0x0,0x0)"
hint.nvme.7.at
="uefi:PciRoot(0x2)/Pci(0x1,0x3)/Pci(0x0,0x0)/Pci(0x3,0x0)/Pci(0x0,0x0)"
which would solve the problem nicely (of course with a special case for
paths starting with "pci" for those use cases where that might still make
sense).
I've started work on implementing this for PCI. And am looking for feedback
before I get too far down that path. I plan on making these case
insensitive because different UEFI tools produce paths rendered differently.
One could take this further, of course. The full UEFI path to the a
partition on one of these devices is:
PciRoot(0x2)/Pci(0x1,0x3)/Pci(0x0,0x0)/Pci(0x3,0x0)/Pci(0x0,0x0)/NVMe(0x1,A2-19-48-44-8B-44-1B-00)/HD(9,GPT,0F8518D9-2DE5-11E8-B5F1-3CFDFE9D5250,0x430,0x19000)
so constructs like the following might make sense:
hint.nda.7.at="uefi:NVMe(0x1,A2-19-48-44-8B-44-1B-00)"
hint.ada.44.at="uefi:Sata(0x0,0xFFFF,0x0)"
for wiring up CAM devices. However, while these extra uses would be nice,
supporting them is beyond the scope of the initial work (though hopefully
the initial work would make enabling these later easier). I plan on
implementing a generic locator KPI for this, but will focus on only the
uefi and newbus locators initially. Later acpi, ofw, fdt and other location
mechanisms can be added. The uefi path stuff, btw, does not require the
system boot using UEFI.
So I'm writing today to solicit feedback on this approach. John Baldwin has
already offered some advice to structure this as a generic locator and to
have some newbus integration, but to also think about the larger picture.
I'm still working on the details about how to make the locators generic
enough to widely useful to other locators, but also specific enough to deal
with the variations between these different systems.
Warner