Post by Garrett CooperPost by Adrian ChaddHi,
https://github.com/freebsd/freebsd/compare/master...erikarn:local/adrian_numa_policy
* (kib) (numactl.c) fix up sorting of include files
* (kib) (numactl.c) consistent use of values when calling err()
* (kib) (numactl.c) consistently wrap lines at 78 characters, don't
prematurely wrap lines
* (kib) don't use the old-style BSD licence mentioning "regents", use
the updated one
* (kib) (vm_domain.c) don't break out after iterating a few times and
have the API be unpredictable - so now the API will always succeed in
reading a vm_policy
I've tested the policies (first-touch, fixed-domain, round-robin) and
they all still work as advertised, both on threads and processes.
I'd appreciate more reviews and some further testing.
Please create a dummy pull request or post the code up on Phabricator.
- Please put some of the items like policy_to_str and str_to_policy in a library.
I'll sketch out a libnuma after this lands. Yes, this stuff will be part of it.
Post by Garrett Cooper- Please use that code in the kernel as well for sysctl_vm_default_policy to reduce duplication.
ok. I'll do it after it lands.
Post by Garrett Cooper- Please note reasoning for why `options MAXMEMDOM=16` in numa(4).
There's no reasoning.
Post by Garrett Cooper- Why are checking for `if (p)` before calling PROC_UNLOCK(p) in sys_numa_getaffinity, but not sys_numa_setaffinity?
I'll check.
Post by Garrett Cooper- sys_numa_setaffinity and sys_numa_getaffinity look similar. Could something be implemented like sysctl(3) for handling getting/setting of affinities all in one shot?
Nope. Other things are the same (eg cpuset).
Post by Garrett Cooper- `if (p)` should be `if (p != NULL)`, etc per style(9).
- Is there a way that the affinity could be inherited/not inherited across threads, similar to what ktrace -i does? If so, how does one do that?
There isn't, much like how there isn't the same for that in cpuset.
Post by Garrett Cooper- In vm_domain_rr_selectdomain, should this use atomic(9), i.e. can multiple threads access/mangle the value of td_dom_rr_idx in parallel?
No - it's curthread. only curthread is modifying that.
Post by Garrett Cooper- In vm_domain_policy_validate, couldn’t you remove all of the intermediary `return (-1)`’s as long as you put `break;`s in the switch/case statements?
- Should vm_domain_policy_cleanup/vm_domain_policy_init be implemented? If so, what should they have in there?
They're placeholders - primarily because I think at some point the
policy definition will grow to be bigger and we may want to malloc /
reference them in some instances. (Much like cpuset.)
This way it's clear that they should be called, even if they don't do anything.
I like defining "object lifecycle", even if part of the create/destroy
lifecycle is null. Adding things like create/destroy functions later
on has proven to be more buggy.
Post by Garrett Cooper- Would it make sense for `struct vm_domain_iterator` to be a queue(9)-like type (just based on the name alone)? If not, what data structure do you anticipate it having, e.g. tree, queue, directed graph, etc?
No. It's an iterator - it's designed to abstract away the concept of
iterating over a domain policy. It won't have any kind of data
structure like you mentioned, as it's an iterator.
Post by Garrett Cooper- In vm_domain_iterator_run, vi->n is always decremented after the vi->n <= 0 run is done — why not move that outside the switch-case statement?
- Why did you hand roll `vm_domain_iterator_isdone` in `vm_domain_iterator_run` up at the top?
vm_domain_iterator_isdone() is (currently) defined as a tail loop
condition, not a head loop condition. Ie, you can only call it after
you've gone through the loop at least once.
Now, I don't necessarily like how the loop constructs work, but it's
designed to be a minimal set of changes to the vm_page.c code that was
doing this kind of iteration over the round-robin set.
I think it's because I wrote iterator_isdone() first before I finished
fleshing out iterator_run().
Post by Garrett Cooper- The parameter names for functions/syscalls can be omitted in the declarations.
ok
Post by Garrett Cooper- The change to vm_page.c seems like it could be committed separate from the NUMA changes.
It will be. It's in there because without it any significant imbalance
in page allocations between domains can end with exhaustion in one
domain and it all going bad.
Post by Garrett Cooper- `However without it'll kernel panic below - the code didn’t` -> `However without the following check, the kernel will panic below; the code didn’t`
- vm_domain_iterator_run seems like it could use one of the queue(9) data structures.
again, no - there's no queue struct underlying it. It's an iterator.
Post by Garrett Cooper- Why is OPT_TID 1001?
- `int error;` should come after `cpuset_t set;` per alignment and style(9).
- `atoi` parsing is better handled via strtoll, etc.
Il'l go sort those out in numactl.c soon.
Thanks!
-a