Discussion:
CFT/CFR: NUMA policy branch
Adrian Chadd
2015-05-13 01:32:16 UTC
Permalink
Hi,

I'm at the point now where I think this part is done and I'd like to
get it more formally reviewed and ready for commit into -HEAD.

Here's the branch:

https://github.com/freebsd/freebsd/compare/master...erikarn:local/adrian_numa_policy

* added syscalls;
* added vm domain policy and iterators;
* per-process, per-thread and global system policy;
* added numactl - a tool to manipulate, fetch and start processes w/ policies;;
* added manpages for everything above.

It's working fine for a handful of people who have been testing it.
I'd like to get some more wider testing and reporting.

I'd like to wrap this up in a formal phabricator review in the next
few days and kick start it along.

Thanks!


-adrian
Adrian Chadd
2015-06-01 03:47:46 UTC
Permalink
Hi,

It's in a review, and I've addressed most of the feedback:

https://reviews.freebsd.org/D2559

I'd like to get more feedback before I commit this. I'm hoping to
commit it next week so I can make a start on pushing numa awareness
into KVA allocation and contigmalloc/busdma.

Thanks!



-adrian
Post by Adrian Chadd
Hi,
I'm at the point now where I think this part is done and I'd like to
get it more formally reviewed and ready for commit into -HEAD.
https://github.com/freebsd/freebsd/compare/master...erikarn:local/adrian_numa_policy
* added syscalls;
* added vm domain policy and iterators;
* per-process, per-thread and global system policy;
* added numactl - a tool to manipulate, fetch and start processes w/ policies;;
* added manpages for everything above.
It's working fine for a handful of people who have been testing it.
I'd like to get some more wider testing and reporting.
I'd like to wrap this up in a formal phabricator review in the next
few days and kick start it along.
Thanks!
-adrian
Warner Losh
2015-06-01 15:30:02 UTC
Permalink
Hi Adrian,

How does this work release to the branch that Jeff Roberson had for numaizing
the VM system? Can any of that work be carried forward?

Warner
Post by Adrian Chadd
Hi,
https://reviews.freebsd.org/D2559
I'd like to get more feedback before I commit this. I'm hoping to
commit it next week so I can make a start on pushing numa awareness
into KVA allocation and contigmalloc/busdma.
Thanks!
-adrian
Post by Adrian Chadd
Hi,
I'm at the point now where I think this part is done and I'd like to
get it more formally reviewed and ready for commit into -HEAD.
https://github.com/freebsd/freebsd/compare/master...erikarn:local/adrian_numa_policy
* added syscalls;
* added vm domain policy and iterators;
* per-process, per-thread and global system policy;
* added numactl - a tool to manipulate, fetch and start processes w/ policies;;
* added manpages for everything above.
It's working fine for a handful of people who have been testing it.
I'd like to get some more wider testing and reporting.
I'd like to wrap this up in a formal phabricator review in the next
few days and kick start it along.
Thanks!
-adrian
_______________________________________________
http://lists.freebsd.org/mailman/listinfo/freebsd-arch
Adrian Chadd
2015-06-01 16:57:55 UTC
Permalink
Where's it hiding? All I have seen is the projects/numa branch. This is a
subset of that.

Adrian
Post by Warner Losh
Hi Adrian,
How does this work release to the branch that Jeff Roberson had for numaizing
the VM system? Can any of that work be carried forward?
Warner
Post by Adrian Chadd
Hi,
https://reviews.freebsd.org/D2559
I'd like to get more feedback before I commit this. I'm hoping to
commit it next week so I can make a start on pushing numa awareness
into KVA allocation and contigmalloc/busdma.
Thanks!
-adrian
Post by Adrian Chadd
Hi,
I'm at the point now where I think this part is done and I'd like to
get it more formally reviewed and ready for commit into -HEAD.
https://github.com/freebsd/freebsd/compare/master...erikarn:local/adrian_numa_policy
Post by Adrian Chadd
Post by Adrian Chadd
* added syscalls;
* added vm domain policy and iterators;
* per-process, per-thread and global system policy;
* added numactl - a tool to manipulate, fetch and start processes w/
policies;;
Post by Adrian Chadd
Post by Adrian Chadd
* added manpages for everything above.
It's working fine for a handful of people who have been testing it.
I'd like to get some more wider testing and reporting.
I'd like to wrap this up in a formal phabricator review in the next
few days and kick start it along.
Thanks!
-adrian
_______________________________________________
http://lists.freebsd.org/mailman/listinfo/freebsd-arch
Adrian Chadd
2015-06-25 00:49:57 UTC
Permalink
Hi,

I've updated the NUMA branch again:

https://github.com/freebsd/freebsd/compare/master...erikarn:local/adrian_numa_policy

The main fixes:

* (stas) Added short versions of options to numactl;
* (stas, wblock, rpaulo) Documentation and code fixes during review;
* (kib) updated the userland facing API to not include seq_t;
* fixed up a couple of silly bugs that gcc-4.2 picked up;
* fixed up compile issues on gcc-4.2.

I've tested this on mips32, mips64, x86 non-NUMA (GENERIC) and NUMA hardware.

kib@ has requested that this use the procctl() API rather than adding
new syscalls. procctl() currently doesn't support P_LWPID (ie thread)
based identifiers for any of its manipulation, so I'd have to go and
add that.

I think this is close to what I'd like to commit. I'd appreciate another review.

Thanks,


-adrian
Adrian Chadd
2015-07-06 02:06:27 UTC
Permalink
Hi,

I've done another update. kib@ has been beating me with the clue stick a bit.

https://github.com/freebsd/freebsd/compare/master...erikarn:local/adrian_numa_policy

* (kib) (numactl.c) fix up sorting of include files
* (kib) (numactl.c) consistent use of values when calling err()
* (kib) (numactl.c) consistently wrap lines at 78 characters, don't
prematurely wrap lines
* (kib) don't use the old-style BSD licence mentioning "regents", use
the updated one
* (kib) (vm_domain.c) don't break out after iterating a few times and
have the API be unpredictable - so now the API will always succeed in
reading a vm_policy

I've tested the policies (first-touch, fixed-domain, round-robin) and
they all still work as advertised, both on threads and processes.

I'd appreciate more reviews and some further testing.

Thanks!



-adrian
Konstantin Belousov
2015-07-07 09:37:47 UTC
Permalink
Post by Adrian Chadd
Hi,
https://github.com/freebsd/freebsd/compare/master...erikarn:local/adrian_numa_policy
* (kib) (numactl.c) fix up sorting of include files
* (kib) (numactl.c) consistent use of values when calling err()
* (kib) (numactl.c) consistently wrap lines at 78 characters, don't
prematurely wrap lines
* (kib) don't use the old-style BSD licence mentioning "regents", use
the updated one
* (kib) (vm_domain.c) don't break out after iterating a few times and
have the API be unpredictable - so now the API will always succeed in
reading a vm_policy
I've tested the policies (first-touch, fixed-domain, round-robin) and
they all still work as advertised, both on threads and processes.
I'd appreciate more reviews and some further testing.
I did not found a fix for the wrong locking of seq_t.
Am I reading sys_numa_getaffinity() right that it does copyout() while
owning the process lock ?
The things are still syscalls instead of procctl() commands.

I did not read further, the patch is half-done at best.
Adrian Chadd
2015-07-07 22:53:18 UTC
Permalink
Post by Konstantin Belousov
Post by Adrian Chadd
Hi,
https://github.com/freebsd/freebsd/compare/master...erikarn:local/adrian_numa_policy
* (kib) (numactl.c) fix up sorting of include files
* (kib) (numactl.c) consistent use of values when calling err()
* (kib) (numactl.c) consistently wrap lines at 78 characters, don't
prematurely wrap lines
* (kib) don't use the old-style BSD licence mentioning "regents", use
the updated one
* (kib) (vm_domain.c) don't break out after iterating a few times and
have the API be unpredictable - so now the API will always succeed in
reading a vm_policy
I've tested the policies (first-touch, fixed-domain, round-robin) and
they all still work as advertised, both on threads and processes.
I'd appreciate more reviews and some further testing.
I did not found a fix for the wrong locking of seq_t.
Which locking is wrong?

* the global policy is locked via the global mutex;
* the process/thread policy is locked via PROC_LOCK.

What did I miss?
Post by Konstantin Belousov
Am I reading sys_numa_getaffinity() right that it does copyout() while
owning the process lock ?
I've fixed and push that, thanks.
Post by Konstantin Belousov
The things are still syscalls instead of procctl() commands.
I haven't done that yet. procctl() doesn't have a concept of per-TID
operations at all, and it currently has a framework for iterating over
things that this would have to slot into somehow. I'm open to
suggestions on how you would integrate per-TID operations into
procctl().
Post by Konstantin Belousov
I did not read further, the patch is half-done at best.
That's lovely. Meanwhile, people are actively using this thing.




-adrian
Rui Paulo
2015-07-08 02:43:18 UTC
Permalink
Post by Adrian Chadd
Post by Konstantin Belousov
I did not read further, the patch is half-done at best.
That's lovely. Meanwhile, people are actively using this thing.
It may not be perfect, but it's way more than half done. You might object to
introducing the syscalls, but procctl is still annoyingly limited.
--
Rui Paulo
Alfred Perlstein
2015-07-08 05:33:19 UTC
Permalink
Post by Rui Paulo
Post by Adrian Chadd
Post by Konstantin Belousov
I did not read further, the patch is half-done at best.
That's lovely. Meanwhile, people are actively using this thing.
It may not be perfect, but it's way more than half done. You might object to
introducing the syscalls, but procctl is still annoyingly limited.
(not yelling at you Rui)... but really... Is that the problem?!!? Just
write a userland library to abstract the kernel interface!

-Alfred
Rui Paulo
2015-07-08 06:44:13 UTC
Permalink
Post by Alfred Perlstein
Post by Rui Paulo
Post by Adrian Chadd
Post by Konstantin Belousov
I did not read further, the patch is half-done at best.
That's lovely. Meanwhile, people are actively using this thing.
It may not be perfect, but it's way more than half done. You might object
to introducing the syscalls, but procctl is still annoyingly limited.
(not yelling at you Rui)... but really... Is that the problem?!!? Just
write a userland library to abstract the kernel interface!
How can a library help? If you can't tell the kernel to apply a policy per-
TID (procctl works by PID), it's useless for multi-threaded applications.
--
Rui Paulo
Alfred Perlstein
2015-07-08 07:36:07 UTC
Permalink
Post by Rui Paulo
Post by Alfred Perlstein
Post by Rui Paulo
Post by Adrian Chadd
Post by Konstantin Belousov
I did not read further, the patch is half-done at best.
That's lovely. Meanwhile, people are actively using this thing.
It may not be perfect, but it's way more than half done. You might object
to introducing the syscalls, but procctl is still annoyingly limited.
(not yelling at you Rui)... but really... Is that the problem?!!? Just
write a userland library to abstract the kernel interface!
How can a library help? If you can't tell the kernel to apply a policy per-
TID (procctl works by PID), it's useless for multi-threaded applications.
The library would abstract away the kernel boundary concerns. So let's say we did what kib wanted and extended procctl to support tids, well at that point the syscalls made could be garbage collected and the library updated to call the correct kernel entry point.
Alfred Perlstein
2015-07-08 05:32:11 UTC
Permalink
Post by Konstantin Belousov
I did not read further, the patch is half-done at best.
This is unproductive.

People are using Adrian's work and we plan on using it as well.

FreeBSD is behind in the NUMA game by at least a decade

Please describe in detail what the issues are, do not roadblock based on
vagueshedding.

JFYI:
From http://kernelnewbies.org/LinuxVersions ->

2.5.59 released Janury 17, 2003:
NUMA aware scheduler extensions

Let's move on and incrementally fix this as opposed to roadblocking.
It's 13 years late.

-Alfred
Garrett Cooper
2015-07-08 06:13:09 UTC
Permalink
Post by Adrian Chadd
Hi,
https://github.com/freebsd/freebsd/compare/master...erikarn:local/adrian_numa_policy
* (kib) (numactl.c) fix up sorting of include files
* (kib) (numactl.c) consistent use of values when calling err()
* (kib) (numactl.c) consistently wrap lines at 78 characters, don't
prematurely wrap lines
* (kib) don't use the old-style BSD licence mentioning "regents", use
the updated one
* (kib) (vm_domain.c) don't break out after iterating a few times and
have the API be unpredictable - so now the API will always succeed in
reading a vm_policy
I've tested the policies (first-touch, fixed-domain, round-robin) and
they all still work as advertised, both on threads and processes.
I'd appreciate more reviews and some further testing.
Please create a dummy pull request or post the code up on Phabricator.

- Please put some of the items like policy_to_str and str_to_policy in a library.
- Please use that code in the kernel as well for sysctl_vm_default_policy to reduce duplication.
- Please note reasoning for why `options MAXMEMDOM=16` in numa(4).
- Why are checking for `if (p)` before calling PROC_UNLOCK(p) in sys_numa_getaffinity, but not sys_numa_setaffinity?
- sys_numa_setaffinity and sys_numa_getaffinity look similar. Could something be implemented like sysctl(3) for handling getting/setting of affinities all in one shot?
- `if (p)` should be `if (p != NULL)`, etc per style(9).
- Is there a way that the affinity could be inherited/not inherited across threads, similar to what ktrace -i does? If so, how does one do that?
- In vm_domain_rr_selectdomain, should this use atomic(9), i.e. can multiple threads access/mangle the value of td_dom_rr_idx in parallel?
- In vm_domain_policy_validate, couldn’t you remove all of the intermediary `return (-1)`’s as long as you put `break;`s in the switch/case statements?
- Should vm_domain_policy_cleanup/vm_domain_policy_init be implemented? If so, what should they have in there?
- Would it make sense for `struct vm_domain_iterator` to be a queue(9)-like type (just based on the name alone)? If not, what data structure do you anticipate it having, e.g. tree, queue, directed graph, etc?
- In vm_domain_iterator_run, vi->n is always decremented after the vi->n <= 0 run is done — why not move that outside the switch-case statement?
- Why did you hand roll `vm_domain_iterator_isdone` in `vm_domain_iterator_run` up at the top?
- The parameter names for functions/syscalls can be omitted in the declarations.
- The change to vm_page.c seems like it could be committed separate from the NUMA changes.
- `However without it'll kernel panic below - the code didn’t` -> `However without the following check, the kernel will panic below; the code didn’t`
- vm_domain_iterator_run seems like it could use one of the queue(9) data structures.
- Why is OPT_TID 1001?
- `int error;` should come after `cpuset_t set;` per alignment and style(9).
- `atoi` parsing is better handled via strtoll, etc.

Thanks!
-NGie
Adrian Chadd
2015-07-08 06:38:10 UTC
Permalink
There's a phabricator review. It's not up to date, because:

* it broke for a while, and
* kib requested he be sent patches, not a phabricator review.



-a
Post by Garrett Cooper
Post by Adrian Chadd
Hi,
https://github.com/freebsd/freebsd/compare/master...erikarn:local/adrian_numa_policy
* (kib) (numactl.c) fix up sorting of include files
* (kib) (numactl.c) consistent use of values when calling err()
* (kib) (numactl.c) consistently wrap lines at 78 characters, don't
prematurely wrap lines
* (kib) don't use the old-style BSD licence mentioning "regents", use
the updated one
* (kib) (vm_domain.c) don't break out after iterating a few times and
have the API be unpredictable - so now the API will always succeed in
reading a vm_policy
I've tested the policies (first-touch, fixed-domain, round-robin) and
they all still work as advertised, both on threads and processes.
I'd appreciate more reviews and some further testing.
Please create a dummy pull request or post the code up on Phabricator.
- Please put some of the items like policy_to_str and str_to_policy in a library.
- Please use that code in the kernel as well for sysctl_vm_default_policy to reduce duplication.
- Please note reasoning for why `options MAXMEMDOM=16` in numa(4).
- Why are checking for `if (p)` before calling PROC_UNLOCK(p) in sys_numa_getaffinity, but not sys_numa_setaffinity?
- sys_numa_setaffinity and sys_numa_getaffinity look similar. Could something be implemented like sysctl(3) for handling getting/setting of affinities all in one shot?
- `if (p)` should be `if (p != NULL)`, etc per style(9).
- Is there a way that the affinity could be inherited/not inherited across threads, similar to what ktrace -i does? If so, how does one do that?
- In vm_domain_rr_selectdomain, should this use atomic(9), i.e. can multiple threads access/mangle the value of td_dom_rr_idx in parallel?
- In vm_domain_policy_validate, couldn’t you remove all of the intermediary `return (-1)`’s as long as you put `break;`s in the switch/case statements?
- Should vm_domain_policy_cleanup/vm_domain_policy_init be implemented? If so, what should they have in there?
- Would it make sense for `struct vm_domain_iterator` to be a queue(9)-like type (just based on the name alone)? If not, what data structure do you anticipate it having, e.g. tree, queue, directed graph, etc?
- In vm_domain_iterator_run, vi->n is always decremented after the vi->n <= 0 run is done — why not move that outside the switch-case statement?
- Why did you hand roll `vm_domain_iterator_isdone` in `vm_domain_iterator_run` up at the top?
- The parameter names for functions/syscalls can be omitted in the declarations.
- The change to vm_page.c seems like it could be committed separate from the NUMA changes.
- `However without it'll kernel panic below - the code didn’t` -> `However without the following check, the kernel will panic below; the code didn’t`
- vm_domain_iterator_run seems like it could use one of the queue(9) data structures.
- Why is OPT_TID 1001?
- `int error;` should come after `cpuset_t set;` per alignment and style(9).
- `atoi` parsing is better handled via strtoll, etc.
Thanks!
-NGie
Alfred Perlstein
2015-07-08 19:18:35 UTC
Permalink
Post by Adrian Chadd
* it broke for a while, and
* kib requested he be sent patches, not a phabricator review.
So Kib is complaining that his feedback is getting lost, but refuses to
use a review tracker?

MFW:
Loading Image...

-Alfred
Post by Adrian Chadd
-a
Post by Garrett Cooper
Post by Adrian Chadd
Hi,
https://github.com/freebsd/freebsd/compare/master...erikarn:local/adrian_numa_policy
* (kib) (numactl.c) fix up sorting of include files
* (kib) (numactl.c) consistent use of values when calling err()
* (kib) (numactl.c) consistently wrap lines at 78 characters, don't
prematurely wrap lines
* (kib) don't use the old-style BSD licence mentioning "regents", use
the updated one
* (kib) (vm_domain.c) don't break out after iterating a few times and
have the API be unpredictable - so now the API will always succeed in
reading a vm_policy
I've tested the policies (first-touch, fixed-domain, round-robin) and
they all still work as advertised, both on threads and processes.
I'd appreciate more reviews and some further testing.
Please create a dummy pull request or post the code up on Phabricator.
- Please put some of the items like policy_to_str and str_to_policy in a library.
- Please use that code in the kernel as well for sysctl_vm_default_policy to reduce duplication.
- Please note reasoning for why `options MAXMEMDOM=16` in numa(4).
- Why are checking for `if (p)` before calling PROC_UNLOCK(p) in sys_numa_getaffinity, but not sys_numa_setaffinity?
- sys_numa_setaffinity and sys_numa_getaffinity look similar. Could something be implemented like sysctl(3) for handling getting/setting of affinities all in one shot?
- `if (p)` should be `if (p != NULL)`, etc per style(9).
- Is there a way that the affinity could be inherited/not inherited across threads, similar to what ktrace -i does? If so, how does one do that?
- In vm_domain_rr_selectdomain, should this use atomic(9), i.e. can multiple threads access/mangle the value of td_dom_rr_idx in parallel?
- In vm_domain_policy_validate, couldn’t you remove all of the intermediary `return (-1)`’s as long as you put `break;`s in the switch/case statements?
- Should vm_domain_policy_cleanup/vm_domain_policy_init be implemented? If so, what should they have in there?
- Would it make sense for `struct vm_domain_iterator` to be a queue(9)-like type (just based on the name alone)? If not, what data structure do you anticipate it having, e.g. tree, queue, directed graph, etc?
- In vm_domain_iterator_run, vi->n is always decremented after the vi->n <= 0 run is done — why not move that outside the switch-case statement?
- Why did you hand roll `vm_domain_iterator_isdone` in `vm_domain_iterator_run` up at the top?
- The parameter names for functions/syscalls can be omitted in the declarations.
- The change to vm_page.c seems like it could be committed separate from the NUMA changes.
- `However without it'll kernel panic below - the code didn’t` -> `However without the following check, the kernel will panic below; the code didn’t`
- vm_domain_iterator_run seems like it could use one of the queue(9) data structures.
- Why is OPT_TID 1001?
- `int error;` should come after `cpuset_t set;` per alignment and style(9).
- `atoi` parsing is better handled via strtoll, etc.
Thanks!
-NGie
_______________________________________________
http://lists.freebsd.org/mailman/listinfo/freebsd-arch
Warner Losh
2015-07-08 19:33:40 UTC
Permalink
Post by Adrian Chadd
* it broke for a while, and
* kib requested he be sent patches, not a phabricator review.
So Kib is complaining that his feedback is getting lost, but refuses to use a review tracker?
http://images5.fanpop.com/image/photos/25000000/No-country-for-old-men-tommy-lee-jones-25069727-450-276.jpg
Do we really need to resort to name calling for a reviewer who is trying
to help make things better? kib has provided me good feedback on patches
I’ve done in the past, though it sometimes takes me a while to understand
his concerns. It is well worth the while to do that, and to engage him
constructively rather than belittling his efforts. And experience has shown
that phabricator is great for small patches, but terrible for large patches
that get revised over and over before going in. This is the 4th or 5th
review than I can recall where phabricator’s flaws went from minor
annoyances to major hassles. For really big reviews, I’m starting to think
after 2 or 3 rounds we should close the review and start a new one
to help work around the issues.

In other words, the right reaction to “I’m stopping the review here since it isn’t
half done” isn’t the defensive and belittling one one I’ve seen, but rather to
start a conversation about what he thinks is missing. Maybe he’s missed
something, or maybe you have. While it is cool people are using it, kib’s
concerns generally are looking past the initial glow of partial success for
how to climb the next mountain range and generally are worth the effort
to get.

Warner
-Alfred
Post by Adrian Chadd
-a
Post by Garrett Cooper
Post by Adrian Chadd
Hi,
https://github.com/freebsd/freebsd/compare/master...erikarn:local/adrian_numa_policy
* (kib) (numactl.c) fix up sorting of include files
* (kib) (numactl.c) consistent use of values when calling err()
* (kib) (numactl.c) consistently wrap lines at 78 characters, don't
prematurely wrap lines
* (kib) don't use the old-style BSD licence mentioning "regents", use
the updated one
* (kib) (vm_domain.c) don't break out after iterating a few times and
have the API be unpredictable - so now the API will always succeed in
reading a vm_policy
I've tested the policies (first-touch, fixed-domain, round-robin) and
they all still work as advertised, both on threads and processes.
I'd appreciate more reviews and some further testing.
Please create a dummy pull request or post the code up on Phabricator.
- Please put some of the items like policy_to_str and str_to_policy in a library.
- Please use that code in the kernel as well for sysctl_vm_default_policy to reduce duplication.
- Please note reasoning for why `options MAXMEMDOM=16` in numa(4).
- Why are checking for `if (p)` before calling PROC_UNLOCK(p) in sys_numa_getaffinity, but not sys_numa_setaffinity?
- sys_numa_setaffinity and sys_numa_getaffinity look similar. Could something be implemented like sysctl(3) for handling getting/setting of affinities all in one shot?
- `if (p)` should be `if (p != NULL)`, etc per style(9).
- Is there a way that the affinity could be inherited/not inherited across threads, similar to what ktrace -i does? If so, how does one do that?
- In vm_domain_rr_selectdomain, should this use atomic(9), i.e. can multiple threads access/mangle the value of td_dom_rr_idx in parallel?
- In vm_domain_policy_validate, couldn’t you remove all of the intermediary `return (-1)`’s as long as you put `break;`s in the switch/case statements?
- Should vm_domain_policy_cleanup/vm_domain_policy_init be implemented? If so, what should they have in there?
- Would it make sense for `struct vm_domain_iterator` to be a queue(9)-like type (just based on the name alone)? If not, what data structure do you anticipate it having, e.g. tree, queue, directed graph, etc?
- In vm_domain_iterator_run, vi->n is always decremented after the vi->n <= 0 run is done — why not move that outside the switch-case statement?
- Why did you hand roll `vm_domain_iterator_isdone` in `vm_domain_iterator_run` up at the top?
- The parameter names for functions/syscalls can be omitted in the declarations.
- The change to vm_page.c seems like it could be committed separate from the NUMA changes.
- `However without it'll kernel panic below - the code didn’t` -> `However without the following check, the kernel will panic below; the code didn’t`
- vm_domain_iterator_run seems like it could use one of the queue(9) data structures.
- Why is OPT_TID 1001?
- `int error;` should come after `cpuset_t set;` per alignment and style(9).
- `atoi` parsing is better handled via strtoll, etc.
Thanks!
-NGie
_______________________________________________
http://lists.freebsd.org/mailman/listinfo/freebsd-arch
_______________________________________________
http://lists.freebsd.org/mailman/listinfo/freebsd-arch
Alfred Perlstein
2015-07-08 19:44:34 UTC
Permalink
What name calling?

Sent from my iPhone
Post by Warner Losh
Post by Adrian Chadd
* it broke for a while, and
* kib requested he be sent patches, not a phabricator review.
So Kib is complaining that his feedback is getting lost, but refuses to use a review tracker?
http://images5.fanpop.com/image/photos/25000000/No-country-for-old-men-tommy-lee-jones-25069727-450-276.jpg
Do we really need to resort to name calling for a reviewer who is trying
to help make things better? kib has provided me good feedback on patches
I’ve done in the past, though it sometimes takes me a while to understand
his concerns. It is well worth the while to do that, and to engage him
constructively rather than belittling his efforts. And experience has shown
that phabricator is great for small patches, but terrible for large patches
that get revised over and over before going in. This is the 4th or 5th
review than I can recall where phabricator’s flaws went from minor
annoyances to major hassles. For really big reviews, I’m starting to think
after 2 or 3 rounds we should close the review and start a new one
to help work around the issues.
In other words, the right reaction to “I’m stopping the review here since it isn’t
half done” isn’t the defensive and belittling one one I’ve seen, but rather to
start a conversation about what he thinks is missing. Maybe he’s missed
something, or maybe you have. While it is cool people are using it, kib’s
concerns generally are looking past the initial glow of partial success for
how to climb the next mountain range and generally are worth the effort
to get.
Warner
-Alfred
Post by Adrian Chadd
-a
Post by Garrett Cooper
Post by Adrian Chadd
Hi,
https://github.com/freebsd/freebsd/compare/master...erikarn:local/adrian_numa_policy
* (kib) (numactl.c) fix up sorting of include files
* (kib) (numactl.c) consistent use of values when calling err()
* (kib) (numactl.c) consistently wrap lines at 78 characters, don't
prematurely wrap lines
* (kib) don't use the old-style BSD licence mentioning "regents", use
the updated one
* (kib) (vm_domain.c) don't break out after iterating a few times and
have the API be unpredictable - so now the API will always succeed in
reading a vm_policy
I've tested the policies (first-touch, fixed-domain, round-robin) and
they all still work as advertised, both on threads and processes.
I'd appreciate more reviews and some further testing.
Please create a dummy pull request or post the code up on Phabricator.
- Please put some of the items like policy_to_str and str_to_policy in a library.
- Please use that code in the kernel as well for sysctl_vm_default_policy to reduce duplication.
- Please note reasoning for why `options MAXMEMDOM=16` in numa(4).
- Why are checking for `if (p)` before calling PROC_UNLOCK(p) in sys_numa_getaffinity, but not sys_numa_setaffinity?
- sys_numa_setaffinity and sys_numa_getaffinity look similar. Could something be implemented like sysctl(3) for handling getting/setting of affinities all in one shot?
- `if (p)` should be `if (p != NULL)`, etc per style(9).
- Is there a way that the affinity could be inherited/not inherited across threads, similar to what ktrace -i does? If so, how does one do that?
- In vm_domain_rr_selectdomain, should this use atomic(9), i.e. can multiple threads access/mangle the value of td_dom_rr_idx in parallel?
- In vm_domain_policy_validate, couldn’t you remove all of the intermediary `return (-1)`’s as long as you put `break;`s in the switch/case statements?
- Should vm_domain_policy_cleanup/vm_domain_policy_init be implemented? If so, what should they have in there?
- Would it make sense for `struct vm_domain_iterator` to be a queue(9)-like type (just based on the name alone)? If not, what data structure do you anticipate it having, e.g. tree, queue, directed graph, etc?
- In vm_domain_iterator_run, vi->n is always decremented after the vi->n <= 0 run is done — why not move that outside the switch-case statement?
- Why did you hand roll `vm_domain_iterator_isdone` in `vm_domain_iterator_run` up at the top?
- The parameter names for functions/syscalls can be omitted in the declarations.
- The change to vm_page.c seems like it could be committed separate from the NUMA changes.
- `However without it'll kernel panic below - the code didn’t` -> `However without the following check, the kernel will panic below; the code didn’t`
- vm_domain_iterator_run seems like it could use one of the queue(9) data structures.
- Why is OPT_TID 1001?
- `int error;` should come after `cpuset_t set;` per alignment and style(9).
- `atoi` parsing is better handled via strtoll, etc.
Thanks!
-NGie
_______________________________________________
http://lists.freebsd.org/mailman/listinfo/freebsd-arch
_______________________________________________
http://lists.freebsd.org/mailman/listinfo/freebsd-arch
Andriy Gapon
2015-07-08 20:43:36 UTC
Permalink
Post by Adrian Chadd
* it broke for a while, and
* kib requested he be sent patches, not a phabricator review.
So Kib is complaining that his feedback is getting lost, but refuses to use a
review tracker?
How about phabricator losing diffs? Would that be a valid complaint?

P.S.
--
Andriy Gapon
Adrian Chadd
2015-07-08 20:57:41 UTC
Permalink
Post by Andriy Gapon
Post by Adrian Chadd
* it broke for a while, and
* kib requested he be sent patches, not a phabricator review.
So Kib is complaining that his feedback is getting lost, but refuses to use a
review tracker?
How about phabricator losing diffs? Would that be a valid complaint?
Hi,

Let's not get side tracked. I've invited a variety of people to review
and comment on this stuff. Some people want it in reviews.freebsd.org,
some want it via diffs. Different people want work done in different
units of work.

My plan is to get this into "good enough" state to throw into -HEAD. I
don't even care if in 12 months it's completely replaced with an
alternate implementation and/or API. What i care about right now is
getting the basic pieces in place so further work and experimentation
can be done. Right now the entry limit to evaluating any NUMA things
on FreeBSD is "you don't, without numa.diff", and that's unacceptable.
I completely expect that it'll change over the course of a few years.
But the fact we still don't have even the most basic userland exposed
API for controlling things is IMHO unacceptable and reflects poorly on
us as a community.

So, I'm looking for less nit-picking and more "this is wrong, you
should do this." A lot of kibs responses have been errors on my part
that I hadn't picked up on and weren't exposed during testing. I'm
looking for more of those. I haven't yet gone over Garrett's comments
in too much depth; I'll look at that tonight if I don't fall asleep
first.

The important thing here is to try and finally move the default
available functionality along a little bit so people can get
interested and start using this and contribute their own work.

Thanks,



-adrian
Alfred Perlstein
2015-07-08 21:53:11 UTC
Permalink
Post by Adrian Chadd
Post by Andriy Gapon
Post by Adrian Chadd
* it broke for a while, and
* kib requested he be sent patches, not a phabricator review.
So Kib is complaining that his feedback is getting lost, but refuses to use a
review tracker?
How about phabricator losing diffs? Would that be a valid complaint?
Hi,
Let's not get side tracked. I've invited a variety of people to review
and comment on this stuff. Some people want it in reviews.freebsd.org,
some want it via diffs. Different people want work done in different
units of work.
My plan is to get this into "good enough" state to throw into -HEAD. I
don't even care if in 12 months it's completely replaced with an
alternate implementation and/or API. What i care about right now is
getting the basic pieces in place so further work and experimentation
can be done. Right now the entry limit to evaluating any NUMA things
on FreeBSD is "you don't, without numa.diff", and that's unacceptable.
I completely expect that it'll change over the course of a few years.
But the fact we still don't have even the most basic userland exposed
API for controlling things is IMHO unacceptable and reflects poorly on
us as a community.
So, I'm looking for less nit-picking and more "this is wrong, you
should do this." A lot of kibs responses have been errors on my part
that I hadn't picked up on and weren't exposed during testing. I'm
looking for more of those. I haven't yet gone over Garrett's comments
in too much depth; I'll look at that tonight if I don't fall asleep
first.
The important thing here is to try and finally move the default
available functionality along a little bit so people can get
interested and start using this and contribute their own work.
Can we just use GitHub and move on already?

-Alfred
Adrian Chadd
2015-07-09 01:36:31 UTC
Permalink
Post by Garrett Cooper
Post by Adrian Chadd
Hi,
https://github.com/freebsd/freebsd/compare/master...erikarn:local/adrian_numa_policy
* (kib) (numactl.c) fix up sorting of include files
* (kib) (numactl.c) consistent use of values when calling err()
* (kib) (numactl.c) consistently wrap lines at 78 characters, don't
prematurely wrap lines
* (kib) don't use the old-style BSD licence mentioning "regents", use
the updated one
* (kib) (vm_domain.c) don't break out after iterating a few times and
have the API be unpredictable - so now the API will always succeed in
reading a vm_policy
I've tested the policies (first-touch, fixed-domain, round-robin) and
they all still work as advertised, both on threads and processes.
I'd appreciate more reviews and some further testing.
Please create a dummy pull request or post the code up on Phabricator.
- Please put some of the items like policy_to_str and str_to_policy in a library.
I'll sketch out a libnuma after this lands. Yes, this stuff will be part of it.
Post by Garrett Cooper
- Please use that code in the kernel as well for sysctl_vm_default_policy to reduce duplication.
ok. I'll do it after it lands.
Post by Garrett Cooper
- Please note reasoning for why `options MAXMEMDOM=16` in numa(4).
There's no reasoning.
Post by Garrett Cooper
- Why are checking for `if (p)` before calling PROC_UNLOCK(p) in sys_numa_getaffinity, but not sys_numa_setaffinity?
I'll check.
Post by Garrett Cooper
- sys_numa_setaffinity and sys_numa_getaffinity look similar. Could something be implemented like sysctl(3) for handling getting/setting of affinities all in one shot?
Nope. Other things are the same (eg cpuset).
Post by Garrett Cooper
- `if (p)` should be `if (p != NULL)`, etc per style(9).
- Is there a way that the affinity could be inherited/not inherited across threads, similar to what ktrace -i does? If so, how does one do that?
There isn't, much like how there isn't the same for that in cpuset.
Post by Garrett Cooper
- In vm_domain_rr_selectdomain, should this use atomic(9), i.e. can multiple threads access/mangle the value of td_dom_rr_idx in parallel?
No - it's curthread. only curthread is modifying that.
Post by Garrett Cooper
- In vm_domain_policy_validate, couldn’t you remove all of the intermediary `return (-1)`’s as long as you put `break;`s in the switch/case statements?
- Should vm_domain_policy_cleanup/vm_domain_policy_init be implemented? If so, what should they have in there?
They're placeholders - primarily because I think at some point the
policy definition will grow to be bigger and we may want to malloc /
reference them in some instances. (Much like cpuset.)
This way it's clear that they should be called, even if they don't do anything.

I like defining "object lifecycle", even if part of the create/destroy
lifecycle is null. Adding things like create/destroy functions later
on has proven to be more buggy.
Post by Garrett Cooper
- Would it make sense for `struct vm_domain_iterator` to be a queue(9)-like type (just based on the name alone)? If not, what data structure do you anticipate it having, e.g. tree, queue, directed graph, etc?
No. It's an iterator - it's designed to abstract away the concept of
iterating over a domain policy. It won't have any kind of data
structure like you mentioned, as it's an iterator.
Post by Garrett Cooper
- In vm_domain_iterator_run, vi->n is always decremented after the vi->n <= 0 run is done — why not move that outside the switch-case statement?
- Why did you hand roll `vm_domain_iterator_isdone` in `vm_domain_iterator_run` up at the top?
vm_domain_iterator_isdone() is (currently) defined as a tail loop
condition, not a head loop condition. Ie, you can only call it after
you've gone through the loop at least once.

Now, I don't necessarily like how the loop constructs work, but it's
designed to be a minimal set of changes to the vm_page.c code that was
doing this kind of iteration over the round-robin set.

I think it's because I wrote iterator_isdone() first before I finished
fleshing out iterator_run().
Post by Garrett Cooper
- The parameter names for functions/syscalls can be omitted in the declarations.
ok
Post by Garrett Cooper
- The change to vm_page.c seems like it could be committed separate from the NUMA changes.
It will be. It's in there because without it any significant imbalance
in page allocations between domains can end with exhaustion in one
domain and it all going bad.
Post by Garrett Cooper
- `However without it'll kernel panic below - the code didn’t` -> `However without the following check, the kernel will panic below; the code didn’t`
- vm_domain_iterator_run seems like it could use one of the queue(9) data structures.
again, no - there's no queue struct underlying it. It's an iterator.
Post by Garrett Cooper
- Why is OPT_TID 1001?
- `int error;` should come after `cpuset_t set;` per alignment and style(9).
- `atoi` parsing is better handled via strtoll, etc.
Il'l go sort those out in numactl.c soon.

Thanks!


-a

Loading...