Discussion:
syscall changes to deal with 32->64 changes.
Poul-Henning Kamp
2002-05-03 09:43:53 UTC
Permalink
We have some 32 to 64 bit changes outstanding, which are necessary
to complete the UFS2 DARPA project.

So far the following three have been identified as having a syscall
impact:

ino_t
stat.st_gen
statfs.<many>

These are pretty trivial, it involves adding some new 64bit versions
of the syscalls and corresponding data structures.

If we look at bit closer, a number of other things really need the
32->64 treatment, struct rlimit for instance. This is still
pretty trivial.

And then the bad one: time_t.

The time_t change is the most intrusive since it also changes the
size of strut timespec, and logically should change the size of
struct timeval but wouldn't since the second part there is defined
as "long". This makes rusage, itimerval, pps_info_t and SVID IPC
change size and affects quite a number of syscalls.

And before anybody gets their knickers in a twist (there's a lot
of that going round these days as the doctor would say) this all
needs to be done and will be done in a backwards compatible way.

Questions:

1. Are there anything else that needs to change size while we're
at it ?

2. Is this a good occation to create a new syscall vector for
FreeBSD 5.0 rather than embellish the existing one with even
more variations ?
--
Poul-Henning Kamp | UNIX since Zilog Zeus 3.20
***@FreeBSD.ORG | TCP/IP since RFC 956
FreeBSD committer | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.

To Unsubscribe: send mail to ***@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message
Jonathan Mini
2002-05-03 10:48:38 UTC
Permalink
Post by Poul-Henning Kamp
2. Is this a good occation to create a new syscall vector for
FreeBSD 5.0 rather than embellish the existing one with even
more variations ?
I believe this is the cleanest solution. It makes sense to me that the entry
point for a function would change when the symantics or parameters change.
Also, folding more variations into the existing syscalls strikes me as a
mess for binary compatability in the future. Especially since we're
dealing with a large number of syscalls here (changing the size of time_t is
going to hit a fair number, I would guess).

If we use a new syscall vector for the 64bit syscalls, we give ourselves the
opportunity to make a "clean break" away from the older ones.
--
Jonathan Mini <***@freebsd.org>
http://www.haikugeek.com

"He who is not aware of his ignorance will be only misled by his knowledge."
-- Richard Whatley

To Unsubscribe: send mail to ***@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message
Bruce Evans
2002-05-03 19:56:33 UTC
Permalink
Post by Poul-Henning Kamp
If we look at bit closer, a number of other things really need the
32->64 treatment, struct rlimit for instance. This is still
pretty trivial.
struct rlimit is already 64-bits.
Post by Poul-Henning Kamp
And then the bad one: time_t.
The time_t change is the most intrusive since it also changes the
size of strut timespec, and logically should change the size of
struct timeval but wouldn't since the second part there is defined
as "long". This makes rusage, itimerval, pps_info_t and SVID IPC
change size and affects quite a number of syscalls.
It would change the size of struct timeval on 32-bit machines, since
the alignment of the "long" should be 32 bits even if longs have the
correct size (2 * 32 bits).
Post by Poul-Henning Kamp
And before anybody gets their knickers in a twist (there's a lot
of that going round these days as the doctor would say) this all
needs to be done and will be done in a backwards compatible way.
1. Are there anything else that needs to change size while we're
at it ?
2. Is this a good occation to create a new syscall vector for
FreeBSD 5.0 rather than embellish the existing one with even
more variations ?
Depends on how long you want to delay 5.0R by ;-).

Bruce


To Unsubscribe: send mail to ***@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message
Robert Watson
2002-05-04 07:41:19 UTC
Permalink
Post by Poul-Henning Kamp
1. Are there anything else that needs to change size while we're
at it ?
Well, you had the SysVIPC stuff in there anyway, so if you're going to
break the ABI, could you modify it so the uid and gid fields use uid_t and
gid_t, and if we don't use mode_t for the perm field already (I think we
do), start using mode_t there also?

I have patches that break out the user and kernel structures for sysvipc,
but they are pretty dated. I needed these to add MAC support for them,
and to start locking them down for SMPng, since both MAC labels and lock
entries shouldn't go in the user/kernel interface structure, which in
theory is a fixed and well-defined structure.
Post by Poul-Henning Kamp
2. Is this a good occation to create a new syscall vector for
FreeBSD 5.0 rather than embellish the existing one with even
more variations ?
John Baldwin and I have thrown this idea around a number of times, as we
keep bumping into things that would change the ABI. Doing this doesn't
save you from having to write the compatibility code, but it would sure
make the syscalls.master file cleaner as we go forwards, and help a lot
with the binary compatibility issue. Here's my opinion: if it's a change
that can be made by one person in one week without boatloads of lasting
compatibility and ABI concerns, and you're volunteering to do it, then I
think it's a great idea. However, there are some technical questions to
answer, including how we identify binaries using the new ABI, etc. If
we're going to generate a new syscall table, 5.0 is definitely the time to
do it, however, so I think it's now or 6.0. And we should make sure we
really catch everything while we're here. I agree with Bruce's concerns
about the possible impact on a release date, hence specifying "a week" as
the implementation time frame.

<here ends material contribution to the discussion of whether we should go
with what you've proposed, what follows is more on the speculation side>

Another idea I've been bumping around for a bit is to better divorce the
"system call" and "service" implementations in the kernel. I don't
remember if I ever sent the picture out to -arch before, but it goes
something like this:

Now:

+------------------------------+ +-----------+ +---------------+
| FreeBSD ABI + basic service | | Linux ABI | | Foo ABI ..... |
| implementation | +-----------+ +---------------+
| | || ||
| +---------------------------------+
| |
+----------------------------------------------------------------+
||
... VFS, Scheduler, VM, ...

Possible future layout:

+-----------------+ +---------------+ +-----------+ +-------------+
| FreeBSD Old ABI | | FreeBSD 5 ABI | | Linux ABI | | Foo ABI ... |
+-----------------+ +---------------+ +-----------+ +-------------+
|| || || ||
+------------------------------------------------------------------+
| Basic service implementation |
+------------------------------------------------------------------+
||
... VFS, Scheduler, VM, ...

The goal of such a picture would be that, in general, the ABI layers would
have the sole purpose of bringing in arguments, mapping native ABI
structures to internal service structures, construction uio's, etc. Then
the service implementation would largely deal with kernel-derived
arguments, or ones where a specific user vs. kernel flag would be present
to indicate the source of the arguments. This would facilitate writing
kernel code that invokes kernel APIs, such as mount, etc, by avoiding
situations where use of user addresses is hard-coded into kernel APIs. In
many cases, it actually wouldn't mean much in the way of changes, and the
lines might not always be clean (especially when a service implementation
was specific to an ABI because it wasn't widely required). If we moved to
a FreeBSD 5 ABI, now would be a good time to do that -- we'd make the
FreeBSD 5 structures be the "native" ones used for the service
implementations, then map past FreeBSD structures into them in the FreeBSD
Old ABI. We already have this sort of layout in many places in the VFS
code, to be honest.

Robert N M Watson FreeBSD Core Team, TrustedBSD Project
***@fledge.watson.org NAI Labs, Safeport Network Services



To Unsubscribe: send mail to ***@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message
Poul-Henning Kamp
2002-05-04 09:44:55 UTC
Permalink
Post by Robert Watson
Here's my opinion: if it's a change
that can be made by one person in one week without boatloads of lasting
compatibility and ABI concerns, and you're volunteering to do it, then I
think it's a great idea.
That's not even remotely close to realistic.

I think a realistic plan might look something like this:

1. Change the in-kernel types to 64 bits
for each entity to change size:
a) rewrite syscall entries to translate sizes.
b) change size in kernel, but retain old size in userland

2. Add new syscall API
a) write syscalls implementations.

3. Change userland over.
a) Add #ifdef NEWAPI all over the includes so people can select
which API to use.
b) Create new major-revv libc which uses new API
c) Leave people time to find bugs in ports etc.
d) Throw the switch for good.

Earliest realistic dates would be 3c on juli 1st and 3d a month later.

Is that too late for 5.0-R ?
--
Poul-Henning Kamp | UNIX since Zilog Zeus 3.20
***@FreeBSD.ORG | TCP/IP since RFC 956
FreeBSD committer | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.

To Unsubscribe: send mail to ***@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message
Robert Watson
2002-05-04 18:42:43 UTC
Permalink
Post by Poul-Henning Kamp
Post by Robert Watson
Here's my opinion: if it's a change
that can be made by one person in one week without boatloads of lasting
compatibility and ABI concerns, and you're volunteering to do it, then I
think it's a great idea.
That's not even remotely close to realistic.
Let me clarify what I meant there, it was
way-the-heck-early-in-the-morning. What I'd like to bound is the flag
procedure: that we try to avoid more than a week of disruption. Nothing
says the internal kernel interfaces, etc, can't be upgraded gradually as
you suggest, etc, just that we attempt to avoid extended breakage during
the change in default ABI.
Post by Poul-Henning Kamp
1. Change the in-kernel types to 64 bits
a) rewrite syscall entries to translate sizes.
b) change size in kernel, but retain old size in userland
2. Add new syscall API
a) write syscalls implementations.
3. Change userland over.
a) Add #ifdef NEWAPI all over the includes so people can select
which API to use.
b) Create new major-revv libc which uses new API
c) Leave people time to find bugs in ports etc.
d) Throw the switch for good.
Earliest realistic dates would be 3c on juli 1st and 3d a month later.
Is that too late for 5.0-R ?
That's probably something to discuss with re@, etc. My personal opinion
is that the ABI change would be alright through the end of August. I'd
like to see it more on the timeframe you suggest. If we do this, then we
do definitely need to have a DP3 using the new ABI, probably around Sept 1
or mid-August, to allow broader testing with the new ABI.

Robert N M Watson FreeBSD Core Team, TrustedBSD Project
***@fledge.watson.org NAI Labs, Safeport Network Services



To Unsubscribe: send mail to ***@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message
Garance A Drosihn
2002-05-07 01:05:46 UTC
Permalink
Post by Poul-Henning Kamp
1. Change the in-kernel types to 64 bits
a) rewrite syscall entries to translate sizes.
b) change size in kernel, but retain old size
in userland
2. Add new syscall API
a) write syscalls implementations.
3. Change userland over.
a) Add #ifdef NEWAPI all over the includes so people
can select which API to use.
b) Create new major-revv libc which uses new API
c) Leave people time to find bugs in ports etc.
d) Throw the switch for good.
Earliest realistic dates would be 3c on july 1st and 3d
a month later.
Is that too late for 5.0-R ?
If that timetable is doable, then I think it's reasonable
to do. The alternate question is that if we did NOT go
with a new syscall vector, then how much work will it be
to get these 32->64bit changes done before 5.0?

I assume the variable would be something more distinctive
than "NEWAPI"...
Post by Poul-Henning Kamp
John Baldwin and I have thrown this idea around a number
of times, as we keep bumping into things that would change
the ABI.
What things would those be? Might as well get them listed,
and see how many of them (if any...) could be included in
this new vector without hurting the timetable.
--
Garance Alistair Drosehn = ***@gilead.netel.rpi.edu
Senior Systems Programmer or ***@freebsd.org
Rensselaer Polytechnic Institute or ***@rpi.edu

To Unsubscribe: send mail to ***@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message
Terry Lambert
2002-05-07 02:22:41 UTC
Permalink
Post by Garance A Drosihn
Post by Robert Watson
John Baldwin and I have thrown this idea around a number
of times, as we keep bumping into things that would change
the ABI.
What things would those be? Might as well get them listed,
and see how many of them (if any...) could be included in
this new vector without hurting the timetable.
Apart from the obvious "stat" values and time_t, there's also
nlink_t, dev_t, etc.... and if we're really clever, a version
number for the stat structure, as the first element.

-- Terry

To Unsubscribe: send mail to ***@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message
Poul-Henning Kamp
2002-05-07 05:10:23 UTC
Permalink
This post might be inappropriate. Click to display it.
Terry Lambert
2002-05-07 07:08:49 UTC
Permalink
Post by Terry Lambert
Post by Garance A Drosihn
Post by Robert Watson
John Baldwin and I have thrown this idea around a number
of times, as we keep bumping into things that would change
the ABI.
What things would those be? Might as well get them listed,
and see how many of them (if any...) could be included in
this new vector without hurting the timetable.
Apart from the obvious "stat" values and time_t, there's also
nlink_t, dev_t, etc.... and if we're really clever, a version
number for the stat structure, as the first element.
Can I officially move that PHK answer Garance Droshin's original
question, in the context of the UFS2 project, which Poul has
already suggested will need the data structure changes which
spurred the current thread on the system call API (and thus ABI)
changing?

It's been 4 days since the question was asked, and his only comment
has been a request to squelch discussion of the list of things which
will have to change. I would prefer that such a list be as complete
as possible, to avoid additional changes being required as a result
of future work by others.

Thanks,
-- Terry

To Unsubscribe: send mail to ***@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message
Robert Watson
2002-05-07 11:36:44 UTC
Permalink
Post by Garance A Drosihn
Post by Robert Watson
John Baldwin and I have thrown this idea around a number
of times, as we keep bumping into things that would change
the ABI.
What things would those be? Might as well get them listed,
and see how many of them (if any...) could be included in
this new vector without hurting the timetable.
Apart from the obvious "stat" values and time_t, there's also nlink_t,
dev_t, etc.... and if we're really clever, a version number for the stat
structure, as the first element.
I have a few in my queue that don't relate to UFS2, and that includes
updating the ipcperm structure to use uid_t and gid_t rather than u_short.
This breaks the ABI for most of the sysv calls, if not all of them,
unfortunately. I have some increasingly dated patches that begin to
seperate user and kernel structures for the sysvipc code, and probably
ought to update, complete, and commit them sometime. This makes the ABI
change easier to handle. There are already ofoo() calls for sysvipc, and
an interesting question is how long we have to wait before we can remove
those (that was from when someone updated u_short to pid_t, I think, but
didn't do the rest).

Robert N M Watson FreeBSD Core Team, TrustedBSD Project
***@fledge.watson.org NAI Labs, Safeport Network Services



To Unsubscribe: send mail to ***@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message
Matthew Dillon
2002-05-07 08:15:49 UTC
Permalink
As a person who has already spent several days playing with
64 bit time_t's and associated syscalls I would say that
creating a new syscall vector for FreeBSD 5.0 (and leaving the
existing one intact for compatibility) is the correct solution.

I tried to add new syscalls to the existing vector but so many
syscalls had to be changed to support 64 bit time_t's that it
became a huge mess... so much so that I would expect the other
BSD's to cry foul on us if we tried to do it with the existing
vector. It will be far, far cleaner to simply implement an
entirely new syscall vector / ELF identifier.

In regards to other things that may need to change size: A complete
audit will have to be performed. I would be happy to take a run through.
Getting it right the first time is extremely important. A bunch of
things starting leaking out of the woodwork as I was playing around
with 64 bit time_t's. At the very least I would pad the structures
to handle things like 64 bit dev_t, ino_t, and file flags, and I would
even consider padding now 96 bit structures like timespec (on IA32 with
64 bit time_t's + nanoseconds long = 96 bits) to 128 bits across the
board. It might also be worthwhile to make uid_t, gid_t, and pid_t
64 bits to support probable future work in those areas.

In regards to the 5.0R question that comes up later in the thread...
I just don't know. I will say that creating a new syscall vector
cannot be done piecemeal... you have to get it *all* in from the
get-go or you create huge issues with things like bootstrapping new
systems and general compatibility and useability, etc..

-Matt

:We have some 32 to 64 bit changes outstanding, which are necessary
:to complete the UFS2 DARPA project.
:...
:
:1. Are there anything else that needs to change size while we're
: at it ?
:
:2. Is this a good occation to create a new syscall vector for
: FreeBSD 5.0 rather than embellish the existing one with even
: more variations ?
:
:--
:Poul-Henning Kamp | UNIX since Zilog Zeus 3.20

To Unsubscribe: send mail to ***@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message
Poul-Henning Kamp
2002-05-07 12:16:19 UTC
Permalink
Post by Matthew Dillon
In regards to other things that may need to change size: A complete
audit will have to be performed. I would be happy to take a run through.
No need to, I'm already busy doing that.
--
Poul-Henning Kamp | UNIX since Zilog Zeus 3.20
***@FreeBSD.ORG | TCP/IP since RFC 956
FreeBSD committer | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.

To Unsubscribe: send mail to ***@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message
Robert Watson
2002-05-07 13:46:56 UTC
Permalink
Post by Matthew Dillon
I tried to add new syscalls to the existing vector but so many
syscalls had to be changed to support 64 bit time_t's that it
became a huge mess... so much so that I would expect the other
BSD's to cry foul on us if we tried to do it with the existing
vector. It will be far, far cleaner to simply implement an
entirely new syscall vector / ELF identifier.
Sounds like we actually have relatively firm concensus on this point [thus
far]. The only concern really has been level of effort and chances of
completeness in a reasonable time.
Post by Matthew Dillon
In regards to other things that may need to change size: A complete
audit will have to be performed. I would be happy to take a run through.
Getting it right the first time is extremely important. A bunch of
things starting leaking out of the woodwork as I was playing around
with 64 bit time_t's. At the very least I would pad the structures
to handle things like 64 bit dev_t, ino_t, and file flags, and I would
even consider padding now 96 bit structures like timespec (on IA32 with
64 bit time_t's + nanoseconds long = 96 bits) to 128 bits across the
board. It might also be worthwhile to make uid_t, gid_t, and pid_t
64 bits to support probable future work in those areas.
It's certainly worth talking about, in that it would be a good breaking
point for any of these. One thing to keep in mind is how much inode
growth is acceptable -- we get some as part of the natural process of
moving to 64-bit block pointers, but we shouldn't let it get out of
control or risk serious reduction in cached inode information for the same
memory footprint.

My understanding from talking with Kirk and Poul-Henning is that ino_t is
definitely on the list already, as are a number of the others at the file
system image layer (such as block pointers, et al). They should already
have much of this underway, and the temptation is to allow them to do the
grunt work if they're already doing it :-).

dev_t I don't really have an opinion about.

For file flags, last I checked, the current leaning was to actually add a
new flags field to the inode for internal system use, and possibly
breaking out the current field into two fields. All of this is at the fs
image level however. This would allow moving the snapshot flag out of the
normal fields range and reducing the level of hard-to-read flag masking in
the UFS2 code. The system flags would also allow for some EA information
caching, improving performance for ACLs and related things.

No opinion on the time type stuff, but I'm sure phk has an opinion :-).

WRT uid_t and gid_t: I'm not sure if there's enough benefit here. The
temptation is certainly there, but unlike things like ino_t, this stuff
tends to get mixed up in data files a lot. Same with time type stuff.
This is more likely to cause problems with persistent application state --
for example, password databases.
Post by Matthew Dillon
In regards to the 5.0R question that comes up later in the thread...
I just don't know. I will say that creating a new syscall vector
cannot be done piecemeal... you have to get it *all* in from the
get-go or you create huge issues with things like bootstrapping new
systems and general compatibility and useability, etc..
Well, I think it's possible to start doing back-end stuff in the kernel
and just "cast down", but I haven't given it too much think-through yet.
Where we'll really need a flag day is with the default compiled ABI for
user binaries. It depends how we handle the change to some of the types,
I suppose -- whether we use "holding types" during a changeover, etc. One
of the reasons I mentioned the "one week" figure in my earlier responses
it that when we do the cut-over, we need to do it decisively and without a
lot of lag in getting stuff working. At the filesystem level, introducing
new types doesn't hurt us too much (ufs_ino_t, ufs2_ino_t, et al), but at
the system call layer it could be that would hurt too much.

I guess the one opinion I haven't heard yet, and am a little surprised not
to have heard is:

No, we shouldn't do this on architectural grounds.

We've heard "yes" in various flavors, including moderated "yes if we can
manage it by the release". Not to invite a bikeshed, but if there's going
to be a strong argument against such a change, it would be nice to hear it
sooner.

Robert N M Watson FreeBSD Core Team, TrustedBSD Project
***@fledge.watson.org NAI Labs, Safeport Network Services




To Unsubscribe: send mail to ***@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message
Garance A Drosihn
2002-05-07 22:52:59 UTC
Permalink
Post by Robert Watson
A bunch of things starting leaking out of the woodwork
as I was playing around with 64 bit time_t's. At the
very least I would pad the structures to handle things
like 64 bit dev_t, ino_t, and file flags, and I would
even consider padding now 96 bit structures like timespec
to 128 bits across the board. It might also be worthwhile
to make uid_t, gid_t, and pid_t 64 bits to support probable
future work in those areas.
My understanding from talking with Kirk and Poul-Henning is
that ino_t is definitely on the list already, as are a number
of the others at the file system image layer (such as block
pointers, et al). They should already have much of this
underway, and the temptation is to allow them to do the
grunt work if they're already doing it :-).
Certainly we should try to talk them into doing as much of
the work as we can convince them to do... :-)
Post by Robert Watson
dev_t I don't really have an opinion about.
I would really really like to have the larger dev_t, but I
expect everyone remembers my previous pleading on the topic.
So, I will just add a "pretty please?" here as a reminder.
I really think it's the right thing to do for OpenAFS, etc.
Post by Robert Watson
I guess the one opinion I haven't heard yet, and am a little
No, we shouldn't do this on architectural grounds.
We've heard "yes" in various flavors, including moderated
"yes if we can manage it by the release". Not to invite a
bikeshed, but if there's going to be a strong argument
against such a change, it would be nice to hear it sooner.
My vote is a pretty strong "yes" for at least the changes
that Poul-Henning originally mentioned. I might want to
change that vote if we're at mid-July and we can't seem to
pull the changes together, but I think the new syscall
vector is less painful than trying to do all the UFS2 work
(which I *do* want in 5.0) via any other method. I would
not mind if 5.0 slipped a month for that work. (not that
I want it to slip, but I would not feel bad if we find out
that it had to slip one month for this change).

Where I start hemming and hawing is when it comes to how
many other changes should be added. Basically I want "as
many as we can do and still keep to the schedule". I would
not want 5.0 to slip for other syscall-ish changes.
--
Garance Alistair Drosehn = ***@gilead.netel.rpi.edu
Senior Systems Programmer or ***@freebsd.org
Rensselaer Polytechnic Institute or ***@rpi.edu

To Unsubscribe: send mail to ***@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message
John Baldwin
2002-05-07 14:33:20 UTC
Permalink
Post by Matthew Dillon
As a person who has already spent several days playing with
64 bit time_t's and associated syscalls I would say that
creating a new syscall vector for FreeBSD 5.0 (and leaving the
existing one intact for compatibility) is the correct solution.
I agree completely.
Post by Matthew Dillon
I tried to add new syscalls to the existing vector but so many
syscalls had to be changed to support 64 bit time_t's that it
became a huge mess... so much so that I would expect the other
BSD's to cry foul on us if we tried to do it with the existing
vector. It will be far, far cleaner to simply implement an
entirely new syscall vector / ELF identifier.
Right. I would like to use the ELF_ABI_OSVERSION thingie, but according to
O'Brien it violates the ELF spec to use any value other than 0. But then
why does it exist if we can't use it? :( One idea being thrown about is
that for the new ABI (It would be FreeBSD ABI 2) there would be a syscall
at syscall 0 that would never change that could be used to select what "minor"
version of the ABI you wanted to use, so if we wanted to add a new ABI at
some point in the future we could stick that syscall in crt.c (or whatever
the proper startup file is) to change to ABI 2.1 or 2.2 or something. However,
for the current change, I think bumping the ELF OS version and adding a new
ABI front-end for ABI 2 is the way to go.

Another advantage of this is that if our tools handle the OS version bit
properly, we can actually not have to worry about much of a flag day. Instead
we can gradually add the new ABI and support for it and then just quietly
throw the switch to change the default ABI in current one day. Old binaries
would still run fine. The only tricky part is for libc.so since you would
really need two version of it (unless we also bumped to libc.so.6 for this..)
and we would need to make sure we linked to the right OSVERSION of the lib when
doing runtime linking, whcih could get ugly.
Post by Matthew Dillon
In regards to other things that may need to change size: A complete
audit will have to be performed. I would be happy to take a run through.
Getting it right the first time is extremely important. A bunch of
things starting leaking out of the woodwork as I was playing around
with 64 bit time_t's. At the very least I would pad the structures
to handle things like 64 bit dev_t, ino_t, and file flags, and I would
even consider padding now 96 bit structures like timespec (on IA32 with
64 bit time_t's + nanoseconds long = 96 bits) to 128 bits across the
board. It might also be worthwhile to make uid_t, gid_t, and pid_t
64 bits to support probable future work in those areas.
I agree, here, too.
Post by Matthew Dillon
In regards to the 5.0R question that comes up later in the thread...
I just don't know. I will say that creating a new syscall vector
cannot be done piecemeal... you have to get it *all* in from the
get-go or you create huge issues with things like bootstrapping new
systems and general compatibility and useability, etc..
I think we can gradually overhaul it internally in the kernel before we throw
the switch to turn it on by default since all we would be breaking would be
test binaries and not "real" ones. Once the new ABI is golden, then we would
throw the switch to change the default.

I'm also not sure if we shouldn't wait to do this until 6.0.
Post by Matthew Dillon
-Matt
:We have some 32 to 64 bit changes outstanding, which are necessary
:to complete the UFS2 DARPA project.
:...
:1. Are there anything else that needs to change size while we're
: at it ?
:2. Is this a good occation to create a new syscall vector for
: FreeBSD 5.0 rather than embellish the existing one with even
: more variations ?
:--
:Poul-Henning Kamp | UNIX since Zilog Zeus 3.20
with "unsubscribe freebsd-arch" in the body of the message
--
John Baldwin <***@FreeBSD.org> <>< http://www.FreeBSD.org/~jhb/
"Power Users Use the Power to Serve!" - http://www.FreeBSD.org/

To Unsubscribe: send mail to ***@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message
Robert Watson
2002-05-07 15:28:26 UTC
Permalink
Post by John Baldwin
Right. I would like to use the ELF_ABI_OSVERSION thingie, but according
to O'Brien it violates the ELF spec to use any value other than 0. But
then why does it exist if we can't use it? :( One idea being thrown
about is that for the new ABI (It would be FreeBSD ABI 2) there would be
a syscall at syscall 0 that would never change that could be used to
select what "minor" version of the ABI you wanted to use, so if we
wanted to add a new ABI at some point in the future we could stick that
syscall in crt.c (or whatever the proper startup file is) to change to
ABI 2.1 or 2.2 or something. However, for the current change, I think
bumping the ELF OS version and adding a new ABI front-end for ABI 2 is
the way to go.
Another advantage of this is that if our tools handle the OS version bit
properly, we can actually not have to worry about much of a flag day.
Instead we can gradually add the new ABI and support for it and then
just quietly throw the switch to change the default ABI in current one
day. Old binaries would still run fine. The only tricky part is for
libc.so since you would really need two version of it (unless we also
bumped to libc.so.6 for this..) and we would need to make sure we
linked to the right OSVERSION of the lib when doing runtime linking,
whcih could get ugly.
My feeling on a strategy resembles the one phk proposed...

(1) Someone does a first pass. Since Kirk and Poul-Henning are doing some
of this anyway, they'd be good candidates. They do what they need,
but don't get carried away.

(2) They commit this, but it does not become the default. How to handle
#includes is an interesting question, perhaps using some new_foo_t
until the switchover is done. Or foo64_t or the like. Throwing a
date at this is a good idea. phk's dates looked good -- perhaps
mid-June or late June.

(3) We have a window where other developers now take care of the things
they care about in the new ABI. For example, I'd be happy to pick up
the sysvipc work to switch to uid_t and gid_t. Depending on our
definition of "get carried away" above, this might be a good time to
update the time-related types.

(4) The switch is thrown. This is the second date of interest. We don't
want to let this get beyond the end of July, so we can cut DP3 (which
would be inevitable with a new ABI) with the new ABI and get decent
testing.
Post by John Baldwin
I think we can gradually overhaul it internally in the kernel before we
throw the switch to turn it on by default since all we would be breaking
would be test binaries and not "real" ones. Once the new ABI is golden,
then we would throw the switch to change the default.
Yeah, if we can do that, would be good. BTW, since this is -current, I'm
not sure we need to be too careful with shared library versions, but it
does raise the question how we have all this co-exist. Will oldabi and
newabi live together in /usr/lib, or should we be considering something
like /usr/lib/aout?
Post by John Baldwin
I'm also not sure if we shouldn't wait to do this until 6.0.
Well, that's the big question, really. If we can do it in the time frame,
why wait, as they say :-). If we can't, then it's a 6.0 thing. 5.0 is a
nice breaking point -- we're ripping up so much anyway, and we'd like to
get the ABI changes for UFS2 in. But if the schedule isn't realistic,
then we can't -- slipping 5.0-release any further isn't something I want
to let happen.

Robert N M Watson FreeBSD Core Team, TrustedBSD Project
***@fledge.watson.org NAI Labs, Safeport Network Services



To Unsubscribe: send mail to ***@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message
Eivind Eklund
2002-05-07 16:51:19 UTC
Permalink
Post by Robert Watson
My feeling on a strategy resembles the one phk proposed...
(1) Someone does a first pass. Since Kirk and Poul-Henning are doing some
of this anyway, they'd be good candidates. They do what they need,
but don't get carried away.
(2) They commit this, but it does not become the default. How to handle
#includes is an interesting question, perhaps using some new_foo_t
until the switchover is done. Or foo64_t or the like. Throwing a
date at this is a good idea. phk's dates looked good -- perhaps
mid-June or late June.
(3) We have a window where other developers now take care of the things
they care about in the new ABI. For example, I'd be happy to pick up
the sysvipc work to switch to uid_t and gid_t. Depending on our
definition of "get carried away" above, this might be a good time to
update the time-related types.
(4) The switch is thrown. This is the second date of interest. We don't
want to let this get beyond the end of July, so we can cut DP3 (which
would be inevitable with a new ABI) with the new ABI and get decent
testing.
Somewhere between 0 and 3.5 we should have an

X) Write a list of proposed changes to the syscalls by just going through the
syscalls list and adding proposed changes, based on public discussion.

I think the right thing to do might be to just make a copy of syscalls.master
and let people commit suggested improvements, and then post it to arch for
discussion after a while (e.g, 3 weeks.)

Eivind.

To Unsubscribe: send mail to ***@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message
Robert Watson
2002-05-07 17:53:35 UTC
Permalink
Post by Eivind Eklund
Somewhere between 0 and 3.5 we should have an
X) Write a list of proposed changes to the syscalls by just going
through the syscalls list and adding proposed changes, based on public
discussion.
I think the right thing to do might be to just make a copy of
syscalls.master and let people commit suggested improvements, and then
post it to arch for discussion after a while (e.g, 3 weeks.)
Probably close to the right strategy -- unfortunately most of the more
interesting changes are in the supporting structs (struct stat, struct
ipcperm, ...), which does complicate things. Also, many of the changes
have to do with changing a type -- ino_t, or the like, rather than
changing the arguments to the system call. This breaks the ABI, but
maintains source-level compatibility as visible in syscalls.master :-).

Robert N M Watson FreeBSD Core Team, TrustedBSD Project
***@fledge.watson.org NAI Labs, Safeport Network Services



To Unsubscribe: send mail to ***@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message
Eivind Eklund
2002-05-08 15:49:48 UTC
Permalink
Post by Robert Watson
Post by Eivind Eklund
Somewhere between 0 and 3.5 we should have an
X) Write a list of proposed changes to the syscalls by just going
through the syscalls list and adding proposed changes, based on public
discussion.
I think the right thing to do might be to just make a copy of
syscalls.master and let people commit suggested improvements, and then
post it to arch for discussion after a while (e.g, 3 weeks.)
Probably close to the right strategy -- unfortunately most of the more
interesting changes are in the supporting structs (struct stat, struct
ipcperm, ...), which does complicate things. Also, many of the changes
have to do with changing a type -- ino_t, or the like, rather than
changing the arguments to the system call. This breaks the ABI, but
maintains source-level compatibility as visible in syscalls.master :-).
I was thinking of adding the suggested improvements as comments between the
lines in the copy of syscalls.master, just using it as a red thread to make
sure we examine a number of the relevant issues, and get a structured document
out.

And even the conversions you mention would end up being relevant there, as we
would add comments on which syscalls would need a converting frontend to gain
backwards compatibility.

Eivind.

To Unsubscribe: send mail to ***@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message
Matthew Dillon
2002-05-07 19:31:13 UTC
Permalink
:
:> I'm also not sure if we shouldn't wait to do this until 6.0.
:
:Well, that's the big question, really. If we can do it in the time frame,
:why wait, as they say :-). If we can't, then it's a 6.0 thing. 5.0 is a
:nice breaking point -- we're ripping up so much anyway, and we'd like to
:get the ABI changes for UFS2 in. But if the schedule isn't realistic,
:then we can't -- slipping 5.0-release any further isn't something I want
:to let happen.
:
:Robert N M Watson FreeBSD Core Team, TrustedBSD Project

I think the one big advantage with having a new syscall vector is
precisely the fact that the new ABI can be worked on without effecting
the existing system. Internal system support that is still 32 bits
winds up being straight-up in the old syscall vector and extended out
in the new, and internal system support that has been changed to 64
bits winds up being truncated in the old syscall vector and straight
up in the new. Either way we are able to maintain ABI compatibility for
both syscall vectors even if it takes months to convert all the kernel
subsystems to the new sizes.

This seems to infer that the work will not directly interfere with 5.0R,
even if it is not 100% complete by the release. That would make the
work 'a go' in my book.

-Matt
Matthew Dillon
<***@backplane.com>

To Unsubscribe: send mail to ***@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message
Poul-Henning Kamp
2002-05-07 15:37:16 UTC
Permalink
Well, seems like some sort of concensus is building.

Now for a bit of radical thinking.

The FreeBSD kernel is already a multi-API kernel, we have freebsd
syscalls, including various old compat stuff, we have NetBSD compat,
we have BSDI compat, Linuxolator, IBCS2, and we will probably have
Solaris compat on the sparc64 platform.

We cannot easily compile for two native APIs if we cannot have two
different set of 'sys/*' and 'machine/*' files since a lot of the
types we want to change size on are defined in these files.

I would therefore like to propose that we do something like
the following:

Repocopy src/sys/sys/* to src/sys/include
Repocopy src/sys/sys/* to src/sys/abi4
Remove src/sys/sys/*

In src/sys/include we remove everything which deals with
syscall parameters, but retain kernel internal data structures.
Stuff like vnodes, and similar lives here.

In src/sys/abi4 we remove everything that we leave behind
in src/sys/include.

(A similar split should be done for src/sys/$arch/include
into a src/sys/$arch/abi4 directory)

In .c land we repocopy the relevant sys/kern/*.c files
into sys/abi4 and remove everything but the syscall
entry points and add explicit conversions to arguments
as needed.

We now have a clean split between the stuff which defines
what goes on in the kernel and the stuff which defines
the ABI to userland.

Adding a new ABI is now a matter of creating the relevant
directories (src/sys/abi5) and populating with files which
does it right.

I see many advantages to this way of doing things:

We can remove practically all "#ifdef _KERNEL" from the .h files
we install in /usr/include/sys and we can get a fair bit closer to
whatever the standard-du-jour dictates that we put there.

Conversely we can clean the kernel side of things (src/sys/include)
for things we don't want people to use in the kernel, but which
standards or compatibility demand we put in <sys/blah.h>.

We also get a clear kernel / userland split on C types, we may find
it convenient to operate on a 64 bit foo_t in the kernel but leave
it 32 bit in userland and this lets us trivially do so.

This should put a good bit of infrastructure in to make current and
future API/ABI implementations simpler and more structured.

I guess a way to sum this up is that it will put all API/ABI's on
equal footing. With this change none of them will be any more
"native" than any other API/ABI.
--
Poul-Henning Kamp | UNIX since Zilog Zeus 3.20
***@FreeBSD.ORG | TCP/IP since RFC 956
FreeBSD committer | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.

To Unsubscribe: send mail to ***@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message
David O'Brien
2002-05-07 17:55:26 UTC
Permalink
Post by Poul-Henning Kamp
I would therefore like to propose that we do something like
Repocopy src/sys/sys/* to src/sys/include
Repocopy src/sys/sys/* to src/sys/abi4
This is actually the FreeBSD 3.0 ABI, not an ABI introduced with 4.0.

To Unsubscribe: send mail to ***@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message
John Baldwin
2002-05-07 18:01:10 UTC
Permalink
Post by David O'Brien
Post by Poul-Henning Kamp
I would therefore like to propose that we do something like
Repocopy src/sys/sys/* to src/sys/include
Repocopy src/sys/sys/* to src/sys/abi4
This is actually the FreeBSD 3.0 ABI, not an ABI introduced with 4.0.
I prefer a suggestion I think you had where the ABI isn't actually
linked to freebsd versions, but instead our current ABI is ABI 1 (or 0)
and the new ABI would be ABI 2 (or 1).
--
John Baldwin <***@FreeBSD.org> <>< http://www.FreeBSD.org/~jhb/
"Power Users Use the Power to Serve!" - http://www.FreeBSD.org/

To Unsubscribe: send mail to ***@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message
Matthew Dillon
2002-05-07 19:40:43 UTC
Permalink
:On 07-May-2002 David O'Brien wrote:
:> On Tue, May 07, 2002 at 05:37:16PM +0200, Poul-Henning Kamp wrote:
:>> I would therefore like to propose that we do something like
:>> the following:
:>>
:>> Repocopy src/sys/sys/* to src/sys/include
:>> Repocopy src/sys/sys/* to src/sys/abi4
:>
:> This is actually the FreeBSD 3.0 ABI, not an ABI introduced with 4.0.
:
:I prefer a suggestion I think you had where the ABI isn't actually
:linked to freebsd versions, but instead our current ABI is ABI 1 (or 0)
:and the new ABI would be ABI 2 (or 1).
:
:--
:
:John Baldwin <***@FreeBSD.org> <>< http://www.FreeBSD.org/~jhb/
:"Power Users Use the Power to Serve!" - http://www.FreeBSD.org/

Hmm. I think this may be a little too confusing. Remember that we
have serious issues with mainline files in /usr/src/include (aka
/usr/include), not just /usr/src/sys/sys. We also have issues with
/usr/lib, etc. Why not do something like this:

/usr/src/include/

Contains architecture and ABI independant files

/usr/src/include/ABIx/
/usr/src/include/ABIy/
/usr/src/include/ABIz/

Contains ABI specific files. For example, /usr/src/include/ABIx/sys

(mirrored in /usr/include, aka /usr/include/<BLAH>, /usr/include/ABI/<BLAH>)

Then we simply change the default compiler -I path from: /usr/include to
/usr/include:/usr/include/ABIxxx where the ABI may be chosen at
compile time with an option to cc, and the default is set to whatever
the appropriate default is. This makes the 'switch' easy to throw.
The compiler/linker is going to need knowledge of the ABI being used
anyway since it has to set the ELF parameter, we might as well leverage
that for the includes as well.

We will need ABI specific libraries as well. e.g. /usr/lib, /usr/lib/ABIx,
etc...

I really think this is the cleanest, safest way to do it, and it also paves
the way for us to allow natively compiled multi-architectural support.
e.g. consider this:

cc -ABI4 ...
cc -ABI5 ...
cc -ABILinux ...
cc -ABIOpenBSD ...

You see what I am getting at?

-Matt
Matthew Dillon
<***@backplane.com>

To Unsubscribe: send mail to ***@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message
David O'Brien
2002-05-07 20:13:14 UTC
Permalink
Post by Matthew Dillon
the way for us to allow natively compiled multi-architectural support.
cc -ABI4 ...
cc -ABI5 ...
cc -ABILinux ...
cc -ABIOpenBSD ...
Honestly, why do we have this need? It seems to fall into the "it would
be nice"; but seldomly used.

To Unsubscribe: send mail to ***@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message
Matthew Dillon
2002-05-07 21:06:34 UTC
Permalink
:
:On Tue, May 07, 2002 at 12:40:43PM -0700, Matthew Dillon wrote:
:> the way for us to allow natively compiled multi-architectural support.
:> e.g. consider this:
:>
:> cc -ABI4 ...
:> cc -ABI5 ...
:> cc -ABILinux ...
:> cc -ABIOpenBSD ...
:
:Honestly, why do we have this need? It seems to fall into the "it would
:be nice"; but seldomly used.

Well, how do you intend to test the new ABI vector?

-Matt
Matthew Dillon
<***@backplane.com>

To Unsubscribe: send mail to ***@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message
David O'Brien
2002-05-07 23:17:30 UTC
Permalink
Post by Matthew Dillon
:> the way for us to allow natively compiled multi-architectural support.
:>
:> cc -ABI4 ...
:> cc -ABI5 ...
:> cc -ABILinux ...
:> cc -ABIOpenBSD ...
:Honestly, why do we have this need? It seems to fall into the "it would
:be nice"; but seldomly used.
Well, how do you intend to test the new ABI vector?
One moves forward and does not look back.

To Unsubscribe: send mail to ***@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message
Matthew Dillon
2002-05-07 23:35:08 UTC
Permalink
:
:On Tue, May 07, 2002 at 02:06:34PM -0700, Matthew Dillon wrote:
:>
:> :
:> :On Tue, May 07, 2002 at 12:40:43PM -0700, Matthew Dillon wrote:
:> :> the way for us to allow natively compiled multi-architectural support.
:> :> e.g. consider this:
:> :>
:> :> cc -ABI4 ...
:> :> cc -ABI5 ...
:> :> cc -ABILinux ...
:> :> cc -ABIOpenBSD ...
:> :
:> :Honestly, why do we have this need? It seems to fall into the "it would
:> :be nice"; but seldomly used.
:>
:> Well, how do you intend to test the new ABI vector?
:
:One moves forward and does not look back.

Uh huh. Well, I have some experience with that when I tried changing
one of my test boxes over to a 64 bit time_t. I wound up having to
wipe the entire machine (raw dd from the backup partition). Twice.
To be blunt, making incremental changes and still having a working
system at the end of the day required extremely careful attention
to detail, and I made two mistakes over the period of several
days that I couldn't back out of.

In otherwords, my considered opinion is that it would actually be
*easier* to do the relatively modest amount of work required to generalize
the ABI linkage in order to save a whole lot more work down the line
when people actually try to test it. I will note that what I am
suggesting is considerably less work then the more radical suggestion
Poul had (note: I have no specific opinion on Poul's radical suggestion
at this time).

-Matt
Matthew Dillon
<***@backplane.com>

To Unsubscribe: send mail to ***@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message
Michael C. Wu
2002-05-10 04:57:01 UTC
Permalink
On Tue, May 07, 2002 at 01:13:14PM -0700, David O'Brien scribbled:
| On Tue, May 07, 2002 at 12:40:43PM -0700, Matthew Dillon wrote:
| > the way for us to allow natively compiled multi-architectural support.
| > e.g. consider this:
| > cc -ABI4 ...
| > cc -ABI5 ...
| > cc -ABILinux ...
| > cc -ABIOpenBSD ...
| Honestly, why do we have this need? It seems to fall into the "it would
| be nice"; but seldomly used.

Vendor imported code (e.g. KAME(sigh..), USB, possibly 1394,
cardbus, possibly some userland important tools and libraries)
would have little or no diffs from the original vendor's code..

To Unsubscribe: send mail to ***@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message
John Baldwin
2002-05-10 05:13:52 UTC
Permalink
Post by Michael C. Wu
| > the way for us to allow natively compiled multi-architectural support.
| > cc -ABI4 ...
| > cc -ABI5 ...
| > cc -ABILinux ...
| > cc -ABIOpenBSD ...
| Honestly, why do we have this need? It seems to fall into the "it would
| be nice"; but seldomly used.
Vendor imported code (e.g. KAME(sigh..), USB, possibly 1394,
cardbus, possibly some userland important tools and libraries)
would have little or no diffs from the original vendor's code..
Erm, all the changes so far aren't in API's, but rather changing the size
of types like time_t. The same KAME code would compile fine.
--
John Baldwin <***@FreeBSD.org> <>< http://www.FreeBSD.org/~jhb/
"Power Users Use the Power to Serve!" - http://www.FreeBSD.org/

To Unsubscribe: send mail to ***@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message
Matthew Dillon
2002-05-10 06:56:35 UTC
Permalink
:>
:> Vendor imported code (e.g. KAME(sigh..), USB, possibly 1394,
:> cardbus, possibly some userland important tools and libraries)
:> would have little or no diffs from the original vendor's code..
:
:Erm, all the changes so far aren't in API's, but rather changing the size
:of types like time_t. The same KAME code would compile fine.
:
:--
:
:John Baldwin <***@FreeBSD.org> <>< http://www.FreeBSD.org/~jhb/

Huh? Changing the size of time_t sure sounds like an API change to me!
<GRIN>. Which was my original point... though I was considering those
ports which might not be 64-bit-time_t friendly (probably hundreds).

-Matt
Matthew Dillon
<***@backplane.com>

To Unsubscribe: send mail to ***@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message
Marcel Moolenaar
2002-05-10 08:10:04 UTC
Permalink
Post by Matthew Dillon
:> Vendor imported code (e.g. KAME(sigh..), USB, possibly 1394,
:> cardbus, possibly some userland important tools and libraries)
:> would have little or no diffs from the original vendor's code..
:Erm, all the changes so far aren't in API's, but rather changing the size
:of types like time_t. The same KAME code would compile fine.
Huh? Changing the size of time_t sure sounds like an API change to me!
There's no programmer visual change if you change the size of a type.
Only when you go down to the binary level, you'll notice things have
changed. I think this means that changing the definition of an abstract
type (which time_t is) is not an API change.

Put differently: you don't have to recode, just recompile (provided
you didn't make assumptions about the size of course :-)

I'm not 100% sure, but it's how I intuitively interpret API...
--
Marcel Moolenaar USPA: A-39004 ***@xcllnt.net

To Unsubscribe: send mail to ***@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message
John Baldwin
2002-05-10 12:05:08 UTC
Permalink
Post by Matthew Dillon
:>
:> Vendor imported code (e.g. KAME(sigh..), USB, possibly 1394,
:> cardbus, possibly some userland important tools and libraries)
:> would have little or no diffs from the original vendor's code..
:Erm, all the changes so far aren't in API's, but rather changing the size
:of types like time_t. The same KAME code would compile fine.
:--
Huh? Changing the size of time_t sure sounds like an API change to me!
<GRIN>. Which was my original point... though I was considering those
ports which might not be 64-bit-time_t friendly (probably hundreds).
It's not one that other source needs to be changed for though. It's not like
you've changed the prototypes of functions. Instead, you've changed the binary
representation of certain types. We don't need to use the old ABI's for vendor
code like Michael suggested because vendor code will compile just fine with the
newer one.
Post by Matthew Dillon
-Matt
Matthew Dillon
--
John Baldwin <***@FreeBSD.org> <>< http://www.FreeBSD.org/~jhb/
"Power Users Use the Power to Serve!" - http://www.FreeBSD.org/

To Unsubscribe: send mail to ***@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message
Peter Wemm
2002-05-08 06:22:41 UTC
Permalink
Post by David O'Brien
Post by Poul-Henning Kamp
I would therefore like to propose that we do something like
Repocopy src/sys/sys/* to src/sys/include
Repocopy src/sys/sys/* to src/sys/abi4
This is actually the FreeBSD 3.0 ABI, not an ABI introduced with 4.0.
Also note that we already have a 32-bit freebsd abi emulation in
/sys/ia64/ia32.. The x86 syscalls are wrapperized and converted
to 64 bit.

Cheers,
-Peter
--
Peter Wemm - ***@wemm.org; ***@FreeBSD.org; ***@yahoo-inc.com
"All of this is for nothing if we don't go to the stars" - JMS/B5


To Unsubscribe: send mail to ***@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message
Doug Rabson
2002-05-08 09:09:17 UTC
Permalink
Post by Peter Wemm
Post by David O'Brien
Post by Poul-Henning Kamp
I would therefore like to propose that we do something like
Repocopy src/sys/sys/* to src/sys/include
Repocopy src/sys/sys/* to src/sys/abi4
This is actually the FreeBSD 3.0 ABI, not an ABI introduced with 4.0.
Also note that we already have a 32-bit freebsd abi emulation in
/sys/ia64/ia32.. The x86 syscalls are wrapperized and converted
to 64 bit.
That place is not the final home for the thing either. I would like to be able
to use the same code for all ilp32 process on lp64 kernel situations (e.g.
sparc32 on sparc64, i386 on x86-64 etc.)
--
Doug Rabson Mail: ***@nlsystems.com
Phone: +44 20 8348 6160


To Unsubscribe: send mail to ***@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message
Boris Popov
2002-05-08 02:53:28 UTC
Permalink
Post by Poul-Henning Kamp
We cannot easily compile for two native APIs if we cannot have two
different set of 'sys/*' and 'machine/*' files since a lot of the
types we want to change size on are defined in these files.
Not exactly, because #ifdefs can handle this perfectly. Performing
diffs on different files is a more simple task though.
Post by Poul-Henning Kamp
I would therefore like to propose that we do something like
Repocopy src/sys/sys/* to src/sys/include
Repocopy src/sys/sys/* to src/sys/abi4
Remove src/sys/sys/*
Sounds good, except #includes in various headers may look fuzzy
after that, but this is acceptable. Although, this scheme doesn't account
data parameters passed to VFS_MOUNT() call. Definitions of these
structures lives in fs/*fs.h files. How do you plan to deal with it ?
Post by Poul-Henning Kamp
This should put a good bit of infrastructure in to make current and
future API/ABI implementations simpler and more structured.
I guess a way to sum this up is that it will put all API/ABI's on
equal footing. With this change none of them will be any more
"native" than any other API/ABI.
Agreed.
--
Boris Popov
http://rbp.euro.ru


To Unsubscribe: send mail to ***@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message
Matthew Dillon
2002-05-08 03:20:51 UTC
Permalink
:
:On Tue, 7 May 2002, Poul-Henning Kamp wrote:
:
:> We cannot easily compile for two native APIs if we cannot have two
:> different set of 'sys/*' and 'machine/*' files since a lot of the
:> types we want to change size on are defined in these files.
:
: Not exactly, because #ifdefs can handle this perfectly. Performing
:diffs on different files is a more simple task though.
:...
:Boris Popov
:http://rbp.euro.ru

#ifdef's are a bad idea for this case IMHO, at least in regards to
being able to develop the new ABI without interfering with the
release schedule. I think it is far less dangerous and far more
advantageous to simply give each ABI it's own secondary include
(-I) path (not to mention making the include files far more readable
post-ABI-changes). There is absolutely no need to pollute the include
files with #ifdefs.

Also we should consider the fact that it may take considerably longer
for many ports to become 64-bit time_t safe (not to mention uids, gids,
and so forth). Doing the ABI properly with a compiler option and default
setting would allow unsafe ports to be compiled to the old ABI on
new systems. The power of this capability should not be underestimated.

-Matt
Matthew Dillon
<***@backplane.com>


To Unsubscribe: send mail to ***@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message
Boris Popov
2002-05-08 03:28:08 UTC
Permalink
Post by Matthew Dillon
#ifdef's are a bad idea for this case IMHO, at least in regards to
being able to develop the new ABI without interfering with the
release schedule. I think it is far less dangerous and far more
advantageous to simply give each ABI it's own secondary include
(-I) path (not to mention making the include files far more readable
post-ABI-changes). There is absolutely no need to pollute the include
files with #ifdefs.
Yes, this is what I'm expressed in the "performing diffs" phrase
:)
Post by Matthew Dillon
Also we should consider the fact that it may take considerably longer
for many ports to become 64-bit time_t safe (not to mention uids, gids,
and so forth). Doing the ABI properly with a compiler option and default
setting would allow unsafe ports to be compiled to the old ABI on
new systems. The power of this capability should not be underestimated.
Heh, different ABI in the different ports is the real fun of it.
--
Boris Popov
http://rbp.euro.ru


To Unsubscribe: send mail to ***@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message
Peter Wemm
2002-05-08 07:00:01 UTC
Permalink
Post by Poul-Henning Kamp
Well, seems like some sort of concensus is building.
Now for a bit of radical thinking.
The FreeBSD kernel is already a multi-API kernel, we have freebsd
syscalls, including various old compat stuff, we have NetBSD compat,
we have BSDI compat, Linuxolator, IBCS2, and we will probably have
Solaris compat on the sparc64 platform.
Also, dont forget x86-64 and ia64 which will now have 3, 4 or 5 syscall
vectors to deal with.

1) 32 bit i386 a.out <= 4.x (yes, a.out is still supported)
2) 32 bit i386 ELF <= 4.x
3) 32 bit i386 ELF >= 5.x
4) 64 bit x86-64 ELF / 64 bit ia64 ELF
5) 32 bit IA64 ELF (you can compile in ILP32 mode)

Personally, that's several too many already. :-( (and yes, I know that
a.out and elf currently share the syscall vector part.. but not the
executable loader.. they have different startup mechanisms and different
kernel entry trap methods.)

We're already having enough trouble copying around the MPSAFE tag in all
those damn syscall.master's.. We dont need more.
Post by Poul-Henning Kamp
We cannot easily compile for two native APIs if we cannot have two
different set of 'sys/*' and 'machine/*' files since a lot of the
types we want to change size on are defined in these files.
I would therefore like to propose that we do something like
Repocopy src/sys/sys/* to src/sys/include
Repocopy src/sys/sys/* to src/sys/abi4
Remove src/sys/sys/*
In src/sys/include we remove everything which deals with
syscall parameters, but retain kernel internal data structures.
Stuff like vnodes, and similar lives here.
In src/sys/abi4 we remove everything that we leave behind
in src/sys/include.
(A similar split should be done for src/sys/$arch/include
into a src/sys/$arch/abi4 directory)
In .c land we repocopy the relevant sys/kern/*.c files
into sys/abi4 and remove everything but the syscall
entry points and add explicit conversions to arguments
as needed.
We now have a clean split between the stuff which defines
what goes on in the kernel and the stuff which defines
the ABI to userland.
Adding a new ABI is now a matter of creating the relevant
directories (src/sys/abi5) and populating with files which
does it right.
We can remove practically all "#ifdef _KERNEL" from the .h files
we install in /usr/include/sys and we can get a fair bit closer to
whatever the standard-du-jour dictates that we put there.
Conversely we can clean the kernel side of things (src/sys/include)
for things we don't want people to use in the kernel, but which
standards or compatibility demand we put in <sys/blah.h>.
We also get a clear kernel / userland split on C types, we may find
it convenient to operate on a 64 bit foo_t in the kernel but leave
it 32 bit in userland and this lets us trivially do so.
This should put a good bit of infrastructure in to make current and
future API/ABI implementations simpler and more structured.
I guess a way to sum this up is that it will put all API/ABI's on
equal footing. With this change none of them will be any more
"native" than any other API/ABI.
Personally, I think this is *way* overkill.

I think there is far more value to be had by divorcing the syscall
interfaces from the code that implements them so we can do away with the
damn stackgap stuff.

eg: instead of open() doing the copyin *and* the body of the work,
we should have sys_open (or abi4_open, linux_open, etc) which do the pathname
copyin, any args massaging etc, and then call open() with the cleaned up
arguments. open() shouldn't have to do copyin etc.

The ia64 32 bit emulator already does the 32 bit time_t <-> 64 bit time_t
conversions. There are quite a few that need translation, not the least of
which are: struct rusage (all the wait* syscalls), struct statfs,
things like setitimer etc which take timevals, select (timeval),
gettimeofday(timeval), struct stat, utimes(time_t), adjtime(timeval),
readv/writev/pread/pwrite etc and all the syscalls that use iovec's
(there is a 'long' in there if you're thinking about making it all
explicit sized as well).

In fact, gettimeofday is a classic example, the following taken from
the sys/ia64/ia32/ia32_misc.c code that we're using now:

int
ia32_gettimeofday(struct thread *td, struct ia32_gettimeofday_args *uap)
{
int error;
caddr_t sg;
struct timeval32 *p32, s32;
struct timeval *p = NULL, s;

p32 = SCARG(uap, tp);
if (p32) {
sg = stackgap_init();
p = stackgap_alloc(&sg, sizeof(struct timeval));
SCARG(uap, tp) = (struct timeval32 *)p;
}
error = gettimeofday(td, (struct gettimeofday_args *) uap);
if (error)
return (error);
if (p32) {
error = copyin(p, &s, sizeof(s));
if (error)
return (error);
CP(s, s32, tv_sec);
CP(s, s32, tv_usec);
error = copyout(&s32, p32, sizeof(s32));
if (error)
return (error);
}
return (error);
}

Do you really want to impose all those copyin/outs etc on common paths for
4.x binaries? We need to spend more effort on things like having a seperate
sys_gettimeofday(td, struct gettimeofday_args *uap) vs
gettimeofday(td, struct timeval *tv);

You then have:
int gettimeofday(td, struct timeval *tv)
{
.. normal code but no copyout ...
}
int sys_gettimeofday(td, uap) /* native 5.x syscall */
{
int error;
struct timeval tv; /* native kernel timeval */

error = gettimeofday(td, &tv);
if (error == 0 && uap->tp)
error = copyout(&tv, uap->tp, sizeof(tv));
return error;
}
int sys_gettimeofday32(td, uap) /* 32 bit syscall interface */
{
int error;
struct timeval tv;
struct timeval32 tv32; /* userland 32 bit timeval */

error = gettimeofday(td, &tv);
convert_tv_to_tv32(&tv, &tv32);
if (error == 0 && uap->tp)
error = copyout(&tv32, uap->tp, sizeof(tv32));
return error;
}
and so on. Lots less bogus copyin/outs through the stackgap. You can use
your local stack for temporary conversions, or even malloc etc. But
trying to do it in userland because we dont cleanly divorce the syscall
ABI implementation with the functionality just sucks.

Finally, I really think the entire-new-syscall vector idea is sheer
wasteful overkill. I'd much rather we had COMPAT_FREEBSD4 kernel compile
options using the existing vector with new syscalls added in that we need
to translate. What I saw on SVR4 was much cleaner. They dealt with
different "struct stat"'s wit no trouble at all. You could even compile
to the older interfaces.

Cheers,
-Peter
--
Peter Wemm - ***@wemm.org; ***@FreeBSD.org; ***@yahoo-inc.com
"All of this is for nothing if we don't go to the stars" - JMS/B5


To Unsubscribe: send mail to ***@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message
Matthew Dillon
2002-05-08 07:58:19 UTC
Permalink
:Personally, I think this is *way* overkill.
:
:I think there is far more value to be had by divorcing the syscall
:interfaces from the code that implements them so we can do away with the
:damn stackgap stuff.
:
:eg: instead of open() doing the copyin *and* the body of the work,
:we should have sys_open (or abi4_open, linux_open, etc) which do the pathname
:copyin, any args massaging etc, and then call open() with the cleaned up
:arguments. open() shouldn't have to do copyin etc.

I would love to see this too. Just looking at the hell the linux syscall
emulation code goes through is enough to convince me.

(I don't agree with your argument against adding another syscall vector,
however, because I see no other solution that is clean enough to be a
reasonable replacement).

:Finally, I really think the entire-new-syscall vector idea is sheer
:wasteful overkill. I'd much rather we had COMPAT_FREEBSD4 kernel compile
:options using the existing vector with new syscalls added in that we need
:to translate. What I saw on SVR4 was much cleaner. They dealt with
:different "struct stat"'s wit no trouble at all. You could even compile
:to the older interfaces.
:
:Cheers,
:-Peter
:--
:Peter Wemm - ***@wemm.org; ***@FreeBSD.org; ***@yahoo-inc.com
:"All of this is for nothing if we don't go to the stars" - JMS/B5

This is what I tried to do when I was messing around with time_t and
I came to regret it. There are just too many system calls that need
to be changed to simply be able to add them to the existing vector.
Take the 'stat' mess and multiply by about 20 and the result is the
mess you would get if you tried to integrate the 64 bit calls into the
existing vector.

-Matt
Matthew Dillon
<***@backplane.com>

To Unsubscribe: send mail to ***@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message
Peter Wemm
2002-05-08 08:38:06 UTC
Permalink
Post by Matthew Dillon
:Finally, I really think the entire-new-syscall vector idea is sheer
:wasteful overkill. I'd much rather we had COMPAT_FREEBSD4 kernel compile
:options using the existing vector with new syscalls added in that we need
:to translate. What I saw on SVR4 was much cleaner. They dealt with
:different "struct stat"'s wit no trouble at all. You could even compile
:to the older interfaces.
This is what I tried to do when I was messing around with time_t and
I came to regret it. There are just too many system calls that need
to be changed to simply be able to add them to the existing vector.
Take the 'stat' mess and multiply by about 20 and the result is the
mess you would get if you tried to integrate the 64 bit calls into the
existing vector.
struct stat is pretty easy. Have a look at what NetBSD did. It is quite
simple.

http://cvsweb.netbsd.org/cgi-bin/cvsweb.cgi/syssrc/sys/sys/stat.h?rev=1.41&content-type=text/x-cvsweb-markup

They define the different struct stat's and use a __RENAME() macro from
cdefs.h.

They can even trivially compile a binary that uses their 1.2 binary interface
vs their current 1.3 interface.

IMHO, 'struct stat' is hardly a reason for a new syscall vector when it can
be solved other ways so easily. I would not be suprised if all the others
couldn't be done pretty easily too. I'm not sure that a new syscall vector
will gain us much at the end of the day other than an even bigger kernel
with more tables to get out of sync...

This is right up there with "Hey! lets reorder ASCII so that it is more
sensible!". At the end of the day we now have *two* character sets and now
have the difficulty of keeping track which is which. And the end user sees
nothing new, he's still got all the same keys to type with... He's not
going to care all that much if it is easier to memorize the 'new-ascii'
code table. At face value it looks like a good idea, but I dont any real
benefit except more work that people have to do.

[Note, I am *not* suggesting that we keep time_t and other types small, I
am just questioning the elaborate path to get there. A cynic might wonder
if somebody was getting paid by the hour for this... ]

Cheers,
-Peter
--
Peter Wemm - ***@wemm.org; ***@FreeBSD.org; ***@yahoo-inc.com
"All of this is for nothing if we don't go to the stars" - JMS/B5


To Unsubscribe: send mail to ***@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message
Poul-Henning Kamp
2002-05-08 10:30:28 UTC
Permalink
Post by Peter Wemm
I think there is far more value to be had by divorcing the syscall
interfaces from the code that implements them so we can do away with the
damn stackgap stuff.
Uhm, that was what I tried to express with my proposal.
Post by Peter Wemm
eg: instead of open() doing the copyin *and* the body of the work,
we should have sys_open (or abi4_open, linux_open, etc) which do the pathname
copyin, any args massaging etc, and then call open() with the cleaned up
arguments. open() shouldn't have to do copyin etc.
Exactly.

And that means that we can ditch the MPSAFE thing in the syscall.master
file, since the *_open() function will be MPSAFE nomatter what and if
open() isn't MPSAFE, then open() will grab and release GIANT.
Post by Peter Wemm
Finally, I really think the entire-new-syscall vector idea is sheer
wasteful overkill.
Having looked at the number of syscalls we have to deal with I think
it is the only practically passable route.

I havn't heard any comments on the splitting of the #include files ?
--
Poul-Henning Kamp | UNIX since Zilog Zeus 3.20
***@FreeBSD.ORG | TCP/IP since RFC 956
FreeBSD committer | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.

To Unsubscribe: send mail to ***@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message
John Baldwin
2002-05-08 12:10:41 UTC
Permalink
Post by Poul-Henning Kamp
And that means that we can ditch the MPSAFE thing in the syscall.master
file, since the *_open() function will be MPSAFE nomatter what and if
open() isn't MPSAFE, then open() will grab and release GIANT.
Maxime Henrion is already working on this anyways.
--
John Baldwin <***@FreeBSD.org> <>< http://www.FreeBSD.org/~jhb/
"Power Users Use the Power to Serve!" - http://www.FreeBSD.org/

To Unsubscribe: send mail to ***@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message
Nathan Hawkins
2002-05-07 16:18:31 UTC
Permalink
Post by John Baldwin
Right. I would like to use the ELF_ABI_OSVERSION thingie, but according to
O'Brien it violates the ELF spec to use any value other than 0. But then
why does it exist if we can't use it? :( One idea being thrown about is
that for the new ABI (It would be FreeBSD ABI 2) there would be a syscall
at syscall 0 that would never change that could be used to select what "minor"
version of the ABI you wanted to use, so if we wanted to add a new ABI at
some point in the future we could stick that syscall in crt.c (or whatever
the proper startup file is) to change to ABI 2.1 or 2.2 or something. However,
for the current change, I think bumping the ELF OS version and adding a new
ABI front-end for ABI 2 is the way to go.
I'd like to see FreeBSD start using ELF .note.ABI-tag sections to handle
the binary OS type/versioning. I know that at least NetBSD, Linux and
Hurd do it this way, and I think most others do too. It looks like gcc
is already inserting them, (run readelf -S, and you should see it) so
it'd be a matter of adding to the binary emulation selector to check
.note.ABI-tag. For your new ABI, set a version number in the tag. I
believe the tag gcc is putting in already has a field for that.

This also has the benefit of making emulations work more transparently,
as it should obsolete brandelf.
Post by John Baldwin
Another advantage of this is that if our tools handle the OS version bit
properly, we can actually not have to worry about much of a flag day. Instead
we can gradually add the new ABI and support for it and then just quietly
throw the switch to change the default ABI in current one day. Old binaries
would still run fine. The only tricky part is for libc.so since you would
really need two version of it (unless we also bumped to libc.so.6 for this..)
and we would need to make sure we linked to the right OSVERSION of the lib when
doing runtime linking, whcih could get ugly.
This would be about the same for the above suggestion.

I think bumping libc version would be good. It solves things neatly.

---Nathan


To Unsubscribe: send mail to ***@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message
David O'Brien
2002-05-07 18:09:01 UTC
Permalink
Post by Nathan Hawkins
I'd like to see FreeBSD start using ELF .note.ABI-tag sections to handle
the binary OS type/versioning. I know that at least NetBSD, Linux and
Hurd do it this way, and I think most others do too.
Why? Just to follow the NIH herd? The EI_OSABI and EI_ABIVERSION fields
were in the gABI spec before anyone started using .note sections for
this.

If .note.ABI-tag is the end all and be all, then why did the ELF spec
authors even create EI_OSABI and EI_ABIVERSION?? And why did they put
things in the ELF header such that things are very easy to parse just by
reading a small amount of a binary? A LOT of the things in an ELF header
could have been put in .note sections. But they weren't because it is so
much easier to read fixed structures.

That said, run readelf on any FreeBSD binary, you will find a
.note.ABI-tag section. I just never got around to adding support for it
to imgact_elf.
Post by Nathan Hawkins
It looks like gcc is already inserting them, (run readelf -S, and you
No, the section comes from crt1.o actually.
Post by Nathan Hawkins
should see it) so it'd be a matter of adding to the binary emulation
selector to check .note.ABI-tag.
For your new ABI, set a version number in the tag. I
believe the tag gcc is putting in already has a field for that.
We already put the value of __FreeBSD_version into .note.ABI-tag.
(see src/lib/csu/common/crtbrand.c)
--
-- David (***@FreeBSD.org)

To Unsubscribe: send mail to ***@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message
Nathan Hawkins
2002-05-07 18:58:48 UTC
Permalink
Post by David O'Brien
Post by Nathan Hawkins
I'd like to see FreeBSD start using ELF .note.ABI-tag sections to handle
the binary OS type/versioning. I know that at least NetBSD, Linux and
Hurd do it this way, and I think most others do too.
Why? Just to follow the NIH herd? The EI_OSABI and EI_ABIVERSION fields
were in the gABI spec before anyone started using .note sections for
this.
Because AFAICS, it's a defacto, unwritten standard. Even if it violates
spec.

NIH is a matter of perspective. FreeBSD could be considered to be in NIH
mode, because the other ELF based systems use a different method.
Post by David O'Brien
If .note.ABI-tag is the end all and be all, then why did the ELF spec
authors even create EI_OSABI and EI_ABIVERSION?? And why did they put
things in the ELF header such that things are very easy to parse just by
reading a small amount of a binary? A LOT of the things in an ELF header
could have been put in .note sections. But they weren't because it is so
much easier to read fixed structures.
I have no axe to grind here. I don't consider the .note.ABI-tag to be a
beautiful way to do things. You're right, it should be faster to get
this from a field in the ELF header. But you also use the .interp
section in the emulation selector code. I think that the .note.ABI-tag
would be a better choice.
Post by David O'Brien
That said, run readelf on any FreeBSD binary, you will find a
.note.ABI-tag section. I just never got around to adding support for it
to imgact_elf.
Yes, I can see that. I've looked at adding support to imgact_elf. I
haven't had time to try yet.

---Nathan


To Unsubscribe: send mail to ***@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message
David O'Brien
2002-05-07 20:59:57 UTC
Permalink
Post by Nathan Hawkins
Post by David O'Brien
Post by Nathan Hawkins
I'd like to see FreeBSD start using ELF .note.ABI-tag sections to handle
the binary OS type/versioning. I know that at least NetBSD, Linux and
Hurd do it this way, and I think most others do too.
Why? Just to follow the NIH herd? The EI_OSABI and EI_ABIVERSION fields
were in the gABI spec before anyone started using .note sections for
this.
Because AFAICS, it's a defacto, unwritten standard. Even if it violates
spec.
Your response is mostly content free. But I honestly interested in your
response. Could you reread what I said and address it?

What is the defacto, unwritten standard? I assume you mean .note
sections.
Post by Nathan Hawkins
Even if it violates spec.
What is "it" and how does it violate the gABI spec?
Post by Nathan Hawkins
NIH is a matter of perspective. FreeBSD could be considered to be in NIH
mode, because the other ELF based systems use a different method.
FreeBSD was the first to have a need to "brand" binaries. So there was
nothing to follow, so no NIH.

[The need was to be able to run static Linux binaries. Note that if
Linux was strictly gABI compliant there would (1) be no static
binaries; and (2) we could not use the dynamic linker's name as a key
to know the binary is a Linux one.]
Post by Nathan Hawkins
this from a field in the ELF header. But you also use the .interp
section in the emulation selector code.
Not really. We do, but that is because few are strictly compliant with
the psABI's. For i386 FreeBSD and Linux should be using
"/usr/lib/libc.so.1". For Alpha "/usr/lib/ld.so", and for Sparc64
"/usr/libexec/ld-elf.so.1".

To Unsubscribe: send mail to ***@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message
Nathan Hawkins
2002-05-08 03:00:42 UTC
Permalink
Post by David O'Brien
Post by Nathan Hawkins
Post by David O'Brien
Post by Nathan Hawkins
I'd like to see FreeBSD start using ELF .note.ABI-tag sections to handle
the binary OS type/versioning. I know that at least NetBSD, Linux and
Hurd do it this way, and I think most others do too.
Why? Just to follow the NIH herd? The EI_OSABI and EI_ABIVERSION fields
were in the gABI spec before anyone started using .note sections for
this.
Because AFAICS, it's a defacto, unwritten standard. Even if it violates
spec.
Your response is mostly content free. But I honestly interested in your
response. Could you reread what I said and address it?
All right:
1. Why?
How about being able to use stock binutils?

The original subject was how to handle some proposed ABI changes. The
.note.ABI-tag section has a version field, already in the binaries
(since 4.1, I believe.) It looks pretty simple to add a field to
Elf_Brandinfo that has the ABI version. exec_elf_imgact could then be
altered to select the right sysvec based on OS and ABI version from the
note.

2. The EI_OSABI and EI_ABIVERSION fields were in the gABI spec before
anyone started using .note sections for this.

But there has since been agreement to use the .note sections. One good
reason is that it keeps OS specifics out of binutils, and makes it
possible to cross compile for another OS on the same architecture.

You'd have to alter binutils to change the default EI_ABIVERSION or
EI_OSABI field. If you do, how do you compile for the old ABI? The
ABI-tags method would allow binutils to support both ABI's, by linking
with a different copy of /usr/lib/crt1.o.

---Nathan



To Unsubscribe: send mail to ***@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message
Terry Lambert
2002-05-08 06:06:06 UTC
Permalink
Post by David O'Brien
Post by Nathan Hawkins
Even if it violates spec.
What is "it" and how does it violate the gABI spec?
I think he means a non-zero EI_OSABI and/or EI_ABIVERSION.
Basically, you are supposed to go through the standards process to
define values other than 0. It's like defining a new "MIME type".
Post by David O'Brien
Post by Nathan Hawkins
NIH is a matter of perspective. FreeBSD could be considered to be in NIH
mode, because the other ELF based systems use a different method.
FreeBSD was the first to have a need to "brand" binaries. So there was
nothing to follow, so no NIH.
[The need was to be able to run static Linux binaries. Note that if
Linux was strictly gABI compliant there would (1) be no static
binaries; and (2) we could not use the dynamic linker's name as a key
to know the binary is a Linux one.]
FreeBSD has static binaries, too. I almost suggested "branding"
the images, but that's really ridiculous. The closest thing I
have seen to a real answer is the ABI command line argument to
the compiler, and I think that's a really bad idea, too, though
you have to wonder how a commercial vendor will get the old ABI.

If I were a commercial vendor offering a binary, I would target
the binary to the lowest common denominator, which would mean I
would target the oldest ABI that could support the necessary
functions to make the program work.

The NetWare ABI has this same problem, with people targetting
oder ABI's -- mostly because they can, since NetWare maintains
interfaces back to the 2.0a version (early 1980's), and ships
them with the libraries in the SDK. If you want to have the
largest possible audience/customer base, then you target the
least common denominator. Actually... I'm not really aware of
any installations which don't have the "bindery emulation" NLM
loaded so that they can run older software on their Novell
network.
Post by David O'Brien
Post by Nathan Hawkins
this from a field in the ELF header. But you also use the .interp
section in the emulation selector code.
Not really. We do, but that is because few are strictly compliant with
the psABI's. For i386 FreeBSD and Linux should be using
"/usr/lib/libc.so.1". For Alpha "/usr/lib/ld.so", and for Sparc64
"/usr/libexec/ld-elf.so.1".
Yep. The problem is that some of the system calls have been
co-opt'ed, so binary compatability with the EABI spec and
some commercial OS's isn't possible. Ideally, when Linux went
and exceeded the scope of the Solaris ABI, and added binary
incompatabilities (e.g. differences in manifest constant values,
elimination of character devices, etc.), they should have used a
different kernel entry vector. Even more ideally, they would
have just conformed to the solaris EABI, and all the code would
"just run".

-- Terry

To Unsubscribe: send mail to ***@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message
Peter Wemm
2002-05-08 06:28:37 UTC
Permalink
Post by Nathan Hawkins
Post by David O'Brien
Post by Nathan Hawkins
I'd like to see FreeBSD start using ELF .note.ABI-tag sections to handle
the binary OS type/versioning. I know that at least NetBSD, Linux and
Hurd do it this way, and I think most others do too.
Why? Just to follow the NIH herd? The EI_OSABI and EI_ABIVERSION fields
were in the gABI spec before anyone started using .note sections for
this.
Because AFAICS, it's a defacto, unwritten standard. Even if it violates
spec.
NIH is a matter of perspective. FreeBSD could be considered to be in NIH
mode, because the other ELF based systems use a different method.
I seem to recall that NetBSD invented .note.ABI-tag and pushed it back
to the binutils folks. In a later revision of ELF, the EI_OSABI stuff
was added. .note.ABI-tag predated those additions. When FreeBSD first
started dabbling with ELF, the NetBSD folks were tinkering with the
.note (and PT_NOTE executable header) stuff, and Linux had a "personality"
syscall. For linux, ELF executables started up in "SVR4 mode" and the
personality syscall changed it to "linux". (SYS_personality was a syscall
number that didn't exist on SVR4). We didn't do *anything* with note
sections till much much later on.

We still have an emulation stub for it.. see sys/compat/linux/linux_misc.c

/*
* UGH! This is just about the dumbest idea I've ever heard!!
*/
int
linux_personality(struct thread *td, struct linux_personality_args *args)
{
#ifdef DEBUG
if (ldebug(personality))
printf(ARGS(personality, "%d"), args->per);
#endif
#ifndef __alpha__
if (args->per != 0)
return EINVAL;
#endif

/* Yes Jim, it's still a Linux... */
td->td_retval[0] = 0;
return 0;
}


Cheers,
-Peter
--
Peter Wemm - ***@wemm.org; ***@FreeBSD.org; ***@yahoo-inc.com
"All of this is for nothing if we don't go to the stars" - JMS/B5


To Unsubscribe: send mail to ***@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message
Marcel Moolenaar
2002-05-08 07:18:32 UTC
Permalink
Post by Peter Wemm
Post by Nathan Hawkins
Post by David O'Brien
Why? Just to follow the NIH herd? The EI_OSABI and EI_ABIVERSION fields
were in the gABI spec before anyone started using .note sections for
this.
Because AFAICS, it's a defacto, unwritten standard. Even if it violates
spec.
NIH is a matter of perspective. FreeBSD could be considered to be in NIH
mode, because the other ELF based systems use a different method.
I seem to recall that NetBSD invented .note.ABI-tag and pushed it back
to the binutils folks.
Interestingly, the LSB (Linux Standards Body) version 1.1.0 has it documented
as Linux specific.

My whole take on the ELF aspect is that we should use EI_OSABI and
EI_ABIVERSION and stop trying to be more compliant than the standard
allows. It's basicly a mess and nobody is truely compliant anyway.
The new draft has EI_OSABI and EI_ABIVERSION documented for years, so
I think we can speculatively use it.

If our toolchain throws in a .note.ABI-tag section than so be it;
we might as well give it sensible contents. I don't think we should
use it as the primary means to select the ABI though.

FWIW,
--
Marcel Moolenaar USPA: A-39004 ***@xcllnt.net

To Unsubscribe: send mail to ***@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message
Christoph Hellwig
2002-05-08 08:17:04 UTC
Permalink
Post by Peter Wemm
syscall. For linux, ELF executables started up in "SVR4 mode" and the
personality syscall changed it to "linux".
That's wrong. On Linux ELF binaries start as normal linux processes.
Depending on whether binary emulation is enabled and certain hints are
found (SCO elfmark branding, different interpreter) they are forced to
be foreign personalities. Also if binaries issues syscalls on foreign
syscalls vectors (e.g. lcall27 for Solaris/ix86 or lcall7 for the
i386 SVR3/SVR4 derivates.).


To Unsubscribe: send mail to ***@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message
Peter Wemm
2002-05-08 08:56:24 UTC
Permalink
Post by Christoph Hellwig
Post by Peter Wemm
syscall. For linux, ELF executables started up in "SVR4 mode" and the
personality syscall changed it to "linux".
That's wrong. On Linux ELF binaries start as normal linux processes.
Depending on whether binary emulation is enabled and certain hints are
found (SCO elfmark branding, different interpreter) they are forced to
be foreign personalities. Also if binaries issues syscalls on foreign
syscalls vectors (e.g. lcall27 for Solaris/ix86 or lcall7 for the
i386 SVR3/SVR4 derivates.).
Bah, you are correct. The last I looked at the code was around 1.2 era,
and looking again:
if (strcmp(elf_interpreter,"/usr/lib/libc.so.1") == 0 ||
strcmp(elf_interpreter,"/usr/lib/ld.so.1") == 0)
ibcs2_interpreter = 1;
..
current->personality = (ibcs2_interpreter ? PER_SVR4 : PER_LINUX);

Cheers,
-Peter
--
Peter Wemm - ***@wemm.org; ***@FreeBSD.org; ***@yahoo-inc.com
"All of this is for nothing if we don't go to the stars" - JMS/B5


To Unsubscribe: send mail to ***@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message
David O'Brien
2002-05-07 17:49:18 UTC
Permalink
Post by John Baldwin
Right. I would like to use the ELF_ABI_OSVERSION thingie, but according to
O'Brien it violates the ELF spec to use any value other than 0.
Not quite. We cannot touch EI_VERSION (this is the ELF spec version).
We are able to play with EI_ABIVERSION since we also play with EI_OSABI.

EI_ABIVERSION
Byte e_ident[EI_ABIVERSION] identifies the the version of the ABI
to which the object is targeted. This field is used to distinguish
among incompatible versions of an ABI. The interpretation of this
version number is dependent on the ABI identified by the EI_OSABI
field. Applications conforming to this specification use the value
0

Well, it is a minor violation to use a non-0 value. But what that really
means is that given a non-0 value, a generic ELF program manipulating the
binary cannot make general assumptions.

To Unsubscribe: send mail to ***@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message
Peter Wemm
2002-05-08 06:21:12 UTC
Permalink
Post by John Baldwin
Post by Matthew Dillon
As a person who has already spent several days playing with
64 bit time_t's and associated syscalls I would say that
creating a new syscall vector for FreeBSD 5.0 (and leaving the
existing one intact for compatibility) is the correct solution.
I agree completely.
Post by Matthew Dillon
I tried to add new syscalls to the existing vector but so many
syscalls had to be changed to support 64 bit time_t's that it
became a huge mess... so much so that I would expect the other
BSD's to cry foul on us if we tried to do it with the existing
vector. It will be far, far cleaner to simply implement an
entirely new syscall vector / ELF identifier.
Right. I would like to use the ELF_ABI_OSVERSION thingie, but according to
O'Brien it violates the ELF spec to use any value other than 0. But then
why does it exist if we can't use it? :(
EI_OSABI (index 7) and EI_ABIVERSION (index 8) are *NOT* in the original
ELF spec. They were added relatively recently, and a SVR4 ABI compliant
binary must (by definition) have these at value zero since it doesn't
exist on SVR4.

#define EI_OSABI 7 /* Operating system / ABI identification */
#define EI_ABIVERSION 8 /* ABI version */

With EI_OSABI = 9 (ELFOSABI_FREEBSD), we are by definition not strictly
SVR4 ELF ABI compliant. But so what? We hardly provide the SVR4
layout for 'struct stat' etc anyway..

With EI_OSABI == FreeBSD, then EI_ABIVERSION == ours to play with as we see
fit. (EI_ABIVERSION being non-zero will be no more of a problem than
EI_OSABI being non-zero since traditional ELF tools know about *neither*
of them).

Cheers,
-Peter
--
Peter Wemm - ***@wemm.org; ***@FreeBSD.org; ***@yahoo-inc.com
"All of this is for nothing if we don't go to the stars" - JMS/B5


To Unsubscribe: send mail to ***@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message
Matthew Dillon
2002-05-10 17:35:51 UTC
Permalink
This post might be inappropriate. Click to display it.
Garance A Drosihn
2002-05-10 21:40:07 UTC
Permalink
: It's not one that other source needs to be changed for though.
: It's not like you've changed the prototypes of functions.
: Instead, you've changed the binary representation of certain
: types. We don't need to use the old ABI's for vendor code
: like Michael suggested because vendor code will compile just
: fine with the newer one.
[...] the off_t change that occured when 64 bit cpu's
started hitting the mainstream UNIX world. It was *NASTY*.
time_t is going to be almost as nasty. Even many of our
current utilities, for example those that embed time_t in
data streams or data files, will either break or become
incompatible with their brethren.
This isn't a matter of brute-forcing the change. There is
just too much stuff out there to brute-force. All I am
suggesting is that we take the capabilities that we already
have to add to the compiler and extend them out to userland
as an option. It helps us, and it helps ports.
The more I think about this, the more I think Matt's suggestion
is a prudent one. There is still plenty of code which thinks
that size_t is 'int', or uses size_t when it really should be
using offset_t. The code "works" mainly because we rarely have
really large files. Who knows how much code we have, both in
the base system and in all the ports, which makes invalid
assumptions about time_t, or of some of the other types that
might be changed with this new syscall vector.
I'm not an expert on GCC but I could probably do the work
if nobody else wants to.
Of course, the sticky question is that we're also in the
middle of a major change in the gcc world, so I don't know
how disruptive it would be to try and do these changes in
about the same timeframe. And certainly the latest and
greatest gcc is critical to a good 5.0-release.

Still, if we were going to "do it right", I think it does
make more sense to allow the ABI's to co-exist, instead of
trying to make a hard cutover. If someone does have a
problem with a port, it would be nice to just recompile
with "the old API" and see that effects the problem.

Perhaps we could volunteer Matt to at least look into the
idea a bit more, without making any actual changes yet, just
to see how hard it would be to do this. That work should be
unrelated to the actual syscall change, so looking into this
should not slow down that project.
--
Garance Alistair Drosehn = ***@gilead.netel.rpi.edu
Senior Systems Programmer or ***@freebsd.org
Rensselaer Polytechnic Institute or ***@rpi.edu

To Unsubscribe: send mail to ***@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message
Loading...