To assert() or not to assert(), that is not really a question...

Discussion:

Poul-Henning Kamp

2018-05-26 07:19:14 UTC

When I started writing Varnish Cache, on of the first decision was
that I would pepper the source code with asserts even "pointless"
ones, and I did to the order of 10% of all source file lines, and
it is probably the best decision I have ever made.

The primary effects of all the asserts is that we get crash-dumps
from the earliest possible point in time the trouble could be
spotted. For a process which can rutinely have several thousands
threads, that is what makes debugging possible in the first place.

The secondary effect is that the sheer number of asserts means
that crashes can almost always be isolated to a very small stretch
of code based on the assert location.

The one benefit from all the asserts I had not anticipated is that
they almost always prevent program bugs from turning into information
leakage vulnerabilities: We stop before we send wrong data out.

The biggest impact of all the asserts however, is that Varnish Cache
went 11 years while moving a very large fraction of all HTTP traffic
on the net, without a major security issue.

When we finally got our first big CVE, it was not a remote excution,
an information leak or anything horrible bad: It was a "harmless"
DoS caused by a wrong assert test, but a DoS is still pretty bad
news when so much traffic goes through Varnish.

I think FreeBSD needs to learn from Varnish Cache's experience:
We should have far more asserts in FreeBSD.

We already have a "private" assert facility in the kernel, and
people should simply use it more.

But in userland our asserts come from <assert.h>, and that is a
problem.

The main trouble with using assert(3) is that random people
illadvisedly set NDEBUG expecting their code to run faster as a
result.

It does not.

Almost all the asserts in Varnish happens on values already in CPU
registers and/or cache and I have never been able to credibly measure
a performance impact from the asserts in Varnish.

Besides, many of the asserts never make it to the CPU, modern
compilers have strong analysis which can see that the can never
trigger, so they are removed in optimization.

We cannot change <assert.h> to get rid of the NDEBUG mistake, and
in a more abstract line of thought we probably should not even use
<assert.h> for FreeBSD source code - it sort of belongs to the users.

I suggest and strongly urge that we add a set of userland assert-macros
which are never compiled out, for use *only* in FreeBSD userland
source code, and that we sprinkle them liberally, in particular
anything setuid etc.

In Varnish Cache we have four "kinds" of asserts, which allows us
to communicate crucial information in the message to the users. I
don't know if a similar subdivision of asserts would make sense for
FreeBSD, but I mention it here as inspiration:

1. "Regular asserts" - things which are just plain wrong, which
probably means we have a genuine bug somewhere. Examples could
be null pointers where previous checks should have ensured this
not be so. Also error situations for which there is no saner
handling that killing the projcess.

2. "xxx asserts" - situations which should (almost) never happen,
and for which we could do more productive error handling, but
where the seldomness of the condition makes it a bad idea (ie:
untested code) and a bad investment of our developer time to do
so. Disk full is a good example for Varnish.

3. "wrong asserts" - Internal state is messed up, program flow
has taken a "impossible" branch. A good example is the
default branch of a switch on a finite input set.

4. "Incomplete asserts" - Code we have not written yet, extension
points not open for business yet, very strange corner-cases
belived to be impossible, but not proven to be so yet.

Poul-Henning

--
Poul-Henning Kamp | UNIX since Zilog Zeus 3.20
***@FreeBSD.ORG | TCP/IP since RFC 956
FreeBSD committer | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.

Oliver Pinter

2018-05-26 08:09:57 UTC

Permalink

Post by Poul-Henning Kamp
When I started writing Varnish Cache, on of the first decision was
that I would pepper the source code with asserts even "pointless"
ones, and I did to the order of 10% of all source file lines, and
it is probably the best decision I have ever made.
The primary effects of all the asserts is that we get crash-dumps
from the earliest possible point in time the trouble could be
spotted. For a process which can rutinely have several thousands
threads, that is what makes debugging possible in the first place.
The secondary effect is that the sheer number of asserts means
that crashes can almost always be isolated to a very small stretch
of code based on the assert location.
The one benefit from all the asserts I had not anticipated is that
they almost always prevent program bugs from turning into information
leakage vulnerabilities: We stop before we send wrong data out.
The biggest impact of all the asserts however, is that Varnish Cache
went 11 years while moving a very large fraction of all HTTP traffic
on the net, without a major security issue.
When we finally got our first big CVE, it was not a remote excution,
an information leak or anything horrible bad: It was a "harmless"
DoS caused by a wrong assert test, but a DoS is still pretty bad
news when so much traffic goes through Varnish.
We should have far more asserts in FreeBSD.
We already have a "private" assert facility in the kernel, and
people should simply use it more.

This private assert facility exists in the kernel, but used only in a
limited scope.
It's only used in -CURRENT. It would be a really big step forward, if we
were enable them for -STABLE branches too, because these beaches changes
significantly too without enabled KASSERT.

Post by Poul-Henning Kamp
But in userland our asserts come from <assert.h>, and that is a
problem.
The main trouble with using assert(3) is that random people
illadvisedly set NDEBUG expecting their code to run faster as a
result.
It does not.
Almost all the asserts in Varnish happens on values already in CPU
registers and/or cache and I have never been able to credibly measure
a performance impact from the asserts in Varnish.
Besides, many of the asserts never make it to the CPU, modern
compilers have strong analysis which can see that the can never
trigger, so they are removed in optimization.
We cannot change <assert.h> to get rid of the NDEBUG mistake, and
in a more abstract line of thought we probably should not even use
<assert.h> for FreeBSD source code - it sort of belongs to the users.
I suggest and strongly urge that we add a set of userland assert-macros
which are never compiled out, for use *only* in FreeBSD userland
source code, and that we sprinkle them liberally, in particular
anything setuid etc.
In Varnish Cache we have four "kinds" of asserts, which allows us
to communicate crucial information in the message to the users. I
don't know if a similar subdivision of asserts would make sense for
1. "Regular asserts" - things which are just plain wrong, which
probably means we have a genuine bug somewhere. Examples could
be null pointers where previous checks should have ensured this
not be so. Also error situations for which there is no saner
handling that killing the projcess.
2. "xxx asserts" - situations which should (almost) never happen,
and for which we could do more productive error handling, but
untested code) and a bad investment of our developer time to do
so. Disk full is a good example for Varnish.
3. "wrong asserts" - Internal state is messed up, program flow
has taken a "impossible" branch. A good example is the
default branch of a switch on a finite input set.
4. "Incomplete asserts" - Code we have not written yet, extension
points not open for business yet, very strange corner-cases
belived to be impossible, but not proven to be so yet.
Poul-Henning
--
Poul-Henning Kamp | UNIX since Zilog Zeus 3.20
FreeBSD committer | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.
_______________________________________________
https://lists.freebsd.org/mailman/listinfo/freebsd-arch

Mark R V Murray

2018-05-26 12:50:10 UTC

Permalink

+1 from day-job experience!

M
--
Mark R V Murray

Ravi Pokala

2018-05-28 18:07:59 UTC

Permalink

-----Original Message-----
From: <owner-freebsd-***@freebsd.org> on behalf of Poul-Henning Kamp <***@phk.freebsd.dk>
Date: 2018-05-26, Saturday at 00:19
To: <***@freebsd.org>
Subject: To assert() or not to assert(), that is not really a question...

Post by Poul-Henning Kamp
...
1. "Regular asserts" - things which are just plain wrong, which
probably means we have a genuine bug somewhere. Examples could
be null pointers where previous checks should have ensured this
not be so. Also error situations for which there is no saner
handling that killing the projcess.
...
3. "wrong asserts" - Internal state is messed up, program flow
has taken a "impossible" branch. A good example is the
default branch of a switch on a finite input set.

Hi Poul-Henning,

I am in strong overall agreement with your argument. I am however confused as to how (1) and (3) are different; they're both irrevocably bad internal state.

Thanks,

Ravi (rpokala@)

Poul-Henning Kamp

2018-05-28 18:24:06 UTC

Permalink

--------

Post by Ravi Pokala

Post by Poul-Henning Kamp
1. "Regular asserts" - things which are just plain wrong, which
probably means we have a genuine bug somewhere. Examples could
be null pointers where previous checks should have ensured this
not be so. Also error situations for which there is no saner
handling that killing the projcess.
...
3. "wrong asserts" - Internal state is messed up, program flow
has taken a "impossible" branch. A good example is the
default branch of a switch on a finite input set.

Hi Poul-Henning,
I am in strong overall agreement with your argument. I am however
confused as to how (1) and (3) are different; they're both irrevocably
bad internal state.

The regular assert is assert() as we know and love it, and if it triggers
it reports the C-source of the failing condition.

The WRONG macro always triggers, and reports its string argument.

Here is a random snippet of varnish code showing both:

/* Per specification */
assert(sizeof vpx1_sig == 5);
assert(sizeof vpx2_sig == 12);

[...]
p = req->htc->rxbuf_b;
if (p[0] == vpx1_sig[0])
i = vpx_proto1(wrk, req);
else if (p[0] == vpx2_sig[0])
i = vpx_proto2(wrk, req);
else
WRONG("proxy sig mismatch");

Poul-Henning

PS:

You can explore the Varnish source code here:

https://github.com/varnishcache/varnish-cache

Asserts defined in:

.../include/vas.h

Custom backtrace/state dump in:

.../bin/varnishd/cache/cache_panic.c

Code coverage results:

http://varnish-cache.org/gcov/

You may also find the void-pointer paranoia interesting:

.../include/miniobj.h

Ravi Pokala

2018-05-28 20:07:35 UTC

Permalink

Interesting stuff. Thanks, phk!

-Ravi (rpokala@)

-----Original Message-----
From: Poul-Henning Kamp <***@phk.freebsd.dk>
Date: 2018-05-28, Monday at 11:24
To: Ravi Pokala <***@freebsd.org>
Cc: <***@freebsd.org>
Subject: Re: To assert() or not to assert(), that is not really a question...

--------

Post by Ravi Pokala

Hi Poul-Henning,
I am in strong overall agreement with your argument. I am however
confused as to how (1) and (3) are different; they're both irrevocably
bad internal state.