The Unclear Impact

The Unclear Impact

C quiz. T/F: TFAE (the following are equivalent)

int x;
if (cond) x = (expr1);
else x = (expr2);

int x;
x = (cond) ? (expr1) : (expr2);

@dalias I want to say yes but the fact that you're asking has me wondering...

Is there some dark corner of the spec where they diverge if the expressions have side effects?

@azonenberg It's truly evil and the way I formulated the examples obscures how evil it is. 😈

@dalias can we assume the expressions are pure, aka don't have any side effects?

@Elmusfire I don't think it matters, does it?

@dalias idk, I am unsure how the elvis operator shortcuts/optimises in c.
For he if statement, only one of the expressions is executed, but I am not sure about the elvis operator. My gut says the actual execution is undefined and the compiler can choose which option optimizes better.

@Elmusfire Which operand is evaluated (on the abstract machine) is conditional on the truth of the first. The other operand is not evaluated.

@dalias voted True because I just can't see how it could be different…

@dalias i'll be something evil with sequence points, won't it? blobfoxshocked

replies
0
announces
0
likes
0

@dalias false, i’m sure of it, but i haven’t the slightest idea why

@ariadne Smart reasoning. 😂

@dalias 20 years of holding the footgun makes one weary

@dalias https://godbolt.org/z/G7ePd494W clang generates different code for two cases that match this pattern, for something other than what you ran into earlier, although gcc's behaviour doesn't match

@mildsunrise came up with the idea

@dalias @mildsunrise https://godbolt.org/z/efGqTTeq5

This example shows that clang and gcc agree that the output should be different when everything is parameterised, so it seems that gcc is doing an invalid optimisation in the first case with specific values

@chozu @mildsunrise Ok but this version no longer has expressions there only the resulting values from those expressions. Refactoring this way is *very* different, as all the expressions get evaluated before passing to the functions.

@chozu @mildsunrise Seems you may have found a gcc bug tho 😂

@dalias @mildsunrise nice

I assume you know more about submitting gcc bugs than I do, how should I go about submitting it?

@dalias @mildsunrise I don't think it's a bug, because (float)INT_MAX is ub

@dalias @mildsunrise uh, or rather (int)(float)INT_MAX is

@chozu @dalias ohh sorry then, I didn't know implicit conversions could result in UB

spoiler

@ariadne This one is a delightful footgun for folks who like to make gratuitous beautifying/style changes to code.

@dalias no, in the first case expr1 and expr2 get assigned directly, in the second case they go through the "common type" that is determined through the ternary operator rules; this can lead to surprises: if e.g. expr1 is a float, its type is going to win against e.g. an int expr2, possibly leading to loss of precision if expr2 is big.

@dalias `__LINE__` is cheating right

@dalias well i did *something*

C file with following contents that maybe counts?

#include <stdint.h>
#include <stdio.h>

#define int uintptr_t

#define cond x
#define expr1 (int)&a
#define expr2 (int)&x

int a = 1;

int func1(void) {
        int x;
        if (cond) x = (expr1);
        else x = (expr2);

        x = *(int*)x;
        return x;
}

int func2(void) {
        int x;
        x = (cond) ? (expr1) : (expr2);

        x = *(int*)x;
        return x;
}

int main(void) {
        printf("%ld %ld\n", func1(), func2());
}

Okay, so, following up now:

The correct answer is false. An "easy" unintended reason that I missed is types. The ?: operator produces a result in a type depending on the types of the second and third operand, so for example:

x = 1 ? INT_MAX/2 : 0.0f;

produces a different result from:

if (1) x = INT_MAX/2;
else x = 0.0f:

However let's say we get rid of that possibility (as I intended) by making it:

x = (cond) ? (int)(expr1) : (int)(expr2);

The answer is still false, and the reason is profoundly evil.

In C89 they might be equivalent now (exercise for reader), but C99 and later have compound literals, making an obscure rule relevant...

Per C99 6.8.4 Selection statements ¶3:

"A selection statement is a block whose scope is a strict subset of the scope of its enclosing block. Each associated substatement is also a block whose scope is a strict subset of the scope of the selection statement."

And 6.5.2.5 Compound literals ¶6:

"If the compound literal occurs outside the body of a function, the object has static storage duration; otherwise, it has automatic storage duration associated with the enclosing block."

A compound literal that appears in if/else has lifetime that ends with the if/else (selection statement). That appears in ?: has lifetime that persists to the end of the enclosing block.

So, how do you use that to make the original forms non equivalent. If int can round-trip pointers (ILP32 systems), 1?(int)&(int){0}:0 produces a valid pointer when cast back to int*, while the if form doesn't. But there are other ways to do it too.

For example 1?(p=&(int){0}),0:0 has a side effect of storing a valid pointer in p, but the if version stores a dangerous dangling pointer.

This makes for a profound and glorious footgun for folks who like to make gratuitous style/"beautifying" changes to code they import/fork/inherit: changing "ugly, hackish looking" ?: to "nice" if/else can turn valid code into deviously subtle UB! 🤣 🤯 🤦