Skip to content

P2795R5 Erroneous behaviour for uninitialized reads #6897

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Apr 15, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
85 changes: 70 additions & 15 deletions source/basic.tex
Original file line number Diff line number Diff line change
Expand Up @@ -3744,26 +3744,49 @@
\end{note}
\indextext{object lifetime|)}

\rSec2[basic.indet]{Indeterminate values}
\rSec2[basic.indet]{Indeterminate and erroneous values}

\pnum
\indextext{value!indeterminate}%
\indextext{indeterminate value}%
When storage for an object with automatic or dynamic storage duration
is obtained, the object has an \defnadj{indeterminate}{value}, and if
no initialization is performed for the object, that object retains an
indeterminate value until that value is replaced\iref{expr.ass}.
is obtained,
the bytes comprising the storage for the object
have the following initial value:
\begin{itemize}
\item
If the object has dynamic storage duration, or
is the object associated with a variable or function parameter
whose first declaration is marked with
the \tcode{[[indeterminate]]} attribute\iref{dcl.attr.indet},
the bytes have \defnadjx{indeterminate}{values}{value};
\item
otherwise, the bytes have erroneous values,
where each value is determined by the implementation
independently of the state of the program.
\end{itemize}
If no initialization is performed for an object (including subobjects),
such a byte retains its initial value
until that value is replaced\iref{dcl.init.general,expr.ass}.
If any bit in the value representation has an indeterminate value,
the object has an indeterminate value;
otherwise, if any bit in the value representation has an erroneous value,
the object has an erroneous value\iref{conv.lval}.
\begin{note}
Objects with static or thread storage duration are zero-initialized,
see~\ref{basic.start.static}.
\end{note}

\pnum
If an indeterminate value is produced by an evaluation,
the behavior is undefined except in the following cases:
Except in the following cases,
if an indeterminate value is produced by an evaluation,
the behavior is undefined, and
if an erroneous value is produced by an evaluation,
the behavior is erroneous and
the result of the evaluation is the value so produced but is not erroneous:
\begin{itemize}
\item
If an indeterminate value of
If an indeterminate or erroneous value of
unsigned ordinary character type\iref{basic.fundamental}
or \tcode{std::byte} type\iref{cstddef.syn}
is produced by the evaluation of:
Expand All @@ -3780,37 +3803,69 @@
\item
a discarded-value expression\iref{expr.context},
\end{itemize}
then the result of the operation is an indeterminate value.
then the result of the operation is an indeterminate value or
that errorneous value, respectively.
\item
If an indeterminate value of
If an indeterminate or erroneous value of
unsigned ordinary character type or \tcode{std::byte} type
is produced by the evaluation of
the right operand of a simple assignment operator\iref{expr.ass}
whose first operand is an lvalue of
unsigned ordinary character type or \tcode{std::byte} type,
an indeterminate value replaces
an indeterminate value or that erroneous value, respectively, replaces
the value of the object referred to by the left operand.
\item
If an indeterminate value of unsigned ordinary character type
If an indeterminate or erroneous value of unsigned ordinary character type
is produced by the evaluation of the initialization expression
when initializing an object of unsigned ordinary character type,
that object is initialized to an indeterminate
value.
value or that erroneous value, respectively.
\item
If an indeterminate value of
unsigned ordinary character type or \tcode{std::byte} type
is produced by the evaluation of the initialization expression
when initializing an object of \tcode{std::byte} type,
that object is initialized to an indeterminate value.
that object is initialized to an indeterminate value or
that erroneous value, respectively.
\end{itemize}
Converting an indeterminate or erroneous value of
unsigned ordinary character type or \tcode{std::byte} type
produces an indeterminate or erroneous value, respectively.
In the latter case,
the result of the conversion is the value of the converted operand.
\begin{example}
\begin{codeblock}
int f(bool b) {
unsigned char c;
unsigned char d = c; // OK, \tcode{d} has an indeterminate value
unsigned char *c = new unsigned char;
unsigned char d = *c; // OK, \tcode{d} has an indeterminate value
int e = d; // undefined behavior
return b ? d : 0; // undefined behavior if \tcode{b} is \tcode{true}
}

int g(bool b) {
unsigned char c;
unsigned char d = c; // no erroneous behavior, but \tcode{d} has an erroneous value

assert(c == d); // holds, both integral promotions have erroneous behavior

int e = d; // erroneous behavior
return b ? d : 0; // erroneous behavior if \tcode{b} is \tcode{true}
}

void h() {
int d1, d2;

int e1 = d1; // erroneous behavior
int e2 = d1; // erroneous behavior

assert(e1 == e2); // holds
assert(e1 == d1); // holds, erroneous behavior
assert(e2 == d1); // holds, erroneous behavior

std::memcpy(&d2, &d1, sizeof(int)); // no erroneous behavior, but \tcode{d2} has an erroneous value
assert(e1 == d2); // holds, erroneous behavior
assert(e2 == d2); // holds, erroneous behavior
}
\end{codeblock}
\end{example}

Expand Down
4 changes: 2 additions & 2 deletions source/classes.tex
Original file line number Diff line number Diff line change
Expand Up @@ -5505,7 +5505,7 @@
is neither initialized nor
given a value
during execution of the \grammarterm{compound-statement} of the body of the constructor,
the member has an indeterminate value.
the member has an indeterminate or erroneous value\iref{basic.indet}.
\end{note}
\begin{example}
\begin{codeblock}
Expand All @@ -5521,7 +5521,7 @@
C() { } // initializes members as follows:
A a; // OK, calls \tcode{A::A()}
const B b; // error: \tcode{B} has no default constructor
int i; // OK, \tcode{i} has indeterminate value
int i; // OK, \tcode{i} has indeterminate or erroneous value
int j = 5; // OK, \tcode{j} has the value \tcode{5}
};
\end{codeblock}
Expand Down
4 changes: 2 additions & 2 deletions source/compatibility.tex
Original file line number Diff line number Diff line change
Expand Up @@ -2881,8 +2881,8 @@
\end{codeblock}

\rationale
This is to avoid erroneous function calls (i.e., function calls
with the wrong number or type of arguments).
This is to avoid function calls
with the wrong number or type of arguments.
\effect
Change to semantics of well-defined feature.
This feature was marked as ``obsolescent'' in C.
Expand Down
46 changes: 46 additions & 0 deletions source/declarations.tex
Original file line number Diff line number Diff line change
Expand Up @@ -9189,6 +9189,52 @@
\end{codeblock}
\end{example}

\rSec2[dcl.attr.indet]{Indeterminate storage}
\indextext{attribute!indeterminate}

\pnum
The \grammarterm{attribute-token} \tcode{indeterminate} may be applied
to the definition of a block variable with automatic storage duration or
to a \grammarterm{parameter-declaration} of a function declaration.
No \grammarterm{attribute-argument-clause} shall be present.
The attribute specifies
that the storage of an object with automatic storage duration
is initially indeterminate rather than erroneous\iref{basic.indet}.

\pnum
If a function parameter is declared with the \tcode{indeterminate} attribute,
it shall be so declared in the first declaration of its function.
If a function parameter is declared with
the \tcode{indeterminate} attribute in the first declaration of its function
in one translation unit and
the same function is declared without the \tcode{indeterminate} attribute
on the same parameter in its first declaration in another translation unit,
the program is ill-formed, no diagnostic required.

\pnum
\begin{note}
Reading from an uninitialized variable
that is marked \tcode{[[indeterminate]]} can cause undefined behavior.
\begin{codeblock}
void f(int);
void g() {
int x [[indeterminate]], y;
f(y); // erroneous behavior\iref{basic.indet}
f(x); // undefined behavior
}

struct T {
T() {}
int x;
};
int h(T t [[indeterminate]]) {
f(t.x); // undefined behavior when called below
return 0;
}
int _ = h(T());
\end{codeblock}
\end{note}

\rSec2[dcl.attr.likelihood]{Likelihood attributes}%
\indextext{attribute!likely}
\indextext{attribute!unlikely}
Expand Down
19 changes: 15 additions & 4 deletions source/expressions.tex
Original file line number Diff line number Diff line change
Expand Up @@ -675,6 +675,9 @@

\item Otherwise, the object indicated by the glvalue is read\iref{defns.access},
and the value contained in the object is the prvalue result.
If the result is an erroneous value\iref{basic.indet} and
the bits in the value representation are not valid for the object's type,
the behavior is undefined.
\end{itemize}

\pnum
Expand Down Expand Up @@ -5299,7 +5302,7 @@
If the \grammarterm{expression} in a \grammarterm{noptr-new-declarator}
is present, it is implicitly converted to \tcode{std::size_t}.
\indextext{function!allocation}%
The \grammarterm{expression} is erroneous if:
The value of the \grammarterm{expression} is invalid if:
\begin{itemize}
\item
the expression is of non-class type and its value before converting to
Expand Down Expand Up @@ -5327,7 +5330,7 @@
number of elements to initialize.
\end{itemize}

If the \grammarterm{expression} is erroneous after converting to \tcode{std::size_t}:
If the value of the \grammarterm{expression} is invalid after converting to \tcode{std::size_t}:
\begin{itemize}
\item
if the \grammarterm{expression} is a potentially-evaluated core constant expression,
Expand Down Expand Up @@ -7519,7 +7522,7 @@
limits (see \ref{implimits});

\item
an operation that would have undefined behavior
an operation that would have undefined or erroneous behavior
as specified in \ref{intro} through \ref{cpp},
excluding \ref{dcl.attr.assume} and \ref{dcl.attr.noreturn};
\begin{footnote}
Expand Down Expand Up @@ -7937,7 +7940,7 @@

\item
if the value is an object of scalar type,
it does not have an indeterminate value\iref{basic.indet},
it does not have an indeterminate or erroneous value\iref{basic.indet},

\item
if the value is of pointer type, it contains
Expand Down Expand Up @@ -7973,6 +7976,14 @@
constexpr int r = h(); // OK
constexpr auto e = g(); // error: a pointer to an immediate function is
// not a permitted result of a constant expression

struct S {
int x;
constexpr S() {}
};
int i() {
constexpr S s; // error: \tcode{s.x} has erroneous value
}
\end{codeblock}
\end{example}

Expand Down
31 changes: 28 additions & 3 deletions source/intro.tex
Original file line number Diff line number Diff line change
Expand Up @@ -253,6 +253,16 @@
\definition{dynamic type}{defns.dynamic.type.prvalue}
\defncontext{prvalue} \termref{defns.static.type}{static type}{} of the prvalue expression

\definition{erroneous behavior}{defns.erroneous}
well-defined behavior that the implementation is recommended to diagnose
\begin{defnote}
Erroneous behavior is always the consequence of incorrect program code.
Implementations are allowed, but not required,
to diagnose it\iref{intro.compliance.general}.
Evaluation of a constant expression\iref{expr.const}
never exhibits behavior specified as erroneous in \ref{intro} through \ref{cpp}.
\end{defnote}

\definition{expression-equivalent}{defns.expression.equivalent}
\defncontext{library}
\indexdefn{expression-equivalent}%
Expand Down Expand Up @@ -629,13 +639,13 @@
\begin{defnote}
Undefined behavior may be expected when
this document omits any explicit
definition of behavior or when a program uses an erroneous construct or erroneous data.
definition of behavior or when a program uses an incorrect construct or invalid data.
Permissible undefined behavior ranges
from ignoring the situation completely with unpredictable results, to
behaving during translation or program execution in a documented manner
characteristic of the environment (with or without the issuance of a
\termref{defns.diagnostic}{diagnostic message}{}), to terminating a translation or execution (with the
issuance of a diagnostic message). Many erroneous program constructs do
issuance of a diagnostic message). Many incorrect program constructs do
not engender undefined behavior; they are required to be diagnosed.
Evaluation of a constant expression\iref{expr.const} never exhibits behavior explicitly
specified as undefined in \ref{intro} through \ref{cpp}.
Expand Down Expand Up @@ -721,7 +731,8 @@
within its resource limits as described in \ref{implimits},
accept and correctly execute
\begin{footnote}
``Correct execution'' can include undefined behavior, depending on
``Correct execution'' can include undefined behavior
and erroneous behavior, depending on
the data being processed; see \ref{intro.defs} and~\ref{intro.execution}.
\end{footnote}
that program.
Expand Down Expand Up @@ -900,6 +911,20 @@
requirement on the implementation executing that program with that input
(not even with regard to operations preceding the first undefined
operation).
If the execution contains an operation specified as having erroneous behavior,
the implementation is permitted to issue a diagnostic and
is permitted to terminate the execution
at an unspecified time after that operation.

\pnum
\recommended
An implementation should issue a diagnostic when such an operation is executed.
\begin{note}
An implementation can issue a diagnostic
if it can determine that erroneous behavior is reachable
under an implementation-specific set of assumptions about the program behavior,
which can result in false positives.
\end{note}

\pnum
\indextext{conformance requirements}%
Expand Down
6 changes: 3 additions & 3 deletions source/lib-intro.tex
Original file line number Diff line number Diff line change
Expand Up @@ -1939,10 +1939,10 @@
\pnum
A value-initialized object of type \tcode{P} produces the null value of the type.
The null value shall be equivalent only to itself. A default-initialized object
of type \tcode{P} may have an indeterminate value.
of type \tcode{P} may have an indeterminate or erroneous value.
\begin{note}
Operations involving
indeterminate values can cause undefined behavior.
Operations involving indeterminate values can cause undefined behavior, and
operations involving erroneous values can cause erroneous behavior\iref{basic.indet}.
\end{note}

\pnum
Expand Down
2 changes: 1 addition & 1 deletion source/threads.tex
Original file line number Diff line number Diff line change
Expand Up @@ -8376,7 +8376,7 @@
\pnum
\begin{note}
It is the user's responsibility to ensure that waiting threads
do not erroneously assume that the thread has finished if they experience
do not incorrectly assume that the thread has finished if they experience
spurious wakeups. This typically requires that the condition being waited
for is satisfied while holding the lock on \tcode{lk}, and that this lock
is not released and reacquired prior to calling \tcode{notify_all_at_thread_exit}.
Expand Down
26 changes: 20 additions & 6 deletions source/utilities.tex
Original file line number Diff line number Diff line change
Expand Up @@ -18865,13 +18865,27 @@
If there are multiple such values, which value is produced is unspecified.
A bit in the value representation of the result is indeterminate if
it does not correspond to a bit in the value representation of \tcode{from} or
corresponds to a bit of an object that is not within its lifetime or
corresponds to a bit
for which the smallest enclosing object is not within its lifetime or
has an indeterminate value\iref{basic.indet}.
For each bit in the value representation of the result that is indeterminate,
the smallest object containing that bit has an indeterminate value;
the behavior is undefined unless that object is
of unsigned ordinary character type or \tcode{std::byte} type.
The result does not otherwise contain any indeterminate values.
A bit in the value representation of the result is erroneous
if it corresponds to a bit
for which the smallest enclosing object has an erroneous value.
For each bit $b$ in the value representation of the result
that is indeterminate or erroneous,
let $u$ be the smallest object containing that bit enclosing $b$:
\begin{itemize}
\item
If $u$ is of unsigned ordinary character type or \tcode{std::byte} type,
$u$ has an indeterminate value
if any of the bits in its value representation are indeterminate, or
otherwise has an erroneous value.
\item
Otherwise, if $b$ is indeterminate, the behavior is undefined.
\item
Otherwise, the behaviour is erroneous, and the result is as specified above.
\end{itemize}
The result does not otherwise contain any indeterminate or erroneous values.

\pnum
\remarks
Expand Down