Distinguishing between maybe-null vs never-null is the important thing

Herb Sutter C++ 2017-02-15 2 Minutes

This discussion today on the Core Guidelines repo issues is probably of broad interest. It’s regarding why we chose to annotate not_null<T*> rather than the reverse in the Guidelines and the Guideline Support Library (GSL).

Pasting here:

I would take this interface reduction one step further and make an un-annotated T* implicitly “not null”.

I understand, and we considered that.

We decided against that for several reasons:

T*, smart_ptr<T>, span<T>, container<T>::iterator, range<T>, etc. are all non-owning indirections and should be consistent with each other — it would be strange for some to be nullable but not others. Iterators can be “null”, for example a default-constructed iterator is not referring to anything.
More generally, all of those can be default-constructed, and the only reasonable semantics for that are “doesn’t point to anything.” (This can be a springboard for a broader discussion about the situations where default-constructible types are important, Regular types, etc.)
A large fraction of existing of T* are deliberately intended to be null, because people by convention use references for not-null parameters in particular and so in modern C++ code the presence of a T*parameter often (not always) implies nullability by that convention. So trying to annotate the “nullable” case is a huge code churn, and not only unadoptable but actually against the intent of much existing code.
Even if we ignored that and changed the default for T*, then we’d need to invent yet another annotation wrapper such as nullable<T>, and have to teach and explain both not_null<T> and nullable<T> (inconsistently).

For these and other reasons, we think that pointers should be nullable by default unless annotated otherwise.

valid concerns that are being dismissed because of a failure to distinguish between best practices for new code, and pragmatic recommendations for updating old code

I hope that helps reassure you that the concerns were considered deeply and aren’t being dismissed, and apply both to new code and old code. Defaults are important, and should reflect the common case especially for new code, but also for old code much of which is “correct” but just expressed without enough information about the intent because the programmer didn’t have the option or tool to express the intent.

The key issue is to distinguish maybe-null and never-null in the type system, and both of our approaches agree on doing that. Tony Hoare called null pointers his “billion-dollar mistake,” but in my opinion, and I think yours, the mistake was not maybe-null pointers (which are necessary, unavoidable, and pervasively present in every language with pointer/reference indirections, including Java, C#, C, C++, etc.), but rather in not distinguishing maybe-null and never-null pointers in the type system. You and we are both trying to do that, and so in the above I think we’re largely agreeing and our discussion is narrowly just about which one should be the default.

Published by Herb Sutter

Herb Sutter is an author and speaker, a software architect at Microsoft, and chair of the ISO C++ standards committee. View all posts by Herb Sutter

Published 2017-02-15

18 thoughts on “Distinguishing between maybe-null vs never-null is the important thing”

paveld500 says:

2017-08-08 at 8:18 am

I mean, how they work, where I should use them and how?
Pavel says:

2017-08-08 at 8:17 am

I’m sorry, but can somebody offer a deeper explanation on what not_null and maybe_null templates are?
mordachaiwolf says:

2017-03-21 at 9:34 am

Love this response! YES – T* is implicitly, by default, NULLABLE!!!!
Breaking that is INSANE.
Having a better type-system where you can say not_null is a step forward, without the foolish baggage of changing everything and sundry.
Daniele says:

2017-03-17 at 12:05 am

Let’s hope to see it in C++20 then!
GregM says:

2017-03-16 at 5:33 pm

Lots of people want it, but it has taken time to get it right, just like concepts, ranges, modules.
Daniele says:

2017-03-16 at 2:31 am

Thank you for your quick replies guys. I am really curious to read the trip report.

C++17 new features seem really awesome. I would have liked to see enum iteration included as well, but it seems I’ll have to wait. Apparently it looks as not so many people require this feature, but I think that, when available, it will enable many new powerful idioms.
Herb Sutter says:

2017-03-15 at 8:29 pm

@Daniele: Yes, compile-time reflection will enable that, and I’ll have a paper related to that this June as well for the next meeting.

@GregM: Right you are. As soon as I get caught up I’ll write that trip report (I may wait until the mailing is posted so I can link to papers, or maybe I’ll get it done sooner…).
GregM says:

2017-03-15 at 7:29 pm

Daniele, compile time reflection allows that.

Nov 2016 status from this blog:
the Reflection study group reviewed the latest merged static reflection proposal and found it ready to enter the main Evolution groups at our next meeting to start considering the unified static reflection proposal for a TS or for the next standard.

There was another meeting two weeks ago, but I haven’t seen any trip reports yet.
Daniele says:

2017-03-15 at 9:48 am

Ps I forgot to say that I am speaking about compile time iteration
Daniele says:

2017-03-15 at 9:19 am

Hi there,

this is totally offtopic, so apologies but I didn’t know how to ask you this question.
Do you know why even in C++14 you guys have not foreseen a way to iterate over enum classes? is there any hope to see it in C++17?

It is certainly possible to iterate over them if the enum are contiguous, but if they are not and you do not want to manually write (error prone because of repetition) code or use macros, it is (as far as I know) practically impossible to do it. This is because of the standard: converting from an int to an out of range enum value results in undefined behavior.

By the way, what you’re guys doing with the core guidelines rocks!

Thank you,
Daniele
Ivan says:

2017-02-16 at 2:04 am

valid_ptr is better name because dereferencable_ptr is too long. Same for assigned_ptr. value_ptr, is unclear IMAO. If you have better name I am all ears. Note that Code Complete advises against using negation in variable names, I think same holds for types.
David Collier says:

2017-02-16 at 1:37 am

Non-null types would be great, but the problem is that in the current language, they can only be partly supported. You can implement a non-null raw pointer, but not a non-null unique_ptr. A non-null unique_ptr is the type you want to return from make_unique, but it’s useless if you can’t move from it. To make this work properly we need proper (“destructive”) move semantics which don’t require a moved-from object to be left in a valid state.
Pingback: Distinguishing between maybe-null vs never-null is the important thing | Ace Infoway
Vishal Oza says:

2017-02-15 at 3:16 pm
I agree with this post I think nullptr is a necessary evil and the default should be nullable for all pointers. not_null should only when you know that be that the object the pointer points to has been created. This is one of the reasons I hate the
```
this
```
keyword I think that
```
this
```
should return a reference and not a pointer. I also think checking against nullptr is way to get optional augments rather std::optional
paercebal says:

2017-02-15 at 1:51 pm

@Ivan: “not_null” is so clear it shines (despite the negation), whereas valid_ptr is confusing. For example:

void foo()
{
T * p0 = nullptr ;
T * p1 ;
}

In the code above, p0 is a valid pointer, that points to null. You can’t dereference it, but you know it (because it’s null).

Whereas p1’s value is… undetermined. This means you have garbage, that could point or not to something. If you’re lucky. That pointer is invalid.

This works the same way for iterators. Consider a vector of integers:

void foo(std::vector v)
{
assert(! v.empty()) ;
auto itBegin = v.begin() ;
auto itEnd = v.end() ;

v.resize(v.capacity() + 10) ;
//
}

Before the resize, itBegin and itEnd are both valid iterators. The first is dereferencable (you can retrieved the “pointed” value), while the second is not.

After the resize, we KNOW both itBegin and itEnd are now invalid (because the vector reallocated the underlying memory). If you try to dereference any of them, you’ll get an undefined behavior… exactly like the non-null non-initialized pointer p1 above.

Truth is, you have two notions:
– the pointer is valid (null or valid pointed object) or invalid (undeterminate)
– the pointed object is valid (i.e. the pointer is non-null and points to a valid pointed object).

Now, we could used valid_pointed (which is quite ugly)… but what does that mean in the following case?

void foo(valid_pointed p)
{
// I can’t dereference p, so the valid_pointed is a bit strange…
// and what about pointers that have been reinterpret_cast-ed?
}

The fact your address points to something valid doesn’t mean that the current type of the pointer is the right one to use that memory.

This is why valid_ptr (or even valid_pointed) is unclear, IMHO.
Iama Hummingbird says:

2017-02-15 at 10:32 am

WOW!!! SO I DON’T HAVE TO BUY ANOTHER BOOK MADE FOR A HIGH SCHOOL KID!!!
Yuriy Grishin says:

2017-02-15 at 10:06 am

valid_ptr is worse, valid how exactly? Why nullptr is invalid? It is valid from some perspective. not_null is good enough I think.

If you’re asking for a special thing to do, it very well might be ugly. Same story with casts here I guess.
Ivan says:

2017-02-15 at 9:45 am

wrt naming… not_null is ugly as hell,
I would prefer valid_ptr or something like that…