[Edit: I really like the ‘range of values’ several commenters proposed. We do need something like that in the standard library, and it may well come in with ranges, but as you can see there are several simple ways to roll your own in the meantime, and some third-party libraries have similar features already.]
Today a reader asked the following question:
So I’ve been reading all I can about c++11/c++14 and beyond when time permits. I like auto, I really do, I believe in it. I have a small problem I’m trying to decide what to do about. So in old legacy code we have things like this:
for (int i = 0; i < someObject.size(); i++) { … }
For some object types size might be unsigned, size_t, int, int64_t etc…
Is there a proper way to handle this generically with auto? The best I could come up with is:
auto mySize = someObject.size();
for (auto i = decltype(mySize){0}; i < mySize; i++) { … }
But I feel dirty for doing it because although it actually is very concise, it’s not newbie friendly to my surrounding coworkers who aren’t as motivated to be on the bleeding edge.
Good question.
First, of course, I’m sure you know the best choice is to use range-for where that’s natural, or failing that consider iterators and begin()/end() where auto naturally gets the iterator types right. Having said that, sometimes you do need an index variable (including for performance) so I’ll assume you’re in that case.
So here’s my off-the-cuff answer:
- If this isn’t in a template, then I think for( auto i = 0; etc. is fine, and if you get a warning about signed/unsigned mismatch just write for(auto i = 0u; etc. This is all I’ve usually needed to do.
- If this is truly generic code in a template, then I suppose for( auto i = 0*mySize; etc. isn’t too bad – it gets the type and it’s not terribly ugly. Disclaimer: I’ve never written this, personally, as I haven’t had a need (yet). And I definitely don’t know that I like it… just throwing it out as an idea.
But that’s an off-the-cuff answer. Dear readers, if you know other/better answers please tell us in the comments.
Adam, even better than both in that it doesn’t introduce variables into the outer scope:
That is, unless the loop body does something that can change the size of object, in which case you do need your second form, and you have to live with any extra size() calls.
Hi all, I’m not sure about insert object.size() (or any other function) in for statement is good solution, i think the
it’s better than
because we don’t call object.size() function in every iteration (in second example, it’s called twice)
Should be able to use an STL alogrithm if possible?.
I want the nifty way python does it
Just need to be able to define variables in tie.
I would actually like to propose a breaking change; I think
should not compile. The reason is,
0
is a valid constant for all signed integral types, all unsigned integral types, all floating-point types and all pointer types. So the type really can’t be deduced. This could lead to code likeworking, since
i
don’t participate in deduction, and the rest agreeI am going to go against the tide here and say that I *don’t* want i to become size_t.
For a counting for loop, especially one that goes from 0, there is never any reason to use an unsigned value, and since I try to use signed integers in most other places, this can lead to having to cast the index back to signed inside the loop.
So a nice benefit of creating an iterator factory function to use in a for-each loop is that it can actually convert the argument to a signed 32 or 64 bit integer (and for correctness sake throw an exception if the argument is too large to fit).
This would be trivial to do with proposal https://groups.google.com/a/isocpp.org/d/msg/std-proposals/oaEAiyoreV8/ELm56JGRSWIJ :)
A lot of the answers here seem to me to be needlessly complex.
If our concern is readability and newbie-friendliness, then all you need to do is set the tricky-ish definition on its own, so it’s clear:
I don’t see this as hurting our pro-auto-ness; decltype isn’t any less automatic or flexible than auto is. Our type is *still* deduced at compile-time according to the values it depends on, rather than being hard-coded. But you could also use the explicitly type initializer idiom:
I this is maybe a little less readable for novices, but the point is that the only conceivably-confusing part is on its own, doing *only* the job of declaration. That makes it easy to understand – you don’t need to understand the whole loop; and the loop (in and of itself) is still perfectly clear. All that’s happening is that the c++11-newbie says “Huh, a variable is being declared here and I don’t really understand its type or why it’s so complicated”; and then they look it up and have a clear, obvious answer.
(This is also why I left “i=0” in, even though I could rely on initialization at declaration – I want the loop to be crystal-clear, even if the type of the index variable isn’t.)
yeah auto is better make it simple !
That works. You could also multiply with 0:
for(auto mySize = someObject.size(), i = mySize * 0; i < mySize; i++) …
I wonder whether this is acceptable?
for(auto mySize = someObject.size(), i = mySize – mySize; i < mySize; i++) …
Thanks, for catching that Joe. The above versions don’t work indeed (segfaults.) But the following works:
p sarkar, your code will result in an endless loop, at least for standard containers like vectors and strings. std::size_t is unsigned, so it is never less than 0.
Would love range based solutions. If order of evaluation does not matter, one way to side-step the issue is:
Somehow my link didn’t make it into my reply, another shot:
https://gist.github.com/DieHertz/f83b33ffe33e1c07abfc
I use such code for the primary reason that we compile with -Wall, and signed/unsigned comparisons won’t let me get away with auto i = 0 :-)
Besides, it looks much more attractive, and compiles into optimal code just like hand-written loop.
https://gist.github.com/DieHertz/f83b33ffe33e1c07abfc
Ranges is a good idea in general, but they wouldn’t resolve this issue, they would just provide a superficially different syntax.
The issue here is the type conversion, and it’s int 0 that should be converted to the size_type, not the other argument to int. And unless we decide on some construct that always casts the first argument to the second argument type (might not be a great idea) we have to specify the type explicitly. Thus, the obvious solutions are still the most readable and correct ones:
And if size_type is something frequently found in the code, then creating a user-defined literal will make it shorter, and easier to read:
Also, if 0 is frequently converted to size_type, it makes sense to define a constant of that type:
How about
One thing worth noting also, is that the for-range-loop construct does not require real iterators (ie: it does not check any iterator traits, and only use a very small part of the public signature of a full-fledged iterator).
For a type to work with the for range loop, you only need to have a
and
returning an object supporting the following members:
So it is really easy to extend existing components to support a for-range-loop construct.
It might be overkill, but one option is to define an inline function that returns a 0 of the correct type for the container.
Then user code could look like
An possible alternative name for this function is firstIndex.
{auto i = 0; for(auto e: container) {
...
++i;
}}
I usually just do this:
It’s not perfect, but the majority of the time my array will never have anything close to 2 billion elements. Having a habit of using signed ints also prevents writing infinite loops when you traverse the array from back to front using indices. Using ssize_t could be another option if you’re concerned about sizeof(int).
I like to use ints everywhere unless a more specific integral type is needed. If you make your loop variable unsigned then when you use it within the loop you’re likely to be doing math and comparisons between signed and unsigned.
Fancy ranges and meta programming are nice but a lot of times we just have 2 arrays and want to iterate over them both with a simple index. No reason to over think things.
Wow some really scary answers! If not a generic template, then leave as is, if a template then rewrite using the appropriate iteration style. Life can be simple sometimes…..even in c++!
Although this isn’t much different than other suggestions, I would suggest something like:
where `indices_of` is a function which returns a suitable range object. For example:
One nice thing about this approach is that `indices_of` could even be generalized to work with `std::map` or other containers whose indices aren’t integers, or whose indices don’t start at zero.
I have seen many suggestions for ranges and I like them. However, I would like to be able to quickly construct ranges for certain data structures:
I personally do not know if this is feasable, but I just look how the API would be the cleanest. In general, this .range() function should only be available for data structures with simple integer domain for keys.
Sorry, one function was missing:
I also like the ‘range of values’ approach because it enforces code locality as the iteration boundaries are tied together and it is also very succinct.
As for the “null of a certain index type” problem we have some handy functions in our code base that merely call the default constructor (if any) and is basically syntactic sugar for the
decltype(mySize){}
approach, but clearly states the intent of the caller:to be used like this:
To be honest, I just use
We can talk all we want about ranges. Sure, they would be nice, but we don’t have them yet, and especially for newbies, are we going to recommend a 3rd party library or writing our own solution for something so trivial? Just use size_t. It’s what the standard library uses, and so good practice IMO should be to use size_t (or a lesser unsigned type) for the size for your own classes.
It work seamlessly with the standard library and it should work with most 3rd party libraries and/or your own classes without signed/unsigned warnings and/or narrowing warnings.
It’s actually a very interesting topic. I see a lot of unsigned/signed mismatches from static analysis tools due to these situations.
I currently favour:
I hope in the near future to be able to use non-member std::size() which is more generic as it works with built-in arrays too:
The fact that some programmers may not be familiar with a language feature is not a reason to avoid it, particularly if it makes the code more correct! decltype is fine here, and you can write it even more neatly than the original:
for (decltype(someObject.size()) i = 0; i < someObject.size(); i++) { … }
All these answers suggesting ranges are great, but even farther afield and less searchable than decltype.
In N4254 I proposed the “z” suffix for size_t literals:
This allows code like this:
I wrote a interval arithmetic library (https://github.com/elliotgoodrich/EZInterval) that allows iterating over intervals. The ez::make_interval variable has overloaded operator[] and operator() to let you choose whether the interval is open or closed.
This works with any other type that acts like a numeric type, e,g pointers and iterators
I’ve made a change and will commit later so that the type of i in the examples above will be the std::common_type of the lower and upper bound variables so ez::make_interval[0](size) will give you back a type of std::size_t when you iterate over it.
(please excuse the spam, there’s no preview for comments…)
The 0*-trick only works if
decltype(std::declval<int>() * std::declval<decltype(someObject.size())>())
is the same asdecltype(someObject.size())
.This is the case for
int
,uint
,size_t
andssize_t
, but not for “smaller” types, such asshort
oruchar
, because they’re promoted toint
for any arithmetic. That also rules out xor and subtracting the size form itself to create a 0 value.In generic code, what’s wrong with plain old (for a certain definition of “old”)
?
similar to Herb’s solution, but without a multiplication in sight: for( auto i = mySize – mySize; …
As it was mentioned earlier: writing irange(end), irange(begin, end) and irange(begin, end, step) range wrappers (returning iterators) is simple to write yourself and from what I’ve checked even Visual Studio (which many times failed to optimize enough) and the performance (and assembly for anyone wondering) is the same as for regular for but you get terse syntax.
I proposed the “z” literal suffix for size_t variables (see N4254 http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4254.html) so that one can write “auto s = 0z; // s has type size_t”
I sometimes do reverse loops for this reason:
Guys, what about performance of
compared to the simpler:
?
Is the compiler still able to optimize/vectorize? I’m asking because I don’t know and I’m sure some of you do.
Thanks!
Marco
Here’s an idea off the top of my head:
How about something like this? It’s a back-of-the-envelope solution, and I haven’t given more than a few seconds to the naming of it all, but it works for the obvious cases (I have not done anything that remotely looks like real testing, and it’s a bit late here):
@Juan Carlos Arevalo Baeza
Using range-v3 (and a range compatible range-for-loop) your idea looks like:
for (auto i : ranges::view::iota(0, size-1))
I guess we can also get indexes and values together:
for (auto i_v : ranges::view::zip(ranges::view::iota(0), v))
std::cout << "Index: " << std::get(*i) << " Value: " << std::get(*i) << '\n';
‘Generic’ code is good not just for templates, but also for portable code, where different platforms may define things differently. So in non-template code you might need `0` to avoid warnings on one platform and `0u` to avoid them on another.
Needing indexes and values seems common enough that perhaps their should be a special syntax for it.
Failing that perhaps a standard algorithm: `for_each(Range &&r, Functor &&f)` where the functor can optional accept both an index and the elements.
And one maybe-improvement over the options presented here: `for (auto size = v.size(), i = 0*size; i < size; ++i) {`
Is indeed the way to go; aside from that though, we still need a (preferably core-language) integer-literal for
std::size_t
(also for thestd::[u]int*_t
-typedefs). It is really very sad that there is no simpler way of generating a zero of those types than writing out something likestd::size_t{}
.std::size_t
is certainly one of the most often needed integer-types, but people still access containers withint
s because they are so much more convenient to type. This is really a terrible situation.Great points on the index-range examples. We do need an iterable ‘range of values’ in the standard library, but you can roll your own in the several ways suggested in the meantime.
@Herb, I was wondering if you agree with these statements:
1) “Using an unsigned instead of an int to gain one more bit to represent positive integers is almost never a good idea.” (Stroustrup)
2) Implicit conversion rules make unsigned types into bug attractors (http://stackoverflow.com/a/10168569/297451)
Eg: size_t x = 0; for(size_t i=10; i>=x; –i) {}
If so, is it reasonable to cast away the unsigned-ness with a static_cast or boost::numeric_cast?
I think a simple counting range would get the best of the two worlds:
(assuming the compiler can optimize this as efficiently as plain for loop)
Implementation of such a range is trivial and left as an exercise for a reader :)
How about:
I believe range-for is still the solution here. I’ll use boost::irange here, but you can roll your own simple wrapper for this quite easily:
This lets the usual integer type promotion machinery of C++ do its thing (which, depending how you see it, can be a good or bad thing—but it’s predictable at least).
The obvious ideal (IMHO) is to allow range-for with ranges of values instead of iterators. I don’t know if the ranges proposals or discussions are contemplating this. So for instance, this would be something like:
Names are, of course, up for grabs. This is implementable today. range_from_to “just” needs to return something that resembles input iterators (that implements the input iterator interface) when begin() and end() are called. The devil being potentially in the details (in the implementation of the end() sentinels and comparison operators, really).
With this in the standard library, classic for-loops should be relegated to one-time situations where we don’t have a range readily available, as it’d be more cumbersome to implement the range than to just use the raw loop.
JCAB
I sometimes use
.