templates - Volatile specifier ignored in C++ -
i'm pretty new c++ , ran across info on means variable volatile
.as far understood, means read or write variable can never optimized out of existence.
however weird situation arises when declare volatile
variable isn't 1, 2, 4, 8 bytes large: compiler(gnu c++11 enabled) seemingly ignores volatile
specifier. code kinda looks this:
#define expand1 a, a, a, a, a, a, a, a, a, #define expand2 //ten expand1 here, expand3 expand5 follows //expand5 equivalent of 1e+005 a, a, .... struct threebytes{char x,y,z;}; struct fourbytes{char w,x,y,z;}; int main() { //reads int 1e+010 times, requires ~1.5sec foo<int>(); //reads threebytes 1e+010 times, doesn't take time, means didn't read foo<threebytes>(); //reads fourbytes 1e+010 times, requires ~1.5sec foo<fourbytes>(); } template<typename t> void foo() { volatile t a; //time empty loop //for settings, indeed take time , isn't optimized out clock_t start = clock(); for(int = 0; < 100000; i++); clock_t end = clock(); int interval = end - start; //run expand5 1e+005 times start = clock(); for(int = 0; < 100000; i++) expand5; end = clock(); //compare time difference , print result cout << end - start - interval << endl; }
the result int
version give me ~1.5 seconds while threebytes
version give 0. i've tested different variables(user-defined or not) 1 8 bytes , 1, 2, 4, 8 takes time run. bug existing on pc or volatile
request compiler , not absolute?
ps 4 byte versions take half time others on pc , source of confusion
the struct version optimized out probably, compiler realizes there's no side effects (no read or write variable a
), regardless of volatile
. have no-op, a;
, compiler can whatever pleases it; not forced unroll loop or optimize out, volatile
doesn't matter here. in case of int
s, there seems no optimizations, consistent use case of volatile
: should expect non-optimizations only when have possible "access object" (i.e. read or write) in loop. constitutes "access object" implementation-defined (although of time follows common-sense), see edit 3 @ bottom.
toy example here:
#include <iostream> #include <chrono> int main() { volatile int = 0; const std::size_t n = 100000000; // side effects, never optimized auto start = std::chrono::steady_clock::now(); (std::size_t = 0 ; < n; ++i) ++a; // side effect (write) auto end = std::chrono::steady_clock::now(); std::cout << std::chrono::duration_cast<std::chrono::milliseconds>(end - start).count() << " ms" << std::endl; // no side effects, may or may not optimized out start = std::chrono::steady_clock::now(); (std::size_t = 0 ; < n; ++i) a; // no side effect, no-op end = std::chrono::steady_clock::now(); std::cout << std::chrono::duration_cast<std::chrono::milliseconds>(end - start).count() << " ms" << std::endl; }
edit
the no-op not optimized out scalar types, can see in this minimal example. struct
's though, it is optimized out. in example linked, clang
doesn't optimize code no optimization, optimizes both loops -o3
. gcc
doesn't optimize out loops either no optimizations, optimizes first loop optimizations on.
edit 2
clang
spits out warning: warning: expression result unused; assign variable force volatile load [-wunused-volatile-lvalue]
. initial guess correct, compiler can optimize out no-ops, not forced. why struct
s , not scalar types don't understand, compiler's choice, , standard compliant. reason gives warning when no-op struct
, , doesn't give warning when it's scalar type.
also note don't have "read/write", have no-op, shouldn't expect volatile
.
edit 3
from golden book (c++ standard)
7.1.6.1/8 cv-qualifiers [dcl.type.cv]
what constitutes access object has volatile-qualified type implementation-defined. ...
so compiler decide when optimize out loops. in cases, follows common sense: when reading or writing object.
Comments
Post a Comment