c++ - rint not present in Visual Studio 2010 math.h and equivalent of CUDA rint -
i'm porting cuda code c++ , using visual studio 2010. cuda code uses rint
function, not seem present in visual studio 2010 math.h, seems need implement myself.
according link, cuda rint
function
rounds x nearest integer value in floating-point format, halfway cases rounded towards zero.
i think use casting int
discards fractional part, rounding towards zero, ended-up following function
inline double rint(double x) { int temp; temp = (x >= 0. ? (int)(x + 0.5) : (int)(x - 0.5)); return (double)temp; }
which has 2 different castings, 1 int
, 1 double
.
i have 3 questions:
- is above function equivalent cuda
rint
"small" numbers? fail "large" numbers cannot representedint
? - is there more computationlly efficient way (rather using 2 castings) of defining
rint
?
thank in advance.
the cited description of rint() in cuda documentation incorrect. roundings integer floating-point result map ieee-754 (2008) specified rounding modes follows:
trunc() // round towards 0 floor() // round down (towards negative infinity) ceil() // round (towards positive infinity) rint() // round nearest or (i.e. ties rounded even) round() // round nearest, ties away 0
generally, these functions work described in c99 standard. rint(), standard specifies function rounds according current rounding mode (which defaults round nearest or even). since cuda not support dynamic rounding modes, functions defined use current rounding mode use rounding mode "round nearest or even". here examples showing difference between round() , rint():
argument rint() round() 1.5 2.0 2.0 2.5 2.0 3.0 3.5 4.0 4.0 4.5 4.0 5.0
round() can emulated along lines of code posted, not aware of simple emulation rint(). please note not want use intermediate cast integer, 'int' supports narrower numeric range integers representable 'double'. instead use trunc(), ceil(), floor() appropriate.
since rint() part of both current c , c++ standards, bit surprised msvc not include function; suggest checking msdn see whether substitute offered. if platforms sse4 capable, use sse intrinsics _mm_round_sd(), _mm_round_pd()
defined in smmintrin.h
, rounding mode set _mm_fround_to_nearest_int
, implement functionality of cuda's rint().
while (in experience), sse intrinsics portable across windows, linux, , mac os x, may want avoid hardware specific code. in case, try following code (lightly tested):
double my_rint(double a) { const double two_to_52 = 4.5035996273704960e+15; double fa = fabs(a); double r = two_to_52 + fa; if (fa >= two_to_52) { r = a; } else { r = r - two_to_52; r = _copysign(r, a); } return r; }
note msvc 2010 seems lack standard copysign() function well, had substitute _copysign(). above code assumes current rounding mode round-to-nearest-even (which default). adding 2**52 makes sure rounding occurs @ integer unit bit. note assumes pure double-precision computation performed. on platforms use higher precision intermediate results 1 might need declare 'fa' , 'r' volatile.
Comments
Post a Comment