# About string to double !?!

strtod(“0.9”, NULL) give 0.90000000000000002 !
So, strtod(“0.9”, NULL) + 0.1 > 1 !!!

How to deal with that ?
I don’t need 17 decimals but I’d prefer an ‘exact’ value.

Thanks,
Alain.

When dealing with floating point numbers, you don’t do exact comparasons
anyway. You always consider the resolution. If you have a resolution of
0.1, then if you want to find if some value is equal to some other value,
you might do something like this:

if( fvalue <= (calcval+0.001) && fvalue >= (calcval-0.001) )
…the value is considered equal
else
…it is not equal

Some might put this in terms of significant digits, but this way of thinking
somewhere, but I don’t know what it is…

Kevin

“Alain Bonnefoy” <alain.bonnefoy@icbt.com> wrote in message
news:3D37C459.5080004@icbt.com

strtod(“0.9”, NULL) give 0.90000000000000002 !
So, strtod(“0.9”, NULL) + 0.1 > 1 !!!

How to deal with that ?
I don’t need 17 decimals but I’d prefer an ‘exact’ value.

Thanks,
Alain.

The lazy&danger way is first claim double x and y,
then let x = strtod(), then y = x + 0.1, and then compare
y with 1. Common practice in scientific workers’ world
is that you’d better to define a “threshold” filter to
limit meaningful digits after dot …

Weijie

“Alain Bonnefoy” <alain.bonnefoy@icbt.com> wrote in message
news:3D37C459.5080004@icbt.com

strtod(“0.9”, NULL) give 0.90000000000000002 !
So, strtod(“0.9”, NULL) + 0.1 > 1 !!!

How to deal with that ?
I don’t need 17 decimals but I’d prefer an ‘exact’ value.

Thanks,
Alain.

I remember almost 30 years ago writing my first Fortran program. It counted
1 to 10. It used real number but only printed out the integer part of the
real number. My output looked like this:
1
2
3
3
4
5
6
7
8
9
10
I noticed that three was in there twice. My program was only about 4 lines
and there was no room for such a screw up. It turned out that the math lib
we had back then added 3.000000 + 1.000000 and returned 3.999999. But of
course when you just print the integer part it looked like a 3. I remember
asking the teacher, “Are you sure they send astronaughts to the moon using
COMPUTERS?”

“Alain Bonnefoy” <alain.bonnefoy@icbt.com> wrote in message
news:3D37C459.5080004@icbt.com

strtod(“0.9”, NULL) give 0.90000000000000002 !
So, strtod(“0.9”, NULL) + 0.1 > 1 !!!

How to deal with that ?
I don’t need 17 decimals but I’d prefer an ‘exact’ value.

Thanks,
Alain.

“Kevin Stallard” <kevin@ffflyingrobots.com> wrote in message
news:ah9ara\$krm\$1@inn.qnx.com

When dealing with floating point numbers, you don’t do exact comparasons
anyway. You always consider the resolution. If you have a resolution of
0.1, then if you want to find if some value is equal to some other value,
you might do something like this:

if( fvalue <= (calcval+0.001) && fvalue >= (calcval-0.001) )
…the value is considered equal
else
…it is not equal

ANSI needs to define the approximately equal operator (two ~ on top of each

other). Ah, but then we’d all have to run out and buy new keyboards.

That’s not a problem for display and this is what printf do perfectly.
But I don’t see how to do internaly.
Maybe the comparison to 1, and not 1.0, may confuse in this artcle but
in fact, the compiler is able to deal with that. Writing 1.0 doesn’t
change anything.
I don’t know how strtod do, and I even wonder how writing 1.0 in a ‘C’
source could be 1.0 in an executable!

My problems comes from the fact that values comes from a data server
where the values are stored in ascii in the databases.
The clients expects that requested values are good, and not to be estimated.

Alain.

Weijie Zhang wrote:

The lazy&danger way is first claim double x and y,
then let x = strtod(), then y = x + 0.1, and then compare
y with 1. Common practice in scientific workers’ world
is that you’d better to define a “threshold” filter to
limit meaningful digits after dot …

Weijie

“Alain Bonnefoy” <> alain.bonnefoy@icbt.com> > wrote in message
news:> 3D37C459.5080004@icbt.com> …

strtod(“0.9”, NULL) give 0.90000000000000002 !
So, strtod(“0.9”, NULL) + 0.1 > 1 !!!

How to deal with that ?
I don’t need 17 decimals but I’d prefer an ‘exact’ value.

Thanks,
Alain.

On Tue, 23 Jul 2002 08:23:20 +0200, Alain Bonnefoy <alain.bonnefoy@icbt.com> wrote:

--------------020304030706040104030105
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit

That’s not a problem for display and this is what printf do perfectly.
But I don’t see how to do internaly.
Maybe the comparison to 1, and not 1.0, may confuse in this artcle but
in fact, the compiler is able to deal with that. Writing 1.0 doesn’t
change anything.
I don’t know how strtod do, and I even wonder how writing 1.0 in a ‘C’
source could be 1.0 in an executable!

My problems comes from the fact that values comes from a data server
where the values are stored in ascii in the databases.
The clients expects that requested values are good, and not to be estimated.

Most floating point values are estimates only of the “real” value.
One way is to work in BCD, but this may not be practical.

e.g. http://www.snippets.org/snippets/portable/portable.php3
(Assorted math function section)

Alternatively, use rounding routines when you have to
display a value.

Alain,

Using estimates isn’t reducing that value of the data you are giving the
client. Just because you are not displaying data to the 20’th decimal place
doesn’t mean your data is less accurate. Proper floating point comparison,
greater than, or less than detection techniques are done by estimation.
Even after rounding, I still use them. This estimation takes into account
the precision required for the application, therefore, in realtiy there is
no estimation taking place, you are just eliminating precision that has no
value.

Regards,
Kevin

“Alain Bonnefoy” <alain.bonnefoy@icbt.com> wrote in message
news:3D3CF658.3040502@icbt.com
That’s not a problem for display and this is what printf do perfectly. But I
don’t see how to do internaly.
Maybe the comparison to 1, and not 1.0, may confuse in this artcle but in
fact, the compiler is able to deal with that. Writing 1.0 doesn’t change
anything.
I don’t know how strtod do, and I even wonder how writing 1.0 in a ‘C’
source could be 1.0 in an executable!

My problems comes from the fact that values comes from a data server where
the values are stored in ascii in the databases.
The clients expects that requested values are good, and not to be estimated.

Alain.

Weijie Zhang wrote:

The lazy&danger way is first claim double x and y,then let x = strtod(),
then y = x + 0.1, and then comparey with 1. Common practice in scientific
workers’ worldis that you’d better to define a “threshold” filter tolimit
meaningful digits after dot …Weijie"Alain Bonnefoy"
<alain.bonnefoy@icbt.com> wrote in messagenews:3D37C459.5080004@icbt.com
strtod(“0.9”, NULL) give 0.90000000000000002 !So, strtod(“0.9”, NULL) + 0.1

1 !!!How to deal with that ?I don’t need 17 decimals but I’d prefer an
‘exact’ value.Thanks,Alain.

Alain Bonnefoy <alain.bonnefoy@icbt.com> wrote:

strtod(“0.9”, NULL) give 0.90000000000000002 !
So, strtod(“0.9”, NULL) + 0.1 > 1 !!!

How to deal with that ?
I don’t need 17 decimals but I’d prefer an ‘exact’ value.

Unfortunately, you can’t have an ‘exact’ value – 0.9 (decimal) is
NOT exactly represntable in binary (finite length) floating point.

You could have the same problem in decimal: take 1/3 as a (finite
length) decimal, add to it 1/3 as a (finite length) decimal twice,
– and they will NOT be equal to 1.

(That is .333333 + .333333 + .333333 = .999999 != 1)

The exact same thing happens when you try to represent most decimal
fractions as binary – they can’t be exactly represented, you just
get a reasonably close approximation. You CAN’T get any better.

Therefor, you have to write your code in such a way as to deal with
this intelligently.

– all comparisons should not be:
if (a == constant)
they should be
if ( abs(a - constant) < close_enough )

– all print outs should only print an appropriate number of digits/
decimal places

This sort of handling of floating point numbers is normal programming
practice. They are NOT exact values.

## -David

QNX Training Services
http://www.qnx.com/support/training/
Please followup in this newsgroup if you have further questions.

David Gibbs wrote:

…cut.

Unfortunately, you can’t have an ‘exact’ value – 0.9 (decimal) is
NOT exactly represntable in binary (finite length) floating point.

I’m not really agree with that; (0.9 + 0.1 == 1.0) is right.

but (strtod(“0.9”,NULL) + 0.1 == 1.0) is wrong. This is what is a little
bit perturbing.

Shouldn’t it be more relevant that strtod() round the result to the
number of digits of the string value?

You could have the same problem in decimal: take 1/3 as a (finite
length) decimal, add to it 1/3 as a (finite length) decimal twice,
– and they will NOT be equal to 1.

(That is .333333 + .333333 + .333333 = .999999 != 1)

Yes but 0.333333 can be understand as 0.3333330 which is not equal to 1/3.

If I write in ‘C’ f = 0.333333, the value will be 0.333333 and not
0.33333300000000002.

The exact same thing happens when you try to represent most decimal
fractions as binary – they can’t be exactly represented, you just
get a reasonably close approximation. You CAN’T get any better.

Therefor, you have to write your code in such a way as to deal with
this intelligently.

– all comparisons should not be:
if (a == constant)
they should be
if ( abs(a - constant) < close_enough )

– all print outs should only print an appropriate number of digits/
decimal places

This sort of handling of floating point numbers is normal programming
practice. They are NOT exact values.

-David

thanks,

Alain.

Alain Bonnefoy <alain.bonnefoy@icbt.com> wrote:

(That is .333333 + .333333 + .333333 = .999999 != 1)

Yes but 0.333333 can be understand as 0.3333330 which is not equal to 1/3.
If I write in ‘C’ f = 0.333333, the value will be 0.333333 and not
0.33333300000000002.

No it won’t:

\$ cat foo.c
#include <stdio.h>

## int main( void ) { float f = 0.333333; double d = 0.333333; printf( “%.20f\n%.20f\n”, f, d ); return 0; } \$ ./foo 0.33333298563957214355 0.33333299999999999041 \

Wojtek Lerch QNX Software Systems Ltd.

Alain Bonnefoy <alain.bonnefoy@icbt.com> wrote:

David Gibbs wrote:

…cut.

Unfortunately, you can’t have an ‘exact’ value – 0.9 (decimal) is
NOT exactly represntable in binary (finite length) floating point.

I’m not really agree with that; (0.9 + 0.1 == 1.0) is right.
but (strtod(“0.9”,NULL) + 0.1 == 1.0) is wrong. This is what is a little
bit perturbing.

Shouldn’t it be more relevant that strtod() round the result to the
number of digits of the string value?

You could have the same problem in decimal: take 1/3 as a (finite
length) decimal, add to it 1/3 as a (finite length) decimal twice,
– and they will NOT be equal to 1.

(That is .333333 + .333333 + .333333 = .999999 != 1)

Yes but 0.333333 can be understand as 0.3333330 which is not equal to 1/3.
If I write in ‘C’ f = 0.333333, the value will be 0.333333 and not
0.33333300000000002.

Here’s an exchange I had with a customer on this topic that might
help explain why this is the way it is:

-seanb

From seanb Wed Apr 16 13:26:32 1997
Subject: Re: arithmetic/precision error
To: gilmerjo@odessa.com (John H. Gilmer)
Date: Wed, 16 Apr 1997 13:26:32 -0400 (EDT)
From: “Sean Boudreau” <seanb@qnx.com>
In-Reply-To: <3353FEAC@286.odessa.com> from “John H. Gilmer” at Apr 15, 97 03:20:00 pm
X-Mailer: ELM [version 2.4 PL24]
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Content-Length: 2478
Status: RO

I’ve distilled my problem down to the following section of code:

double x,y,z;
string Fixit;

x=9.6;
y=9.7;
z=(x+y)/2.0; // at this point, the debugger shows z has value 9.64999…

sprintf( Fixit, “%f”, z ); // debugger shows Fixit has value “9.65”
z = strtod( Fixit, NULL ); // debugger shows z has value 9.65

My earlier assertion that one program computed z as 9.65 while another
computed it as 9.64999… was not correct. In program “A”, I used the
debugger to check the value and saw 9.64999…; in program “B”, I used a
printf() statement with “%f” format which apparently rounded the value to

I saw the same thing here. printf ("%.15f", z); will give you the whole
deal.

9.65. Anyhow, adding the sprintf() and strtod() as shown in the code
above appears to be a viable workaround - your thoughts? Why is this
kind of crap necessary? Is it the chip (is use a 486, not the buggy
Pentium) or the floating point library to blame?

Neither. What you are seeing arises from the fact that neither
0.6 or 0.7 decimal can be represented as an exact binary number.

0.6 = 0.1001 with the last 4 digits repeated to infinity.
0.7 = 0.10110 with the last 4 digits repeated to infinity.

Therefore 9.6 and 9.7, as doubles, are represented internally as:

9.6 → 1.00110011 x 2^3 with the last 4 digits repeated 11 more times for a
total of 52 bits internally (the leading ‘1.’ are
assumed since all floating points are normalized
to this with the exponent).

9.7 → 1.00110110 x 2^3 with the last 4 digits repeated 11 more times.

These numbers, let’s call them 9.6i and 9.7i for internal, may be equal
to 9.600000000000000 and 9.700000000000000 respectively to 15 decimal
places’ precision, but they are not equal to 9.6 or 9.7.

It can be shown, using the 53 bits shown above that:

(9.6i + 9.7i)/2 = avgi

where avgi = 1.001101001100 x 2^3 with the last 4 digits repeated 10
more times.

avgi → 9.649999999999998 to 15 decimal places’ precision

If you don’t need such precision, only print out the precision you
need (printf ("%.5f", z); for example). If you do need such precision,
you have to account for such conversion / rounding off errors.

## Sean Boudreau Technical Support QNX Software Systems Ltd.

From seanb Fri Apr 18 10:05:42 1997
Subject: Re: arithmetic/precision error
To: gilmerjo@odessa.com (John H. Gilmer)
Date: Fri, 18 Apr 1997 10:05:42 -0400 (EDT)
From: “Sean Boudreau” <seanb@qnx.com>
In-Reply-To: <3356BC9E@286.odessa.com> from “John H. Gilmer” at Apr 17, 97 05:14:00 pm
X-Mailer: ELM [version 2.4 PL24]
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Content-Length: 5441
Status: RO

Neither. What you are seeing arises from the fact that neither
0.6 or 0.7 decimal can be represented as an exact binary number.

0.6 = 0.1001 with the last 4 digits repeated to infinity.
0.7 = 0.10110 with the last 4 digits repeated to infinity.

Ok, I confess I’ve forgotten some boolean algrebra. I know it’s not your
job to teach me, but how is it that 0.1001 is equal to .6? Apparently,
I don’t know how to do fractions in binary - is it easy enough for you to
explain?

0.1001 isn’t equal to 0.6 as I’ll show below. 0.6 = 0.1001 with the
last 4 digits repeated to infinity.

I’ll explain by example with an exact binary number:

what does 1100.1001 binary equal in decimal?

1100.1001 = 1x2^3 + 1x2^2 + 0x2^1 + 0x2^0 + 1x2^-1 + 0x2^-2 + 0x2^-3 + 1x2^-4
= 1x8 + 1x4 + 0 + 0 + 1/2 + 0 + 0 + 1/16
= 8 + 4 + 0.5 + 0.0625
= 12.5625

The above is a good example of precision errors. If I only had 8 bits
with which to represent real numbers instead of 53 as in the case of
a double, the closest I could get to 12.6 would be 12.5625 coming from
the bottom or 12.625 coming from the top. 53 bits gives me a lot better
precision but I’ll never get 12.6 since .6 is a repeating number in
binary.

Going the other way, what does 12.5625 decimal equal in binary?

Considering the whole part (12).

12/2 = 6 remainder 0 (Least significant bit)
6/2 = 3 remainder 0
3/2 = 1 remainder 1
1/2 = 0 remainder 1 (Most Significant bit)
therefore 12 = 1100

Considering the fractional part (.5625)

…5625x2 = 1.125 whole part is 1 (Most significant bit)
0.125x2 = 0.25 whole part is 0
0.25x2 = 0.5 whole part is 0
0.5x2 = 1.0 whole part is 1 (Least significant bit)

therefore .5625 = .1001
therefore 12.5625 = 1100.1001

As another example. What does 0.7 decimal equal in binary?

0.7x2 = 1.4 whole part is 1
0.4x2 = 0.8 whole part is 0
0.8x2 = 1.6 whole part is 1
0.6x2 = 1.2 whole part is 1
0.2x2 = 0.4 whole part is 0

At this point the last 4 digits begin to repeat to infinity.

Therefore 9.6 and 9.7, as doubles, are represented internally as:

9.6 → 1.00110011 x 2^3 with the last 4 digits repeated 11 more times for
a

I’m still having the same problem - I see the nine “1.001” but not the .6

total of 52 bits internally (the leading ‘1.’ are
assumed since all floating points are normalized
to this with the exponent).

9.7 → 1.00110110 x 2^3 with the last 4 digits repeated 11 more times.

These numbers, let’s call them 9.6i and 9.7i for internal, may be equal
to 9.600000000000000 and 9.700000000000000 respectively to 15 decimal
places’ precision, but they are not equal to 9.6 or 9.7.

It can be shown, using the 53 bits shown above that:

(9.6i + 9.7i)/2 = avgi

where avgi = 1.001101001100 x 2^3 with the last 4 digits repeated 10
more times.

avgi → 9.649999999999998 to 15 decimal places’ precision

If you don’t need such precision, only print out the precision you
need (printf ("%.5f", z); for example). If you do need such precision,
you have to account for such conversion / rounding off errors.

## Sean Boudreau Technical Support QNX Software Systems Ltd.

I believe you, but I need to be able to understand it a little better
before I try to fix it. If a precision error occurs which causes a value
to be too large, it will not affect my application. However, ALL errors
which cause the value to be too low must be dealt with. In the example
above, the erroneous result has repeating 9’s to the end. Will this
always be the case? If so, I could fix this by adding .000000000000001
in my Round() function before I round to my desired precision (I am never
interested in more than 8 digits of precision).

Alternatively, I notice that printf("%f") will round when the rightmost
digit or digits are 9.
For example, 9.64999… rounds to 9.65, while
9.644999… rounds to 9.645

You can force printf() to round off after a certain place with the
precision specifier. If you only need 8 places precision then
printf ("%.8f", z) will solve your problem. A lot of rounding
off / representation errors can occur on a double before they
affect the 8th decimal place. If you don’t like the trailing
zeros, you can do something like the following:

#include <stdio.h>
#include <string.h>

void main (void)
{
char buf [100];
char *ptr;
double a=9.6;
double b=9.7;
double avg;

avg = (a+b)/2.0;

// to 8 decimal places
printf (“avg = %.8f\n”, avg);

sprintf (buf, “%.8f”, avg);

ptr = buf + strlen (buf);
while (*–ptr == ‘0’)
*ptr = ‘\0’;

// without the trailing zeros
printf (“avg = %s\n”, buf);
}

If all precision errors which err on the low side result in
“something.something999…999”, then I could use sprintf() in my Round()
function to fix the problem.

## Sean Boudreau Technical Support QNX Software Systems Ltd.

From gilmerjo@odessa.com Fri Apr 18 14:34:30 1997
Received: from odessa.com (root@tow.odessa.com [198.176.233.2]) by qnx.com (8.8.5/8.6.12) with SMTP id OAA11678 for <seanb@qnx.com>; Fri, 18 Apr 1997 14:34:28 -0400
Received: from 286.odessa.com (mail.odessa.com [198.176.233.10]) by odessa.com (8.6.12/8.6.9) with SMTP id LAA10845 for <seanb@qnx.com>; Fri, 18 Apr 1997 11:37:55 -0500
Received: by 286.odessa.com with Microsoft Mail
id <3357BFC3@286.odessa.com>; Fri, 18 Apr 97 11:38:59 PDT
From: “John H. Gilmer” <gilmerjo@odessa.com>
To: Sean Boudreau <seanb@qnx.com>
Subject: RE: arithmetic/precision error
Date: Fri, 18 Apr 97 11:39:00 PDT
Message-ID: <3357BFC3@286.odessa.com>
Return-Receipt-To: <gilmerjo@odessa.com>
Encoding: 210 TEXT
X-Mailer: Microsoft Mail V3.0
Status: RO

Got it. Thanks!

-----Original Message-----
From: Sean Boudreau [SMTP:seanb@qnx.com]
Sent: Friday, April 18, 1997 10:06 AM
To: gilmerjo
Subject: Re: arithmetic/precision error

Has anyone come up with a math library that will do comparisons down to the
last bit that CAN be accurately identified?

I.E.
If I really want to test:
if( f == 0.9 )
I wouldn’t want to write in my code
if( f > 0.89999 && f < 0.90001 )
if the value of 0.900003 could pass as equal when in fact the correct value
should be
0.89999999927

Alain Bonnefoy <alain.bonnefoy@icbt.com> wrote:

David Gibbs wrote:

…cut.

Unfortunately, you can’t have an ‘exact’ value – 0.9 (decimal) is
NOT exactly represntable in binary (finite length) floating point.

I’m not really agree with that; (0.9 + 0.1 == 1.0) is right.
but (strtod(“0.9”,NULL) + 0.1 == 1.0) is wrong. This is what is a little
bit perturbing.

Shouldn’t it be more relevant that strtod() round the result to the
number of digits of the string value?

It can’t. The number of binary digits to represent “0.9” is infinite.
How many digits of binary decimal do you want kept? One binary
digit per decimal digit? Well, then you’re number will be rounded
to 1.0. Two? Then you still get rouned to 1.0. Three? Then
you will get rounded to 0.875.

What you have to do is decide on your precision, then specifying,
for your output, exactly how many decimal digits you want rounded
to on output.

You could have the same problem in decimal: take 1/3 as a (finite
length) decimal, add to it 1/3 as a (finite length) decimal twice,
– and they will NOT be equal to 1.

(That is .333333 + .333333 + .333333 = .999999 != 1)

Yes but 0.333333 can be understand as 0.3333330 which is not equal to 1/3.
If I write in ‘C’ f = 0.333333, the value will be 0.333333 and not
0.33333300000000002.

Not true – as Wojtek showed.

In fact, if you write f = 0.333333, you are in fact saying that:
f = 333333.0/1000000.0;

This will give you something that is NOT exactly 0.333333.

And, a mistaken analogy as well:

In fact, you did:
f = 9/10; (“0.9”)

and 9/10 is NOT representable as a finite-length binary number.

Remember, when you wrote “0.9” in ascii decimal, you were really
writing “9/10”, which can’t be exactly represented. You were not
writing some exact binary value.

thanks,
Alain.

You’re welcome.

And, welcome to the wonderful, bizarre, world of floating-point
representations of (seemingly) simple numbers.

(P.S. Just be glad that all base-2 numbers can be accurately
represented in base-10 as well. Imagine if we natively worked
in base-9, and base-2 fractions couldn’t be accurately translated
back either! That would be a double-joy.)

## -David

QNX Training Services
http://www.qnx.com/support/training/
Please followup in this newsgroup if you have further questions.

“Bill Caroselli (Q-TPS)” <QTPS@EarthLink.net> wrote in message
news:ai6jht\$dtf\$1@inn.qnx.com

Has anyone come up with a math library that will do comparisons down to
the
last bit that CAN be accurately identified?

There is no a “standard” lib for that because it is “embedded” into your
definition of “good” value. For example,
you can use absolute difference or relative difference as the “cut-off”
policy.
As to “go to the last bit” (how do you define?), you can know it is
dangerous
simply by noticing the fact that a school of scientific workers are taking
of its randomness …, they call that “Monte Carlo” simulation
In the practical world, you may consider also a way of

1. first abstract your model into integer field
(simply because you have up to 2^31 value to play with
and the space is hugh enough for me (order of 2^32 elements! may cost my
whole life to count though:)…)
2. play in integer field instead.
Although there are a lot of c playing in numerical simulation, fortran
sounds to me
more seriously.

Weijie

I.E.
If I really want to test:
if( f == 0.9 )
I wouldn’t want to write in my code
if( f > 0.89999 && f < 0.90001 )
if the value of 0.900003 could pass as equal when in fact the correct
value
should be
0.89999999927
\

Well by “to the last bit” I mean this:
for any given ‘float f’ consider what is the smallest
increment/decrement that can be represented by that constant. I.E. what is
the smallest amount that will generate a different binary value.

Then when we want:
if( float_1 == float_2 )
I’m willing to accept:
if( float_1 + smallest_increment <= float_2 - smallest_increment
&& float_1 - smallest_increment >= float_2 + smallest_increment )
as an acceptable calculation.

Yes, I know that this represents a lot of extra work. But you would only
have to do it when you needed this kind of testing.

“Weijie Zhang” <wzhang@qnx.com> wrote in message
news:ai9dcd\$jm2\$1@nntp.qnx.com

“Bill Caroselli (Q-TPS)” <> QTPS@EarthLink.net> > wrote in message
news:ai6jht\$dtf\$> 1@inn.qnx.com> …
Has anyone come up with a math library that will do comparisons down to
the
last bit that CAN be accurately identified?

There is no a “standard” lib for that because it is “embedded” into your
definition of “good” value. For example,
you can use absolute difference or relative difference as the “cut-off”
policy.
As to “go to the last bit” (how do you define?), you can know it is
dangerous
simply by noticing the fact that a school of scientific workers are taking
of its randomness …, they call that “Monte Carlo” simulation > > …
In the practical world, you may consider also a way of

1. first abstract your model into integer field
(simply because you have up to 2^31 value to play with
and the space is hugh enough for me (order of 2^32 elements! may cost
my
whole life to count though:)…)
2. play in integer field instead.
Although there are a lot of c playing in numerical simulation, fortran
sounds to me
more seriously.

Weijie

I.E.
If I really want to test:
if( f == 0.9 )
I wouldn’t want to write in my code
if( f > 0.89999 && f < 0.90001 )
if the value of 0.900003 could pass as equal when in fact the correct
value
should be
0.89999999927

\

It should be possible to set a global math precision at the beginnng of
the prog, says FLOAT_SIGNIFICANT_DIGITS = 15, in order that all
comparisons are made up to the 15th decimal digit!

Alain.

Weijie Zhang wrote:

“Bill Caroselli (Q-TPS)” <> QTPS@EarthLink.net> > wrote in message
news:ai6jht\$dtf\$> 1@inn.qnx.com> …

Has anyone come up with a math library that will do comparisons down to

the

last bit that CAN be accurately identified?

There is no a “standard” lib for that because it is “embedded” into your
definition of “good” value. For example,
you can use absolute difference or relative difference as the “cut-off”
policy.
As to “go to the last bit” (how do you define?), you can know it is
dangerous
simply by noticing the fact that a school of scientific workers are taking
of its randomness …, they call that “Monte Carlo” simulation > > …
In the practical world, you may consider also a way of

1. first abstract your model into integer field
(simply because you have up to 2^31 value to play with
and the space is hugh enough for me (order of 2^32 elements! may cost my
whole life to count though:)…)
2. play in integer field instead.
Although there are a lot of c playing in numerical simulation, fortran
sounds to me
more seriously.

Weijie

I.E.
If I really want to test:
if( f == 0.9 )
I wouldn’t want to write in my code
if( f > 0.89999 && f < 0.90001 )
if the value of 0.900003 could pass as equal when in fact the correct

value

should be
0.89999999927

\

Well by “to the last bit” I mean this:
for any given ‘float f’ consider what is the smallest
increment/decrement that can be represented by that constant. I.E. what is
the smallest amount that will generate a different binary value.

Then when we want:
if( float_1 == float_2 )
I’m willing to accept:
if( float_1 + smallest_increment <= float_2 - smallest_increment
&& float_1 - smallest_increment >= float_2 + smallest_increment )
as an acceptable calculation.

There are a couple pieces of fuzziness with this thinking. First,
floating points are represented by mantissa & exponent. So defining
the “smallest increment” is pretty tricky, as it won’t be a simple
number – you could look at smallest increment of mantissa, though.
(But as a REAL number, the smallest increment might be .5, might
be .000005, might be far smaller, or might be larger, depending on the
exponent.)

It sounds like you’re trying to say that f1 == f2 if they differ by
less than the smallest increment – but that is exactly what does
happen. Of course, your particular test is written incorrectly,
consider if f1 & f2 in your test were integers. The smallest possible
increment for an integer is 1.

So, your saying that f1=f2 if:
(f1 + 1) <= (f2 -1) && (f1-1 >= f2 +1 )
Let’s try that with 5… 5=5 if:
(5+1) <= (5-1) && (5-1) >= (5+1)
if
(6 <=4) && (4 >= 6)

Oops.

But, the problem isn’t with this – the problem is with expecting that
0.9 can be exactly represented in float, when it can’t.

People will do some calculations, and expect that the answer for
the calculations, which if done in REAL number would exactly equal
0,9 will also exactly equal the best approximation to 0,9 that can
be represented – but what usually happens is the approximation of
the result and the approximation of 0.9 will be slightly different.

## -David

QNX Training Services
http://www.qnx.com/support/training/
Please followup in this newsgroup if you have further questions.

Alain Bonnefoy <alain.bonnefoy@icbt.com> wrote:

It should be possible to set a global math precision at the beginnng of
the prog, says FLOAT_SIGNIFICANT_DIGITS = 15, in order that all
comparisons are made up to the 15th decimal digit!

Neat thought.

Of course… the compiler would have to be modified – right now
(f1 == f2) just does a straight bit-wise comparison. (At least
in C). You’d have to get into operator overloading & stuff like
that… though in C++ you could do this, define your own floating
point type, define your comparison operator, and have it look at
a global significant digits flag in order to know how many significant
digits to use for comparison.

Hm… prototypes for math libraries & calls to them might be a bit
messy, but could be worked around.

Of course, you also have to consider whether you mean binary or
decimal digits. 15 decimal digits would require about 50 binary
digits.

## -David

QNX Training Services
http://www.qnx.com/support/training/
Please followup in this newsgroup if you have further questions.