Number precision and scale

Posts: 6
Joined: 2006.03
Post: #1
If anyone could give me insight into a project they've worked on where they intentionally scaled values up to a base and then divide later if they want fractions. For example: you want to express "level finished is 50%". Fine.

double completed = 0.5;

But then you go and completed++ and unfortunately get an int back. Or maybe even:
completed = completed + 0.1;

But what I've found is a issue with precision that has finally hit my life outside of a textbook. What normally is a chapter that I skip over thinking its not applicable has landed square in my lap. Here's my test:

#include <iostream>

float addFloat(floatf, float i) {
    return f + i;

int main(int argc, char* argv[])
    while (f < 99.0) {
        f = addFloat(f, 0.1);
        std::cout << "Float: " << f << "\n";
    return 0;

Eventually you'll see this.
Float: 21.7
Float: 21.8
Float: 21.9
Float: 22
Float: 22.1
Float: 22.2
Float: 22.3
Float: 22.4
Float: 22.5
Float: 22.6
Float: 22.7001
Float: 22.8001
Float: 22.9001
Float: 23.0001
Float: 23.1001

A similar thing happens with doubles. And some Unix's behave differently than OSX. And that's fine. It's been explained before. I'd simply multiply everything by 10, cast to int, divide by 10 back to float. Fine.

It's just, well, rgb values are nice as doubles. And I guess one has to plan ahead by how precise color will be then. If I wanted a hundreds place, then I'd have some constant int = 100 defined. So now I'm thinking "hmm, this could really be messy".

So while this might seem like a good post for a C++ newsgroup instead of this, my real question is: What stories can you tell me about this in your project? Do you accept this as norm? How far do you take your color precision or anything else? I'm sure it depends on the project but I would like to hear any tales anyway.
Quote this message in a reply
Posts: 5,143
Joined: 2002.04
Post: #2
Float's usually fine except for times, where double is necessary. Things like quaternions and unit vectors need frequent renormalization. For a game, though, you're not typically doing anything that requires extreme precision, so I've never had a real problem.

In a float, you've got 23 bits of precision. In your framebuffer, you've got 8 bits of precision for a color channel. Float should be more than sufficient for most tasks.

The hardware and compiler are *much* more important than the OS in determining how floating-point math rounds. I guess the OS could conceivably matter, but it'd be an odd case.
Quote this message in a reply
Posts: 776
Joined: 2003.04
Post: #3
By default the compiler will use doubles when it sees things like "99.0" and "0.1". Since you actually need a float, it will be converting between double and float, possibly generating more precision errors. I usually round/truncate values for display purposes only.

Anyway, try this (I'm not on a Mac or near gcc right now) and see if it makes any difference:
int main(int argc, char* argv[])
    float f;
    while (f < 99.0f) {
        f = addFloat(f, 0.1f);
        std::cout << "Float: " << f << "\n";
    return 0;
Quote this message in a reply
Posts: 1,563
Joined: 2003.10
Post: #4
milkfilk Wrote:Do you accept this as norm?
Yup. I generally assume that all floating-point numbers of any type are imprecise. There are a number of techniques I use to get around this depending on the situation, which may include large integers of a known scale, or fixed-point numbers.

milkfilk Wrote:How far do you take your color precision or anything else?
The usual convention for colors is 8 bits per component (256 distinct values for each of red, green, blue, alpha). Situations exist where you might want more precision than that, but I wouldn't expect to run into them very often...
Quote this message in a reply
Posts: 373
Joined: 2006.08
Post: #5
not positive, but I think that it doesn't actually turn it back into an int, that's just something that std::cout is doing automatically...have you tried it with printf() in C and see if it does the same thing?

Worlds at War (Current Project) -
Quote this message in a reply
Posts: 1,403
Joined: 2005.07
Post: #6
I learned a bit from writing (and graphing the results of) this program in the past.

#include <stdio.h>

int main(void) {
    float f;
    int i = 0;
    do {
        f = *((float*)&i);
            printf("%.64f (%d)\n", f, i);
    } while(++i);
    return 0;

Sir, e^iπ + 1 = 0, hence God exists; reply!
Quote this message in a reply
Post Reply