Float vs Double

Apprentice
Posts: 19
Joined: 2005.11
Post: #1
I'm on winter-break, and decided to go through all my notes from my last C++ class from front-to-back. And one of my first notes was "Float vs. Double: Teacher says use double. Double more accurate, larger.

float myFloat = 10.5f ;
double myDouble = 10.5 ; " End notes.

Throughout the class, the teacher always used floats, not doubles in his examples, and I in turn always used floats as well. I did a little of my own research... sizeof(float); sizeof(double); and found float = 4 bytes, double 8 bytes.

So if I do: 2.2 * 3.3... is float or double more accurate? Or will it only be more accurate with extremely large numbers? Are doubles then twice as slow to calculate as floats since they're twice the bytes? What's with putting the "f" on a single-precision floating point number like 10.5f ? I know I've forgotten to put the "f" on a lot of my floats, yet I haven't gotten any errors or warnings.


And another random-basic programming question: I don't have a large concept of how fast things compute (obviously most programs I've written are of blazing speed, however, I'm also on a dual-2.5, so most everything is blazing fast, and I sadly don't have any slower machines to test things on...

How many hz does it take to do integer addition, multiplication, division, etc etc etc. And for float? Double? (and does 32-bit processor vs. 64 make much/any difference?)
Quote this message in a reply
Member
Posts: 114
Joined: 2005.03
Post: #2
Float is 32 Bit, Double is 64 bit. If you develop in 32 bit, float should be faster (also on a G5 if you do anything with an UI, as Cocoa, Carbon and so on are all 32 bit). The f at the end tells the compiler that it is a float constant. If you do
Code:
float foo = 2.0
the finished program will load the number as a double and then convert it to a float, which is slower than
Code:
float foo = 2.0f
where the number will be loaded directly as float.

double is more precise, but I've never had any problems with using floats anywhere. The difference is not that large for games, I suppose.

About your last question: Depends. A PowerPC can usually do an addition and a multiplication in a single instruction. The G5 can, if the code is well optimized, do two of both in one instruction. If you use Altivec, this number will be increased even more. On the other hand, it all depends on how well the code is written and then every instruction has to go through a pipeline, so the actual speed will be less. In the end, just write your code and if it's too slow, use Shark to find out where and optimize it. It is nearly impossible to tell how fast what piece of code is in real-life circumstances.
Quote this message in a reply
Apprentice
Posts: 19
Joined: 2005.11
Post: #3
Thanks cochrane! That makes perfect sense. Your reply and a chapter in my text-book (I must have never read before) filled in all the holes.
Quote this message in a reply
zKing
Unregistered
 
Post: #4
Cochrane Wrote:If you do
Code:
float foo = 2.0
the finished program will load the number as a double and then convert it to a float, which is slower than
Code:
float foo = 2.0f
where the number will be loaded directly as float.

Well in cases like that, as long as you compile with optimizations on (i.e. release mode), any compiler worth its salt will precompute known constant conversions as far as possible at compile time.

And things are a bit complicated in the float/double world, this link explains things better than I can:
http://cpptips.hyperformix.com/cpptips/flt_dbl
Quote this message in a reply
Luminary
Posts: 5,143
Joined: 2002.04
Post: #5
zKing Wrote:Well in cases like that, as long as you compile with optimizations on (i.e. release mode), any compiler worth its salt will precompute known constant conversions as far as possible at compile time.

That's precisely the problem; the compiler can't assume that it knows how the hardware will do the conversion at runtime. GCC at least, even with optimization on, will load a double and convert to a float, which is not fast.

In general, floating point expressions can't be simplified at compile time -- #define TWO_PI 6.28........ will produce better code than #define TWO_PI (2.0 * 3.14.............).

To the OP: on the G4 and G5, computations with doubles are the same speed as computations with floats (except maybe division? only on G4?). You do however have twice as much data to deal with, so doubles may still be slower.

I have no idea what the timings of instructions are for x86 CPUs and a quick trip to Google turned up nothing... to anyone investigating, remember that Mac OS X for Intel does floating point math using SSE and SSE2, so the relevant instructions are (for example) mulss and mulsd, rather than fmul.

Game physics is one area where using doubles for calculation can bring substantial improvements (things passing through other things less often, for example).
Quote this message in a reply
zKing
Unregistered
 
Post: #6
OneSadCookie Wrote:That's precisely the problem; the compiler can't assume that it knows how the hardware will do the conversion at runtime. GCC at least, even with optimization on, will load a double and convert to a float, which is not fast.

That's a darn good point, particularly in a compiler like GCC where the back end generator could be spitting out to lord-knows-what, I could see GCC not wanting to guess even if the standard seems to say its ok to be a bit lax in this area. It'd take an on the ball optimizing back end to pick it out down the line. (I heard that Visual Studio is starting to do some of these kinds of "after machine code generation" type optimizations.)
Quote this message in a reply
Apprentice
Posts: 19
Joined: 2004.10
Post: #7
OneSadCookie Wrote:That's precisely the problem; the compiler can't assume that it knows how the hardware will do the conversion at runtime. GCC at least, even with optimization on, will load a double and convert to a float, which is not fast.

Actually, compilers DO NOT optimize float/double math because they aren't allowed to by the standard. The standard (C-9X) forces compilers to implement the code as written because the compiler does not know the state of the error condition or flags on the instruction for a given part. In other words, the C standard recognizes that different processors will have different error states or float flags based on an instruction, and the states might be outside the bounds of 'float' as defined. Rather than completely defining the context (and possibly hinder the useful-ness present in some parts), the standard states, "leave float operations alone".

I don't have my copy of the standard with me or I'd quote the secton as well.
Quote this message in a reply
Member
Posts: 131
Joined: 2004.10
Post: #8
Floats vs. Doubles.

Most video hardware (if not all, even pro level) use floats to render 3D data. So using doubles for your models will only be for your benefit of maintaining the precision of the original data in double form if you need that.

Some thing to know about doubles and floats. Limited precision. floats have 6 decimals of precision (I've seen 5 in other places but most say 6) and doubles generally 15. This doesn't mean 6 or 15 after the decimal place for any number. Write out the number using scientific notation to understand the decimal precision.

Bascially for floats:
Values between -1 and 1 are accurate to 0.000001.
Values between +/-1 and +/-9 are accurate to 0.00001.
Values between +/-10 and +/-99 are accurate to 0.0001.
Values between +/-100 and +/-999 are accurate to 0.001.
...
Values between +/-100000 and +/-999999 are accurate to 1.0.

So for a number that is, say, 500000 (.500000 x 10^6 (written this way more clearly shows the 6 decimal place precision)) is accurate to about 1. Adding .1 may not actually do anything to the value of the number.

Doubles come into play when you need the precision. Say for solid-solid intersection of large coordinate data. Using floats may result in two faces looking exactly the same, while using doubles the two faces may not have the same plane, may or may not intersect, may be on different sides. Plus all the calculations that may be required, like finding normals, intersections of lines with planes etc. Floats may be off just enough to make your life miserable.

For the most part most games don't run across the problem too often where float point precision bites them in the butt. There are simple tricks to get around it if it does become an issue.
Quote this message in a reply
zKing
Unregistered
 
Post: #9
Great info people, thanks!

It's been a long while since I've needed to deal with much floating point math... you're teachin' this old dog some new tricks!
Quote this message in a reply
Apprentice
Posts: 10
Joined: 2005.08
Post: #10
as a last note, if you have an altivec enabled CPU and want to use it for your computations, you must use floats since doubles are not supported.
(if this makes sense or not with the intel transition is up to you, of course)
Quote this message in a reply
Member
Posts: 749
Joined: 2003.01
Post: #11
amazingly I discovered a big bug in N-Ball due simply to the float limitations, as Zekaric said it oocurs when you need precise distance between two objects with large coordinate data (i.e. in a level, the "x" component can get quite big)

©h€ck øut µy stuƒƒ åt ragdollsoft.com
New game in development Rubber Ninjas - Mac Games Downloads
Quote this message in a reply
Moderator
Posts: 1,560
Joined: 2003.10
Post: #12
Out of curiosity, how large is one unit in N-ball's coordinate system?
Quote this message in a reply
Member
Posts: 749
Joined: 2003.01
Post: #13
one pixel, but the problem wouldn't change if I scaled it bigger, as Zekaric explains.

©h€ck øut µy stuƒƒ åt ragdollsoft.com
New game in development Rubber Ninjas - Mac Games Downloads
Quote this message in a reply
Oldtimer
Posts: 834
Joined: 2002.09
Post: #14
Something that bit me bad the other day will perhaps save one of you guys sometime: I've been working a bit on a new engine. About two months ago, I defined some time types: bmTimeInterval and bmTimeAbsolute. Interval is for keeping track of, say, times between frames while bmTimeAbsolute is used to keep track of actual points in time. The distinction is just for readability, I defined them both as floats.

Now, to grab time steps I use the Carbon function Microseconds and convert its Unsigned Wide result to a double. That double is then converted to a bmTimeAbsolute. Right here, I should've been suspicious, but it ran fine on my iBook and on my G4 tower. So, I moved it to my G5 workstation. Immediately all my animations ran unbelievably choppy. Framerate was good, but time steps were off. So, I thought that perhaps the G5 casts between double and float differently, and changed bmTimeAbsolute to a double. It ran fine. Why was this?

Well, Microseconds returns the time in seconds since system start. My iBook and G4 get restarted every week. My G5 iMac is running a web server and typically gets reboots every two months. Now, in this case, my uptime was about 17 days - that's a lot of seconds. So, in line with Zekaric's post up there, I was down to tenths of seconds of precision when I cast my double to float, and the timesteps I calculated were pretty grainy.

Lesson: don't trust floats for absolute time. Mac OS X can get very good uptimes, and you don't want to look bad on them. Wink
Quote this message in a reply
Apprentice
Posts: 19
Joined: 2004.10
Post: #15
Fenris Wrote:Now, to grab time steps I use the Carbon function Microseconds and convert its Unsigned Wide result to a double.

I'm missing something. Why didn't you just leave it as an unsigned wide? An unsigned wide is just an unsigned long, and is supported by C (therefore you can do math with them).

{edited to fix typo}
Quote this message in a reply
Post Reply 

Possibly Related Threads...
Thread: Author Replies: Views: Last Post
  float value changes when passed to function kendric 5 3,446 Nov 15, 2009 01:57 PM
Last Post: kendric