sqrt speed test - help please!!!...

Apprentice
Posts: 13
Joined: 2008.05
Post: #1
okay guys...

please help me understand this!
i made a littel test in xcode using obj-c / c...it looks like the following:

Code:
- (void)executionCode
{
    double k = 99999999999999999999999999999999999999.0;
    
    NSDate *start = [[NSDate alloc] init];
    [start timeIntervalSinceReferenceDate];
    
    for( unsigned long i = 0; i < 200000000; i++ )
    {
        k = sqrt(k);
    }
    
    NSDate *end = [[NSDate alloc] init];
    [end timeIntervalSinceReferenceDate];
    
    NSTimeInterval time = [end timeIntervalSinceDate:start];
    
    NSLog(@"Time: %f", time);

    
    [start release];
    [end release];
}

calling this the log says:

>Time: 1.0xxx....<


I tried to move it to another function/method:

Code:
double testSqrt(double number) {
    return sqrt(number);
}

and if i use this function/method in the executionCode-method:

Code:
- (void)executionCode
{
    double k = 99999999999999999999999999999999999999.0;
    
    NSDate *start = [[NSDate alloc] init];
    [start timeIntervalSinceReferenceDate];
    
    for( unsigned long i = 0; i < 200000000; i++ )
    {
        k = testSqrt(k);
    }
    
    NSDate *end = [[NSDate alloc] init];
    [end timeIntervalSinceReferenceDate];
    
    NSTimeInterval time = [end timeIntervalSinceDate:start];
    
    NSLog(@"Time: %f", time);

    
    [start release];
    [end release];
}

the log says:

>Time: 3.xxx....<


Can someone try to explain this to me?

If I were to make a game using Cocoa, is it a bad approach to write c-functions instead of using the cocoa-api?

Thanks for any reply!


brush
Quote this message in a reply
Luminary
Posts: 5,143
Joined: 2002.04
Post: #2
No, it means you need to understand your language and toolchain:

Code:
iMac:Desktop keith$ cat test.m
#import <Foundation/Foundation.h>

#define N 200000000

static __attribute__((noinline)) double sqrt1(double d)
{
    return sqrt(d);
}

static inline double sqrt2(double d)
{
    return sqrt(d);
}

main()
{
    NSTimeInterval start, end;
    unsigned i;
    double x;
    
    x = 9999999;
    start = [NSDate timeIntervalSinceReferenceDate];
    for (i = 0; i < N; ++i)
    {
        x = sqrt(x);
    }
    end = [NSDate timeIntervalSinceReferenceDate];
    NSLog(@"sqrt: %f", end - start);
    
    x = 9999999;
    start = [NSDate timeIntervalSinceReferenceDate];
    for (i = 0; i < N; ++i)
    {
        x = sqrt1(x);
    }
    end = [NSDate timeIntervalSinceReferenceDate];
    NSLog(@"sqrt1: %f", end - start);
    
    x = 9999999;
    start = [NSDate timeIntervalSinceReferenceDate];
    for (i = 0; i < N; ++i)
    {
        x = sqrt2(x);
    }
    end = [NSDate timeIntervalSinceReferenceDate];
    NSLog(@"sqrt2: %f", end - start);
}
iMac:Desktop keith$ gcc -O0 test.m -framework Foundation
iMac:Desktop keith$ ./a.out
2008-06-21 23:12:58.151 a.out[70214:10b] sqrt: 1.140779
2008-06-21 23:13:01.990 a.out[70214:10b] sqrt1: 3.837325
2008-06-21 23:13:05.841 a.out[70214:10b] sqrt2: 3.850151
iMac:Desktop keith$ gcc -O2 test.m -framework Foundation
iMac:Desktop keith$ ./a.out
2008-06-21 23:12:43.184 a.out[70192:10b] sqrt: 0.110239
2008-06-21 23:12:43.296 a.out[70192:10b] sqrt1: 0.110598
2008-06-21 23:12:43.407 a.out[70192:10b] sqrt2: 0.110056
Quote this message in a reply
Apprentice
Posts: 13
Joined: 2008.05
Post: #3
oh okay....so all this is about compiler command-line optimization according to this: http://gcc.gnu.org/onlinedocs/gcc/Optimi...ze-Options ?!
Quote this message in a reply
Luminary
Posts: 5,143
Joined: 2002.04
Post: #4
I realized after I posted it that I don't understand why sqrt1 is fast in the optimized version ^_^
Quote this message in a reply
DoG
Moderator
Posts: 869
Joined: 2003.01
Post: #5
Apparently, GCC inlines it anyway. By declaring sqrt1() in another module, it becomes slow.

However, I found something else to be very weird.

Code:
// main.m
// define sqrt1() in another module
#include <stdio.h>

#import <Foundation/Foundation.h>
#import <Carbon/Carbon.h>

#define N 200000000

double sqrt1(double d) __attribute__((noinline));


static inline double sqrt2(double d)
{
    return sqrt(d);
}


void RunTest(void)
{
    double start, end;
    unsigned i;
    double x;
    
    x = 9999999;
    start = GetCurrentEventTime();
    for (i = 0; i < N; ++i)
    {
        x = sqrt(x);
        //x+=1.0;
    }
    end = GetCurrentEventTime();
    NSLog(@"sqrt: %f", end - start);
    
    x = 9999999;
    start = GetCurrentEventTime();
    for (i = 0; i < N; ++i)
    {
        x = sqrt1(x);
        //x+=1.0;
    }
    end = GetCurrentEventTime();
    NSLog(@"sqrt1: %f", end - start);
    
    x = 9999999;
    start = GetCurrentEventTime();
    for (i = 0; i < N; ++i)
    {
        x = sqrt2(x);
        //x+=1.0;
    }
    end = GetCurrentEventTime();
    NSLog(@"sqrt2: %f", end - start);
};

int main (int argc, const char * argv[])
{
    RunTest();
    RunTest();
    RunTest();
    return 0;
}

Running with the x+=1.0 commented out, the last round of tests results is
Code:
2008-06-22 11:13:40.245 sqrt-test[1563:10b] sqrt: 0.085364
2008-06-22 11:13:41.780 sqrt-test[1563:10b] sqrt1: 1.533590
2008-06-22 11:13:41.866 sqrt-test[1563:10b] sqrt2: 0.085658

Removing the comments, however, the result becomes
Code:
2008-06-22 11:18:10.338 sqrt-test[1589:10b] sqrt: 0.085514
2008-06-22 11:18:17.052 sqrt-test[1589:10b] sqrt1: 6.713437
2008-06-22 11:18:17.139 sqrt-test[1589:10b] sqrt2: 0.085823

Apparently, the optimized calls aren't really affected by the x+=1.0, but the non-inlined function call becomes 4x slower?
Quote this message in a reply
DoG
Moderator
Posts: 869
Joined: 2003.01
Post: #6
Just as a quick update, GCC actually optimizes away everything but the loop increment itself for the sqrt() and sqrt2() cases, hence the large speed difference.

Also, sqrt() is optimized for magic values, so taking sqrt(1.0) will take a shortcut and be much faster than regular sqrt. Unfortunately, 1.0 is reached after much fever iterations than N.

In fact, it only takes 52 sqrt() calls when x=2.0 (52 is the number of bits in the mantissa of the double, and this is no coincidence), and 61 when x = 1e200.

So, here's some test code that behaves as one would expect, with the non-inlined call being a bit slower than the rest:

Code:
#include <stdio.h>

#import <Foundation/Foundation.h>
#import <Carbon/Carbon.h>

#define N 50
#define M 1000000

// again, sqrt1() defined in other module
double sqrt1(double d) __attribute__((noinline));


static inline double sqrt2(double d)
{
    return sqrt(d);
}

void RunTest(void)
{
    double start, end;
    unsigned i, j;
    double x, y, z = 0.0;
    
    y = 1.0;
    start = GetCurrentEventTime();
    for (j = 0; j < M; ++j)
    {
        x = 1e200;
        for (i = 0; i < N; ++i)
        {
            x = sqrt(x);
        }
        z+=x;
    }
    end = GetCurrentEventTime();
    NSLog(@"sqrt: %f", end - start);
    
    start = GetCurrentEventTime();
    for (j = 0; j < M; ++j)
    {
        x = 1e200;
        for (i = 0; i < N; ++i)
        {
            x = sqrt1(x);
        }
        z+=x;
    }
    end = GetCurrentEventTime();
    NSLog(@"sqrt1: %f", end - start);
    
    start = GetCurrentEventTime();
    for (j = 0; j < M; ++j)
    {
        x = 1e200;
        for (i = 0; i < N; ++i)
        {
            x = sqrt2(x);
        }
        z+=x;
    }
    end = GetCurrentEventTime();
    NSLog(@"sqrt2: %f", end - start);
    printf("%f\n", z);
};

int main (int argc, const char * argv[])
{
    RunTest();
    RunTest();
    RunTest();
    return 0;
}

Somebody say benchmarking is easy Wacko
Quote this message in a reply
Post Reply