Even *MORE* endian issues

Member
Posts: 161
Joined: 2005.07
Post: #16
Just because I'm in an explaining mood, I'll cover why arrays are created using arr[size], and the elements are numbered 0 through size-1.

When the program goes to create space on the stack for memory, it needs to know two important things:
1) the number of items you want
2) the size (number of bytes) of a single item

In C, you need to explicitly tell it these things by using sizeof(datatype) and multiply that number by the number of items you want. C++ figures it out automatically based on the datatype you're trying to initialize (like when you say int b[5], C++ knows you want 5 items, and each item should be the size of an int).

With the int b[5] example, that would create memory like so:
Code:
[____|____|____|____|____]
0123 4567 .... .... .... (etc.)
An int is generally 4 bytes, and you want 5 items, so it initializes 20 bytes for you.

Now, lets say you want to access an item in this newly-allocated array. Both C and C++ automatically multiply the index by the size of a single item to get the correct info. For example, with int b[5] and accessing b[3], it would calculate it like this:
Code:
index is 3, and each item is 4 bytes, so start accessing at byte 12:

[____|____|____|____|____]
                ^ right there
That's the fourth item in the array! So that's how things work internally. Hopefully that will help you understand things a bit better. Smile
Quote this message in a reply
Luminary
Posts: 5,143
Joined: 2002.04
Post: #17
imikedaman Wrote:To continue the old conversation about why using b[1] didn't cause a crash, here's a general trace of what the program is doing internally:
Code:
1) short henry;

2) union {
short a;
unsigned char b[1];
} george, jake;

jake = 72;

3) george.b[0] = jake.b[1];
4) george.b[1] = jake.b[0];

henry = george.a;

cout << henry << endl;
Code:
For reference, let's pretend this is the chunk
of memory available for your program:
[_______________________]

1) creates space for two bytes:
[__|____________________]
  ^henry, 2 bytes for a short
2) creates space for george and jake:
[__|___|___|____________]
      ^ 3 bytes each, for a 2-byte short and 1-char array
3) george.b[0] is here in memory:
[__|___|___|____________]
      ^
jake.b[1] is here:
[__|___|___|____________]
            ^
4) Likewise, george.b[1] is here:
[__|___|___|____________]
        ^
In each case, b[1] is within your program's chunk of memory, so it does not crash. However, the memory you are accessing does not belong to where you think it does. george.b[1] is actually the first byte of jake's short (a), and jake.b[1] is probably random garbage.

Or, at least that's how I was taught to trace it. Either way, that's why you needed to use b[2] to avoid the problem.

The general principle here is OK, but the execution is a little off. For a start, george & jake are unions, so the b array overlaps the a. There's also the potential that the compiler will choose to align these structures to four bytes, or keep them in registers.

The net result is that b[1] in this case does exactly the same as it would if b were declared to have length 2, so the program works correctly. A good compiler would warn about overrunning that array, though.
Quote this message in a reply
Member
Posts: 161
Joined: 2005.07
Post: #18
OneSadCookie Wrote:The general principle here is OK, but the execution is a little off. For a start, george & jake are unions, so the b array overlaps the a.
Oh right, hehe. For whatever reason I treated them like structs. Wacko
Quote this message in a reply
Jones
Unregistered
 
Post: #19
Ooo... fancy diagrams. Blink Wow

Thanks, imikedaman, for your lesson in Memory Allocation 101. Wink

Well, the code works now, anyways. Unfortunatly, the file save is exactly 24kb bigger than the file read in. OOPs. Hehe, get it? It's a class, and it's broken, so "OOPs". Hehe, yeeaaaah....

Something broke, but I'll fix it if it kills me.

Question:

Reading/Writing "float[3] * 1" is the same as "float * 3", it would seem so, as I get the exact same result. Is either method "better" or "more accepted as correct"?

Thanks!
Quote this message in a reply
Luminary
Posts: 5,143
Joined: 2002.04
Post: #20
Jones Wrote:Reading/Writing "float[3] * 1" is the same as "float * 3", it would seem so, as I get the exact same result. Is either method "better" or "more accepted as correct"?

Well, I have no idea what you meant by those two pieces of non-code, but whatever you meant, it's wrong Rasp
Quote this message in a reply
Jones
Unregistered
 
Post: #21
OneSadCookie Wrote:Well, I have no idea what you meant by those two pieces of non-code, but whatever you meant, it's wrong Rasp

Heh, is my code *that* predictable? Wink

You know when reading and writing, one must specify an amount to read and how many times to read it. Essentially, any two numbers that result in the total when multiplied will work here. But is it better to say:

Code:
/*myArray holds 3 int items, for example. */
fread(myArray, sizeof(int[3]), 1, myStream);

Or:

Code:
fread(myArray, sizeof(int), 3, myStream);

Thanks!
Quote this message in a reply
Moderator
Posts: 1,140
Joined: 2005.07
Post: #22
Option 2. That's what the third parameter is for.
Quote this message in a reply
Jones
Unregistered
 
Post: #23
akb825 Wrote:Option 2. That's what the third parameter is for.

Ah good, that's how I've been doing it so far.
Quote this message in a reply
Post Reply 

Possibly Related Threads...
Thread: Author Replies: Views: Last Post
  Endian-ness ia3n_g 11 4,140 Sep 20, 2006 11:47 PM
Last Post: OneSadCookie
  Little Endian to Big Endian Jones 42 14,218 Jul 20, 2006 12:07 PM
Last Post: Jones