Array access and virtual memory

(This applies to Java and C, but the code is given in Python for readability.)

Is it faster to iterate over multiple separate arrays (tuples) of simple variables?

for i in range(0, n):
	phone = phones[i];
	# ...
	email = emails[i];

Or over a single array of complex variables?

for i in range(0, n):
	phone = people[i].phone;
	# ...
	email = people[i].email;

One array is faster than multiple arrays. This is because an array is stored in a contiguous block of memory. Accessing data in different arrays at the same time can require several different pages to be loaded from virtual memory. Memory access, especially hard drive access, is slow. As your application and data set grows, a significant performance difference may manifest itself.

Arrays from the Second Dimension

When iterating over a multidimensional array with indexes i and j, is it faster to iterate over j inside i?

for i in range(0, n):
	for j in range(0, m):
		cell = cells[i][j];

Or over i inside j?

for j in range(0, m):
	for i in range(0, n):
		cell = cells[i][j];

In Java and C, a multidimensional array[n][m] is stored as contiguous m-block of contiguous n-blocks. Let i be in n and j be in m. For a given i-cell, j-cells will be far apart. For a given j-cell, i-cells will be adjacent. Accessing adjacent values in memory is always faster.

For an array[i][j], putting j in the outer loop and i in the inner loop will significantly reduce potential virtual memory slowdowns.

This is the right way:

for j in range(0, m):
	for i in range(0, n):
		cell = cells[i][j];

The above only makes a difference with large data sets, but I like to cultivate good habits.

3 thoughts on “Array access and virtual memory

  1. Hi Astarr,

    Try Googling:
    http://www.google.com/search?q=multidimensional+array+c%2B%2B

    This is how you declare a multidimensional array in Java:
    int[][] a = new int[10][5];

    This is how you do it in C/C++:
    int a[10][5];

    Because of the hacky way C implements arrays, you can also do this and get exactly the same array:
    int a[10 * 5];

    It’s a bit trickier in C/C++ when objects get involved because of manual memory allocation. I leave that as an exercise for the reader.

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.