Discuss multi-dim indexing / array order #7

minnerbe · 2023-12-14T15:44:39Z

This is a replica of #1 from @bogovicj. In order to keep things simple, we decided to keep contributions local, rather than forking this repository.

The two commits that exceed the scope of #1 only fix minor typos and inconsistencies.

* matrix convention and example * image convention

* change top level title * add conclusion * typo fixes

ArrayOrder.md

minnerbe · 2023-12-14T15:50:44Z

ArrayOrder.md

+or right) refers to rows vs columns for matrices in mathematics.
+
+* **Define:** Arrays storing matrices in "row-major" give columns stride 1. 
+* **Define:** Arrays storing matrices in "column-major" give rows stride 1. 


Is the use of rows and columns correct here? As far as I understand the concepts introduced here, "row-major" stores rows contiguous in memory, i.e., gives rows stride 1. I might be missing something, though.

Yea, this is backwards. We also may need to put a unit on 1.

In Julia, column-major, the strides are as follows, given in terms of the type (Float64 aka double).

julia> strides(zeros(5,6,7)) (1, 5, 30)

In Python, the strides are given in terms of bytes.

In [5]: np.zeros((5,6,7)).strides Out[5]: (336, 56, 8)

"row is contiguous in memory" is equivalent to "column index has stride one" (not intuitive, i know)

"rows" and "columns" mean things for matrices only. but @mkitti , make a 2D example. Then you'll see strides (1,5), say. The first dimension indexes rows. Rows have stride 1 = column major.

I see what you mean by this comment. I interpreted "gives columns stride 1" as "iterating within a column (i.e., changing the row index) has stride 1". In my view, my interpretation is consistent with your definition above:

the stride of a dimension is the (positive) step in the flat array that corresponds to the adjacent element along that dimension.

What about replacing "columns" and "rows" by "second" and "first" index, respectively?

d-v-b · 2023-12-18T16:01:20Z

ArrayOrder.md

+## Multi-dimensional array indexing
+
+Zarr stores multi-dimensional arrays into regularly sized chunks.
+Chunks are themselves multi-dimensional arrays of a smaller size than


...a smaller size than

not necessarily. you can have a zarr array with 1 chunk

d-v-b · 2023-12-18T16:03:13Z

ArrayOrder.md

+Zarr stores multi-dimensional arrays into regularly sized chunks.
+Chunks are themselves multi-dimensional arrays of a smaller size than
+the complete multidimensional array and are stored as a 1D array of
+values, called a "flattened" array.


aren't most (all?) n-dimensional arrays stored as 1D arrays of values? talking about zarr implies that this is a zarr thing, rather than a storing-arrays-in-computers thing.

ArrayOrder.md

minnerbe · 2024-01-24T18:39:23Z

Just a quick check-in: where are we on this issue @bogovicj, @d-v-b? Is there a need for a more focused discussion, e.g., over zoom?

bogovicj · 2024-01-24T21:05:02Z

I'll revisit this next week - happy to zoom chat if it will be useful

mkitti · 2024-06-06T23:04:53Z

posts/2024-05-22-multi-dim-arrays.qmd

+
+## Appendix
+
+### Programming languages


It is not clear to me that these ordering can be applied to "languages" as a whole. Most languages do not have multidimensional indexing or data structures at their base level. Rather it is often libraries built and used with those languages that implement multidimensional data structures and indexing.

For example, the C++ library Eigen defaults to column-major, F-order for storage:
https://eigen.tuxfamily.org/dox/group__TopicStorageOrders.html

What is usually discussed with respect to languages is the storage order of the data rather than just the indexing order. Here things are murky since languages such as Java do not guarantee contiguous memory storage. Numpy supports storage in either F or C order.

My suggestion is to list libraries such as NumPy, Eigen, or imglib2 rather than languages here.

I might just scrap the whole table, since I'm less and less sure what the added value is. The scope of this may now be such that anyone that this article is useful for already knows what's in that table and more details of the type you're pointing out...

I think it might be useful for someone moving data from NumPy to imglib2 to see what the default order is of those two libraries.

d-v-b · 2024-06-07T09:36:29Z

posts/2024-05-22-multi-dim-arrays.qmd

+array where `i` is the "first" index, and `k` is the "last" index. Here, we will consider only the non-negative integers as
+valid indexes for arrays, though different contexts may use a different index set.
+
+Multi-dimensional arrays are often stored as one-dimensional (1D), or "flat," arrays that are interpreted, or "reshaped," into


I think it's actually obligatory to store arrays in a 1D representation

By flat / 1D I mean that they're stored in contiguous memory, and that's not necessarily true of zarr arrays for example. While every chunk may be contiguous 1D, the "whole array" need not be.

nD arrays could be stored as a Iliffe vector. That's not particularly smart, but it is a common when one does not have a supporting library.

d-v-b · 2024-06-07T09:39:42Z

posts/2024-05-22-multi-dim-arrays.qmd

+
+Two-dimensional images are often stored as arrays where two dimensions vary the horizontal and vertical positions of the
+samples, and as a result these dimensions should be displayed horizontally and vertically, respectively.  Most formats for
+storing "natural" images store data such that the "horizontal axis" / rows have a smaller stride than the "vertical axis" /


what is a natural image?

Tough to define precisely, but roughly "an image of physical objects that a human might see using the unaided eye" It's a pretty common term in computer vision

Used here in contrast to "medical" or "microscopic" images, which are not "natural" images

might be useful here to clarify that "horizontal" and "vertical" are relative to the camera sensor, not the subject. Landscape and portrait images of the same subject should probably not be displayed based on the memory layout.

* rework stride and related definitions * clearer recommendations re memory layout * recommendations re dimension naming

bogovicj and others added 8 commits November 16, 2023 11:40

rough start to array order

1deb379

array ordering progress

b8fbf89

* matrix convention and example * image convention

multi-dim arr indexing

135bfa5

* change top level title * add conclusion * typo fixes

typo fix, cartesian coords

7d0eb84

typo fix

d872033

flesh out grouping, first=left and last=right, interpretation

74daf99

Fix a few typos

6f8f165

Unify presentation of arrays

d680393

minnerbe commented Dec 14, 2023

View reviewed changes

d-v-b reviewed Dec 18, 2023

View reviewed changes

ArrayOrder.md Outdated Show resolved Hide resolved

bogovicj added 2 commits December 19, 2023 17:14

array order: fix typos, add language indicating limited scope

695e01f

array order: add note on (co)lexicographic order

caf2644

bogovicj added 7 commits May 1, 2024 15:26

Merge remote-tracking branch 'origin/main' into arrayOrder

07172b9

refactor: make array order post

1fdca26

Merge branch 'main' into arrayOrder

993c869

rewrite of array order post (draft)

2d6cd37

rm definitions

7512503

authors and description

5226806

more examples and edits

4d5e6b4

mkitti reviewed Jun 7, 2024

View reviewed changes

d-v-b reviewed Jun 7, 2024

View reviewed changes

bogovicj added 3 commits June 24, 2024 15:19

add Mark as author to multi-dim arrays

94c960b

multi-dim arrays rm programming languages table

3cd7619

multi-dim arrays assorted edits

07a2423

* rework stride and related definitions * clearer recommendations re memory layout * recommendations re dimension naming

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Discuss multi-dim indexing / array order #7

Discuss multi-dim indexing / array order #7

minnerbe commented Dec 14, 2023

minnerbe Dec 14, 2023

mkitti Dec 14, 2023

bogovicj Dec 18, 2023 •

edited

Loading

minnerbe Dec 18, 2023

d-v-b Dec 18, 2023 •

edited

Loading

d-v-b Dec 18, 2023

minnerbe commented Jan 24, 2024

bogovicj commented Jan 24, 2024

mkitti Jun 6, 2024

bogovicj Jun 17, 2024

mkitti Jun 17, 2024

d-v-b Jun 7, 2024

bogovicj Jun 17, 2024

mkitti Jun 17, 2024

d-v-b Jun 7, 2024

bogovicj Jun 17, 2024 •

edited

Loading

d-v-b Jun 25, 2024

Discuss multi-dim indexing / array order #7

Are you sure you want to change the base?

Discuss multi-dim indexing / array order #7

Conversation

minnerbe commented Dec 14, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bogovicj Dec 18, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

d-v-b Dec 18, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

minnerbe commented Jan 24, 2024

bogovicj commented Jan 24, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bogovicj Jun 17, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bogovicj Dec 18, 2023 •

edited

Loading

d-v-b Dec 18, 2023 •

edited

Loading

bogovicj Jun 17, 2024 •

edited

Loading