Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discuss multi-dim indexing / array order #7

Open
wants to merge 20 commits into
base: main
Choose a base branch
from
Open
Changes from 17 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
283 changes: 283 additions & 0 deletions posts/2024-05-22-multi-dim-arrays.qmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,283 @@
---
title: "Multi-dimensional array indexing"
description: "Recommendations to ensure that array data and
index/coordinate data can be consistently interpreted
across software using different indexing conventions."
author:
- name: "John Bogovic"
- name: "Davis Bennett"
- name: "Michael Innerberger"
- name: "Stephan Saalfeld"
- name: "Virginia Scarlett"
date: "5/10/2024"
---

## Summary and recommendations

Conventions for indexing and storing multi-dimensional data vary across programming languages and software libraries.
Data sharing across software with different systems is technically simple *so long as the convention used is clearly
documented.* For example, one software library might store and save point coordinates as `(x,y,z)`, while another
stores them as `(z,y,x)`. Interoperability is easy (just "reverse the numbers") only if it is possible to determine
what convention was used to produce the data, and software can determine what convention is used only if it is
clearly and explicitly communicated.

To enable seamless data and metadata sharing, software and storage formats should clearly and explicitly communicate:

1. the relationship of array indexes to memory layout
2. the semantic meaning of array dimensions
3. the relationship of coordinate data to array dimensions

Ambiguities should be clarified by (1) explicitly documenting the array indexing convention (most often C-order or F-order),
and (2) referring to array dimensions and (3) coordinates with a consistent order and/or with labels that are
independent of order (e.g. `['x', 'y', 'z', 't', ...]`, see
[xarray](https://docs.xarray.dev/en/latest/getting-started-guide/why-xarray.html))

::: {.callout-tip collapse="true" appearance="minimal"}
# Examples

These are provided as good examples, not as the only way to communicate the above information.

#### Array size and dimension interpretation

> 3D displacement fields are stored as 4D arrays, whose dimensions are ordered `[3,X,Y,Z]` where the first dimension contains
> displacement vector components, and varies fastest in memory (i.e. F-order).

#### Point coordinates

> Points are stored as 3D coordinates in a CSV file ordered `X,Y,Z`
> where the `X` dimension varies fastest in memory.

:::


## Multi-dimensional array indexing and memory layout

Elements of an n-dimensional array are indexed by an ordered n-tuple, each element of which indexes into one of the array's
dimensions. We will also call elements of this tuple "indexes." For example, the tuple `(i,j,k)` indexes a three-dimensional
array where `i` is the "first" index, and `k` is the "last" index. Here, we will consider only the non-negative integers as
valid indexes for arrays, though different contexts may use a different index set.

Multi-dimensional arrays are often stored as one-dimensional (1D), or "flat," arrays that are interpreted, or "reshaped," into
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's actually obligatory to store arrays in a 1D representation

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By flat / 1D I mean that they're stored in contiguous memory, and that's not necessarily true of zarr arrays for example. While every chunk may be contiguous 1D, the "whole array" need not be.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nD arrays could be stored as a Iliffe vector. That's not particularly smart, but it is a common when one does not have a supporting library.

a multi-dimensional array by mapping the n-tuple of coordinates to a single index into the 1D array. The two most common
conventions for this mapping are C-order and F-order.

In this article, we will refer to n-dimensional arrays as simply "arrays" and 1D arrays as "flat."


#### Reshaping arrays and stride

One can think of reshaping a 1D array as a recursive process of grouping a number of adjacent elements. An n-dimensional array
can be reshaped to an (n+1)-dimensional array by grouping a number adjacent elements belonging to the same dimension.

* **Define:** the *stride* of a dimension is the (positive) step in the flat array that corresponds to the adjacent element
along that dimension.

The stride of a dimension is the product of sizes of all previous dimensions.

::: {.callout-note collapse="true" appearance="minimal"}
# Note on "units" of stride

In this article, we will express stride in terms of elements. That is, stride 1 means the adjacent element in memory,
no matter the size in bytes per element.

In some other contexts, stride is expressed in terms of bytes. For example, an array containing elements of type `float32`,
the smallest stride possible would be 4 bytes: the next element is "4 bytes away".

:::

* **Define:** the "fastest" or "inner" dimension is the dimension with a stride of 1.
* **Define:** the "slowest" or "outer" dimension is the dimension with the largest stride.

* **Define:** the *size* of a dimension is the number of grouped elements.

The size of an n-dimensional array is described by a list of sizes per dimension. For example: `[3, 5, 7].` In this example,
the *first* dimension has size `3`, the *last* dimension has size `7`.

::: {.callout-tip collapse="true" appearance="minimal"}
# Example

Suppose we have this flat array:

`[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23]`


and three dimensions having sizes 2, 3, and 4. Their strides are 1, 2, and 6 where `2*3 = 6`. There is no grouping to be done
for a dimensions of stride 1, so the first step is to join elements into groups of 2 (the second stride):

`[(0, 1), (2, 3), (4, 5), (6, 7), (8, 9), (10, 11), (12, 13), (14, 15), (16, 17), (18, 19), (20, 21), (22, 23)]`

Next, group elements of the new list (which are themselves groups) into groups of 3 (the largest stride).

`[((0, 1), (2, 3), (4, 5)), ((6, 7), (8, 9), (10, 11)), ((12, 13), (14, 15), (16, 17)), ((18, 19), (20, 21), (22, 23))]`

Notice:

* The element adjacent to `0` in the inner group is `1`, hence stride `1`.
* The element adjacent to `0` in the intermediate grouping is `2`, hence stride `2`.
* The element adjacent to `0` in the outer grouping is `6`, hence stride `6`.
:::


### C- and F-order

The terms C-order and F-order come from conventions for indexing arrays in the C and Fortran programming languages.

* **C-order indexing:**: the fastest dimension corresponds to the last index, the slowest dimension corresponds to the first index
* **F-order indexing:**: the slowest dimension corresponds to the last index, the fastest dimension corresponds to the first index


::: {.callout-tip collapse="true" appearance="minimal"}
# Examples

Using the same flat array as the above example:

`[((0, 1), (2, 3), (4, 5)), ((6, 7), (8, 9), (10, 11)), ((12, 13), (14, 15), (16, 17)), ((18, 19), (20, 21), (22, 23))]`

* The size of this array using C-order is: `[4, 3, 2]`
* The size of this array using F-order is: `[2, 3, 4]`
* The index of element `5` using C-order is: `[0][2][1]`
* The index of element `5` using F-order is: `[1][2][0]`

:::

If this array is indexed using C-order, then the last index has stride `1`. As a result, the middle index will have stride `7`,
and the *first* dimension will have stride `7*5 = 35`.

Consider again an array of size `[3, 5, 7]`, but using F-order indexing. Again, the *first* dimension has size `3`, the *last*
dimension has size `7`. Now, however, using F-order, the *first* dimension will have stride `1`, the *second* dimension will
have stride `3`, and the *third* dimension will have stride `3*5 = 15`.

## Dimension semantics

The dimensions of a multi-dimensional array sometimes come with additional semantics depending
on what data they store. We discuss interpreting arrays as matrices and images.

### Matrices

Matrices are often represented as a 2D array of numbers. Horizontal groupings of these numbers are called "rows" and vertical
groupings are called "columns." In mathematics, the entries of a matrix $A$ are denoted $a_{ij}$.

**Recommendation:** Software should be clearly communicate when arrays represent matrices, and follow the *Matrix indexing
convention*. Software and documentation should use the terms "row-major" and "column-major" only when referring to matrices.

::: {.callout-note appearance="minimal"}
# Matrix indexing convention

The first index of a matrix ($i$) refers to rows, the second index ($j$) refers to columns.
:::


::: {.callout-tip collapse=true appearance="minimal"}
# Row- and column-major

The terms row- and column-major derive for the storage of matrices. Defining these terms first depends on first agreeing which
index (first or last) refers to rows vs columns for matrices in mathematics.

* **Consequence:** Given matrix indexing conventions, C-order storage is equivalent to "row-major".
* **Consequence:** Given matrix indexing conventions, F-order storage is equivalent to "column-major".

#### example

As a result of the *Matrix Indexing Convention* the size of a matrix with `2` rows and `3` columns is `[2, 3]` for both C- and
F-orderings. Consider:

```
column 0 column 1 column 2
row 0 [ 0 1 2 ]
row 1 [ 3 4 5 ]
```

* The flat C-ordered array will be: `[0, 1, 2, 3, 4, 5]`
* The flat F-ordered array will be: `[0, 3, 1, 4, 2, 5]`

To reiterate, the multi-dimensional indexes for both C- and F-order are:

```
column 0 column 1 column 2
row 0 [ (0,0) (0,1) (0,2) ]
row 1 [ (1,0) (1,1) (1,2) ]
```

because, the row index is *always* the first index.

:::

For matrices, C- and F- order indexing will agree on the ordering of an array's indexes and dimensions, but will store the
arrays differently when flattened. This is because the matrix indexing convention attaches semantics (row/column) to
the index position (first/second).

### Images

Two-dimensional images are often stored as arrays where two dimensions vary the horizontal and vertical positions of the
samples, and as a result these dimensions should be displayed horizontally and vertically, respectively. Most formats for
storing "natural" images store data such that the "horizontal axis" / rows have a smaller stride than the "vertical axis" /
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is a natural image?

Copy link
Collaborator

@bogovicj bogovicj Jun 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tough to define precisely, but roughly "an image of physical objects that a human might see using the unaided eye" It's a pretty common term in computer vision

Used here in contrast to "medical" or "microscopic" images, which are not "natural" images

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might be useful here to clarify that "horizontal" and "vertical" are relative to the camera sensor, not the subject. Landscape and portrait images of the same subject should probably not be displayed based on the memory layout.

columns. Note: while rows have smaller stride than columns, it is common for rows not to have stride 1, for example when using
"interleaved" color components, the "color" dimension will have a stride of 1. Biomedical images do not typically have
"horizontal" or "vertical" dimensions, but may have other semantics (e.g., anatomical, or related to the imaging system).

**Recommendation:** Software and storage formats should be explicit about any semantics that are attached to dimensions and
not use stride as a proxy for or to imply semantics. Documentation may specify semantics in terms of fastest/slowest dimension,
but must explicitly communicate that fact.

For images, C- and F-order will historically disagree on the ordering of an array's indexes and dimensions, but will store the
arrays the same way when flattened. This is because the convention for natural images historically
associates dimension semantics to the memory layout (the fastest dimension has horizontal semantics).

## Coordinate data

Coordinate data are data that refer to "locations" of the multidimensional array. They may be discrete,
and refer to specific samples, or continuous, referring to points "in-between" array locations.

::: {.callout-tip collapse="true" appearance="minimal"}
# Coordinate data examples

* Point annotations
* "Structure `A` is located at point `(x,y)`"
* Bounding boxes / ROIs
* "Crop the image to bounding box `(min_x, min_y, width, height)`
* Other collections of points
* "My neuron skeleton consists of points `[[z0, y0, x0], ..., [zN, yN, xN]]`

:::




### cartesian coordinates

In contrast to the matrix row/column index convention, cartesian
coordinates label the horizontal and vertical dimensions `x` and `y`,
respectively. Referencing positions in the 2D plane is done using
ordered two-tuples `(x,y)`, where `x` is conventionally the left-index
and `y` is the right-index. Using cartesian coordinates, varying
the left dimensions varies horizontal position, and varying the right
dimension varies the vertical position.

Applications and workflows that make use of image geometry most commonly use cartesian coordinates.


## References

1) [hdf5 dataspaces](https://docs.hdfgroup.org/hdf5/develop/_h5_s__u_g.html#sec_dataspace)
2) [zarr arrays](https://zarr-specs.readthedocs.io/en/latest/v2/v2.0.html#arrays)
3) [nrrd axis ordering](https://teem.sourceforge.net/nrrd/format.html#general.4)
4) [n5 ordering discussion](https://github.com/saalfeldlab/n5/issues/31)
5) [multi-dimensional arrays in vigra](http://ukoethe.github.io/vigra/doc-release/vigranumpy/index.html#more-on-the-motivation-and-use-of-axistags)

## Appendix

### Programming languages
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not clear to me that these ordering can be applied to "languages" as a whole. Most languages do not have multidimensional indexing or data structures at their base level. Rather it is often libraries built and used with those languages that implement multidimensional data structures and indexing.

For example, the C++ library Eigen defaults to column-major, F-order for storage:
https://eigen.tuxfamily.org/dox/group__TopicStorageOrders.html

What is usually discussed with respect to languages is the storage order of the data rather than just the indexing order. Here things are murky since languages such as Java do not guarantee contiguous memory storage. Numpy supports storage in either F or C order.

My suggestion is to list libraries such as NumPy, Eigen, or imglib2 rather than languages here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might just scrap the whole table, since I'm less and less sure what the added value is. The scope of this may now be such that anyone that this article is useful for already knows what's in that table and more details of the type you're pointing out...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it might be useful for someone moving data from NumPy to imglib2 to see what the default order is of those two libraries.


| C-order | F-order |
| ------- | ------- |
| C | Fortran |
| C++ | Java |
| Python | Matlab |

### Related terms

| C-order | F-order |
| --------------------- | ------------------------ |
| lexicographical order | co-lexicographical order |
| row-major | column-major |
| matrix indexing | cartesian indexing |