Skip to content

Reading and Writing Tiled Images

Dirk Farin edited this page Oct 30, 2024 · 7 revisions

Introduction

When reading or writing high-resolution images, it may not be feasible to keep the whole image in memory. For those situations, libheif supports to read or write images as individual tiles. When reading, only the accessed tiles are read from the file and decoded. Similarly, when writing, each tile can be encoded separately and added to the image file. It is even possible to write the image tiles in arbitrary order.

libheif supports three methods to store tiled images:

  • grid images - This format saves a tiled image as a collection of small images. Each tile is stored in the HEIF file as a separate image, which are then combined into a large grid image. It has a rather large overhead because metadata has to be stored for each tile image. The format is also limited to only 65535 tiles. This is the format used by most mobile phone cameras that split the image usually into 512x512 tiles.
  • unci images - These are "uncompressed" images according to ISO 23001-17. This format has tiling built into it with only little overhead, but it can only store images compressed with lossless compression algorithms (deflate, brotli).
  • tili images - This is a currently proprietary HEIF extension that, similarly to grid, compresses each tile independently, but it avoids the metadata overhead and it works for practically unlimited image sizes. It also supports skipped (no-data) tiles and higher dimensional images (like multi-spectral images or 3D volumes). It is optimized for efficient streaming of the image data over networks. More information about tili images can be found here.

Reading any of these tiling formats with libheif uses the same API, thus you do not have to care about the format used in the HEIF file internally.

Note: the functions for writing 'unci' and 'tili' images are still in the heif_experimental.h header until the API is considered stable. If you installed libheif through your distribution's package manager, this file might not be installed. Please compile libheif from source and enable WITH_EXPERIMENTAL_FEATURES in your cmake config.

Writing Tiled Images

Irrespective of the tiling format used, you will always first generate the tiled image and then add the individual tiles to it. The API for generating the tiled image depends on the tiling format used, while adding the tiles to the image is always done with the same API call.

Writing 'grid' Images

First, generate the grid image with

struct heif_error heif_context_add_grid_image(struct heif_context* ctx,
                                              uint32_t image_width,
                                              uint32_t image_height,
                                              uint32_t tile_columns,
                                              uint32_t tile_rows,
                                              const struct heif_encoding_options* encoding_options,
                                              struct heif_image_handle** out_grid_image_handle);

Note that the image_width and image_height does not have to be an integer multiple of the tile sizes. If they are smaller, the extra data at the right and bottom border is removed from the decoded image.

Then add the image tiles one by one with a call to

struct heif_error heif_context_add_image_tile(struct heif_context* ctx,
                                              struct heif_image_handle* tiled_image,
                                              uint32_t tile_x, uint32_t tile_y,
                                              const struct heif_image* image,
                                              struct heif_encoder* encoder);

For grid images, it is required that all image tiles are filled. Skipped (no-data) tiles are not allowed. You may add the tiles in any order.

Writing unci Images (ISO 23001-17)

For tiled unci images, you first create the unci image with

struct heif_unci_image_parameters {
  int version;

  // --- version 1

  uint32_t image_width;
  uint32_t image_height;

  uint32_t tile_width;
  uint32_t tile_height;

  enum heif_metadata_compression compression;

  // ...
};

struct heif_error heif_context_add_unci_image(struct heif_context* ctx,
                                              const struct heif_unci_image_parameters* parameters,
                                              const struct heif_encoding_options* encoding_options,
                                              const struct heif_image* prototype,
                                              struct heif_image_handle** out_unci_image_handle);

The prototype parameter is an image with the same color channels and settings as you will use for the individual tiles. This dummy image is not coded, but only used to specify the image format. You can use very small image planes (1x1) in the prototype image as their size is not used. For unci images, different than for grid images, the image_width and image_height must be an integer multiple of the tile sizes. The compression parameter selects the lossless compression algorithm.

Now, you can add tiles to the unci image as above with

struct heif_error heif_context_add_image_tile(struct heif_context* ctx,
                                              struct heif_image_handle* tild_image,
                                              uint32_t tile_x, uint32_t tile_y,
                                              const struct heif_image* image,
                                              struct heif_encoder* encoder);

The tile_x, tile_y parameters specify the tile position as indices (0;0), (0;1), (0;2). These are not pixel coordinates.

Writing tili Images

For tiled tili images, you also first create the tili image and then add the individual tiles to it. The image is first created with:

struct heif_tiled_image_parameters {
  int version;

  // --- version 1

  uint32_t image_width;
  uint32_t image_height;

  uint32_t tile_width;
  uint32_t tile_height;

  uint32_t compression_type_fourcc;

  uint8_t offset_field_length;   // one of: 32, 40, 48, 64 (bits)
  uint8_t size_field_length;     // one of:  0, 24, 32, 64 (bits)

  uint8_t number_of_extra_dimensions;  // 0 for normal images, 1 for volumetric (3D), ...
  uint32_t extra_dimensions[8];        // size of extra dimensions (first 8 dimensions)

  uint8_t tiles_are_sequential;  // (bool) hint whether all tiles are added in sequential order
};

struct heif_error heif_context_add_tiled_image(struct heif_context* ctx,
                                               const struct heif_tiled_image_parameters* parameters,
                                               const struct heif_encoding_options* options,
                                               struct heif_image_handle** out_tiled_image_handle);

The compression_type_fourcc corresponds to the image item type usually stored in the HEIF file, e.g. hvc1 for h.265, av01 for AVIF.

The tili image contains a table with offset pointers to the individual tiles in the file. You can choose the bit-length of these offsets and the tile sizes. When setting the size_field_length to 0, no tile size will be stored. Note that omitting the tile sizes will force the decoder to load the whole offset table when parsing the file, which may be undesirable when the file should be streamed over the network. The size of the table is the combined bit-length of the two fields times the number of tiles.

Now, using the same function as with the other tiling methodes, you can add tiles:

struct heif_error heif_context_add_image_tile(struct heif_context* ctx,
                                              struct heif_image_handle* tile_image,
                                              uint32_t tile_x, uint32_t tile_y,
                                              const struct heif_image* image,
                                              struct heif_encoder* encoder);

You have to use the same heif_encoder with the same settings for all tiles. The tile_x, tile_y parameters specify the tile position as indices (0;0), (0;1), (0;2). These are not pixel coordinates.

Reading Tiled Images

Reading tiled images works the same for all image types. The same code will work with all tiling schemes and even with non-tiled images, which will appear like images consisting of a single tile.

Before decoding a tile, the first step should be to get the tiling information with

struct heif_error heif_image_handle_get_image_tiling(const struct heif_image_handle* handle,
                                                     int process_image_transformations,
                                                     struct heif_image_tiling* out_tiling);

The boolean parameter process_image_transformations indicates whether libheif should take care of all image transformations (rotations, mirror, cropping) internally, or whether you want to handle them yourself. If this is enabled, libheif will also convert the tile coordinates such that it looks to the client application as if the image geometry is not transformed.

The above function returns the following tiling information:

struct heif_image_tiling
{
  int version;

  // --- version 1

  uint32_t num_columns;
  uint32_t num_rows;
  uint32_t tile_width;
  uint32_t tile_height;

  uint32_t image_width;
  uint32_t image_height;

  // Position of the top left tile.
  // Usually, this is (0;0), but if a tiled image is rotated or cropped, it may be that the top left tile should be placed at a negative position.
  // The offsets define this negative shift.
  uint32_t top_offset;
  uint32_t left_offset;

  uint8_t number_of_extra_dimensions;  // 0 for normal images, 1 for volumetric (3D), ...
  uint32_t extra_dimension_size[8];    // size of extra dimensions (first 8 dimensions)
};

In general, you should assume that the image_width and image_height are no integer multiples of the tile size. This constraint will be the case for unci images, but not for the other tiling types. When you get tiles overlapping the border, you should ignore it and you should not draw the part that extends beyond the border. libheif will not crop the border tiles.

Internally, tiles may extend beyond the right and bottom borders, but when you turned on process_image_transformations, the image may be rotated and cropped and tiles may extend beyond all four borders. For this reason, the fields top_offset and left_offset indicate how much of the top and left border should be removed when displaying the image.


Tiled image with ignored right and bottom border.


Image has been rotated by 90 degrees. Now the `left_offset` is greater than 0.

Now you can decode individual tiles with

struct heif_error heif_image_handle_decode_image_tile(const struct heif_image_handle* in_handle,
                                                      struct heif_image** out_img,
                                                      enum heif_colorspace colorspace,
                                                      enum heif_chroma chroma,
                                                      const struct heif_decoding_options* options,
                                                      uint32_t tile_x, uint32_t tile_y);

Like with encoding, tile_x and tile_y specify the tile position index, not the pixel coordinate. The other parameters are the same as for the usual heif_image_handle_decode_image() function. Make sure that the heif_decoding_options value ignore_transformations is set to !process_image_transformations.

Network Streaming

When opening a file, libheif will parse the file structure and read the meta box. When you access a single image tile, it will load specifically only the data of that tile from the file. This is particularly important when you want to stream the image over a network without first downloading the whole image. In that case, you can implement the heif_reader interface and implement it to download the data from the network. Apart from the usual read and seek functions, version 2 of this interface also includes the function heif_reader_range_request_result request_range(uint64_t start_pos, uint64_t end_pos, void* userdata). libheif will call this function to tell the reader which file range it is going to read next. This should be a blocking call in which you can download the data from the network. The advantage to downloading the file in the read() function is that the read() function may read many small chunks, while request_range() will request larger file ranges that are more efficient to download. You may even download more data than requested and let libheif know in the result that this data is available. libheif may decide to use that extra data if it can make use of it.

Another, optional, function is void preload_range_hint(uint64_t start_pos, uint64_t end_pos, void* userdata). With this function, libheif lets you know of a file range that it may need in the future. Contrary to the above, this function should be non-blocking and return immediately. You may want to start a download in the background so that it is ready in case the range is requested later on.

If you are caching network data, you might also be interested in the callback void release_file_range(uint64_t start_pos, uint64_t end_pos, void* userdata) which is used to let you know if libheif does not need a specific file range anymore and you can remove it from the cache. Note that you can remove any data from the cache whenever you want, but it might be that you have to reload it if libheif will request it again.

Multiresolution Pyramids

Closely related to reading high-resolution images is the feature to store lower-resolution overview images in the same file. These make it easier to display zoomed-out views of the image without having to read large areas of the image at the highest resolution and scaling it down.

Multiresolution image pyramids are stored as a set of images, one for each resolution layer. These layer images are combined into a pyramid that is stored as a pymd entity group.

A pymd entity group also contains some metadata for each layer:

struct heif_pyramid_layer_info {
  heif_item_id layer_image_id;
  uint16_t layer_binning;
  uint32_t tile_rows_in_layer;
  uint32_t tile_columns_in_layer;
};

This includes the subsampling factor (layer_binning) and the number of tiles in the layer. You can get the pymd metadata with

struct heif_pyramid_layer_info*
heif_context_get_pyramid_entity_group_info(struct heif_context*,
                                           heif_entity_group_id id,
                                           int* out_num_layers);

Decoding the Multiresolution Pyramid Information

You can get the pymd entity group with heif_entity_group* heif_context_get_entity_groups(const struct heif_context*, uint32_t type_filter, uint32_t item_filter, int* out_num_groups). More specifically, you can set the type_filter to heif_fourcc('p','y','m','d') to get the pyramid directly if it is present. The image item IDs in the entity group are ordered from lowest resolution to highest resolution.

Encoding a Multiresolution Pyramid

First encode all the resolution layers as separate images. Each layer image in the pyramid can be a tiled image and you can also mix the image types. For example, the highest resolution layer could be an unci image that stores the lossless compressed data, while the overview images use tili or grid with an efficient image codec. It also helps that software that cannot read unci images can at least show the overview images.

After you have created all layer images and know their item IDs, call

struct heif_error
heif_context_add_pyramid_entity_group(struct heif_context* ctx,
                                      const heif_item_id* layer_item_ids,
                                      size_t num_layers,
                                      heif_item_id* out_group_id);

This will group them into a pymd group. You can list the image IDs in any order. Libheif takes care to sort the layer images according to their size and automatically generates the meta-information stored in the pymd entity group.

If you use tiled images, you can add tiles after the pymd entity group has been generated. For example, you could first create "empty" images with heif_context_add_tiled_image() for each layer, generate the pymd entity group, and then start encoding the image tiles.

Example Software

There is an example viewer application that supports tiled image decoding to display images of arbitrary resolution. On that page you will also find links to example images.