Skip to content

Commit

Permalink
Merge pull request #295 from t20100/add-blosc2-plugins
Browse files Browse the repository at this point in the history
Added support of blosc2 plugins to the blosc2 filter; Updated c-blosc2 v2.13.2
  • Loading branch information
vasole authored Feb 8, 2024
2 parents 2aaeb14 + 73b9478 commit a0f37f3
Show file tree
Hide file tree
Showing 21 changed files with 365 additions and 42 deletions.
6 changes: 4 additions & 2 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -96,8 +96,8 @@ jobs:
ls dist
- name: Install from source package
run:
pip install --pre dist/hdf5plugin*
run: |
pip install --pre "$(ls dist/hdf5plugin-*)[test]" --only-binary blosc2 || pip install --pre dist/hdf5plugin-*
- name: Print python info
run: |
Expand Down Expand Up @@ -129,5 +129,7 @@ jobs:
CIBW_BUILD: cp39-macosx_*
CIBW_ARCHS_MACOS: arm64
HDF5PLUGIN_SSE2: False
HDF5PLUGIN_SSSE3: False
HDF5PLUGIN_AVX2: False
HDF5PLUGIN_AVX512: False
HDF5PLUGIN_NATIVE: False
12 changes: 7 additions & 5 deletions .github/workflows/release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ jobs:
name: cibw-sdist
path: dist
- name: Install sdist
run: pip install --pre dist/hdf5plugin*.tar.gz
run: pip install --pre "$(ls dist/hdf5plugin*.tar.gz)[test]"
- name: Run tests
run: python test/test.py

Expand Down Expand Up @@ -85,13 +85,14 @@ jobs:
HDF5PLUGIN_OPENMP: "False"
HDF5PLUGIN_NATIVE: "False"
HDF5PLUGIN_SSE2: ${{ matrix.with_sse2 && 'True' || 'False' }}
HDF5PLUGIN_SSSE3: "False"
HDF5PLUGIN_AVX2: "False"
HDF5PLUGIN_AVX512: "False"
HDF5PLUGIN_BMI2: "False"
HDF5PLUGIN_CPP11: "True"
HDF5PLUGIN_CPP14: "True"

CIBW_ENVIRONMENT_PASS_LINUX: HDF5PLUGIN_OPENMP HDF5PLUGIN_NATIVE HDF5PLUGIN_SSE2 HDF5PLUGIN_AVX2 HDF5PLUGIN_AVX512 HDF5PLUGIN_BMI2 HDF5PLUGIN_CPP11 HDF5PLUGIN_CPP14
CIBW_ENVIRONMENT_PASS_LINUX: HDF5PLUGIN_OPENMP HDF5PLUGIN_NATIVE HDF5PLUGIN_SSE2 HDF5PLUGIN_SSSE3 HDF5PLUGIN_AVX2 HDF5PLUGIN_AVX512 HDF5PLUGIN_BMI2 HDF5PLUGIN_CPP11 HDF5PLUGIN_CPP14

# Use Python3.11 to build wheels that are compatible with all supported version of Python
CIBW_BUILD: cp311-*
Expand Down Expand Up @@ -132,10 +133,11 @@ jobs:
pattern: cibw-wheels-*
path: dist
merge-multiple: true
- name: Install h5py and hdf5plugin
- name: Install hdf5plugin
# First select the right wheel from dist/ with pip download, then install it
run: |
pip install h5py
pip install --no-index --no-cache --find-links=./dist hdf5plugin --only-binary hdf5plugin
pip download --no-index --no-cache --no-deps --find-links=./dist --only-binary :all: hdf5plugin
pip install "$(ls ./hdf5plugin-*.whl)[test]" --only-binary blosc2 || pip install "$(ls ./hdf5plugin-*.whl)"
- name: Run test with latest h5py
run: python test/test.py
- name: Run test with oldest h5py
Expand Down
6 changes: 3 additions & 3 deletions doc/information.rst
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,7 @@ HDF5 compression filters and compression libraries sources were obtained from:
* `hdf5-blosc plugin <https://github.com/Blosc/hdf5-blosc>`_ (v1.0.0)
using `c-blosc <https://github.com/Blosc/c-blosc>`_ (v1.21.5), LZ4, Snappy, ZLib and ZStd.
* hdf5-blosc2 plugin (from `PyTables <https://github.com/PyTables/PyTables/>`_ v3.9.2)
using `c-blosc2 <https://github.com/Blosc/c-blosc2>`_ (v2.13.1), LZ4, ZLib and ZStd.
using `c-blosc2 <https://github.com/Blosc/c-blosc2>`_ (v2.13.2), LZ4, ZLib and ZStd.
* `FCIDECOMP plugin <ftp://ftp.eumetsat.int/pub/OPS/out/test-data/Test-data-for-External-Users/MTG_FCI_Test-Data/FCI_Decompression_Software_V1.0.2>`_ (v1.0.2)
using `CharLS <https://github.com/team-charls/charls>`_
(1.x branch, commit `25160a4 <https://github.com/team-charls/charls/tree/25160a42fb62e71e4b0ce081f5cb3f8bb73938b5>`_).
Expand All @@ -93,9 +93,9 @@ HDF5 compression filters and compression libraries sources were obtained from:

Sources of compression libraries shared accross multiple filters were obtained from:

* `LZ4 v1.9.4 <https://github.com/Blosc/c-blosc2/tree/v2.13.1/internal-complibs/lz4-1.9.4>`_
* `LZ4 v1.9.4 <https://github.com/Blosc/c-blosc2/tree/v2.13.2/internal-complibs/lz4-1.9.4>`_
* `Snappy v1.1.10 <https://github.com/google/snappy>`_
* `ZStd v1.5.5 <https://github.com/Blosc/c-blosc2/tree/v2.13.1/internal-complibs/zstd-1.5.5>`_
* `ZStd v1.5.5 <https://github.com/Blosc/c-blosc2/tree/v2.13.2/internal-complibs/zstd-1.5.5>`_
* `ZLib v1.2.13 <https://github.com/Blosc/c-blosc/tree/v1.21.5/internal-complibs/zlib-1.2.13>`_

When compiled with Intel IPP, the LZ4 compression library is replaced with `LZ4 v1.9.3 <https://github.com/lz4/lz4/releases/tag/v1.9.3>`_ patched with a patch from Intel IPP 2021.7.0.
Expand Down
4 changes: 4 additions & 0 deletions doc/install.rst
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,9 @@ Available options
* - ``HDF5PLUGIN_SSE2``
- Whether or not to compile with `SSE2`_ support.
Default: True on ppc64le and when probed on x86, False otherwise
* - ``HDF5PLUGIN_SSSE3``
- Whether or not to compile with `SSSE3`_ support.
Default: True when probed on x86, False otherwise
* - ``HDF5PLUGIN_AVX2``
- Whether or not to compile with `AVX2`_ support.
It requires enabling `SSE2`_ support.
Expand Down Expand Up @@ -80,4 +83,5 @@ Note: Boolean options are passed as ``True`` or ``False``.
.. _BMI2: https://en.wikipedia.org/wiki/X86_Bit_manipulation_instruction_set
.. _IPP: https://en.wikipedia.org/wiki/Integrated_Performance_Primitives
.. _SSE2: https://en.wikipedia.org/wiki/SSE2
.. _SSSE3: https://en.wikipedia.org/wiki/SSSE3
.. _OpenMP: https://www.openmp.org/
5 changes: 5 additions & 0 deletions doc/usage.rst
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,11 @@ In order to read compressed dataset with `h5py`_, use:
It registers ``hdf5plugin`` supported compression filters with the HDF5 library used by `h5py`_.
Hence, HDF5 compressed datasets can be read as any other dataset (see `h5py documentation <https://docs.h5py.org/en/stable/high/dataset.html#reading-writing-data>`_).

.. note::

HDF5 datasets compressed with `Blosc2`_ can require additional plugins to enable decompression, such as `blosc2-grok <https://pypi.org/project/blosc2-grok>`_ or `blosc2-openhtj2k <https://pypi.org/project/blosc2-openhtj2k>`_.
See list of Blosc2 `filters <https://www.blosc.org/c-blosc2/reference/utility_variables.html#codes-for-filters>`_ and `codecs <https://www.blosc.org/c-blosc2/reference/utility_variables.html#compressor-codecs>`_.

Write compressed datasets
+++++++++++++++++++++++++

Expand Down
2 changes: 2 additions & 0 deletions package/debian11/rules
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,9 @@ export PYBUILD_NAME=hdf5plugin
# Build options
export HDF5PLUGIN_NATIVE=False
export HDF5PLUGIN_SSE2=True
export HDF5PLUGIN_SSSE3=False
export HDF5PLUGIN_AVX2=False
export HDF5PLUGIN_AVX512=False
export HDF5PLUGIN_OPENMP=True
export HDF5PLUGIN_CPP11=True

Expand Down
2 changes: 2 additions & 0 deletions package/debian12/rules
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,9 @@ export PYBUILD_NAME=hdf5plugin
# Build options
export HDF5PLUGIN_NATIVE=False
export HDF5PLUGIN_SSE2=True
export HDF5PLUGIN_SSSE3=False
export HDF5PLUGIN_AVX2=False
export HDF5PLUGIN_AVX512=False
export HDF5PLUGIN_OPENMP=True
export HDF5PLUGIN_CPP11=True

Expand Down
64 changes: 53 additions & 11 deletions setup.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# coding: utf-8
# /*##########################################################################
#
# Copyright (c) 2016-2022 European Synchrotron Radiation Facility
# Copyright (c) 2016-2024 European Synchrotron Radiation Facility
#
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
Expand Down Expand Up @@ -158,6 +158,11 @@ def __init__(self, compiler=None):
else:
self.sse2_compile_args = ()

if self.ARCH in ('X86_32', 'X86_64'):
self.ssse3_compile_args = ('-mssse3',) # There is no /arch:SSSE3
else:
self.ssse3_compile_args = ()

if self.ARCH in ('X86_32', 'X86_64'):
self.avx2_compile_args = ('-mavx2', '/arch:AVX2')
else:
Expand Down Expand Up @@ -201,6 +206,16 @@ def has_sse2(self) -> bool:
return check_compile_flags(self.__compiler, "-msse2")
return False # Disabled by default

def has_ssse3(self) -> bool:
"""Check SSSE3 availability on host"""
if self.ARCH in ('X86_32', 'X86_64'):
if not has_cpu_flag('ssse3'):
return False # SSSE3 not available on host
if self.__compiler.compiler_type == "msvc":
return True
return check_compile_flags(self.__compiler, "-mssse3")
return False # Disabled by default

def has_avx2(self) -> bool:
"""Check AVX2 availability on host"""
if self.ARCH in ('X86_32', 'X86_64'):
Expand Down Expand Up @@ -273,20 +288,24 @@ def __init__(
use_sse2 = host_config.has_sse2() if env_sse2 is None else env_sse2 == "True"
self.__use_sse2 = bool(use_sse2)

env_ssse3 = os.environ.get("HDF5PLUGIN_SSSE3", None)
use_ssse3 = host_config.has_ssse3() if env_ssse3 is None else env_ssse3 == "True"
self.__use_ssse3 = bool(use_ssse3)

if use_avx2 is None:
env_avx2 = os.environ.get("HDF5PLUGIN_AVX2", None)
use_avx2 = host_config.has_avx2() if env_avx2 is None else env_avx2 == "True"
if use_avx2 and not use_sse2:
if use_avx2 and not (use_sse2 and use_ssse3):
logger.error(
"use_avx2=True disabled: incompatible with use_sse2=False")
"use_avx2=True disabled: incompatible with use_sse2=False and use_ssse3=False")
use_avx2 = False
self.__use_avx2 = bool(use_avx2)

env_avx512 = os.environ.get("HDF5PLUGIN_AVX512", None)
use_avx512 = host_config.has_avx512() if env_avx512 is None else env_avx512 == "True"
if use_avx512 and not (use_sse2 and use_avx2):
if use_avx512 and not (use_sse2 and use_ssse3 and use_avx2):
logger.error(
"use_avx512=True disabled: incompatible with use_sse2=False or use_avx2=False")
"use_avx512=True disabled: incompatible with use_sse2=False, use_ssse3=False and use_avx2=False")
use_avx512 = False
self.__use_avx512 = bool(use_avx512)

Expand All @@ -304,6 +323,8 @@ def __init__(
compile_args = []
if self.__use_sse2:
compile_args.extend(host_config.sse2_compile_args)
if self.__use_ssse3:
compile_args.extend(host_config.ssse3_compile_args)
if self.__use_avx2:
compile_args.extend(host_config.avx2_compile_args)
if self.__use_avx512:
Expand All @@ -319,6 +340,7 @@ def __init__(
use_cpp11 = property(lambda self: self.__use_cpp11)
use_cpp14 = property(lambda self: self.__use_cpp14)
use_sse2 = property(lambda self: self.__use_sse2)
use_ssse3 = property(lambda self: self.__use_ssse3)
use_avx2 = property(lambda self: self.__use_avx2)
use_avx512 = property(lambda self: self.__use_avx512)
use_openmp = property(lambda self: self.__use_openmp)
Expand Down Expand Up @@ -346,6 +368,7 @@ def get_config_string(self):
'native': self.use_native,
'bmi2': self.USE_BMI2,
'sse2': self.use_sse2,
'ssse3': self.use_ssse3,
'avx2': self.use_avx2,
'avx512': self.use_avx512,
'cpp11': self.use_cpp11,
Expand Down Expand Up @@ -865,11 +888,28 @@ def get_blosc2_plugin():
"""
hdf5_blosc2_dir = 'src/PyTables/hdf5-blosc2/src'
blosc2_dir = 'src/c-blosc2'
plugins_dir = f'{blosc2_dir}/plugins'

# blosc sources
sources = glob(f'{blosc2_dir}/blosc/*.c')
include_dirs = [blosc2_dir, f'{blosc2_dir}/blosc', f'{blosc2_dir}/include']
define_macros = [('SHUFFLE_AVX512_ENABLED', 1), ('SHUFFLE_NEON_ENABLED', 1)]
sources += [ # Add embedded codecs, filters and tuners
src_file
for src_file in glob(f'{plugins_dir}/*.c') + glob(f'{plugins_dir}/*/*.c') + glob(f'{plugins_dir}/*/*/*.c')
if not os.path.basename(src_file).startswith("test")
]
sources += glob(f'{plugins_dir}/codecs/zfp/src/*.c') # Add ZFP embedded sources

include_dirs = [
blosc2_dir,
f'{blosc2_dir}/blosc',
f'{blosc2_dir}/include',
f'{blosc2_dir}/plugins/codecs/zfp/include',
]
define_macros = [
('HAVE_PLUGINS', 1),
('SHUFFLE_AVX512_ENABLED', 1),
('SHUFFLE_NEON_ENABLED', 1),
]
extra_compile_args = []
extra_link_args = []
libraries = []
Expand Down Expand Up @@ -902,9 +942,8 @@ def get_blosc2_plugin():
include_dirs += get_zstd_clib('include_dirs')
define_macros.append(('HAVE_ZSTD', 1))

extra_compile_args += ['-std=gnu99'] # Needed to build manylinux1 wheels
extra_compile_args += ['-O3', '-ffast-math']
extra_compile_args += ['/Ox', '/fp:fast']
extra_compile_args += ['-O3', '-std=gnu99']
extra_compile_args += ['/Ox']
extra_compile_args += ['-pthread']
extra_link_args += ['-pthread']

Expand Down Expand Up @@ -1289,7 +1328,10 @@ def get_version():
ext_modules=extensions,
install_requires=['h5py'],
setup_requires=['setuptools', 'wheel'],
extras_require={'dev': ['sphinx', 'sphinx_rtd_theme']},
extras_require={
'dev': ['sphinx', 'sphinx_rtd_theme'],
'test': ['blosc2>=2.5.1', 'blosc2-grok>=0.2.2'],
},
cmdclass=cmdclass,
libraries=libraries,
zip_safe=False,
Expand Down
8 changes: 6 additions & 2 deletions src/c-blosc2/ANNOUNCE.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,13 @@
# Announcing C-Blosc2 2.13.1
# Announcing C-Blosc2 2.13.2
A fast, compressed and persistent binary data store library for C.

## What is new?

This is a patch release for fixing a bug regarding the included files in `b2nd.h`.
This is a patch release for improving of SSSE3 detection on Visual Studio.
Also, documentation for the globally registered filters and codecs has been
added:
https://www.blosc.org/c-blosc2/reference/utility_variables.html#codes-for-filters
https://www.blosc.org/c-blosc2/reference/utility_variables.html#compressor-codecs

For more info, please see the release notes in:

Expand Down
11 changes: 11 additions & 0 deletions src/c-blosc2/RELEASE_NOTES.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,17 @@
Release notes for C-Blosc2
==========================

Changes from 2.13.1 to 2.13.2
=============================

* Better checking for `SSSE3` availability in Visual Studio. Probably fixes #546 too.
Thanks to @t20100 (Thomas Vincent) for the PR (#586).

* Documented the globally registered filters and codecs. See:
https://www.blosc.org/c-blosc2/reference/utility_variables.html#codes-for-filters
https://www.blosc.org/c-blosc2/reference/utility_variables.html#compressor-codecs


Changes from 2.13.0 to 2.13.1
=============================

Expand Down
11 changes: 8 additions & 3 deletions src/c-blosc2/blosc/frame.c
Original file line number Diff line number Diff line change
Expand Up @@ -1358,7 +1358,7 @@ static int get_meta_from_header(blosc2_frame_s* frame, blosc2_schunk* schunk, ui
schunk->metalayers[nmetalayer] = metalayer;

// Populate the metalayer string
int8_t nslen = *idxp & (uint8_t)0x1F;
uint8_t nslen = *idxp & (uint8_t)0x1F;
idxp += 1;
header_pos += nslen;
if (header_len < header_pos) {
Expand Down Expand Up @@ -1538,7 +1538,7 @@ static int get_vlmeta_from_trailer(blosc2_frame_s* frame, blosc2_schunk* schunk,
schunk->vlmetalayers[nmetalayer] = metalayer;

// Populate the metalayer string
int8_t nslen = *idxp & (uint8_t)0x1F;
uint8_t nslen = *idxp & (uint8_t)0x1F;
idxp += 1;
trailer_pos += nslen;
if (trailer_len < trailer_pos) {
Expand Down Expand Up @@ -1885,6 +1885,11 @@ blosc2_schunk* frame_to_schunk(blosc2_frame_s* frame, bool copy, const blosc2_io
}
if (chunk_cbytes > (int32_t)prev_alloc) {
data_chunk = realloc(data_chunk, chunk_cbytes);
if (data_chunk == NULL) {
BLOSC_TRACE_ERROR("Cannot realloc space for the data_chunk.");
rc = BLOSC2_ERROR_MEMORY_ALLOC;
break;
}
prev_alloc = chunk_cbytes;
}
if (!frame->sframe) {
Expand Down Expand Up @@ -3119,7 +3124,7 @@ void* frame_update_chunk(blosc2_frame_s* frame, int64_t nchunk, void* chunk, blo
}

// Add the new offset
int64_t sframe_chunk_id;
int64_t sframe_chunk_id = -1;
if (frame->sframe) {
if (offsets[nchunk] < 0) {
sframe_chunk_id = -1;
Expand Down
2 changes: 1 addition & 1 deletion src/c-blosc2/doc/Doxyfile
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ GENERATE_RTF = NO
CASE_SENSE_NAMES = NO
GENERATE_HTML = NO
GENERATE_XML = YES
RECURSIVE = NO
RECURSIVE = YES
QUIET = YES
JAVADOC_AUTOBRIEF = YES
WARN_IF_UNDOCUMENTED = NO
Expand Down
20 changes: 20 additions & 0 deletions src/c-blosc2/doc/reference/utility_variables.rst
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,14 @@ Codes for filters

.. doxygenenumvalue:: BLOSC_TRUNC_PREC

.. doxygenenumvalue:: BLOSC_FILTER_NDCELL

.. doxygenenumvalue:: BLOSC_FILTER_NDMEAN

.. doxygenenumvalue:: BLOSC_FILTER_BYTEDELTA

.. doxygenenumvalue:: BLOSC_FILTER_INT_TRUNC


Compressor codecs
-----------------
Expand All @@ -46,6 +54,18 @@ Compressor codecs

.. doxygenenumvalue:: BLOSC_ZSTD

.. doxygenenumvalue:: BLOSC_CODEC_NDLZ

.. doxygenenumvalue:: BLOSC_CODEC_ZFP_FIXED_ACCURACY

.. doxygenenumvalue:: BLOSC_CODEC_ZFP_FIXED_PRECISION

.. doxygenenumvalue:: BLOSC_CODEC_ZFP_FIXED_RATE

.. doxygenenumvalue:: BLOSC_CODEC_OPENHTJ2K

.. doxygenenumvalue:: BLOSC_CODEC_GROK


Compressor names
----------------
Expand Down
Loading

0 comments on commit a0f37f3

Please sign in to comment.