Releases: David-McKenna/udpPacketManager
0.9.2: JOSS Release
Introduce changes to the JOSS paper source from @dfm.
This release will be used to generate the DOI for the JOSS submission.
0.9.1: Fixes and Tweaks
A number of small fixes for
- Metadata (nbit incorrectly set)
- Builds (move PSRDADA to the insecure git:// protocol, as Sourceforge's HTTPS interface is unstable)
- Improve documentation, CLI flags in a few places
0.9 (1.0 with a few known issues)
This should be a 1.0 release, but with a few issues that have arisen in testing (isolated to the stokes CLI), that version is being held back for now.
Rough overview of the changelog between 0.6.2 and 0.9:
- New I/O interface for to simplify adding data source/sinks
- Add PSRDADA ringbuffer support
- Add HDF5 ICD003 compliant writer (bitshuffle/zstd compression support)
- New metadata interface based upon parsing beamctl commands
- Supports GUPPI, SigProc, DADA and HDF5 ICD003 formats
- Removed mockHeader dependency
- New CLI for extended downsampling and channelisation
- Some bugs remain, not production ready.
- GitHub action based CI implemented
- googletest based test suite, with 78% code coverage
- dreamBeam calibration now uses shmem instead of FIFOs for stability and performance improvements
- JOSS Paper submission
- Improved documentation
0.6.2 -- Calibration via dreamBeam, mmap reader, back-end cleanup, Stokes U fix
0.6.0 introduces support for calibrating data via Jones matrices calculated by Tobia Carozzi's dreamBeam (2baOrNot2ba/dreamBeam). This will correct voltages to be in J2000-coordinates, though can be configured with any input coordinate system supported by casacore (J2000, SUN, JUPITER, AZELGO...). Applying calibration to your data will result in the output data type always being a 32-bit float as we do not apply type conversion to the results after applying the Jones matrix.
Similarly, we now use mmap(2) to read data for the zstandard reader, which has shown 5-15% improvements in read times over the old fread(3) methodology. The madvise(2) calls to de-allocate the memory are fairly slow (10-40ms on REALTA, without the library will use all available memory on all CPUs), but there is still a net gain from the process.
This release also cleans up parts of the back-end so that they are easier to manage and expand. This include having a single C-C++ bridge function rather than one per mode, introducing inline functions for calculating standard offsets in the processing kernels (input, time-major output, frequency-major output), and adjusting input types to be an enum rather than a true/false check on zstandard compression. GCC-compiled outputs now also use the task based processing that was used with ICC before as it no longer takes an eternity (but it is still fairly slow compared to ICC).
Calibration process does need a bit more documentation, but time is short at the moment. Tl;dr, posix_spawnp -> python -> pipe -> C function
0.6.1 introduces a fix for an incorrect definition of Stokes U spotted by Tobia Carozzi (@2baOrNot2ba )
0.6.2 hides some verbose messages that were always printed on some execution paths rather than checking if the DEBUG flag was set.
Minor additions/changes:
- Added a 'stationID' meta parameter with the name of the station, as read from the CEP packets.
- Added support for time-major Stokes outputs, though this currently not exposed to the reader.
- Added VERBOSEP for printing an error that mentions the function name (TODO: convert verbose messages over to this define)
- CLI: -c (clock mode) moved to -z to allow for calibration to use -c
- Stokes functions now use ints instead of floats to improve SIMD throughput (future: multiple functions of different types to improve speeds even more? Not sure how to get templates and typedefs working together though.)
- Add an extra verbose more (meta->verbose = 2) for printing the outputs during the kernel processing
- lofar_udp_misc and the initialisation functions now operate directly on the data rather than using a union struct to get ints from chars
- Moved a lot of common defines/includes to lofar_udp_general.h
- Number of OpenMP threads can now be set by reader->ompThreads rather than being fixed at compile time
- Added a few more UDPHDROFF offsets to packet checks; not complete though.
- We no longer reset the number of OpenMP threads on every reader call
Minor fixes:
- Heavily desyn'd ports (> 1 iteration worth of packets) will now be correctly aligned rather than cause a segfault on the first iteration (fixes #5)
- Several smaller bounds checks to make sure we don't overrun our buffer when dealing with packet loss while skipping to/from packets (fixes #3)
- Fix the ascii_hdr_manager (GUPPI RAW output) not correctly resetting fargc for parsing metadata parameters (fixes #4)
- Error if the clock types are mixed
0.6.1 -- Calibration via dreamBeam, mmap reader, back-end cleanup, Stokes U fix
0.6.0 introduces support for calibrating data via Jones matrices calculated by Tobia Carozzi's dreamBeam (2baOrNot2ba/dreamBeam). This will correct voltages to be in J2000-coordinates, though can be configured with any input coordinate system supported by casacore (J2000, SUN, JUPITER, AZELGO...). Applying calibration to your data will result in the output data type always being a 32-bit float as we do not apply type conversion to the results after applying the Jones matrix.
Similarly, we now use mmap(2) to read data for the zstandard reader, which has shown 5-15% improvements in read times over the old fread(3) methodology. The madvise(2) calls to de-allocate the memory are fairly slow (10-40ms on REALTA, without the library will use all available memory on all CPUs), but there is still a net gain from the process.
This release also cleans up parts of the back-end so that they are easier to manage and expand. This include having a single C-C++ bridge function rather than one per mode, introducing inline functions for calculating standard offsets in the processing kernels (input, time-major output, frequency-major output), and adjusting input types to be an enum rather than a true/false check on zstandard compression. GCC-compiled outputs now also use the task based processing that was used with ICC before as it no longer takes an eternity (but it is still fairly slow compared to ICC).
Calibration process does need a bit more documentation, but time is short at the moment. Tl;dr, posix_spawnp -> python -> pipe -> C function
0.6.1 introduces a fix for an incorrect definition of Stokes U spotted by Tobia Carozzi (@2baOrNot2ba )
Minor additions/changes:
- Added a 'stationID' meta parameter with the name of the station, as read from the CEP packets.
- Added support for time-major Stokes outputs, though this currently not exposed to the reader.
- Added VERBOSEP for printing an error that mentions the function name (TODO: convert verbose messages over to this define)
- CLI: -c (clock mode) moved to -z to allow for calibration to use -c
- Stokes functions now use ints instead of floats to improve SIMD throughput (future: multiple functions of different types to improve speeds even more? Not sure how to get templates and typedefs working together though.)
- Add an extra verbose more (meta->verbose = 2) for printing the outputs during the kernel processing
- lofar_udp_misc and the initialisation functions now operate directly on the data rather than using a union struct to get ints from chars
- Moved a lot of common defines/includes to lofar_udp_general.h
- Number of OpenMP threads can now be set by reader->ompThreads rather than being fixed at compile time
- Added a few more UDPHDROFF offsets to packet checks; not complete though.
- We no longer reset the number of OpenMP threads on every reader call
Minor fixes:
- Heavily desyn'd ports (> 1 iteration worth of packets) will now be correctly aligned rather than cause a segfault on the first iteration (fixes #5)
- Several smaller bounds checks to make sure we don't overrun our buffer when dealing with packet loss while skipping to/from packets (fixes #3)
- Fix the ascii_hdr_manager (GUPPI RAW output) not correctly resetting fargc for parsing metadata parameters (fixes #4)
- Error if the clock types are mixed
0.6.0 -- Calibration via dreamBeam, mmap reader, back-end cleanup
0.6.0 introduces support for calibrating data via Jones matrices calculated by Tobia Carozzi's dreamBeam (2baOrNot2ba/dreamBeam). This will correct voltages to be in J2000-coordinates, though can be configured with any input coordinate system supported by casacore (J2000, SUN, JUPITER, AZELGO...). Applying calibration to your data will result in the output data type always being a 32-bit float as we do not apply type conversion to the results after applying the Jones matrix.
Similarly, we now use mmap(2) to read data for the zstandard reader, which has shown 5-15% improvements in read times over the old fread(3) methodology. The madvise(2) calls to de-allocate the memory are fairly slow (10-40ms on REALTA, without the library will use all available memory on all CPUs), but there is still a net gain from the process.
This release also cleans up parts of the back-end so that they are easier to manage and expand. This include having a single C-C++ bridge function rather than one per mode, introducing inline functions for calculating standard offsets in the processing kernels (input, time-major output, frequency-major output), and adjusting input types to be an enum rather than a true/false check on zstandard compression. GCC-compiled outputs now also use the task based processing that was used with ICC before as it no longer takes an eternity (but it is still fairly slow compared to ICC).
Calibration process does need a bit more documentation, but time is short at the moment. Tl;dr, posix_spawnp -> python -> pipe -> C function
Minor additions/changes:
- Added a 'stationID' meta parameter with the name of the station, as read from the CEP packets.
- Added support for time-major Stokes outputs, though this currently not exposed to the reader.
- Added VERBOSEP for printing an error that mentions the function name (TODO: convert verbose messages over to this define)
- CLI: -c (clock mode) moved to -z to allow for calibration to use -c
- Stokes functions now use ints instead of floats to improve SIMD throughput (future: multiple functions of different types to improve speeds even more? Not sure how to get templates and typedefs working together though.)
- Add an extra verbose more (meta->verbose = 2) for printing the outputs during the kernel processing
- lofar_udp_misc and the initialisation functions now operate directly on the data rather than using a union struct to get ints from chars
- Moved a lot of common defines/includes to lofar_udp_general.h
- Number of OpenMP threads can now be set by reader->ompThreads rather than being fixed at compile time
- Added a few more UDPHDROFF offsets to packet checks; not complete though.
- We no longer reset the number of OpenMP threads on every reader call
Minor fixes:
- Heavily desyn'd ports (> 1 iteration worth of packets) will now be correctly aligned rather than cause a segfault on the first iteration (fixes #5)
- Several smaller bounds checks to make sure we don't overrun our buffer when dealing with packet loss while skipping to/from packets (fixes #3)
- Fix the ascii_hdr_manager (GUPPI RAW output) not correctly resetting fargc for parsing metadata parameters (fixes #4)
- Error if the clock types are mixed
0.5.0 -- Limit Processed Beamlets, Add Test Cases, Numerous Small Fixes/QOL
Features:
A subset of the beamlets in the input ports can now be selected, such that when 488 beamlets are recorded, we can process all of them or just a subset on the range [lower, upper). The docs/newProcessingMode.md
has been updated to reflect the changes in the kernels to support this.
Implemented a set of test cases in the makefile under target test
, which generates outputs and compares them to a set of hashes. The hashes on hand were generated using -ffast-math, so debug builds will not pass tests.
The lofar_udp_reader option can now be set up using a struct rather than a large number of input parameters.
The CLIs can now start processing data from a non-0 base port
Some default structs have been provided for configuring the reader
Fixes:
Documentation pass across the repo
Makefile will no longer always try to use ICC if it is available, and will default to whatever is passed in by CC and CXX.
GCC compiles now use -ffast-math as well
Makefile not uses -command
syntax rather than command; exit 0
to work around calls expected to fail
Fixed lofar_udp_extractor passing the wrong number of beamlets to mockHeader for some processing modes
CLIs now determine the number of output files from the lofar_udp_reader struct rather than trying (and often failing. whoops.) to predict it themselves
CLIs will now raise an error if the input filename does not update when iterating over input filepaths
Removed inconsistent documentation about the GUPPI RAW CLI
Fixed some of the "Full Stokes Vector" processing modes (151-153) generating garbage on ICC by splitting the loop into two separate sets of function calls. (A compiler bug?)
Fixed an incorrect base offset in some of the Stokes decimation modes (1,2,3)
Fixed the Time-Major, Dual Pols (32) mode incorrectly calculating the output offset for the second half of data (missing bracket)
The main processing loop is now a compile-time fixed if-else statement rather than a runtime switch (I thought it was fixed at compile time, I was mistaken).
Added safeguards to prevent memory leaks from the CLI
Fixed the Useful Stokes mode (160, did not effect decimated versions) having an incorrect number of output files
Fixed the reader_step return value not being updated due to a missing OMP pragma
Fixed reader_step attempting to iterate when there was no work to perform (cosmetic change)
0.4.0 -- 4-bit Support, Mode 160 (Stokes I, V)
Bump version 0.3.5 -> 0.4.0
Add support for processing mode 160 - 164, generating Stokes I and V vectors simultaneously.
Add support for 4-bit observations
Move the 4-bit LUT to lofar_udp_packends
Main loop should be (probably insignificantly) more performant, swap from a run-time switch statement to compiler-time if-else chain (apparently the switch isn't compiler optimised...)
Minor struct changes to handle the number of beamlets available with 4-bit mode
Some cleanup of constants
0.3.5 - GUPPI RAW Output Support + More Fixes
GUPPI_RAW: Check header exists before trying to rea it
Everything: Fix an edge case with the alignment code and non-powers-of-2 buffer lengths causing the end of each buffer to be copied prematurely on shift_remainder_packet calls
Previously:
Adds LOFAR UDP -> GUPPI RAW support through a new CLI.
Added time-major processing modes (30, 31, 32)
Improved documentation
Cleaned up file structures
0.3.4 - GUPPI RAW Output Support + More Fixes
Fix an issue with the decimation Stokes modes
Implement mode 15[0..4], generating a full Stokes vector for a given input
Fully implements the changes meant to be implemented in 0.3.2/3
Previously:
Adds LOFAR UDP -> GUPPI RAW support through a new CLI.
Added time-major processing modes (30, 31, 32)
Improved documentation
Cleaned up file structures