Version 0.4.0 -- 4-bit Support, Mode 160 (Stokes I, V)

Bump version 0.3.5 -> 0.4.0 Add support for processing mode 160 - 164, generating Stokes I and V vectors simultaneously. Add support for 4-bit observations Move the 4-bit LUT to lofar_udp_packends Main loop should be (probably insignificantly) more performant, swap from a run-time switch statement to compiler-time if-else chain (apparently the switch isn't compiler optimised...) Minor struct changes to handle the number of beamlets available with 4-bit mode Some cleanup of constants
David-McKenna · Oct 29, 2020 · 39ff9b8 · 39ff9b8
1 parent e53d0e3
commit 39ff9b8
Show file tree

Hide file tree

Showing 13 changed files with 512 additions and 310 deletions.
diff --git a/Makefile b/Makefile
@@ -9,8 +9,8 @@ CC		= gcc
 CXX		= g++
 endif
 
-LIB_VER = 0.3
-LIB_VER_MINOR = 5
+LIB_VER = 0.4
+LIB_VER_MINOR = 0
 CLI_VER = 0.2
 
 # Detemrine the max threads per socket to speed up execution via OpenMP with ICC (GCC falls over if we set too many)

diff --git a/README.md b/README.md
@@ -14,10 +14,8 @@ Caveats & TODOs
 While using the library, do be aware
 - CEP packets that are recorded out of order may cause issues, the best way to handle them as not been determined so they are currently skipped
 - The provided python dummy data script tends to generate errors in the output after around 5,000 packets are generated
-- 4-bit data is not yet supported
 
 Future work should not break the exiting load/process iteration loop, and may consist of
-- Implementing 4-bit
 - Creating a wrapper python library to allow for easer interfacing within python scripts rather than requiring a C program (CFFI if I can strip out ifdefs?)
 - Investigating [blosc](https://github.com/Blosc/) [(examples link)](https://github.com/Blosc/c-blosc2/tree/master/examples) as an option to speed up some processing modes
 - Specifying specific beamlets to process rather than entire ports

diff --git a/docs/README_CLI.md b/docs/README_CLI.md
@@ -150,9 +150,13 @@ By default, we define a number of base Stokes parameter outputs each at a multip
 - N input files -> 1 output file
 
 #### 150: "Stokes Vector"
-- Take the input data, apply (20), and then combine the polariszation to form 4 output 32-bit floating poit Stoke s(I, Q, U, V) filterbanks for each frequency sample
+- Take the input data, apply (20), and then combine the polariszation to form 4 output 32-bit floating point Stokes (I, Q, U, V) filterbanks for each frequency sample
 - N input files -> 4 output files
 
+#### 160: "Useful Stokes Vector"
+- Take the input data, apply (20), and then combine the polariszation to form 4 output 32-bit floating point Stokes (I, V) filterbanks for each frequency sample
+- N input files -> 2 output files
+
 #### Time decimation
 We also offer up to a 16x decimation during execution (the number of time samples per packet). To select this, choose a Stokes parameter and add a log 2 of the factor to the mode.
 

diff --git a/docs/README_INTEGRATION.md b/docs/README_INTEGRATION.md
@@ -9,7 +9,7 @@ Include the reader header
 ```
 
 
-You may need to manually include some C functions for some compilers (CUDA needed this in my case). For a basic implementation, all you need is
+You may need to manually include some C functions for some compilers (CUDA needed this in my case, I haven't caught all of the prototypes for C++). For a basic implementation, all you need is
 ```
 extern "C"  {
   int lofar_udp_reader_step(lofar_udp_reader *reader);

diff --git a/docs/TODO b/docs/TODO
@@ -1,5 +1,4 @@
 TODO:
 
 Better approach to out of order packets?
-4-bit decoding + processing
 fread -> mmap operations
diff --git a/docs/newProcessingMode.md b/docs/newProcessingMode.md
@@ -8,14 +8,14 @@ int lofar_udp_raw_udp_my_new_kernel(lofar_udp_meta *meta);
 ```
 
 2. Create the CPP/C bridge in `lofar_udp_backends.cpp`. You will need to pick both a processing mode int enum (any value greater than 0 and not in use by other modes) and an output data format. For copy methods, the output datatype should be the same as the input. Though you can change it, eg to convert to float by using float as the output datatype. 
+The 4-bit processing enum is (almost) always 4000 larger than the default enum, to signal to the processing loop that the data packet needs to have the bits upacked before proceeding. If you have a processing mode that just performs a data move, eg memcpy, this change is not needed.
 
 ```
 int lofar_udp_raw_udp_my_new_kernel(lofar_udp_meta *meta) {
 	VERBOSE(if (meta->VERBOSE) printf("Entered C++ call for lofar_udp_raw_udp_my_new_kernel\n"));
 	switch(meta->inputBitMode) {
 		case 4:
-			fprintf(stderr, "4-bit mode is not yet supported, exiting.\n");
-			return 1;
+			return lofar_udp_raw_loop<signed char, OUTPUT_DTYPE, KERNEL_ENUM_VAL (+ 4000)>(meta);
 		case 8:
 			return lofar_udp_raw_loop<signed char, OUTPUT_DTYPE, KERNEL_ENUM_VAL>(meta);
 		case 16:
@@ -82,6 +82,6 @@ case KERNEL_ENUM_VAL:
 	break;
 ```
 
-6. Staying in `int lofar_udp_setup_processing(lofar_udp_meta *meta)`, run the maths on the input / output data sizes and add your case to the switch statement. If adding a completely new calculation, be sue to add a `break;` statement afterwards, as the compiler warning is disabled for this switch statement. In the case of a re-rodering operation, you will just need to define the number of output arrays.
+6. Staying in `int lofar_udp_setup_processing(lofar_udp_meta *meta)`, run the maths on the input / output data sizes and add your case to the switch statement. If adding a completely new calculation, be sure to add a `break;` statement afterwards, as the compiler warning is disabled for this switch statement. In the case of a re-rodering operation, you will just need to define the number of output arrays.
 
 7. Add documentation to `README_CLI.md` and `lofar_cli_meta.c`.
diff --git a/src/CLI/lofar_cli_extractor.c b/src/CLI/lofar_cli_extractor.c
@@ -162,6 +162,7 @@ int main(int argc, char  *argv[]) {
 	if (processingMode == 2 || processingMode == 11 || processingMode == 21) outputFilesCount = UDPNPOL;
 	else if (processingMode == 10 || processingMode == 20 || (processingMode > 99 && processingMode < 140)) outputFilesCount = 1;
 	else if (processingMode > 149 && processingMode < 160) outputFilesCount = 4;
+	else if (processingMode > 159 && processingMode < 170) outputFilesCount = 2;
 
 	// Sanity check a few inputs
 	if ( (strcmp(inputFormat, "") == 0) || (ports == 0) || (packetsPerIteration < 2)  || (replayDroppedPackets > 1 || replayDroppedPackets < 0) || (processingMode > 1000 || processingMode < 0) || (seconds < 0)) {

diff --git a/src/CLI/lofar_cli_meta.c b/src/CLI/lofar_cli_meta.c
@@ -26,6 +26,7 @@ void processingModes(void) {
 	printf("130: Raw UDP to Stokes V: Form a 32-bit float Stokes V for the input.\n\n");
 
 	printf("150: Raw UDP to Full Stokes: Form a 32-bit float Stokes Vector for the input (I, Q, U, V output files)\n\n");
+	printf("160: Raw UDP to Full Stokes: Form a 32-bit float Stokes Vector for the input (I, V output files)\n\n");
 
 	printf("Stokes outputs can be decimated in orders of 2, up to 16x by adjusting the last digit of their processing mode.\n");
 	printf("This is handled in orders of two, so 101 will give a Stokes I with 2x decimation, 102, will give 4x, 103 will give 8x and 104 will give 16x.\n");