diff --git a/images/drishti-logo.png b/images/drishti-logo.png
index a75388e..d370941 100644
Binary files a/images/drishti-logo.png and b/images/drishti-logo.png differ
diff --git a/images/sample-io-insights-issues.svg b/images/sample-io-insights-issues.svg
index 5b3259a..8c5d20e 100644
--- a/images/sample-io-insights-issues.svg
+++ b/images/sample-io-insights-issues.svg
@@ -1,211 +1 @@
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- Drishti
-
-
-
-
-
-
-
-
-
-
- ╭─ DRISHTI v.0.3 ───────────────────────────────────────────────────────────────────────────────────────────────────── ─╮
- │ │
- │ JOB : 1190243 │
- │ EXECUTABLE : bin/8_benchmark_parallel │
- │ DARSHAN : jlbez_8_benchmark_parallel_id1190243_7-23-45631-11755726114084236527_1.darshan │
- │ EXECUTION DATE : 2021-07-23 16:40:31+00:00 to 2021-07-23 16:40:32+00:00 (0.00 hours) │
- │ FILES : 6 files (1 use STDIO, 2 use POSIX, 1 use MPI-IO) │
- │ PROCESSES 64 │
- │ HINTS : romio_no_indep_rw=true cb_nodes=4 │
- │ │
- ╰─ 1 critical issues , 5 warnings , and 5 recommendations ────────────────────────────────────────────────────────────── ─╯
-
- ╭─ METADATA ────────────────────────────────────────────────────────────────────────────────────────────────────────── ─╮
- │ │
- │ ▶ Application is read operation intensive (6.34% writes vs. 93.66% reads) │
- │ ▶ Application might have redundant read traffic (more data was read than the highest read offset) │
- │ ▶ Application might have redundant write traffic (more data was written than the highest write offset) │
- │ │
- ╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
- ╭─ OPERATIONS ──────────────────────────────────────────────────────────────────────────────────────────────────────── ─╮
- │ │
- │ ▶ Application issues a high number (285) of small read requests (i.e., < 1MB) which represents 37.11% of all │
- │ read/write requests │
- │ ↪ 284 (36.98%) small read requests are to "benchmark.h5" │
- │ ▶ Application mostly uses consecutive (2.73%) and sequential (90.62%) read requests │
- │ ▶ Application mostly uses consecutive (19.23%) and sequential (76.92%) write requests │
- │ ▶ Application uses MPI-IO and read data using 640 (83.55%) collective operations │
- │ ▶ Application uses MPI-IO and write data using 768 (100.00%) collective operations │
- │ ▶ Application could benefit from non-blocking (asynchronous) reads │
- │ ▶ Application could benefit from non-blocking (asynchronous) writes │
- │ ▶ Application is using inter-node aggregators (which require network communication) │
- │ │
- ╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
-
- 2022 | LBL | Drishti report generated at 2022-08-05 13:19:59.787458 in 0.955 seconds
-
-
-
-
-
+Drishti ╭─ DRISHTI v.0.3 ───────────────────────────────────────────────────────────────────────────────────────────────────── ─╮ │ │ │ JOB : 1190243 │ │ EXECUTABLE : bin/8_benchmark_parallel │ │ DARSHAN : jlbez_8_benchmark_parallel_id1190243_7-23-45631-11755726114084236527_1.darshan │ │ EXECUTION DATE : 2021-07-23 16:40:31+00:00 to 2021-07-23 16:40:32+00:00 (0.00 hours) │ │ FILES : 6 files (1 use STDIO, 2 use POSIX, 1 use MPI-IO) │ │ PROCESSES 64 │ │ HINTS : romio_no_indep_rw=true cb_nodes=4 │ │ │ ╰─ 1 critical issues , 5 warnings , and 5 recommendations ────────────────────────────────────────────────────────────── ─╯ ╭─ METADATA ────────────────────────────────────────────────────────────────────────────────────────────────────────── ─╮ │ │ │ ▶ Application is read operation intensive (6.34% writes vs. 93.66% reads) │ │ ▶ Application might have redundant read traffic (more data was read than the highest read offset) │ │ ▶ Application might have redundant write traffic (more data was written than the highest write offset) │ │ │ ╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯ ╭─ OPERATIONS ──────────────────────────────────────────────────────────────────────────────────────────────────────── ─╮ │ │ │ ▶ Application issues a high number (285) of small read requests (i.e., < 1MB) which represents 37.11% of all │ │ read/write requests │ │ ↪ 284 (36.98%) small read requests are to "benchmark.h5" │ │ ▶ Application mostly uses consecutive (2.73%) and sequential (90.62%) read requests │ │ ▶ Application mostly uses consecutive (19.23%) and sequential (76.92%) write requests │ │ ▶ Application uses MPI-IO and read data using 640 (83.55%) collective operations │ │ ▶ Application uses MPI-IO and write data using 768 (100.00%) collective operations │ │ ▶ Application could benefit from non-blocking (asynchronous) reads │ │ ▶ Application could benefit from non-blocking (asynchronous) writes │ │ ▶ Application is using inter-node aggregators (which require network communication) │ │ │ ╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯ 2022 | LBL | Drishti report generated at 2022-08-05 13:19:59.787458 in 0.955 seconds
\ No newline at end of file
diff --git a/images/sample-io-insights-verbose.svg b/images/sample-io-insights-verbose.svg
index 2a608f8..3e3b310 100644
--- a/images/sample-io-insights-verbose.svg
+++ b/images/sample-io-insights-verbose.svg
@@ -1,634 +1 @@
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- Drishti
-
-
-
-
-
-
-
-
-
-
- ╭─ DRISHTI v.0.3 ───────────────────────────────────────────────────────────────────────────────────────────────────── ─╮
- │ │
- │ JOB : 1190243 │
- │ EXECUTABLE : bin/8_benchmark_parallel │
- │ DARSHAN : jlbez_8_benchmark_parallel_id1190243_7-23-45631-11755726114084236527_1.darshan │
- │ EXECUTION DATE : 2021-07-23 16:40:31+00:00 to 2021-07-23 16:40:32+00:00 (0.00 hours) │
- │ FILES : 6 files (1 use STDIO, 2 use POSIX, 1 use MPI-IO) │
- │ PROCESSES 64 │
- │ HINTS : romio_no_indep_rw=true cb_nodes=4 │
- │ │
- ╰─ 1 critical issues , 5 warnings , and 5 recommendations ────────────────────────────────────────────────────────────── ─╯
-
- ╭─ METADATA ────────────────────────────────────────────────────────────────────────────────────────────────────────── ─╮
- │ │
- │ ▶ Application is read operation intensive (6.34% writes vs. 93.66% reads) │
- │ ▶ Application might have redundant read traffic (more data was read than the highest read offset) │
- │ ▶ Application might have redundant write traffic (more data was written than the highest write offset) │
- │ │
- ╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
- ╭─ OPERATIONS ──────────────────────────────────────────────────────────────────────────────────────────────────────── ─╮
- │ │
- │ ▶ Application issues a high number (285) of small read requests (i.e., < 1MB) which represents 37.11% of all │
- │ read/write requests │
- │ ↪ 284 (36.98%) small read requests are to "benchmark.h5" │
- │ ↪ Recommendations: │
- │ ↪ Consider buffering read operations into larger more contiguous ones │
- │ ↪ Since the appplication already uses MPI-IO, consider using collective I/O calls (e.g. MPI_File_read_all() or │
- │ MPI_File_read_at_all()) to aggregate requests into larger ones │
- │ │
- │ ╭─ Solution Example Snippet ─────────────────────────────────────────────────────────────────────────────── ─╮ │
- │ │ │ │
- │ │ 1 MPI_File_open ( MPI_COMM_WORLD , " output-example.txt " , MPI_MODE_CREATE | MPI_MODE_RDONLY , MPI_INFO_NULL , │ │
- │ │ 2 . . . │ │
- │ │ 3 MPI_File_read_all ( fh , & buffer , size , MPI_INT , & s ) ; │ │
- │ │ │ │
- │ ╰───────────────────────────────────────────────────────────────────────────────────────────────────────────╯ │
- │ │
- │ ▶ Application mostly uses consecutive (2.73%) and sequential (90.62%) read requests │
- │ ▶ Application mostly uses consecutive (19.23%) and sequential (76.92%) write requests │
- │ ▶ Application uses MPI-IO and read data using 640 (83.55%) collective operations │
- │ ▶ Application uses MPI-IO and write data using 768 (100.00%) collective operations │
- │ ▶ Application could benefit from non-blocking (asynchronous) reads │
- │ ↪ Recommendations: │
- │ ↪ Since you use MPI-IO, consider non-blocking/asynchronous I/O operations (e.g., MPI_File_iread(), │
- │ MPI_File_read_all_begin/end(), or MPI_File_read_at_all_begin/end()) │
- │ │
- │ ╭─ Solution Example Snippet ─────────────────────────────────────────────────────────────────────────────── ─╮ │
- │ │ │ │
- │ │ 1 MPI_File fh ; │ │
- │ │ 2 MPI_Status s ; │ │
- │ │ 3 MPI_Request r ; │ │
- │ │ 4 . . . │ │
- │ │ 5 MPI_File_open ( MPI_COMM_WORLD , " output-example.txt " , MPI_MODE_CREATE | MPI_MODE_RDONLY , MPI_INFO_NULL │ │
- │ │ 6 . . . │ │
- │ │ 7 MPI_File_iread ( fh , & buffer , BUFFER_SIZE , n , MPI_CHAR , & r ) ; │ │
- │ │ 8 . . . │ │
- │ │ 9 // compute something │ │
- │ │ 10 . . . │ │
- │ │ 11 MPI_Test ( & r , & completed , & s ) ; │ │
- │ │ 12 . . . │ │
- │ │ 13 if ( ! completed ) { │ │
- │ │ 14 // compute something │ │
- │ │ 15 │ │
- │ │ 16 MPI_Wait ( & r , & s ) ; │ │
- │ │ 17 } │ │
- │ │ │ │
- │ ╰───────────────────────────────────────────────────────────────────────────────────────────────────────────╯ │
- │ │
- │ ▶ Application could benefit from non-blocking (asynchronous) writes │
- │ ↪ Recommendations: │
- │ ↪ Since you use MPI-IO, consider non-blocking/asynchronous I/O operations (e.g., MPI_File_iwrite(), │
- │ MPI_File_write_all_begin/end(), or MPI_File_write_at_all_begin/end()) │
- │ │
- │ ╭─ Solution Example Snippet ─────────────────────────────────────────────────────────────────────────────── ─╮ │
- │ │ │ │
- │ │ 1 MPI_File fh ; │ │
- │ │ 2 MPI_Status s ; │ │
- │ │ 3 MPI_Request r ; │ │
- │ │ 4 . . . │ │
- │ │ 5 MPI_File_open ( MPI_COMM_WORLD , " output-example.txt " , MPI_MODE_CREATE | MPI_MODE_WRONLY , MPI_INFO_NULL │ │
- │ │ 6 . . . │ │
- │ │ 7 MPI_File_iwrite ( fh , & buffer , BUFFER_SIZE , MPI_CHAR , & r ) ; │ │
- │ │ 8 . . . │ │
- │ │ 9 // compute something │ │
- │ │ 10 . . . │ │
- │ │ 11 MPI_Test ( & r , & completed , & s ) ; │ │
- │ │ 12 . . . │ │
- │ │ 13 if ( ! completed ) { │ │
- │ │ 14 // compute something │ │
- │ │ 15 │ │
- │ │ 16 MPI_Wait ( & r , & s ) ; │ │
- │ │ 17 } │ │
- │ │ │ │
- │ ╰───────────────────────────────────────────────────────────────────────────────────────────────────────────╯ │
- │ │
- │ ▶ Application is using inter-node aggregators (which require network communication) │
- │ ↪ Recommendations: │
- │ ↪ Set the MPI hints for the number of aggregators as one per compute node (e.g., cb_nodes=32) │
- │ │
- │ ╭─ Solution Example Snippet ─────────────────────────────────────────────────────────────────────────────── ─╮ │
- │ │ │ │
- │ │ 1 # ------------------------------- # │ │
- │ │ 2 # MPICH # │ │
- │ │ 3 # ------------------------------- # │ │
- │ │ 4 export MPICH_MPIIO_HINTS = "*:cb_nodes=16:cb_buffer_size=16777216:romio_cb_write=enable:romio_ds_wri │ │
- │ │ 5 │ │
- │ │ 6 # * means it will apply the hints to any file opened with MPI-IO │ │
- │ │ 7 # cb_nodes ---> number of aggregator nodes, defaults to stripe count │ │
- │ │ 8 # cb_buffer_size ---> controls the buffer size used for collective buffering │ │
- │ │ 9 # romio_cb_write ---> controls collective buffering for writes │ │
- │ │ 10 # romio_cb_read ---> controls collective buffering for reads │ │
- │ │ 11 # romio_ds_write ---> controls data sieving for writes │ │
- │ │ 12 # romio_ds_read ---> controls data sieving for reads │ │
- │ │ 13 │ │
- │ │ 14 # to visualize the used hints for a given job │ │
- │ │ 15 export MPICH_MPIIO_HINTS_DISPLAY = 1 │ │
- │ │ 16 │ │
- │ │ 17 # ------------------------------- # │ │
- │ │ 18 # OpenMPI / SpectrumMPI (Summit) # │ │
- │ │ 19 # ------------------------------- # │ │
- │ │ 20 export OMPI_MCA_io = romio321 │ │
- │ │ 21 export ROMIO_HINTS = ./my-romio-hints │ │
- │ │ 22 │ │
- │ │ 23 # the my-romio-hints file content is as follows: │ │
- │ │ 24 cat $ROMIO_HINTS │ │
- │ │ 25 │ │
- │ │ 26 romio_cb_write enable │ │
- │ │ 27 romio_cb_read enable │ │
- │ │ 28 romio_ds_write disable │ │
- │ │ 29 romio_ds_read disable │ │
- │ │ 30 cb_buffer_size 16777216 │ │
- │ │ 31 cb_nodes 8 │ │
- │ │ │ │
- │ ╰───────────────────────────────────────────────────────────────────────────────────────────────────────────╯ │
- │ │
- │ │
- ╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
-
- 2022 | LBL | Drishti report generated at 2022-08-05 13:20:09.160753 in 0.965 seconds
-
-
-
-
-
+Drishti ╭─ DRISHTI v.0.3 ───────────────────────────────────────────────────────────────────────────────────────────────────── ─╮ │ │ │ JOB : 1190243 │ │ EXECUTABLE : bin/8_benchmark_parallel │ │ DARSHAN : jlbez_8_benchmark_parallel_id1190243_7-23-45631-11755726114084236527_1.darshan │ │ EXECUTION DATE : 2021-07-23 16:40:31+00:00 to 2021-07-23 16:40:32+00:00 (0.00 hours) │ │ FILES : 6 files (1 use STDIO, 2 use POSIX, 1 use MPI-IO) │ │ PROCESSES 64 │ │ HINTS : romio_no_indep_rw=true cb_nodes=4 │ │ │ ╰─ 1 critical issues , 5 warnings , and 5 recommendations ────────────────────────────────────────────────────────────── ─╯ ╭─ METADATA ────────────────────────────────────────────────────────────────────────────────────────────────────────── ─╮ │ │ │ ▶ Application is read operation intensive (6.34% writes vs. 93.66% reads) │ │ ▶ Application might have redundant read traffic (more data was read than the highest read offset) │ │ ▶ Application might have redundant write traffic (more data was written than the highest write offset) │ │ │ ╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯ ╭─ OPERATIONS ──────────────────────────────────────────────────────────────────────────────────────────────────────── ─╮ │ │ │ ▶ Application issues a high number (285) of small read requests (i.e., < 1MB) which represents 37.11% of all │ │ read/write requests │ │ ↪ 284 (36.98%) small read requests are to "benchmark.h5" │ │ ↪ Recommendations: │ │ ↪ Consider buffering read operations into larger more contiguous ones │ │ ↪ Since the appplication already uses MPI-IO, consider using collective I/O calls (e.g. MPI_File_read_all() or │ │ MPI_File_read_at_all()) to aggregate requests into larger ones │ │ │ │ ╭─ Solution Example Snippet ─────────────────────────────────────────────────────────────────────────────── ─╮ │ │ │ │ │ │ │ 1 MPI_File_open ( MPI_COMM_WORLD , " output-example.txt " , MPI_MODE_CREATE | MPI_MODE_RDONLY , MPI_INFO_NULL , │ │ │ │ 2 . . . │ │ │ │ 3 MPI_File_read_all ( fh , & buffer , size , MPI_INT , & s ) ; │ │ │ │ │ │ │ ╰───────────────────────────────────────────────────────────────────────────────────────────────────────────╯ │ │ │ │ ▶ Application mostly uses consecutive (2.73%) and sequential (90.62%) read requests │ │ ▶ Application mostly uses consecutive (19.23%) and sequential (76.92%) write requests │ │ ▶ Application uses MPI-IO and read data using 640 (83.55%) collective operations │ │ ▶ Application uses MPI-IO and write data using 768 (100.00%) collective operations │ │ ▶ Application could benefit from non-blocking (asynchronous) reads │ │ ↪ Recommendations: │ │ ↪ Since you use MPI-IO, consider non-blocking/asynchronous I/O operations (e.g., MPI_File_iread(), │ │ MPI_File_read_all_begin/end(), or MPI_File_read_at_all_begin/end()) │ │ │ │ ╭─ Solution Example Snippet ─────────────────────────────────────────────────────────────────────────────── ─╮ │ │ │ │ │ │ │ 1 MPI_File fh ; │ │ │ │ 2 MPI_Status s ; │ │ │ │ 3 MPI_Request r ; │ │ │ │ 4 . . . │ │ │ │ 5 MPI_File_open ( MPI_COMM_WORLD , " output-example.txt " , MPI_MODE_CREATE | MPI_MODE_RDONLY , MPI_INFO_NULL │ │ │ │ 6 . . . │ │ │ │ 7 MPI_File_iread ( fh , & buffer , BUFFER_SIZE , n , MPI_CHAR , & r ) ; │ │ │ │ 8 . . . │ │ │ │ 9 // compute something │ │ │ │ 10 . . . │ │ │ │ 11 MPI_Test ( & r , & completed , & s ) ; │ │ │ │ 12 . . . │ │ │ │ 13 if ( ! completed ) { │ │ │ │ 14 // compute something │ │ │ │ 15 │ │ │ │ 16 MPI_Wait ( & r , & s ) ; │ │ │ │ 17 } │ │ │ │ │ │ │ ╰───────────────────────────────────────────────────────────────────────────────────────────────────────────╯ │ │ │ │ ▶ Application could benefit from non-blocking (asynchronous) writes │ │ ↪ Recommendations: │ │ ↪ Since you use MPI-IO, consider non-blocking/asynchronous I/O operations (e.g., MPI_File_iwrite(), │ │ MPI_File_write_all_begin/end(), or MPI_File_write_at_all_begin/end()) │ │ │ │ ╭─ Solution Example Snippet ─────────────────────────────────────────────────────────────────────────────── ─╮ │ │ │ │ │ │ │ 1 MPI_File fh ; │ │ │ │ 2 MPI_Status s ; │ │ │ │ 3 MPI_Request r ; │ │ │ │ 4 . . . │ │ │ │ 5 MPI_File_open ( MPI_COMM_WORLD , " output-example.txt " , MPI_MODE_CREATE | MPI_MODE_WRONLY , MPI_INFO_NULL │ │ │ │ 6 . . . │ │ │ │ 7 MPI_File_iwrite ( fh , & buffer , BUFFER_SIZE , MPI_CHAR , & r ) ; │ │ │ │ 8 . . . │ │ │ │ 9 // compute something │ │ │ │ 10 . . . │ │ │ │ 11 MPI_Test ( & r , & completed , & s ) ; │ │ │ │ 12 . . . │ │ │ │ 13 if ( ! completed ) { │ │ │ │ 14 // compute something │ │ │ │ 15 │ │ │ │ 16 MPI_Wait ( & r , & s ) ; │ │ │ │ 17 } │ │ │ │ │ │ │ ╰───────────────────────────────────────────────────────────────────────────────────────────────────────────╯ │ │ │ │ ▶ Application is using inter-node aggregators (which require network communication) │ │ ↪ Recommendations: │ │ ↪ Set the MPI hints for the number of aggregators as one per compute node (e.g., cb_nodes=32) │ │ │ │ ╭─ Solution Example Snippet ─────────────────────────────────────────────────────────────────────────────── ─╮ │ │ │ │ │ │ │ 1 # ------------------------------- # │ │ │ │ 2 # MPICH # │ │ │ │ 3 # ------------------------------- # │ │ │ │ 4 export MPICH_MPIIO_HINTS = "*:cb_nodes=16:cb_buffer_size=16777216:romio_cb_write=enable:romio_ds_wri │ │ │ │ 5 │ │ │ │ 6 # * means it will apply the hints to any file opened with MPI-IO │ │ │ │ 7 # cb_nodes ---> number of aggregator nodes, defaults to stripe count │ │ │ │ 8 # cb_buffer_size ---> controls the buffer size used for collective buffering │ │ │ │ 9 # romio_cb_write ---> controls collective buffering for writes │ │ │ │ 10 # romio_cb_read ---> controls collective buffering for reads │ │ │ │ 11 # romio_ds_write ---> controls data sieving for writes │ │ │ │ 12 # romio_ds_read ---> controls data sieving for reads │ │ │ │ 13 │ │ │ │ 14 # to visualize the used hints for a given job │ │ │ │ 15 export MPICH_MPIIO_HINTS_DISPLAY = 1 │ │ │ │ 16 │ │ │ │ 17 # ------------------------------- # │ │ │ │ 18 # OpenMPI / SpectrumMPI (Summit) # │ │ │ │ 19 # ------------------------------- # │ │ │ │ 20 export OMPI_MCA_io = romio321 │ │ │ │ 21 export ROMIO_HINTS = ./my-romio-hints │ │ │ │ 22 │ │ │ │ 23 # the my-romio-hints file content is as follows: │ │ │ │ 24 cat $ROMIO_HINTS │ │ │ │ 25 │ │ │ │ 26 romio_cb_write enable │ │ │ │ 27 romio_cb_read enable │ │ │ │ 28 romio_ds_write disable │ │ │ │ 29 romio_ds_read disable │ │ │ │ 30 cb_buffer_size 16777216 │ │ │ │ 31 cb_nodes 8 │ │ │ │ │ │ │ ╰───────────────────────────────────────────────────────────────────────────────────────────────────────────╯ │ │ │ │ │ ╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯ 2022 | LBL | Drishti report generated at 2022-08-05 13:20:09.160753 in 0.965 seconds
\ No newline at end of file
diff --git a/images/sample-io-insights.svg b/images/sample-io-insights.svg
index ab1f312..f351ed5 100644
--- a/images/sample-io-insights.svg
+++ b/images/sample-io-insights.svg
@@ -1,259 +1 @@
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- Drishti
-
-
-
-
-
-
-
-
-
-
- ╭─ DRISHTI v.0.3 ───────────────────────────────────────────────────────────────────────────────────────────────────── ─╮
- │ │
- │ JOB : 1190243 │
- │ EXECUTABLE : bin/8_benchmark_parallel │
- │ DARSHAN : jlbez_8_benchmark_parallel_id1190243_7-23-45631-11755726114084236527_1.darshan │
- │ EXECUTION DATE : 2021-07-23 16:40:31+00:00 to 2021-07-23 16:40:32+00:00 (0.00 hours) │
- │ FILES : 6 files (1 use STDIO, 2 use POSIX, 1 use MPI-IO) │
- │ PROCESSES 64 │
- │ HINTS : romio_no_indep_rw=true cb_nodes=4 │
- │ │
- ╰─ 1 critical issues , 5 warnings , and 5 recommendations ────────────────────────────────────────────────────────────── ─╯
-
- ╭─ METADATA ────────────────────────────────────────────────────────────────────────────────────────────────────────── ─╮
- │ │
- │ ▶ Application is read operation intensive (6.34% writes vs. 93.66% reads) │
- │ ▶ Application might have redundant read traffic (more data was read than the highest read offset) │
- │ ▶ Application might have redundant write traffic (more data was written than the highest write offset) │
- │ │
- ╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
- ╭─ OPERATIONS ──────────────────────────────────────────────────────────────────────────────────────────────────────── ─╮
- │ │
- │ ▶ Application issues a high number (285) of small read requests (i.e., < 1MB) which represents 37.11% of all │
- │ read/write requests │
- │ ↪ 284 (36.98%) small read requests are to "benchmark.h5" │
- │ ↪ Recommendations: │
- │ ↪ Consider buffering read operations into larger more contiguous ones │
- │ ↪ Since the appplication already uses MPI-IO, consider using collective I/O calls (e.g. MPI_File_read_all() or │
- │ MPI_File_read_at_all()) to aggregate requests into larger ones │
- │ ▶ Application mostly uses consecutive (2.73%) and sequential (90.62%) read requests │
- │ ▶ Application mostly uses consecutive (19.23%) and sequential (76.92%) write requests │
- │ ▶ Application uses MPI-IO and read data using 640 (83.55%) collective operations │
- │ ▶ Application uses MPI-IO and write data using 768 (100.00%) collective operations │
- │ ▶ Application could benefit from non-blocking (asynchronous) reads │
- │ ↪ Recommendations: │
- │ ↪ Since you use MPI-IO, consider non-blocking/asynchronous I/O operations (e.g., MPI_File_iread(), │
- │ MPI_File_read_all_begin/end(), or MPI_File_read_at_all_begin/end()) │
- │ ▶ Application could benefit from non-blocking (asynchronous) writes │
- │ ↪ Recommendations: │
- │ ↪ Since you use MPI-IO, consider non-blocking/asynchronous I/O operations (e.g., MPI_File_iwrite(), │
- │ MPI_File_write_all_begin/end(), or MPI_File_write_at_all_begin/end()) │
- │ ▶ Application is using inter-node aggregators (which require network communication) │
- │ ↪ Recommendations: │
- │ ↪ Set the MPI hints for the number of aggregators as one per compute node (e.g., cb_nodes=32) │
- │ │
- ╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
-
- 2022 | LBL | Drishti report generated at 2022-08-05 13:20:19.715639 in 0.996 seconds
-
-
-
-
-
+Drishti ╭─ DRISHTI v.0.3 ───────────────────────────────────────────────────────────────────────────────────────────────────── ─╮ │ │ │ JOB : 1190243 │ │ EXECUTABLE : bin/8_benchmark_parallel │ │ DARSHAN : jlbez_8_benchmark_parallel_id1190243_7-23-45631-11755726114084236527_1.darshan │ │ EXECUTION DATE : 2021-07-23 16:40:31+00:00 to 2021-07-23 16:40:32+00:00 (0.00 hours) │ │ FILES : 6 files (1 use STDIO, 2 use POSIX, 1 use MPI-IO) │ │ PROCESSES 64 │ │ HINTS : romio_no_indep_rw=true cb_nodes=4 │ │ │ ╰─ 1 critical issues , 5 warnings , and 5 recommendations ────────────────────────────────────────────────────────────── ─╯ ╭─ METADATA ────────────────────────────────────────────────────────────────────────────────────────────────────────── ─╮ │ │ │ ▶ Application is read operation intensive (6.34% writes vs. 93.66% reads) │ │ ▶ Application might have redundant read traffic (more data was read than the highest read offset) │ │ ▶ Application might have redundant write traffic (more data was written than the highest write offset) │ │ │ ╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯ ╭─ OPERATIONS ──────────────────────────────────────────────────────────────────────────────────────────────────────── ─╮ │ │ │ ▶ Application issues a high number (285) of small read requests (i.e., < 1MB) which represents 37.11% of all │ │ read/write requests │ │ ↪ 284 (36.98%) small read requests are to "benchmark.h5" │ │ ↪ Recommendations: │ │ ↪ Consider buffering read operations into larger more contiguous ones │ │ ↪ Since the appplication already uses MPI-IO, consider using collective I/O calls (e.g. MPI_File_read_all() or │ │ MPI_File_read_at_all()) to aggregate requests into larger ones │ │ ▶ Application mostly uses consecutive (2.73%) and sequential (90.62%) read requests │ │ ▶ Application mostly uses consecutive (19.23%) and sequential (76.92%) write requests │ │ ▶ Application uses MPI-IO and read data using 640 (83.55%) collective operations │ │ ▶ Application uses MPI-IO and write data using 768 (100.00%) collective operations │ │ ▶ Application could benefit from non-blocking (asynchronous) reads │ │ ↪ Recommendations: │ │ ↪ Since you use MPI-IO, consider non-blocking/asynchronous I/O operations (e.g., MPI_File_iread(), │ │ MPI_File_read_all_begin/end(), or MPI_File_read_at_all_begin/end()) │ │ ▶ Application could benefit from non-blocking (asynchronous) writes │ │ ↪ Recommendations: │ │ ↪ Since you use MPI-IO, consider non-blocking/asynchronous I/O operations (e.g., MPI_File_iwrite(), │ │ MPI_File_write_all_begin/end(), or MPI_File_write_at_all_begin/end()) │ │ ▶ Application is using inter-node aggregators (which require network communication) │ │ ↪ Recommendations: │ │ ↪ Set the MPI hints for the number of aggregators as one per compute node (e.g., cb_nodes=32) │ │ │ ╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯ 2022 | LBL | Drishti report generated at 2022-08-05 13:20:19.715639 in 0.996 seconds
\ No newline at end of file