Releases: ovis-hpc/ldms
OVIS-4.2.2
This is STABLE OVIS-4.2.X release.
Soon master will move from V3 to V4.
OVIS-3.4.13
Changes:
- LDMS_LOG_PATH and gender attribute variable may now contain ${} for shell variable expansion in the daemon's runtime environment.
- ldms-sensor-config expanded to support newer library use of openat.
- simplified self check for host; fixes corner cases seen on clusters.
- extended systemd init scripts to take default schema names from plugin names.
- extended systemd init scripts to provide hooks for general site-specific extensions such as prepopulating job data on non-slurm nodes with the empty data set. The default hook handles the slurm case for the jobid/job_info plugins and may need disabling or tailoring per site in ldmsd.local.conf.
- added csv utilities specific to ldms data export tasks.
OVIS-4.2.1-rc1
OVIS-4.2.1-rc1.
OVIS-4.2_Beta will be deprecated.
OVIS-3.4.12
This release provides testing improvements and bug fixes, usability improvements, and a new sampler feature 'filesingle' for single-metric files typical of sysfs.
General
- Improvements in user input checking in daemon and systemd scripts.
- Added in-daemon check to prevent misconfiguration of aggregators collecting from themselves.
Testing tools
- Slurm-based parallel testing tool pll-ldms-static-test.sh added.
- Option transportdata= added to store_csv to enable collection of transport debugging info.
- Added configure option --enable-mmdebug which disables mmap of transport data and detection of buffer overruns (but only for the sock transport) if environment variable LDMS_ENABLE_MMALLOC_DEBUG is also defined.
Plugins
- Sampler filesingle added for collecting sysfs metrics (temps, volts, speeds, lustre, etc); config helper ldms-sensors-config provided.
- Errors in Lustre 2.8 sampler corrected.
- Added alternate store_csvdbg plugin, which is store_csv compiled with the storing transport data enabled.
- Added chkmeminfo plugin, which is the meminfo plugin compiled with data-corruption-check stored in high bits of metrics.
Systemd
- Provided do-not-repeat-yourself genders configuration of aggregators and stores by adding LDMSD_GENDERS_1 and _2 to allow L1 and L2 aggregators to inspect L0 genders files for connection data.
- Fixed scripting errors in interpretation of genders for certain storage policy specifications.
- Added LDMSD_DEBUG_CONFIG_FILE option to ldmsd.%I.conf which allows arbitrary ldmsd scripting to be appended to genders-based configuration output in /var/run/ldmsd/all-config.%I.
- Fixed error messages from systemd scripts to be tagged with the correct daemon identity instead of 'root'.
Notes:
The luster2_client sampler in this release does not support lustre 2.10 and later due to refactoring of the lustre /proc/sys interface.
OVIS-3.4.11
- Add metric whitelist and blacklist options to store_flatfile plugin.
- Add rolltype=5 rollagain={period} (periodic rollover based on the wall clock time) to store_csv plugin.
- ldmsd genders support changes:
Add ldmsd_strgp_POLICYNAME support to customize containers.
Fix ldmsaggd_event_thds gender support. - Add humane diagnostic of missing input files to ldms-static-test.sh.
- Fix ldmsd mis-handling of empty aggregator interval specifications.
- Fix ldms-static-test.sh bug (bug could NOT affect TOSS3 users).
- Fix misformat of network port numbers in some log messages.
OVIS-3.4.10
Changes since 3.4.9:
- Add %{env} support in csv rename template option.
- Add -h option and rollover_created function to ldms-static-test.sh utility.
- Updated man pages.
- Fixes to insecure directory permissions (755) on rename/create in csv store.
- Fix generation order for updtr_start command in ldmsctl_args3.
- Fix init script miscomputed 'instance=' on certain sampler configuration cases.
- Fix ldmsd@agg local example genders file.
- Fix to SLURM prolog example in Plugin_jobid man page.
OVIS-3.4.9
Changes since 3.4.8:
- lnet_stats bug fixed to not report stale data.
- systemd init scripts updated to better handle custom schema names (or lack of them).
- systemd libgenders specification error detection improved.
OVIS-3.4.8
New since 3.4.7:
- a store_rabbitkw (see man Plugin_store_rabbitkw).
- a new script command 'lsdate' for those working with timestamped csv files from store_csv. (man lsdate)
- a bunch of minor improvements to test scripts.
OVIS-3.4.7
Changes in 3.4.7 since 3.4.6
FEATURE ADDS:
- Made rate computation in sysclassib optional.
- Added options to csv store to define uid/gid/perm at file create time.
- Include timezone offset in ldms_ls output date stamps.
- Extended gender options with ldmsd_idsuffix and ldmsd_id.
BUG FIXES:
- Fixed rate computation in sysclassib so that reset-drops appear as negative rates. Before they appeared as large random numbers.
- Fixed computation of host component ids in systemd scripts to account for fixed width (leading 0) integer fields in hostnames.
- Fixed exclusion of 0 and uid/gid > 65536 on csv uid/gid options.
- Detect and log once certain schema name conflicts at store_csv and store_rabbitkw. When the first instance of a schema name is larger than the second to hit the store, missing metrics are detectable. The reverse is not true, and data mislabelling may occur in this case.
- Better logging of store_rabbitkw issues.
- Fixed opa2 sampler log message priority.(error -> info)
- Fixed incorrect (excess) detection of comments in config parameter names containing #.
A # following whitespace or beginning a line begins a comment.
OVIS-3.4.6
Changes in 3.4.6 since 3.4.4
FUNCTIONAL CHANGES:
-
Added /usr/bin/ldms-static-test.sh and numerous test examples of ldms configuration in /usr/share/doc/ovis-ldms-3.4.6/examples/static-test. See man ldms-static-test. Includes store, sampler, and multilevel aggregation examples.
-
Added dstat sampler for monitoring ldmsd itself. Expected use is to be
loaded on aggregator and storage ldmsd instances. See Plugin_dstat man page. -
Added jobid collection support to lustre2_client sampler.
-
Added opa2 sampler to collect omnipath hfi interface metrics. See Plugin_opa2 man page.
-
Updated libgenders support for managing ports (see man ldms-attributes) in init scripts (see man ldms-attributes):
ldmsd_use_unix_socket
ldmsd_sockpath
ldmsd_use_inet_socket
ldmsd_config_port
ldmsd_log
ldmsd_vg
ldmsd_vgargfile -
Added filters to trap and warn about common gender spelling and punctuation errors.
-
Split the build/install of libgenders/boost tool from install of systemd scripts. Systemd scripts can be used without the ldmsctl_args3 tool if the user provides the daemon configuration commands in a named script listed in ldmsd.local.conf.
-
Added missing man pages for samplers ported from LDMS v2: clock, procstat, sysclassib, jobid, lustre2_client, procsensors.
-
New/updates to man, plugins for cray samplers aries_linkstatus, aries_mmr.
-
Changed defaults in systemd scripts to allow more open files at aggregators and syslogid.
-
Fixed overzealous failure condition handling in ldms_jobid.
-
Added debug output of registered memory (mmalloc) in use at exit to better bound -m option value needed for ldmsd instances. New mm_stat call in lib/mmalloc supplies the data.
SECURITY CHANGES:
- Fixed default insecure (commonly know secret) ldmsauth file. Now it is invalid by default (too short).
RUNTIME CHANGES/BUG FIXES:
-
Fixed C bugs in store related code:
- idx_delete
- notification (memory leak)
- avl (attribute/value list handling of error conditions)
- thread locking error in store_csv
-
Fixed C bugs in network transports:
- rdma connection resource leaks in error handling cases.
-
Fixed C bugs in samplers:
- jobid minor fixes
- procnfs sampler now accounts for variations in nfs file layout. The procnfs sampler has never supported nfsv4 metrics and does not now.
- Reduced repetitive logging of the same transient failure conditions.
- Updated several samplers to run through transient disappearance of /proc.
HOUSEKEEPING CHANGES:
-
Removed LDMS_BUILDTYPE from systemd control scripts (it was preventing relocatability, and is in any case obsolete).
-
Remove most old packaging scripts from ldms source tree packaging/ directory.
-
Change install permissions on pedigree script.
-
Update rpath macro in build (deprecates some old apple os versions).
-
Made rpms fully relocatable without forcing the user to manually set ld and zap related environment variables before invocation. This entails wrapping all the sbin/ldms binaries in .ldms-wrapper. Thanks to cray for assistance in this.
DEVELOPER CHANGES:
-
Updated installed include files and /usr/lib/ovis-[ldms/lib]-configvars.sh so that 3rd party plugins can be built when only the installed ldms binaries and headers are used.
-
Updated .gitignore settings.