All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog and this project adheres to Semantic Versioning.
0.24.2 - 01-24-2024
- 823 Corrected a mistake in converting duration to seconds.
- 821 add container_start_args to pass options to the start command.
0.24.1 - 11-29-2023
0.24.0 - 11-28-2023
- Code cleanup and separate arguments with whitespace in Fujitsu TCS adapter by @mnakao in #808
- Add OUT_OF_MEMORY state for Slurm by @robinkar in #809
- find_port: avoid infinite loop by @utkarshayachit in #811
- handle find_port error codes by @utkarshayachit in #812
- vnc: run websockify as background process by @utkarshayachit in #813
- Add working_dir option for Fujitsu TCS job scheduler by @mnakao in #816
- Minor fix for Fujitsu TCS by @mnakao in #817
- Update rake requirement from ~> 13.0.1 to ~> 13.1.0 by @dependabot in #814
- Changes default return value for cluster.batch_connect_ssh_allow? by @HazelGrant in #818
0.23.5 - 04-10-2023
- 804 fixed a kubernetes bug in the
info_all
code path. - Slurm
-M
flag now correctly accounts for full pathsacctmgr
commands in 807.
0.23.4 - 03-06-2023
- 800 fixed some Fujitsu bugs.
0.23.3 - 02-17-2023
- ACLs now respond to
allowlist
andblocklist
in 795. - Sites can now use
OOD_SSH_PORT
to use a nonstandard port in 797.
0.23.2 - 02-02-2023
- The linux host adapter should correctly extract the full apptainer pid in 794.
0.23.1 - 02-01-2023
QueueInfo
objects also upcase accounts when applicable in 792.
queue_name
has the aliasqueue
in 790.
0.23.0 - 01-17-2023
- 787 added the
queues
API to the adapter class with support for Slurm. - 783 added the
accounts
API to the adapter class with support for Slurm.
- The linux host adapter now supports apptainer in 788.
0.22.0 - 10-31-2022
- Added the
vnc_container
batch connect template in 774. - https://osc.github.io/ood_core is now updated on every commit to master in 765.
- Kubernetes can now read mulitple secrets in 778.
- PBSPro correctly reads usernames with periods in them in 780.
0.21.0 - 08-01-2022
- Added the
fujitsu_tcs
adapter in 766.
0.20.2 - 07-28-2022
0.20.1 - 07-21-2022
- Fixed turbovnc compatability issue with the -nohttpd flag in 767.
0.20.0 - 06-03-2022
- Adapters can now respond to
cluster_info
in 752. This returns information about the cluster like how many nodes are available and so on. Only Slurm support in this release. OodCore::Job::Info
now has agpus
attribute in 753. Only Slurm support in this release.- Support Ruby 3 in 759
0.19.0 - 02-03-2022
- Systemd adapter in 743.
- The linux host adapter is a little more portable in 333.
- Improved pod security for the k8s adapter in 748.
0.18.1 - 10-18-2021
- Fixed kubernetes initialization in 331.
0.18.0 - 10-18-2021
- Fixed LHA crashing on strange bash output in 322.
- All adapters now respond to #{adapter}? methods like slurm?, pbspro?, kubernetes? and so on in 326.
- The kubernetes adapter now expects to set context statically in 324. And can now accept context as a part of it's interface. It will now also always send --context when using OIDC and that context defaults to the clustername in 327.
- Removed the activesupport dependency in 329.
0.17.6 - 8-24-2021
- kubernetes now allows for arbitrary labels to be set in 317.
- kubernetes now allows for limits and requests to be different in 318.
0.17.5 - 8-20-2021
- kubernetes jobs delete without waiting in 314.
0.17.4 - 7-29-2021
Functionally the same as 0.17.3 but with some CI updates.
0.17.3 - 7-29-2021
- Fixed handling of pods in a startup phase in 303.
- Enable automatic population of supplemental groups in 305.
0.17.2 - 7-14-2021
- Fixed k8s adapter to only show Running pods as running in 300.
0.17.1 - 6-14-2021
- Fixed 278 where unschedulable pods will now show up as queued_held status.
- KUBECONFIG now defaults to /dev/null in the kubernetes adapter in 292.
- Sites can now set
batch_connect.ssh_allow
on the cluster to disable the buttons to start a shell session to compute nodes in 289. POD_PORT
is now available to jobs in the kubernetes adapter in 290.- Kubernetes pods now support a startProbe in 291.
0.17.0 - 5-26-2021
- All Kubernetes resources now have the same labels in 280.
- Kubernetes does not crash when no configmap is defined in 282.
- Kubernetes will not specify init containers if there are none in 284.
- Kubernetes, Slurm and Torque now support the script option
gpus_per_node
in 266. - Kubernetes will now save the pod.yml into the staged root in 277.
- Kubernetes now allows for node selector in 264.
- Kubernetes pods now have access the environment variable POD_NAMESPACE in 275.
- Kubernetes pods can now specify the image pull policy in 272.
- Cluster config's batch_connect now support
ssh_allow
to disable sshing to compute nodes per cluster in 286. - Kubernetes will now add the templated script content to a configmap in 273.
- Kubernetes username prefix no longer appends a - in 271.
0.16.1 - 2021-04-23
- memorized some allow? variables to have better support around ACLS in 267
0.16.0 - 2021-04-20
- Changed how k8s configmaps in are defined in 251. The data structure now expects a key called files which is an array of objects that hold filename, data, mount_path, sub_path and init_mount_path. 255 also relates to this interface change.
- The k8s adapter can now specify environment variables and creates defaults in 252.
- The k8s adapter can now specify image pull secrets in 253.
0.15.1 - 2021-02-25
- kubernetes adapter uses the full module for helpers in 245.
- kubernetes pods spawn with runAsNonRoot set to true in 247.
- kubernetes pods can spawn with supplemental groups along with some other in security defaults in 246.
0.15.0 - 2021-01-26
- ccq adapter now accepts job names with spaces in 210
- k8s correctly handles having no mount volumes in 239
- k8s adapter now applies account metadata to resources in 216 and 231
- k8s adapter can now prefix namespaces in 218
- k8s adapter now applies time limits to pods in 224
- testing automation is now done in github actions in 221
- update bunlder to 2.1.4 and ruby to 2.7 in 235 updated bundler and ruby
- k8s adapter more appropriately labels unschedulable pods as queued in 230
- k8s adapter now uses the script#ood_connection_info API instead of script#native in 222
0.14.0 - 2020-10-01
- Kubernetes adapter in PR 156
0.13.0 - 2020-08-10
- CloudyCluster CCQ Adapter
0.12.0 - 2020-08-05
- qos option to Slurm and Torque #205
- native hash returned in qstat for SGE adapter #198
- option for specifying
submit_host
to submit jobs via ssh on other host #204
- SGE handle milliseconds instead of seconds when milliseconds used #206
- Torque's native "hash" for job submission now handles env vars values with spaces #202
0.11.4 - 2020-05-27
0.11.3 - 2020-05-11
- LinuxhHost Adapter to work with any login shell (#188)
- LinuxhHost Adapter needs to display long lines in pstree to successfully parse output (#188)
0.11.2 - 2020-04-23
- fix signature of
LinuxHost#info_where_owner
0.11.1 - 2020-03-18
- Only the version changed. Had to republish to rubygems.org
0.11.0 - 2020-03-18
- Added directive prefixes to each adapter (e.g.
#QSUB
) (#161) - LHA supports
submit_host
field in native (#164) - Cluster files can be yaml or yml extensions (#171)
- Users can add a flag
OOD_JOB_NAME_ILLEGAL_CHARS
to sanitize job names (#183
- Simplified job array parsing (#144)
- Issue where environment variables were not properly exported to the job (#158)
- Parsing bad cluster files (#150 and #178)
- netcat is no longer a hard dependency. Now lsof, python and bash can be used (153)
- GE crash when nil config file was given (#175)
- GE sometimes reported incorrect core count (#168)
0.10.0 - 2019-11-05
- Added an adapter for submitting work on Linux hosted systems without using a scheduler
- Fixed bug where an unreadable cluster config would cause crashes
0.9.3 - 2019-05-08
- Fixed bug relating to cluster comparison
0.9.2 - 2019-05-08
- When
squeue
returns '(null)' for an account the Slurm adapter will now convert that tonil
0.9.1 - 2019-05-07
- Added logic to
OodCore::Job::ArrayIds
to return an empty array when the array request is invalid
0.9.0 - 2019-05-04
- Job array support for LSF and PBSPro
- Slurm adapter uses
squeue
owner filter (-u
) forinfo_where_owner
- Grid Engine adapter now starts scripts in the current directory like all other adapters
- Fixed issue where Slurm comment field might break job info parsing
- Fixed possible crash when comparing two clusters if the id of one of the clusters is nil
- Fixed bug with the live system test that impacted non-LSF systems
- Fixed bug with Slurm adapter when submit time is not available
0.8.0 - 2019-01-29
- info_all_each and info_where_owner_each super class methods
- job array support for Torque, Slurm, and SGE (currently missing from LSF and PBSPro)
OodCore::Job::Status#precedence
for the ability to get an overall status for a group of jobs
- Fix SGE adapter to specify
-u '*'
when calling qstat to get all jobs
0.7.1 - 2019-01-11
- Fixed crash when libdrmaa is used to query for a job no longer in the queue
0.7.0 - 2018-12-26
- Addition of an optional live system test of a configurable job adapter
- Fix Torque adapter crash by fixing scope resolution on Attrl and Attropl
- Fix SGE adapter crash in
OodCore::Job::Adapters::Sge::Batch#get_info_enqueued_job
when libdrmma is not available (DRMMA constant not defined)
- Always set
SGE_ROOT
env var, for both SGE commands via popen and when using libdrmaa - Use libdrmaa only when libdrmaa is set in the cluster config
0.6.0 - 2018-12-19
- Added ability to override the default password length
- Merge the pbs-ruby gem removing that as a dependency, but adding FFI
- Added support for overriding resource manager client executables using
bin_overrides
in the cluster configs - Add support for the Grid Engine resource manager (tested on GE 6.2u5 and UGE 8.0.1)
- Fixed a bug in password creation where certain locales resulted in invalid passwords #91
0.5.1 - 2018-05-14
0.5.0 - 2018-04-30
- Added missing "Waiting" state to the Torque adapter as
:queued_held
.
- Changed the "Waiting" state in the PBSPro adapter to
:queued_held
.
0.4.0 - 2018-04-20
- Updated Torque adapter to take into account the new
Script#native
format allowing for arrays. #65
0.3.0 - 2018-04-05
- Basic multi-cluster support for LSF by specifying name of cluster for -m argument. #24
- Added
OodCore::Job::Script#shell_path
as an option to all adapters. #82 - Added
header
andfooter
options to a Batch Connect template. #64
- Replaced
Fixnum
code comments withInteger
. #67
0.2.1 - 2018-01-26
- Updated the date in the
LICENSE.txt
file.
- Fixed bug where LSF adapter would sometimes return
nil
when getting job info. #75 - Fixed list of allocated nodes for LSF adapter when single node is expanded for each core. #71
- Clean up children processes in forked Batch Connect main script before cleaning up batch script. #69
- Fix bug when detecting open ports using the bash helpers in the Batch Connect template. #70
0.2.0 - 2017-10-11
- Added Batch Connect helper function to wait for port to be used. #57
- Can include Batch Connect helper functions when writing to files or running remote code. #58
- The Batch Connect helper functions are now available to use in the forked Batch Connect main script. #59
- The
host
andport
environment variables are now available to use in the forked Batch Connect main script. #60
- Fixed a bug with the
nc
command used in the Batch Connect helper functions for CentOS 7. #55 - Fixed not correctly detecting open ports for specific ip address in Batch Connect helper functions. #56
- Fixed a bug when parsing nodes in the Slurm adapter. #54
0.1.1 - 2017-09-08
- fix crash when calling
Adapters::Lsf#info(id:)
with "invalid" id - optimize
Adapters::Lsf#info_where_owner
by usingbjobs -u $USER
when a single user is specified
0.1.0 - 2017-07-17
- Setting the host in a batch_connect batch script can now be directly
manipulated through the
set_host
initialization parameter. #42
0.0.5 - 2017-07-05
- Add wallclock time limit to
OodCore::Job::Info
object. - Add further support for the LSF adapter.
- Add a new Batch Connect template feature that builds batch scripts to launch web servers.
- Add support for the PBS Professional resource manager.
- Add method to filter list of batch jobs for a given owner or owners.
- Torque adapter provides nodes/procs info if available for non-running jobs.
- Slurm adapter provides node info if available for non-running jobs.
- Changed the
CHANGELOG.md
formatting.
- Remove deprecated tests for the Slurm adapter.
- Fix parsing bjobs output for LSF 9.1, which has extra SLOTS column.
0.0.4 - 2017-05-17
- By default all PBS jobs output stdout & stderr to output path unless an error path is specified (mimics behavior of Slurm and LSF)
- Remove
OodCore::Job::Script#min_phys_memory
due to lack of commonality across resource managers. - Remove
OodCore::Job::Script#join_files
due to lack of support in resource managers.
0.0.3 - 2017-04-28
- Provide support for Slurm conf file.
- Correct code documentation for
Script#min_phys_memory
. - Add fix for login feature being allowed on all clusters even if not defined.
0.0.2 - 2017-04-27
- Remove the
OodCore::Job::NodeRequest
object.
- Initial release!