Skip to content

Commit

Permalink
Support for 23.11.x (#332)
Browse files Browse the repository at this point in the history
* upgrade: generate new header files (using 23.11.1 as base)

* upgrade: fixes for slurm 23.11.x compatibility

- slurm_kill_job_step has a new flags param
- hostlist_t typedef has changed
- CR_OTHER_CONS_RES removed upstream
- whitespace fixes

* upgrade: bump version

* upgrade: fixes for pyslurm.pyx

- remove route_plugin key from slurm config
- remove job_credential_* from slurm_config
- resv_msg core_cnt/node_cnt is now a simple uint32_t
- remove obsolete constants
- whitespace fixes

* upgrade: update additional struct definitions

* update README

* update setup.cfg

* update CHANGELOG

* update sbatch_opts.pyx

* support [one|multiple]-tasks-per-sharing of gres-flags

* move old-api specific helper function into pyslurm.pyx
  • Loading branch information
tazend authored Jan 27, 2024
1 parent b20ff49 commit 8489b36
Show file tree
Hide file tree
Showing 24 changed files with 947 additions and 866 deletions.
92 changes: 16 additions & 76 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,98 +5,38 @@ All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## Unreleased on the [23.2.x](https://github.com/PySlurm/pyslurm/tree/23.2.x) branch
## Unreleased on the [23.11.x](https://github.com/PySlurm/pyslurm/tree/23.11.x) branch

- New Classes to interact with Database Associations (WIP)
- `pyslurm.db.Association`
- `pyslurm.db.Associations`
- New Classes to interact with Database QoS (WIP)
- `pyslurm.db.QualityOfService`
- `pyslurm.db.QualitiesOfService`

## [23.11.0](https://github.com/PySlurm/pyslurm/releases/tag/v23.11.0) - 2024-01-27

### Added

- Support for Slurm 23.11.x
- Add `truncate_time` option to `pyslurm.db.JobFilter`, which is the same as -T /
--truncate from sacct.
- Add new Attributes to `pyslurm.db.Jobs` that help gathering statistics for a
- Add new attributes to `pyslurm.db.Jobs` that help gathering statistics for a
collection of Jobs more convenient.
- Add new attribute `gres_tasks_per_sharing` to `pyslurm.Job` and
`pyslurm.JobSubmitDescription`

### Fixed

- Fix `allocated_gres` attribute in the `pyslurm.Node` Class returning nothing.
- Add new `idle_memory` and `allocated_tres` attributes to `pyslurm.Node` class
- Fix Node State being displayed as `ALLOCATED` when it should actually be
`MIXED`.
- Fix crash for the `gres_per_node` attribute of the `pyslurm.Job` class when
the GRES String received from Slurm contains no count.

## [23.2.2](https://github.com/PySlurm/pyslurm/releases/tag/v23.2.2) - 2023-07-18

### Added

- Ability to modify Database Jobs
- New classes to interact with the Partition API
- [pyslurm.Partition][]
- [pyslurm.Partitions][]
- New attributes for a Database Job:
- `extra`
- `failed_node`
- Added a new Base Class [MultiClusterMap][pyslurm.xcollections.MultiClusterMap] that some Collections inherit from.
- Added `to_json` function to all Collections

### Fixed

- Fixes a problem that prevented loading specific Jobs from the Database if
the following two conditions were met:
- no start/end time was specified
- the Job was older than a day

### Changed

- Improved Docs
- Renamed `JobSearchFilter` to [pyslurm.db.JobFilter][]
- Renamed `as_dict` function of some classes to `to_dict`

## [23.2.1](https://github.com/PySlurm/pyslurm/releases/tag/v23.2.1) - 2023-05-18

### Added

- Classes to interact with the Job and Submission API
- [pyslurm.Job](https://pyslurm.github.io/23.2/reference/job/#pyslurm.Job)
- [pyslurm.Jobs](https://pyslurm.github.io/23.2/reference/job/#pyslurm.Jobs)
- [pyslurm.JobStep](https://pyslurm.github.io/23.2/reference/jobstep/#pyslurm.JobStep)
- [pyslurm.JobSteps](https://pyslurm.github.io/23.2/reference/jobstep/#pyslurm.JobSteps)
- [pyslurm.JobSubmitDescription](https://pyslurm.github.io/23.2/reference/jobsubmitdescription/#pyslurm.JobSubmitDescription)
- Classes to interact with the Database Job API
- [pyslurm.db.Job](https://pyslurm.github.io/23.2/reference/db/job/#pyslurm.db.Job)
- [pyslurm.db.Jobs](https://pyslurm.github.io/23.2/reference/db/job/#pyslurm.db.Jobs)
- [pyslurm.db.JobStep](https://pyslurm.github.io/23.2/reference/db/jobstep/#pyslurm.db.JobStep)
- [pyslurm.db.JobFilter](https://pyslurm.github.io/23.2/reference/db/jobsearchfilter/#pyslurm.db.JobFilter)
- Classes to interact with the Node API
- [pyslurm.Node](https://pyslurm.github.io/23.2/reference/node/#pyslurm.Node)
- [pyslurm.Nodes](https://pyslurm.github.io/23.2/reference/node/#pyslurm.Nodes)
- Exceptions added:
- [pyslurm.PyslurmError](https://pyslurm.github.io/23.2/reference/exceptions/#pyslurm.PyslurmError)
- [pyslurm.RPCError](https://pyslurm.github.io/23.2/reference/exceptions/#pyslurm.RPCError)
- [Utility Functions](https://pyslurm.github.io/23.2/reference/utilities/#pyslurm.utils)

### Changed

- Completely overhaul the documentation, switch to mkdocs
- Rework the tests: Split them into unit and integration tests

### Deprecated

- Following classes are superseded by new ones:
- [pyslurm.job](https://pyslurm.github.io/23.2/reference/old/job/#pyslurm.job)
- [pyslurm.node](https://pyslurm.github.io/23.2/reference/old/node/#pyslurm.node)
- [pyslurm.jobstep](https://pyslurm.github.io/23.2/reference/old/jobstep/#pyslurm.jobstep)
- [pyslurm.slurmdb_jobs](https://pyslurm.github.io/23.2/reference/old/db/job/#pyslurm.slurmdb_jobs)

## [23.2.0](https://github.com/PySlurm/pyslurm/releases/tag/v23.2.0) - 2023-04-07

### Added

- Support for Slurm 23.02.x ([f506d63](https://github.com/PySlurm/pyslurm/commit/f506d63634a9b20bfe475534589300beff4a8843))

### Removed

- `Elasticsearch` debug flag from `get_debug_flags`
- `launch_type`, `launch_params` and `slurmctld_plugstack` keys from the
`config.get()` output
- Some constants (mostly `ESLURM_*` constants that do not exist
anymore)
- `route_plugin`, `job_credential_private_key` and `job_credential_public_certificate`
keys are removed from the output of `pyslurm.config().get()`
- Some deprecated and unused Slurm constants
10 changes: 5 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,16 +8,16 @@ pyslurm is the Python client library for the [Slurm Workload Manager](https://sl
* [Python](https://www.python.org) - >= 3.6
* [Cython](https://cython.org) - >= 0.29.36

This Version is for Slurm 23.02.x
This Version is for Slurm 23.11.x

## Versioning

In pyslurm, the versioning scheme follows the official Slurm versioning. The
first two numbers (`MAJOR.MINOR`) always correspond to Slurms Major-Release,
for example `23.02`.
for example `23.11`.
The last number (`MICRO`) is however not tied in any way to Slurms `MICRO`
version, but is instead PySlurm's internal Patch-Level. For example, any
pyslurm 23.02.X version should work with any Slurm 23.02.X release.
pyslurm 23.11.X version should work with any Slurm 23.11.X release.

## Installation

Expand All @@ -29,8 +29,8 @@ the corresponding paths to the necessary files.
You can specify those with environment variables (recommended), for example:

```shell
export SLURM_INCLUDE_DIR=/opt/slurm/23.02/include
export SLURM_LIB_DIR=/opt/slurm/23.02/lib
export SLURM_INCLUDE_DIR=/opt/slurm/23.11/include
export SLURM_LIB_DIR=/opt/slurm/23.11/lib
```

Then you can proceed to install pyslurm, for example by cloning the Repository:
Expand Down
2 changes: 1 addition & 1 deletion pyslurm/__version__.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,4 +5,4 @@
# The last Number "Z" is the current Pyslurm patch version, which should be
# incremented each time a new release is made (except when migrating to a new
# Slurm Major release, then set it back to 0)
__version__ = "23.2.2"
__version__ = "23.11.0"
4 changes: 3 additions & 1 deletion pyslurm/core/job/job.pxd
Original file line number Diff line number Diff line change
Expand Up @@ -347,9 +347,11 @@ cdef class Job:
gres_per_node (dict):
Generic Resources (e.g. GPU) this Job is using per Node.
profile_types (list):
Types for which detailed accounting data is collected.
Types for which detailed accounting data is collected.
gres_binding (str):
Binding Enforcement of a Generic Resource (e.g. GPU).
gres_tasks_per_sharing (str):
Task Sharing of a Generic Resource (e.g. GPU).
kill_on_invalid_dependency (bool):
Whether the Job should be killed on an invalid dependency.
spreads_over_nodes (bool):
Expand Down
59 changes: 34 additions & 25 deletions pyslurm/core/job/job.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,7 @@ cdef class Jobs(MultiClusterMap):
"""Retrieve all Jobs from the Slurm controller
Args:
preload_passwd_info (bool, optional):
preload_passwd_info (bool, optional):
Decides whether to query passwd and groups information from
the system.
Could potentially speed up access to attributes of the Job
Expand Down Expand Up @@ -246,7 +246,7 @@ cdef class Job:
job_info_msg_t *info = NULL
Job wrap = None

try:
try:
verify_rpc(slurm_load_job(&info, job_id, slurm.SHOW_DETAIL))

if info and info.record_count:
Expand Down Expand Up @@ -282,7 +282,7 @@ cdef class Job:
cdef _swap_data(Job dst, Job src):
cdef slurm_job_info_t *tmp = NULL
if dst.ptr and src.ptr:
tmp = dst.ptr
tmp = dst.ptr
dst.ptr = src.ptr
src.ptr = tmp

Expand All @@ -305,7 +305,7 @@ cdef class Job:
Implements the slurm_signal_job RPC.
Args:
signal (Union[str, int]):
signal (Union[str, int]):
Any valid signal which will be sent to the Job. Can be either
a str like `SIGUSR1`, or simply an [int][].
steps (str):
Expand All @@ -315,7 +315,7 @@ cdef class Job:
signaled.
The value `batch` in contrast means, that only the batch-step
will be signaled. With `all` every step is signaled.
hurry (bool):
hurry (bool):
If True, no burst buffer data will be staged out. The default
value is False.
Expand All @@ -338,7 +338,7 @@ cdef class Job:
flags |= slurm.KILL_FULL_JOB
elif steps.casefold() == "batch":
flags |= slurm.KILL_JOB_BATCH

if hurry:
flags |= slurm.KILL_HURRY

Expand Down Expand Up @@ -417,7 +417,7 @@ cdef class Job:
Examples:
>>> import pyslurm
>>>
>>>
>>> # Setting the new time-limit to 20 days
>>> changes = pyslurm.JobSubmitDescription(time_limit="20-00:00:00")
>>> pyslurm.Job(9999).modify(changes)
Expand All @@ -442,10 +442,10 @@ cdef class Job:
Examples:
>>> import pyslurm
>>>
>>>
>>> # Holding a Job (in "admin" mode by default)
>>> pyslurm.Job(9999).hold()
>>>
>>>
>>> # Holding a Job in "user" mode
>>> pyslurm.Job(9999).hold(mode="user")
"""
Expand Down Expand Up @@ -483,11 +483,11 @@ cdef class Job:
Examples:
>>> import pyslurm
>>>
>>>
>>> # Requeing a Job while allowing it to be
>>> # scheduled again immediately
>>> pyslurm.Job(9999).requeue()
>>>
>>>
>>> # Requeing a Job while putting it in a held state
>>> pyslurm.Job(9999).requeue(hold=True)
"""
Expand All @@ -509,7 +509,7 @@ cdef class Job:
Raises:
RPCError: When sending the message to the Job was not successful.
Examples:
>>> import pyslurm
>>> pyslurm.Job(9999).notify("Hello Friends!")
Expand Down Expand Up @@ -539,7 +539,7 @@ cdef class Job:
#
# The copyright notices for the file this function was taken from is
# included below:
#
#
# Portions Copyright (C) 2010-2017 SchedMD LLC <https://www.schedmd.com>.
# Copyright (C) 2002-2007 The Regents of the University of California.
# Copyright (C) 2008-2010 Lawrence Livermore National Security.
Expand Down Expand Up @@ -621,7 +621,7 @@ cdef class Job:

@property
def nice(self):
if self.ptr.nice == slurm.NO_VAL:
if self.ptr.nice == slurm.NO_VAL:
return None

return self.ptr.nice - slurm.NICE_OFFSET
Expand All @@ -647,7 +647,7 @@ cdef class Job:

@property
def state_reason(self):
if self.ptr.state_desc:
if self.ptr.state_desc:
return cstr.to_unicode(self.ptr.state_desc)

return cstr.to_unicode(slurm_job_reason_string(self.ptr.state_reason))
Expand Down Expand Up @@ -808,7 +808,7 @@ cdef class Job:
def cpus_per_task(self):
if self.ptr.cpus_per_tres:
return None

return u16_parse(self.ptr.cpus_per_task, on_noval=1)

@property
Expand Down Expand Up @@ -1031,7 +1031,7 @@ cdef class Job:
task_str = cstr.to_unicode(self.ptr.array_task_str)
if not task_str:
return None

if "%" in task_str:
# We don't want this % character and everything after it
# in here, so remove it.
Expand All @@ -1042,7 +1042,7 @@ cdef class Job:
@property
def end_time(self):
return _raw_time(self.ptr.end_time)

# https://github.com/SchedMD/slurm/blob/d525b6872a106d32916b33a8738f12510ec7cf04/src/api/job_info.c#L480
cdef _calc_run_time(self):
cdef time_t rtime
Expand Down Expand Up @@ -1153,6 +1153,15 @@ cdef class Job:
else:
return None

@property
def gres_tasks_per_sharing(self):
if self.ptr.bitflags & slurm.GRES_MULT_TASKS_PER_SHARING:
return "multiple"
elif self.ptr.bitflags & slurm.GRES_ONE_TASK_PER_SHARING:
return "one"
else:
return None

@property
def kill_on_invalid_dependency(self):
return u64_parse_bool_flag(self.ptr.bitflags, slurm.KILL_INV_DEP)
Expand Down Expand Up @@ -1191,7 +1200,7 @@ cdef class Job:
"""Retrieve the resource layout of this Job on each node.
!!! warning
Return type may still be subject to change in the future
Returns:
Expand All @@ -1204,25 +1213,25 @@ cdef class Job:
#
# The copyright notices for the file that contains the original code
# is below:
#
#
# Portions Copyright (C) 2010-2017 SchedMD LLC <https://www.schedmd.com>.
# Copyright (C) 2002-2007 The Regents of the University of California.
# Copyright (C) 2008-2010 Lawrence Livermore National Security.
# Produced at Lawrence Livermore National Laboratory (cf, DISCLAIMER).
# Written by Morris Jette <[email protected]> et. al.
# CODE-OCEC-09-009. All rights reserved.
# CODE-OCEC-09-009. All rights reserved.
#
# Slurm is licensed under the GNU General Public License. For the full
# text of Slurm's License, please see here:
# pyslurm/slurm/SLURM_LICENSE
#
# Please, as mentioned above, also have a look at Slurm's DISCLAIMER
# under pyslurm/slurm/SLURM_DISCLAIMER
#
#
# TODO: Explain the structure of the return value a bit more.
cdef:
slurm.job_resources *resources = <slurm.job_resources*>self.ptr.job_resrcs
slurm.hostlist_t hl
slurm.hostlist_t *hl
uint32_t rel_node_inx
int bit_inx = 0
int bit_reps = 0
Expand Down Expand Up @@ -1299,9 +1308,9 @@ cdef class Job:
free(host)

slurm.slurm_hostlist_destroy(hl)
return output
return output



# https://github.com/SchedMD/slurm/blob/d525b6872a106d32916b33a8738f12510ec7cf04/src/api/job_info.c#L99
cdef _threads_per_core(char *host):
# TODO
Expand Down
Loading

0 comments on commit 8489b36

Please sign in to comment.