Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alternative proposal to BEP038 #1856

Open
wants to merge 48 commits into
base: bep038
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 18 commits
Commits
Show all changes
48 commits
Select commit Hold shift + click to select a range
4fa15e5
first iteration of the alternative proposal
oesteban Jun 12, 2024
0180e3b
STY: codespell
effigies Jun 12, 2024
bba4159
Replace links to the schema with links to the glossary
effigies Jun 12, 2024
f159e61
enh: generalization of atlas metadata cc/ @jdkent
oesteban Jun 17, 2024
9390602
enh: address some of @effigies' comments
oesteban Jun 17, 2024
3f8c210
enh: reference entities from glossary
oesteban Jun 17, 2024
927335c
fix: folder => directory
oesteban Jun 17, 2024
6a3dcc7
fix: resolving issues with/within entities
oesteban Jun 17, 2024
f7e906d
enh: add mention to transforms files cc/ @peerherholz
oesteban Jun 17, 2024
57160cd
fix: revise more glossary links
oesteban Jun 17, 2024
1db3fd4
enh: move all filetree examples to macros
oesteban Jun 17, 2024
48e77ae
enh: miscellaneous improvements (sort entries in examples, etc.)
oesteban Jun 17, 2024
d6cf6a4
enh: improve intro
oesteban Jun 18, 2024
8bfc7ac
enh: add datatype comment to make it recommended cc/ @effigies
oesteban Jun 18, 2024
f2be20a
enh: start drafting cohort
oesteban Jun 18, 2024
d15677a
enh: add multi-cohort example
oesteban Jun 18, 2024
b275ca3
fix: pacify pre-commit build
oesteban Jun 18, 2024
f9ff346
enh: add the PS13 example
oesteban Jun 18, 2024
f0e8e0e
test
jdkent Jun 21, 2024
b00bdf8
Update src/derivatives/atlas.md
jdkent Jun 21, 2024
abe37b4
Update src/derivatives/atlas.md
jdkent Jun 21, 2024
d812b08
incorporate meeting feedback in schema
jdkent Jun 21, 2024
3accb28
enh: add note about 'MNI Space' below template identifiers @CPernet
oesteban Oct 7, 2024
e5e27f1
enh: clarify the creation of BOTH a new template AND atlas
oesteban Oct 7, 2024
8ffda51
fix: avoid using an atlas name for a segmentation label
oesteban Oct 7, 2024
d62f0d9
enh: rework the text to clarify notions: atlas, template, space
oesteban Oct 7, 2024
161db11
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 7, 2024
44cb808
fix: addressing some inconsistencies about segmentations
oesteban Oct 11, 2024
c77df10
fix: add clarification about cohort
oesteban Oct 11, 2024
947f279
fix: replace ``_mimap`` with ``_pet``
oesteban Oct 11, 2024
a5111aa
fix: replace ``label-`` with ``seg-`` where applies
oesteban Oct 12, 2024
c05146c
enh: update schema according to this proposal
oesteban Oct 12, 2024
f221ad0
fix: folder -> directory (pacify pre-commit)
oesteban Oct 12, 2024
9ec94cd
fix: order of ``scale-`` in the schema
oesteban Oct 12, 2024
1a258b5
enh: refine atlas' definition in common principles
oesteban Oct 12, 2024
79f1ea9
enh: add "authoritative definition of spaces" to template
oesteban Oct 12, 2024
3f2bde5
enh: update schema to consider ``tpl-`` and ``cohort-``
oesteban Oct 12, 2024
95a85a1
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 12, 2024
0d7f965
enh(schema): add ``atlas`` to entities
oesteban Oct 12, 2024
a9fa24a
fix: delete photo from atlas' file rules
oesteban Oct 12, 2024
44c3ad8
Update src/schema/objects/columns.yaml
oesteban Oct 12, 2024
b66c3ee
fix: dereference twice schema files
oesteban Oct 14, 2024
c3b6f41
fix: add missing properties to atlas entity definition
oesteban Oct 14, 2024
891bcf0
FIX: update PET examples
mnoergaard Oct 23, 2024
ef1817b
Merge remote-tracking branch 'upstream/bep038' into bep038-review
effigies Nov 5, 2024
bc6b064
schema: Move templates out of rules.files
effigies Nov 5, 2024
226f134
test: Fix check to avoid overlap between common_principles and common
effigies Nov 5, 2024
ac745e6
schema: Update metaschema, fix errors caught by metaschema
effigies Nov 5, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -23,8 +23,8 @@ nav:
- BIDS Derivatives: derivatives/introduction.md
- Common data types and metadata: derivatives/common-data-types.md
- Imaging data types: derivatives/imaging.md
- Templates and atlases: derivatives/atlas.md
- Longitudinal and multi-site studies: longitudinal-and-multi-site-studies.md
- Atlases: atlas.md
- Glossary: glossary.md
- BIDS Extension Proposals: extensions.md
- Appendix:
Expand Down
672 changes: 0 additions & 672 deletions src/atlas.md

This file was deleted.

34 changes: 31 additions & 3 deletions src/common-principles.md
Original file line number Diff line number Diff line change
Expand Up @@ -122,6 +122,12 @@ data type as defined above.
A data type directory SHOULD NOT be defined if there are no files to be placed
in that directory.

**Specific structure of derived data**.
In the case of [storing derived data (see below)](#source-vs-raw-vs-derived-data),
the subject (`sub-<label>`) and session (`ses-<label>`) entities MAY map onto
the template (`tpl-<label>`) and cohort (`cohort-<label>`) entities
as described in the [corresponding section](derivatives/atlas.md) of this specification.

### Other top level directories

In addition to the subject directories, the root directory of a BIDS dataset
Expand Down Expand Up @@ -305,6 +311,16 @@ field in `dataset_description.json` of each subdirectory of `derivatives` to:
}
```

**Templates and atlases as derived data.**
Templates and atlases are key neuroscientific tools to carry out group-level inferences
and also employed in many atlas-based methodologies (such as atlas-based segmentation).
Original templates and atlases employed as primary data to the analysis MAY be stored
within the `sourcedata/atlases/`.
Any artifacts deriving from atlases, or the creation of new templates and atlases MUST
follow the [corresponding specification](derivatives/atlas.md) and stored under the
`derivatives/` directory, and follow the general specifications for derivatives regarding
storage and distribution, as described in the next section.

### Storage of derived datasets

Derivatives can be stored/distributed in two ways:
Expand Down Expand Up @@ -340,6 +356,15 @@ Derivatives can be stored/distributed in two ways:
<dataset>/derivatives/spm-stats/sub-0001
```

Example of an atlas-generating pipeline with outputs for individual subjects
and the aggregation in an atlas defined with respect to the widely-used
[`MNI152NLin2009cAsym` standard space](appendices/coordinate-systems.md):

```Plain
<dataset>/derivatives/atlas-generator/sub-0001
<dataset>/derivatives/atlas-generator/tpl-MNI152NLin2009cAsym
```

Example of a pipeline with nested derivative directories:

```Plain
Expand Down Expand Up @@ -391,11 +416,14 @@ Case 2.
In both cases, every derivatives dataset is considered a BIDS dataset and must
include a `dataset_description.json` file at the root level (see
[Dataset description][dataset-description]).
Consequently, files should be organized to comply with BIDS to the full extent
Consequently, files SHOULD be organized to comply with BIDS to the full extent
possible (that is, unless explicitly contradicted for derivatives).
Any subject-specific derivatives should be housed within each subject's directory;
if session-specific derivatives are generated, they should be deposited under a
Any subject-specific derivatives SHOULD be housed within each subject's directory;
if session-specific derivatives are generated, they SHOULD be deposited under a
session subdirectory within the corresponding subject directory; and so on.
Likewise, any template-specific derivatives SHOULD be housed within each template's directory;
if cohort-specific derivatives are generated, they SHOULD be deposited under a
cohort subdirectory within the corresponding template directory; and so on.

### Non-compliant derivatives

Expand Down
961 changes: 961 additions & 0 deletions src/derivatives/atlas.md

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion src/derivatives/common-data-types.md
Original file line number Diff line number Diff line change
Expand Up @@ -181,7 +181,7 @@ Template:
<pipeline_name>/
sub-<label>/
<datatype>/
<source_entities>[_space-<space>][_desc-<label>]_<suffix>.<extension>
<source_entities>[_space-<space>][_atlas-<label>][_desc-<label>]_<suffix>.<extension>
```

Data is considered to be *preprocessed* or *cleaned* if the data type of the input,
Expand Down
2 changes: 1 addition & 1 deletion src/derivatives/imaging.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ Template:
<pipeline_name>/
sub-<label>/
<datatype>/
<source_entities>[_space-<space>][_res-<label>][_den-<label>][_desc-<label>]_<suffix>.<extension>
<source_entities>[_space-<space>][_atlas-<label>][_res-<label>][_den-<label>][_desc-<label>]_<suffix>.<extension>
```

Volumetric preprocessing does not modify the number of dimensions, and so
Expand Down
32 changes: 32 additions & 0 deletions src/schema/objects/common_principles.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,22 @@
# WARNING: The terms are presented here in alphabetical order!
# The order in which these terms are presented in the specification is defined in `rules/common_principles.yaml`,
# rather than this file (`objects/common_principles.yaml`).
atlas:
display_name: Atlas
description: |
Knowledge about the brain, generally formalized with reference to a standard space (see the *Template* definition)
by means of spatiotemporal annotations such as landmarks, segmentations, parcellations, or probability maps.

The definition of atlas per Merriam-Webster is ‘a bound collection of maps (i.e. labeled brain regions
or quantitative aspects) and metadata (tables, or textual matter).
Within BIDS, atlases are broadly defined as a mapping between locations in a spatial coordinate systems
and descriptions associated with those locations.
Atlases are often built after *registering many subjects or maps into a space defined by a template*.
By analogy with geographical atlases, brain atlases can map brain locations to either discrete labels like a map
of countries does, or to continuous quantities like a topographic map does.

One prominent manuscript regarding the specific aspects of atlases, such as their *regional resolution*
is ([Bijsterbosch et al., 2020](https://doi.org/10.1038/s41593-020-00726-z)).
data_acquisition:
display_name: Data acquisition
description: |
Expand Down Expand Up @@ -133,6 +149,12 @@ session:
often in the case of some intervention between sessions (for example, training).
In the [PET](SPEC_ROOT/modality-specific-files/positron-emission-tomography.md) context,
a session may also indicate a group of related scans, taken in one or more visits.
space:
display_name: Space
description: |
A reference [coordinate system](appendices/coordinate-systems.md) of analysis
engendered by the spatiotemporal distribution of neuroimaging features such as
those given by subjects' and templates' data.
suffix:
display_name: suffix
description: |
Expand All @@ -154,3 +176,13 @@ task:
In the context of brain scanning, a task is always tied to one data acquisition.
Therefore, even if during one acquisition the subject performed multiple conceptually different behaviors
(with different sets of instructions) they will be considered one (combined) task.
template:
display_name: Template
description: |
An average feature map obtained by aggregation of subjects and/or sessions that allows the
spatial location of brain anatomy and function of the templated cohort.
Templates operationalize the concept of *standardized spatial frame of analysis*,
a common *Space* in which subjects' data can be spatially-normalized into for group inference.
Like subjects' feature maps generate a *native* spatial frame of reference for analyses,
templates engender a *generic* or *standard* space of analysis were subjects can be spatiotemporally
aligned into.
55 changes: 46 additions & 9 deletions src/schema/objects/entities.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -29,14 +29,7 @@ atlas:
name: atlas
display_name: Atlas
description: |
The definition of atlas per Merriam-Webster is ‘a bound collection of maps (i.e. labeled brain regions
or quantitative aspects) and metadata (tables, or textual matter). Within BIDS, atlases are broadly
defined as a mapping between locations in a spatial coordinate systems and descriptions associated with
those locations. Atlases are often build from registering many subjects or maps to a template. By analogy
with geographical atlases, brain atlases can map brain locations to either discrete labels like a map
of countries does, or to continuous quantities like a topographic map does.

This comprises all possible types of atlases, specifically deterministic, probabilistic, and mask/voxel-based
Atlas comprises all possible types of atlases, specifically deterministic, probabilistic, and mask/voxel-based
ones, and quantitative maps from various modalities including but not limited to structural features (e.g.
myelination, cytoarchitecture), functional features (e.g. resting-state networks, localizers) and such based on
multimodal data integration (e.g. gene expression, receptors). Furthermore, it covers both volume/voxel and
Expand All @@ -53,6 +46,14 @@ ceagent:
`"ContrastBolusIngredient"` MAY also be added in the JSON file, with the same label.
type: string
format: label
cohort:
name: cohort
display_name: Cohort
description: |
A subset of a defined template space, for instance, for a longitudinal template of brain development
where infants were participants were averaged at three, six, and twelve months old.
Comment on lines +55 to +56
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pwighton can you make a code suggestion here to replace/edit/improve the definition of cohorts? Thanks!

Comment on lines +55 to +56

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
A subset of a defined template space, for instance, for a longitudinal template of brain development
where infants were participants were averaged at three, six, and twelve months old.
A sub-population over which an atlas or template is derived.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've removed the example, because that didn't seem like a cohort to me. In my mental model is a longitudinal study is one or more cohorts with observations at multiple timepoints.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, then, maybe we need to find an alternate entity name because the term 'cohort' comes from TemplateFlow, and there, we do use it to separate different ages within a single template/atlas. It's also the intention here: to have a single standard space name that, in reality, has several standard spaces within.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per merriam webster it doesn't seem like a bad choice: https://www.merriam-webster.com/dictionary/cohort

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh sorry, I misunderstood. I agree with the merriam webster definition and now understand the longitudinal example. I had originally though the example was referring to multiple timepoints at 3, 6 and 12 months old.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the definition could be more clear, please send suggestions if you think we can make it better :)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My suggested change is above. I suggest changing the definition to "A sub-population over which an atlas or template is derived." and removing the example.

type: string
format: label
chunk:
name: chunk
display_name: Chunk
Expand Down Expand Up @@ -300,7 +301,33 @@ segmentation:
display_name: Segmentation
description: |
The `seg-<label>` key/value pair corresponds to a custom label the user
MAY use to distinguish different segmentations.
MAY use to distinguish different segmentations or parcellations.

For atlases, `seg-<label>` distinguish different realizations of a given
segmentation or parcellation.
For example, the [Yeo 2011 atlas](https://doi.org/10.1152/jn.00338.2011)
is distributed within *FreeSurfer* with two different parcellations
(*7 networks* and *17 networks*).

This entity is only applicable to derivative data.
type: string
format: label
scale:
name: scale
display_name: Scale
description: |
The `scale-<label>` key/value pair corresponds to a custom label the user
MAY use to distinguish segmentations that entail different levels of
*regional resolution* (scales), often indicated by the number of ROIs.
See ([Bijsterbosch et al., 2020](https://doi.org/10.1038/s41593-020-00726-z))
for further details on *regional resolution* or, as defined by the authors of
the manuscript, the *brain units*.

For example, the [Schaefer 2018 atlas](https://doi.org/10.1093/cercor/bhx179)
is distributed within *FreeSurfer* with ten different scales
(100, 200, 300, 400, 500, 600, 700, 800, 900, and 1000 regions) for each of
its three different parcellations (*7 networks*, *17 networks*, and
*Kong's variation of 17 networks*).

This entity is only applicable to derivative data.
type: string
Expand Down Expand Up @@ -412,6 +439,16 @@ task:
the `task` label for resting state files (for example, `task-rest`).
type: string
format: label
template:
name: tpl
display_name: Template
description: |
An standardized space of analysis specified or engendered by one or more average of features
with respect to which group-level and atlas-derived results are provided.
The `<label>` MAY be taken from one of the modality specific lists in the
[Coordinate Systems Appendix](SPEC_ROOT/appendices/coordinate-systems.md).
type: string
format: label
tracer:
name: trc
display_name: Tracer
Expand Down
3 changes: 3 additions & 0 deletions src/schema/rules/common_principles.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,9 @@
- task
- event
- run
- template
- atlas
- space
- index
- label
- suffix
Expand Down
13 changes: 13 additions & 0 deletions src/schema/rules/directories.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,12 @@ derivative:
name: code
level: optional
opaque: true
cohort:
entity: cohort
level: optional
opaque: false
subdirs:
- datatype
derivatives:
name: derivatives
level: optional
Expand Down Expand Up @@ -106,6 +112,13 @@ derivative:
opaque: false
subdirs:
- datatype
template:
entity: template
level: optional
opaque: false
subdirs:
- cohort
- datatype
datatype:
value: datatype
level: optional
Expand Down
3 changes: 3 additions & 0 deletions src/schema/rules/entities.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,9 @@
# This file simply defines the order in which entities should appear within filenames.
# Entity definitions appear in the `objects/entities.yaml` file.
- subject
- template
- session
- cohort
- sample
- task
- tracksys
Expand All @@ -26,6 +28,7 @@
- recording
- chunk
- segmentation
- scale
- resolution
- density
- label
Expand Down
20 changes: 9 additions & 11 deletions tools/print_contributors.py
oesteban marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
Expand Up @@ -15,25 +15,23 @@


def contributor_table_header(max_name_length, max_contrib_length):
return f"""| name{" " * (max_name_length-4)} | contributions{" " * (max_contrib_length-13)} |
| {"-" * max_name_length} | {"-"*max_contrib_length} |
return f"""\
| {"name":<{max_name_length}} | {"contributions":<{max_contrib_length}} |
| {"":-<{max_name_length}} | {"":-<{max_contrib_length}} |
"""


def create_line_contributor(
contributor: dict[str, str], max_name_length: int, max_contrib_length: int
):
name = contributor["name"]
emap = emoji_map()
contributions = "".join(
emoji.emojize(emap[cont]) for cont in contributor["contributions"]
)

line = f"| {name}{' '*(max_name_length-len(name))} | "

nb_contrib = len(contributor["contributions"]) * 2
for contrib in contributor["contributions"]:
line += emoji.emojize(emoji_map()[contrib])

line += f"{' '*(max_contrib_length-nb_contrib)} |\n"

return line
pad = max_contrib_length - len(contributor["contributions"]) * 2
return f"| {name:<{max_name_length}} | {contributions}{'':<{pad}} |\n"


def main():
Expand Down