add method to "bin" continuous metadata and generate a new metadata column #8

nbokulich · 2017-09-08T15:09:29Z

Proposed Behavior
The question is: how to bin? The user could define:

an explicit number of bins to create, and the range of values are sliced at even intervals
a "step" size to explicitly define bin range instead of number of bins. To follow the examples above, 1) if the unit is 1 day, a step of 30 would be roughly 1 month; 2) if the unit is 1 meter, 100 would be 100 meters.
A list of bin cutoffs. E.g., [100, 1000, 10000] would generate 3 bins: all samples with x < 100, 100 ≤ x < 10000, and x ≥ 10000. This would be useful for explicitly defining uneven bin sizes. A tangible example of where this would be used is if samples were collected from patients at many different ages, and an investigator wants to compare the microbiome at [3, 12, 24, 72, 144] months of age.
A very cool "some day in the future" enhancement would be to add a function for auto-binning, by looking at the distribution and finding sensible divisions for creating bins.

This method should require a user-defined name to give the new column.

Comments

This would be useful for using continuous metadata column as pseudo-categorical groupings when performing statistical tests.
For example, a researcher might collect samples from infants at different days of life, and choose to bin those samples into months of life to aggregate into larger groups for statistical comparison. Or collect soil samples at different elevations (meter) and put into 100 m bins for comparison. I could make many other examples.
You are probably asking: "why don't users just create these categories manually from the start"? Sometimes this is not easy to do, and sometimes this will come up only later during analysis.

The text was updated successfully, but these errors were encountered:

jairideout · 2017-09-08T18:36:53Z

This is an interesting idea and definitely worth exploring (though honestly not a high priority for us in 2017 unless someone wants to implement it!). Right now QIIME 2 can't output/create metadata files so we'd need to add support for that in the framework. It sounds like there's a few cases where allowing QIIME 2 to write out metadata would be useful.

jairideout added the enhancement label Sep 8, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add method to "bin" continuous metadata and generate a new metadata column #8

add method to "bin" continuous metadata and generate a new metadata column #8

nbokulich commented Sep 8, 2017 •

edited by Mestabrook3

Loading

jairideout commented Sep 8, 2017

add method to "bin" continuous metadata and generate a new metadata column #8

add method to "bin" continuous metadata and generate a new metadata column #8

Comments

nbokulich commented Sep 8, 2017 • edited by Mestabrook3 Loading

jairideout commented Sep 8, 2017

nbokulich commented Sep 8, 2017 •

edited by Mestabrook3

Loading