Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add MultiValueColumn and both loading and saving logic #46

Open
lisad opened this issue Feb 10, 2024 · 1 comment
Open

Add MultiValueColumn and both loading and saving logic #46

lisad opened this issue Feb 10, 2024 · 1 comment
Assignees

Comments

@lisad
Copy link
Owner

lisad commented Feb 10, 2024

A MultiValueColumn needs a bit of configuration

  • Delimiters
  • Sub-delimiters?
  • An idea what type the values between the delimiters should be - cast them?

During load, parse the list into its multiple values:
e.g. df['languages'] = df['languages'].str.split(',')
But also we should strip spaces off of values after split which this doesn't do

During save, rejoin with commas and enclose in double-quotes (only double-quotes work)

Use case: lat,long is a list of values in one column - they should both be validated to be floats

@lisad
Copy link
Owner Author

lisad commented Jun 2, 2024

This would be a good one @YuliaS.

To make testing a little easier: In test_reshape.py, there's an example of using a data file 'languages.csv' that contains a multi-value column 'languages'. A test using the same datafile but for a "MultiValueColumn" feature, could be to load the same test file, declare the languages column to be a "MultiValueColumn", and make sure that the value for that column is a list every time a row has that column; also that it's saved out again in a consistent way.

@YuliaS YuliaS self-assigned this Jun 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

When branches are created from issues, their pull requests are automatically linked.

2 participants