-
-
Notifications
You must be signed in to change notification settings - Fork 221
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Space after delimiter messes with quoting #337
Comments
The behavior you see is occurring because you don't have CSV data. You only have something that looks like CSV. Spaces after commas with quoted values are invalid. I'm on mobile, but I believe most other CSV parsers (including Python's) either will behave similarly or will error. The trim option doesn't apply here because your CSV data is mangled long before the trim option comes into effect. You need to either fix your data to be valid CSV or do some kind of ad hoc post processing step. |
I was actually coming from Python's csv parser which has this functionality. That's why it caught me out. https://docs.python.org/3/library/csv.html#csv.Dialect.skipinitialspace |
Yeah that's in the dialect configuration itself. I had forgotten about that. I'm open to adding an option to I don't know when or if I'll work on this personally. |
Cool, ok. If I find time I'll look into contributing to csv-core. |
Hi! Would it be acceptable to do something like the below? Of course, we would have to supplement it with a modified csv-core/src/reader.rs
@@ -672,6 +672,9 @@ impl Reader {
output[nout] = input[nin];
nout += 1;
}
+ else if input[nin] == self.delimiter && input[nin+1] == b' ' {
+ nin += 1;
+ }
nin += 1;
if state >= self.dfa.final_field {
ends[nend] = self.output_pos + nout; |
What version of the
csv
crate are you using?1.2.2
Briefly describe the question, bug or feature request.
When parsing a CSV file which has a space following each comma, whether or not I enable trimming, the presence of the space seems to override the default quoting behaviour and cause " to be included in the output rather than function as a quote.
Quoting is set to true by default, and explicitly setting it to true has no effect.
Changing the quote character and leaving in the space after the comma has the same effect.
Include a complete program demonstrating a problem.
What is the observed behavior of the code above?
What is the expected or desired behavior of the code above?
The CSV should be parsed correctly, with the quoted sentence all appearing as one value.
Or at least there should be a setting to enable handling this scenario.
Currently the only setting that sounds related is
trim
, but it doesn't have any impact.The text was updated successfully, but these errors were encountered: