Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[SPARK-39689][SQL] Support 2-chars
lineSep
in CSV datasource
### What changes were proposed in this pull request? Univocity parser allows to set line separator to 1 to 2 characters ([code](https://github.com/uniVocity/univocity-parsers/blob/master/src/main/java/com/univocity/parsers/common/Format.java#L103)), CSV options should not block this usage ([code](https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/CSVOptions.scala#L218)). This PR updates the requirement of `lineSep` accepting 1 or 2 characters in `CSVOptions`. Due to the limitation of supporting multi-chars `lineSep` within quotes, I propose to leave this feature undocumented and add a WARN message to users. ### Why are the changes needed? Unblock the usability of 2 characters `lineSep`. ### Does this PR introduce _any_ user-facing change? No - undocumented feature. ### How was this patch tested? New UT. Closes apache#37107 from Yaohua628/spark-39689. Lead-authored-by: yaohua <[email protected]> Co-authored-by: Yaohua Zhao <[email protected]> Signed-off-by: Hyukjin Kwon <[email protected]>
- Loading branch information