Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible implementation of limited comments #101

Open
wants to merge 2 commits into
base: develop
Choose a base branch
from

Conversation

xloem
Copy link

@xloem xloem commented Oct 23, 2020

Provides for # characters in configuration file values. See #100

@codecov
Copy link

codecov bot commented Oct 24, 2020

Codecov Report

Merging #101 into develop will increase coverage by 0.00%.
The diff coverage is 25.00%.

Impacted file tree graph

@@           Coverage Diff            @@
##           develop     #101   +/-   ##
========================================
  Coverage    49.89%   49.89%           
========================================
  Files           23       23           
  Lines         1385     1387    +2     
  Branches       707      709    +2     
========================================
+ Hits           691      692    +1     
  Misses         111      111           
- Partials       583      584    +1     
Impacted Files Coverage Δ
...clude/boost/program_options/detail/config_file.hpp 38.88% <0.00%> (ø)
include/boost/program_options/parsers.hpp 40.00% <ø> (ø)
src/parsers.cpp 45.61% <0.00%> (ø)
src/config_file.cpp 38.02% <33.33%> (+0.34%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0414abe...e47c6bd. Read the comment docs.

@xloem
Copy link
Author

xloem commented Nov 15, 2020

I got an email mentioning that it might be possible to use std::quoted to resolve this, too. I see the comment is deleted so I infer that they discerned it might not be the right choice, but I'm mentioning it here because I was unaware of std::quoted.

If somebody else wants to try adding some more options to this for maintainers to choose from in case they have concerns, using escape characters is another option.

There might have also been a concern around the interface for enabling or disabling it.

@FloopCZ
Copy link

FloopCZ commented Nov 15, 2020

Hi, yes, sorry about that, that was my fault. As I have written previously, I believe the proposed technique calls for error-prone config files as I am not aware of any interpreter that behaves like this. If you also consider escape character instead (e.g. similarly to \ in Bash), std::quoted will probably not be suited for that case (as I wrongly believed before). However, you can use a regular expression to parse the line, e.g., ([^=]+)=((?:\\.|[^\\#])*)#{0,1}(.*) with whitespace trimming should work by a quick test:

#!/usr/bin/env python3
import re
cases = [
    (r"key=abcd",                    (r"key", r"abcd", r"")),
    (r" key = abcd # com",           (r" key ", r" abcd ", r" com")),
    (r" key =abcd#comment",          (r" key ", r"abcd", r"comment")),
    (r"key =abcd #comment",          (r"key ", r"abcd ", r"comment")),
    (r" key=abcd\#nocomment",        (r" key", r"abcd\#nocomment", r"")),
    (r" key = abcd\#nocomment#com",  (r" key ", r" abcd\#nocomment", r"com")),
    (r" key = abcd\#nocomment###",   (r" key ", r" abcd\#nocomment", r"##")),
    (r" key = abcd\\#comment",       (r" key ", r" abcd\\", r"comment")),
    (r" key = abcd\\\#nocomment",    (r" key ", r" abcd\\\#nocomment", r"")),
    (r" key = abcd\nocomment",       (r" key ", r" abcd\nocomment", r"")),
]
rex=r"([^=]+)=((?:\\.|[^\\#])*)#{0,1}(.*)"
for s, v in cases:
    assert re.fullmatch(rex, s).groups() == v
print("OK")

That could also be the default and only behavior so you would not need to pass an extra argument.
Explanation of the regex:

([^=]+)=            key and `=` character
(?:                 value, i.e., either
    \\.                any escaped character (including the escape character itself)
    [^\\#]             or a normal non-escaped non-comment character
)
#{0,1}(.*)          comment

@xloem
Copy link
Author

xloem commented Nov 15, 2020

Thanks so much for your feedback.

I'm worried adding it as the default behavior could break existing programs when distributions and developers upgrade. There are a lot of users of programs linked to boost out there, a huge config file space.

Do you think there'd be a solution that preserves the behavior for people who happen to have the edge case of 'a=b\#comment in a pre-existing config file?

Developers wouldn't usually be able to predict if their users were affected by this change, to know whether to require an earlier version or not when upgrading, and it's a small change that would be hard to get everyone to spread preparation for.

@FloopCZ
Copy link

FloopCZ commented Nov 15, 2020

You have a good point. However, in my opinion, the edge case a=b\#comment (i.e., "escaping" a comment sign while still treating the comment as a comment) is against widely known practices and therefore it will be so very rare that this breaking change will probably not affect anyone.

If it bothers you anyway, the next version of program_arguments could also print a warning that the behaviour regarding this edge case has changed (or will change in the future) if encountered, as some other programs do (e.g., git, gcc, etc.), but I would not bother.

I guess @vprus will have to share his opinion with us. 🙂

@xloem
Copy link
Author

xloem commented Nov 15, 2020

I like the warning solution, although I'm thinking of GUI applications where the config files could even be autogenerated.

I won't likely be working further on this PR for quite some time, so it's mostly just for when the work hours are available from someone.

It seems to me a separate options class, or a separate member function to set the parsing options, might be a more robust solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants