-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix COPY REPLACING and REPLACE #75
Fix COPY REPLACING and REPLACE #75
Conversation
f954957
to
fb3f529
Compare
Note: I guess WPI is "WIP", depending on if you see it ready for GC yourself you may request a review; if you see it as a draft in general, please set it as draft (on the right side on the Conversation page of the PR). In any case: please keep the ci definitions as extra PR :-) |
9e04313
to
93d4799
Compare
Yes, it was WIP. I don't know why there is no "draft" button on my interface. I am playing with the CI at this moment, but yet, I will definitively put the git related stuff in another PR. |
cff40b8
to
a53bdd0
Compare
I used the CI to check that everything was running correctly. It passes all the tests and I removed the CI commit. |
a53bdd0
to
169080d
Compare
Codecov Report
❗ Your organization is not using the GitHub App Integration. As a result you may experience degraded service beginning May 15th. Please install the Github App Integration for your organization. Read more. @@ Coverage Diff @@
## gcos4gnucobol-3.x #75 +/- ##
=====================================================
+ Coverage 65.17% 65.35% +0.17%
=====================================================
Files 31 32 +1
Lines 58390 58499 +109
Branches 15389 15394 +5
=====================================================
+ Hits 38058 38233 +175
+ Misses 14393 14327 -66
Partials 5939 5939
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
When checking the root issue it presented itself as follows:
|
I tried to understand how to use this work with the listings, but my main problem is that I don't understand what is expected as the result of listing. Indeed, COPY-REPLACING and REPLACE can completely change the source, including removing complete parts of the code, so the current line by line approach, or even displaying the content of a copybook after replacing, do not make sense. If the goal is to have an equivalent source to the original one, but with COPY/REPLACE applied, and then well-formatted, then I would push to use the generated <file.i> and format it back in a fixed format, while re-including comments from the original file when possible. If the goal is different, then I don't really understand what it can be, and so how to reach it. (note that I am not familiar with IBM/MF tooling, so maybe the goal is obvious for people used to them if they have a listing mode...) |
The goal is to have a complete code as it is processed, in the "original style", with COPY and REPLACE applied and all comments, intermixed with diagnostics, applied "print formatting" (pages + headers) and possible suppression (+extra stuff like symbol listing, xref, diagnostic summary, ...). The goal is to drop nearly all of the listing code in cobc.c, moving some parts (possibly to a new file, possibly to the existing ones) and to do everything during the "pre-processing" itself. Note that we do want to have that available there later also for display errors in their context (instead of "position only, add the context in the text"). ... there's just not someone that took on the work to do so... |
That PR is not in draft state, so (while I don't expect this to be available upstream soon) would you mind rebasing it to the current version? |
17e40ca
to
01db7f9
Compare
01db7f9
to
145c685
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
currently I'm finding it hard to follow the changes
Could you please split it into two commits?
First: moves the relevant code from pplex.l to replace.c without any logic change (so the code will also be changed only minimal) and include a new test to listings.at which has a result that will be changed by the second one;
Then have a second commit which does the "actual" change in the already existing files and listing test from the first commit.
cobc/cobc.h
Outdated
@@ -506,6 +508,9 @@ extern void cobc_parse_free (void *); | |||
|
|||
extern void *cobc_plex_malloc (const size_t); | |||
extern void *cobc_plex_strdup (const char *); | |||
extern void *cobc_plex_strsub (const char *, const int); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
`Those and the on above should go to tree.h
cobc/cobc.c
Outdated
@@ -985,8 +985,8 @@ cobc_strdup (const char *dupstr) | |||
return p; | |||
} | |||
|
|||
#if defined (_WIN32) || defined (__CYGWIN__) | |||
static char * | |||
#if defined (_WIN32) || defined (__CYGWIN__) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is currently just defined for WIN32 so should not be unconditionally declared in the header (not sure if we need that anywhere outside of this file...)
memcpy (p, s, len); | ||
return p; | ||
} | ||
|
||
/* variant of strcpy which copies max 'max_size' bytes from 'src' to 'dest', | ||
if the size of 'src' is too long only its last/last bytes are copied and an | ||
eliding "..." is placed in front or at end depending on 'elide_at_end' */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we move out all those memory-only related functions to a separate file?
cobc/pplex.l
Outdated
@@ -106,6 +107,9 @@ static int ppwrap (void) { | |||
|
|||
#define PLEX_COND_DEPTH 16 | |||
|
|||
// replace yytext by some simplified text in ppecho() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please change that to c89 comment and explain why we do that - it replaces all texts below ?!?
cobc/pplex.l
Outdated
return (save_ptr ? 1 : 0); | ||
/* This new versoin does not use `alt_space`, so the output | ||
will be different from the previous one. */ | ||
ppecho_copy( text ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
has unused variable warning because now only text is used?
As noted above: whenever this is tackled it should be split into two commits (those may both be in this single PR). |
145c685
to
15b943a
Compare
I modified this PR to have 3 commits:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Changelog entries for the last two commit are missing. I guess this will also contain a reference to at least one upstream bug issue (Bug 868).
There were some minor change requests, but that's "far enough" to be finished soon and then integrated into RC3.
cobc/pplex.l
Outdated
fwrite (text, (size_t)textlen, (size_t)1, ppout); | ||
if (cb_listing_file) { | ||
check_listing (s, 0); | ||
ppecho_direct(text_queue->text, text_queue->token); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
replacing that looks correct; but shockingly we seem to have no single testcase for this - can you please duplicate the test below in syn_copy.at for "preprocess listing" (different expected result), so that we see it is not changed unexpected?
Just to check: do you have an ETA? |
Don't mind, I'll do that just now along with fixing the code that code was broken by @nberth's change in r4644 for "pplex.l (check_listing): do not output sequence number of short lines" Done in r5101. |
d627bda
to
be4ee69
Compare
a2ec0fb
to
9b1ab30
Compare
Is there anything more to do on this PR ? The |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor things to consider, but overall LGTM
cobc/ChangeLog
Outdated
algorithm. The first phase performs COPY REPLACING on the stream | ||
of tokens, while the second phase perform REPLACE on the resulting | ||
stream of tokens. This rewriting is closer to the COBOL standard | ||
and fixes bug #831 partially. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, why only partially? []
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#831 actually describes several bugs. One is in listing mode (and is addressed in PR #92 ) and another is in source mode (the one addressed in this PR). I was more concerned by the second part, as it prevents correct code from being compiled, while the first part is only a problem when using the listing mode (we advised our devs to look at the .i files instead of the listing files to exactly know the result of preprocessing).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would #92 solve the listing part already?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't found the time to check yet...
fbcbf05
to
7b69eaf
Compare
Apart from the missing Changelog entry for -P (this may even be NEWS worthy) and a NEWS entry about "cobc now uses a two-pass algorithm..." that is ready for SVN. |
7b69eaf
to
f1d0612
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good to go, listing may also be fixed after RC3
texts, src, | ||
replace_list); | ||
} else { | ||
if (!strcasecmp(src_text,text)){ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lefessan We do have the text len(yylen) in the scanner, wouldn't it make sense to pass this and include it in the struct?
Then we can compare the length itself and don't need to use strcasecmp for all content (which is done as soon as at least one REPLACE[ING] is active, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not obvious if the speedup would be worth the work. It's likely that, for most words, strcasecmp(...)
immediately returns after checking only one or two chars. So, on one hand, testing the size would save some tests when the sizes are different, but on the other hand, if the sizes are equal, it adds one test that was not done before.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point!
Merged, thanks ! |
Would you mind fixing the vs builds? The new file has to be added to all files under build_windows that reference help.c. |
Does this also fix https://sourceforge.net/p/gnucobol/bugs/23? If yes then it is likely good to check and possibly reference it. Only open: MSVC (and OrangeC) project definitions for cobc (new replace.h). |
Done upstream. |
Primarily a note: it seems that this PR has increased memory usage of cobc during preparing a lot (testing with sources that have ~500000 LOC and no REPLACE or REPLACING at all). The following is about The main cpu instructions and memory are to be seen in replace.c (which is logically, we also moved old memory allocations there). Question @lefessan: Could you recheck for possible optimizations, especially reducing memory when no REPLACE is active? Note: cpu-time wise that's no problem, with this big source there is a cpu time of 1.2 seconds for the preparsing, which is ok. But memory-wise - I'm not sure. |
Friendly ping @lefessan on the memory usage/performance, especially when there's no REPLACE active in the whole file. |
@lefessan you may have missed that: this PR has increased memory usage of cobc during preparing a lot; can you please reinspectif you find a better way to handle that? |
Here some numbers from a huge (generated) COBOL file, from valgrinds recorded heap usage
Note that also this file has no Of course, this makes anything that inspects the memory (for example valgrind or cobc if built with hardening options) take much longer. |
Current implementation does not conform to standard (COPY REPLACING and REPLACE are supposed to be in two successive passes instead of applied together), and has various bugs.
This version is probably less efficient, but better conforms to the standard.
Limitations:
This first version only modify the replacements during preprocessing, not the ones during the listing printing.
Since REPLACE are interpreted before the COPY REPLACING, they cannot be modified by them. A conforming implementation would interpret REPLACE strictly after COPY REPLACING.