Skip to content

add #line tags to AdditionalFiles when copying to build directory, to… #707

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

oschonrock
Copy link
Contributor

@oschonrock oschonrock commented May 14, 2020

… improve error messages

Please check if the PR fulfills these requirements

  • The PR has no duplicates (please search among the Pull Requests before creating one)
  • The PR follows our contributing guidelines
  • Tests for the changes have been added (for bug fixes / features)
  • Docs have been added / updated (for bug fixes / features)
  • What kind of change does this PR introduce?
    Feature. Usable file:line locations in compiler error messages for Additional files

  • What is the current behavior?
    Additional files show error message location in build directory, which is not useful

  • What is the new behavior?
    Compiler Error messages in additional files now reference the original file. Same behaviour as for main and extra sketch files

  • Does this PR introduce a breaking change?
    No

  • Other information:
    It's a trivial one liner. Please change it there is a better way.


See how to contribute

@CLAassistant
Copy link

CLAassistant commented May 14, 2020

CLA assistant check
All committers have signed the CLA.

Copy link
Collaborator

@matthijskooijman matthijskooijman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this PR, would be a nice change to see. I have three remarks, one-and-a-half of which I had already posted on the original PR:

  • For what kinds of files is this #line directive now added? .cpp, .c and .h files are of course fine, but how about assembly files? Can there be any other types of files in the list of "Additional files" that do not like #line directives?
  • When files to be modified are really big, then the in-memory prepending of the line directive might be ineffecient (compared to writing the line directive first and then the original contents). I can see why we need this to compare the file contents, though. Maybe that could also be done in two parts, though. For big files, completely loading into memory might be a bit problematic by itself, but creating a complete copy of the contents might double that problem.
  • This now happens in a function called writeIfDifferent, which is a nice and generic function that now hardcodes this C-preprocessor-specific bit of content. Maybe it would be cleaner if this function would accept an arbitrary "prefix" argument, and the calller (maybe a few levels up) that knows more about the types of files being processed could compones the #line directive?

@oschonrock
Copy link
Contributor Author

My answers:

  1. I don't know what the exhaustive list of file types is, so can't answer that.

  2. In principle I agree, however, in practice this is a red herring in my opinion. YAGNI Any source code that doesn't easily fit the host machine's memory is going to blow the MCU. It's only there for the duration of the function call, so there is no wider effect. And this is not a result of the proposed change. It is how that code is written, for better or for worse. Separate concern, needs separate refactor / pull request.

  3. This also occurred to me. I did grep all of arduino-cli and that function is only called from exactly that one place. It's arguably not worth a function. IMO. I don't see it being very generic until it has a second use TBH. But I am happy to add the "file_prefix" as an optional param, if that's preferred.

@oschonrock
Copy link
Contributor Author

oschonrock commented May 14, 2020

Re 1.: Is this the authoritative list?

These should all be fine? No ASM?

@rsora rsora added component/core type: enhancement Proposed improvement labels May 15, 2020
@matthijskooijman
Copy link
Collaborator

Re 1.: Is this the authoritative list?

Yeah, looks like. Looking at those, I think all of them use the preprocessor. .s are assembly files, but thinking about this again, I think assembler files are also just preprocessed (since you can #include files with constents in assembler as well).

  1. In principle I agree, however, in practice this is a red herring in my opinion. YAGNI Any source code that doesn't easily fit the host machine's memory is going to blow the MCU. It's only there for the duration of the function call, so there is no wider effect.

Yeah, this is probably true. Especially since this considers just sketch files, not all compiled code (e.g. it excludes all libraries and system headers), all of which also have to be loaded into memory.

And this is not a result of the proposed change. It is how that code is written, for better or for worse. Separate concern, needs separate refactor / pull request.

I don't think that is true: Currently the file is loaded into memory once, with this change it is loaded twice (one []byte with the original, one []byte with the prefixed version). And this could be fixed by never appending the two in memory, but comparing and writing the parts separately. But that would make the code more complex for probably minimal gain.

  1. This also occurred to me. I did grep all of arduino-cli and that function is only called from exactly that one place. It's arguably not worth a function. IMO. I don't see it being very generic until it has a second use TBH. But I am happy to add the "file_prefix" as an optional param, if that's preferred.

Ah, I see it is actually in sketch.go rather than in some more generic place, so there's probably little room for confusion unless the function is actually used elsewhere too. It might be slightly cleaner with a parameter, but if we ever end up calling the function from two places that each need the #line directive, that would again lead to duplication, so I guess it's fine as it is now.

So with that, I think the code is fine as it is now, as far as I'm concerned (though I'm not in a position to decide on this).

One addition that might be useful: Add a testcase that verifies this works (mostly to prevent later changes from breaking it again). I can imagine a small unittest that just tests the sketch copying and verifies the #line directives are present, but also a full-stack test that has a simple sketch with all different filetypes, each defining some symbol that will end up in the final .elf file, and then checking the debug symbols to see if the filenames are correct. I'm not too familiar with the current test suite, but it seems that https://github.com/arduino/arduino-cli/blob/master/test/test_compile.py might be a good place for this (though it does not contain any other test yet that compiles different filetypes).

@oschonrock
Copy link
Contributor Author

I don't think that is true: Currently the file is loaded into memory once, with this change it is loaded twice (one []byte with the original, one []byte with the prefixed version).

That's true, and I did realise that when I wrote it. Coming from C++ you do tend to think about what you're doing with the machine's resources. However, IMHO it is just not relevant here. This is the thought process I followed when writing my one line of code:

find ~/arduino-1.8.12/ -name '*.cpp' -or -name '*.h' -print0 | xargs -0 ls -lS  | head -n1

1021807 hardware/arduino/avr/firmwares/wifishield/wifi_dnld/src/wl_fw.h

The largest possible file (which would never, ever be included as "addtiional files" in the sketch folder) is 1MB. If we hold that in memory twice it's 2MB. A raspbery PIe 3, perhaps the smallest possible host, has 1GB RAM, so we are golden by 2-3 orders of magnitude here. That's why I said "YAGNI", and I agree that it's not worth making the code more complicated to "prematurely optimise for memory usage".

So with that, I think the code is fine as it is now, as far as I'm concerned (though I'm not in a position to decide on this).

Great, thank you.

One addition that might be useful: Add a testcase that verifies this works (mostly to prevent later changes from breaking it again). I can imagine a small unittest that just tests the sketch copying and verifies the #line directives are present, but also a full-stack test that has a simple sketch with all different filetypes, each defining some symbol that will end up in the final .elf file, and then checking the debug symbols to see if the filenames are correct. I'm not too familiar with the current test suite, but it seems that https://github.com/arduino/arduino-cli/blob/master/test/test_compile.py might be a good place for this (though it does not contain any other test yet that compiles different filetypes).

OK, That testing framework might be a big learning curve. I can't promise I will manage the time for that, but I will try.

@oschonrock
Copy link
Contributor Author

oschonrock commented May 15, 2020

Are we sure this is right?

I think all of them use the preprocessor. .s are assembly files, but thinking about this again, I think assembler files are also just preprocessed (since you can #include files with constents in assembler as well).

https://stackoverflow.com/a/7190511/1087626

  • If the file name ends with ".s" (lowercase 's'), then gcc calls the assembler.

  • If the file name ends with ".S" (uppercase 'S'), then gcc applies the C preprocessor on the source file (i.e. it recognizes directives such as #if and replaces macros), and then calls the assembler on the result.

Maybe I need to special case the ".s' and skip the '#line'?

@oschonrock
Copy link
Contributor Author

oschonrock commented May 15, 2020

Maybe I need to special case the ".s' and skip the '#line'?

I did a quick proof of concept test using the attached "Hello Wolrd" assembler test:

AsmTest.zip

Result. If I name the file asm.S (with capital S) then it works just fine, compiles, with added '#line ..' What I don't understand is why it is compiled at all. '*.S' is not one of the patterns here:

If I rename asm.S => asm.s then it doesn't get compiled. It gets copied to build directory, but not compiled. Verbose output shows that it gets "compiled to /dev/null during":

gcc -c .... sketch/asm.s -o /dev/null -DARDUINO_LIB_DISCOVERY_PHASE

but not actually compiled to a asm.o. The 'line# ...' gets added, but gcc doesn't appear to complain. I also commented out the proposed change to sketch.go which adds the '#line' and that makes no difference.

Not quite sure I understand:

  1. why *.S gets compiled (it's not one of the extensions listed in what we though was the authoritative list)...- unless it's a case insensitive match

  2. why *.s doesn't get compiled...it is on the list.

  3. why gcc doesn't complain when "dry-run" compiling *.s (during discovery phase) when we have added '#line...' and it is supposed to not run the c-preprocessor (assuming that stackoverflow answer is correct)

My tentative conclusion: That list in globals.go is not the authoritative list, and/or there are other places where decisions are made on what to do with these "additional files"?

@matthijskooijman
Copy link
Collaborator

Hah, good catch about .s vs .S had not realized that. Thanks for digging :-)

  1. why *.S gets compiled (it's not one of the extensions listed in what we though was the authoritative list)...- unless it's a case insensitive match

Yup, case insensitive indeed, at least for including files in the sketch and thus copying them:

ext := strings.ToLower(filepath.Ext(p))

ext := strings.ToLower(filepath.Ext(path))

  1. why *.s doesn't get compiled...it is on the list.

Seems there is a different list for compilation, probably because this code was imported from arduino-builder and still needs to be more integrated, and because that code also needs to decide what recipe to use to compile each file:

sSources, err := findFilesInFolder(sourcePath, ".S", recurse)
if err != nil {
return nil, errors.WithStack(err)
}
cSources, err := findFilesInFolder(sourcePath, ".c", recurse)
if err != nil {
return nil, errors.WithStack(err)
}
cppSources, err := findFilesInFolder(sourcePath, ".cpp", recurse)
if err != nil {
return nil, errors.WithStack(err)
}

So, that check is not case sensitive (probably should be, be that's out of scope for this PR. Maybe something for @cmaglie to look at, I think he was working on removing some of the legacy).

  1. why gcc doesn't complain when "dry-run" compiling *.s (during discovery phase) when we have added '#line...' and it is supposed to not run the c-preprocessor (assuming that stackoverflow answer is correct)

The reason it is processed at all, is that the "include detection" uses a different code path, that just runs commands on all source files, using yet another list of extensions:

extensions := func(ext string) bool { return ADDITIONAL_FILE_VALID_EXTENSIONS_NO_HEADERS[ext] }

var ADDITIONAL_FILE_VALID_EXTENSIONS_NO_HEADERS = map[string]bool{".c": true, ".cpp": true, ".s": true}

For completeness, I checked that that list is again checked case insensitively:

if extensions != nil && !extensions(strings.ToLower(filepath.Ext(path))) {

As for why this does not barf on .s files, this is probably because it uses a different commandline that just runs the preprocessor with -E (and it also overrides the filetype with -x c++, might be that that is also needed): https://github.com/arduino/ArduinoCore-avr/blob/58081c05e548560d3a60050bbf260ebec5d1e867/platform.txt#L88-L89

Now the real question is: If .s files would actually be compiled (which I think they should), is the #line problematic? You would think it is, but I also know that the preprocessor is sometimes somewhat integrated into gcc rather than a separate process, and I recall that there are some modes where the full preprocessor is not ran, but it still interprets #line directives. The easiest way to test is probably to put .s here and try.

A related question would be: Are these #line directives useful at all for assembler files? Do those generate debugging symbols at all? It might still be useful for error messages, though I think those are already somewhat preprocessed by the builder or the IDE.

@oschonrock
Copy link
Contributor Author

oschonrock commented May 15, 2020

Wow, OK, that's pandorra's box of legacy issues.... :-) Can Windows even tell if a file is upper/lowercase with their case in-sentitive filesystem? I wouldn't know, haven't coded for Windows for two decades.


I think your last question is the real one. Should #line be used for .s or .S ? The answer is no I believe.

All of those other things should probably be addressed as you point out, but as you say, probably out of scope and should be separate.

so would a (pseudo-code):

if (!toUpper(filename).endsWidth('.S')) prependLineNo() 

be safer, and solve the issue without scope creep?


As part of the legacy cleanup "what do to with each kind of file", I would aim for some kind of central struct/config which stores all the filters/actions for each file type in each situation - and the above "insert #line tags" should be one of those. But I don't know the code, so I can't say if that's realistic.

@matthijskooijman
Copy link
Collaborator

Did you confirm whether .S files are indeed problematic with #line directives?

Can Windows even tell if a file is upper/lowercase with their case in-sentitive filesystem? I wouldn't know, haven't coded for Windows for two decades.

Yeah, it's just that you cannot have both cases in the same directory at the same time and file opens etc. match insensitively, but files do have a case sensitive canonical name.

As part of the legacy cleanup "what do to with each kind of file", I would aim for some kind of central struct/config which stores all the filters/actions for each file type in each situation - and the above "insert #line tags" should be one of those. But I don't know the code, so I can't say if that's realistic.

Yeah, I think that would indeed make sense. Cleaning up existing code would be out of scope, but for new checks, maybe we could do something nicer. Such as adding a 'doesNotSupportLineDirective' map to https://github.com/arduino/arduino-cli/blob/c3871677ab82229181975f6aba1170bd721a58d5/arduino/globals/globals.go. Might not be the perfect solution, but then at least this new check is anchored to globals.go and will be more visible for future refactors.

One super-tricky thing here is that this particular check must be case sensitive, while all other extension checks are (or should be) case insensitive, so that requires a clear comment to prevent future confusion (a future refactor could maybe store regexes or otherwise make the case (in)sensitivity more explicit, but again, out of scope).

@oschonrock
Copy link
Contributor Author

Did you confirm whether .S files are indeed problematic with #line directives?

Neither causes an error but the lowercase version doesn't get compiled for unrelated reasons.


Re the other stuff: None of these issues really relate to the #line issue/feature. I am happy to make any appropriate change to if/switch/case/match in whatever source code file. But really this needs a decision from someone who can take responsibility for the dev path of that whole situation. The actual coding bit is fairly trivial.

So if we want this feature, and there is such a person, please speak up now.

@matthijskooijman
Copy link
Collaborator

Oh wait, I misread your earlier comment: You proposed to disable adding the line directive for both .s and .S, since it does not really help there anyway. So I think a lot of my previous comment is not applicable then.

To implement that, I would still suggest adding something to globals.go, but since there is no case sensitive matching needed anymore, that could maybe be a positive one then, e.g. supportsLineDirectiveExtensions or something like that?

But really this needs a decision from someone who can take responsibility for the dev path of that whole situation. The actual coding bit is fairly trivial.

Yup. I would think @cmaglie or @facchinm, though there are some more devs in this repo I do not really know yet :-)

@slady
Copy link

slady commented May 29, 2020

@oschonrock , @matthijskooijman , you guys have considered so many features not relating directly to this pull request, including but not limited to:

  • what file types or extensions to cover in this inclusion process
  • possible excessive memory consumption (which had already been there before our changes even took place)
  • unit testing and end to end tests for this change

All of those are very interesting questions indeed, but what bothers me more is if this change branch ever gets merged to the main trunk.

I wrote my original pull request with the same purpose for arduino-builder almost a year ago, though it remained ignored by the arduion team. And then the branches diverged during a huge refactoring making the changes incompatible.

A friend of mine wrote a new patch for the current version of arduino-cli roughly half a year ago, reportedly even on the very same line as @oschonrock , but my friend did not even bother to publish his change and I can only wonder why...

Studying source codes and proposing changes is really hip! However more important matter is making the arudion team notice our changes and embrace them instead of disregarding them. Up until now they seem a little oblivious to the open source community.

Any ideas how to change this situation?

@oschonrock
Copy link
Contributor Author

oschonrock commented May 29, 2020

All of those are very interesting questions indeed, but what bothers me more is if this change branch ever gets merged to the main trunk.

I completely agree. My proposed changeset is precisely 1 line, in order to achieve a purpose with minimum fuss. All the complicated discussions came from the other guy... ;-) And they ultimately came to nothing.

In the end, I said exactly the same thing as you: Unless we get someone with main branch commit rights to approve this, this is a waste of time. That's where the discussion stopped.

I am running a patched version on my machine, so I am happy. I tried to share. It came to nothing so far. You can't force open source project to prioritise your preferred changes.

TBH, I get the impression that this project is riddled with many complication due to the long legacy of the arduino IDE. This often happens in older projects, and it can be very difficult to make improvements, because every change breaks something, so you tend to "leave it alone, unless Rome is burning".

@matthijskooijman
Copy link
Collaborator

Studying source codes and proposing changes is really hip! However more important matter is making the arudion team notice our changes and embrace them instead of disregarding them.

In my experience, one of the most effective ways to get something merged, is to make sure it is easy to review, and the code has good quality (so minimal improvement cycles are needed). Both reduce the amount of time a core dev has to spend on a PR. All of my questions and comments on this PR have been with these goals in mind, not for "studying source codes".

I have indeed posed some questions which are not directly related to the change in this PR, but are related to the way this PR fits into the broader context. One can disregard those as irrelevant, but failing to see the broader context (and then deciding whether a change might need some refactoring to better fit) will likely result in code that is harder to maintain, eventually slowing down development, rather than speeding it up.

Also, a lot of the questions I've posed are likely things that the core devs will also be wondering when they review this PR. Having those questions considered, and answered will again help speed up review (even if the answer is "the PR does not need to be changed", it helps if this is made explicit). In that sense, it might be helpful to summarize some of the questions considered, observations made and conclusions resulting from those.

Up until now they seem a little oblivious to the open source community.

AFAICS, this is more a problem of workload: There seem to be a lot more issues and contributions than the core devs can currently process. I'm pretty sure they would want for some extra hours in the day to fix this, but that is not the reality. It may seem that your contribution, or the contributions that you care about are being ignored, but that probably means that time is spent getting other contributions merged. So, as a community, we can probably try to help things by doing our best to make things easy to review, but also review each other's contributions and getting them into a better shape before the core devs get a chance to look at them in detail. The risk is that you spend time getting something perfect that does not end up being merged, or ends up broken in a big refactor, but I'm not sure how to better handle this with the resources available...

TBH, I get the impression that this project is riddled with many complication due to the long legacy of the arduino IDE. This often happens in older projects, and it can be very difficult to make improvements, because every change breaks something, so you tend to "leave it alone, unless Rome is burning".

I certainly think this is a problem for Arduino, but I do not have the impression that it much applies here.

@oschonrock
Copy link
Contributor Author

oschonrock commented May 31, 2020

No offence, but I think the far reaching, sweeping discussion about out of scope refactorings of other parts of the code is quite disproportionate for a one line change.

IMHO it makes this pull request totally unusable if someone with commit right actually looked at. They would have to read for 30mins in order to approve 1 line of change.

This should be clean and straightforward. 1 line changeset. Approve or reject, by someone with the permissions and rights to do so.

It is now getting even worse, because it is turning into a philosophical discussion about how open source works. I have been doing open source for over 20 years, and I can tell you, this is not how. So since this pull request is now unusable for anyone with the permissions, I am going to close it and abort. If you can find someone with decision making power who cares, we can make a new one.

Enjoy.

@oschonrock oschonrock closed this May 31, 2020
@cmaglie
Copy link
Member

cmaglie commented Jun 1, 2020

@oschonrock if you don't mind, I've had this one in my backlog, so I'll reopen it.

Apologies for the slow responsiveness, but we are really under a heavy workload, I understand that this could be frustrating sometimes...

@cmaglie cmaglie reopened this Jun 1, 2020
@cmaglie
Copy link
Member

cmaglie commented Jun 1, 2020

Related to: arduino/arduino-builder#325
Fixes: arduino/arduino-builder#323

@cmaglie cmaglie linked an issue Jun 1, 2020 that may be closed by this pull request
@oschonrock
Copy link
Contributor Author

if you don't mind, I've had this one in my backlog, so I'll reopen it.

Sure. If you've read this far, you must be really keen ;-)

@cmaglie cmaglie added this to the 0.12.0 milestone Jun 26, 2020
@cmaglie cmaglie self-assigned this Jun 26, 2020
@cmaglie cmaglie removed this from the 0.12.0 milestone Sep 14, 2020
@cmaglie cmaglie added this to the 0.14.0 milestone Sep 14, 2020
@per1234
Copy link
Contributor

per1234 commented Jan 2, 2021

Fixes #965

@cmaglie
Copy link
Member

cmaglie commented Mar 17, 2021

Merged in #1224

@cmaglie cmaglie closed this Mar 17, 2021
@per1234 per1234 mentioned this pull request Mar 23, 2021
5 tasks
@per1234 per1234 added conclusion: duplicate Has already been submitted topic: code Related to content of the project itself labels Sep 30, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
conclusion: duplicate Has already been submitted topic: code Related to content of the project itself type: enhancement Proposed improvement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[debug] Support breakpoints on sketch libraries
7 participants