Skip to content

ERROR: IDE compiles incorrect under Windows but compiles correct under Linux #9659

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Byterra opened this issue Jan 24, 2020 · 14 comments
Closed

Comments

@Byterra
Copy link

Byterra commented Jan 24, 2020

Compiling under Linux results in a fine working program.

When I compile the same source under W10 the program has a lot of faults like:

  • resetting automatically
  • changing the value on the pins without the program says to
  • exiting the program when it has not yet finished
  • measuring a LOW pin when it is not LOW

To investigate this, I used two different Windows 10 laptops, also I did a clean install again on one of them, IDE version 1.8.10.
For Linux I use an older laptop with Ubuntu 18.04.3 LTS and IDE 2:1.05+dfsg2-4.1 is used.

Source:
AllIn_2020_Base_Keep.TXT

@Byterra
Copy link
Author

Byterra commented Feb 12, 2020

A late comment: I compile for theArduino Mega 2560

@ElectricRCAircraftGuy
Copy link
Contributor

ElectricRCAircraftGuy commented May 2, 2020

@matthijskooijman
Copy link
Collaborator

matthijskooijman commented May 3, 2020

I suspect this is the same problem as arduino/ArduinoCore-avr#339. Could you try running with -fno-jump-tables to see if that solves any or all of your problems?

To do so, find the platform.txt file for the used Arduino AVR core (the path should be printed somewhere at the top when compiling with verbose output) add -fno-jump-tables to the compiler.c.elf.flags line (do not forget to separate the existing stuff from the new option with a space). If you need more help, just shout.

@Byterra
Copy link
Author

Byterra commented May 3, 2020 via email

@Byterra
Copy link
Author

Byterra commented May 22, 2020

platform.txt
arduinoCompileTest.txt
AllIn_20200108_FF.txt
duration_20200503.txt

First of all my apologies for responding so late.

I did what you asked, and an extremely funny thing now happens. The program I use as a sort of baseline works fine after compiling under Windows 10. Problem solved, you'd say.

But nothing like it!! Another program, derived from the baseline, still has the same problem. Reed contacts read as "closed" when thay are "open", pins pinMode(nn, OUTPUT) are set LOW, where they should be, or should remain HIGH.

I attach:

  1. my adapted 'platform.txt'
  2. the result log of the program compile
  3. the baseline prgm AllIn_20200108
  4. the derived prgm duration_20200503

To be absolutely clear: the base program that compiles fine now is AllIn_20200108
The program that still does NOT compile well: duration_20200503

Finally: the 'Serial.print' & Serial.println' now also print a time in front of the test I want to print !!!

Awaiting your reply,

Regards,

Henk van der Heijden.

@Byterra Byterra closed this as completed May 22, 2020
@matthijskooijman
Copy link
Collaborator

Hey Henk, your previous reply got lost in my big TODO-list, sorry for not following up. Also note that in my previous reply, I messed up the link to the other issue, I edited that now.

Your observations seem weird: The problem seems solved, but then reoccurs after running for some time (or at least, some problem occurs, possibly a different problem?).

I'm wondering if there might be multiple problems here. In particular:

  • Maybe the initial problem was the jump-table problem, but there is a second problem that occurs later. That (or maybe even both) problems could be caused by errors in your code still, things like null-pointer dereferences, out-of-bounds memory writes or use-after-free errors can produce pretty random behaviour, including toggling pins.
  • Maybe you do not have exactly the same AVR core and/or compiler versions on you Windows and Linux machines, which could explain the difference in behaviour? Or maybe your code does something that is undefined behaviour, then I think it might even be possible for the same compiler running on different systems to produce different results (still unlikely, but not unthinkable).

I had a quick look at your failing sketch, but could not find anything obviously wrong (but the code is complex, so I will certainly have missed things. I also see a lot of array indexing, which could be sensitive to overflows). In your compile log, I see some warnings, but nothing that should cause any actual problems.

I'm not sure where to go from here, unfortunately. Maybe you could somehow more closely pinpoint the problem in your code, though problems like these tend to disappear when you try to shrink the code, I'm afraid...

@Byterra
Copy link
Author

Byterra commented May 22, 2020 via email

@matthijskooijman
Copy link
Collaborator

I thought of one more thing to try. The problem of arduino/ArduinoCore-avr#339 actually seems to be caused by the --relax linker option. So far we've seen problems with jump tables, but maybe it causes problems with other situations too, maybe.

So, you could try to disable --relax. It is a bit tricky, since it is not present in platform.txt explicitly, but automatically added to compiler.c.elf.flags (which is planned to be removed, see arduino/arduino-cli#639, but that won't help you now of course).

So, to disable it, you would need to add -Wl,--no-relax after the compiler.c.elf.flags, to disable it again. IOW, you could try:

recipe.c.combine.pattern="{compiler.path}{compiler.c.elf.cmd}" {compiler.c.elf.flags} -Wl,--no-relax -mmcu={build.mcu} {compiler.c.elf.extra_flags} -o "{build.path}/{build.project_name}.elf" {object_files} "{build.path}/{archive_file}" "-L{build.path}" -lm

(I removed the -fno-jump-tables again, since that should no longer be needed)

@Byterra
Copy link
Author

Byterra commented May 24, 2020 via email

@Byterra
Copy link
Author

Byterra commented Jun 14, 2020 via email

@Byterra
Copy link
Author

Byterra commented Jul 2, 2020 via email

@matthijskooijman
Copy link
Collaborator

Recently, 1.8.13 was released which automatically disables --relax, fixing arduino/ArduinoCore-avr#339. You could try upgrading and see if that helps, no further changes needed (you do need to revert your changes to platform.txt of course).

As I mailed to you earlier, I did resolve the problem in another way by having added extra variables that read data from one array, and then use these to read another one. But I was quite enthousastic bout that indirect addressing. You call jump tables I understand now.

I think you misunderstood: Jump tables are internally generated by the compiler when you use a switch statement. Arrays used in your own code are not those and should be unaffected by this problem.

@Byterra
Copy link
Author

Byterra commented Jul 3, 2020 via email

@matthijskooijman
Copy link
Collaborator

Removed old version. Cleared as far as possible registry, installed 1.8.13, and even faster than before the same problem occurred, pins were set LOW, while they should have remained HIGH.

Hm, that suggests that you problem is really a different one than I had thought it was. Maybe another compiler bug then, or maybe some problem in your code after all, maybe you have a bug that triggers undefined behaviour and different compiler versions behave differently with that (as they're allowed to). The code too complex for me too tell. I'm not sure if I have any further suggestions for debugging this, though. Or maybe one: Did you try enabling warnings in the preferences? If there is indeed undefined behaviour, the compiler might be raising a warning (which are hidden by default, unfortunately).

Thnx for all your effort. Maybe you know if there will be an entirely new version somewhere this year?

Not sure, I think that gcc for AVR has not always been quick to update, and I think development has nearly halted in most recent versions (there was a bountysource about migrating AVR to a new backend in gcc, otherwise it might even be removed in the near future...).

It would be interesting to compare the compiled Linux version with the compiled Windows version. On Linux I use (Arduino2:1.0.5+dfsg2-4.1), and at the moment I do not dare to upgrade. The Windows verson is able to export the machine code. The Linux version I use does not.

Oh, that's an ancient version. I hadn't realized that when you first posted this, but that probably means the problem is not Linux vs Windows, but just old compiler vs new compiler. Normally, I would recommend people upgrade from that ancient version, but given it works for you now, I'll refrain from that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants