Skip to content

Fix for invalid elf file when converting to bin #8356

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 7 commits into from

Conversation

anilsoni85
Copy link

@anilsoni85 anilsoni85 commented Oct 31, 2021

On windows 10 xltensa-lx106-elf-objdump sporadically returns no output and results into build errors. Adding logic to retry to read segment from elf.

Ref: #7253

On windows 10 xltensa-lx106-elf-objdump sporadically returns no output and results into build errors. Adding logic to retry to read segment from elf.
@d-a-v
Copy link
Collaborator

d-a-v commented Oct 31, 2021

Would it be possible to print the exception details so this weird behavior is understood ?

@d-a-v d-a-v changed the title Fix for Issue #7253 Fix for invalid elf file when converting to bin Oct 31, 2021
@earlephilhower
Copy link
Collaborator

Would it be possible to print the exception details so this weird behavior is understood ?

I'm not sure that would be helpful. The exception now manually raised when the output is empty.

Python's subprocess.popen() will throw an OSError exception (which is not caught and will dump to console/IDE) if the app can't start for some reason. But, it's not in the case they're reporting because they're just showing the manually created "no start found" exception we make.

So we really don't have anything to go on here. I would maybe add a note to the top saying # This is a hack to fix intermittent Windows silent failures, see #7253 or something because it's the only spot in a whole bunch of calls we do this.

I'd also like to get a report or two that this fixes things from someone who has been having failures before merging.

My gut says there's something in these users' Windows installs that's blocking the app from running (malware scanner? antivirus? race condition w/OneDrive online? no idea) every now and then. I, too, have never had it fail in my (admittedly limited) Windows testing, nor has it crapped out ever in Windows CI.

@d-a-v
Copy link
Collaborator

d-a-v commented Nov 1, 2021

I was indeed thinking of a race condition without further clue but

malware scanner? antivirus? race condition w/OneDrive online?

Considering that we are dealing with executable files, I think you put your finger on it @earlephilhower .

@anilsoni85
Copy link
Author

anilsoni85 commented Nov 1, 2021

Would it be possible to print the exception details so this weird behavior is understood ?
Python's subprocess.popen() will throw an OSError exception

The process doesn't print any output to STDOUT, while debugging the issue I dumped each line inside the for-loop and noticed that when it fails there is no output. Perhaps the process writes on STDERR but returns with exit code 0 because there was no OSError exception on console. The process silently goes away sporadically without printing a single line. I will revert my local script change and put screenshot later when I will get chance. The error was pretty consistent to reproduce but it was failing randomly for one of the segment of elf.

My gut says there's something in these users' Windows installs that's blocking the app from running (malware scanner? antivirus? race condition w/OneDrive online? no idea)

I was running it on local isolated development environment which is excluded from OneDrive sync therefore I don't think its related to one drive/Antivirus/Malware.

@anilsoni85
Copy link
Author

anilsoni85 commented Nov 2, 2021

Look at the attached log lines from build output I generated this log file locally after adding print in elf2bin.py. Notice how there were 2 attempts to load .text segment at line#6 and line#7.

        with subprocess.Popen([path + '/xtensa-lx106-elf-objdump', '-h', '-j', segment,  elf], stdout=subprocess.PIPE, universal_newlines=True ) as p:
            lines = p.stdout.readlines()
            print("Segment: {}. Attempt: {}/{} STDOUT Lines: {}".format(segment, attempts, maxAttempts, lines))
            if (len(lines) > 0): 
                for line in lines:

elf2bin.txt

Copy link
Author

@anilsoni85 anilsoni85 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed raise exception level

@mcspr
Copy link
Collaborator

mcspr commented Nov 3, 2021

The process silently goes away sporadically without printing a single line.

Have you tried wait() / communicate() after Popen() call, and then read the stdout?
What is the stderr=subprocess.PIPE result though, is there anything there? (unlikely, but nonetheless... and I assume arduino ide would've printed it as we don't capture it?)

@earlephilhower
Copy link
Collaborator

Have you tried wait() / communicate() after Popen() call, and then read the stdout?

The readlines() call should be blocking until the process exits, as I understand the Python libs. So, it's already effectively doing this.

Seeing if STDERR has anything would be useful, though, like you suggested!

The logs seem to show it failing after succeeding several times, so the race condition thing seems harder to believe now (unless AV/etc. is non-deterministic and only checks things after a few 10s of ms). And, still, no Windows CI failures (running on a Win2K21 Server VM I believe).

Print STDERR and STDOUT on console when segment read fails.
@anilsoni85
Copy link
Author

I cannot reproduce it anymore . I have modified the reattempt logic and added print statement to dump STDOUT and STDERR on console may be it will help in future.

@panakos
Copy link

panakos commented Dec 24, 2022

Citrix Workspace was causing the issue. I updated Citrix and problem gone!

@anilsoni85 anilsoni85 closed this May 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants