-
Notifications
You must be signed in to change notification settings - Fork 7.6k
Olimex boards ESP32-EVB/Gateway ethernet fix #6188
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Added short delay (350 ms) for ESP32-EVB at the start of the ethernet begin function (in ETH.cpp) to solve the issue with the inability to initialize the phy immediately after reset. Added values of the revision macros (in boards.txt) for ESP32-Gateway to match the #if conditions in the variant file and applying the changes for the respective revision.
|
This is due to the reset-/supervisor-chip |
@@ -228,6 +228,9 @@ ETHClass::~ETHClass() | |||
|
|||
bool ETHClass::begin(uint8_t phy_addr, int power, int mdc, int mdio, eth_phy_type_t type, eth_clock_mode_t clock_mode) | |||
{ | |||
#if defined ARDUINO_ESP32_EVB | |||
delay (350); // Olimex board ESP32-EVB requires short delay before the phy initialization after reset |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't it make more sense to check the clock mode?
If the clock mode is external crystal on GPIO-0, then this should apply.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, this is just the very long 470ms delay on this particular board and how this is implemented to save a GPIO pin
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
470 ms?
The change mentions 350 ms
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
470ms is the RC-time constant of the supervisor chip, that enables the PHY-clock.
The ESP32 has also an RC-reset circuit with 100ms delay.
So 470ms - 100ms = 370ms. Plus the startup time the ESP32 take until it reaches the code, where the Ethernet is being configured.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm that sounds awfully critical timed and extremely board specific.
Tolerances of capacitors are quite big (tens of percent), and capacity of a capacitor may reduce over time as the component ages.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey and sorry for the late reply. I didn't expect such an interest considering I posted an issue about a week ago but there was only one suggestion there, so I thought it would take days (if at all) before this pull request is discussed.
Anyway - for the issue I thought it is something specific for ESP32-EVB board and with colleagues after some testing we figured that this solves the issue. Although it's more of a workaround rather than an actual fix. We are not aware if other boards will need it or not.
The value for the delay is empirically derived by testing about 20 of our boards. Some of which behaved properly and the sketch worked as intended with or without the delay. While others needed between 100-250 ms to make it work at all. The most demanding boards required ~275ms at which point sometimes they sometimes worked, sometimes failed. And at 300 I didn't find any that aren't working. The extra 50 ms on top of that are more of a "insurance" although you might be right that it will need more.
As for the suggestion with the clock mode check - what exactly do you mean? I knew my solution (or should I say workaround) is lame but considering it get the job done I decided it's better than not having it at all. I am open for suggestions in that regard. It's just that I am uncertain how to implement a more elegant solution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding the clock mode.
There are 2 configuration related issues here:
- The need to power manage (or reset) the LAN controller
- Whether or not there is an actual need for blocking a clock signal to GPIO-0
Resetting the LAN controller via the reset pin needs roughly 100 msec for the LAN controller to properly work.
Power cycling the LAN controller can be useful for saving power (it needs 40 - 100 mA, depending on whether it is connected to a switch). This can also be used to suppress any clock signal to GPIO-0 when using an external crystal.
Depending on the power supply and R/C timings this may take a few-100 msec to get stable + the 100 msec as with performing a reset.
Blocking a clock signal to GPIO-0 can be handled with an analog switch chip or holding down the EN pin of the crystal.
Switching this takes a few msec at most (when toggling the EN pin of the crystal).
If the clock mode is set to have an external crystal, then it is likely the power pin is either used for power cycling the LAN controller and/or switching the clock signal to GPIO-0. I assume most boards will power cycle the LAN controller as this also can be used as a reset to clear any unrecoverable error state on the LAN controller (which does happen every now and then)
Thus having either the PWR or RST pin set, or the clock mode set to GPIO-0 external crystal, can be used to add some extra delay.
Is there a chance to do a more general fix? Other boards when change chips will fail too. |
Is it possible to 'catch' a crash and then set a GPIO pin right before the actual reboot? |
@VojtechBartoska could you please poke the ETH team and see what they think about it and if there is a more proper way to fix this for everyone? |
@me-no-dev yes, assigning to myself. |
The most appropriate way of how to handle this kind of scenarios is to have separate GPIO reserved for REFCLK enabling/disabling, in my opinion and enable the clock only when it is safe (e.g. from user program after boot is done). Typically, it can be achieved by pulling CLK EN low by a resistor so the clock is disabled during booting and then configuring the GPIO to high to enable it. However, that is matter of HW design decision and it is not applicable to this ESP32-EVB board where the design goal was to keep as much as possible GPIO's available. |
This is very much board specific and I unfortunately don't see any better solution than waiting for PHY is properly started. The question is where to wait though... Frankly speaking, I am not very familiar with Arduino project so I cannot provide any erudite help. However, from philosophical point of view, it should not be done at level closed to driver but in some specific board initialization function. |
@Stanimir-Petev, @sauttefk I also noticed that the PHY reset is de-asserted (driving input high) right after the power is applied to LAN8710A. Am I right? Didn't you observed any issues? The reason why I am asking is the LAN8710A datasheet states: "A hardware reset (nRST assertion) is required following power-up". |
@kostaond The Olimex ESP32-EVB is broken by design! Warm-starts have a 50/50 chance of waiting forever in the bootloader instead of starting the application. As far as I can see there is a R/C reset circuit that delays the NRST input of the LAN8710 after a cold-start. |
I see. So shouldn't it be stated in some Olimex documentation/errata along with other issue workaround suggested in this PR rather than creating board specific update? However, as I said, I am not Arduino guy so maybe a philosophy is different here...
Do you mean C18? |
IMHO a fix for a specific faulty device should not be placed in the general driver. |
Sorry to hijack this. But since the ethernet team is here. Can somebody confirm that during reset the LAN8720 does not output REFCLKO? I would like to use the LAN8720 with a 25MHz crystal, outputting REFCLKO (50MHz) to GPIO0. But I need to make sure the LAN8720 does shutdown the clock during reset so that the ESP32 will boot ok. The datasheet does not say it explicitly. |
I remember we did such a test years ago. The result shows that, LAN8720 will keep REF_CLK output even in reset state. |
no, unfortunately not. I also wished this would be so :-( |
@gonzabrusco I'd also wish there would be a possibility to set the ESP32 to make most of the bootstrap pins obsolete by programming the efuse just as this is possible with GPIO12 for selecting the flash-voltage. |
Thanks @sauttefk @suda-morris . So with the LAN8720 there's no way to use it with GPIO0 as a clock input (except with that hack). Or maybe fully power it down. |
Well you could use a 50MHz crystal oscillator with an enable pin. Pull this enable pin low by default and connect it to the same GPIO as your LAN8720 reset. |
So just keep this Pull Request open to remind us further investigation on Ethernet in general. |
A clean reinitialization of all GPIOs involved in RMII, before starting up the ETH module seems to fix the resart issue. Done by @s-hadinger for Tasmota fixing the issue with Tube ZB (based on POE Olimex module)
|
@Jason2866 That's great and would explain a lot of issues I'm seeing on my own boards as I did base some of my own board designs on the Olimex boards. Where/when do you reset these pins? I assume right before calling ETH.begin() obviously, but maybe also before an intended reboot? |
Oh, I didn't know about this issue. I don't know what is the root cause, but I've seen this issue with the latest Olimex POE. The forced reset of GPIOs are made just before calling However I suspect more a problem in the ESP32 IO matrix between reboots. It always works fine after power up, but fails often after a reboot. |
Yep, I've seen that too, also for other pins. (e.g. I2C needing some tricks to get the bus unstuck after reboot) |
I will be lazy on this issue. This small fix above makes it work 100% of the time for Olimex POE, that will do it for now. I hope you will find the root cause. |
Hmmm. Unfortunately the problem came back. My patch above is not enough. |
What exactly is the problem you're experiencing? |
After a first start Ethernet works well. When I restart (no reset button, no power off), the Ethernet seems to connect and after 2 seconds goes off (the green led lights for 1-2 seconds). Then it tries to reconnect and goes off again. Surprisingly after some time, restarting the device does work. |
Does this happen on all switches/routers, or only on some? |
I have only tried on a Unifi switch with auto-negotiation enabled. This could also come from the auto-nego failing. I'm sorry that I couldn't spend more time on it, nor enable more logs. I will try to gather more information in the following days. |
Summary
Added short delay (350 ms) for ESP32-EVB at the start of the ethernet begin function (in ETH.cpp) to solve the issue with the inability to initialize the phy immediately after reset.
Added values of the revision macros (in boards.txt) for ESP32-Gateway to match the #if conditions in the variant file and applying the changes for the respective revision.
Impact
I have described the delay for ESP32-EVB subject in more details here: #6142
The values for ESP32-Gateway macros are needed so the changes can be applied not only for a specific revision but for the revisions after which is achieved by comparing the revision value to a constant inside the "pins_arduino.h" file inside the variants folder. For example:
The hardware changes were not only for revision D, but also for E, F etc. And without the values those comparisons were meaningless and the code inside was ignored. As a result the default ethernet example wasn't working. With these changes implemented the ethernet clock and power pins are defined and it works properly now.