-
Notifications
You must be signed in to change notification settings - Fork 13.3k
Difference in esp-01 and esp-12e while implementing serial to wifi bridge #4428
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I cannot answer as to if there's physically some difference that causes this issue, but there's a reasonably simple workaround you could use: only transfer, say, how much data it's going to send to the ESP, the data, and then a CRC, have the ESP check the CRC and reply with either a "CRC OK" or "CRC not OK", and in the case of "CRC not OK" have the app resend the payload. |
In your serial-wifi transfer, don't read more serial data bytes than client.availableForWrite() before writing them to the client. |
Try pr #4328 . |
Before I add to this thread with my similar issue, it looks like that PR has now been merged, and since I downloaded the latest GIT yesterday, AFAIK I've got the new code. I'm also doing a similar thing, using an ESP-01 (80MHz CPU, 40MHz Flash) and implementing an FTP server using another (much faster) MCU controller that talks to the 8266 via Serial. The master MCU claims to have zero serial-error/drift @ 250KHz, 1MHz & 2MHz. The whole project works perfectly @ 500KHz but quickly fails at anything greater (the next step I tried being 750KHz, but 1MHz does too, and 2MHz, which would be my ideal speed if it worked). I'm fairly [very] sure I lose a byte due to some long WiFi interrupt (the only library I use) that puts the two units out of sync. CRC may well be a solution, if sync. can be re-established, but that's easier to say than it is to implement. I also tried increasing the 8266 HW/Serial receive buffer size from 256 bytes to 2048. It didn't make a difference. Given there are 40 clock cycles (80MHz/2MHz) for an interrupt to complete - given the buffer change I mentioned didn't work so I expect it's being blocked, I kind of understand that running the devices at this speed isn't going to work without some ESP experts (not I!) doing something special here... But that's where I'm at... Left with a device that processes files at 40KB/s when it could be going close to 160KB/s. I feel that's the difference between a nice solution and one that would be rarely used. |
Can you try reading Serial then sending to WiFi client not more than client.availableForWrite() |
A Serial to WiFi bridge is a common function, |
Pr #4328 has been merged (post 2.4.1 release), it fixes an issue with the HardwareSerial rx buffering, and is reportedly working up to 2MHz. |
My solution has serial handshaking (data lengths and wait signals) between the two MCUs and the buffer is in global memory.. availableForWrite() should not be needed because Serial comms is fully completed and the other side set to "wait" before WiFi is used (both transmit and receive). Having said that, I expect WiFi interrupts are firing fairly constantly no matter what state the system is in. I also have PR #4328. My calculation of 40 clocks was wrong, since there's 10 bits in each byte transferred - it's 400 clocks before Serial will possibly lose a byte @ 2MHz. Questions I might ask the ESP people in the know are, can the Serial interrupt fire when the WiFi Interrupt is running (priority and nesting), if not then what is the max. latency the WiFi interrupt needs (this can be used to calculate the appropriate serial baud setting) and also, if the WiFi interrupt needs to loop and consume >400 clocks, can it service the Serial hardware as a side task while/if interrupts are blocked (so my larger serial buffer is effective). Thanks everyone. |
I downloaded the tech. specs. for the ESP8266 and I see the hardware serial has 128 byte receive and transmit FIFOs. The uart_start_isr function configures RX for "full" as 100 (giving 28 characters extra to clear the FIFO - 11200 clocks) but I changed this so "full" was just 25 and it did seem to work a bit longer - but still failed. Sadly, the specs. say little to nothing about how WiFi works. I did read the doco for this project and they say WiFi is serviced at the end of loop() and when you call delay()/yield(). My project has no delays but does exit loop() regularly and I think I saw code in Serial.available() that "optimistically" yields when empty, which I use. Since I was able to transfer files at a higher speed for bit longer, I ran a diff on the original file each time I uploaded and downloaded it. The files were the same size, but in one case there were two characters different, one about 100KB in to the file and another at about 1.2MB in - the file was 2MB in size. I had not lost a character, but there is a signal problem for me at high speed. That makes me wonder if the upload speed option of 921600 is because the ESP-01 can't reliably do 1000000? On another attempt, the file transfer didn't complete and I saw this at the end of the short file: (the file is basically 2MB of c++ source code full of text hex values)
Notice the ESP8266 write exception occurred in the middle of my receive from WiFi function (WiFiClient.read() already completed because I'm in the loop that sends back the result over Serial. And, more confusing, my loop has then continued to send 1725 bytes from my buffer (I've not shown it all here) after the exception happened. My 16KB buffer accesses are protected by a limit check:
My FTP code doesn't request anywhere near 16KB in these calls as I've limited it in trying to fix this issue. The two MCUs are completely out of sync. at this point too with the master getting back the exception data in the file payload. The FTP code is hammering this function at this point (all it's doing it asking for data and checking if still connected in a loop - as a disconnect on the data socket signals the FTP file transfer is complete). Edit: Should say read exception. |
|
The one above occurred in the control thread while waiting for a new command from the FTP client. This one is the data in the file that is mentioned above:
The FTPd3f (firmware) has this on line 113:
The firmware has this on line 139:
For line 139, I expect a byte corruption in sockindex being sent has made it go out of bounds? |
It is hard to debug the esp-arduino core without a sketch and setup we don't have access to, this is why we need MCVE to reproduce locally and track the bug down. |
I understand. I'm not sure what problem there would be with line 113 but it has only happened once (unlike many other exceptions I've now seen, now that I know where to look and have modified my master sketch to dump serial when something odd happens) but I wasn't expecting it to restart the server listener while I was connected with the FTP client. I suspect the control byte was corrupt and it entered the wrong function in that instance, even though it shouldn't have crashed. I am monitoring heap and it has always been above 25KB too. What I can now say is, as I increase the Serial baud rate between the two devices, I see more and more serial communication corruptions. I've changed the code to only make WiFi function calls on array members that are instantiated and the exceptions are now all gone. Any bogus sockindex values fail-safe to return values that keep the transfer going. The file transfers however, although always the correct number of bytes transmitted/received in either direction, have more and more data corruptions (detected by running diff) no matter which direction the file is sent. I'm now confident the software is rock solid. The hardware however isn't. The thing is, I've rewired it 3 times now and there's no difference. Currently the RX on the 8266 is a 5mm long solder trace to master, and the TX is a 12mm wire. Neither are close to any other signal - and I'm not sure I can improve this any further. I had thought maybe the 3.3v regulator isn't up to it, but Serial is the only thing not working while WiFi seems fine. I don't think that is the problem, and it's certainly spec'd for the job. Adding CRC now seems like the best thing to do, but at 2MHz, the number of corruptions I can see is huge. It also makes the project twice as complicated and not really guaranteed to be any faster with repeat packet sends. I might opt for a couple of test-sketches that test the full range of baud rates to see if there's a fast one both MCUs like. I just can't find any details on this for the 8266. But, thanks for helping! |
At this point, we need a MCVE sketch. |
Sure.. The following sketch is a minimal example, use whatever other MCU with two serial ports you have for master. With twifi=0 you see occasional output like "EL~~-" (E means error but a single byte wasn't lost, L means a single byte was, but EL together just means it's resynchronising after an error followed by two more characters and a minus in the resync). With twifi=1 you get much more output, but I'm not sure if attempting to connect with an FTP client makes it worse - but it does output "Disconnect" and then resyncs with the usual 3 bytes ~~-.
|
Some observations: 1/ Increasing the RX software buffer (to 2KB) does nothing to help the problem in this sketch. 2/ Since RX and TX Serial hardware FIFOs are the same size, I doubt very much you could reproduce this problem with only one module with RX bound to TX. 3/ Even with WiFi off, which reduces errors considerably, I'm seeing one error every ~15MBs transferred on average. I added a byte match success counter to the sketch for that estimate. With WiFi enabled, it's pretty much like I wrote above about my file transfer, one error for every MB transferred. Sometimes you get 3-5MBs or so without errors, but that's rate. E.G. Master sketch now uses a global unsigned long "checks" variable and says: if (b != nextr++) { ... } else checks++; if (checks >= 1000000) {Serial.write ("MB!\n"); checks = 0;} |
@DeshmukhAkshay : I agree with your initial findings in the OP, my test sketch above still shows regular errors at 500000 baud when WiFi is on, but they almost all go away at 250000 baud on an ESP-01. I don't have a '12e to test with and I'm unlikely to buy one in the future (or any other 8266 based devices now) to test given I can still see errors at 250000 baud. I also agree with you, anything below 250000 makes file transferring via WiFi too slow to be useful for anything but tiny files. At 250000 baud, I see one error every ~30MBs transferred on average when WiFi is on. I also did a microsecond-difference test on the two MCUs I'm using, the result (where 1000000 is a perfect match of the two clocks) was 999993. 0.000007us drift per second, since Serial is meant to handle up to 2.5% difference in baud rates, I don't think this difference in clocks used to generate the baud rate is related to the errors. (Oh, I also meant to write "rare" in my post just above, not "rate".) EDIT: I tried turning off all WiFi devices in my home except the ESP-01 and the WiFi-Router, it's done 80MB without an error so far on Serial, but that's not useful except to demonstrate where the problem lies. |
@DeshmukhAkshay : I believe there's a bug in the ESP8266 uart.c file that wrongly calculates there are no characters available if the uart isr runs at precisely the "wrong" time. The wrong time being, when moving the FIFO data due to being full or by time-out, and if the software buffer is already completely empty then the two components of "available()" can be 0 if the isr fires in the middle of the addition operation it does. If this is the case while read() has been called (because available() had already told you there was data in the FIFO) it will return -1 as the next character and corrupt your file. Can you modify your esp library files in the Arduino directory and add this extra line of code and re-try your tests? You may want to get the latest files from here, they have some other fixes to the uart library. You may also want to change your tests to include suggestions made by others here... In file: uart.c
Also, if you are going to use faster baud rates, increase the rx buffer size to 512 or 1024: |
Hey, I had also tried same tests on NodeMCU1.0 board just to confirm previous hardware design was correct. It is also giving me same behaviour. |
Download the latest files from here. They have fixed some issues with the uart. Other than that, I would say your original analysis was correct, the uart is overflowing (there's now a new Serial method you can use to find out if it has: bool hasOverrun() so you can verify it for sure with that). If that's true, only send from the PC RxBufferSize (1024) bytes and then make the PC wait for a continue control byte to come back, create a counter in your sketch and when it reaches RxBufferSize and you have passed all the data to the WiFi methods, send back the control byte to tell the PC to resume now you know the Rx buffer is empty. That's basically what my solution does, and it works now. |
@jasoroony can you brief me about how you used hasOverrun() method in your sketch or any other URL/reference. |
It resets the internal flag every-time you call the hasOverrun method, so you only get a true result once until it overruns again. If you do turn an LED on, you could turn it off after some short time period...
|
Looks like a solution to the initial problem was found. The UART code has also been significantly upgraded (esp. relating to overflow) since the opening. Closing for now, but if there is still something please do open a new issue and fill out the requested fields. |
I am having a project where I use ESP8266 as a serial-to-wifi bridge. I send data serially from a python code running on windows7 to esp8266(esp-01) which it sends to a wifi client running on windows7 again. While sending a pdf file (79 KBs) in a similar fashion, I (due to limiting RAM), opted for a method where i receive serially chunks of file and send it to wifi immediately after receiving. This keeps on happening inside the loop until all the bytes of file are successfully transferred from serial to wifi. Doing this, initially esp8266(esp-01) was loosing some bytes sometimes. I figured there is somehow a mismatch between wifi transfer speed and serial baudrate which results in overflow of serial buffer and thus, loss of few bytes rendering final output file corrupt. So, I reduced baudrate from 500000 to 250000 and things worked fine confirming my theory. The issue is when I am running same code in esp8266(esp-12e), the baudrate 250000 is not enough and it still looses bytes. Further, reducing baudrate to 115200 makes things work almost fine (9 times success and one time lost bytes) while with 9600 baudrate file transfer is 100% success. But baudrate below 250000 is already slower as per my requirement.
My questions are:
1- Is my theory correct? Theory that serially data is coming too fast while receiving it and sending to wifi immediately takes some time, combined with a little slower wifi speed is resulting in overflow of serial buffer sometimes and thus, loss of bytes. (decreasing of baudrate did worked in both hardwares which sort of proves this point but I am not sure)
2- Why the same code do not work fine in esp-12e? Is there any difference in there wifi controller or handling or something else? I created a hotspot in both hardwares and checked their speed which is same (54 Mbps), so it may not be wifi speed but then what?
3- How to overcome this?
A simple version of code i used for testing is as follows:
The circuit diagram of connection for esp-01 and esp-12e:

The text was updated successfully, but these errors were encountered: