Skip to content

MODBUS communication randomly broken due to misbehaving Serial.flush() function #5877

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
garageeks opened this issue Nov 11, 2021 · 17 comments · Fixed by #6026
Closed

MODBUS communication randomly broken due to misbehaving Serial.flush() function #5877

garageeks opened this issue Nov 11, 2021 · 17 comments · Fixed by #6026
Assignees

Comments

@garageeks
Copy link

garageeks commented Nov 11, 2021

Using arduino-esp32 1.0.6, the library ModbusMaster (https://github.com/4-20ma/ModbusMaster) works as intended.
Using arduino-esp32 2.0.0 or 2.0.1 it randomly stops communicating with the slave Modbus device.

I investigated with a scope and these are my findings:
RED LINE = TX signal, BLUE LINE = TX enable signal (tied to DE/NOT RE pins)

  1. Succesful transmission - the device replies as expected (Not seen here as I only have two channels)
    picoscope-success

  2. First failed transmission - enable signal stays high for a long time, preventing reception of the device answer
    picoscope-firstfail

  3. Subsequent failed transmission - enable signal is prematurely turned to LOW. The time it stays HIGH varies.
    picoscope-fail1
    picoscope-fail2
    picoscope-fail3

The mentioned library, when a packet is transmitted, turns HIGH the enable signal with preTransmission callback, write data to the serial port, flushes the serial port and then turns the enable signal LOW. Pretty straightforward. I assume Serial TX is buffered, then Serial.flush takes its time until the buffer is empty.

modbus-fail

Given the recent big overhaul of HardwareSerial and some other issues, I think there is a bug in the new implementation with 2.0.0 and 2.0.1 releases.

This is similar to #4603

@SuGlider
Copy link
Collaborator

@garageeks
In order to assist you could you please run this code in order to get more information about your board and development environment. If possible, run this code using core 1.0.6, 2.0.0 and 2.0.1 from your main development environment and report the result back here.

  Serial.printf("Internal Total heap %d, internal Free Heap %d\n", ESP.getHeapSize(), ESP.getFreeHeap());
  Serial.printf("SPIRam Total heap %d, SPIRam Free Heap %d\n", ESP.getPsramSize(), ESP.getFreePsram());
  Serial.printf("ChipRevision %d, Cpu Freq %d, SDK Version %s\n", ESP.getChipRevision(), ESP.getCpuFreqMHz(), ESP.getSdkVersion());
  Serial.printf("Flash Size %d, Flash Speed %d\n", ESP.getFlashChipSize(), ESP.getFlashChipSpeed());

@SuGlider SuGlider self-assigned this Nov 12, 2021
@garageeks
Copy link
Author

Hi @SuGlider , shall I run the code standalone or while running my MODBUS application?

I always print the free heap to detect potential memory leaks, so I have already some information from my application:
on 1.0.6 I get Free heap: 247744 - Max block: 113792
on 2.0.0 I get Free heap: 181063 - Max block: 65524

The ESP32 module I'm using is ESP32-WROOM-32E

PS: on 2.0.0 free heap and max block value is basically the same when MODBUS is working and when it stops working
PS2: the code is not exactly identical, but the code compiled on 1.0.6 has more features compared to 2.0.0. I can compare same code and report.

@SuGlider
Copy link
Collaborator

@garageeks
You can run it stand alone with 2.0.x.

@garageeks
Copy link
Author

Sorry, I forgot my programmer at home, I could only flash new firmware by OTA. Here's the results from within my application.

1.0.6 - MODBUS works non-stop for 24+hrs
Internal Total heap 347228, internal Free Heap 249100
SPIRam Total heap 0, SPIRam Free Heap 0
ChipRevision 3, Cpu Freq 240, SDK Version v3.3.5-1-g85c43024c
Flash Size 16777216, Flash Speed 40000000

2.0.0 - MODBUS works but fails after a while
Internal Total heap 347091, internal Free Heap 283303
SPIRam Total heap 0, SPIRam Free Heap 0
ChipRevision 3, Cpu Freq 240, SDK Version v4.4-dev-2313-gc69f0ec32
Flash Size 16777216, Flash Speed 40000000

2.0.1 - MODBUS doesn't work at all from the start
Internal Total heap 310792, internal Free Heap 225908
SPIRam Total heap 0, SPIRam Free Heap 0
ChipRevision 3, Cpu Freq 240, SDK Version v4.4-dev-3569-g6a7d83af19-dirty
Flash Size 16777216, Flash Speed 40000000

@SuGlider
Copy link
Collaborator

@garageeks
Please try this example with Core 2.0.0+
You can change UART port or Modbus slave ID as needed.

Please let me know if uart_set_mode(UART_PORT_NUM, UART_MODE_RS485_HALF_DUPLEX) solves this issue.

#include <ModbusMaster.h>

// necessary to define UART_MODE_RS485_HALF_DUPLEX
#include <driver/uart.h>


// instantiate ModbusMaster object
ModbusMaster node;

void setup()
{
  // use Serial (port 0); initialize Modbus communication baud rate
  Serial.begin(19200);

  // communicate with Modbus slave ID 2 over Serial (port 0)
  node.begin(2, Serial);

  // Set RS485 half duplex mode on UART_0.  This shall force flush to wait up to sending all bits out
  ESP_ERROR_CHECK(uart_set_mode(0, UART_MODE_RS485_HALF_DUPLEX));
}

@garageeks
Copy link
Author

Dear @SuGlider ,
In my application I added your suggestion as follows

node.begin(1, Serial2);
ESP_ERROR_CHECK(uart_set_mode(UART_NUM_2,UART_MODE_RS485_HALF_DUPLEX));

It didn't accept an integer as first parameter of the function. I assumed UART_NUM_2 is the valid one for Serial2.

However on 2.0.1 it still doesn't communicate with the MODBUS device.

@VojtechBartoska VojtechBartoska added the Status: Test needed Issue needs testing label Nov 18, 2021
@ignasurba
Copy link

I can confirm this is an issue.

image

Part of the packet is sent after the flush returns and DE is disabled.

This worked for me:
ESP_ERROR_CHECK(uart_set_mode(UART_NUM_1, UART_MODE_RS485_HALF_DUPLEX));
(using Serial1)

@garageeks
Copy link
Author

garageeks commented Dec 3, 2021

Hi @ignasurba thank you for your feedback. I have tried it based on @SuGlider suggestion but didn't work. I can give it another try.
Where did you place this instruction? After the node.begin instruction?

@ignasurba
Copy link

It looks like this:

  Serial1.begin(MODBUS_BAUDRATE, SERIAL_8N1, MODBUS_RX, MODBUS_TX);
  node.begin(1, Serial1);
  node.preTransmission(modbus_pre);
  node.postTransmission(modbus_post);
  ESP_ERROR_CHECK(uart_set_mode(UART_NUM_1, UART_MODE_RS485_HALF_DUPLEX));

Seems to work so far without issues.

@garageeks
Copy link
Author

I replaced Serial2 for Serial1 and UART_NUM_2 for UART_NUM_1 and it works!

So apparently there is some other bug with Serial2 / UART_NUM_2 combination.

The previous peripheral on Serial1 is happy to work on Serial2, so I can leave my test bench running for a while and see its stability.

This code doesn't work:

	Serial2.begin(115200, SERIAL_8N1, 16, 17);		//Serial2 MODBUS interface
	node.begin(1, Serial2);
	// Callbacks 
	node.preTransmission(preTransmission);
	node.postTransmission(postTransmission);
	ESP_ERROR_CHECK(uart_set_mode(UART_NUM_2,UART_MODE_RS485_HALF_DUPLEX));

@mfriedlvaricon
Copy link

Hello!

We have the same issue... using Serial2.flush randomly returns inb4 all the information has been sent. This results in a massive loss of data and a non working product...
In the first picture you can see a valid output (Red line is basically when Serial.flush completes)
In the second picture flush gets called way to early and actually destroys the message.
CONS3-Busfehler - kein Abbruch_modifiziertMF

CONS3-Busfehler Abbruch nach 2,5ms mit DE vom ESP32

All of this happens totally randomly. Sometimes it works, sometimes it keeps failing.
(EO = EnableOutput, DO = DisableOutput)
grafik

Hope this gets fixed soon, as our products are relying on this...

Thanks!
Markus

@TD-er
Copy link
Contributor

TD-er commented Dec 15, 2021

Not sure if it will get noticed, as both this issue and the PR are now closed.
By I really wonder whether this is a good fix.
Sure it can be a good fix when using modbus, but I think it may make it impossible to use the serial port for bi-directional communication like showing log and receiving commands. e.g. typical terminal use cases.

See: #6026 (comment)

@VojtechBartoska
Copy link
Contributor

@TD-er Thanks for your comment. We will take a look on your note and consider correctness of this fix.

@SuGlider
Copy link
Collaborator

Not sure if it will get noticed, as both this issue and the PR are now closed. By I really wonder whether this is a good fix. Sure it can be a good fix when using modbus, but I think it may make it impossible to use the serial port for bi-directional communication like showing log and receiving commands. e.g. typical terminal use cases.

See: #6026 (comment)

@TD-er

This fix doesn't prevent UART to be full-duplex or to force it to be half-duplex. It's just a name of a feature IDF uses in this driver with its own internal RS-485 functionalities - that are not used in ESP32 Arduino implementation.

Please read more in #6026 (comment)

@TD-er
Copy link
Contributor

TD-er commented Dec 18, 2021

Hmm that could benefit from a comment line explaining what is intended as that setting value is not really descriptive of what it apparently does.
I am now browsing through various RS485 schematics to see if there are use cases where the RX will receive whatever is sent via TX but so far I have found none.
The "half duplex" in the name suggests the RX is not listening when sending.
Typical RS485 communication, like done with a MAX485, does prevent the RX to receive what is sent via TX, so then it doesn't matter if the RX data is still being read, but if there is another common setup then it is required to ignore any incoming data on the RX line. (which is what I did expect when reading the "Half Duplex" part of the setting)

Still I do think that it would be a good idea to make the mode configurable, like in the uartBegin function where it is now set to UART_MODE_RS485_HALF_DUPLEX

@SuGlider
Copy link
Collaborator

@TD-er

Thanks for your suggestions and care.

Maybe, as you said, a commentary before the line code in UART HAL layer may avoid misinformation.

Thanks you!

@SuGlider
Copy link
Collaborator

SuGlider commented Jan 13, 2022

@garageeks @ignasurba @TD-er
The fix has been reviewed and now it has nothing to do with RS-485 any more.
Please look at #6133

Thanks for your support!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants