Skip to content

String clear() does not free memory, only set length(0) #8485

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
4 of 6 tasks
TD-er opened this issue Feb 10, 2022 · 22 comments
Open
4 of 6 tasks

String clear() does not free memory, only set length(0) #8485

TD-er opened this issue Feb 10, 2022 · 22 comments

Comments

@TD-er
Copy link
Contributor

TD-er commented Feb 10, 2022

Basic Infos

  • This issue complies with the issue POLICY doc.
  • I have read the documentation at readthedocs and the issue is not addressed there.
  • I have tested that the issue is present in current master branch (aka latest git).
  • I have searched the issue tracker for a similar issue.
  • If there is a stack dump, I have decoded it.
  • I have filled out all fields below.

Platform

  • Hardware: [ESP-12|ESP-01|ESP-07|ESP8285 device|other]
  • Core Version: [latest git]

Problem Description

The String::clear() function does only set the length to 0, it does not free the memory.
The same for assigning an empty string, as this only checks if the reserved capacity is enough.

This means it is quite tricky when clearing a string that's no longer needed.
For example a buffer I'm using, which is an array of String.

The only way to really free the memory is by assigning a new string which will then be copied using the move operator, like this:

Message[read_idx] = String();

It does not really make sense to only set the length of the string to 0, as that's not what clear() suggests.

Or at least have some function like free() to actually free the allocated memory.

@mcspr
Copy link
Collaborator

mcspr commented Feb 10, 2022

This means it is quite tricky when clearing a string that's no longer needed.
For example a buffer I'm using, which is an array of String.

The only way to really free the memory is by assigning a new string which will then be copied using the move operator, like this:

Message[read_idx] = String();

Maybe I am missing some other use case... but what you are describing is destructor of the String object. In case of the container, we remove the element to free the resource. In case that you describe above, operation does make sense though? It discards the original object, replacing it with something completely new.

@TD-er
Copy link
Contributor Author

TD-er commented Feb 10, 2022

Yep, that's exactly my point why it makes no sense this seems to be the only way to actually free the allocated memory.

String foo;
String emptyString;
foo.reserve(100);
foo.clear(); // Still allocates the 100 bytes memory
foo = "";  // Still allocates the 100 bytes memory
foo = F("");  // Still allocates the 100 bytes memory
foo = emptyString;  // Still allocates the 100 bytes memory
foo = String(); // uses the move copy constructor String::String(String&& other) to free the heap memory.

My use case is a buffer object, which keeps log strings.

    String Message[LOG_STRUCT_MESSAGE_LINES];
    unsigned long timeStamp[LOG_STRUCT_MESSAGE_LINES] = {0};

After I sent the log messages to the webserver, I want to free them.
Or if the messages have not been read after a while.
But calling .clear() does not free the memory, and thus these String objects keep growing to whatever was once the largest string it had to keep.

@mcspr
Copy link
Collaborator

mcspr commented Feb 10, 2022

If these are line entries, have you considered std::vector<String> with something that checks upper boundary of size() < LOG_STRUCT_MESSAGE_LINES?
What would be the alternative of clear() then, when we do want to persist the buffer?

@TD-er
Copy link
Contributor Author

TD-er commented Feb 10, 2022

Maybe .clear() is still a valid option, instead of = "";
But then you should need a .free() to make clear what is intended.

And using std::vector<String> is not really an option for a cyclic buffer.
You still need to assign a new String object to it, as destructing a member of a std::vector container may cause strange behavior.
Then it is better to consider to make a double ended queue or a list of String.

@mcspr
Copy link
Collaborator

mcspr commented Feb 10, 2022

struct Blah : public String {
    using String::String;
    using String::invalidate;
};

?

In case of the queue I'd still think about actually destroying the String object.

@TD-er
Copy link
Contributor Author

TD-er commented Feb 10, 2022

In case of the queue I'd still think about actually destroying the String object.

Yep, so that could be an alternative to overcome the shortcoming of the String object.
Not sure if I like to inherit from String as that also isn't free regarding resources.
Have to check to see if String has a virtual destructor too, or else we're getting lots of other issues :)

@mcspr
Copy link
Collaborator

mcspr commented Feb 10, 2022

Not sure if I like to inherit from String as that also isn't free regarding resources.

Should not be a problem? We are not introducing any additional code, just add an alias to call the (sort-of internal) method.

Have to check to see if String has a virtual destructor too, or else we're getting lots of other issues :)

No virtual dtor, but it also should not matter in plain inheritance. And note that it also could be solved by having String member in the class that does have such dtor

@TD-er
Copy link
Contributor Author

TD-er commented Feb 10, 2022

Either way, wouldn't it be better if the actual String class has a function to properly free the allocated heap memory? :)

@earlephilhower
Copy link
Collaborator

I think that @TD-er has a point. Right now a String is always monotonically increasing in size. While in most cases they're short lived, in his code they're obviously not.

Having the String free non-SSO memory and revert to a SSO-based empty string seems like it follows the "principle of least surprise".

@mcspr
Copy link
Collaborator

mcspr commented Feb 10, 2022

So just make invalidate() public?
I'd rather String did not do the opposite and magically shrinked when changing internal length

@earlephilhower
Copy link
Collaborator

I would not expose invalidate, just make the constant c-str assignment and String assignment smart enough to detect strlen()==0 and basically invalidate automatically.

@mcspr
Copy link
Collaborator

mcspr commented Feb 10, 2022

Not to go in a loop, but = String(); does exactly this without any additional code and / or logic in any of the methods :/ Move is really cheap, I really don't understand the concern
Maybe I need to sleep on this, though.

@earlephilhower
Copy link
Collaborator

While I agree x=String() will free the memory, to me it seems a little obscure and my first instinct would be to do x="" or x.clear().

To make it more concrete, I've pushed #8486 about what I was talking about. Maybe @TD-er can see if this is what he is describing?

earlephilhower added a commit to earlephilhower/Arduino that referenced this issue Feb 10, 2022
Fixes esp8266#8485

Whenever the empty C string is assigned or the .clear() method
called, free any heap allocated with the String.
@TD-er
Copy link
Contributor Author

TD-er commented Feb 10, 2022

I've done some thinking too and maybe the clear() can be as it was, to clear the internal string.
But assigning a "" should release the memory as that's also found in lots of library code, so others seem to think so too.

Can you also add a free() function, preferrably next to the clear() so it will be clear to the user what will be cleared when they call clear :)

invalidate may not be the most obvious function name for a public function.

I think myString.free(); will result in smaller compiled code compared to myString = ""; (especially when implemented in the .cpp, not in the .h)

@TD-er
Copy link
Contributor Author

TD-er commented Feb 10, 2022

Another thing I did realize today, when going through my code.

Since = String() does call the String::String(String&& other) constructor, it also makes a big difference in these examples:

String getText() { return F(""); }

String myString;
myString.reserve(100);

myString = F("");  // still allocated 100 bytes
myString = getText(); // No heap allocated anymore

Lots of code does something like this:

String myString;
myString.reserve(100);
myString = F("MyLog: ");
myString += value;
...

But then it really makes a difference when the first assignment is the return String object of a function call or something else like a String object, or whatever you can assign to a String.

@d-a-v
Copy link
Collaborator

d-a-v commented Feb 10, 2022

Current API (.clear(), = "") may already be known as harmless regarding pre-reservation and used like this:

String str;
str.reserve(100);
while (some_condition) {
    str.clear();  // or str = "";
    for (int i = 0; i < N; i++) {
        str += 'x';
    }
    ....
}

Rather than String::free(), what about a String::reset() ?

@d-a-v
Copy link
Collaborator

d-a-v commented Feb 10, 2022

But then it really makes a difference when the first assignment is the return String object of a function call or something else like a String object, or whatever you can assign to a String.

myString.reserve(100);
myString = F("MyLog: ");

it does not invalidate the pre-reservation.

myString = nullptr; currently does.

@TD-er
Copy link
Contributor Author

TD-er commented Feb 10, 2022

But then it really makes a difference when the first assignment is the return String object of a function call or something else like a String object, or whatever you can assign to a String.

myString.reserve(100);
myString = F("MyLog: ");

it does not invalidate the pre-reservation.

myString = nullptr; currently does.

Yep, that's my point.

const __FlashStringHelper * myFlash() { return F("MyFlash"); }
String myString() { return F("MyString"); }

String str;
str.reserve(100);

str = myFlash(); // Still 100 bytes allocated;
str += myString(); // Still 100 bytes allocated;
str = myString(); // Work done by reserve is undone.

@TD-er
Copy link
Contributor Author

TD-er commented Feb 10, 2022

Rather than String::free(), what about a String::reset() ?

I'm in for the best name.
If similar classes use reset() then that should be it.

std::string does have a clear() function which does also not explicitly require to change the capacity.

Note from that page:

Unlike for std::vector::clear, the C++ standard does not explicitly require that capacity is unchanged by this function, but existing implementations do not change capacity. This means that they do not release the allocated memory (see also shrink_to_fit).

But it also seems to have a shrink_to_fit function since C++11.
However, calling clear() followed by shrink_to_fit() is not really an improvement.

So then a reset() does also make sense.

@TD-er
Copy link
Contributor Author

TD-er commented Feb 11, 2022

Just ran into an issue on ESP32 running the latest IDF4.4 Arduino v2.0.2 code
I accidentally assigned a string to 2 other strings using std::move like this:

String foo, bar;

void myFunction(String&& str) {
  foo = std::move(str);
  bar = std::move(str);
}

N.B. these strings may already contain some data

This leads to crashes on ESP32 and not on ESP8266, but will show unintended behavoir (aka: bug)

I looked at the code on both ends and the String::move differs quite a lot on both platforms.
So maybe it is (still?) a bug on ESP32 in the move code?

ESP32:
https://github.com/espressif/arduino-esp32/blob/c4954dd582f5bde478316ddf184ded6979e812b3/cores/esp32/WString.cpp#L245-L272

ESP8266:

void String::move(String &rhs) noexcept {
invalidate();
sso = rhs.sso;
rhs.init();
}

IMHO the code on the ESP32 is not really a move, as it also may keep the allocated space in the string and thus probably still keeps the memory allocated when you assign as discussed here as work-around ( = String() )
The check for buffer() does not take into account to match the SSO flag of the rhs and thus it tries to copy its content.
But the rhs was already "invalidated" and thus rhs.len() = 0, so it tries to memmove on the rhs.buffer().

TL;DR
The move operator is for sure broken on the ESP32 side, if you assign an already moved String (or invalidated String) to a String object that already contains some string which does not fit in the SSO buffer.

@everslick
Copy link
Contributor

everslick commented Sep 24, 2023

I'm late to the party as it seems, but I want to share my 2 cents as well. I believe assigning "" to a string should NOT release its internal buffer, because then there is no way to reset it's content without going through a complete free() -> malloc() cycle with all it's CPU overhead. Additionally it IS quite surprising that assignment suddenly shrinks the capacity, when it is only supposed to increase on assignment. So I'm fine with clear() and = "" having different semantics, even though I prefer a new method like reset() because it does not change behavior of existing code.

@erkr
Copy link

erkr commented Nov 24, 2024

I don’t now why but for me use case (construct web pages) String.clear() does exactly what I need. It clears the string without losing the reserved space.
Assigning an empty string doesn’t work the same! That seems to return before the cleanup is completed. So new values assigned directly after setting the empty string get lost ?!
Please don’t make clear() a free() without fixed the empty string issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants