Skip to content

WiFiClientSecure: Memory leak when an SSL Handshake fails #5781

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Phsacar opened this issue Oct 19, 2021 · 9 comments
Closed

WiFiClientSecure: Memory leak when an SSL Handshake fails #5781

Phsacar opened this issue Oct 19, 2021 · 9 comments
Labels
Area: BT&Wifi BT & Wifi related issues Status: Solved

Comments

@Phsacar
Copy link

Phsacar commented Oct 19, 2021

Hardware:

Board: ESP32 Dev Module
Core Installation version: x
IDE name: Platform.io
Flash Frequency: 40Mhz
PSRAM enabled: no
Upload Speed: 115200
Computer OS: Ubuntu

Description:

Hi,
I found a leak in ssl_client.cpp:
If the Handshake fails in start_ssl_client(), and an error is returned, the following section is never reached:

  if (rootCABuff != NULL) {
        mbedtls_x509_crt_free(&ssl_client->ca_cert);
    }

    if (cli_cert != NULL) {
        mbedtls_x509_crt_free(&ssl_client->client_cert);
    }

    if (cli_key != NULL) {
        mbedtls_pk_free(&ssl_client->client_key);
    }   

There does not seem to be a deconstructor in WiFiClientSecure that handles freeing the memory either.

Background: I am using a CA to connect to an encrypted mqtt server; only after freeing ssl_client->ca_cert myself I stopped having memory leaks.

@VojtechBartoska VojtechBartoska added the Area: BT&Wifi BT & Wifi related issues label Oct 20, 2021
@TD-er
Copy link
Contributor

TD-er commented Nov 4, 2021

Are you absolutely sure the leak stopped, after manually freeing ca_cert?

I'm looking into this too and I even copied WiFiClientSecure + ssl_client into new classes (with a prefix in the name) to be able to do simple tests by freeing everything that's being accessed via _init functions.
But on average I keep loosing ~1880 bytes per iteration.

Simple test here is to use a CA root certificate (I used the let's encrypt one, but it should not matter) to verify against a self-signed certificate.
I do get TLS error code: -9984 X509 - Certificate verification failed, e.g. CRL, CA or signature check failed , which is to be expected. But on each new attempt I loose about 1880 bytes of free heap memory.

I use it for accessing MQTT.
I don't call loadCACert as that's obviously leaking memory (see: #5826 ) but just call setCACert.
Also tested by deleting the WiFiClientSecure object (and implementing the calls to free in the destructor, just to be sure) but does not fix it.

@TD-er
Copy link
Contributor

TD-er commented Nov 4, 2021

OK, scrap that... Apparently I already fixed it myself, but have changed multiple things at once and did not notice it.

I had switched to no longer deleting the WiFiClientSecure, but still did create it and at the same time I had added to free the certificate data.
So now I also have the memory usage stable too.

I can confirm that ssl_client.h should have a destructor which needs to call the free functions of everything in the struct.

sslclient_context::~sslclient_context()
{
    mbedtls_x509_crt_free(&ca_cert);
    mbedtls_x509_crt_free(&client_cert);
    mbedtls_pk_free(&client_key);
}

The rest is already freed in the call to stop() in WiFiClientSecure, but maybe those should also be part of this destructor too as there might be other paths possible, which I don't see right now.

N.B. I also added this, but not sure if it is needed:

void ssl_init(sslclient_context *ssl_client)
{
    mbedtls_ssl_free(&ssl_client->ssl_ctx);
    mbedtls_ssl_config_free(&ssl_client->ssl_conf);
    mbedtls_ctr_drbg_free(&ssl_client->drbg_ctx);

    mbedtls_ssl_init(&ssl_client->ssl_ctx);
    mbedtls_ssl_config_init(&ssl_client->ssl_conf);
    mbedtls_ctr_drbg_init(&ssl_client->drbg_ctx);
}

Edit:
I also had made a constructor to make sure the structs were initialized correctly, or at least make sure the pointers are NULL.

sslclient_context::sslclient_context()
{
    memset(&ssl_ctx, 0, sizeof(ssl_ctx));
    memset(&ssl_conf, 0, sizeof(ssl_conf));
    memset(&drbg_ctx, 0, sizeof(drbg_ctx));
    memset(&entropy_ctx, 0, sizeof(entropy_ctx));
    memset(&ca_cert, 0, sizeof(ca_cert));
    memset(&client_cert, 0, sizeof(client_cert));
    memset(&client_key, 0, sizeof(client_key));
}

When I removed this in my cleanup process, the ssl_init call caused crashes.
So if you call the _free functions while there have been no calls to _init first, it may crash.

@ppescher
Copy link
Contributor

I found the same memory leak happens when verification fails, but possibly for many other reasons after review of the source code.
I'm currently testing a fix, I will be back with my findings later.

@sguarin
Copy link

sguarin commented Nov 27, 2021

The problem I was having it's related to call malloc again on WiFiClientSecure without freeing the previosly stored certificate.
Related issue with temporary fix here:
#5826

@TD-er
Copy link
Contributor

TD-er commented Nov 27, 2021

The issue you're trying to fix is just when you stream the certificate.
The issue I referred to is related to the client context not being initialized and cleared well when verification fails and you try again to connect.

@ppescher
Copy link
Contributor

ppescher commented Nov 29, 2021

Disregard the first PR, I messed up my forked repo (by the way, I did not find a way to delete the PR completely).

In my tests the proposed fix gets rid of memory leaks when the secure connection attempt fails. I simulated troubles in the connection by randomly disconnecting the server side: if this happened during handshake I got memory leaks up to a point where the firmware just fails because of memory allocation errors. With this fix it does not happen anymore.

@stale stale bot added the wontfix label Apr 17, 2022
@espressif espressif deleted a comment from stale bot Apr 20, 2022
@VojtechBartoska
Copy link
Contributor

Can this be closed as mentioned PR was merged already?

@VojtechBartoska VojtechBartoska added Resolution: Awaiting response Waiting for response of author and removed wontfix labels Apr 20, 2022
@ppescher
Copy link
Contributor

@VojtechBartoska I'm not the op, but as far as I'm concerned this is fixed and you can close the issue.

@VojtechBartoska
Copy link
Contributor

Closing as solved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: BT&Wifi BT & Wifi related issues Status: Solved
Projects
None yet
Development

No branches or pull requests

5 participants