Skip to content

FS: Performance improvements in vfs_api #7541

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Dec 19, 2022
Merged

Conversation

BlueAndi
Copy link
Contributor

Performance improvements by replacing sprintf with strcpy/strcat. Additional avoid creating temporary String objects.

@BlueAndi BlueAndi mentioned this pull request Nov 30, 2022
1 task
@VojtechBartoska VojtechBartoska added the Area: Performance Issue related to performance problems and improvements label Dec 1, 2022
Copy link
Collaborator

@mrengineer7777 mrengineer7777 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code is functionally equivalent but better performance. No obvious bugs introduced. LGTM.

@P-R-O-C-H-Y P-R-O-C-H-Y self-assigned this Dec 2, 2022
@P-R-O-C-H-Y P-R-O-C-H-Y added the Area: Libraries Issue is related to Library support. label Dec 2, 2022
@P-R-O-C-H-Y P-R-O-C-H-Y added this to the 2.0.6 milestone Dec 2, 2022
@P-R-O-C-H-Y
Copy link
Member

Hi @BlueAndi @mrengineer7777, I did a quick test and creating 24 dirs with this PR take around 5400ms (was around 2000 ms before).

Do yo have any data that the speed has improved? Or some sketch to test it myself? Thanks

@mrengineer7777
Copy link
Collaborator

@P-R-O-C-H-Y I have not tested the code, only reviewed proposed changes. Swapping sprintf for strcpy + strcat should be faster. That said, I did read this weekend that strcat rescans the target string before appending. The fastest implementation here would be to use pointers.

e.g. for the mkdir function.
Original:

    char * temp = (char *)malloc(strlen(fpath)+strlen(_mountpoint)+1);
    if(!temp) {
        log_e("malloc failed");
        return false;
    }
    sprintf(temp,"%s%s", _mountpoint, fpath);

PR:

    char * temp = (char *)malloc(strlen(fpath)+strlen(_mountpoint)+1);
    if(!temp) {
        log_e("malloc failed");
        return false;
    }
    strcpy(temp, _mountpoint);
    strcat(temp, fpath);

Faster (theoretical, untested):

    size_t mountlen = strlen(_mountpoint);
    size_t fpathlen = strlen(fpath);
    char *temp = (char *)malloc(mountlen+fpathlen+1);
    if(!temp) {
        log_e("malloc failed");
        return false;
    }
    char *ptr = temp;
    memcpy(ptr, _mountpoint, mountlen);
    ptr += mountlen;
    memcpy(ptr, _fpath, fpathlen);
    ptr += fpathlen;
    *ptr = '\0'; //Null terminate (but don't advance pointer)

Result will be in "temp". As a bonus you can calculate the string length as temp-ptr.
Low level code like this is fast but easy to mess it. It requires careful testing.

@BlueAndi
Copy link
Contributor Author

BlueAndi commented Dec 6, 2022

@P-R-O-C-H-Y I measured by walking recursively through the directories and files (LittleFS) by using open() and openNextFile() in a existing project. I will move it to a separate sketch and measure creating directories and files too.

@BlueAndi
Copy link
Contributor Author

BlueAndi commented Dec 8, 2022

@P-R-O-C-H-Y Can you please test with it on your side? https://github.com/BlueAndi/fsTest

Original:

[  2541][I][main.cpp:101] setup(): [main] Format filesystem.
[  4689][I][main.cpp:122] setup(): [main] Setup finished.
[  4689][I][main.cpp:138] loop(): [main] Creating directories and files ...
[  4689][I][main.cpp:139] loop(): [main] Directories per level: 5
[  4695][I][main.cpp:140] loop(): [main] Directory depth      : 5
[  4701][I][main.cpp:141] loop(): [main] Files per directory  : 5
[ 10825][I][main.cpp:152] loop(): [main] --> Duration: 6 s 118 ms
[ 10825][I][main.cpp:156] loop(): [main] Walking through directories recursively ...
[ 14471][I][main.cpp:161] loop(): [main] --> Duration: 3 s 644 ms
[ 14471][I][main.cpp:169] loop(): [main] sprintf test ...
[ 14540][I][main.cpp:177] loop(): [main] --> Duration: 0 s 69 ms
[ 14540][I][main.cpp:181] loop(): [main] strcpy/strcat test ...
[ 14558][I][main.cpp:190] loop(): [main] --> Duration: 0 s 18 ms
[ 14558][I][main.cpp:194] loop(): [main] by hand test ...
[ 14575][I][main.cpp:220] loop(): [main] --> Duration: 0 s 17 ms
[ 14576][I][main.cpp:224] loop(): [main] Reset to restart.

Improved:

[  2552][I][main.cpp:101] setup(): [main] Format filesystem.
[  4701][I][main.cpp:122] setup(): [main] Setup finished.
[  4701][I][main.cpp:138] loop(): [main] Creating directories and files ...
[  4701][I][main.cpp:139] loop(): [main] Directories per level: 5
[  4707][I][main.cpp:140] loop(): [main] Directory depth      : 5
[  4713][I][main.cpp:141] loop(): [main] Files per directory  : 5
[ 10268][I][main.cpp:152] loop(): [main] --> Duration: 5 s 549 ms
[ 10268][I][main.cpp:156] loop(): [main] Walking through directories recursively ...
[ 13450][I][main.cpp:161] loop(): [main] --> Duration: 3 s 180 ms
[ 13450][I][main.cpp:169] loop(): [main] sprintf test ...
[ 13519][I][main.cpp:177] loop(): [main] --> Duration: 0 s 69 ms
[ 13519][I][main.cpp:181] loop(): [main] strcpy/strcat test ...
[ 13537][I][main.cpp:190] loop(): [main] --> Duration: 0 s 18 ms
[ 13537][I][main.cpp:194] loop(): [main] by hand test ...
[ 13555][I][main.cpp:220] loop(): [main] --> Duration: 0 s 18 ms
[ 13555][I][main.cpp:224] loop(): [main] Reset to restart.

I hope I did not any mistake!? :-)

@P-R-O-C-H-Y
Copy link
Member

P-R-O-C-H-Y commented Dec 13, 2022

Hi @BlueAndi, I have tested your sketch with ESP32-DevKitC for LittleFS lib and edited sketch for testing on ESP32-WroverKit with SD Library. These are the results.

LITTLEFS library

ORIGINAL:

[  5391][I][fs_test.ino:101] setup(): [main] Format filesystem.
[ 10787][I][fs_test.ino:122] setup(): [main] Setup finished.
[ 10787][I][fs_test.ino:138] loop(): [main] Creating directories and files ...
[ 10788][I][fs_test.ino:139] loop(): [main] Directories per level: 5
[ 10794][I][fs_test.ino:140] loop(): [main] Directory depth      : 5
[ 10800][I][fs_test.ino:141] loop(): [main] Files per directory  : 5
[ 16480][I][fs_test.ino:152] loop(): [main] --> Duration: 5 s 674 ms
[ 16480][I][fs_test.ino:156] loop(): [main] Walking through directories recursively ...
[ 19387][I][fs_test.ino:161] loop(): [main] --> Duration: 2 s 904 ms
[ 19387][I][fs_test.ino:169] loop(): [main] sprintf test ...
[ 19456][I][fs_test.ino:177] loop(): [main] --> Duration: 0 s 69 ms
[ 19456][I][fs_test.ino:181] loop(): [main] strcpy/strcat test ...
[ 19474][I][fs_test.ino:190] loop(): [main] --> Duration: 0 s 18 ms
[ 19474][I][fs_test.ino:194] loop(): [main] by hand test ...
[ 19492][I][fs_test.ino:220] loop(): [main] --> Duration: 0 s 18 ms
[ 19492][I][fs_test.ino:224] loop(): [main] Reset to restart.

With changes from PR:

[  5379][I][fs_test.ino:101] setup(): [main] Format filesystem.
[ 10736][I][fs_test.ino:122] setup(): [main] Setup finished.
[ 10736][I][fs_test.ino:138] loop(): [main] Creating directories and files ...
[ 10737][I][fs_test.ino:139] loop(): [main] Directories per level: 5
[ 10743][I][fs_test.ino:140] loop(): [main] Directory depth      : 5
[ 10749][I][fs_test.ino:141] loop(): [main] Files per directory  : 5
[ 16428][I][fs_test.ino:152] loop(): [main] --> Duration: 5 s 673 ms
[ 16428][I][fs_test.ino:156] loop(): [main] Walking through directories recursively ...
[ 19346][I][fs_test.ino:161] loop(): [main] --> Duration: 2 s 915 ms
[ 19346][I][fs_test.ino:169] loop(): [main] sprintf test ...
[ 19415][I][fs_test.ino:177] loop(): [main] --> Duration: 0 s 68 ms
[ 19415][I][fs_test.ino:181] loop(): [main] strcpy/strcat test ...
[ 19434][I][fs_test.ino:190] loop(): [main] --> Duration: 0 s 18 ms
[ 19434][I][fs_test.ino:194] loop(): [main] by hand test ...
[ 19451][I][fs_test.ino:220] loop(): [main] --> Duration: 0 s 17 ms

SD LIBRARY

ORIGINAL:

[    74][I][fs_test.ino:161] loop(): [main] Creating directories and files ...
[    74][I][fs_test.ino:162] loop(): [main] Directories per level: 5
[    75][I][fs_test.ino:163] loop(): [main] Directory depth      : 5
[    82][I][fs_test.ino:164] loop(): [main] Files per directory  : 5
[  4647][I][fs_test.ino:175] loop(): [main] --> Duration: 4 s 559 ms
[  4647][I][fs_test.ino:179] loop(): [main] Walking through directories recursively ...
[  7941][I][fs_test.ino:184] loop(): [main] --> Duration: 3 s 291 ms
[  7941][I][fs_test.ino:192] loop(): [main] sprintf test ...
[  8010][I][fs_test.ino:200] loop(): [main] --> Duration: 0 s 69 ms
[  8010][I][fs_test.ino:204] loop(): [main] strcpy/strcat test ...
[  8028][I][fs_test.ino:213] loop(): [main] --> Duration: 0 s 18 ms
[  8028][I][fs_test.ino:217] loop(): [main] by hand test ...
[  8046][I][fs_test.ino:243] loop(): [main] --> Duration: 0 s 18 ms
[  8046][I][fs_test.ino:247] loop(): [main] Reset to restart

With changes from PR:

[   100][I][fs_test.ino:161] loop(): [main] Creating directories and files ...
[   100][I][fs_test.ino:162] loop(): [main] Directories per level: 5
[   102][I][fs_test.ino:163] loop(): [main] Directory depth      : 5
[   108][I][fs_test.ino:164] loop(): [main] Files per directory  : 5
[  7214][I][fs_test.ino:175] loop(): [main] --> Duration: 7 s 100 ms
[  7214][I][fs_test.ino:179] loop(): [main] Walking through directories recursively ...
[ 10375][I][fs_test.ino:184] loop(): [main] --> Duration: 3 s 158 ms
[ 10375][I][fs_test.ino:192] loop(): [main] sprintf test ...
[ 10444][I][fs_test.ino:200] loop(): [main] --> Duration: 0 s 69 ms
[ 10444][I][fs_test.ino:204] loop(): [main] strcpy/strcat test ...
[ 10462][I][fs_test.ino:213] loop(): [main] --> Duration: 0 s 18 ms
[ 10463][I][fs_test.ino:217] loop(): [main] by hand test ...
[ 10480][I][fs_test.ino:243] loop(): [main] --> Duration: 0 s 17 ms
[ 10480][I][fs_test.ino:247] loop(): [main] Reset to restart.

From my side, I did not get the result as you did. On what chip did you test? I will try it on different chips, but for sure I can say, that for SD lib, the files + directories creation is a lot worse than it was before. But I don't get why is behaving differently.

@BlueAndi
Copy link
Contributor Author

I tested it with a esp32doit-devkit-v1, further detail:

[  4608][I][main.cpp:140] loop(): [main] ESP chip id    : 2C1559BF713C
[  4608][I][main.cpp:141] loop(): [main] ESP type       : ESP32
[  4614][I][main.cpp:142] loop(): [main] ESP chip rev.  : 1
[  4619][I][main.cpp:143] loop(): [main] ESP cpu freq.  : 240 MHz
[  4625][I][main.cpp:144] loop(): [main] Flash chip mode: DIO
[  4630][I][main.cpp:145] loop(): [main] Flash chip size: 0x00400000 byte
[  4637][I][main.cpp:146] loop(): [main] Flash freq.    : 40 MHz
[  4642][I][main.cpp:147] loop(): [main] ESP SDK version: v4.4.2

In our first two measurements with LittleFS I can't see really a difference. Hmm ...
What we can see is, that the simple tests with sprintf and strcpy/strcat are always equal as expected. And the sprintf variant is the slowest.

I can't see the difference to the SD library. Will try on another board with a esp32-s3.

@BlueAndi
Copy link
Contributor Author

With the Lilygo T-Display S3 the difference is not so high, but still there.

Original:

[ 12743][I][main.cpp:104] setup(): [main] Format filesystem.
[ 25312][I][main.cpp:125] setup(): [main] Setup finished.
[ 25313][I][main.cpp:139] loop(): [main] ESP chip id    : 7011CEFA12F4
[ 25313][I][main.cpp:140] loop(): [main] ESP type       : esp32s3
[ 25319][I][main.cpp:141] loop(): [main] ESP chip rev.  : 0
[ 25324][I][main.cpp:142] loop(): [main] ESP cpu freq.  : 240 MHz
[ 25330][I][main.cpp:143] loop(): [main] Flash chip mode: QUIO
[ 25335][I][main.cpp:144] loop(): [main] Flash chip size: 0x01000000 byte
[ 25342][I][main.cpp:145] loop(): [main] Flash freq.    : 80 MHz
[ 25347][I][main.cpp:146] loop(): [main] ESP SDK version: v4.4.2
[ 25353][I][main.cpp:150] loop(): [main] Creating directories and files ...
[ 25360][I][main.cpp:151] loop(): [main] Directories per level: 5
[ 25366][I][main.cpp:152] loop(): [main] Directory depth      : 5
[ 25371][I][main.cpp:153] loop(): [main] Files per directory  : 5
[ 30323][I][main.cpp:164] loop(): [main] --> Duration: 4 s 946 ms
[ 30323][I][main.cpp:168] loop(): [main] Walking through directories recursively ...
[ 32693][I][main.cpp:173] loop(): [main] --> Duration: 2 s 368 ms
[ 32693][I][main.cpp:181] loop(): [main] sprintf test ...
[ 32752][I][main.cpp:189] loop(): [main] --> Duration: 0 s 59 ms
[ 32753][I][main.cpp:193] loop(): [main] strcpy/strcat test ...
[ 32770][I][main.cpp:202] loop(): [main] --> Duration: 0 s 16 ms
[ 32770][I][main.cpp:206] loop(): [main] by hand test ...
[ 32784][I][main.cpp:232] loop(): [main] --> Duration: 0 s 14 ms
[ 32785][I][main.cpp:236] loop(): [main] Reset to restart.

Improved:

[ 12606][I][main.cpp:104] setup(): [main] Format filesystem.
[ 25193][I][main.cpp:125] setup(): [main] Setup finished.
[ 25194][I][main.cpp:139] loop(): [main] ESP chip id    : 7011CEFA12F4
[ 25194][I][main.cpp:140] loop(): [main] ESP type       : esp32s3
[ 25199][I][main.cpp:141] loop(): [main] ESP chip rev.  : 0
[ 25205][I][main.cpp:142] loop(): [main] ESP cpu freq.  : 240 MHz
[ 25211][I][main.cpp:143] loop(): [main] Flash chip mode: QUIO
[ 25216][I][main.cpp:144] loop(): [main] Flash chip size: 0x01000000 byte
[ 25223][I][main.cpp:145] loop(): [main] Flash freq.    : 80 MHz
[ 25228][I][main.cpp:146] loop(): [main] ESP SDK version: v4.4.2
[ 25234][I][main.cpp:150] loop(): [main] Creating directories and files ...
[ 25241][I][main.cpp:151] loop(): [main] Directories per level: 5
[ 25247][I][main.cpp:152] loop(): [main] Directory depth      : 5
[ 25252][I][main.cpp:153] loop(): [main] Files per directory  : 5
[ 30200][I][main.cpp:164] loop(): [main] --> Duration: 4 s 942 ms
[ 30200][I][main.cpp:168] loop(): [main] Walking through directories recursively ...
[ 32554][I][main.cpp:173] loop(): [main] --> Duration: 2 s 352 ms
[ 32555][I][main.cpp:181] loop(): [main] sprintf test ...
[ 32614][I][main.cpp:189] loop(): [main] --> Duration: 0 s 59 ms
[ 32615][I][main.cpp:193] loop(): [main] strcpy/strcat test ...
[ 32631][I][main.cpp:202] loop(): [main] --> Duration: 0 s 16 ms
[ 32632][I][main.cpp:206] loop(): [main] by hand test ...
[ 32646][I][main.cpp:232] loop(): [main] --> Duration: 0 s 14 ms
[ 32646][I][main.cpp:236] loop(): [main] Reset to restart.

…itional avoid creating temporary String objects.
Copy link
Member

@P-R-O-C-H-Y P-R-O-C-H-Y left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did another test with SD card running on 40 MHz, and got better performance with this PR.

ORIGINAL CODE - SD card 40 MHz:

[    53][I][fs_test.ino:161] loop(): [main] Creating directories and files ...
[    53][I][fs_test.ino:162] loop(): [main] Directories per level: 5
[    55][I][fs_test.ino:163] loop(): [main] Directory depth      : 5
[    61][I][fs_test.ino:164] loop(): [main] Files per directory  : 5
[  4531][I][fs_test.ino:175] loop(): [main] --> Duration: 4 s 464 ms
[  4532][I][fs_test.ino:179] loop(): [main] Walking through directories recursively ...
[  7654][I][fs_test.ino:184] loop(): [main] --> Duration: 3 s 120 ms

WITH CHANGES FROM PR - SD card 40 MHz:

[    93][I][fs_test.ino:161] loop(): [main] Creating directories and files ...
[    93][I][fs_test.ino:162] loop(): [main] Directories per level: 5
[    94][I][fs_test.ino:163] loop(): [main] Directory depth      : 5
[   101][I][fs_test.ino:164] loop(): [main] Files per directory  : 5
[  4451][I][fs_test.ino:175] loop(): [main] --> Duration: 4 s 344 ms
[  4451][I][fs_test.ino:179] loop(): [main] Walking through directories recursively ...
[  7210][I][fs_test.ino:184] loop(): [main] --> Duration: 2 s 756 ms

But I still cannot understand, why it was that much slower with SD card running on 20 MHz as I posted before. The changes should improve the speed.

Lets merge this PR, thanks @BlueAndi for the PR and for providing the test sketch 👍

@me-no-dev me-no-dev merged commit f487d89 into espressif:master Dec 19, 2022
@BlueAndi
Copy link
Contributor Author

@P-R-O-C-H-Y You are welcome and its always good that someone independend review the stuff. Thanks for your effort as well!

@P-R-O-C-H-Y
Copy link
Member

@P-R-O-C-H-Y You are welcome and its always good that someone independend review the stuff. Thanks for your effort as well!

@BlueAndi Btw. I work for Espressif. My part is in Arduino project and mostly taking care of FS libs + some peripherals.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: Libraries Issue is related to Library support. Area: Performance Issue related to performance problems and improvements
Projects
Development

Successfully merging this pull request may close these issues.

5 participants