Skip to content

Reworked SPI class with direct register access to fix #28 for R_SPI #45

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 17 commits into from
Jul 24, 2023

Conversation

RudolphRiedel
Copy link
Contributor

The discussion for issue #28 has a plethora of information what this is.
In essence, a buffer of 231 bytes with a SPI clock of 8MHz is transferred in 252µs with this new version of SPI.cpp while the FSP based class needs 652µs to do the same, so this new version needs 21µs overhead while the FSP based class needs 421µs overhead.
For single byte transfers this new version has a delay between two bytes of 1.22µs when using 8MHz while the FSP based class has a delay of 10.9µs.

Another bug that I just discovered and that is fixed with this new version of SPI.cpp is that the FSP based version needs 652µs to transfer a buffer of 231 bytes with 6, 8, 12 and 24 MHz, somehow changing the SPI clock works but the time needed overall is the same.

@CLAassistant
Copy link

CLAassistant commented Jul 12, 2023

CLA assistant check
All committers have signed the CLA.

@aentinger
Copy link
Contributor

Hi @RudolphRiedel ☕ 👋

Thank you for this PR. Let's start the work of getting it merged in ...

@RudolphRiedel
Copy link
Contributor Author

I was not aware that this is for the Portenta C33 as well which uses a RA6M5 and I do not have one to test this.
I just compared the datasheets though and I am quite positive that this works for the Portenta C33.

@aentinger
Copy link
Contributor

I was not aware that this is for the Portenta C33 as well which uses a RA6M5 and I do not have one to test this.

I'll handle the tests for C33 👍

…t that no new functions shall be added that are not defined in the standard Arduino API
@RudolphRiedel
Copy link
Contributor Author

RudolphRiedel commented Jul 12, 2023

And I just found a "goto" in the code I did not touch.
Well, checking the repository, I guess one more does not hurt - but are there really no coding guidelines for Arduino?

@aentinger
Copy link
Contributor

aentinger commented Jul 13, 2023

Testing was performed at 1MHz, 10MHz and 24MHz SPI clock rate using BMP388-Basic from 107-Arduino-BMP388. The example exercises SPI both from interrupt and normal execution context.

  • Uno R4 Minima: SPI ✔️
  • Uno R4 WiFi: Only prototype available w/ me, putting in order for production version.
  • Portenta C33: SPI ✔️, SPI1 ✔️

@aentinger
Copy link
Contributor

@RudolphRiedel ☕ 👋

I did perform tests for Uno R4 Minima and Portenta C33, see here.

My pre-production Uno R4 WiFi refuses to be pressed into service. With what board have you performed your tests?

Overall changes should work for both WiFi and Minima, given that they share the same microcontroller. The used pins for SPI are different but we did not touch the routine performing the IO muxing/lookup.

@RudolphRiedel
Copy link
Contributor Author

I tested with the Minima as this is the one I could get.
I checked the schematics and the user manual, so the Minima is using R_SPI1 and the WiFi is using R_SPI0.
It compiles fine for the WiFi variant.

As a hardware engineer I am inclined to add more comments, but this would be off-topic.

@aentinger
Copy link
Contributor

Well, I've put in the request for the Uno WiFi R4. I'll test once it arrives on my table and then we can proceed merging this PR.

@greiman
Copy link

greiman commented Jul 15, 2023

I just tried it to see what it would do now. At first H7 STM32Cube SPI would not run over a few MHz if at all.

Arduino on top of mbed with STM32Cube is more than I can put up with. mbed depends on ST to setup STM32Cube. so you must deal with three big companies. Also Portenta H7 has lots of other problems.

mbed is an RTOS I really dislike so I probably won't fight with SPI.

I would really want SDIO on STMH7 but the STM32 SDIO controller is too archaic.

I wrote a SDIO driver for teensy 3.x and 4.x. It does over 22 MB/sec SD read/write. That's about the limit since Teensy only supports 50 MHz SDIO.

@mjs513
Copy link
Contributor

mjs513 commented Jul 16, 2023

@greiman
just incorporated the micros fix in the latest PR and reran you bench - just as a fyi

Serial number: 0X882B3DA3
Manufacturing date: 7/2017

FILE_SIZE_MB = 10
BUF_SIZE = 512 bytes
Starting write test, please wait.

write speed and latency
speed,max,min,avg
KB/Sec,usec,usec,usec
1664.72,30693,299,303
1664.72,30681,299,303

Starting read test, please wait.

read speed and latency
speed,max,min,avg
KB/Sec,usec,usec,usec
1769.91,288,284,285
1770.22,288,284,285

@greiman
Copy link

greiman commented Jul 16, 2023

@mjs513
The write transfer max latency is interesting, over 30 ms. What card are you using?

I need to investigate.

Could be the card since the spec allows cards to have occasional long write latency. Most cards have shorter max latency since they multi-task erase and wear leveling.

Could be a problem still in micros().

I suspect SPI is not the problem. Could be another device like USB.

@mjs513
Copy link
Contributor

mjs513 commented Jul 16, 2023

@greiman

What card are you using?
Using a SanDisk Ultra 32GB has HC I marking on it.

@greiman
Copy link

greiman commented Jul 16, 2023

Guess it is the card. I tried several Ultra 32GB cards on an Arduino Due and got longer write latency. Most were 15-20 ms but in one test I got 102ms.

My memory was that Ultra cards had shorter max latency. Guess I was wrong.

Here is a 32GB Samsung Pro on a Due board. This is outstanding.

write speed and latency
speed,max,min,avg
KB/Sec,usec,usec,usec
4537.21,128,110,111
4537.21,127,110,111

read speed and latency
speed,max,min,avg
KB/Sec,usec,usec,usec
4516.71,113,110,111
4528.99,113,110,111

Here is a older 32GB Samsung Evo, a lower end card.

write speed and latency
speed,max,min,avg
KB/Sec,usec,usec,usec
4524.89,3653,110,111
4524.89,3667,110,111

@greiman
Copy link

greiman commented Jul 16, 2023

The R4MA1 processor is not bad at $3.2 in volume on DigiKey. As for the software, too bad about FSP.

I looked at the RA6M5 in the C33 board and decided to order one. It has a nice SD/MMC Host Interface (SDHI). too bad the pins are not more accessible.

The SPI controller has more buffering and probably goes faster than 24 MHz. I will play with mods to use this structure.
RA6M5

@aentinger
Copy link
Contributor

aentinger commented Jul 17, 2023

Good Morning @RudolphRiedel and @greiman ☕ 👋

I see that you've been enjoying yourselves 😉 As for my part I've not yet received yet my Uno R4 WiFi. I'll try and get someone else to test this PR against Uno R4 WiFi.

Edit: @Rocketct will test for Uno R4 WiFi and update his findings here.

@RudolphRiedel
Copy link
Contributor Author

Please consider activating the Github discussions feature for at least this repository.

@aentinger
Copy link
Contributor

Please consider activating the Github discussions feature for at least this repository.

Sorry, that's not my decision to make. Please consider that :octocat: discussions are open both for arduino/language and arduino/ArduinoCore-API.

@facchinm
Copy link
Member

Discussions opened 😉

@greiman
Copy link

greiman commented Jul 17, 2023

Discussions opened

Here is a discussion about the SPI API I opened in May.

@greiman
Copy link

greiman commented Jul 17, 2023

Sorry, that's not my decision to make. Please consider that :octocat: discussions are open both for arduino/language and arduino/ArduinoCore-API.

These two topics are not really not appropriate for this pull request. @RudolphRiedel didn't extend the SPI API.

What is wrong with this pull request?

Is the issue the Arduino policy for Renesas FSP? Is the way this implementation interacts with FSP unacceptable?

The AVR Core only depends on vendor software at the lowest level. It is amazing how many new boards that depend on vendor software have lower SPI performance than Uno R3.

Single byte SPI transfer on Portenta H7 is about half as fast as Uno R3.

@aentinger
Copy link
Contributor

These two topics are not really not appropriate for this pull request. @RudolphRiedel didn't extend the SPI API.

These two repositories are highly relevant for discussing extension of the Arduino API 🤷. At any rate @facchinm - who has in fact decision making power to open GitHub discussions for this repository - has already decided to do so.

Is the issue the Arduino policy for Renesas FSP? Is the way this implementation interacts with FSP unacceptable?

There's no issue with this pull request. As soon as I have a greenlight from @Rocketct or complete testing with my own R4 WiFi (delivery pending) I'll merge this PR.

@greiman
Copy link

greiman commented Jul 17, 2023

These two repositories are highly relevant for discussing extension of the Arduino API 🤷. At any rate @facchinm - who has in fact decision making power to open GitHub discussions for this repository - has already decided to do so.

You mean I may actually get a response to my post of May 16?

I didn't pickup on your request to post since I didn't get a single reply last time I was asked to post there.

@aentinger
Copy link
Contributor

You mean I may actually get a response to my post of May 16?

I hope you will 👍 . Please understand that there are many competing priorities (you would not believe the amount of Emails I and our team get from GitHub alone each day) and although I am inclined to help you I have no decision making role. I can understand your frustration but you've got to understand that my role is limited to pointing out the previous designated areas for discussions concerning API extensions 🤷 .

You may continue venting your frustration at the post you linked but please limit discussion in this PR thread to the immediate content of this PR 🙏 🙇 .

@greiman
Copy link

greiman commented Jul 17, 2023

Note: About this pull request.

I suspected not being pro Renesas FSP was a bad thing. I had a friend check Arduino with DNB. I bet tossing FSP is not an option.

June 2022, Arduino raised $32.0M in Series B funding led by Robert Bosch Venture Capital, joined by Renesas, Anzu Partners, and Arm.

As part of the round, Renesas is investing $10 million. This investment is part of Renesas’ strategy to grow in the mass market and reach new customers within Arduino’s community of 30 million developers worldwide.

Pull requests for R4 may be a Sisyphean Task.

@RudolphRiedel
Copy link
Contributor Author

RudolphRiedel commented Jul 18, 2023

I suspected not being pro Renesas FSP was a bad thing.

Well, I do hope that someone from Renesas is following this.
If the FSP would actually be useable, in regards of performance, there would be no reason to toss it.
There are R_SPI_Open() and R_SPI_Close() functions and yet the transfer functions configure the SPI unit for the bit-width.
And while there are R_SPI_Read(), R_SPI_Write() and R_SPI_WriteRead() functions, these are just calling r_spi_write_read_common().
This is what happens when there is an Arduino call to SPI.transfer(data):
R_SPI_WriteRead() -> r_spi_write_read_common() -> r_spi_bit_width_config() / r_spi_start_transfer() -> r_spi_transmit() / r_spi_transmit() / R_BSP_IrqEnable() -> some lengthy ISR function, not entirely sure which
And somehow in between transfers the SPI unit is disabled.

This is slow because of the amount of code executed over and over again as the functions are generic and because of the number of context-switches for all the function calls.
And to put numbers behind this, when running the SPI at 8MHz, there is a 11µs delay between two SPI.transfer(data) calls, this is measured with the logic analyzer from the end of one transfer untill a new one begins, this is over 520 clock cycles for the core.
My version with direct register access reduces that to about 1.1µs delay between two SPI.transfer(data) calls.

@greiman
Copy link

greiman commented Jul 18, 2023

@RudolphRiedel

I am fine with Renesas fixing this. I now understand the business relationship between Arduino and Renesas.

I suspect that as part of the this relationship Renesas is to provide fixes for FSP problems.

I say toss, since soon Arduino wouldn't be able to move to new versions of FSP if pull requests like this one are accepted. You can't have device register access at the SPI.h/SPI.cpp level.

The correct place for the transfer fix is in FSP and there is no way Arduino could accept a pull request for FSP.

Probably the old Arduino open source policy no longer makes sense for core software at the FSP level.

We will probably will see Renesas processors in new Arduino products. FSP support will be critical for Arduino.

@RudolphRiedel
Copy link
Contributor Author

Nah, you are overthinking this, Renesas is selling micro-controllers, not software.
And it is not like they are going to sell less of it because the software runs faster.

This is probably pretty normal for a project, make it work first, identify a bottleneck, remove that bottleneck.
It's not like the FSP implementation is not working.
And in order to make it run faster they would need to extend their API by additional functions that actually do nothing else and rely on that R_SPI_Open() was called earlier with the correct parameters.

@aentinger
Copy link
Contributor

Gentleman, I'll be away for a couple of days on personal business.

@pennam will do the test for the Uno R4 WiFi and update this thread accordingly / merge in case of a positive test result.

@greiman I'll ask you one last time to take your non-PR related speculation elsewhere, i.e. the forum because this is simply not the right place for it. A PR thread exists to augment the process of getting a PR either merged or denied, and nothing else. Based on its technical merits I very much intent to merge this contribution, once a positive test result for Uno R4 WiFi is in.

@mjs513
Copy link
Contributor

mjs513 commented Jul 19, 2023

Out of curiosity I tried an adafruit product: https://www.adafruit.com/product/4899. Basically a NAND chip that acts like an SD card with @greiman bench sketch - it worked using a wifi board:

FreeStack: 27756
Type is FAT16
Card size: 0.52 GB (GB = 1E9 bytes)

Manufacturer ID: 0XB
OEM ID: XT
Product: XTSDA
Revision: 1.1
Serial number: 0X4FAE0070
Manufacturing date: 9/2020

FILE_SIZE_MB = 10
BUF_SIZE = 512 bytes
Starting write test, please wait.

write speed and latency
speed,max,min,avg
KB/Sec,usec,usec,usec
1511.49,95620,299,334
1522.07,95025,299,332

Starting read test, please wait.

read speed and latency
speed,max,min,avg
KB/Sec,usec,usec,usec
1769.29,292,284,285
1769.91,291,284,285

@pennam
Copy link
Contributor

pennam commented Jul 21, 2023

Tested this on my UNO R4 Wifi @ 1MHz 12MHz and 24MHz SPI clock using BMP388-Basic without interrupt. Works fine.

I've also made a spin on my Portenta C33 and find out the clock is capped @ 2MHz independently from the configuration.

I will doubleckeck with @aentinger to understand how and if he can get 24MHz out of his C33.

@greiman
Copy link

greiman commented Jul 21, 2023

@pennam

It appears that the Portenta C33 SPI on the board pins are Serial Communications Interface (SCI). SCI provides "simple SPI".

See the pinout for SPI on D7, D8, D9, D10.

At first glance I don't see the max rate for SCI Simple SPI.

This mod is only for this interface.

The Serial Peripheral Interface (SPI) has 2 channels. The SPI provides high-speed full-duplex synchronous serial
communications with multiple processors and peripheral devices.

Looks like the Portenta Breakout has an SPI port on one of the full featured ports. I have a C33 but no breakout so I can't test it.

Edit: Here are definitions from pin_arduino.h for C33:

/****** SPI CORE DEFINES ******/

#define SPI_HOWMANY       2

#define PIN_SPI_MOSI      8
#define PIN_SPI_MISO      10
#define PIN_SPI_SCK       9
#define PIN_SPI_CS        7
#define FORCE_SPI_MODE    (MODE_SCI)

#define PIN_SPI1_MOSI     46
#define PIN_SPI1_MISO     45
#define PIN_SPI1_SCK      47
#define PIN_SPI1_CS       48
#define FORCE_SPI1_MODE   (MODE_SPI)

@aentinger
Copy link
Contributor

I've also made a spin on my Portenta C33 and find out the clock is capped @ 2MHz independently from the configuration.

SCI SPI (7-10) is capping at around 12.5 MHz for me. There is definitely some limitation, but its not 2 MHz.

@aentinger
Copy link
Contributor

Testing was performed at 1MHz and 10MHz SPI clock rate using BMP388-Basic from 107-Arduino-BMP388. The example exercises SPI both from interrupt and normal execution context.

Uno R4 WiFi: SPI ✔️ .
Portenta C33: SPI ✔️, SPI1 ✔️ .

My key take-away is that the FSP code that calculates the bitrate settings warrants a closer look.

Copy link
Contributor

@aentinger aentinger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍 Thank you for your contribution @RudolphRiedel ☕ 👋 🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants