Skip to content

Improve SPI interface (GIT8266O-397) #723

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
zub2 opened this issue Oct 7, 2019 · 5 comments
Open

Improve SPI interface (GIT8266O-397) #723

zub2 opened this issue Oct 7, 2019 · 5 comments

Comments

@zub2
Copy link

zub2 commented Oct 7, 2019

Environment

  • Development Kit: esp8266ex
  • IDF version: d6ec931
  • Development Env: Make
  • Operating System: Linux
  • Power Supply: external 3.3V

Problem Description

There are several issues that make the SPI driver difficult to use:

spi_trans (drivers/spi.c) requires 32bit-aligned buffers for miso and mosi data

This is not a big deal when the buffer is allocated for SPI, but when a longer buffer is sent piecewise, the necessary alignment becomes a burden. Consider the following scenario (SPI master):

uint16_t cmd = SOME_CMD;
uint32_t addr = SOME_ADDR;

spi_trans_t t = {};
t.cmd = &cmd;
t.bits.cmd = 16;
t.addr = &addr;
t.bits.addr = 32;

unsigned size;
uint8_t * data = getData(&size);
while (size > 0)
{
    waitForDeviceReady();
    const unsigned transferSize = min(size, getDeviceFreeBufferSize());

    t.mosi = (uint32_t*)data;
    t.bits.mosi = transferSize * 8;
    spi_trans(HSPI_HOST, t);

    data += transferSize;
    size -= transferSize;
}

... unless each time a multiple of 4 bytes is transferred, this explodes due to wrong alignment in spi_master_trans(). So the only solution currently is for user code to keep copying the data to a properly aligned buffer before it's passed to spi_trans(). So it's 1 copy before spi_trans() is called and another copy of the same data inside spi_master_trans().

The same issue happens with miso data.

Furthermore, making the type uint32_t* makes life difficult in all situations when the user doesn't want to transfer an array of uint32_ts. An uint8_t* or just void* would be more obvious. There is no data structure on this level - it's just bytes, or in fact just bits, as bits.mosi is actually the number of bits to transfer. So why force multiples of 4 bytes while the actual transfer size is bits?

I see that the actual SPI data buffer is defined as uint32_t (in spi_struct.h). I was not successful in finding any details on how should the HW buffer be accessed. But looking at how esp-open-rtos handles this (esp_spi.c in esp-open-rtos), it seems that a simple memcpy() (from a source address w/o any necessary alignment) is sufficient. It just seems that the last write has to make sure it ends on a multiple of 4 bytes. Alternatively, even if the SPI data buffer had to be accessed as uint32_t, it could still be handled inside spi_master_trans().

The size of spi_trans_t is 20 bytes yet spi_trans() accepts it by value

This means it's copied to stack. And then spi_trans() calls spi_trans_static() which calls one of spi_master_trans() or spi_slave_trans(). If the calls are not inlined, that's 3 unneeded copies of spi_trans_t. I think spi_trans(), spi_trans_static(), spi_master_trans() and spi_slave_trans() should accept a pointer to the spi_trans_t to spare stack space.

Making spi_trans_t cmd and addr pointers makes the usage for SPI master more difficult

When cmd or addr are an expression, a local variable has to be created. I understand why these were made pointers - for SPI slave. But why not just store the received values in the SPI slave scenario inside the spi_trans_t and let the caller then take the values and process them as needed? Of course this would only work if the spi_trans_t is passed via pointer, so that spi_slave_trans() can modify these.

@zub2
Copy link
Author

zub2 commented Oct 7, 2019

For the alignment issue, I realized I am not using addr. So a work around for me is putting the 1 to 3 bytes into addr and using the rounded-up pointer as mosi. But this won't work when one also uses addr.

@donghengqaz
Copy link
Collaborator

I haved noticed problems what you meet, so we are refactoring the SPI driver, main work is about what your mentioned. And the other work is to increase the throughput of the SPI. So when version v3.3 is released, we may solve these problems..

@zub2
Copy link
Author

zub2 commented Feb 9, 2020

I see you made some changes in v3.3-rc1. Specifically I see the following:

  1. spi_trans_t is now passed via pointer - great, thanks 👍
  2. spi_trans_t still defines mosi and miso as uint32_t* but a comment implying 32-bit alignment is not strictly needed was added, and in code I see that when unaligned buffer is passed to spi_master_trans the following happens:
ESP_LOGW(TAG,"Using unaligned data may reduce transmission efficiency");
memset(spi_object[host]->buf, 0, sizeof(uint32_t) * 16);
memcpy(spi_object[host]->buf, trans->mosi, trans->bits.mosi / 8 + (trans->bits.mosi % 8) ? 1 : 0);
for (x = 0; x < trans->bits.mosi; x += 32) {
    y = x / 32;
    SPI[host]->data_buf[y] = spi_object[host]->buf[y];
}

... does it mean the SPI[host]->data_buf can't be accessed in smaller units than uint32_t? I.e. would memcpy((uint8_t*)SPI[host]->data_buf, trans->mosi, (trans->bits.mosi+7) / 8) not be possible? I'm asking especially as esp-open-rtos does not seem to have a problem with just a memcpy (see here) and having used the code, I can confirm it worked well. Although it is possible that it was still outside the spec for the ESP8266 SPI.
Also, the current code for unaligned handling is in fact quite sub-optimal, even if a simple memcpy into SPI[host]->data_buf can't be used: First it does the warning trace. I'd expect this to already be a performance hit in itself. Then it does a memset to fill spi_object[host]->buf with zeros. And then it does a memcpy of the user data to the same spi_object[host]->buf - so why the memset then? And then it does the loop copy.

@github-actions github-actions bot changed the title Improve SPI interface Improve SPI interface (GIT8266O-397) Feb 9, 2020
@vargavik
Copy link

I'd like to point out an operator precedence problem with the trans->bits.mosi / 8 + (trans->bits.mosi % 8) ? 1 : 0 part too.
That should be trans->bits.mosi / 8 + ((trans->bits.mosi % 8) ? 1 : 0)

The current code will copy only exactly 1 byte no matter what is the content of trans->bits.mosi.
I had a non working code and had to drill down to logic analizer level to recognize only the first byte was correctly sent ot by the hardware. The same issue exist in the miso part.

@vargavik
Copy link

I found another bit more serious issue at the miso part.
If the data is aligned but the miso transmission byte length cannot be divided by 4 the following line will cause memory corruption for last incomplete 4 byte chunk of trans->mosi array:
trans->miso[y] = SPI[host]->data_buf[y];

The alignment check should also check the transfer length like this:
if ((uint32_t)(trans->miso) % 4 == 0 && trans->bits.miso % 32 == 0) {
for (x = 0; x < trans->bits.miso; x += 32) {
y = x / 32;
trans->miso[y] = SPI[host]->data_buf[y];
}
} else {

vargavik pushed a commit to vargavik/ESP8266_RTOS_SDK that referenced this issue Aug 27, 2021
vargavik added a commit to vargavik/ESP8266_RTOS_SDK that referenced this issue Aug 27, 2021
vargavik added a commit to vargavik/ESP8266_RTOS_SDK that referenced this issue Aug 27, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants