Account for `retry-after-ms` header #14

simonkurtz-MSFT · 2024-05-16T21:20:43Z

Presently, the module only accounts for the retry-after header containing a value in seconds. Azure OpenAI Provisioned Throughput (PTU) deployments also pass back the retry-after-ms header containing a value in milliseconds. This value may be preferential to seconds.

From the PTU documentation:

*A 429 response indicates that the allocated PTUs are fully consumed at the time of the call. The response includes the retry-after-ms and retry-after headers that tell you the time to wait before the next call will be accepted. *

As such, there is only a longer delay when not using retry-after-ms when there is simply no PTU left to consume (overall and/or tokens-per-minute?). This may be an acceptable situation that may not surface too often.

One approach may be to convert everything internal to the module to milliseconds. These resources provide the starting point for the enhancement:

cc @kristapratico

The text was updated successfully, but these errors were encountered:

simonkurtz-MSFT added the enhancement New feature or request label May 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Account for `retry-after-ms` header #14

Account for `retry-after-ms` header #14

simonkurtz-MSFT commented May 16, 2024 •

edited

Loading

Account for retry-after-ms header #14

Account for retry-after-ms header #14

Comments

simonkurtz-MSFT commented May 16, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Account for `retry-after-ms` header #14

Account for `retry-after-ms` header #14

simonkurtz-MSFT commented May 16, 2024 •

edited

Loading