Skip to content

Account for retry-after-ms header #14

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
simonkurtz-MSFT opened this issue May 16, 2024 · 0 comments
Open

Account for retry-after-ms header #14

simonkurtz-MSFT opened this issue May 16, 2024 · 0 comments
Labels
enhancement New feature or request

Comments

@simonkurtz-MSFT
Copy link
Owner

simonkurtz-MSFT commented May 16, 2024

Presently, the module only accounts for the retry-after header containing a value in seconds. Azure OpenAI Provisioned Throughput (PTU) deployments also pass back the retry-after-ms header containing a value in milliseconds. This value may be preferential to seconds.

From the PTU documentation:

*A 429 response indicates that the allocated PTUs are fully consumed at the time of the call. The response includes the retry-after-ms and retry-after headers that tell you the time to wait before the next call will be accepted. *

As such, there is only a longer delay when not using retry-after-ms when there is simply no PTU left to consume (overall and/or tokens-per-minute?). This may be an acceptable situation that may not surface too often.


One approach may be to convert everything internal to the module to milliseconds. These resources provide the starting point for the enhancement:

cc @kristapratico

@simonkurtz-MSFT simonkurtz-MSFT added the enhancement New feature or request label May 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant