@@ -503,6 +503,124 @@ def create(
503
503
extra_body : Body | None = None ,
504
504
timeout : float | httpx .Timeout | None | NotGiven = NOT_GIVEN ,
505
505
) -> Completion | Stream [Completion ]:
506
+ """
507
+ Creates a completion for the provided prompt and parameters.
508
+
509
+ Args:
510
+ model: ID of the model to use. You can use the
511
+ [List models](https://platform.openai.com/docs/api-reference/models/list) API to
512
+ see all of your available models, or see our
513
+ [Model overview](https://platform.openai.com/docs/models/overview) for
514
+ descriptions of them.
515
+
516
+ prompt: The prompt(s) to generate completions for, encoded as a string, array of
517
+ strings, array of tokens, or array of token arrays.
518
+
519
+ Note that <|endoftext|> is the document separator that the model sees during
520
+ training, so if a prompt is not specified the model will generate as if from the
521
+ beginning of a new document.
522
+
523
+ best_of: Generates `best_of` completions server-side and returns the "best" (the one with
524
+ the highest log probability per token). Results cannot be streamed.
525
+
526
+ When used with `n`, `best_of` controls the number of candidate completions and
527
+ `n` specifies how many to return – `best_of` must be greater than `n`.
528
+
529
+ **Note:** Because this parameter generates many completions, it can quickly
530
+ consume your token quota. Use carefully and ensure that you have reasonable
531
+ settings for `max_tokens` and `stop`.
532
+
533
+ echo: Echo back the prompt in addition to the completion
534
+
535
+ frequency_penalty: Number between -2.0 and 2.0. Positive values penalize new tokens based on their
536
+ existing frequency in the text so far, decreasing the model's likelihood to
537
+ repeat the same line verbatim.
538
+
539
+ [See more information about frequency and presence penalties.](https://platform.openai.com/docs/guides/text-generation/parameter-details)
540
+
541
+ logit_bias: Modify the likelihood of specified tokens appearing in the completion.
542
+
543
+ Accepts a JSON object that maps tokens (specified by their token ID in the GPT
544
+ tokenizer) to an associated bias value from -100 to 100. You can use this
545
+ [tokenizer tool](/tokenizer?view=bpe) to convert text to token IDs.
546
+ Mathematically, the bias is added to the logits generated by the model prior to
547
+ sampling. The exact effect will vary per model, but values between -1 and 1
548
+ should decrease or increase likelihood of selection; values like -100 or 100
549
+ should result in a ban or exclusive selection of the relevant token.
550
+
551
+ As an example, you can pass `{"50256": -100}` to prevent the <|endoftext|> token
552
+ from being generated.
553
+
554
+ logprobs: Include the log probabilities on the `logprobs` most likely output tokens, as
555
+ well the chosen tokens. For example, if `logprobs` is 5, the API will return a
556
+ list of the 5 most likely tokens. The API will always return the `logprob` of
557
+ the sampled token, so there may be up to `logprobs+1` elements in the response.
558
+
559
+ The maximum value for `logprobs` is 5.
560
+
561
+ max_tokens: The maximum number of [tokens](/tokenizer) that can be generated in the
562
+ completion.
563
+
564
+ The token count of your prompt plus `max_tokens` cannot exceed the model's
565
+ context length.
566
+ [Example Python code](https://cookbook.openai.com/examples/how_to_count_tokens_with_tiktoken)
567
+ for counting tokens.
568
+
569
+ n: How many completions to generate for each prompt.
570
+
571
+ **Note:** Because this parameter generates many completions, it can quickly
572
+ consume your token quota. Use carefully and ensure that you have reasonable
573
+ settings for `max_tokens` and `stop`.
574
+
575
+ presence_penalty: Number between -2.0 and 2.0. Positive values penalize new tokens based on
576
+ whether they appear in the text so far, increasing the model's likelihood to
577
+ talk about new topics.
578
+
579
+ [See more information about frequency and presence penalties.](https://platform.openai.com/docs/guides/text-generation/parameter-details)
580
+
581
+ seed: If specified, our system will make a best effort to sample deterministically,
582
+ such that repeated requests with the same `seed` and parameters should return
583
+ the same result.
584
+
585
+ Determinism is not guaranteed, and you should refer to the `system_fingerprint`
586
+ response parameter to monitor changes in the backend.
587
+
588
+ stop: Up to 4 sequences where the API will stop generating further tokens. The
589
+ returned text will not contain the stop sequence.
590
+
591
+ stream: Whether to stream back partial progress. If set, tokens will be sent as
592
+ data-only
593
+ [server-sent events](https://developer.mozilla.org/en-US/docs/Web/API/Server-sent_events/Using_server-sent_events#Event_stream_format)
594
+ as they become available, with the stream terminated by a `data: [DONE]`
595
+ message.
596
+ [Example Python code](https://cookbook.openai.com/examples/how_to_stream_completions).
597
+
598
+ suffix: The suffix that comes after a completion of inserted text.
599
+
600
+ temperature: What sampling temperature to use, between 0 and 2. Higher values like 0.8 will
601
+ make the output more random, while lower values like 0.2 will make it more
602
+ focused and deterministic.
603
+
604
+ We generally recommend altering this or `top_p` but not both.
605
+
606
+ top_p: An alternative to sampling with temperature, called nucleus sampling, where the
607
+ model considers the results of the tokens with top_p probability mass. So 0.1
608
+ means only the tokens comprising the top 10% probability mass are considered.
609
+
610
+ We generally recommend altering this or `temperature` but not both.
611
+
612
+ user: A unique identifier representing your end-user, which can help OpenAI to monitor
613
+ and detect abuse.
614
+ [Learn more](https://platform.openai.com/docs/guides/safety-best-practices/end-user-ids).
615
+
616
+ extra_headers: Send extra headers
617
+
618
+ extra_query: Add additional query parameters to the request
619
+
620
+ extra_body: Add additional JSON properties to the request
621
+
622
+ timeout: Override the client-level default timeout for this request, in seconds
623
+ """
506
624
return self ._post (
507
625
"/completions" ,
508
626
body = maybe_transform (
@@ -1017,6 +1135,124 @@ async def create(
1017
1135
extra_body : Body | None = None ,
1018
1136
timeout : float | httpx .Timeout | None | NotGiven = NOT_GIVEN ,
1019
1137
) -> Completion | AsyncStream [Completion ]:
1138
+ """
1139
+ Creates a completion for the provided prompt and parameters.
1140
+
1141
+ Args:
1142
+ model: ID of the model to use. You can use the
1143
+ [List models](https://platform.openai.com/docs/api-reference/models/list) API to
1144
+ see all of your available models, or see our
1145
+ [Model overview](https://platform.openai.com/docs/models/overview) for
1146
+ descriptions of them.
1147
+
1148
+ prompt: The prompt(s) to generate completions for, encoded as a string, array of
1149
+ strings, array of tokens, or array of token arrays.
1150
+
1151
+ Note that <|endoftext|> is the document separator that the model sees during
1152
+ training, so if a prompt is not specified the model will generate as if from the
1153
+ beginning of a new document.
1154
+
1155
+ best_of: Generates `best_of` completions server-side and returns the "best" (the one with
1156
+ the highest log probability per token). Results cannot be streamed.
1157
+
1158
+ When used with `n`, `best_of` controls the number of candidate completions and
1159
+ `n` specifies how many to return – `best_of` must be greater than `n`.
1160
+
1161
+ **Note:** Because this parameter generates many completions, it can quickly
1162
+ consume your token quota. Use carefully and ensure that you have reasonable
1163
+ settings for `max_tokens` and `stop`.
1164
+
1165
+ echo: Echo back the prompt in addition to the completion
1166
+
1167
+ frequency_penalty: Number between -2.0 and 2.0. Positive values penalize new tokens based on their
1168
+ existing frequency in the text so far, decreasing the model's likelihood to
1169
+ repeat the same line verbatim.
1170
+
1171
+ [See more information about frequency and presence penalties.](https://platform.openai.com/docs/guides/text-generation/parameter-details)
1172
+
1173
+ logit_bias: Modify the likelihood of specified tokens appearing in the completion.
1174
+
1175
+ Accepts a JSON object that maps tokens (specified by their token ID in the GPT
1176
+ tokenizer) to an associated bias value from -100 to 100. You can use this
1177
+ [tokenizer tool](/tokenizer?view=bpe) to convert text to token IDs.
1178
+ Mathematically, the bias is added to the logits generated by the model prior to
1179
+ sampling. The exact effect will vary per model, but values between -1 and 1
1180
+ should decrease or increase likelihood of selection; values like -100 or 100
1181
+ should result in a ban or exclusive selection of the relevant token.
1182
+
1183
+ As an example, you can pass `{"50256": -100}` to prevent the <|endoftext|> token
1184
+ from being generated.
1185
+
1186
+ logprobs: Include the log probabilities on the `logprobs` most likely output tokens, as
1187
+ well the chosen tokens. For example, if `logprobs` is 5, the API will return a
1188
+ list of the 5 most likely tokens. The API will always return the `logprob` of
1189
+ the sampled token, so there may be up to `logprobs+1` elements in the response.
1190
+
1191
+ The maximum value for `logprobs` is 5.
1192
+
1193
+ max_tokens: The maximum number of [tokens](/tokenizer) that can be generated in the
1194
+ completion.
1195
+
1196
+ The token count of your prompt plus `max_tokens` cannot exceed the model's
1197
+ context length.
1198
+ [Example Python code](https://cookbook.openai.com/examples/how_to_count_tokens_with_tiktoken)
1199
+ for counting tokens.
1200
+
1201
+ n: How many completions to generate for each prompt.
1202
+
1203
+ **Note:** Because this parameter generates many completions, it can quickly
1204
+ consume your token quota. Use carefully and ensure that you have reasonable
1205
+ settings for `max_tokens` and `stop`.
1206
+
1207
+ presence_penalty: Number between -2.0 and 2.0. Positive values penalize new tokens based on
1208
+ whether they appear in the text so far, increasing the model's likelihood to
1209
+ talk about new topics.
1210
+
1211
+ [See more information about frequency and presence penalties.](https://platform.openai.com/docs/guides/text-generation/parameter-details)
1212
+
1213
+ seed: If specified, our system will make a best effort to sample deterministically,
1214
+ such that repeated requests with the same `seed` and parameters should return
1215
+ the same result.
1216
+
1217
+ Determinism is not guaranteed, and you should refer to the `system_fingerprint`
1218
+ response parameter to monitor changes in the backend.
1219
+
1220
+ stop: Up to 4 sequences where the API will stop generating further tokens. The
1221
+ returned text will not contain the stop sequence.
1222
+
1223
+ stream: Whether to stream back partial progress. If set, tokens will be sent as
1224
+ data-only
1225
+ [server-sent events](https://developer.mozilla.org/en-US/docs/Web/API/Server-sent_events/Using_server-sent_events#Event_stream_format)
1226
+ as they become available, with the stream terminated by a `data: [DONE]`
1227
+ message.
1228
+ [Example Python code](https://cookbook.openai.com/examples/how_to_stream_completions).
1229
+
1230
+ suffix: The suffix that comes after a completion of inserted text.
1231
+
1232
+ temperature: What sampling temperature to use, between 0 and 2. Higher values like 0.8 will
1233
+ make the output more random, while lower values like 0.2 will make it more
1234
+ focused and deterministic.
1235
+
1236
+ We generally recommend altering this or `top_p` but not both.
1237
+
1238
+ top_p: An alternative to sampling with temperature, called nucleus sampling, where the
1239
+ model considers the results of the tokens with top_p probability mass. So 0.1
1240
+ means only the tokens comprising the top 10% probability mass are considered.
1241
+
1242
+ We generally recommend altering this or `temperature` but not both.
1243
+
1244
+ user: A unique identifier representing your end-user, which can help OpenAI to monitor
1245
+ and detect abuse.
1246
+ [Learn more](https://platform.openai.com/docs/guides/safety-best-practices/end-user-ids).
1247
+
1248
+ extra_headers: Send extra headers
1249
+
1250
+ extra_query: Add additional query parameters to the request
1251
+
1252
+ extra_body: Add additional JSON properties to the request
1253
+
1254
+ timeout: Override the client-level default timeout for this request, in seconds
1255
+ """
1020
1256
return await self ._post (
1021
1257
"/completions" ,
1022
1258
body = maybe_transform (
0 commit comments