Skip to content

build: recognize collection_model_binding_data for batch inputs #1655

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 21 commits into from
Mar 31, 2025

Conversation

EvanR-Dev
Copy link
Contributor

@EvanR-Dev EvanR-Dev commented Mar 18, 2025

This pull request introduces several changes to enhance the handling of collection_model_binding_data. The most important changes include adding support for collection_model_binding_data in the datumdef.py and meta.py files
Enhancements to data handling:

Description

Addresses https://github.com/Azure/azure-functions-pyfx-planning/issues/393


Today when a user uses a deferred binding, the host sends a model_binding_data object that is converted into its corresponding SDK type. This object contains important information about the SDK type, but most notably the content.

For triggers that can handle batch inputs such as EventHub, this means the cardinality="many" - in this case, the host sends us a collection_model_binding_data object, which is just 1 to N model_binding_data objects. Here is a sample for each:

model_binding_data:

version: "1.0"source: "AzureEventHubsEventData"content_type: "application/octet-stream"content: "\000Sr\301\216\010\243\033x-opt-sequence-number-epochT\377\243\025x-opt-sequence-numberU.\243\014x-opt-offset\201\000\000\000\001\000\000\030\020\243\023x-opt-enqueued-time\000\243\035com.microsoft:datetime-offset\201\010\335c/`\215c\220\000St\301I\002\241\rDiagnostic-Id\241700-2b9c74c8f8a49fcdee7743b26534811a-891cfa411389e551-00\000Su\240 {\"message\": \"Hello from local!\"}"

collection_model_binding_data:

model_binding_data {
  version: "1.0"
  source: "AzureEventHubsEventData"
  content_type: "application/octet-stream"
  content: "\000Sr\301\216\010\243\033x-opt-sequence-number-epochT\377\243\025x-opt-sequence-numberUH\243\014x-opt-offset\201\000\000\000\003\000\000\006\340\243\023x-opt-enqueued-time\000\243\035com.microsoft:datetime-offset\201\010\335e\266\360\233G\300\000St\301I\002\241\rDiagnostic-Id\241700-a6e861768b42bdbaa26e5e367c8b0ffa-d71a437aea587458-00\000Su\240 {\"message\": \"Hello from local!\"}"
}
model_binding_data {
  version: "1.0"
  source: "AzureEventHubsEventData"
  content_type: "application/octet-stream"
  content: "\000Sr\301\216\010\243\033x-opt-sequence-number-epochT\377\243\025x-opt-sequence-numberUI\243\014x-opt-offset\201\000\000\000\003\000\000\007\220\243\023x-opt-enqueued-time\000\243\035com.microsoft:datetime-offset\201\010\335e\266\360\233G\300\000St\301I\002\241\rDiagnostic-Id\241700-a6e861768b42bdbaa26e5e367c8b0ffa-d71a437aea587458-00\000Su\240 {\"message\": \"Hello from local!\"}"
}

Notice that model_binding_data automatically has the fields accessible to do the conversion whereas collection_model_binding_data has each model_binding_data nested

From proto:

// Used to encapsulate collection model_binding_data
message CollectionModelBindingData {
  repeated ModelBindingData model_binding_data = 1;
}
// Message representing Microsoft.Azure.WebJobs.ParameterBindingData
// Used for hydrating SDK-type bindings in out-of-proc workers
message ModelBindingData
{
    // The version of the binding data content
    string version = 1;

    // The extension source of the binding data
    string source = 2;

    // The content type of the binding data content
    string content_type = 3;

    // The binding data content
    bytes content = 4;
}

PR information

  • The title of the PR is clear and informative.
  • There are a small number of commits, each of which has an informative message. This means that previously merged commits do not appear in the history of the PR. For information on cleaning up the commits in your pull request, see this page.
  • If applicable, the PR references the bug/issue that it fixes in the description.
  • New Unit tests were added for the changes made and CI is passing.

Quality of Code and Contribution Guidelines

Evan Roman added 2 commits March 17, 2025 23:22
@EvanR-Dev EvanR-Dev changed the title Recognize collection_model_binding_data for batch inputs build: recognize collection_model_binding_data for batch inputs Mar 18, 2025
@EvanR-Dev EvanR-Dev self-assigned this Mar 27, 2025
@EvanR-Dev EvanR-Dev marked this pull request as ready for review March 28, 2025 20:13
@hallvictoria hallvictoria merged commit 1ae733c into dev Mar 31, 2025
26 of 28 checks passed
@hallvictoria hallvictoria deleted the evanroman/cmbd branch March 31, 2025 17:02
return binding.decode(datum,
trigger_metadata=metadata,
pytype=pytype)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT: Improvement suggestion:

validSources = {
"AzureEventHubsEventData",
"AzureServiceBusReceivedMessage"
}

if datum.type == "collection_model_binding_data" or datum.value.source in valid_sources:
return binding.decode(datum, trigger_metadata=metadata, pytype=pytype)

Also, can type can be enum in your datum model? That will avoid the string comparisons and improve the codebae as less strings constant will flow.

hallvictoria added a commit that referenced this pull request Apr 11, 2025
* build: recognize collection_model_binding_data for batch inputs (#1655)

* add cmbd

* Add

* Add

* Rm newline

* Add tests

* Fix cmbd

* Fix test

* Lint

* Rm

* Rm

* Add back newline

* rm ws

* Rm list

* Rm cmbd from cache

* Avoid caching

* Keep cmbd check

* Add comment

* Lint

---------

Co-authored-by: Evan Roman <[email protected]>
Co-authored-by: hallvictoria <[email protected]>

* build: update Python Worker Version to 4.36.1 (#1660)

Co-authored-by: AzureFunctionsPython <[email protected]>

* initial changes

* Update Python SDK Version to 1.23.0 (#1663)

Co-authored-by: AzureFunctionsPython <[email protected]>

* merges from ADO

* merge fixes

* merge fixes

* merge fixes

* merge fixes

* don't run 313 unit tests yet

* changes for builds

---------

Co-authored-by: Evan <[email protected]>
Co-authored-by: Evan Roman <[email protected]>
Co-authored-by: AzureFunctionsPython <[email protected]>
Co-authored-by: AzureFunctionsPython <[email protected]>
gavin-aguiar added a commit that referenced this pull request Apr 25, 2025
* Proxy Worker: Initial Commit

* Updated worker config to include 3.13

* Updated test_setup

* Updated worker.py

* Updated dispatcher

* Updated syspath in worker.py

* Updated path in worker.py

* Updated worker.py

* Removed reload in dispatcher

* Updating v1 library worker name

* Added dispatcher logs

* Added dispatcher try/catch logs

* Updated sys path

* Dispatcher and dependency manager updates

* Updated dispatcher and pyproject

* Testing updates and refactoring

* Bug fixes and refactoring

* Added more unit tests

* Added tests and fixed test setup

* Updated test_setup

* Updated test setup to add grpc dir copy

* build: proxy worker build & test setup (#1664)

* build: recognize collection_model_binding_data for batch inputs (#1655)

* add cmbd

* Add

* Add

* Rm newline

* Add tests

* Fix cmbd

* Fix test

* Lint

* Rm

* Rm

* Add back newline

* rm ws

* Rm list

* Rm cmbd from cache

* Avoid caching

* Keep cmbd check

* Add comment

* Lint

---------

Co-authored-by: Evan Roman <[email protected]>
Co-authored-by: hallvictoria <[email protected]>

* build: update Python Worker Version to 4.36.1 (#1660)

Co-authored-by: AzureFunctionsPython <[email protected]>

* initial changes

* Update Python SDK Version to 1.23.0 (#1663)

Co-authored-by: AzureFunctionsPython <[email protected]>

* merges from ADO

* merge fixes

* merge fixes

* merge fixes

* merge fixes

* don't run 313 unit tests yet

* changes for builds

---------

Co-authored-by: Evan <[email protected]>
Co-authored-by: Evan Roman <[email protected]>
Co-authored-by: AzureFunctionsPython <[email protected]>
Co-authored-by: AzureFunctionsPython <[email protected]>

* Merging changes

* linting fixes

* Addressed comments

* Updated unit test and added missing protos files

* fix e2e test reference

* lint, mypy, add 3.13 to unittests

* correct version check

* syntax

* syntax

* fix unit tests, mypy

* oops

* format

* lint

* fix unittest dir for proxy

* set env variable

* update pyproject to use real deps

* bump to a2

* bump v2 to a3

* Import v2 by default for LC

* Refactoring and minor fixes

---------

Co-authored-by: hallvictoria <[email protected]>
Co-authored-by: Evan <[email protected]>
Co-authored-by: Evan Roman <[email protected]>
Co-authored-by: AzureFunctionsPython <[email protected]>
Co-authored-by: AzureFunctionsPython <[email protected]>
Co-authored-by: hallvictoria <[email protected]>
Co-authored-by: Victoria Hall <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants