Skip to content

[WIP] Add ujson as alternative JSON encoder #130

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 15 commits into from

Conversation

tonybaloney
Copy link
Contributor

@tonybaloney tonybaloney commented May 25, 2022

The standard library json module is the slowest of the json encoders.

ujson is 10-20x faster at encoding and decoding, especially for large datasets.

This PR moves the json imports into a shim module, which picks the standard library implementation or ujson depending on whether:

  • The user has installed ujson
  • The user hasn't disabled it via an environment variable

@tonybaloney tonybaloney changed the title [WIP] Add orjson as alternative JSON encoder [WIP] Add ujson as alternative JSON encoder May 25, 2022
@vrdmr
Copy link
Member

vrdmr commented May 25, 2022

Any specific reason to choose ujon over orjson?

@tonybaloney
Copy link
Contributor Author

Any specific reason to choose ujon over orjson?

Supporting StringifyEnum was impossible without using a fork of orjson, which I tried and it was using old bindings for Python.

ujson supports custom type serialisation via a __json__ method in the class, which is going to be more performant. It's also more compatible with json

@codecov
Copy link

codecov bot commented May 25, 2022

Codecov Report

Attention: Patch coverage is 81.81818% with 10 lines in your changes missing coverage. Please review.

Project coverage is 85.79%. Comparing base (284c15d) to head (c090330).
Report is 141 commits behind head on dev.

Files with missing lines Patch % Lines
azure/functions/_json.py 72.97% 7 Missing and 3 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##              dev     #130      +/-   ##
==========================================
- Coverage   86.04%   85.79%   -0.26%     
==========================================
  Files          50       51       +1     
  Lines        2903     2922      +19     
  Branches      391      396       +5     
==========================================
+ Hits         2498     2507       +9     
- Misses        329      336       +7     
- Partials       76       79       +3     
Flag Coverage Δ
unittests 85.79% <81.81%> (-0.22%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Another variation

Move expression

remove braces

Change to underscore
@tonybaloney
Copy link
Contributor Author

benchmark

This is the benchmark between ujson (left) and json (right) for HttpRequest.get_json()

@tonybaloney
Copy link
Contributor Author

I've deployed 2 Azure Functions in Australiaeast with this patch applied and without the patch applied

The sample POST request is:

{
	"id": "0001",
	"type": "donut",
	"name": "Cake",
	"ppu": 0.55,
	"batters":
		{
			"batter":
				[
					{ "id": "1001", "type": "Regular" },
					{ "id": "1002", "type": "Chocolate" },
					{ "id": "1003", "type": "Blueberry" },
					{ "id": "1004", "type": "Devil's Food" }
				]
		},
	"topping":
		[
			{ "id": "5001", "type": "None" },
			{ "id": "5002", "type": "Glazed" },
			{ "id": "5005", "type": "Sugar" },
			{ "id": "5007", "type": "Powdered Sugar" },
			{ "id": "5006", "type": "Chocolate with Sprinkles" },
			{ "id": "5003", "type": "Chocolate" },
			{ "id": "5004", "type": "Maple" }
		]
}

The function source code is:

import azure.functions as func
import json

def main(req: func.HttpRequest) -> func.HttpResponse:
    try:
        req_body = req.get_json()
    except ValueError:
        pass

    return func.HttpResponse(
        json.dumps(req_body),
        status_code=200
    )

The script to test the two deployments:

$ ab -p test_data.json -T application/json -n 1000 -c 10 https://ant-functions-load-testing.azurewebsites.net/api/httptriggertest
$ ab -p test_data.json -T application/json -n 1000 -c 10 https://ant-functions-load-testing-og.azurewebsites.net/api/httptriggertest

The results are:

50 66 75 80 90 95 98 99
JSON 114 119 125 128 150 217 345 2113
UJSON 111 116 118 121 126 131 145 175
Normalised JSON 44 49 55 58 80 147 275 2043
Normalised UJSON 41 46 48 51 56 61 75 105

I've subtracted 70ms as this was the mean connect time, so you can more clearly see the difference between the two branches.

10% faster in the 50th percentile, but importantly 2.3x faster in the 95th percentile.
(ignore the 99th percentile as this will include coldstart times)

screenshot 2022-05-25 at 18 53 12

@YunchuWang YunchuWang requested a review from pdthummar as a code owner October 18, 2022 15:18
@hallvictoria
Copy link
Contributor

Closing in favor of #285

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants