-
Notifications
You must be signed in to change notification settings - Fork 3.3k
Adding utils.deserialize #989
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Allows loading objects from json and yaml files. See for feature request kubernetes-client#977
Welcome @chekolyn! |
Also, is it necessary to add several thousand lines of test cases? It’s very difficult to review like that... And wait, what is this PR for? I think it was for deserializing only? |
Sorry for all the lines, i just wanted to add some json files generated via kubectl and exported to json via command line. |
Adding klass parameter to explicitly set object class name.
/assign |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: chekolyn The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Hey guys, please let me know what else is necessary for this PR. I've reduce the number of json tests and the number of json files. Thank you. |
/assign @roycaihw |
@roycaihw Sorry to bug, but I was wondering how can we move this PR forward. |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think having methods that take a single object data (in json/yaml/dict) and the kind (klass), and deserialize the data into the API object is a good start.
In future if we see needs for 1. batch deserialization 2. automatically inferring the kind, we can expand the functionality.
resp_mock.data = json.dumps(data) | ||
|
||
# Infer response_type from json data when not provided | ||
if not klass: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think having the caller to pass in the klass is the simplest and satisfies most use cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll trying to target both use-cases. The one we pass klass and the auto-inferring. I will prioritize the first one. For auto-inferring I am taking inspiration from the watch implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added the klass parameter as an optional one; if one is not given it will try to find the proper klass with the help of the response_type_from_dict function.
kubernetes/utils/deserialize.py
Outdated
if "List" in data["kind"]: | ||
obj_list = [] | ||
# Could be "List" or "Pod/Service/...List" | ||
# This is a list type. iterate within its items |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think api_client.deserialize can handle list types, e.g. V1DeploymentList
. Why do we need to break the list down?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this an excellent observation; I will explore this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think returning ObjectsList will work however there is one edge case where we need to account for multiple types of objects.
The following json can have multiple objects with different kinds.
kubectl get all --all-namespaces -o json > all.objs.json
I guess in this case we need to traverse the list items and extract the different kinds of objects then we can output a list of ObjectsList. I think that should be the only time we return a python list with multiple K8s ObjectsList.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
kubectl get all
sends multiple requests to the apiserver (you can see them with kubectl get all -v=6
). This python client doesn't have a get all
utility. If it does, for each request the response will be a single ObjectsList.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think in this situation I was thinking more of the json file output from kubectl "all.objs.json" more than the way kubectl interacts with the API; in this context I was trying to support loading such files (with multiple object kinds) via the load_from_json() function.
Please see this #977 (comment) for some additional info.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice if we can return ObjectsList when the input is indeed a single ObjectsList. ObjectsList's are standard API classes, and a user should be able to give a "ObjectsList" as the klass hint.
For the edge case with kubectl get all
, I see it as a lower priority because:
kubectl get all
is a broken command. The termall
is misleading and what people think it does is usually wrong: kubectl get all does not list all resources in a namespace kubernetes/kubectl#151. Currently it is suggested to not usekubectl get all
and usekubectl api-resources
instead.- I haven't seen lots of requests for deserializing a list of objects with different kinds. And I think it can be broken down by iterating through the list.
I'm okay with adding a utility for deserializing all.objs.json
, as long as we support the standard ObjectList kinds e.g. V1DeploymentList
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sound good @roycaihw I agree with the priorities you have outlined. Thank you for your insight.
if isinstance(obj, list): | ||
objs.extend(obj) | ||
else: | ||
objs.append(obj) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you justify the use case for deserializing a yaml list into a list of objects?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think I can justify it if we can handle list type as you suggested above. I will explore that option first and this shall be removed if that is implemented.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After reviewing this again, I think this is necessary to support yaml multi documents in one yaml file.
fcn_to_call = "{0}{1}Api".format(group, version.capitalize()) | ||
k8s_api = getattr(client, fcn_to_call) | ||
# Replace CamelCased action_type into snake_case | ||
kind = data["kind"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The dict can be either a list of objects or a object, e.g. the following two types of responses
$ kubectl get po -ojson
{
"apiVersion": "v1",
"items": [
{
"apiVersion": "v1",
"kind": "Pod",
"metadata": {
...
v.s.
$ kubectl get po nginx-deployment-54f57cf6bf-cs6t7 -ojson
{
"apiVersion": "v1",
"kind": "Pod",
"metadata": {
...
in the first case, the lookup function should be list_namespaced_pod_with_http_info
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The implementation has been changed to return k8s list objects as requested.
kubernetes/utils/deserialize.py
Outdated
fnc_lookup = "read_namespaced_{0}_with_http_info".format(kind) | ||
elif hasattr(k8s_api, "read_{0}_with_http_info".format(kind)): | ||
fnc_lookup = "read_{0}_with_http_info".format(kind) | ||
# Try with the create if no read fnc found |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there any API that supports CREATE but doesn't support GET?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is for objects that don't have read_xxx functions; I believe the get_{0}_with_http_info would be more appropriate to avoid confusion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't think of an API that has create_{0}_with_http_info
but doesn't have read_{0}_with_http_info
/read_namespaced_{0}_with_http_info
. Could you give an example?
Hello @bpereto, I was doing some tests on my end, but I've only seen some marginal variances. These are some of my results:
I'm using this script to test it and using a json file with about 5K items. #!/bin/bin/env python3
#
#
import json
import kubernetes.utils as utils
import time
file_path = "/home/sa/Documents/all.pods.json"
def timeit(method):
def timed(*args, **kw):
ts = time.time()
result = method(*args, **kw)
te = time.time()
if 'log_time' in kw:
name = kw.get('log_name', method.__name__.upper())
kw['log_time'][name] = int((te - ts) * 1000)
else:
print('%r %2.2f ms'% \
(method.__name__, (te - ts) * 1000)
)
return result
return timed
@timeit
def read_file(file):
with open(file) as myfile:
data = myfile.read()
data = json.loads(data)
return data
@timeit
def get_items_count(file):
count=0
data = read_file(file_path)
for i in data.get("items"):
count +=1
return count
@timeit
def get_objs(file):
data = read_file(file)
obj = utils.load_from_dict(data, verbose=True)
return obj
@timeit
def get_objs_from_utils(file):
return utils.load_from_json(file_path, verbose=True)
print("Just read file")
read_file(file_path)
print("Read file and count items")
print("Items count: {}".format(get_items_count(file_path)))
print("Read file and use utils.load_from_dict")
obj = get_objs(file_path)
print("Use k8s utils.load_from_json")
obj = get_objs_from_utils(file_path) |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
This PR is still awaiting to be reviewed, thank you. |
Stale issues rot after 30d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
/remove-lifecycle rotten |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-contributor-experience at kubernetes/community. |
/remove-lifecycle stale |
Still interested in this feature. Thanks for your contribution |
I'll echo the interest in this feature. It would be great to be able to move from a yaml file to k8s object smoothly. Thanks for all of your work @chekolyn! |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle rotten |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /close |
@k8s-triage-robot: Closed this PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Any plans to merge this? This is a really useful feature, thanks. |
This is pretty decent implementation, I hope PR will be re-opened and merged as this feature is quite needed. For instance I want to define and provide a yaml template in file, defining all the Ops specific nits and bits like labels, namespace, service accounts, etc. |
Also interested in this functionality, to avoid needless custom parsing of yaml snippets into Python classes in one of our projects. |
Allows loading objects from json and yaml files.
See for feature request #977
Please review, this is my first pr.