Skip to content

Adding utils.deserialize #989

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 12 commits into from

Conversation

chekolyn
Copy link

Allows loading objects from json and yaml files.
See for feature request #977

Please review, this is my first pr.

Allows loading objects from json and yaml files.
See for feature request kubernetes-client#977
@k8s-ci-robot
Copy link
Contributor

Welcome @chekolyn!

It looks like this is your first PR to kubernetes-client/python 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes-client/python has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Oct 19, 2019
@k8s-ci-robot k8s-ci-robot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Oct 19, 2019
@micw523
Copy link
Contributor

micw523 commented Oct 19, 2019

Also, is it necessary to add several thousand lines of test cases? It’s very difficult to review like that...

And wait, what is this PR for? I think it was for deserializing only?

@chekolyn
Copy link
Author

Sorry for all the lines, i just wanted to add some json files generated via kubectl and exported to json via command line.
It seems that it did catch an issue with python2.7.
I looked into it and it seems the json.load is having issues with some json files; I have not found the root cause, but I have a hunch it might be unicode related.

@roycaihw
Copy link
Member

/assign

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: chekolyn
To complete the pull request process, please assign roycaihw
You can assign the PR to them by writing /assign @roycaihw in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@chekolyn
Copy link
Author

chekolyn commented Nov 2, 2019

Hey guys, please let me know what else is necessary for this PR. I've reduce the number of json tests and the number of json files. Thank you.

@chekolyn
Copy link
Author

chekolyn commented Nov 4, 2019

/assign @roycaihw

@chekolyn
Copy link
Author

@roycaihw Sorry to bug, but I was wondering how can we move this PR forward.
If size is of the PR is an issue; I can continue to delete some of the test json files, but that would reduce some of the test cases.

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 13, 2020
Copy link
Member

@roycaihw roycaihw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think having methods that take a single object data (in json/yaml/dict) and the kind (klass), and deserialize the data into the API object is a good start.

In future if we see needs for 1. batch deserialization 2. automatically inferring the kind, we can expand the functionality.

resp_mock.data = json.dumps(data)

# Infer response_type from json data when not provided
if not klass:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think having the caller to pass in the klass is the simplest and satisfies most use cases.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll trying to target both use-cases. The one we pass klass and the auto-inferring. I will prioritize the first one. For auto-inferring I am taking inspiration from the watch implementation.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added the klass parameter as an optional one; if one is not given it will try to find the proper klass with the help of the response_type_from_dict function.

if "List" in data["kind"]:
obj_list = []
# Could be "List" or "Pod/Service/...List"
# This is a list type. iterate within its items
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think api_client.deserialize can handle list types, e.g. V1DeploymentList. Why do we need to break the list down?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this an excellent observation; I will explore this.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think returning ObjectsList will work however there is one edge case where we need to account for multiple types of objects.
The following json can have multiple objects with different kinds.

kubectl get all --all-namespaces -o json > all.objs.json

I guess in this case we need to traverse the list items and extract the different kinds of objects then we can output a list of ObjectsList. I think that should be the only time we return a python list with multiple K8s ObjectsList.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kubectl get all sends multiple requests to the apiserver (you can see them with kubectl get all -v=6). This python client doesn't have a get all utility. If it does, for each request the response will be a single ObjectsList.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in this situation I was thinking more of the json file output from kubectl "all.objs.json" more than the way kubectl interacts with the API; in this context I was trying to support loading such files (with multiple object kinds) via the load_from_json() function.

Please see this #977 (comment) for some additional info.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice if we can return ObjectsList when the input is indeed a single ObjectsList. ObjectsList's are standard API classes, and a user should be able to give a "ObjectsList" as the klass hint.

For the edge case with kubectl get all, I see it as a lower priority because:

  • kubectl get all is a broken command. The term all is misleading and what people think it does is usually wrong: kubectl get all does not list all resources in a namespace kubernetes/kubectl#151. Currently it is suggested to not use kubectl get all and use kubectl api-resources instead.
  • I haven't seen lots of requests for deserializing a list of objects with different kinds. And I think it can be broken down by iterating through the list.

I'm okay with adding a utility for deserializing all.objs.json, as long as we support the standard ObjectList kinds e.g. V1DeploymentList

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sound good @roycaihw I agree with the priorities you have outlined. Thank you for your insight.

if isinstance(obj, list):
objs.extend(obj)
else:
objs.append(obj)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you justify the use case for deserializing a yaml list into a list of objects?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think I can justify it if we can handle list type as you suggested above. I will explore that option first and this shall be removed if that is implemented.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After reviewing this again, I think this is necessary to support yaml multi documents in one yaml file.

@roycaihw roycaihw added kind/feature Categorizes issue or PR as related to a new feature. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Feb 14, 2020
fcn_to_call = "{0}{1}Api".format(group, version.capitalize())
k8s_api = getattr(client, fcn_to_call)
# Replace CamelCased action_type into snake_case
kind = data["kind"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The dict can be either a list of objects or a object, e.g. the following two types of responses

$ kubectl get po -ojson
{
    "apiVersion": "v1",
    "items": [
        {
            "apiVersion": "v1",
            "kind": "Pod",
            "metadata": {
...

v.s.

$ kubectl get po nginx-deployment-54f57cf6bf-cs6t7 -ojson
{
    "apiVersion": "v1",
    "kind": "Pod",
    "metadata": {
...

in the first case, the lookup function should be list_namespaced_pod_with_http_info

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation has been changed to return k8s list objects as requested.

fnc_lookup = "read_namespaced_{0}_with_http_info".format(kind)
elif hasattr(k8s_api, "read_{0}_with_http_info".format(kind)):
fnc_lookup = "read_{0}_with_http_info".format(kind)
# Try with the create if no read fnc found
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there any API that supports CREATE but doesn't support GET?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is for objects that don't have read_xxx functions; I believe the get_{0}_with_http_info would be more appropriate to avoid confusion.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't think of an API that has create_{0}_with_http_info but doesn't have read_{0}_with_http_info/read_namespaced_{0}_with_http_info. Could you give an example?

@chekolyn
Copy link
Author

chekolyn commented Oct 8, 2020

Hello @bpereto,

I was doing some tests on my end, but I've only seen some marginal variances.

These are some of my results:

(k8s_python) sa@leela:~/bin$ time python k8s_deserialize.py 
Just read file
'read_file'  2007.61 ms
Read file and count items
'read_file'  2040.86 ms
'get_items_count'  2177.36 ms
Items count: 5775
Read file and use utils.load_from_dict
'read_file'  1991.78 ms
Lookup info: {'group': 'Core', 'version': 'v1', 'kind': 'list', 'items_kind': 'pod', 'fcn_to_call': 'CoreV1Api', 'fnc_lookup': 'list_namespaced_pod_with_http_info'}
Lookup function found: list_namespaced_pod_with_http_info in k8s_api: <class 'kubernetes.client.apis.core_v1_api.CoreV1Api'> response_type: V1PodList lookup info: {'group': 'Core', 'version': 'v1', 'kind': 'list', 'items_kind': 'pod', 'fcn_to_call': 'CoreV1Api', 'fnc_lookup': 'list_namespaced_pod_with_http_info'}
'get_objs'  17893.79 ms
Use k8s utils.load_from_json
Lookup info: {'group': 'Core', 'version': 'v1', 'kind': 'list', 'items_kind': 'pod', 'fcn_to_call': 'CoreV1Api', 'fnc_lookup': 'list_namespaced_pod_with_http_info'}
Lookup function found: list_namespaced_pod_with_http_info in k8s_api: <class 'kubernetes.client.apis.core_v1_api.CoreV1Api'> response_type: V1PodList lookup info: {'group': 'Core', 'version': 'v1', 'kind': 'list', 'items_kind': 'pod', 'fcn_to_call': 'CoreV1Api', 'fnc_lookup': 'list_namespaced_pod_with_http_info'}
'get_objs_from_utils'  18325.33 ms

real	0m42.901s
user	0m41.048s
sys	0m1.853s

I'm using this script to test it and using a json file with about 5K items.

#!/bin/bin/env python3
#
#
import json
import kubernetes.utils as utils
import time

file_path = "/home/sa/Documents/all.pods.json"


def timeit(method):
    def timed(*args, **kw):
        ts = time.time()
        result = method(*args, **kw)
        te = time.time()
        if 'log_time' in kw:
            name = kw.get('log_name', method.__name__.upper())
            kw['log_time'][name] = int((te - ts) * 1000)
        else:
            print('%r  %2.2f ms'% \
                  (method.__name__, (te - ts) * 1000)
                  )
        return result
    return timed

@timeit
def read_file(file):
    with open(file) as myfile:
        data = myfile.read()
        data = json.loads(data)

        return data

@timeit
def get_items_count(file):
    count=0

    data = read_file(file_path)
    for i in data.get("items"):
        count +=1

    return count

@timeit
def get_objs(file):
    data = read_file(file)

    obj = utils.load_from_dict(data, verbose=True)

    return obj

@timeit
def get_objs_from_utils(file):
    return utils.load_from_json(file_path, verbose=True)


print("Just read file")
read_file(file_path)

print("Read file and count items")
print("Items count: {}".format(get_items_count(file_path)))

print("Read file and use utils.load_from_dict")
obj = get_objs(file_path)

print("Use k8s utils.load_from_json")
obj = get_objs_from_utils(file_path)

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 6, 2021
@chekolyn chekolyn requested a review from roycaihw January 6, 2021 05:26
@chekolyn
Copy link
Author

chekolyn commented Jan 6, 2021

This PR is still awaiting to be reviewed, thank you.

@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Feb 5, 2021
@bpereto
Copy link

bpereto commented Feb 10, 2021

/remove-lifecycle rotten

@k8s-ci-robot k8s-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Feb 10, 2021
@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 11, 2021
@roycaihw
Copy link
Member

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 11, 2021
@VoVAllen
Copy link

Still interested in this feature. Thanks for your contribution

@vnredmon
Copy link

vnredmon commented Sep 1, 2021

I'll echo the interest in this feature. It would be great to be able to move from a yaml file to k8s object smoothly. Thanks for all of your work @chekolyn!

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 30, 2021
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Dec 30, 2021
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue or PR with /reopen
  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

@k8s-ci-robot
Copy link
Contributor

@k8s-triage-robot: Closed this PR.

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue or PR with /reopen
  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@jovan-absci
Copy link

Any plans to merge this? This is a really useful feature, thanks.

@smoke
Copy link

smoke commented Mar 8, 2024

This is pretty decent implementation, I hope PR will be re-opened and merged as this feature is quite needed.

For instance I want to define and provide a yaml template in file, defining all the Ops specific nits and bits like labels, namespace, service accounts, etc.
and from the python code to load the template, change few parts of it - e.g. some argument / command to be run and create / apply the final declaration.

@dboreham
Copy link

Also interested in this functionality, to avoid needless custom parsing of yaml snippets into Python classes in one of our projects.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.