Skip to content

QST: #58930

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
2 tasks done
phan480 opened this issue Jun 5, 2024 · 3 comments
Open
2 tasks done

QST: #58930

phan480 opened this issue Jun 5, 2024 · 3 comments
Labels
Needs Triage Issue that has not been reviewed by a pandas team member Usage Question

Comments

@phan480
Copy link

phan480 commented Jun 5, 2024

Research

  • I have searched the [pandas] tag on StackOverflow for similar questions.

  • I have asked my usage related question on StackOverflow.

Link to question on StackOverflow

Pandas json_normalize question

Question about pandas

I am trying to understand an issue I encounter with json_normalize. This may be a bug?
The following code:

import pandas
data = [{
        "id": 1039835,
        "status": "on_track",
        "development_plan": {
            "id": 23858,
            "items": [{
                    "id": 176498,
                    "item_name": "itemname 1"
                }, {
                    "id": 172450,
                    "item_name": "itemname 2"
                }, {
                    "id": 172451,
                    "item_name": "itemname 3"
                }
            ]
        },
        "user": {
            "id": 714067
        }
    }
]
print(
    pandas.json_normalize(data,
                          ['development_plan', 'items'],
                          [['development_plan', 'id'],['status'],['user', 'id'],['user']],
                          errors = 'ignore'
    )
)

Returns this dataframe:

   id   item_name development_plan.id    status user.id            user

0 176498 itemname 1 23858 on_track 23858 {'id': 714067}
1 172450 itemname 2 23858 on_track 23858 {'id': 714067}
2 172451 itemname 3 23858 on_track 23858 {'id': 714067}

The question is why the user.id is wrong? It returns the development_plan.id. Adding the complete user dictionary does return the correct user.id though.

Is this a bug or am I missing something?

@phan480 phan480 added Needs Triage Issue that has not been reviewed by a pandas team member Usage Question labels Jun 5, 2024
@RamMarthi9
Copy link

RamMarthi9 commented Jun 5, 2024

Hi phan480,

Use this piece of code: I think this should help u. Ping me if this is working.
import pandas as pd

data = [{
"id": 1039835,
"status": "on_track",
"development_plan": {
"id": 23858,
"items": [{
"id": 176498,
"item_name": "itemname 1"
}, {
"id": 172450,
"item_name": "itemname 2"
}, {
"id": 172451,
"item_name": "itemname 3"
}
]
},
"user": {
"id": 714067
}
}
]

Extract data

records = []
for entry in data:
status = entry['status']
user_id = entry['user']['id']
development_plan_id = entry['development_plan']['id']
for item in entry['development_plan']['items']:
record = {
'id': item['id'],
'item_name': item['item_name'],
'development_plan_id': development_plan_id,
'status': status,
'user_id': user_id
}
records.append(record)

Create DataFrame

df = pd.DataFrame(records)

print(df)

I got the following result:
id item_name development_plan_id status user_id
0 176498 itemname 1 23858 on_track 714067
1 172450 itemname 2 23858 on_track 714067
2 172451 itemname 3 23858 on_track 714067

@phan480
Copy link
Author

phan480 commented Jun 5, 2024

Thank you for your response, yes it works. But doesn't use the json_normalize functionality? I am using this function to parse a multitude of api's and their responses. I configure their record_path and meta data columns in configuration data.

I would like to avoid creating a unique way to parse every API.

@yuanx749
Copy link
Contributor

I believe there is a bug, and seems this pr #59225 can fix it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Needs Triage Issue that has not been reviewed by a pandas team member Usage Question
Projects
None yet
Development

No branches or pull requests

3 participants