-
Notifications
You must be signed in to change notification settings - Fork 16
[CTIS QSF tools] Generate changelog from diffing codebook #1630
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
I trust you've run this and compared the output, e.g. diffing the old changelog against the new one? |
Yep. Differences are minimal -- a few buggy lines in the changelog have been removed (e.g. the old version showed display logic as having changed if just the QID of the conditional question changed, even if all text, logic, and answer choices were the same). There are a few additional obs (O(10)) included since this version diffs a few extra fields. And a few changelog lines have changed type. "Matrix subquestion changed" specifically has been broken down into more specific categories. But most lines are the same. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code looks good, and I'll trust that checking the generated codebook has covered most possible problems
@krivard This is ready to merge. |
Description
Currently, the survey changelog is generated by joining descriptive fields from the codebook onto QSF diffs. Since the codebook is meant to be a parsed version of the QSFS that matches the microdata and the diffs are meant to report raw fields from the QSFs, a lot of processing is required to make the two compatible (e.g. some survey items are renamed in specific waves). This means that the code to create the survey changelog is complex, making it both hard to maintain and likely to contain bugs.
This PR generates the changelog instead by joining the codebook with itself, and diffing pairs of columns (old and new version of a characteristic) for each observation.
Fix a few issues with the codebook, mostly related to display logic edge cases.
Changelog
generate-codebook.R
generate-changelog.R