Skip to content

Consider storing vignette input data in package #72

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
brookslogan opened this issue May 9, 2022 · 4 comments · Fixed by #77
Closed

Consider storing vignette input data in package #72

brookslogan opened this issue May 9, 2022 · 4 comments · Fixed by #77
Assignees

Comments

@brookslogan
Copy link
Contributor

@dajmcdon raised the possibility of replacing API queries in the vignettes and examples to internal data, in order to speed up vignette building and help with CI; the fetching code could be left in with an eval=FALSE. The drawback is potential license-related questions: what LICENSE do we assign the package if it mixes MIT-licensed code and CC/other-licensed data?

@brookslogan
Copy link
Contributor Author

Another factor to consider here: repository and package size. We shouldn't store massive data objects here, to keep git operations responsive and fully-featureful (merge operations can degrade in the presence of massive data objects), and to keep clones and installs quick. (Already the repository is 35MB, not instant to clone.)

@dajmcdon
Copy link
Contributor

dajmcdon commented May 9, 2022

I suspect that must be largely the docs/ right? I'm going to try in #71 to move that all to it's own branch. Though that won't help unless you do a shallow clone. Typical remotes::install_github() behaviour seems to be to actually download the repo (rather than clone), so this could be a problem.

All that said, I'm going to try to enforce the "no data bigger than 1MB" rule.

@brookslogan
Copy link
Contributor Author

brookslogan commented May 10, 2022

I would also guess that's the case. With the current main checked out, docs looks like it's 97.5% of the checked-out content.

I'm just skimming here [... I guess that was the wrong place, maybe here is more relevant, but it looks like remote_download.github_remote still attaches the ref so maybe the conclusion is the same?] and to me it looks like it should just download a HEAD snapshot from the GitHub API, so I would think that moving docs/ to its own branch would avoid slowdowns for remotes::install_github() (and maybe also merge slowness, but large default clones would remain a potential issue).

@brookslogan
Copy link
Contributor Author

I think we may need to keep some of the docs for basic help to work with install_github. Looks like 96% of the current (checked out snapshot) docs space is from pngs for the articles.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants