-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
Update citation webpage #33311
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update citation webpage #33311
Conversation
I'm not in academia, so not an expert in citations. But I'd say users arriving to this page would like to see how to cite pandas (a single way), or if there is more than one, when to use each. @wesm I guess you're the right person to ask. How pandas should be cited? |
There's a failing check on the docs that I don't fully understand, if someone can help me figure it out I can go in and fix it (or also feel free to take that over). |
The error is unrelated to this PR, it's being addressed in #33309 |
Here's the canonical citation: https://conference.scipy.org/proceedings/scipy2010/mckinney.html DOI 10.25080/Majora-92bf1922-00a BibTeX
|
Thanks @wesm, @ivalaginja can you update the page to only contain the citation mentioned by Wes please. Thanks! |
I'm not a researcher, may be in missing something obvious. But are users expected to cite both the paper and the zenodo reference? Or depending on the context they will cite one or the other? Feels like with Wes paper should be enough, not sure what I'm missing. |
The whole point of this PR and the issue it addresses is that the correct way to cite software is to cite the software directly. From the perspective of "proper" citation, this is enough. However, the most merit in the academic world is gained by citations to papers, so it is usual to cite both if both are available, if the authors voice that wish. So the bare minimum is to cite the software alone; the usual case that satisfies this minimum is to cite both, in order to also provide academic credit. The case where only a paper is cited and the software product itself is not, is a faulty citation, unless the software never got published properly and hence does not have an archived identifier to be cited with. |
I don't have a strong position. I have seen academic papers that have failed to cite pandas beyond a link to the project website so it would be good for it to be clear when people go looking for a citation (they may not, though) |
I have seen that a lot too and unfortunately, it keeps happening. Part of it comes from the fact that the act of writing software is not valued as highly in academia as it is to write and publish a paper. For licensed tools that users pay for that is usually not an issue, as the license holders get money and mostly don't care about academic credit. In the past years though there has been a larger and larger shift to free and open-source tools, especially with Python. Most people simply don't know about the difference between making software publicly available (e.g. put it on GitHub) and publishing their software in an archive that creates a permanent record (e.g. Zenodo). The interplay between people not knowing how to cite software because they haven't seen it enough yet, but also not knowing how to publish their own software products in order to make them citable leads to an embarrassingly high number of papers that "cite" software by providing URLs (non-permanent by definition). It's a very interesting topic if you're into software publication and a great showcase of the rigidity of the academic world :) |
This is why it certainly helps to make citation instructions clearer. I didn't change the formulation on the pandas citation page because my intent was to get the software citation in there, not to change what the pandas team expects form its users, even if there is a strong recommendation. If you want, I can change the part "we would appreciate citations to the published software and the following paper" to "we would kindly ask you to cite both the software and the following paper". |
Not in academia, so not an expert, but I see almost all project in the ecosystem is presenting a paper to cite. Couple of coses offer a citation to the website, and just matplotlib has some zenodo badges.
@TomAugspurger I saw your comments in the issue, are we happy to have a zenodo badge that has to be updated at every release? Is it worth the effort? If we want it, do we want it in the README, in the citing page, or in both? @jorisvandenbossche I think you've been in academia, any opinion here? |
It's been a while since I read through the issue, but I thought there was a
way to get a generic zenodo link for pandas, regardless of the version.
…On Tue, Apr 7, 2020 at 7:14 AM Marc Garcia ***@***.***> wrote:
Not in academia, so not an expert, but I see almost all project in the
ecosystem is presenting a paper to cite. Couple of coses offer a citation
to the website, and just matplotlib has some zenodo badges.
- https://docs.dask.org/en/latest/cite.html
- https://scipy.org/citing.html
- https://www.astropy.org/acknowledging.html
- https://scikit-learn.org/stable/about.html#citing-scikit-learn
- https://matplotlib.org/3.2.1/citing.html
- https://docs.bokeh.org/en/1.0.1/docs/citation.html
@TomAugspurger <https://github.com/TomAugspurger> I saw your comments in
the issue, are we happy to have a zenodo badge that has to be updated at
every release? Is it worth the effort? If we want it, do we want it in the
README, in the citing page, or in both?
@jorisvandenbossche <https://github.com/jorisvandenbossche> I think
you've been in academia, any opinion here?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#33311 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAKAOIXEXE6IIM23IQHH4E3RLMKJ3ANCNFSM4MBYCHVQ>
.
|
Yes there is. To quote from the link in the PR description: Do you want me to drop in the general Zenodo citation instead? That would mean that you don't need to update this with every release. |
I would have the version independent badge to zenodo in the readme, and a link to also the version independent after the bibtex entry that Wes sent. Does that make sense? |
It does! And I will also leave a note I there to encourage people to go find the version-specific BibTeX entry, but if they use the general one, nothing is lost. Sounds like a good plan! |
Sorry, I think there has been some misunderstanding. What I understood was that we were going to provide only the bibtex reference of pandas paper, as all the other projects in the community. And then, add the svg badge both in the README, and after the paper's bibtex. The current proposal in this PR is asking people to add two references to pandas, which seems unnecessary, and I don't expect people to do it. I wouldn't be creative, and I'd follow what the rest of the community, just give the paper to cite. And if some people find the zenodo thing useful, we can surely have the badge. But I'd avoid confusion on what's the way we encourage people to cite pandas. |
In that case, I will retract my PR, as this is completely missing the point of it, and let you take over on how to deal with your citations. Just to reiterate, my suggestion was to move away from the poor standard practice of never mentioning the software directly. I am confident that people who are capable of writing a paper are also capable of following simple instructions on what they need to cite if they find the authors' request. The submitted PR is not about being creative - I did not come up with this myself (see sources below). And the "zenodo thing" does have a purpose beyond providing an additional pretty badge. I don't want to waste your time further if you don't see the benefit in this. Please do reach out in the future if you'd like to continue the discussion. Note how I was trying to motivate the team to consider following the recommendations put forward by the FORCE11 Software Citation Working Group, which they published, among other resources, in their paper presenting software citation principles. The particular section I am referring to is about citing the software product directly, additionally to software papers (see section 6.2):
I will point out that I am aware that these are a set of recommendations, but I also highly recommend to dive a little deeper into the matter to see how all of these things work together. In the end, it is you (the team) who decides whether they would like to be cited properly or not and whatever you put forward, an author should adhere to. Out of my personal experience, the confusion usually arises from people trying to actually follow these principles but then packages not providing a native software citation they could use. |
merging this, thanks @ivalaginja this is a net improvement, especially if it reduces friction for citers. |
Follow-up of #32388, addressing #24036
I will leave it to the pandas team to decide whether to put in there a BibTeX entry with the concept DOI or a specific version, some options of dealing with this are described in this comment.
Note how this comment thread on the previous PR asked to replace the author list by "The pandas development team". However, if users go to Zenodo to get the correct BibTeX entry of the version they're actually using, their citation will contain the full author list provided in Zenodo.