Skip to content

Create CITATION.md #32388

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from
Closed

Create CITATION.md #32388

wants to merge 3 commits into from

Conversation

ivalaginja
Copy link
Contributor

Addresses #24036 by adding a CITATION.md file to the repository.

I copied the citation instructions from here: https://pandas.io/about/citing.html
and added a section for the published software on Zenodo. Following this recommendation, I did this for the latest released version (v1.0.1) with a note for the user to go fetch the citation from Zenodo for the version they are actually using.

Tagging @TomAugspurger and @jreback since you were active in the linked issue.

I still recommend to update the citation request on the pandas website directly (https://pandas.io/about/citing.html), as well as on the Scipy website (https://www.scipy.org/citing.html#pandas), on https://pandas.pydata.org/ and maybe provide a <package>.__citation__ variable as suggested here.

Copy link
Contributor

@TomAugspurger TomAugspurger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is CITATION.md a standard place for this information?

CITATION.md Outdated
- *pandas* version 1.0.1 published on Zenodo (please find us on Zenodo and replace with the citation for the version you are using)
```
@software{reback2020pandas,
author = {Jeff Reback and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How was this list determined? I'd say perhaps keep it as "Wes McKinney and the pandas development team" or just "The pandas development team". Curious what others think though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes this should be just the pandas development team

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is how you guys released it on Zenodo. I will update it in the PR and if you wish, I think you can even go into Zenodo and adjust that so that people fetch it correctly when they get their BibTeX entry from there.

@ivalaginja
Copy link
Contributor Author

Is CITATION.md a standard place for this information?

Yes, the format doesn't matter, but authors will look for either citation instructions on a website or a CITATION file on the repository. The format doesn't matter (I just find markdown more neat than a plain text file). STScI has released a recommended style-guide for software releases based on what the community usually uses, and it recommends including a CITATION file.

Copy link
Member

@datapythonista datapythonista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the work on this @ivalaginja

@pandas-dev/pandas-core a discussion we probably had to have earlier... What is our preferred way of citing pandas? In the old website we've got the two papers, and I just copied them to the new, but instead of making the users decide one randomly, we should probably decide ourselves on what's the proper way to cite pandas, and give them just one option.

Is this zenodo thing the preferred way? Should we leave just that and drop the papers?

@@ -0,0 +1,49 @@
# Citing and logo
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably don't need the logo stuff in this file, just the citing part


## Citing pandas

If you use *pandas* for a scientific publication, we would appreciate citations to the published software and one of the two given papers:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are not two papers anymore, I think this comment is outdated.


If you use *pandas* for a scientific publication, we would appreciate citations to the published software and one of the two given papers:

- *pandas* version 1.0.1 published on Zenodo (please find us on Zenodo and replace with the citation for the version you are using)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably worth having the link to Zenodo, so users can click and go there.

month = feb,
year = 2020,
publisher = {Zenodo},
version = {v1.0.1},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure, but may be instead of 1.0.1 we could have something like REPLACE_BY_USED_VERSION? Or may be we can simply remove the version and make things easier?

}
```

## Brand and logo
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd remove this section.

@datapythonista
Copy link
Member

I don't see numpy, scipy or matplotlib implementing a CITATION.md file, and they are much more academic than us. I'd update the website and avoid duplication if this is not a standard in the scientific Python world.

@JoKeyser
Copy link

JoKeyser commented Mar 6, 2020

The point is to make it easy for anyone searching for information how to cite it properly. There doesn't seem to be a standard yet; adding this information to the Readme would also do the trick.

@ivalaginja
Copy link
Contributor Author

ivalaginja commented Mar 6, 2020

I fully agree with @JoKeyser. This is not a standard yet, and yes adding it to the readme would do the same trick. The fact that numpy, Scipy et al. don't have something like this is annoying to problematic, rather than leading the way.

@datapythonista
Copy link
Member

I don't think adding a CITATION.md file in pandas is going to create a standard. And I don't think it's great having to maintain two copies of the same information.

There is a link "Citing pandas" in the navigation bar of the website. If you want to add a comment in the README linking to that page, since people may go to the GitHub and not the website to look for it, that sounds good. And if you want to improve the citation page, that's also welcome.

We can surely discuss it again if other projects are happy to adopt this as a standard, but I think at this stage this adds little value, and gives us extra work of maintaining duplicate information, and adds even more noise to our root directory.

@ivalaginja
Copy link
Contributor Author

The whole point of this is not actually to create a CITATION file but to update the citation instructions in general because there is no mention at all about the actual software citation. Adding that information into the README works just as well, but the website definitely needs to get updated.

And you're right, pandas adding a CITATION file wouldn't set a standard - it would make it follow it. And yes, it would not be necessary if the existing places were up to date.

@datapythonista
Copy link
Member

Ok, that sounds good. Closing this, feel free to open a PR to update the existing citing page https://pandas.pydata.org/about/citing.html

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants