Skip to content

BUG: Allow empty chunksize in stata reader when using iterator #37302

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Oct 23, 2020

Conversation

bashtage
Copy link
Contributor

@bashtage bashtage commented Oct 21, 2020

@bashtage bashtage added IO Stata read_stata, to_stata Regression Functionality that used to work in a prior pandas version labels Oct 21, 2020
"chunksize must be set to a positive integer to use as an iterator."
)
return self.read(nrows=self._chunksize or 1)
self._chunksize = 1 if self._chunksize is None else self._chunksize
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there validation when chunsize is passed in?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not checked when set. I've refactored the code to not rely on chunksize being set to determine if an iterator is being used.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what i mean is what if someone passes chunksize<=0 ? do we appropriately raise

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, there is explicit validation to ensure that it is an int that is >0

pandas/pandas/io/stata.py

Lines 1049 to 1052 in b2ea05f

if self._chunksize is None:
self._chunksize = 1
elif not isinstance(chunksize, int) or chunksize <= 0:
raise ValueError("chunksize must be a positive integer when set.")

@jreback
Copy link
Contributor

jreback commented Oct 21, 2020

can you add a what's new note for the precision loss change (1.1.4)

@jreback jreback added this to the 1.1.4 milestone Oct 21, 2020
@jreback
Copy link
Contributor

jreback commented Oct 21, 2020

also pls merge master

@bashtage
Copy link
Contributor Author

I rebased on master and added a what's new.

@pep8speaks
Copy link

pep8speaks commented Oct 21, 2020

Hello @bashtage! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2020-10-23 06:22:14 UTC

@jreback
Copy link
Contributor

jreback commented Oct 22, 2020

small conflict and pls rebase

Remvoe error message inorrectl added
Fixed new issues identified by mypy
Add test to ensure conversion of large ints is correct

closes pandas-dev#37280
@jreback jreback merged commit 901b1a7 into pandas-dev:master Oct 23, 2020
@jreback
Copy link
Contributor

jreback commented Oct 23, 2020

thanks @bashtage

@simonjayhawkins
Copy link
Member

@meeseeksdev backport 1.1.x

meeseeksmachine pushed a commit to meeseeksmachine/pandas that referenced this pull request Oct 23, 2020
simonjayhawkins added a commit that referenced this pull request Oct 24, 2020
…ta reader when using iterator) (#37364)

* Backport PR #37302: BUG: Allow empty chunksize in stata reader when using iterator

* remove match argument to assert_produces_warning

Co-authored-by: Kevin Sheppard <[email protected]>
Co-authored-by: Simon Hawkins <[email protected]>
JulianWgs pushed a commit to JulianWgs/pandas that referenced this pull request Oct 26, 2020
Remvoe error message inorrectl added
Fixed new issues identified by mypy
Add test to ensure conversion of large ints is correct

closes pandas-dev#37280
kesmit13 pushed a commit to kesmit13/pandas that referenced this pull request Nov 2, 2020
Remvoe error message inorrectl added
Fixed new issues identified by mypy
Add test to ensure conversion of large ints is correct

closes pandas-dev#37280
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
IO Stata read_stata, to_stata Regression Functionality that used to work in a prior pandas version
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: regression: pandas.read_stata(filename, iterator=True) raises ValueError
4 participants