-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
ENH: Add dtype argument to read_sql_query (GH10285) #37546
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@jorisvandenbossche tagging you since you were involved in the original issue. Could you have a look at this imp and lmk whether this is what you had in mind? |
@jorisvandenbossche could you (or maybe someone else) have a look at this PR? Thanks |
@jreback could you (or maybe someone else) have a look at this PR? Thanks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pls always add tests. w/o them its impossible to tell if this does anything
5258da9
to
bcef60e
Compare
Updated PR. Only Linux py38_np_dev fails, but due to other reasons. |
Merged master, everything succeeds. I also saw a similar enhancement request for dtypes for |
@jreback could you have another look? Will fix the merge conflict in the docs later today |
pandas/io/sql.py
Outdated
@@ -295,6 +301,7 @@ def read_sql_query( | |||
params=None, | |||
parse_dates=None, | |||
chunksize: None = None, | |||
dtype: Optional[Union[Dtype, Dict[str, Dtype]]] = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this type actually is pretty usefule, can you define in _typing, call it DtypeOrDictDtype / DtypeTable and add a comment about it. cc @simonjayhawkins @WillAyd @jorisvandenbossche for the name here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dtype: Optional[Union[Dtype, Dict[str, Dtype]]] = None, | |
dtype: Optional[Union[Dtype, Dict[Label, Dtype]]] = None, |
Maybe DtypeArg
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, added DtypeArg
to _typing
pandas/io/sql.py
Outdated
index_col=None, | ||
coerce_float=True, | ||
parse_dates=None, | ||
dtype=None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you type anywhere you are adding type
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
doc/source/whatsnew/v1.2.0.rst
Outdated
@@ -307,6 +307,7 @@ Other enhancements | |||
- Improve numerical stability for :meth:`.Rolling.skew`, :meth:`.Rolling.kurt`, :meth:`Expanding.skew` and :meth:`Expanding.kurt` through implementation of Kahan summation (:issue:`6929`) | |||
- Improved error reporting for subsetting columns of a :class:`.DataFrameGroupBy` with ``axis=1`` (:issue:`37725`) | |||
- Implement method ``cross`` for :meth:`DataFrame.merge` and :meth:`DataFrame.join` (:issue:`5401`) | |||
- :func:`pandas.read_sql_query` now accepts a ``dtype`` argument to cast the columnar data from the SQL database based on user input (:issue:`10285`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move to 1.3
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
] | ||
DtypeArg = Optional[Union[Dtype, Dict[Label, Dtype]]] | ||
DtypeObj = Union[np.dtype, "ExtensionDtype"] | ||
|
||
# For functions like rename that convert one label to another |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved the dtype block since Label
was defined later in the file
In #13049 it is proposed to add the |
pandas/_typing.py
Outdated
Dtype = Union[ | ||
"ExtensionDtype", str, np.dtype, Type[Union[str, float, int, complex, bool, object]] | ||
] | ||
DtypeArg = Optional[Union[Dtype, Dict[Label, Dtype]]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't use Optional in the spec itself
pandas/io/sql.py
Outdated
@@ -132,10 +133,14 @@ def _wrap_result( | |||
index_col=None, | |||
coerce_float: bool = True, | |||
parse_dates=None, | |||
dtype: DtypeArg = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Optional[DtypeArg] for all of these
@@ -100,6 +93,14 @@ | |||
JSONSerializable = Optional[Union[PythonScalar, List, Dict]] | |||
Axes = Collection | |||
|
|||
# dtypes | |||
|
|||
Dtype = Union[ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add a comment on what DtypeArg is / supposed to be used
@@ -1361,6 +1376,9 @@ def read_query( | |||
chunksize : int, default None | |||
If specified, return an iterator where `chunksize` is the number | |||
of rows to include in each chunk. | |||
dtype : Type name or dict of columns | |||
Data type for data or columns. E.g. np.float64 or | |||
{‘a’: np.float64, ‘b’: np.int32, ‘c’: ‘Int64’} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
versionadded 1.3
pandas/tests/io/test_sql.py
Outdated
result = sql.read_sql_query( | ||
"SELECT SepalLength, SepalWidth FROM iris", self.conn, dtype=dtype | ||
) | ||
assert result.dtypes.to_dict() == { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you constructed an expected frame and use tm.assert_frame_equal
lgtm ping on greenish |
as a followup :-> can type any |
@jreback CI is greenish,Travis failed due to other reasons Will work on dtype follow up tomorrow. |
@@ -371,6 +379,9 @@ def read_sql_query( | |||
chunksize : int, default None | |||
If specified, return an iterator where `chunksize` is the number of | |||
rows to include in each chunk. | |||
dtype : Type name or dict of columns | |||
Data type for data or columns. E.g. np.float64 or | |||
{‘a’: np.float64, ‘b’: np.int32, ‘c’: ‘Int64’} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need a versionadded 1.3 here. ok to add in next PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I see i didn't commit that change. But will indeed add it to the follow on
thanks @avinashpancham small comment for followon |
black pandas
git diff upstream/master -u -- "*.py" | flake8 --diff