Skip to content

Commit 16cc77e

Browse files
Merge branch 'master' into json_improve
2 parents 1722437 + a4b0132 commit 16cc77e

File tree

21 files changed

+259
-113
lines changed

21 files changed

+259
-113
lines changed

doc/source/user_guide/io.rst

+2
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,7 @@ The pandas I/O API is a set of top level ``reader`` functions accessed like
2828
:delim: ;
2929

3030
text;`CSV <https://en.wikipedia.org/wiki/Comma-separated_values>`__;:ref:`read_csv<io.read_csv_table>`;:ref:`to_csv<io.store_in_csv>`
31+
text;`TXT <https://www.oracle.com/webfolder/technetwork/data-quality/edqhelp/Content/introduction/getting_started/configuring_fixed_width_text_file_formats.htm>`__;:ref:`read_fwf<io.fwf_reader>`
3132
text;`JSON <https://www.json.org/>`__;:ref:`read_json<io.json_reader>`;:ref:`to_json<io.json_writer>`
3233
text;`HTML <https://en.wikipedia.org/wiki/HTML>`__;:ref:`read_html<io.read_html>`;:ref:`to_html<io.html>`
3334
text; Local clipboard;:ref:`read_clipboard<io.clipboard>`;:ref:`to_clipboard<io.clipboard>`
@@ -1372,6 +1373,7 @@ should pass the ``escapechar`` option:
13721373
print(data)
13731374
pd.read_csv(StringIO(data), escapechar='\\')
13741375
1376+
.. _io.fwf_reader:
13751377
.. _io.fwf:
13761378

13771379
Files with fixed width columns

doc/source/whatsnew/v0.25.1.rst

+10-3
Original file line numberDiff line numberDiff line change
@@ -53,7 +53,7 @@ Numeric
5353
^^^^^^^
5454
- Bug in :meth:`Series.interpolate` when using a timezone aware :class:`DatetimeIndex` (:issue:`27548`)
5555
- Bug when printing negative floating point complex numbers would raise an ``IndexError`` (:issue:`27484`)
56-
-
56+
- Bug where :class:`DataFrame` arithmetic operators such as :meth:`DataFrame.mul` with a :class:`Series` with axis=1 would raise an ``AttributeError`` on :class:`DataFrame` larger than the minimum threshold to invoke numexpr (:issue:`27636`)
5757
-
5858

5959
Conversion
@@ -103,10 +103,9 @@ MultiIndex
103103
I/O
104104
^^^
105105

106-
- Fix bug in :meth:`io.json.json_normalize` when nested meta paths with a nested record path. (:issue:`27220`)
107106
- Avoid calling ``S3File.s3`` when reading parquet, as this was removed in s3fs version 0.3.0 (:issue:`27756`)
108107
- Better error message when a negative header is passed in :func:`pandas.read_csv` (:issue:`27779`)
109-
-
108+
- Fix bug in :meth:`io.json.json_normalize` when nested meta paths with a nested record path. (:issue:`27220`)
110109

111110
Plotting
112111
^^^^^^^^
@@ -160,6 +159,14 @@ Other
160159
-
161160
-
162161

162+
I/O and LZMA
163+
~~~~~~~~~~~~
164+
165+
Some users may unknowingly have an incomplete Python installation, which lacks the `lzma` module from the standard library. In this case, `import pandas` failed due to an `ImportError` (:issue: `27575`).
166+
Pandas will now warn, rather than raising an `ImportError` if the `lzma` module is not present. Any subsequent attempt to use `lzma` methods will raise a `RuntimeError`.
167+
A possible fix for the lack of the `lzma` module is to ensure you have the necessary libraries and then re-install Python.
168+
For example, on MacOS installing Python with `pyenv` may lead to an incomplete Python installation due to unmet system dependencies at compilation time (like `xz`). Compilation will succeed, but Python might fail at run time. The issue can be solved by installing the necessary dependencies and then re-installing Python.
169+
163170
.. _whatsnew_0.251.contributors:
164171

165172
Contributors

doc/source/whatsnew/v0.7.3.rst

-6
Original file line numberDiff line numberDiff line change
@@ -25,8 +25,6 @@ New features
2525
from pandas.tools.plotting import scatter_matrix
2626
scatter_matrix(df, alpha=0.2) # noqa F821
2727
28-
.. image:: ../savefig/scatter_matrix_kde.png
29-
:width: 5in
3028
3129
- Add ``stacked`` argument to Series and DataFrame's ``plot`` method for
3230
:ref:`stacked bar plots <visualization.barplot>`.
@@ -35,15 +33,11 @@ New features
3533
3634
df.plot(kind='bar', stacked=True) # noqa F821
3735
38-
.. image:: ../savefig/bar_plot_stacked_ex.png
39-
:width: 4in
4036
4137
.. code-block:: python
4238
4339
df.plot(kind='barh', stacked=True) # noqa F821
4440
45-
.. image:: ../savefig/barh_plot_stacked_ex.png
46-
:width: 4in
4741
4842
- Add log x and y :ref:`scaling options <visualization.basic>` to
4943
``DataFrame.plot`` and ``Series.plot``

doc/source/whatsnew/v1.0.0.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -158,7 +158,7 @@ MultiIndex
158158
I/O
159159
^^^
160160

161-
-
161+
- :meth:`read_csv` now accepts binary mode file buffers when using the Python csv engine (:issue:`23779`)
162162
-
163163

164164
Plotting

pandas/_libs/parsers.pyx

+5-3
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,6 @@
22
# See LICENSE for the license
33
import bz2
44
import gzip
5-
import lzma
65
import os
76
import sys
87
import time
@@ -59,9 +58,12 @@ from pandas.core.arrays import Categorical
5958
from pandas.core.dtypes.concat import union_categoricals
6059
import pandas.io.common as icom
6160

61+
from pandas.compat import _import_lzma, _get_lzma_file
6262
from pandas.errors import (ParserError, DtypeWarning,
6363
EmptyDataError, ParserWarning)
6464

65+
lzma = _import_lzma()
66+
6567
# Import CParserError as alias of ParserError for backwards compatibility.
6668
# Ultimately, we want to remove this import. See gh-12665 and gh-14479.
6769
CParserError = ParserError
@@ -645,9 +647,9 @@ cdef class TextReader:
645647
'zip file %s', str(zip_names))
646648
elif self.compression == 'xz':
647649
if isinstance(source, str):
648-
source = lzma.LZMAFile(source, 'rb')
650+
source = _get_lzma_file(lzma)(source, 'rb')
649651
else:
650-
source = lzma.LZMAFile(filename=source)
652+
source = _get_lzma_file(lzma)(filename=source)
651653
else:
652654
raise ValueError('Unrecognized compression type: %s' %
653655
self.compression)

pandas/compat/__init__.py

+30
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@
1010
import platform
1111
import struct
1212
import sys
13+
import warnings
1314

1415
PY35 = sys.version_info[:2] == (3, 5)
1516
PY36 = sys.version_info >= (3, 6)
@@ -65,3 +66,32 @@ def is_platform_mac():
6566

6667
def is_platform_32bit():
6768
return struct.calcsize("P") * 8 < 64
69+
70+
71+
def _import_lzma():
72+
"""Attempts to import lzma, warning the user when lzma is not available.
73+
"""
74+
try:
75+
import lzma
76+
77+
return lzma
78+
except ImportError:
79+
msg = (
80+
"Could not import the lzma module. "
81+
"Your installed Python is incomplete. "
82+
"Attempting to use lzma compression will result in a RuntimeError."
83+
)
84+
warnings.warn(msg)
85+
86+
87+
def _get_lzma_file(lzma):
88+
"""Returns the lzma method LZMAFile when the module was correctly imported.
89+
Otherwise, raises a RuntimeError.
90+
"""
91+
if lzma is None:
92+
raise RuntimeError(
93+
"lzma module not available. "
94+
"A Python re-install with the proper "
95+
"dependencies might be required to solve this issue."
96+
)
97+
return lzma.LZMAFile

pandas/core/arrays/sparse.py

+6-3
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,7 @@
3939
)
4040
from pandas.core.dtypes.dtypes import register_extension_dtype
4141
from pandas.core.dtypes.generic import (
42+
ABCDataFrame,
4243
ABCIndexClass,
4344
ABCSeries,
4445
ABCSparseArray,
@@ -1735,13 +1736,15 @@ def sparse_unary_method(self):
17351736

17361737
@classmethod
17371738
def _create_arithmetic_method(cls, op):
1738-
def sparse_arithmetic_method(self, other):
1739-
op_name = op.__name__
1739+
op_name = op.__name__
17401740

1741-
if isinstance(other, (ABCSeries, ABCIndexClass)):
1741+
def sparse_arithmetic_method(self, other):
1742+
if isinstance(other, (ABCDataFrame, ABCSeries, ABCIndexClass)):
17421743
# Rely on pandas to dispatch to us.
17431744
return NotImplemented
17441745

1746+
other = lib.item_from_zerodim(other)
1747+
17451748
if isinstance(other, SparseArray):
17461749
return _sparse_array_op(self, other, op, op_name)
17471750

pandas/core/computation/expressions.py

+4-3
Original file line numberDiff line numberDiff line change
@@ -76,16 +76,17 @@ def _can_use_numexpr(op, op_str, a, b, dtype_check):
7676

7777
# required min elements (otherwise we are adding overhead)
7878
if np.prod(a.shape) > _MIN_ELEMENTS:
79-
8079
# check for dtype compatibility
8180
dtypes = set()
8281
for o in [a, b]:
83-
if hasattr(o, "dtypes"):
82+
# Series implements dtypes, check for dimension count as well
83+
if hasattr(o, "dtypes") and o.ndim > 1:
8484
s = o.dtypes.value_counts()
8585
if len(s) > 1:
8686
return False
8787
dtypes |= set(s.index.astype(str))
88-
elif isinstance(o, np.ndarray):
88+
# ndarray and Series Case
89+
elif hasattr(o, "dtype"):
8990
dtypes |= {o.dtype.name}
9091

9192
# allowed are a superset

pandas/core/frame.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -1190,7 +1190,7 @@ def to_numpy(self, dtype=None, copy=False):
11901190
Parameters
11911191
----------
11921192
dtype : str or numpy.dtype, optional
1193-
The dtype to pass to :meth:`numpy.asarray`
1193+
The dtype to pass to :meth:`numpy.asarray`.
11941194
copy : bool, default False
11951195
Whether to ensure that the returned value is a not a view on
11961196
another array. Note that ``copy=False`` does not *ensure* that

0 commit comments

Comments
 (0)