Skip to content

Commit 48dab82

Browse files
committed
PERF: Avoid fragmentation of DataFrame in read_sas (pandas-dev#48603)
* PERF: Avoid fragmentation of DataFrame in read_sas * Add whatsnew * Add warning
1 parent c3571e6 commit 48dab82

File tree

2 files changed

+4
-3
lines changed

2 files changed

+4
-3
lines changed

doc/source/whatsnew/v1.6.0.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -219,7 +219,7 @@ MultiIndex
219219

220220
I/O
221221
^^^
222-
-
222+
- Bug in :func:`read_sas` caused fragmentation of :class:`DataFrame` and raised :class:`.errors.PerformanceWarning` (:issue:`48595`)
223223
-
224224

225225
Period

pandas/io/sas/sas_xport.py

+3-2
Original file line numberDiff line numberDiff line change
@@ -481,7 +481,7 @@ def read(self, nrows: int | None = None) -> pd.DataFrame:
481481
raw = self.filepath_or_buffer.read(read_len)
482482
data = np.frombuffer(raw, dtype=self._dtype, count=read_lines)
483483

484-
df = pd.DataFrame(index=range(read_lines))
484+
df_data = {}
485485
for j, x in enumerate(self.columns):
486486
vec = data["s" + str(j)]
487487
ntype = self.fields[j]["ntype"]
@@ -496,7 +496,8 @@ def read(self, nrows: int | None = None) -> pd.DataFrame:
496496
if self._encoding is not None:
497497
v = [y.decode(self._encoding) for y in v]
498498

499-
df[x] = v
499+
df_data.update({x: v})
500+
df = pd.DataFrame(df_data)
500501

501502
if self._index is None:
502503
df.index = pd.Index(range(self._lines_read, self._lines_read + read_lines))

0 commit comments

Comments
 (0)