You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When read_fwf is used with iterator = True and skiprows = [list] arguments it doesn't properly skip all the rows in the skiprows list. Things work properly when either of those arguments is used in isolation.
Here is a simple bit of code to reproduce:
importpandasaspd#Create a fixed width file to test with.df=pd.DataFrame({'a': range(10)})
withopen('testfwf.txt', 'w') asf:
f.write(df.to_string(index=False, header=False))
rows_to_skip= [0,1,2,6,9]
df_iter=pd.read_fwf('testfwf.txt', colspecs= [(0,2)], names= ['a'], iterator=True,
chunksize=2, skiprows=rows_to_skip)
print('The fixed width file in chunks with rows [0,1,2,6,9] skipped: ')
fordfindf_iter:
print(df)
print('Notice how row 6 of the fixed width file has not been skipped even though it should')
print('have been.')
It seems that all rows are skipped until there are rows that aren't skipped. For example, the leading rows 0,1,2 are skipped. But since there are then rows that aren't skipped the skipping stops for all rows until then end, when row 9 IS skipped.
The text was updated successfully, but these errors were encountered:
This looks correct to me. I suppose its a problem with the iterator. I am pretty sure that skiprows should not be allowed with an iterator in general. The rows get renumbered each time the iterator runs. So its pretty useless. I suppose it could be fixed, but would be some effort. You are welcome to look at this in detail.
When read_fwf is used with iterator = True and skiprows = [list] arguments it doesn't properly skip all the rows in the skiprows list. Things work properly when either of those arguments is used in isolation.
Here is a simple bit of code to reproduce:
It seems that all rows are skipped until there are rows that aren't skipped. For example, the leading rows 0,1,2 are skipped. But since there are then rows that aren't skipped the skipping stops for all rows until then end, when row 9 IS skipped.
The text was updated successfully, but these errors were encountered: