Closed
Description
As already described in issue #3690, panel.to_frame discards all minor entries with nan in the data, which can be very confusing. I believe there should be a warning, first time data is dropped or the opposite should be the default behavior.
The warning could be treated similar to a ZeroDivisionWarning in numpy only on the first occurrence.
See below for an example:
df1 = pd.DataFrame(np.random.randn(2, 3), columns=['A', 'B', 'C'],
index=['foo', 'bar'])
df2 = pd.DataFrame(np.random.randn(2, 3), columns=['A', 'B', 'C'],
index=['foo', 'bar'])
df2.loc['foo', 'B'] = np.nan
mydict = {'df1': df1, 'df2': df2}
pd.Panel(mydict).to_frame()
Output:
major | minor | df1 | df2 |
---|---|---|---|
foo | A | 1.9097545931480682 | -0.6710202447941566 |
foo | C | 1.3335254610685865 | 1.53372538551507 |
bar | A | 0.3145550744497975 | -1.7221352144306152 |
bar | B | -0.15681197178861878 | -1.2308510354641322 |
bar | C | -0.09598971674309852 | -0.1268630728124487 |
Using filter_observations=False, nan won't be dropped:
pd.Panel(mydict).to_frame(filter_observations=False)
Output:
major | minor | df1 | df2 |
---|---|---|---|
foo | A | 1.9097545931480682 | -0.6710202447941566 |
foo | B | 2.092552358833253 | |
foo | C | 1.333525461068586 | 1.53372538551507 |
bar | A | 0.3145550744497975 | -1.7221352144306152 |
bar | B | -0.15681197178861878 | -1.2308510354641322 |
bar | C | -0.09598971674309852 | -0.1268630728124487 |