Skip to content

BUG: to_latex outputs string with missing second index level values #14484

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
the-alleged-car opened this issue Oct 24, 2016 · 3 comments
Closed

Comments

@the-alleged-car
Copy link

the-alleged-car commented Oct 24, 2016

I am using pandas to generate a LaTeX string using the to_latex() method on a DataFrame, which is indexed using a MultiIndex object. Running the code snippet produces an incorrect list of strings: the LaTeX table is missing two index numbers.

Code Snippet

import pandas as pd

outliers_lst = [(23240, 0),
                 (23240, 15),
                 (23240, 23),
                 (23240, 31),
                 (23240, 85),
                 (38661, 85),
                 (41231, 85),
                 (41231, 92),
                 (46371, 0)]

headers = (['max', 'EC 1', 'S'],
             ['max', 'EC 1', 'A'],
             ['max', 'EC 2', 'S'])

table = pd.DataFrame("",index = pd.MultiIndex.from_tuples(sorted(outliers_lst)), columns = pd.MultiIndex.from_tuples(headers))
table.to_latex(index = True, longtable = True, column_format = 'c'*5).split('\n')

Incorrect Output

[u'\\begin{longtable}{cccccccccccccccccccccccccc}',
 u'\\toprule',
 u'      &    &  max &   &      \\\\',
 u'      &    & EC 1 &   & EC 2 \\\\',
 u'      &    &    S & A &    S \\\\',
 u'\\midrule',
 u'\\endhead',
 u'\\midrule',
 u'\\multicolumn{3}{r}{{Continued on next page}} \\\\',
 u'\\midrule',
 u'\\endfoot',
 u'',
 u'\\bottomrule',
 u'\\endlastfoot',
 u'23240 & 0  &      &   &      \\\\',
 u'      & 15 &      &   &      \\\\',
 u'      & 23 &      &   &      \\\\',
 u'      & 31 &      &   &      \\\\',
 u'      & 85 &      &   &      \\\\',
 u'38661 &    &      &   &      \\\\',
 u'41231 &    &      &   &      \\\\',
 u'      & 92 &      &   &      \\\\',
 u'46371 & 0  &      &   &      \\\\',
 u'\\end{longtable}',
 u'']

Correct Output

[u'\\begin{longtable}{cccccccccccccccccccccccccc}',
 u'\\toprule',
 u'      &    &  max &   &      \\\\',
 u'      &    & EC 1 &   & EC 2 \\\\',
 u'      &    &    S & A &    S \\\\',
 u'\\midrule',
 u'\\endhead',
 u'\\midrule',
 u'\\multicolumn{3}{r}{{Continued on next page}} \\\\',
 u'\\midrule',
 u'\\endfoot',
 u'',
 u'\\bottomrule',
 u'\\endlastfoot',
 u'23240 & 0  &      &   &      \\\\',
 u'      & 15 &      &   &      \\\\',
 u'      & 23 &      &   &      \\\\',
 u'      & 31 &      &   &      \\\\',
 u'      & 85 &      &   &      \\\\',
 u'38661 & 85 &      &   &      \\\\',
 u'41231 & 85 &      &   &      \\\\',
 u'      & 92 &      &   &      \\\\',
 u'46371 & 0  &      &   &      \\\\',
 u'\\end{longtable}',
 u'']

Note that in the correct output LaTeX strings, the rows with indices (38661, 85) and (41231, 85) correctly include the second index (the number 85), but in the incorrect LaTeX strings the rows do not include the number 85.

Could this be because the row (23240, 85) above (38661, 85) includes 85 in its second index?

commit: None python: 2.7.12.final.0 python-bits: 64 OS: Windows OS-release: 7 machine: AMD64 processor: Intel64 Family 6 Model 60 Stepping 3, GenuineIntel byteorder: little LC_ALL: None LANG: None

pandas: 0.18.1
nose: 1.3.7
pip: 8.1.2
setuptools: 23.0.0
Cython: 0.24
numpy: 1.11.1
scipy: 0.17.1
statsmodels: 0.6.1
xarray: None
IPython: 4.2.0
sphinx: 1.4.1
patsy: 0.4.1
dateutil: 2.5.3
pytz: 2016.4
blosc: None
bottleneck: 1.1.0
tables: 3.2.2
numexpr: 2.6.0
matplotlib: 1.5.1
openpyxl: 2.3.2
xlrd: 1.0.0
xlwt: 1.1.2
xlsxwriter: 0.9.2
lxml: 3.6.0
bs4: 4.4.1
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: 1.0.13
pymysql: 0.7.6.None
psycopg2: None
jinja2: 2.8
boto: 2.40.0
pandas_datareader: None

@jorisvandenbossche
Copy link
Member

@the-alleged-car That indeed looks like a bug in the multi-index handling (not printing consecutive values should only happen for the same values of the previous level). Thanks for the report!

Smaller reproducible example:

In [18]: df = pd.DataFrame(index=pd.MultiIndex.from_tuples([('A', 'c'), ('B', 'c')]), columns=['col'])

In [19]: print(df.to_latex())
\begin{tabular}{lll}
\toprule
  &   &  col \\
\midrule
A & c &  NaN \\
B &   &  NaN \\
\bottomrule
\end{tabular}

@jorisvandenbossche jorisvandenbossche changed the title DataFrame.to_latex() outputs string with missing second index values BUG: to_latex outputs string with missing second index level values Oct 24, 2016
@jorisvandenbossche
Copy link
Member

@the-alleged-car If you want to take a look how to fix it, always welcome!

@enriquefernandez
Copy link

This just bit me as well.
Any known workarounds for the moment?

gfyoung added a commit to forking-repos/pandas that referenced this issue Dec 8, 2017
@jreback jreback added this to the 0.21.1 milestone Dec 8, 2017
jreback pushed a commit that referenced this issue Dec 8, 2017
* BUG: LatexFormatter.write_result multi-index

Fixed GH issue 14484:
`LatexFormatter.write_result`` now does not print blanks if a
higher-order index differs from the previous row.
Also added testcase for this.

* MAINT: Address reviewer comments

Closes gh-14484
Closes gh-17499
TomAugspurger pushed a commit to TomAugspurger/pandas that referenced this issue Dec 8, 2017
* BUG: LatexFormatter.write_result multi-index

Fixed GH issue 14484:
`LatexFormatter.write_result`` now does not print blanks if a
higher-order index differs from the previous row.
Also added testcase for this.

* MAINT: Address reviewer comments

Closes pandas-devgh-14484
Closes pandas-devgh-17499
TomAugspurger pushed a commit that referenced this issue Dec 11, 2017
* BUG: LatexFormatter.write_result multi-index

Fixed GH issue 14484:
`LatexFormatter.write_result`` now does not print blanks if a
higher-order index differs from the previous row.
Also added testcase for this.

* MAINT: Address reviewer comments

Closes gh-14484
Closes gh-17499
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants