Skip to content

.concat crashes Python #16111

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
topper-123 opened this issue Apr 24, 2017 · 2 comments · Fixed by #16133
Closed

.concat crashes Python #16111

topper-123 opened this issue Apr 24, 2017 · 2 comments · Fixed by #16133
Labels
Bug Categorical Categorical Data Type
Milestone

Comments

@topper-123
Copy link
Contributor

pandas.concat crashes the python interpreter

The program snippet below crashes the python interpreter. I have run the snippet with:

  • Pandas 0.19.2,
  • Python 3.6.1 and Python 3.5.2,
  • Windows 10 and Windows 8.1
  • in a straigth python interpreter and in the ipython program

all with the same result (sometimes I have run it twice and it crashes the next time, don't know why).

import pandas as pd

categories = ['Afghanistan', 'Albania', 'Algeria', 'Andorra', 'Angola', 'Antigua & Deps', 'Argentina', 'Armenia', 'Australia', 'Austria', 'Azerbaijan', 'Bahamas', 'Bahrain', 'Bangladesh', 'Barbados', 'Belarus', 'Belgium', 'Benin', 'Bhutan', 'Bolivia', 'Bosnia Herzegovina', 'Botswana', 'Brazil', 'Brunei', 'Bulgaria', 'Burkina', 'Cambodia', 'Cameroon', 'Canada', 'Chile', 'China', 'Colombia', 'Congo {Democratic Rep}', 'Costa Rica', 'Croatia', 'Cuba', 'Cyprus', 'Czech Republic', 'Denmark', 'Djibouti', 'Dominican Republic', 'Ecuador', 'Egypt', 'El Salvador', 'Estonia', 'Ethiopia', 'Finland', 'France', 'Gabon', 'Georgia', 'Germany', 'Ghana', 'Greece', 'Guatemala', 'Haiti', 'Honduras', 'Hungary', 'Iceland', 'India', 'Indonesia', 'Iran', 'Iraq', 'Ireland {Republic}', 'Israel', 'Italy', 'Jamaica', 'Japan', 'Jordan', 'Kazakhstan', 'Kenya', 'Korea North', 'Korea South', 'Kosovo', 'Kuwait', 'Kyrgyzstan', 'Laos', 'Latvia', 'Lebanon', 'Liechtenstein', 'Lithuania', 'Luxembourg', 'Macedonia', 'Madagascar', 'Malaysia', 'Maldives', 'Malta', 'Mauritania', 'Mauritius', 'Mexico', 'Moldova', 'Mongolia', 'Montenegro', 'Morocco', 'Mozambique', 'Myanmar, {Burma}', 'Namibia', 'Nepal', 'Netherlands', 'New Zealand', 'Nicaragua', 'Nigeria', 'Norway', 'Oman', 'Other', 'Pakistan', 'Panama', 'Paraguay', 'Peru', 'Philippines', 'Poland', 'Portugal', 'Qatar', 'Romania', 'Russian Federation', 'Rwanda', 'San Marino', 'Saudi Arabia', 'Senegal', 'Serbia', 'Sierra Leone', 'Singapore', 'Slovakia', 'Slovenia', 'Solomon Islands', 'Somalia', 'South Africa', 'South Sudan', 'Spain', 'Sri Lanka', 'Sudan', 'Swaziland', 'Sweden', 'Switzerland', 'Syria', 'Taiwan', 'Tajikistan', 'Tanzania', 'Thailand', 'Togo', 'Trinidad & Tobago', 'Tunisia', 'Turkey', 'Turkmenistan', 'Uganda', 'Ukraine', 'United Arab Emirates', 'United Kingdom', 'United States', 'Uruguay', 'Uzbekistan', 'Vanuatu', 'Vatican City', 'Venezuela', 'Vietnam', 'Zambia', 'Zimbabwe', "Didn't answer"]

series0_values = [9, 1, 8, 18, 8, 85, 49, 3, 1, 2, 14, 1, 7, 40, 2, 5, 64, 1, 15, 5, 1, 116, 7, 43, 8, 6, 13, 2, 2, 32, 35, 2, 1, 23, 1, 5, 1, 43, 112, 1, 9, 319, 1, 25, 3, 2, 30, 4, 455, 23, 60, 1, 37, 37, 106, 1, 13, 6, 5, 5, 1, 32, 1, 10, 8, 14, 5, 1, 11, 1, 6, 1, 39, 4, 1, 4, 2, 7, 124, 30, 1, 4, 28, 1, 31, 2, 20, 130, 39, 29, 97, 7, 17, 1, 17, 11, 15, 23, 1, 146, 11, 1, 75, 55, 4, 9, 10, 1, 1, 4, 54, 1, 3, 34, 3, 299, 587, 7, 1, 10, 17, 2, 65]
series0_index = ['Afghanistan', 'Albania', 'Algeria', 'Argentina', 'Armenia', 'Australia', 'Austria', 'Azerbaijan', 'Bahamas', 'Bahrain', 'Bangladesh', 'Barbados', 'Belarus', 'Belgium', 'Bolivia', 'Bosnia Herzegovina', 'Brazil', 'Brunei', 'Bulgaria', 'Cambodia', 'Cameroon', 'Canada', 'Chile', 'China', 'Colombia', 'Costa Rica', 'Croatia', 'Cuba', 'Cyprus', 'Czech Republic', 'Denmark', 'Dominican Republic', 'Ecuador', 'Egypt', 'El Salvador', 'Estonia', 'Ethiopia', 'Finland', 'France', 'Gabon', 'Georgia', 'Germany', 'Ghana', 'Greece', 'Guatemala', 'Honduras', 'Hungary', 'Iceland', 'India', 'Indonesia', 'Iran', 'Iraq', 'Ireland {Republic}', 'Israel', 'Italy', 'Jamaica', 'Japan', 'Jordan', 'Kazakhstan', 'Kenya', 'Korea North', 'Korea South', 'Kyrgyzstan', 'Latvia', 'Lebanon', 'Lithuania', 'Macedonia', 'Madagascar', 'Malaysia', 'Maldives', 'Malta', 'Mauritius', 'Mexico', 'Moldova', 'Mongolia', 'Morocco', 'Myanmar, {Burma}', 'Nepal', 'Netherlands', 'New Zealand', 'Nicaragua', 'Nigeria', 'Norway', 'Oman', 'Pakistan', 'Peru', 'Philippines', 'Poland', 'Portugal', 'Romania', 'Russian Federation', 'Saudi Arabia', 'Serbia', 'Sierra Leone', 'Singapore', 'Slovakia', 'Slovenia', 'South Africa', 'South Sudan', 'Spain', 'Sri Lanka', 'Sudan', 'Sweden', 'Switzerland', 'Syria', 'Taiwan', 'Thailand', 'Togo', 'Trinidad & Tobago', 'Tunisia', 'Turkey', 'Turkmenistan', 'Uganda', 'Ukraine', 'United Arab Emirates', 'United Kingdom', 'United States', 'Uruguay', 'Uzbekistan', 'Venezuela', 'Vietnam', 'Zimbabwe', "Didn't answer"]

i0 = pd.CategoricalIndex(series0_index, ordered=True, categories=categories)
s0 = pd.Series(series0_values, index=i0)

series1_values = [4, 18, 16, 65, 33, 2, 8, 1, 12, 35, 3, 3, 46, 6, 1, 89, 4, 17, 8, 6, 3, 2, 1, 22, 30, 3, 13, 1, 9, 1, 35, 109, 7, 197, 1, 14, 2, 17, 233, 9, 28, 16, 36, 56, 5, 1, 3, 4, 6, 14, 3, 8, 3, 3, 3, 14, 2, 1, 3, 1, 3, 5, 86, 11, 1, 3, 20, 12, 2, 1, 1, 7, 93, 20, 18, 61, 3, 14, 1, 5, 9, 5, 1, 19, 80, 4, 1, 75, 29, 2, 6, 11, 1, 2, 34, 45, 6, 281, 580, 6, 1, 5, 10, 1, 36]
series1_index = ['Afghanistan', 'Argentina', 'Armenia', 'Australia', 'Austria', 'Bahrain', 'Bangladesh', 'Barbados', 'Belarus', 'Belgium', 'Bolivia', 'Bosnia Herzegovina', 'Brazil', 'Bulgaria', 'Cambodia', 'Canada', 'Chile', 'China', 'Colombia', 'Costa Rica', 'Croatia', 'Cuba', 'Cyprus', 'Czech Republic', 'Denmark', 'Ecuador', 'Egypt', 'El Salvador', 'Estonia', 'Ethiopia', 'Finland', 'France', 'Georgia', 'Germany', 'Ghana', 'Greece', 'Guatemala', 'Hungary', 'India', 'Indonesia', 'Iran', 'Ireland {Republic}', 'Israel', 'Italy', 'Japan', 'Jordan', 'Kazakhstan', 'Kenya', 'Korea South', 'Latvia', 'Lebanon', 'Lithuania', 'Macedonia', 'Malaysia', 'Malta', 'Mexico', 'Moldova', 'Mongolia', 'Morocco', 'Mozambique', 'Myanmar, {Burma}', 'Nepal', 'Netherlands', 'New Zealand', 'Nicaragua', 'Nigeria', 'Norway', 'Pakistan', 'Panama', 'Paraguay', 'Peru', 'Philippines', 'Poland', 'Portugal', 'Romania', 'Russian Federation', 'Saudi Arabia', 'Serbia', 'Sierra Leone', 'Singapore', 'Slovakia', 'Slovenia', 'Solomon Islands', 'South Africa', 'Spain', 'Sri Lanka', 'Swaziland', 'Sweden', 'Switzerland', 'Syria', 'Taiwan', 'Thailand', 'Trinidad & Tobago', 'Tunisia', 'Turkey', 'Ukraine', 'United Arab Emirates', 'United Kingdom', 'United States', 'Uruguay', 'Uzbekistan', 'Venezuela', 'Vietnam', 'Zimbabwe', "Didn't answer"]

i1 = pd.CategoricalIndex(series1_index, ordered=True, categories=categories)
s1 = pd.Series(series1_values, index=i1)

series2_values = [7, 1, 6, 2, 49, 26, 1, 1, 1, 25, 3, 18, 9, 63, 1, 2, 5, 1, 3, 2, 17, 23, 4, 3, 15, 55, 2, 2, 172, 7, 1, 17, 1, 41, 4, 6, 13, 8, 61, 2, 4, 1, 4, 2, 3, 1, 3, 2, 9, 1, 1, 63, 13, 16, 3, 1, 4, 59, 11, 3, 22, 2, 5, 1, 2, 2, 7, 8, 50, 2, 39, 35, 2, 1, 14, 6, 1, 191, 312, 5, 1, 1, 1, 32]
series2_index = ['Afghanistan', 'Angola', 'Argentina', 'Armenia', 'Australia', 'Austria', 'Azerbaijan', 'Bangladesh', 'Belarus', 'Belgium', 'Bosnia Herzegovina', 'Brazil', 'Bulgaria', 'Canada', 'Chile', 'China', 'Colombia', 'Costa Rica', 'Croatia', 'Cyprus', 'Czech Republic', 'Denmark', 'Egypt', 'Estonia', 'Finland', 'France', 'Gabon', 'Georgia', 'Germany', 'Greece', 'Honduras', 'Hungary', 'Iceland', 'India', 'Indonesia', 'Iran', 'Ireland {Republic}', 'Israel', 'Italy', 'Kazakhstan', 'Korea South', 'Laos', 'Latvia', 'Lebanon', 'Lithuania', 'Luxembourg', 'Malaysia', 'Malta', 'Mexico', 'Mongolia', 'Nepal', 'Netherlands', 'New Zealand', 'Norway', 'Pakistan', 'Paraguay', 'Philippines', 'Poland', 'Portugal', 'Romania', 'Russian Federation', 'Saudi Arabia', 'Serbia', 'Sierra Leone', 'Singapore', 'Slovakia', 'Slovenia', 'South Africa', 'Spain', 'Sri Lanka', 'Sweden', 'Switzerland', 'Taiwan', 'Thailand', 'Turkey', 'Ukraine', 'United Arab Emirates', 'United Kingdom', 'United States', 'Uruguay', 'Venezuela', 'Vietnam', 'Zimbabwe', "Didn't answer"]

i2 = pd.CategoricalIndex(series2_index, ordered=True, categories=categories)
s2 = pd.Series(series2_values, index=i2)

print("Before")
x = pd.concat([s0, s1, s2], axis=1)  # crash!
print("After")

pd.concat([s0, s1, s2], axis=1) crashes the interpreter for me. If only one or two series are concatenated, no crash occurs

Problem description

The interpreter just exits with a message in a message box (In danish, but translated to An error caused Python to exit) and the program shuts down. There is no traceback or other information about the cause (that I know how to get at - If I'm instructed to how to get it, I may be able to fetch it).

Expected Output

pd.concat should return a three-column DataFrame. Instead the interpreter crashes.

Output of pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.6.1.final.0
python-bits: 64
OS: Windows
OS-release: 10
machine: AMD64
processor: Intel64 Family 6 Model 78 Stepping 3, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None

pandas: 0.19.2
nose: None
pip: 9.0.1
setuptools: 35.0.1
Cython: None
numpy: 1.11.2
scipy: 0.18.1
statsmodels: None
xarray: None
IPython: 6.0.0
sphinx: None
patsy: None
dateutil: 2.6.0
pytz: 2016.10
blosc: None
bottleneck: None
tables: None
numexpr: None
matplotlib: 2.0.0
openpyxl: 2.4.5
xlrd: None
xlwt: None
xlsxwriter: None
lxml: 3.7.3
bs4: 4.5.3
html5lib: 0.999999999
httplib2: None
apiclient: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.9.6
boto: None
pandas_datareader: None

@chris-b1 chris-b1 added Bug Categorical Categorical Data Type labels Apr 24, 2017
@chris-b1
Copy link
Contributor

Thanks for the report, also happens on 0.20.0rc1

@gfyoung
Copy link
Member

gfyoung commented Apr 25, 2017

For reference, this isn't a Windows bug. It also crashes on Linux.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Categorical Categorical Data Type
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants