ENH/DOC: wide_to_long performance and docstring clarification #14778
Labels
Docs
Performance
Memory or execution speed performance
Reshaping
Concat, Merge/Join, Stack/Unstack, Explode
Milestone
I had to massage some messy data recently, and a big bottleneck turned out to be wide_to_long
Some test data with many id variables and time variables
Reshaping with
wide_to_long
takes around 2 secs.I modified
wide_to_long
slightly (regex on categorical column / avoid copying many "idvariables", postpone type coercion) and the runtime is nowThe result is the same
Docstring clarification
The
wide_to_long
docstring also contains an unused last parameterstubend : str
, which should be removed.A docstring
Note
addtion about escaping special characters (with for examplere.escape
) instubnames
could also perhaps be informative, since if the user passes a dataframe with messy stubnames, the function fails with a pretty uninformative error message.I can send a PR for this if that is wanted?
Output of
pd.show_versions()
pandas: 0.19.0+124.g06f26b5.dirty
nose: 1.3.7
pip: 9.0.1
setuptools: 27.2.0
Cython: 0.24.1
numpy: 1.11.1
scipy: 0.18.1
statsmodels: 0.6.1
xarray: None
IPython: 5.1.0
sphinx: 1.4.6
patsy: 0.4.1
dateutil: 2.5.3
pytz: 2016.6.1
blosc: None
bottleneck: 1.1.0
tables: 3.2.3.1
numexpr: 2.6.1
matplotlib: 1.5.3
openpyxl: 2.3.2
xlrd: 1.0.0
xlwt: 1.1.2
xlsxwriter: 0.9.3
lxml: 3.6.4
bs4: 4.5.1
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: 1.0.13
pymysql: None
psycopg2: None
jinja2: 2.8
boto: 2.42.0
pandas_datareader: None
The text was updated successfully, but these errors were encountered: