You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Like a lot of people, I wrote some code that attempted to call DataFrame.append() a few million times, and it was slow. So I read the pandas docs ("don't do that, use pd.concat()") and StackOverflow (blah blah blah Ginger) and decided that if I was going to write a function that made chunks of dataframe and then concatenated them, the least I could do would be to make it general purpose, and a package others could use.
It has some features that I care about -- scalability to 100s of millions of rows, and features that help specify or identify categorical columns. But of course, it's not quite compatible with DataFrame.append(), and I'm new enough to pandas to be unsure if I've made it follow pandas conventions as far as possible.
So my question is, can someone who is better versed in pandas conventions than I am, preferably someone who would love to use a package that let them easily build up a dataframe with .append(), take a look at this code and give me some advice?
Thanks in advance!
The text was updated successfully, but these errors were encountered:
As you mentioned, pandas encourages the use of pd.concat over append#35407. Generally, compiling all frame in a list and using pd.concat will be the most performant, even if written within a wrapper function. Going to close this issue, in terms of advertising the package or asking for further advice, we recommend using gitter or StackOverflow
Like a lot of people, I wrote some code that attempted to call
DataFrame.append()
a few million times, and it was slow. So I read the pandas docs ("don't do that, usepd.concat()
") and StackOverflow (blah blah blah Ginger) and decided that if I was going to write a function that made chunks of dataframe and then concatenated them, the least I could do would be to make it general purpose, and a package others could use.Thus I created pandas-appender.
It has some features that I care about -- scalability to 100s of millions of rows, and features that help specify or identify categorical columns. But of course, it's not quite compatible with
DataFrame.append()
, and I'm new enough to pandas to be unsure if I've made it follow pandas conventions as far as possible.So my question is, can someone who is better versed in pandas conventions than I am, preferably someone who would love to use a package that let them easily build up a dataframe with
.append()
, take a look at this code and give me some advice?Thanks in advance!
The text was updated successfully, but these errors were encountered: