QST: #48091

cloud-rocket · 2022-08-15T17:58:52Z

Research

I have searched the [pandas] tag on StackOverflow for similar questions.
I have asked my usage related question on StackOverflow.

Link to question on StackOverflow

https://stackoverflow.com/questions/73324844/efficient-way-to-create-dataframe-with-different-column-types

Question about pandas

I'd like to create DataFrame with most of the columns of type np.float32.

The data (from Postgres table) comes in form of records:

data = [(0.16275345863180396, 0.16275346), (0.6356328878675244, 0.6356329)...]

The only way I found to do it is by:

columns = [('a', 'float64'), ('b', 'float32')]
df = DataFrame.from_records(np.array(data, dtype=columns),
                            coerce_float=coerce_float)

This approach is extremely slow (which is highly noticeable with large datasets), compared to the default one (used by pd.read_sql_query):

df = DataFrame.from_records(data,
                            columns=columns,
                            coerce_float=coerce_float)

But the last one creates all columns to be np.float64 regardless of the real data type which is not specified.

What is the best way to construct such a DataFrame?

The text was updated successfully, but these errors were encountered:

AKuederle · 2022-08-17T13:26:53Z

Is it feasible for you to change the column to the correct dtype after creation?

cloud-rocket · 2022-08-17T18:28:34Z

Is it feasible for you to change the column to the correct dtype after creation?

@AKuederle - that's what I found myself doing eventually, by calling astype on created DF.

I think it should not be the proper way to do it.

phofl · 2022-08-19T14:35:22Z

duplicate of #4464

cloud-rocket added Needs Triage Issue that has not been reviewed by a pandas team member Usage Question labels Aug 15, 2022

phofl closed this as completed Aug 19, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

QST: #48091

QST: #48091

cloud-rocket commented Aug 15, 2022 •

edited

Loading

AKuederle commented Aug 17, 2022

cloud-rocket commented Aug 17, 2022

phofl commented Aug 19, 2022

QST: #48091

QST: #48091

Comments

cloud-rocket commented Aug 15, 2022 • edited Loading

Research

Link to question on StackOverflow

Question about pandas

AKuederle commented Aug 17, 2022

cloud-rocket commented Aug 17, 2022

phofl commented Aug 19, 2022

cloud-rocket commented Aug 15, 2022 •

edited

Loading