We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
I have searched the [pandas] tag on StackOverflow for similar questions.
I have asked my usage related question on StackOverflow.
https://stackoverflow.com/questions/73324844/efficient-way-to-create-dataframe-with-different-column-types
I'd like to create DataFrame with most of the columns of type np.float32.
np.float32
The data (from Postgres table) comes in form of records:
data = [(0.16275345863180396, 0.16275346), (0.6356328878675244, 0.6356329)...]
The only way I found to do it is by:
columns = [('a', 'float64'), ('b', 'float32')] df = DataFrame.from_records(np.array(data, dtype=columns), coerce_float=coerce_float)
This approach is extremely slow (which is highly noticeable with large datasets), compared to the default one (used by pd.read_sql_query):
pd.read_sql_query
df = DataFrame.from_records(data, columns=columns, coerce_float=coerce_float)
But the last one creates all columns to be np.float64 regardless of the real data type which is not specified.
np.float64
What is the best way to construct such a DataFrame?
The text was updated successfully, but these errors were encountered:
Is it feasible for you to change the column to the correct dtype after creation?
Sorry, something went wrong.
@AKuederle - that's what I found myself doing eventually, by calling astype on created DF.
astype
I think it should not be the proper way to do it.
duplicate of #4464
No branches or pull requests
Research
I have searched the [pandas] tag on StackOverflow for similar questions.
I have asked my usage related question on StackOverflow.
Link to question on StackOverflow
https://stackoverflow.com/questions/73324844/efficient-way-to-create-dataframe-with-different-column-types
Question about pandas
I'd like to create DataFrame with most of the columns of type
np.float32
.The data (from Postgres table) comes in form of records:
The only way I found to do it is by:
This approach is extremely slow (which is highly noticeable with large datasets), compared to the default one (used by
pd.read_sql_query
):But the last one creates all columns to be
np.float64
regardless of the real data type which is not specified.What is the best way to construct such a DataFrame?
The text was updated successfully, but these errors were encountered: