Skip to content

to_sql with postgres fails silently when given uppercase/capitalized table names #26030

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
dbrakman opened this issue Apr 9, 2019 · 4 comments
Labels
IO SQL to_sql, read_sql, read_sql_query Needs Info Clarification about behavior needed to assess issue

Comments

@dbrakman
Copy link

dbrakman commented Apr 9, 2019

I attempted to use to_sql to insert into a Postgres table with a non-lowercase identifier. No error was raised in python, but the insert did not succeed:

import pandas as pd
import sqlalchemy

df = pd.DataFrame({
    u'id': [1, 2],
    u'name': [u'foo', u'bar'],
})

engine = sqlalchemy.create_engine('postgresql://{user}:{password}@{host}/{dbname}'.    format(
    user='myuser', password='password', host='localhost', dbname='mydatabase'))

df.to_sql('TMP1', engine, index=False, if_exists='replace')
df.to_sql('tmp2', engine, index=False, if_exists='replace')
df.to_sql('"TMP3"', engine, index=False, if_exists='replace')

When I check the psql console, the 'tmp2' table is populated, but there is no 'tmp1' or 'TMP3' table:

mydatabase=> select * from tmp1;
ERROR:  relation "tmp1" does not exist
LINE 1: select * from tmp1;
                      ^

mydatabase=> select * from tmp2;
 id | name
----+------
  1 | foo
  2 | bar

mydatabase=> select * from "TMP3";
ERROR:  relation "TMP3" does not exist
LINE 1: select * from "TMP3";
                      ^

Problem description

This MWE was tested with Postgres 11.2 on OSX 10.14.4 and with Postgres 9.5.15 on Red Hat 4.8.3-9.

The most related issue I saw was #7815, but I didn't understand its resolution. It sounded to me as though a subsequent release would have to_sql check which table was created/updated, if any, and return a warning on mismatch with its name parameter, but I did not observe comparable behavior.

Expected Output

I would expect either a successful insert or an exception.

Output of pd.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 2.7.15.final.0 python-bits: 64 OS: Darwin OS-release: 18.5.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: None.None

pandas: 0.23.4
pytest: 4.0.2
pip: 18.1
setuptools: 40.6.3
Cython: 0.29.2
numpy: 1.15.4
scipy: 1.1.0
pyarrow: None
xarray: None
IPython: 5.8.0
sphinx: 1.8.2
patsy: 0.5.1
dateutil: 2.6.1
pytz: 2018.7
blosc: None
bottleneck: 1.2.1
tables: 3.4.4
numexpr: 2.6.8
feather: None
matplotlib: 2.2.3
openpyxl: 2.5.12
xlrd: 1.2.0
xlwt: 1.2.0
xlsxwriter: 1.1.2
lxml: 4.2.5
bs4: 4.6.3
html5lib: 1.0.1
sqlalchemy: 1.2.15
pymysql: None
psycopg2: 2.8.1 (dt dec pq3 ext lo64)
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None

@WillAyd
Copy link
Member

WillAyd commented Apr 9, 2019

Does this work just using SQLAlchemy?

@WillAyd WillAyd added IO SQL to_sql, read_sql, read_sql_query Needs Info Clarification about behavior needed to assess issue labels Apr 9, 2019
@dbrakman
Copy link
Author

dbrakman commented Apr 9, 2019

What direction would be most useful in SQLAlchemy? Do you recommend I have the engine execute a "plain" SQL statement, or should I create objects in a session?

@dbrakman
Copy link
Author

dbrakman commented Apr 9, 2019

But from what I can tell, SQLAlchemy behaves the same way. The following code completes successfully, but using the psql commands from the first post reveals that only a "tmp2" table was created.

import sqlalchemy
from sqlalchemy import create_engine, Table, Column, Integer, String, MetaData

engine = create_engine('postgresql://{user}:{password}@{host}/{dbname}'.format(
    user='myuser', password='password', host='localhost', dbname='mydatabase'))

metadata = MetaData()
tmp1_table = Table('TMP1', metadata,
  Column('id', Integer, primary_key=True),
  Column('foo', String)
)
tmp2_table = Table('tmp2', metadata,
  Column('id', Integer, primary_key=True),
  Column('foo', String)
)
tmp3_table = Table('"TMP3"', metadata,
  Column('id', Integer, primary_key=True),
  Column('foo', String)
)
metadata.create_all(engine)

I'm closing this issue and creating a new one with SQLAlchemy at sqlalchemy/sqlalchemy#4605

@dbrakman dbrakman closed this as completed Apr 9, 2019
@dbrakman
Copy link
Author

To summarize the other ticket:

Tables were created successfully, but not with the names I expected: TMP1 and "TMP3", verifiable by select * from "TMP1"; and select * from """TMP3""";.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
IO SQL to_sql, read_sql, read_sql_query Needs Info Clarification about behavior needed to assess issue
Projects
None yet
Development

No branches or pull requests

2 participants