-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
API: object oriented interface for SQL functionality #7960
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
If we want to tackle this for 0.15, we should start with this. @hayd @mangecoeur @danielballan @JoergRittinger @artemyk @maxgrenderjones @aldanor @jreback |
@jorisvandenbossche Let me throw out another possibility that occurred to me after some hacking away on SQL code:
|
@artemyk unifying Additionally, it is more flexible if you want to add features or create subclasses. For instance you might want to manipulate more than one table at a time with the same engine. |
@jorisvandenbossche this is my input (i'm slowly getting back to dealing with this after long period of other work). Personally I'm happy with the architecture, especially with respect to being able to create subclasses to add custom functionality. Being able to provide custom MetaData and Table definitions I think is important, as well as to override the various type conversion routines. However it would be nice to avoid cramming in too many keyword arguments, so it would be better to let people do this by subclassing rather than passing optional parameters. We should make sure that the architecture is truely modular and you can swap in your own classes as needed. The exception may be for schema handling where you currently need to pass in MetaData - but from recent commits it seems this will be handled through a schema parameter (I personally use this and so far have had to create a custom metadata object). With respect to So for example I find that As to how we see Finally for naming - I'm not great with names, but I like simply |
@mangecoeur good to have your intput! (I wanted to ask it as you wrote part of the current code base (also @danielballan I think?), but you already did)
Do you have other use cases? I think we should try to add an 'advanced' section in the docs to give some examples of these use cases. And indeed, for solely specifying a |
@jorisvandenbossche
I'm sure there is more and we don't have to implement any of this, just make sure that you can swap out one or the other of Engine or Table without it blowing up in your face, and that overriding in sub-classes is reasonably straightforward. |
@mangecoeur +1 on all of those conversions, except the first/second. They user should simply cast the frame as appropriate. Many ways to do this and should not be specific to SQL. |
@jreback I think someone will find the need for both the first and second, never underestimate how weird someone's use case might be. The point however is we don't need to care whether they do or not, just that we make the system reasonably easy to subclass and customize. If people want to make something completely ridiculous with it afterwards its up to them :P |
@mangecoeur yes, people do have odd things they try to do! |
@mangecoeur The problem is, I think we have to care what people would do with it, as otherwise we could easily break things. So I think we will have to outline some typical advanced usage case, and test for them (and document them). Because along the way af adding features/modifying the internals of SQLTable, we can certainly break such usage cases if they rely on the specific implementation in the internals. |
@jorisvandenbossche true, so we need to make an internal api that will be modular and stable and easy to replace. Cant say what that might be without diving in, no time just now, hopefully soon... |
I think, for 0.15, I will try to do some of the renaming (SQLDatabase, SQLTable, ..), as we agree on this I think (and then the old names don't linger on longer than needed). The other things discussed here will then be for a next release. @artemyk I also think the distinction between Database and Table is good, but trying to abstract the differences between sqlalchemy and sqlite away I think would maybe be a good idea, as it is now sometimes difficult to add something without duplicating a lot of code. Using something like a backend class would possibly a way, you can always try something and see how things work out. It is however not yet fully clear what would be the difference with the existing PandasSQLAlchemy and PandasSQLLegacy (or it would split the current class in mode (sqlalchemy/sqlite) specific things and more general methods like to_sql/read_sql?) |
This sounds good to me, @jorisvandenbossche. Sorry to offering minimal input -- defending my thesis this fall! |
@danielballan good luck! knock 'em dead! |
@danielballan good luck! |
@jreback @jorisvandenbossche I made an initial stab at reorganizing the backend code in #8562 . It eliminates some redundancy and now |
Follow-up of #6300 to discuss the object-oriented API for the sql functions.
So there are two ways to interact with SQL databases:
read_sql_query
,read_sql_table
,to_sql
The high-level functions are already quite polished (but there are still some issues with NaN, datetime, schema, ..), but the OO API should still be discussed and is not yet really public.
Current design:
PandasSQLAlchemy
object (withexecute
,read_sql
,read_table
,to_sql
,has_table
,drop_table
methods)PandasSQLTable
object to do the actual table reading/writingSome questions remain:
MetaData
Table
description instead of generating automatically?HDF5Store
? (getitem/setitem support? egpandas_sql['table_name']
to read a table?)PandasSQLAlchemy
? Or alsoPandasSQLTable
?PandasSQLAlchemy
should be public?PandasSQLAlchemy
-> do we need a better name? ('SQLDatabase', 'SQLdb', 'SQLConnection', ...)read_sql
->read_query
?)PandasSQLAlchemy
?Engine
? or wrapper aroundEngine
+MetaData
The text was updated successfully, but these errors were encountered: