Skip to content

to_cassandra(table, session) #10765

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
rustyrazorblade opened this issue Aug 7, 2015 · 9 comments
Closed

to_cassandra(table, session) #10765

rustyrazorblade opened this issue Aug 7, 2015 · 9 comments
Labels
IO Data IO issues that don't fit into a more specific label

Comments

@rustyrazorblade
Copy link

Adding cassandra support should be very similar to to_sql() but is different enough where I think it needs it's own call, as it wouldn't use SQLAlchemy or behave even remotely close.

Python driver is here: https://github.com/datastax/python-driver

Willing to help on this if it'll get pulled in.

@cpcloud
Copy link
Member

cpcloud commented Aug 7, 2015

This might be more suitable for odo. @rustyrazorblade thoughts on doing a PR there?

@jreback jreback added the IO Data IO issues that don't fit into a more specific label label Aug 7, 2015
@rustyrazorblade
Copy link
Author

Is all future work on interacting with different Databases going to the odo project? I don't mind doing the PR there - I just want to make sure Cassandra is a first class citizen in the Pandas world.

@cpcloud
Copy link
Member

cpcloud commented Aug 7, 2015

Is all future work on interacting with different Databases going to the odo project?

Not sure.

Pros:

  • Odo's entire job is converting things to other things and is built with extensibility in mind. Pandas IO system isn't extensible without tacking on a new method.
  • As soon as you hook into the odo graph you get conversion to nearly every other file format without having to write any additional code

Cons:

  • Pandas has a lot of users, odo doesn't
  • By extension, Pandas has familiar interface and odo doesn't

@rustyrazorblade
Copy link
Author

I suspect having that functionality in odo is a good thing no matter what. I'll take a look at it and see if I can get a PR in there for Cassandra.

@cpcloud
Copy link
Member

cpcloud commented Aug 7, 2015

@rustyrazorblade awesome! great to have more contributors there

@jreback
Copy link
Contributor

jreback commented Aug 7, 2015

@rustyrazorblade I think go ahead and contribute this to odo. If it makes sense we can then port back to pandas.

@jreback jreback added this to the Someday milestone Aug 7, 2015
@jorisvandenbossche
Copy link
Member

I think this is currently out of scope for core pandas, therefore closing this.
But it would certainly make for a nice standalone package if someone is interested (and we may include some mechanism to plug such extension packages into pandas, see #9378)

@jorisvandenbossche jorisvandenbossche modified the milestones: No action, Someday Nov 11, 2016
@kassett
Copy link
Contributor

kassett commented Feb 17, 2024

@mroeschke I see that the odo repository mentioned above has not been maintained for the last 7 or 8 years. I have personally implemented the to_cql functionality in several projects and think it would be tremendously useful to have it in Pandas, especially given that Cassandra has become very popular in the last few years.
I have two questions

  1. Can I implement this in Pandas?
  2. Writing to Cassandra is incredibly slow if you don't implement multi-processing. Is this something that we are okay with doing in Pandas?

@mroeschke
Copy link
Member

It's probably still too niche to belong in pandas, but if you create a Python package with it we can add it to the ecosystem docs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
IO Data IO issues that don't fit into a more specific label
Projects
None yet
Development

No branches or pull requests

6 participants