Skip to content

Google App Engine and External Database #466

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
pjebs opened this issue Jun 29, 2016 · 27 comments
Closed

Google App Engine and External Database #466

pjebs opened this issue Jun 29, 2016 · 27 comments
Labels

Comments

@pjebs
Copy link

pjebs commented Jun 29, 2016

I solved how to connect to an external database using Google App Engine (Standard Environment): See #457

"google.golang.org/appengine/socket"

dial := func(addr string) (net.Conn, error) {
        return socket.Dial(appengine.NewContext(r), "tcp", addr)
    }

mysql.RegisterDial("external", dial)

The new issue is:
In order to connect externally, I need to use socket API which requires a request to create the Google Context.

This leads me to believe that we can't allow this driver to use connection pooling due to a new request requiring a new context. Am I correct in this assumption?

If we can't use connection pooling, what's the best way to manage it?
Is there a setting to disable connection pooling (i.e. max connection 1, 0 second idle time etc) OR create a connection at start of request cycle and close it and end of request cycle.

The documentation says DO NOT CLOSE connection. Is this a good exception to the directive?

@pjebs
Copy link
Author

pjebs commented Jun 29, 2016

@benguild this may interest you since you use GAE and everyone knows Google Cloud SQL is prohibitively expensive.

@benguild
Copy link

@pjebs Are you connecting to an external SQL server, or using this with Google Cloud SQL? I use smaller Google Cloud SQL instances and have a "memcache" layer in between to reduce the load.

@pjebs
Copy link
Author

pjebs commented Jun 29, 2016

I just started using an external SQL server instead of Cloud SQL. Of course caching is important, but there are plenty of cheaper managed databases around or you can manage your own. I believe even AWS RDS is slightly cheaper that Cloud SQL.

@benguild
Copy link

Are you sure you can't use this context? https://github.com/benguild/GAEBridge/tree/master/context
The connection pool is definitely in your best interests, I think. The added latency of creating connections and stuff over and over is not something you want to add to your application. It makes more sense to "hand off" connections that are idle to other requests.

@pjebs
Copy link
Author

pjebs commented Jun 29, 2016

If you are trying to connect to something outside the Google App Engine ecosystem, you need to use URLFetch or SocketAPI. Both require the current request to create a google context. There is no other way to connect to anything externally.

The context will automatically expire after the request cycle is over. I believe (???) this will invalidate the database connection made during the request, after the request is over.

Your library is only useful during any specific request.

@benguild
Copy link

So this context is separate from the request's context?

@benguild
Copy link

And the socket connection only exists in that context, which expires?

@benguild
Copy link

Can you pass the context between request contexts?

@pjebs
Copy link
Author

pjebs commented Jun 29, 2016

A1: The context is the request's context. You can use your library to pass the context to the dial function if you want.

A3: You definitely can't pass contexts between individual requests (hence why I put the Cleanup in your library to be called after request cycle is over)

A2: Not sure what happens to the connection after request cycle expires. If it does invalidate, I'm asking the authorities if this is a good reason to close connection and depart from their directive.

@benguild
Copy link

benguild commented Jun 29, 2016

Gotcha. Well it definitely sounds like a limitation of GAE if you're losing access to an actual connection between requests. You definitely cannot "pool" them in this case.

I think you should use Google Cloud SQL (which avoids this problem due to internal networking), or just close the connection pool between requests on your external box. Or, consider other hosting... because the added latency of an external DB server doesn't seem worth it if you're also losing the pool. You may have outgrown the practicality of GAE based on your budget.

@benguild
Copy link

For the record, I use the GAE "memcache" service heavily for this reason.

@pjebs
Copy link
Author

pjebs commented Jun 29, 2016

In a super busy GAE application, you probably need to use memcache, read replicas and master Cloud SQL

@benguild
Copy link

Honestly, "memcache" is fine as a "read replica" in most instances IMO, as long as your application is designed to utilize it. The only time data is read from the DB is if the cache is lost or entries are purged without just being updated.

What's your exact issue at this time?

@pjebs
Copy link
Author

pjebs commented Jun 29, 2016

No issue. I just moved over to an external database so was wondering what to do regarding disabling connection pooling. I read some documentation regarding this driver (which I can't find anymore) that suggested that if you call close on connection too often you may get memory leaks.

@pjebs
Copy link
Author

pjebs commented Jun 30, 2016

@arnehormann @julienschmidt I have done some research and diagnosed the problem and found out some possible solutions. There will need to be a slight change at the driver level. The change was done for the postgres driver: https://github.com/lib/pq/pull/287/files

1 ) GAE's socket API ("google.golang.org/appengine/socket") requires a custom dialler for mysql Driver
The Conn object needs to be created with a context object which is linked to the current request. If a new context is not provided, then the connection to the database will be invalidated and probably closed. But definitely unusable.

  1. In order for the database connection to work for a new request, setContext from socket API must be called:
    https://github.com/golang/appengine/blob/master/socket/socket_classic.go#L161 (explanation)
    https://groups.google.com/forum/#!msg/google-appengine-go/FtWll9cYjmo/YoEZSJk_CgAJ (shows the current mysql driver doesn't work)

  2. More info: http://grokbase.com/t/gg/golang-nuts/148hpkvzn1/go-nuts-proposal-setcontext-inside-database-sql-to-support-the-app-engine-environment

  3. The only other solution is not using connection pooling and just opening/closing per request cycle. But that suffers from their connection limit of Socket Connect Count 864,000 (per day) 4,800 (per minute)

https://cloud.google.com/appengine/docs/quotas?hl=en#Sockets

@methane
Copy link
Member

methane commented Sep 26, 2016

The Conn object needs to be created with a context object which is linked to the current request.

Where is the limitation written in doc?
Can't you use context.Background() ?

@pjebs
Copy link
Author

pjebs commented Sep 26, 2016

@methane
In GAE, you need to create a context specifically via: appengine.NewContext(r)

That context must be passed to the Conn struct that mysql driver uses via the setContext() function found in google.golang.org/appengine/socket : https://github.com/golang/appengine/blob/master/socket/socket_classic.go#L161

appengine.NewContext must be called somewhere in the request-response cycle to obtain the Context.

@methane
Copy link
Member

methane commented Sep 26, 2016

In GAE, you need to create a context specifically via: appengine.NewContext(r)

Where is the limitation written in the doc??

That context must be passed to the Conn struct that mysql driver uses via the setContext() function found in google.golang.org/appengine/socket : https://github.com/golang/appengine/blob/master/socket/socket_classic.go#L161

The functions receives golang.org/x/net/context.Context. Why can't you use context.Background() to obtain context which not bound to request?

appengine.NewContext must be called somewhere in the request-response cycle to obtain the Context.

Where is the limitation written in the doc???

@pjebs
Copy link
Author

pjebs commented Sep 26, 2016

https://cloud.google.com/appengine/docs/go/reference#NewContext
https://cloud.google.com/appengine/docs/go/sockets/reference#Dial
https://github.com/golang/appengine/blob/master/socket/socket_classic.go#L161

Good Question regarding where is the limitation listed. GAE operates in a sandboxed environment. In order to escape, there are only 2 ways:
urlfetch (for outbound http requests): https://cloud.google.com/appengine/docs/go/issue-requests
socket api (for connecting to external mysql): https://cloud.google.com/appengine/docs/go/sockets/
and also: https://cloud.google.com/appengine/docs/go/sockets/reference#Dial (see how dial requires a context)

https://cloud.google.com/appengine/docs/go/sockets/reference#Conn.SetContext (It might be possible to infer it from reading Dial and SetContext - but good question- I can't find where it's listed EXACTLY, but everyone who uses the platform knows this as common knowledge).

If you don't attach a context to socket api via: appengine.NewContext(r)

you get this error:
error: dial tcp 173.XXX.XXX.444:3306: socket: operation not permitted

@benguild
Copy link

@pjebs Just to nudge this since I don't have your email (you can grab mine off my website if you want!) ... Have you been satisfied with Google Cloud SQL 2nd generation? It seems like their use of mysqlproxy solves the zombie connections issue and the instances are much more reasonably priced. (plus better performance...)

@pjebs
Copy link
Author

pjebs commented Sep 28, 2016

I've heard there are issues with using this driver with Gen2 CloudSql. I'm still not in production so just using a simple Gen 1 for now.

@benguild
Copy link

@pjebs Issues like what? I haven't had any problems and we're in production. Send me an email so that we're not hogging the thread :)

@pjebs
Copy link
Author

pjebs commented Sep 28, 2016

The issues with this driver: #484

@benguild
Copy link

I posted a follow-up to #484. No issues in production besides Google's documentation being wrong. Maybe it's been updated now, but like I said we're using it in production without problems.

@methane
Copy link
Member

methane commented Sep 28, 2016

This issue is completely different from #484.
#483 is relating.

@pjebs
Copy link
Author

pjebs commented Oct 7, 2016

I attempted to work around database/sql's connection pooling by creating a wrapper since otherwise it requires changes to the mysql driver. My attempt Q&D work-around can be found here: https://github.com/pjebs/GAE-Toolkit-Go/tree/master/sql

If anyone wants to help go ahead.

@methane
Copy link
Member

methane commented Apr 3, 2019

This is not an issue for new GAE/Go based on gVisor.
I don't think it's worth enough to make effort for only legacy environment.

@methane methane closed this as completed Apr 3, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants