Skip to content

[HELP] How to do “for each” for 1,000,000 rows using node-mysql? #1370

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Aminadav opened this issue Mar 15, 2016 · 13 comments
Closed

[HELP] How to do “for each” for 1,000,000 rows using node-mysql? #1370

Aminadav opened this issue Mar 15, 2016 · 13 comments
Labels

Comments

@Aminadav
Copy link

http://stackoverflow.com/questions/36015279/how-to-do-for-each-for-1-000-000-rows-using-node-mysql

I'm using NodeJS. I want to do something to 1,000,000 rows without loading all the rows into the memory.

Before, when I used ASP Classic I did:

do while not rec.eof
   //do something
   rec.movenext
loop

In node-mysql I didn't found anything that similar to cursor. only

connection.query('select * from bigdata',function(err,rows))

The problem is, that I don't want to loads all the rows at one time.

@rm3nchaca
Copy link

you can use LIMIT and implement some pagination or if it's possible, filter the rows with WHERE, for example:

select * from bigdata where field='something' limit 1, 1000

@Aminadav
Copy link
Author

Thanks. I hoped maybe there is something built-in in the module.

@ZJONSSON
Copy link
Contributor

edited
You can use streams to process large large datasets without blowing up memory. Here is an example with transform stream:

Node 4+ example:

var stream = require('stream');

connection.query('select * from bigdata')
  .stream()
  .pipe(stream.Transform({
    objectMode: true,
    transform: function(data,encoding,callback) {
      // do something with data...
      callback()
    }
   })
   .on('finish',function() { console.log('done');})

Node 0.12 example:

In Node 4+ you can define the transform function in the options you pass on to stream.Transform. In earlier versions you would have to instantiate an object and define the _transform function directly, i.e.:

var Transform = require('stream').Transform,
    myTransform = new (Transform({objectMode:true}));

myTransform._transform = function(data,encoding,callback) {
  // do something with data...
  callback()
};

connection.query('select * from bigdata')
  .stream()
  .pipe(myTransform)
  .on('finish',function() { console.log('done');})

@Aminadav
Copy link
Author

@ZJONSSON
The data in this example will contain an array or rows, or object? I want it to contain an object or 1 row.
You didn't set the highWaterMark. What is the default value? I need it to be '1'

Thanks.

@ZJONSSON
Copy link
Contributor

Clarified a little the answer above. In transform stream you will only handle one record at a time (and call callback when you are done). The highWaterMark defines how large the internal buffer of the stream (i.e. how many records) can become before pausing (will resume when buffer has been reduced through processing records).

Each record coming into the transform will be an object representing a single row. For this reason the transform (and any other stream processing the data) needs to be defined with option: objectMode: true

@ZJONSSON
Copy link
Contributor

Closing - please reopen if any further issues

@Aminadav
Copy link
Author

Aminadav commented Mar 15, 2016

Not working for me. Only the finish called
Node Version: v5.6.0

connection.query('select id  from stats  limit 100')
            .stream()
            .pipe(stream.Transform({
                    objectMode:true,
                transform:function(data,encoding,callback){
                                //This never be called
                    res.write(util.inspect(data))
                    callback()
                }           
            }))
            .on('finish',function(){console.log('done');res.end()})

@Aminadav
Copy link
Author

Now it's working. I have used by mistake node-mysql insted of mysql module.
Thanks.

@Aminadav
Copy link
Author

Aminadav commented Mar 15, 2016

It's not working in pool.

var conn=mysql.createPool(data)
    conn
        .query('select * from stats limit 100')
        .stream()
        .pipe(require('stream').Transform({
            objectMode:true,
            transform:function(object,encoding,callback){
                console.log(object)
                callback()
            }
        }))
        conn.end()

@ZJONSSON
Copy link
Contributor

At first glance it seems you are closing the connection immediately. Wrapping it inside the finish event-handler i.e. .on('finish',() => conn.end()) might be the fix

@ZJONSSON ZJONSSON reopened this Mar 16, 2016
@Aminadav
Copy link
Author

Aminadav commented Mar 16, 2016

You right. you can close it again. Thanks again

@iSuperMan
Copy link

Hi! I have some problem like @AminaG, but i don't want to receive a single object/row in each transform callback. Instead of i want to receive a small array of certain size which contains objects/rows . Is it possible ?

var stream = require('stream');

connection.query('select * from bigdata')
  .stream()
  .pipe(stream.Transform({
    objectMode: true,
    transform: function(rows,encoding,callback) {
      rows.forEach(row => { /* do something with each row */ })
      callback()
    }
   })
   .on('finish',() => connection.end() )

@dougwilson
Copy link
Member

Hi @iSuperMan Streams are defined by Node.js code, and object mode streams will only ever give your transform function one object at a time. Since the objects are a row, you only get one row at a time. If you want to get multiple at once, you may need to ask how to do this, but it's a general Node.js stream question, not really anything to do with this module.

@mysqljs mysqljs deleted a comment from taylorgoolsby Aug 9, 2019
@mysqljs mysqljs deleted a comment from charutiwari04 Aug 9, 2019
@mysqljs mysqljs locked as resolved and limited conversation to collaborators Aug 9, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Development

No branches or pull requests

5 participants