Provide a legal way to compose result of query when handling rows manually #110

dimsmol · 2012-03-21T00:14:23Z

Handling rows manually I often end up with something like:

var processedRows = [];
query
    .on('row', function (row) {
        var processedRow = processRow(row); // create an object from row or whatever
        processedRows.push(processedRow);
    })
    .on('end', function () {
        doSomething(processedRows);
    });

The code could be more clear if we had legal access to result object from row event:

query
    .on('row', function (row, result) {
        var processedRow = processRow(row); // create an object from row or whatever
        result.addRow(processedRow);
    })
    .on('end', function (result) {
        doSomething(result.rows);
    });

The text was updated successfully, but these errors were encountered:

brianc · 2012-03-21T00:36:35Z

Supplying rows to the end event would require buffering all the rows into memory. Doing that completely romoves the evented models ability to stream large result sets without buffering them.

dimsmol · 2012-03-21T01:37:20Z

Totally agree, but it's a very convenient way if query for sure returns few rows only (e.g. has explicit limit clause).

Another example - if you perform processing which produces very few results even on large number of rows. Having control on what to add you can selectively store data you really want to buffer.

Code already has mechanics for such kind of buffering used for callback-style calls, so providing this feature will require very minor changes.

dimsmol · 2012-03-21T01:50:09Z

It would be also useful to reserve some property of result object to be used to store non-array results of rows processing. It could be data, for example. This will allow code like this:

query
    .on('row', function (row, result) {
        result.data = result.data || {};
        var kind = veryComplexFuncToDetermineKind(row);
        var count = result.data[kind];
        result.data[kind] = (count == null ? 1 : count + 1);
    })
    .on('end', function (result) {
        done(result.data); // calculated counts for different kinds
    });

The main idea is to allow to compose result object staying encapsulated within particular query processing. Without need for introducing additional variables which are not query-bound and will be messed up when several queries are processed within single scope.

dimsmol · 2012-03-21T02:02:05Z

Just to be clear.

I'm not suggesting to always add some results of row processing to result object.
I suggest to provide such an ability with full programmer's control of what to add (or not to add).

Surely, nothing will be added by default.
If your code doesn't use proposed second argument of row event, it will behave exactly same way as before without any buffering.
Old code will also behave exactly as before without any buffering.

brianc · 2012-03-21T03:42:36Z

I think some of the use cases you've outlined could be good in certain scenarios; however, the value/complexity ratio to me is unjustifiable for a database client. A database client (or any low-level infrastructure library) should strive for the minimal interface possible to "get the job done" and work towards complete stability and database feature support. Sugar, as suggested by this issue and issue #109, can be added in your application code with very little to no monkey-patching. I'm not aiming to provide the jQuery of database interfaces. I'm aiming to provide a battle-tested, low level, bug free way to run queries against postgres. I know it's nice to have some extra features, but from a library standpoint what can be added in your own code should be added in your own code.

Speaking from a bit of experience and going off completely on a tangent, I want to give an example so you know "I feel your pain." .NET has probably the world's most verbose database client. I make no exaggeration when I provide the following example of the steps involved to run a simple, parameterized query against a database (postgres included)


using(var connection = new SqlConnection(/*connection string*/))
{
  using(var client = connection.CreateClient())
  {
    using(var cmd = client.CreateCommand())
    {
      cmd.CommandText = "SELECT name FROM people WHERE id = @id"
      var param = cmd.CreateParameter();
      param.ParameterName = "id";
      param.value = 1;
      cmd.Parameters.add(param);
      using(var reader = cmd.ExecuteReader())
      {
        while(reader.read())
        {
          return reader["name"];
        }
      }
    }
  }
}

While that's an extreme and verbose amount of code, you can wrap most of that up in such a way as to provide a single line interface for running queries. Does this mean the .NET guys should have made a 1 line way to run queries? Probably not, because the low level interface their client provides allows you to set types on parameters, stream row results, execute queries async, try/catch varios parts of the query execution, use transactions, log steps of query execution etc...

tl; dr: node-postgres strives to be sugar-free, bug-free, and overhead-free. Anything that can be handled in code of the consuming library should be handled there.

dimsmol · 2012-03-21T10:44:55Z

I agree that things should be kept as simple as possible.
Also it for sure can be done in client code, but requires Query subclassing that relies on it's internal organization and rewriting query method of Client (to use this new subclass) that relies on internal organization even more.

On the other hand this issue requires to change one row of code only, replacing self.emit('row', row); with self.emit('row', row, _result); which does not look to imply much complexity.

And it requires only one additional row of code to reserve data field of Result:

var Result = function() {
  this.rows = [];
  this.data = null; // reserved for client's code usage
};

However, data can be added to Result in client code even without this change, this only guarantees that it is safe to do and helps a little with code execution optimisation. So this step is very optional if it looks too obscure.

These very little changes will greatly simplify extension of results handling mechanics still keeping things simple.
They don't try to add any new features to the library, just provide simpler way to do it in client code.

dimsmol · 2012-03-21T17:11:21Z

I created little fetchAll helper function that looks to be better (and more js-style) solution than attempt to tune library internals.
I think this issue can be closed with "wontfix" )
Sorry for all the buzz around.

dimsmol · 2012-03-21T17:26:47Z

Here is the code if someone interested:

var Query = require('pg/lib/query');
var fetchAll = function (config, cb) {
    if (config instanceof Query)
    {
        config = {
            query: config
        };
    }

    var query = config.query;
    var data = config.initialData || [];
    var collect = config.collect;

    var doCollect;

    if (collect) {
        doCollect = function (row) {
            collect(row, data);
        };
    }
    else
    {
        doCollect = function (row) {
            data.push(row);
        };
    };

    query
        .on('row', doCollect)
        .on('error', cb)
        .on('end', function (result) {
            // NOTE ensure not in error, because error is already handled at this point
            if (result != null)
            {
                cb(null, data);
            }
        });
};

and usage:

// collect all query rows
fetchAll(query, cb);

// collect ids extracted from rows
fetchAll({
    query: query,
    collect: function (row, data) {
        data.push(row.id);
    }
}, cb);

// create dict of rows using id as a key
fetchAll({
    query: query,
    initialData: {},
    collect: function (row, data) {
        data[row.id] = row;
    }
}, cb);

chowey · 2012-03-24T20:09:49Z

For the record, although it is bad form, you can do this:

query
    .on('row', function (row) {
        var processedRow = processRow(row); // create an object from row or whatever
        this._result.addRow(processedRow);
    })
    .on('end', function (result) {
        doSomething(result.rows);
    });

dimsmol · 2012-03-26T09:58:28Z

very bad form, I agree )

chowey · 2012-03-26T12:57:21Z

Having used this in code now, I see that there is a very strong case for providing the result object to the row event.

It allows you to buffer all the rows into memory, but it doesn't do it automatically.

The result object would just need to be documented.

So +1 from me for this addition.

dimsmol · 2012-03-27T09:29:56Z

I think it's better to remove rows collection code from Result and maybe remove this class at all, using simple dictionary on end. But this cannot be done without breaking backwards compatibility, so first document it as deprecated along with callback-style and remove completely in future, keep support for event-style only. It will simplify the things and remove confusing dualism from library.

dimsmol · 2012-03-27T09:33:18Z

It's also very confusing that readme mention callback-style as "preferred" when event-style looks more mature and appropriate by now.

chowey · 2012-03-27T13:29:06Z

There's nothing wrong with result.addRow() as an interface. It would help people who are just getting started. The callback-style basically just does this (although implemented differently in code):

query.on('row', function (row, result) {
    result.addRow(row);
});

This could be documented, so that a beginner knows how to accumulate his rows.

Since result is just an object, you can add any old property you want to on-the-fly. So you can already do custom dictionary stuff if you want:

query.on('row', function (row, result) {
    if (isInterestingRow(row))
        result.interestingRow = row;
});
query.on('end', function (result) {
    console.log('Here is some interesting data:');
    console.log(result.interestingRow);
});

Again, this just needs to be documented so that people understand its use.

brianc · 2012-03-27T13:33:22Z

Thanks for the pull request & good discussion.

brianc mentioned this issue Mar 21, 2012

Add custom row object class support? #104

Closed

chowey mentioned this issue Mar 26, 2012

Add result to row event #115

Merged

brianc closed this as completed Mar 27, 2012

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Provide a legal way to compose result of query when handling rows manually #110

Provide a legal way to compose result of query when handling rows manually #110

dimsmol commented Mar 21, 2012

brianc commented Mar 21, 2012

dimsmol commented Mar 21, 2012

dimsmol commented Mar 21, 2012

dimsmol commented Mar 21, 2012

brianc commented Mar 21, 2012

dimsmol commented Mar 21, 2012

dimsmol commented Mar 21, 2012

dimsmol commented Mar 21, 2012

chowey commented Mar 24, 2012

dimsmol commented Mar 26, 2012

chowey commented Mar 26, 2012

dimsmol commented Mar 27, 2012

dimsmol commented Mar 27, 2012

chowey commented Mar 27, 2012

brianc commented Mar 27, 2012

Provide a legal way to compose result of query when handling rows manually #110

Provide a legal way to compose result of query when handling rows manually #110

Comments

dimsmol commented Mar 21, 2012

brianc commented Mar 21, 2012

dimsmol commented Mar 21, 2012

dimsmol commented Mar 21, 2012

dimsmol commented Mar 21, 2012

brianc commented Mar 21, 2012

dimsmol commented Mar 21, 2012

dimsmol commented Mar 21, 2012

dimsmol commented Mar 21, 2012

chowey commented Mar 24, 2012

dimsmol commented Mar 26, 2012

chowey commented Mar 26, 2012

dimsmol commented Mar 27, 2012

dimsmol commented Mar 27, 2012

chowey commented Mar 27, 2012

brianc commented Mar 27, 2012