Skip to content

Feature request (dynamodb-util): Expose premarshall configuration #1533

Closed
@russell-dot-js

Description

@russell-dot-js

Is your feature request related to a problem? Please describe.
The problem, as detailed here, is that the current (v2) DocumentClient, and the upcoming (v3) DocumentClient, will serialize native JS Dates as {}.

We cannot make assumptions about how consumers want to serialize Dates (some want ISO strings, some want numbers, some may want just a date with no time component), but we can expose configuration to make it easy for consumers to instruct the client on how to serialize certain attributes based on type or name, without overstepping the client's responsibility and becoming a full blown ORM.

Describe the solution you'd like
I'd love to hammer out details of the contract here, and as an avid dynamo user, I'm also happy to implement this feature. The rough concept is something along these lines:

premarshall: ({TableName: string, ItemKey: KeyType, unmarshalledItem: any, attributeKey: string, attributeValue: any}) => any;

new DocumentClient({
  premarshall: ({attributeKey: key, attributeValue: value}) => {
    if(value instanceof Date) {
      return value.toIsoString(); // or convert to number depending on use case
    }
    if(value instanceof moment) {
      return value.format();
    }

    if(key === 'someKeyIDontWantInDynamo') {
      return undefined;
    }
    return value;
  }
})

I included TableName in case you want to serialize items differently based on table (especially important if you are writing to multiple tables during a transaction, you might have different rules in place for each one).

Describe alternatives you've considered
I'm not sure if we should cross the line into allowing consumers to change the key name... it would make sense for Put, but is kind of confusing for UpdateItem, where your UpdateExpression should already have your keys clearly defined.

But this is something every dynamo client struggles with - creating a clean, usable client that handles all the unique (and kind of weird) ways Dynamo is used. For example, both DynamoDB Data Mapper and DynamoDB Toolbox provide ways to instruct the client on how to map attributes, but they also come at the cost of having to strictly define your types either as classes with decorators, or via configs, so it's easy to create disconnects between the client configuration and the types you are passing to the DB when using typescript.

Thus, pure DocumentClient tends to be more usable than either of these alternatives, and I want to be careful to not take on too responsibility. However, it might be nice to allow consumers to do something like the following contrived example:

new DocumentClient({
  premarshall: ({unmarshalledItem: item, attributeKey, attributeValue}) => {
    if(['createdAt', 'updatedAt', 'favoriteColor'].includes(attributeKey)) {
      return undefined;
    }

    if(attributeKey === 'hometown') {
      return {
        attributeKey: 'GSI1_Sort',
        attributeValue: `${item.favoriteColor}_${item.hometown}_${item.updatedAt}_{${item.createdAt}`
      };
    }

    return { attributeKey, attributeValue };
  }
})

In this example, the consumer is choosing to cram createdAt, updatedAt, favoriteColor, and hometown in to a GSI's sort key, and remove those individual attributes from the object itself. But this is where it's easy for the client to start to box users in take on too much responsibility, leading to complexity and, in some cases, becoming unusable (for example, in the example above, if I wanted to store all 4 attributes separately, AS WELL AS cram into the GSI1, it would be easy for 3/4, but what about hometown, where I rewrite the attribute name?)

Additional context
IMO it's more important to have this on the way in to dynamo on the way out, because there's always some casting from DB values to runtime values anyway (e.g. converting iso string back to date, stripping database concerns, etc). But some way to inject some casting logic that could be applied to Item, Attributes, Items, etc based on endpoint is a secondary item on my wishlist.

Obviously this is something that consumers can do themselves before calling UpdateItem / PutItem, but it creates additional complexity if, for example, every consumer has to wrap the documentClient with their own functionality to marshall. Or, even worse, have one-off code like such:

documentClient.putItem({
  Key,
  Item: {
     ...value,
     createdAt: value.createdAt.toISOString(),
   }
})

Code like the above becomes highly error prone, because even with typescript, adding a new Date value to typeof T will not break any assumptions and consumers have to jump through additional hoops to keep their DB logic in line with their types, when in reality it should "just work" (without having to cross the line into a full blown ORM).

Also: DynamoDB is rad 👍

Metadata

Metadata

Assignees

No one assigned

    Labels

    feature-requestNew feature or enhancement. May require GitHub community feedback.wontfixWe have determined that we will not resolve the issue.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions