Skip to content
This repository was archived by the owner on Jun 18, 2024. It is now read-only.

How to handle empty fields #240

Closed
gbinal opened this issue Jan 3, 2014 · 10 comments
Closed

How to handle empty fields #240

gbinal opened this issue Jan 3, 2014 · 10 comments

Comments

@gbinal
Copy link
Contributor

gbinal commented Jan 3, 2014

A question for folks: An issue that's come up a good bit as we begin harvesting agency public data listings is whether to dictate how to handle empty fields. Take for example, the webservice field. There is of course the scenario where it's populated:

image

But there are plenty of other times when it's convenient for the field to still be in the record even when it's unpopulated. So far, I've seen this happen two ways:

  1. with a null value:
    image

  2. or with empty quotation marks:
    image

Should one or either of these be acceptable from a validation and harvesting perspective?

@konklone, @benbalter, @waldoj, @kinlane, @philipashlock, @dwcaraway, @kvuppala, @FuhuXia - can you weigh in?

@konklone
Copy link
Contributor

konklone commented Jan 3, 2014

Choosing between null and "", definitely null. A null affirmatively indicates there is no value there, whereas "" is actually a value of its own.

@waldoj
Copy link

waldoj commented Jan 3, 2014

Ruh-roh—a null-versus-empty debate. With luck, we'll all favor null, and we can avoid any holy wars.

@konklone
Copy link
Contributor

konklone commented Jan 3, 2014

This actually isn't null vs a missing field -- it's null versus an empty string -- I'd think that's an easier choice...

@kvuppala
Copy link

kvuppala commented Jan 3, 2014

the current harvesting tools have to make a special case in either approach null vs "" string. I would rather hope empty fields should not be included in the json.

@philipashlock
Copy link
Contributor

I think the full question is between three possibilities: null, empty string, or missing field. I think we can all agree that an empty string is the most ambiguous and least desirable thing to represent an unspecified field. After that I don't have too strong of feelings about null versus missing field. I'd be ok with parsers needing to be able to handle either but treat them as equivalent. There's the risk of serialization issues with this, but this is already standard for JSON and we've already said the schema can be extended in unexpected ways, so this is nothing new.

As for JSON Schema which is currently the main basis for validation, we could add null as a another possible type for any non-required fields. This feels like a little bit of a hack though. I think it would be better if JSON Schema allowed you to define the equivalency between null and empty fields on a schema by schema basis then have it successfully validate any field that came in as null that was exclusively defined as another type, say a string, but wasn't specified as being required. I believe JSON Schema accepts empty strings as valid if the only pattern used to validate them is "type":"string" (which is often the case for the current POD JSON Schema) but if a format is specified (like "format":"uri" for webService) or if something like minLength is defined (and above 0) then it wouldn't validate even if it's a non-required field.

I haven't dug into the current state of JSON Schema enough to see how much an option for specifying null and missing field equivalency would make sense, but this thread seems relevant - https://groups.google.com/forum/#!searchin/json-schema/null$20missing/json-schema/QCFLf-H9D50/FVRLnAjwZ30J

@philipashlock
Copy link
Contributor

Sorry, didn't mean to click "Close & Comment" just "Close"

I've done that more than once. It seems like it'd be better if "Close" turned into just a checkbox to make it harder to accidentally close when you just wanted to comment - or maybe have the button be colored red to make it more obvious. Its surprisingly easy to quickly click a normal looking button with the word "comment" in it. @benbalter where can I file a ticket for that? ;)

@dwcaraway
Copy link
Contributor

agree with conclusions thus far.

  • empty string is a value
  • null or missing keys are treated equivalently

my opinion

  • for a required field, value must be non-null, if string, must contain at
    least 1 non-whitespace character
  • for optional or free fields, null, missing or empty string value is valid
    providing field has no format requirements

On Jan 3, 2014, at 5:35 PM, Philip Ashlock [email protected] wrote:

Sorry, didn't bean to click "Close & Comment" just "Close"

I've done that more than once. It seems like it'd be better if "Close" was
just a checkbox to make it harder to accidentally close when you just
wanted to comment. @benbalter https://github.com/benbalter where can I
file a ticket for that? ;)


Reply to this email directly or view it on
GitHubhttps://github.com//issues/240#issuecomment-31559590
.

@pschweitzerusgsgov
Copy link
Contributor

You're likely to see empty arrays as well. It's all well and good to say how thing should be, it's better to make your code not fail when it sees something that you wish were not there. So yes, encourage people to use null rather than an empty string (or an empty array), but treat all of those values the same. Have you seen some of the other problems that are in some data.json files? This is not a major problem!

@gbinal
Copy link
Contributor Author

gbinal commented Jan 28, 2014

It's all well and good to say how thing should be, it's better to make your code not fail when it sees something that you wish were not there. So yes, encourage people to use null rather than an empty string (or an empty array), but treat all of those values the same.

You're right - I think what's at issue here is what the guidance to agencies should be. There seems to be strong consensus here, so I've put forth pull request #260 for that.

@gbinal gbinal closed this as completed Jan 29, 2014
@philipashlock
Copy link
Contributor

Here's a commit to update the JSON schema with the flexibility to use null on non-required fields:
202fb58

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

8 participants
@konklone @philipashlock @gbinal @waldoj @dwcaraway @kvuppala @pschweitzerusgsgov and others