Skip to content
This repository was archived by the owner on Jun 18, 2024. It is now read-only.

Problem with Format Regex in Single Entry JSON schema file? #279

Closed
cew821 opened this issue Feb 21, 2014 · 5 comments
Closed

Problem with Format Regex in Single Entry JSON schema file? #279

cew821 opened this issue Feb 21, 2014 · 5 comments

Comments

@cew821
Copy link

cew821 commented Feb 21, 2014

I think the Regex added in #269 for the format field is not matching the "new" Microsoft MIME types, i.e.

application/vnd.openxmlformats-officedocument.spreadsheetml.sheet

Could someone more fluent in regex double check this for me?

@philipashlock
Copy link
Contributor

Seems to be working to me. Here's a test: http://bl.ocks.org/philipashlock/raw/9144111/
Here's the gist of that test: https://gist.github.com/philipashlock/9144111

What makes you think it's not working?

@cew821
Copy link
Author

cew821 commented Feb 21, 2014

@gbinal recently tried to run the data.gov harvester on Energy's json file. It seemed to throw an error on all of the files with that mime type (see example screenshot below). My understanding is that data.gov's harvester is using this file as the source for its validation. It's possible this issue is with data.gov and not this file, however.

image

@mhogeweg
Copy link
Contributor

@cew821 do you have the link to the energy data.json file?

@philipashlock
Copy link
Contributor

Yeah, it might not be updated on data.gov harvester just yet, but from what I can tell it's working fine on that data.json. The only obvious recurring problem I see on Energy's data.json is that references is not in an array.

The JSON Schema file also does not yet have a regex for date fields that supports the full range of possible syntax for ISO 8601 dates, so some valid dates might not validate yet, but that's an issue with the JSON Schema file.

@cew821
Copy link
Author

cew821 commented Feb 21, 2014

Ahhh.. upon closer inspection, the regex referenced by the data.gov harvester is:
^[-\\w]+/[-\\w]+(\\.[-\\w]+)?([+][-\\w]+)?$ vs. the regex in the schema:
^[-\\w]+/[-\\w]+(\\.[-\\w]+)*([+][-\\w]+)?$

Looks like the harvester needs to be updated to use the * (zero or more) instead of the ? (zero or one). Closing this issue.

Re Energy's data.json: I've got a "fixed" version of our file that addresses the references problem, but the harvester was still choking on a few date + format things. Should be addressed shortly.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants