Proposed Final Metadata Schema 1.0 #44

MarinaNitze · 2013-05-16T02:01:45Z

Compiles all proposed changes with general consensus into this commit, for final approval.

Addressing Issue #16.

The Category field guidance incorrectly referred to the metadata name as "category" when it should be "theme."

jpmckinney · 2013-05-16T03:34:26Z

Note: also fixes #18

Technically there should never be one more than one distribution array for a dataset.

Closes #117

Taken from #116. Thanks Sean!

Thanks Sean! Taken from #116

Thanks Sean! This incorporates #100

Closes #91

This addresses #65.

jpmckinney · 2013-08-25T15:56:28Z

keyword was renamed to keywords in #113. However, the RDF term is dcat:keyword. In order to follow the DCAT specification, the singular dcat:keyword must be used. Will the new documentation point out this difference between the RDF serialization and other serializations? It seems simpler to just use the same term for all serializations.

MarinaNitze · 2013-08-25T17:42:30Z

@jpmckinney In RDF, each keyword is listed individually, but here, we have an array. Is it still appropriate to keep it singular? That seems confusing and was why consensus was to rename it to the plural.

jpmckinney · 2013-08-25T18:22:48Z

In RDF, terms for properties are usually singular, even though you can state a property multiple times (unless it has cardinality restrictions). In some RDF serializations, instead of making multiple statements (which you must do in, for example, RDF/XML) you can instead use an array-like syntax, like in Turtle, e.g.:

ex:dataset dcat:keyword ["foo", "bar", "baz"] .

which is equivalent to:

ex:dataset dcat:keyword "foo" .
ex:dataset dcat:keyword "bar" .
ex:dataset dcat:keyword "baz" .

If this project is implementing DCAT, shouldn't it use the same terms? Is the confusion really so severe that it makes sense to deviate from a standard adopted by multiple organizations?

mhogeweg · 2013-08-25T19:19:23Z

I agree with @jpmckinney that keeping the property name 'keyword' instead of 'keywords' is preferred. similar to the above, various XML-based metadata encodings also use a single keyword/subject/themekey/... that is repeated as needed.

In response to discussion in #44

MarinaNitze · 2013-08-26T02:02:30Z

I think good arguments were made to keep "keyword" singular. Taking these into account in #44.

In response to discussion in #44

seanherron · 2013-09-09T15:05:24Z

Pinging this for action - we should either merge these in soon or wait until after November.

haleyvandyck · 2013-09-09T20:06:32Z

Thanks for the ping. Because changes to the metadata schema have policy implications, they must go through an internal review process (see Project Open Data Governance for more detail on review processes).

We are expecting to have final changes merged to the metadata next week, which will formally be V1.0 of the schema. Following those changes, and consistent with whats outlined in the governance, additional changes to the metadata schema will be evaluated in 6 month increments going forward to enable stability and version control.

Thanks for everyone's help and patience improving the first round of the schema.

seanherron · 2013-09-09T20:52:09Z

To help with https://github.com/MarinaMartin/project-open-data.github.io/commit/bc0797055587c6b97655ab4c5591f518da9a0776 I've posted the listing of OMB Agency/Bureau and Treasury Codes as CSV and JSON at https://github.com/seanherron/OMB-Agency-Bureau-and-Treasury-Codes

seanherron · 2013-09-11T03:25:09Z

Hey @MarinaMartin, maybe I missed this, but why does the cardinality of accessURL change from 0,n to 0,1? According to #16, distribution should simply be a collection of accessURLs and some metadata like filetype. If that's the case, shouldn't there be a N cardinality for that field? If not, what purpose does distribution now have since we're associating a one-to-one relationship between dataset access files and entries in the schema?

Sorry if this is addresses elsewhere, haven't been able to find it.

MarinaNitze · 2013-09-11T17:21:02Z

@seanherron The accessURL field itself should only contain 1 URL. You could include multiple accessURL fields in a distribution array. So isn't that 0,1? Maybe I confused it.

seanherron · 2013-09-11T17:23:18Z

@MarinaMartin I guess I'm confused of the use case then - why would you include multiple accessURLs in distribution and only one in the accessURL field?

MarinaNitze · 2013-09-11T17:24:48Z

@seanherron Because you don't have to use distribution. If your dataset just has 1 download location, just use the accessURL field. But if you have multiple download URLs, then distribution should be a concatenation of multiple accessURL + format pairs.

mhogeweg · 2013-09-12T05:40:09Z

perhaps a clarification of the cardinality is in place. accessURL has a cardinality of (1,n), but is only required when the file is available for public download. that means there will be situations when there is no accessURL. hence cardinality could be (0,n).

distribution has cardinality of (0,n) and is set to not be required. slightly different condition from accessURL. but why would distribution be occurring multiple times if it is represented as a (one) array containing multiple pairs of accessURL/format?

it appears the cardinality is used both to indicate whether the field is optional/mandatory and whether there's one occurrence or multiple. but this doesn't address a field like keyword that occurs once but is a comma-separated list of terms, arrays, etc. Perhaps include an object model (UML) of the dcat json structure?

seanherron · 2013-09-12T15:22:09Z

@mhogeweg the cardinality of accessURL changes in this commit from (1,n) to (0,1) https://github.com/project-open-data/project-open-data.github.io/pull/44/files

JoshData · 2013-09-14T20:03:53Z

Should programOffice have cardinality (1, n) instead of (0, n)?

Why is format an array? Typically a URL will respond consistently with a single MIME type. (Or is this to support HTTP Accept?)

JoshData · 2013-09-14T20:47:10Z

accrualPeriodicity's example should be in title case

language's example needs to be updated to be an array of strings rather than a comma-separated string.

references's example should be an array but it looks a bit messed up.

PrimaryITInvestmentUII is the only field with an uppercase initial letter for JSON. (Not objecting, just flagging in case it is unintentional.)

systemOfRecords lost its details table (the table with cardinality etc.)

mhogeweg · 2013-09-18T14:54:53Z

@JoshData yes, programOffice should have cardinality (1,n) if it's mandatory (as seems to be the intention per the changes).

suggest modifying the example for language to be an array: change {"language":"es-MX, wo, nv, en-US"} to: { "language": [ "es-MX", "wo", "nv", "en-US" ] }. In general: test JSON examples with http://jsonlint.com/.

seanherron · 2013-09-18T14:56:09Z

@mhogeweg agree re: language

haleyvandyck · 2013-09-20T18:02:48Z

Thank you all for your incredible contributions and discussion to help improve the first version of the metadata schema.

At long last, we have finally cleared the changes represented in #44 through the White House processes.

In addition to the changes represented in this request we will be making some additional updates on the treatment of BureauCode and the addition of a ProgramCode as well. Details coming soon.

This pull request will constitute v1.0 of the schema. Per the project open data governance, we will continue to evaluate changes to the schema over time on regular 6 month intervals. These changes and future ones will be tracked at /metadata-changelog.

Thank you all for taking in part in this exciting, precedent setting project -- we look forward to continue to work with you all.

jpmckinney · 2013-09-20T18:22:07Z

Great work! It's been a pleasure participating in the schema's development. Will the examples be updated to match the schema, or should an issue be opened for that?

waldoj · 2013-09-20T19:02:21Z

👍 x 💯

haleyvandyck · 2013-09-20T20:47:26Z

@jpmckinney yes--please feel free to open an issue or a pull request with specfic changes

MarinaNitze added 2 commits May 15, 2013 22:00

Clarified "distribution" guidance

fa42040

Addressing Issue #16.

Fixed "theme" naming error

0f1e7d5

The Category field guidance incorrectly referred to the metadata name as "category" when it should be "theme."

MarinaNitze added 14 commits August 24, 2013 17:21

Mass changes to schema

802f0d8

Updated metadata rationale

08a6683

Format adheres to MIME types

e3891ad

All date fields now ISO 8601 compliant

733d67c

Update distribution cardinality

17960f6

Technically there should never be one more than one distribution array for a dataset.

Clarified temporal guidance

65b04c0

Added Distribution to top table of optional fields, fixed ABC order

fdf0d2b

Clarified that URL values should be strings.

23b63f6

Updated format to take MIME Type

208d145

Closes #117

Clarified temporal usage notes.

b9263af

Taken from #116. Thanks Sean!

Fixed accrualPeriodicity example

fe8774b

Thanks Sean! Taken from #116

Updated language guidance

eb44f4f

Thanks Sean! This incorporates #100

Added PrimaryITInvestmentUII

5f7b3e9

Closes #91

Clarified date for issued/modified

1403ab0

This addresses #65.

MarinaNitze added 2 commits August 24, 2013 23:12

Alphabetized guidance section

bd6a8bb

Added bureauCode

bc07970

MarinaNitze mentioned this pull request Aug 25, 2013

Changing publisher cardinality to allow multiple values #96

Closed

Added programOffice

12244ea

This was referenced Aug 25, 2013

Feedback on common core schema #49

Closed

Add operatingUnit Field #89

Closed

MarinaNitze mentioned this pull request Aug 26, 2013

webService #37

Closed

Reverting keyword field to its original singular

ff83878

In response to discussion in #44

MarinaNitze added a commit that referenced this pull request Aug 26, 2013

No longer making "keyword" plural

4ddf05a

In response to discussion in #44

This was referenced Aug 26, 2013

Publish a vocabulary/ontology at a stable URL #136

Closed

Move to public domain #135

Merged

MarinaNitze added 2 commits September 9, 2013 19:21

Small verbiage changes, moved programOffice to required

dfbe8ce

Changed contactPoint from Last, First to free text.

e09f2fd

MarinaNitze mentioned this pull request Sep 19, 2013

Added two values to Periodicity #143

Closed

haleyvandyck merged commit e09f2fd into project-open-data:master Sep 20, 2013

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposed Final Metadata Schema 1.0 #44

Proposed Final Metadata Schema 1.0 #44

MarinaNitze commented May 16, 2013

jpmckinney commented May 16, 2013

jpmckinney commented Aug 25, 2013

MarinaNitze commented Aug 25, 2013

jpmckinney commented Aug 25, 2013

mhogeweg commented Aug 25, 2013

MarinaNitze commented Aug 26, 2013

seanherron commented Sep 9, 2013

haleyvandyck commented Sep 9, 2013

seanherron commented Sep 9, 2013

seanherron commented Sep 11, 2013

MarinaNitze commented Sep 11, 2013

seanherron commented Sep 11, 2013

MarinaNitze commented Sep 11, 2013

mhogeweg commented Sep 12, 2013

seanherron commented Sep 12, 2013

JoshData commented Sep 14, 2013

JoshData commented Sep 14, 2013

mhogeweg commented Sep 18, 2013

seanherron commented Sep 18, 2013

haleyvandyck commented Sep 20, 2013

jpmckinney commented Sep 20, 2013

waldoj commented Sep 20, 2013

haleyvandyck commented Sep 20, 2013

Proposed Final Metadata Schema 1.0 #44

Proposed Final Metadata Schema 1.0 #44

Conversation

MarinaNitze commented May 16, 2013

jpmckinney commented May 16, 2013

jpmckinney commented Aug 25, 2013

MarinaNitze commented Aug 25, 2013

jpmckinney commented Aug 25, 2013

mhogeweg commented Aug 25, 2013

MarinaNitze commented Aug 26, 2013

seanherron commented Sep 9, 2013

haleyvandyck commented Sep 9, 2013

seanherron commented Sep 9, 2013

seanherron commented Sep 11, 2013

MarinaNitze commented Sep 11, 2013

seanherron commented Sep 11, 2013

MarinaNitze commented Sep 11, 2013

mhogeweg commented Sep 12, 2013

seanherron commented Sep 12, 2013

JoshData commented Sep 14, 2013

JoshData commented Sep 14, 2013

mhogeweg commented Sep 18, 2013

seanherron commented Sep 18, 2013

haleyvandyck commented Sep 20, 2013

jpmckinney commented Sep 20, 2013

waldoj commented Sep 20, 2013

haleyvandyck commented Sep 20, 2013