Skip to content
This repository was archived by the owner on Jun 18, 2024. It is now read-only.

Commit cd7a527

Browse files
committed
updating distribution guidance, part 2
In response to #217, #248
1 parent 5e20ba6 commit cd7a527

File tree

1 file changed

+81
-46
lines changed

1 file changed

+81
-46
lines changed

Diff for: v1.1/schema.md

+81-46
Original file line numberDiff line numberDiff line change
@@ -41,7 +41,7 @@ The [Implementation Guidance](/implementation-guide/) available as a part of Pro
4141

4242
Where optional fields are included in a catalog file but are unpopulated, they may be represented by a `null` value. They should not be represented by an empty string (`""`).
4343

44-
When a record has an `accessURL` or `downloadURL`, they should be contained as objects within a `distribution`. Any object may be described by `title`, `description`, `format`, or `mediaType`, though when an object contains `downloadURL`, it must be accompanied by `mediatype`.
44+
When a record has an **accessURL** or **downloadURL**, they should be contained as objects within a **distribution**. Any object may be described by **title**, **description**, **format**, or **mediaType**, though when an object contains **downloadURL**, it must be accompanied by **mediaType**.
4545

4646
The Project Open Data schema is case sensitive. The schema uses a camel case convention where the first letter of some words within a field are capitalized (usually all words but the first one). While it may seem subtle which characters are uppercase and lowercase, it is necessary to follow the exact same casing as defined in the schema documented here. For example:
4747

@@ -89,7 +89,7 @@ temporal | Temporal | The range of temporal applicability of a dataset (i.
8989

9090
"Common Core" Distribution Fields
9191
-------------------------------------------
92-
Within a record, `distribution` is used to aggregate the metadata specific to a dataset's resources (`accessURL` and `downloadURL`), which may be described using the following fields. Each distribution should contain one `accessURL` or `downloadURL`. `downloadURL` should always be accompanied by `mediaType`.
92+
Within a record, **distribution** is used to aggregate the metadata specific to a dataset's resources (**accessURL** and **downloadURL**), which may be described using the following fields. Each distribution should contain one **accessURL** or **downloadURL**. **downloadURL** should always be accompanied by **mediaType**.
9393

9494
{: .table .table-striped}
9595
Field | Label | Definition
@@ -137,15 +137,6 @@ Further Metadata Field Guidance (alphabetical by field)
137137
**Usage Notes** | This field refers to degree to which this dataset *could be made available* to the public, regardless of whether it is currently available to the public. For example, if a member of the public can walk into your agency and obtain a dataset, that entry is **public** even if there are no files online. A *restricted public* dataset is one only available under certain conditions or to certain audiences (such as researchers who sign a waiver). A *non-public* dataset is one that could never be made available to the public for privacy, security, or other reasons as determined by your agency.
138138
**Example** | `{"accessLevel":"public"}`
139139

140-
{: .table .table-striped #accessURL}
141-
**Field [#](#accessURL){: .permalink}** | **accessURL**
142-
----- | -----
143-
**Cardinality** | (0,n)
144-
**Required** | Yes, if the file is accessible indirectly, through means other than direct download.
145-
**Accepted Values** | String (URL)
146-
**Usage Notes** | This should be the URL for an indirect means of accessing the data, such as API documentation, a 'wizard' or other graphical interface which is used to generate a download, feed, or a request form for the data. This should not be a **direct** download URL. It is usually assumed that accessURL is an HTML webpage.
147-
**Example** | `{"accessURL":"http://www.agency.gov/api/vegetables/"}`
148-
149140
{: .table .table-striped #accrualPeriodicity}
150141
**Field [#](#accrualPeriodicity){: .permalink}** | **accrualPeriodicity**
151142
----- | -----
@@ -204,44 +195,88 @@ Further Metadata Field Guidance (alphabetical by field)
204195
**Field [#](#distribution){: .permalink}** | **distribution**
205196
----- | -----
206197
**Cardinality** | (0,n)
207-
**Required** | No
208-
**Accepted Values** | See Usage Notes
209-
**Usage Notes** | Distribution is a concatenation, as appropriate, of the following elements: **accessURL** and **format**. If an entry has only one dataset, enter details for that one; if it has multiple datasets (such as a bulk download and an API), separate entries as seen below:
210-
211-
"distribution": [
212-
{
213-
"accessURL":"https://explore.data.gov/views/ykv5-fn9t/rows.csv?accessType=DOWNLOAD",
214-
"format":"text/csv"
215-
},
216-
{
217-
"accessURL":"https://explore.data.gov/views/ykv5-fn9t/rows.json?accessType=DOWNLOAD",
218-
"format":"application/json"
219-
},
220-
{
221-
"accessURL":"https://explore.data.gov/views/ykv5-fn9t/rows.xml?accessType=DOWNLOAD",
222-
"format":"text/xml"
223-
}
224-
]
225-
226-
227-
{: .table .table-striped #downloadURL}
228-
**Field [#](#downloadURL){: .permalink}** | **downloadURL**
198+
**Required** | Yes, if the dataset has an **accessURL** or **downloadURL**.
199+
**Accepted Values** | Array of Objects
200+
**Usage Notes** | Distribution is a concatenation, as appropriate, of the following elements: **accessURL**, **downloadURL**, **description**, **format**, **mediaType**, and **title**. If an entry has only one form, enter details for that one; if it has multiple forms (such as a bulk download and an API), separate entries as seen below:
201+
**Example** |
202+
"distribution": [
203+
{
204+
"description": "Vegetable data as a CSV file",
205+
"downloadURL": "http://www.agency.gov/vegetables/listofvegetables.csv",
206+
"format": "CSV",
207+
"mediaType": "text/csv",
208+
"title": "vegetables.csv"
209+
},
210+
{
211+
"description": "Vegetable data as a zipped CSV file with attached data dictionary",
212+
"downloadURL": "http://www.agency.gov/vegetables/vegetables-all.zip",
213+
"format": "Zipped CSV",
214+
"mediaType": "application/zip",
215+
"title": "vegetables-all.zip"
216+
},
217+
{
218+
"accessURL": "http://www.agency.gov/api/vegetables/",
219+
"description": "A fully queryable REST API with JSON and XML output",
220+
"format": "API",
221+
"title": "Vegetables REST API"
222+
}
223+
]
224+
225+
{: .table .table-striped .child-field #distribution-accessURL}
226+
**Field [#](#distribution-accessURL){: .permalink}** | **distribution → accessURL**
227+
----- | -----
228+
**Cardinality** | (0,n)
229+
**Required** | Yes, if the file is accessible indirectly, through means other than direct download.
230+
**Accepted Values** | String (URL)
231+
**Usage Notes** | This should be the URL for an indirect means of accessing the data, such as API documentation, a 'wizard' or other graphical interface which is used to generate a download, feed, or a request form for the data. This should not be a **direct** download URL. It is usually assumed that accessURL is an HTML webpage.
232+
**Example** | `{"accessURL":"http://www.agency.gov/api/vegetables/"}`
233+
234+
{: .table .table-striped .child-field #distribution-downloadURL}
235+
**Field [#](#distribution-downloadURL){: .permalink}** | **distribution → downloadURL**
229236
----- | -----
230237
**Cardinality** | (0,n)
231238
**Required** | Yes, if the file is available for public download.
232239
**Accepted Values** | String (URL)
233-
**Usage Notes** | This must be the **direct** download URL. Other means of accessing the dataset should be expressed using **accessURL**.
240+
**Usage Notes** | This must be the **direct** download URL. Other means of accessing the dataset should be expressed using **accessURL**. This should always be accompanied by **mediaType**.
234241
**Example** | `{"downloadURL":"http://www.agency.gov/vegetables/listofvegetables.csv"}`
235242

236-
{: .table .table-striped #format}
237-
**Field [#](#format){: .permalink}** | **format**
243+
{: .table .table-striped .child-field #distribution-description}
244+
**Field [#](#distribution-description){: .permalink}** | **distribution → description**
245+
----- | -----
246+
**Cardinality** | (1,1)
247+
**Required** | Yes, always
248+
**Accepted Values** | String
249+
**Usage Notes** | This should be a human-readable description of the distribution.
250+
**Example** | `{"description":"Vegetable data as a zipped CSV file with attached data dictionary"}`
251+
252+
{: .table .table-striped .child-field #distribution-format}
253+
**Field [#](#distribution-format){: .permalink}** | **distribution → format**
238254
----- | -----
239255
**Cardinality** | (0,1)
240256
**Required** | No
241257
**Accepted Values** | String
242258
**Usage Notes** | This should be a human-readable description of the file format of the dataset, that provides useful information that might not be apparent from `mediaType`.
243259
**Example** | `{"format":"A CSV spreadsheet compressed in a ZIP file."}`
244260

261+
{: .table .table-striped .child-field #distribution-mediaType}
262+
**Field [#](#distribution-mediaType){: .permalink}** | **distribution → mediaType**
263+
----- | -----
264+
**Cardinality** | (0,1)
265+
**Required** | Yes, if the file is available for public download.
266+
**Accepted Values** | String
267+
**Usage Notes** | This must describe the exact files available at **downloadURL** using [MIME Types](http://en.wikipedia.org/wiki/Internet_media_type). _[Also note [Office Open XML MIME types](http://blogs.msdn.com/b/vsofficedeveloper/archive/2008/05/08/office-2007-open-xml-mime-types.aspx)]_
268+
**Example** | `{"mediaType":"application/csv"}`
269+
270+
{: .table .table-striped .child-field #distribution-title}
271+
**Field [#](#distribution-title){: .permalink}** | **distribution → title**
272+
----- | -----
273+
**Cardinality** | (1,1)
274+
**Required** | Yes, always
275+
**Accepted Values** | String
276+
**Usage Notes** | This should be a useful title for the distribution. Acronyms should be avoided.
277+
**Example** | `{"title":"Spreadsheet"}`
278+
279+
245280
{: .table .table-striped #identifier}
246281
**Field [#](#identifier){: .permalink}** | **identifier**
247282
----- | -----
@@ -314,15 +349,6 @@ Further Metadata Field Guidance (alphabetical by field)
314349
**Usage Notes** | -
315350
**Example** | `{"mbox":"[email protected]"}`
316351

317-
{: .table .table-striped #mediaType}
318-
**Field [#](#mediaType){: .permalink}** | **mediaType**
319-
----- | -----
320-
**Cardinality** | (0,1)
321-
**Required** | Yes, if the file is available for public download.
322-
**Accepted Values** | String
323-
**Usage Notes** | This must describe the exact files available at **downloadURL** using [MIME Types](http://en.wikipedia.org/wiki/Internet_media_type). _[Also note [Office Open XML MIME types](http://blogs.msdn.com/b/vsofficedeveloper/archive/2008/05/08/office-2007-open-xml-mime-types.aspx)]_
324-
**Example** | `{"mediaType":"application/json"}`
325-
326352
{: .table .table-striped #modified}
327353
**Field [#](#modified){: .permalink}** | **modified**
328354
----- | -----
@@ -406,7 +432,16 @@ If there is a need to reflect that the dataset is continually updated, ISO 8601
406432
**Required** | Yes, if applicable
407433
**Accepted Values** | ISO 8601 Date
408434
**Usage Notes** | This field should contain an interval of time defined by start and end dates. Dates should be formatted as pairs of {start datetime/end datetime} in the [ISO 8601](http://en.wikipedia.org/wiki/ISO_8601) format. ISO 8601 specifies that datetimes can be formatted in a number of ways, including a simple four-digit year (eg. 2013) to a much more specific YYYY-MM-DDTHH:MM:SSZ, where the T specifies a seperator between the date and time and time is expressed in 24 hour notation in the UTC (Zulu) time zone. (e.g., 2011-02-14T12:00:00Z/2013-07-04T19:34:00Z). Use a solidus ("/") to separate start and end times.
409-
435+
436+
{: .table .table-striped #title}
437+
**Field [#](#title){: .permalink}** | **title**
438+
----- | -----
439+
**Cardinality** | (1,1)
440+
**Required** | Yes, always
441+
**Accepted Values** | String
442+
**Usage Notes** | Acronyms should be avoided.
443+
**Example** | `{"title":"Types of Vegetables"}`
444+
410445
If there is a need to reflect that the dataset is continually updated, ISO 8601 formatting can account for this [with repeating intervals](http://en.wikipedia.org/wiki/ISO_8601#Time_intervals). For instance, updated monthly starting in January 2010 and continuing through the present would be represented as: `R/2010-01/P1M`.
411446

412447
Updated every 5 minutes beginning on February 15, 2010 would be represented as: `R/2010-02-15/PT5M`.

0 commit comments

Comments
 (0)