diff --git a/_config.yml b/_config.yml index 15512f46..de7da452 100644 --- a/_config.yml +++ b/_config.yml @@ -1,13 +1,13 @@ #document settings title: Project Open Data desc: "Open Data Policy — Managing Information as an Asset" -url: https://project-open-data.cio.gov/ -repo_name: project-open-data.github.io -branch: master +url: https://project-open-data.cio.gov #global settings, no need to change these root_url: https://project-open-data.cio.gov org_name: project-open-data +repo_name: project-open-data.github.io +branch: master # default build settings for running locally, no need to edit pymgemnts: true @@ -16,4 +16,4 @@ markdown: kramdown relative_permalinks: false gems: - - jekyll-redirect-from \ No newline at end of file + - jekyll-redirect-from diff --git a/_includes/header.html b/_includes/header.html index 560d27dd..01ebed9b 100644 --- a/_includes/header.html +++ b/_includes/header.html @@ -15,8 +15,10 @@ + + {% capture edit_url %}https://github.com/{{ site.org_name }}/{{ site.repo_name }}/edit/{{ site.branch }}/{{ page.path }}{% endcapture %} - + diff --git a/federal-awards-faq.md b/federal-awards-faq.md index 1f4dc2f3..1978c3be 100644 --- a/federal-awards-faq.md +++ b/federal-awards-faq.md @@ -15,23 +15,23 @@ filename: federal-awards-faq.md * Q3: What does “open data” mean? * Q4: What does “platform independent” mean? * Q5: What does “machine readable” mean? -* Q6: Are all agencies approaching this data in the same way? What if my agency has a grant or contract with recipient A and we translate the data but another agency has a similar contract with recipient A and requires the recipient to provide the data in a platform-independent, machine-readable format? How does this help with consistency across the government? +* Q6: Are all agencies approaching this data in the same way? What if my agency has a grant or contract with recipient A and we translate the data but another agency has a similar contract with recipient A and requires the recipient to provide the data in a platform-independent, machine-readable format? How does this help with consistency across the government? * Q7: Must all agency data be posted publicly? What sorts of considerations guide whether data should be posted publicly? * Q8: Some of the data I collect cannot be shared publicly. How is this data affected by the open data policy? * Q9: Who decides whether data should be posted publicly? * Q10: Once the decision to post data publicly is made, who is responsible for doing so? -* Q11: Do I need to change the way Federal award data are reported, for example, does my agency contract or grant writing system or interface with the Federal Procurement Data System, USAspending.gov, or agency financial system need to change? +* Q11: Do I need to change the way Federal award data are reported, for example, does my agency contract or grant writing system or interface with the Federal Procurement Data System, USAspending.gov, or agency financial system need to change? ###[Questions Applicable to Contracts:](#contracts) * Q12: Do I need to initiate modifications on my existing contracts to address the open data policy? -* Q13: For new contracts, what changes must I make to my deliverables or terms and conditions to address the open data policy? +* Q13: For new contracts, what changes must I make to my deliverables or terms and conditions to address the open data policy? * Q14: How difficult is it for agency personnel to transform machine-readable data into a platform-independent format? Is requiring data in a platform-independent format likely to drive up deliverable cost? Who can provide guidance on making the decision regarding who should convert the data? * Q15: Are there any Federal Acquisition Regulation (FAR) provisions that might affect an agency’s authority to publish data provided as a contract deliverable? ###[Questions Applicable to Grants and Other Financial Assistance:](#financial) -* Q16: Do I need to modify existing grants or financial assistance awards to address the open data policy? +* Q16: Do I need to modify existing grants or financial assistance awards to address the open data policy? * Q17: For new financial assistance awards, what changes must I make to terms and conditions to address the open data policy? -* Q18: What are the implications of the policy for new investigator-initiated scientific research grants? +* Q18: What are the implications of the policy for new investigator-initiated scientific research grants? * Q19: How difficult is it for agency personnel to transform machine-readable data into a platform-independent format? Is requiring data in a platform-independent format likely to drive up costs for recipients? Who can provide guidance on making the decision regarding who should convert the data? * Q20: Is there OMB guidance that might affect an agency’s authority to post data generated with Federal financial assistance? @@ -40,61 +40,61 @@ filename: federal-awards-faq.md ###General Questions ####Q1: What sort of data is covered by the open data policy? -A: The purpose of the Open Data Policy is to make open and machine readable the new default for government data collection and dissemination in order to enhance availability, access and interoperability, per May 9, 2013 Executive Order released by the White House. This includes all information generated and stored by the Federal Government or for data purchased or created through Federal funding such as data collected in conjunction with program administration, scientific research, public health surveillance. There is an expectation that agencies will work to prioritize ensuring that data is open when it would be likely to fuel entrepreneurship, innovation, accountability, and scientific discovery, and improve the lives of Americans in tangible ways. +A: The purpose of the Open Data Policy is to make open and machine readable the new default for government data collection and dissemination in order to enhance availability, access and interoperability, per May 9, 2013 Executive Order released by the White House. This includes all information generated and stored by the Federal Government or for data purchased or created through Federal funding such as data collected in conjunction with program administration, scientific research, public health surveillance. There is an expectation that agencies will work to prioritize ensuring that data is open when it would be likely to fuel entrepreneurship, innovation, accountability, and scientific discovery, and improve the lives of Americans in tangible ways. ####Q2: Who within my agency is responsible for Open Data Policy decision-making? A: The senior accountable official in your agency for making those decisions is the CIO, in partnership with the data owners in your agency. ####Q3: What does “open data” mean? -A: Open data is data that is publicly available data and structured in a way that enables these data to be fully discoverable and useable by end users inside and outside of government. In general, open data is consistent with the following principles: public, accessible, described, reusable, complete, timely, and managed post-release (attributes defined in the Office of Management and Budget (OMB) [Memorandum M-13-13 *Open Data Policy – Managing Information as an Asset*](http://www.whitehouse.gov/sites/default/files/omb/memoranda/2013/m-13-13.pdf). +A: Open data is data that is publicly available data and structured in a way that enables these data to be fully discoverable and useable by end users inside and outside of government. In general, open data is consistent with the following principles: public, accessible, described, reusable, complete, timely, and managed post-release (attributes defined in the Office of Management and Budget (OMB) [Memorandum M-13-13 *Open Data Policy – Managing Information as an Asset*](http://www.whitehouse.gov/sites/default/files/omb/memoranda/2013/m-13-13.pdf). ####Q4: What does “platform independent” mean? A: Platform independent data do not require a specific program to open (e.g., a comma-separated value (.csv) spreadsheet). They are in a format that can be read and processed by a variety of software tools – including free tools. ####Q5: What does “machine readable” mean? -A: Machine readable data is data structured in a format that can be understood and processed by a computer. Some file formats, such as a PDF or Word documents, are human readable and easier to read and edit. This differs significantly from machine readable formats that can be processed by a computer and parsed or organized around specific information. OMB defines “Machine Readable Format” and provides some examples in Circular A-11 Part 6, “Preparation and Submission of Strategic Plans, Annual Performance Plans, and Annual Program Performance Reports.” The circular describes machine readable format as: -> a standard computer language (not English text) that can be read automatically by a web browser or computer system. (e.g.; xml). Traditional word processing documents, hypertext markup language (HTML) and portable document format (PDF) files are easily read by humans but typically are difficult for machines to interpret. Other formats such as extensible markup language (XML), [JavaScript Object Notation] (JSON), or spreadsheets with header columns that can be exported as comma separated values (CSV) are machine readable formats. It is possible to make traditional word processing documents and other formats machine readable but the documents must include enhanced structural elements. +A: Machine readable data is data structured in a format that can be understood and processed by a computer. Some file formats, such as a PDF or Word documents, are human readable and easier to read and edit. This differs significantly from machine readable formats that can be processed by a computer and parsed or organized around specific information. OMB defines “Machine Readable Format” and provides some examples in Circular A-11 Part 6, “Preparation and Submission of Strategic Plans, Annual Performance Plans, and Annual Program Performance Reports.” The circular describes machine readable format as: +> a standard computer language (not English text) that can be read automatically by a web browser or computer system. (e.g.; xml). Traditional word processing documents, hypertext markup language (HTML) and portable document format (PDF) files are easily read by humans but typically are difficult for machines to interpret. Other formats such as extensible markup language (XML), [JavaScript Object Notation] (JSON), or spreadsheets with header columns that can be exported as comma separated values (CSV) are machine readable formats. It is possible to make traditional word processing documents and other formats machine readable but the documents must include enhanced structural elements. -In addition, M-13-13, “Open Data Policy – Managing Information as an Asset” encourages agencies to make data available in non-proprietary (i.e., platform independent) formats to the extent permitted by law. The preferred format for making data available is non-propriety (i.e., platform independent) because this platform allows the data to be accessed without the required use of certain proprietary software programs such as Excel. File formats that are preferred are identified in A-11 Part 6 and include JSON, XML, and CSV. -Your agency CIO will be able to provide more information and subject matter expertise to help contracting officer representatives (CORs) or other program officials in understanding requirements and how to assess deliverables. +In addition, M-13-13, “Open Data Policy – Managing Information as an Asset” encourages agencies to make data available in non-proprietary (i.e., platform independent) formats to the extent permitted by law. The preferred format for making data available is non-propriety (i.e., platform independent) because this platform allows the data to be accessed without the required use of certain proprietary software programs such as Excel. File formats that are preferred are identified in A-11 Part 6 and include JSON, XML, and CSV. +Your agency CIO will be able to provide more information and subject matter expertise to help contracting officer representatives (CORs) or other program officials in understanding requirements and how to assess deliverables. -####Q6: Are all agencies approaching this data in the same way? What if my agency has a grant or contract with recipient A and we translate the data but another agency has a similar contract with recipient A and requires the recipient to provide the data in a platform-independent, machine-readable format? How does this help with consistency across the government?** -A: OMB is working collaboratively across the Executive Office of the President to better understand how agencies are implementing open data and providing input and information to smooth the implementation process for grant recipients and contractors. As part of that process, we will be working across the Councils identified in Section 3 (b) of the [Executive Order (EO) “Making Open Data and Machine Readable the New Default for Government Information”](http://www.whitehouse.gov/the-press-office/2013/05/09/executive-order-making-open-and-machine-readable-new-default-government-) to better understand overall implementation efforts and further develop tools and information that provides consistency and avoids unnecessary duplication. +####Q6: Are all agencies approaching this data in the same way? What if my agency has a grant or contract with recipient A and we translate the data but another agency has a similar contract with recipient A and requires the recipient to provide the data in a platform-independent, machine-readable format? How does this help with consistency across the government?** +A: OMB is working collaboratively across the Executive Office of the President to better understand how agencies are implementing open data and providing input and information to smooth the implementation process for grant recipients and contractors. As part of that process, we will be working across the Councils identified in Section 3 (b) of the [Executive Order (EO) “Making Open Data and Machine Readable the New Default for Government Information”](http://www.whitehouse.gov/the-press-office/2013/05/09/executive-order-making-open-and-machine-readable-new-default-government-) to better understand overall implementation efforts and further develop tools and information that provides consistency and avoids unnecessary duplication. ####Q7: Must all agency data be posted publicly? What sorts of considerations guide whether data should be posted publicly? A: No. Neither the E.O. nor policy change Agency obligations under any law. Specifically, nothing in the EO or policy compels or authorizes the disclosure of privileged information, law enforcement information, national security information, personal information, or information the disclosure of which is prohibited by law. For example, for contracts, information subject to the Trade Secrets Act regarding proprietary company information would be an example of data that should not be made publicly available, as would detailed vendor pricing information in many cases. Note that when datasets are designated as inappropriate for public release due to, for instance, issues associated with personally identifiable information, agencies should consider whether the value of the data can be increased by either (1) making the data available to a set of qualified parties (e.g. researchers) with strong legal and privacy protections, or (2) providing access to other Executive branch agencies though through inter-agency data sharing agreements. However, M-13-13 still requires that these types of information be inventoried in an Enterprise Data Inventory regardless of whether they are considered “open data,” which may lead to valuable internal use. Your agency CIO as the responsible official for implementation of the policy will have more information about any agency or program specific considerations that should be applied to data in your agency. ####Q8: Some of the data I collect cannot be shared publicly. How is this data affected by the open data policy? -A: The open data policy is about more than just making data publicly available; it is also about managing information as a strategic asset within the enterprise. Even data that cannot be shared publicly should be received and stored in platform-independent, machine-readable formats whenever possible. In many cases, there are significant benefits to be derived from sharing data across the government, even when the data will not be publicly posted (See, for instance, [OMB Memorandum M-11-02: Sharing Data While Protecting Privacy]( http://www.whitehouse.gov/sites/default/files/omb/memoranda/2011/m11-02.pdf)). As an example, detailed contract price data that can be shared within the government can provide a valuable touch-point as other agencies plan their procurements, but in some cases that data cannot be publicly shared. Similarly, data collected in conjunction with administering public benefits can be valuable to statistical agencies. +A: The open data policy is about more than just making data publicly available; it is also about managing information as a strategic asset within the enterprise. Even data that cannot be shared publicly should be received and stored in platform-independent, machine-readable formats whenever possible. In many cases, there are significant benefits to be derived from sharing data across the government, even when the data will not be publicly posted (See, for instance, [OMB Memorandum M-11-02: Sharing Data While Protecting Privacy]( http://www.whitehouse.gov/sites/default/files/omb/memoranda/2011/m11-02.pdf)). As an example, detailed contract price data that can be shared within the government can provide a valuable touch-point as other agencies plan their procurements, but in some cases that data cannot be publicly shared. Similarly, data collected in conjunction with administering public benefits can be valuable to statistical agencies. ####Q9: Who decides whether data should be posted publicly? A: Program office personnel and grant officers will largely be responsible for recommending whether a given data set should be made publicly available. As the data owners, such personnel and officers are in the best position to determine whether privacy, national security, or other concerns would preclude public posting. While agency general counsel,grant and contracting officers, and security and privacy personnel, among others, may all be involved in reviewing whether or not a given dataset can be made public, the agency CIO is the responsible official in each agency for leading the open data effort, providing leadership and guidance on implementation. ####Q10: Once the decision to post data publicly is made, who is responsible for doing so? -A: The agency CIO has the primary responsibility for managing and creating an inventory of all agency datasets as part of an enterprise-wide data inventory. The agency CIO is also responsible for creating a public data listing, as a subset of the full inventory. The public data listing does not include the underlying databases, but is simply a list of all datasets (e.g. title, description, point of contact, etc.) in the agency that are public or could be made public. Public data listings, under M-13-13, are required to be posted at agency.gov/data pages. +A: The agency CIO has the primary responsibility for managing and creating an inventory of all agency datasets as part of an enterprise-wide data inventory. The agency CIO is also responsible for creating a public data listing, as a subset of the full inventory. The public data listing does not include the underlying databases, but is simply a list of all datasets (e.g. title, description, point of contact, etc.) in the agency that are public or could be made public. Public data listings, under M-13-13, are required to be posted at agency.gov/data pages. -#### Q11: Do I need to change the way Federal award data are reported, for example, does my agency contract or grant writing system or interface with the Federal Procurement Data System – Next Generation (FPDS-NG), USAspending.gov, or agency financial system need to change? +#### Q11: Do I need to change the way Federal award data are reported, for example, does my agency contract or grant writing system or interface with the Federal Procurement Data System – Next Generation (FPDS-NG), USAspending.gov, or agency financial system need to change? A: No, FPDS-NG and USAspending already provide public data access to their contents in open formats. Some system changes may need to be considered for other Federal agency platforms that house public data, and enable public data to be accessed or downloaded. ###Questions Applicable to Contracts ####Q12: Do I need to initiate modifications on my existing contracts to address the open data policy? -A: No. The open data policy is prospective. For new awards, Federal contracts must ensure that the Government treats data as a valuable national asset, and structures any data-related deliverables to collect such data in formats that can be shared, regardless of whether a determination has been made as to whether the data should be made available to the public. There is no requirement to modify existing procurement agreements. +A: No. The open data policy is prospective. For new awards, Federal contracts must ensure that the Government treats data as a valuable national asset, and structures any data-related deliverables to collect such data in formats that can be shared, regardless of whether a determination has been made as to whether the data should be made available to the public. There is no requirement to modify existing procurement agreements. -####Q13: For new contracts, what changes must I make to my deliverables or terms and conditions to address the open data policy? -A: That depends on what the Government intends to buy. If data are to be included as a deliverable, and the deliverable is written to require submission of those data in an appropriate format (See Section A or contact your agency CIO for more details), there are no additional requirements. The open data policy is not designed to change what the government is buying, but is focused instead on how that information is delivered, and ensuring it can be appropriately reused. +####Q13: For new contracts, what changes must I make to my deliverables or terms and conditions to address the open data policy? +A: That depends on what the Government intends to buy. If data are to be included as a deliverable, and the deliverable is written to require submission of those data in an appropriate format (See Section A or contact your agency CIO for more details), there are no additional requirements. The open data policy is not designed to change what the government is buying, but is focused instead on how that information is delivered, and ensuring it can be appropriately reused. -For example, if a deliverable consists of an assessment or end-state report and no underlying data are required, the open data policy will not force the agency to add a data deliverable or alter the other terms and conditions of the contract. If, however, a deliverable includes data, the deliverable should be structured to require the delivery or export of the data in a machine-readable format. The agency has discretion regarding whether deliverables must be provided in platform-independent formats, but if the deliverable is not provided in such a format, the agency will need to take the extra step of translating it into a platform-independent format. Either solution is acceptable to reach the objective of open data. +For example, if a deliverable consists of an assessment or end-state report and no underlying data are required, the open data policy will not force the agency to add a data deliverable or alter the other terms and conditions of the contract. If, however, a deliverable includes data, the deliverable should be structured to require the delivery or export of the data in a machine-readable format. The agency has discretion regarding whether deliverables must be provided in platform-independent formats, but if the deliverable is not provided in such a format, the agency will need to take the extra step of translating it into a platform-independent format. Either solution is acceptable to reach the objective of open data. -As agencies contemplate the purchase of systems that will process data, the requirements in Section 3 of M-13-13 must be considered. These requirements provide for a life-cycle view of effective and efficient information management by requiring that information is collected in a way that supports downstream processing, and that systems are built to support interoperability and information accessibility, including regular access or exporting of the data as a standard requirement of such systems. Addressing these considerations early in the acquisition process will protect against the costly retrofitting that is often involved in retrieving data from legacy platform-dependent systems. +As agencies contemplate the purchase of systems that will process data, the requirements in Section 3 of M-13-13 must be considered. These requirements provide for a life-cycle view of effective and efficient information management by requiring that information is collected in a way that supports downstream processing, and that systems are built to support interoperability and information accessibility, including regular access or exporting of the data as a standard requirement of such systems. Addressing these considerations early in the acquisition process will protect against the costly retrofitting that is often involved in retrieving data from legacy platform-dependent systems. #### Q14: How difficult is it for agency personnel to transform machine-readable data into a platform-independent format? Is requiring data in a platform-independent format likely to drive up deliverable cost? Who can provide guidance on making the decision regarding who should convert the data? -A: A number of tools are available to transform data from proprietary formats to platform independent formats. [Project Open Data](http://project-open-data.github.io/) has collected a number of such tools. Many of these tools are easy to operate and require very little investment of time or effort. As such, they can be used either by a contractor (if the contract specifies delivery in a platform independent format) or by Federal employees (if the contract specifies delivery in a proprietary format). -Your agency CIO is responsible for leading the open data initiative in your agency as well as working with the interagency working group on this effort. Your agency may determine that having the agency translate the delivered data is most cost effective so we encourage you to work with your CIO on this effort. +A: A number of tools are available to transform data from proprietary formats to platform independent formats. [Project Open Data](/) has collected a number of such tools. Many of these tools are easy to operate and require very little investment of time or effort. As such, they can be used either by a contractor (if the contract specifies delivery in a platform independent format) or by Federal employees (if the contract specifies delivery in a proprietary format). +Your agency CIO is responsible for leading the open data initiative in your agency as well as working with the interagency working group on this effort. Your agency may determine that having the agency translate the delivered data is most cost effective so we encourage you to work with your CIO on this effort. ####Q15: Are there any Federal Acquisition Regulation (FAR) provisions that might affect an agency’s authority to publish data provided as a contract deliverable? A: While the scope of work and other contractual language concerning the required deliverables will determine what data is delivered under a given contract, the rights that the government obtains to the data that is delivered are generally covered by one of a number of standard contract clauses. For civilian agencies, procedures for utilizing these clauses appear within FAR subpart 27.4 – Rights in Data and Copyrights ([48 C.F.R. 27.400 et. seq.,](https://acquisition.gov/far/current/html/FARTOCP27.html)). For the Department of Defense, contracting officers are instructed by the Defense Federal Acquisition Regulation Supplement (DFARS) to use the DFARS coverage in [subparts 227.71 and 227.72](http://www.acq.osd.mil/dpap/dars/dfarspgi/current/index.html) in lieu of the guidance in FAR subpart 27.4. Other agencies may have supplemental acquisitions regulations that apply. -The procedures contained in both the FAR and the DFARS are designed to implement the policy principles expressed at FAR 27.402. Specifically: +The procedures contained in both the FAR and the DFARS are designed to implement the policy principles expressed at FAR 27.402. Specifically: (a) To carry out their missions and programs, agencies acquire or obtain access to many kinds of data produced during or used in the performance of their contracts. Agencies require data to— (1) Obtain competition among suppliers; (2) Fulfill certain responsibilities for disseminating and publishing the results of their activities; @@ -103,30 +103,30 @@ The procedures contained in both the FAR and the DFARS are designed to implement (5) Meet specialized acquisition needs and ensure logistics support. (b) Contractors may have proprietary interests in data. In order to prevent the compromise of these interests, agencies shall protect proprietary data from unauthorized use and disclosure. The protection of such data is also necessary to encourage qualified contractors to participate in and apply innovative concepts to Government programs. In light of these considerations, agencies shall balance the Government’s needs and the contractor’s legitimate proprietary interests. -While these clauses address the data rights obtained by an agency, decisions regarding whether data will be published will be made in accordance with Q9 above, involving, as appropriate, consultation among the agency CIO, general counsel, contracting personnel, security personnel and privacy personnel. +While these clauses address the data rights obtained by an agency, decisions regarding whether data will be published will be made in accordance with Q9 above, involving, as appropriate, consultation among the agency CIO, general counsel, contracting personnel, security personnel and privacy personnel. ### Questions Applicable to Grants and Other Financial Assistance -####Q16: Do I need to modify existing grants or financial assistance awards to address the open data policy? +####Q16: Do I need to modify existing grants or financial assistance awards to address the open data policy? A: No. The open data policy is prospective and looks at grants made after the publication of the Executive Order. For these new awards, Federal financial assistance and grant agreements must ensure that the Government treats data as a valuable national asset, and structures any data-related deliverables to collect such data in formats that can be appropriately shared. There is no requirement to modify existing agreements. ####Q17: For new financial assistance awards, what changes must I make to terms and conditions to address the open data policy? -A: You are not required to make any changes to the terms and conditions of new awards, but depending on the situation, you may want to consider future changes per below. +A: You are not required to make any changes to the terms and conditions of new awards, but depending on the situation, you may want to consider future changes per below. -A1: If data are included or expected to be provided to the Federal government as an outcome of a Federal award and those data are already in the right format (See Section B or contact your agency CIO for more details), there are no additional requirements for the award. +A1: If data are included or expected to be provided to the Federal government as an outcome of a Federal award and those data are already in the right format (See Section B or contact your agency CIO for more details), there are no additional requirements for the award. A2: If data are expected to be provided to the Federal government but are not in the correct format, the agency must make a determination about whether to require the recipient to report the information in the new format (which may require a change to the terms and conditions), or whether to make the change to the new format themselves (which would not). -Your agency CIO is responsible for leading the open data initiative in your agency as well as working with the interagency working group on this effort. Your agency may determine that having the agency translate the delivered data is most cost effective so we encourage you to work with your CIO on this effort. -A3: If data are generated with Federal funding, but providing data to the Federal government would not otherwise be an expected outcome of the Federal award this policy does not include any new requirement to collect it. +Your agency CIO is responsible for leading the open data initiative in your agency as well as working with the interagency working group on this effort. Your agency may determine that having the agency translate the delivered data is most cost effective so we encourage you to work with your CIO on this effort. +A3: If data are generated with Federal funding, but providing data to the Federal government would not otherwise be an expected outcome of the Federal award this policy does not include any new requirement to collect it. -####Q18: What are the implications of the policy for new investigator-initiated scientific research grants? -A: Although most investigator-initiated research grants are expected to generate data, those that do not include a specific provision requiring that the resulting data be provided to the government are not subject to the requirements of this policy. However, to increase the value of the agency’s investment, some grants already require that data be provided to the agency. Furthermore, it is likely that some agencies will begin providing additional guidance with respect to managing digital scientific data created under investigator-initiated grants as part of their response to the Office of Science and Technology’s [Memorandum on Increasing Public Access to the Results of Federally Funded Research](http://www.whitehouse.gov/sites/default/files/microsites/ostp/ostp_public_access_memo_2013.pdf). Agencies, in conjunction with their CIOs, may determine that some requirements of the M-13-13 policy are appropriate for inclusion in future funding announcements (e.g., machine readable, platform independent). +####Q18: What are the implications of the policy for new investigator-initiated scientific research grants? +A: Although most investigator-initiated research grants are expected to generate data, those that do not include a specific provision requiring that the resulting data be provided to the government are not subject to the requirements of this policy. However, to increase the value of the agency’s investment, some grants already require that data be provided to the agency. Furthermore, it is likely that some agencies will begin providing additional guidance with respect to managing digital scientific data created under investigator-initiated grants as part of their response to the Office of Science and Technology’s [Memorandum on Increasing Public Access to the Results of Federally Funded Research](http://www.whitehouse.gov/sites/default/files/microsites/ostp/ostp_public_access_memo_2013.pdf). Agencies, in conjunction with their CIOs, may determine that some requirements of the M-13-13 policy are appropriate for inclusion in future funding announcements (e.g., machine readable, platform independent). ####Q19: How difficult is it for agency personnel to transform machine-readable data into a platform-independent format? Is requiring data in a platform-independent format likely to drive up costs for recipients? Who can provide guidance on making the decision regarding who should convert the data? -A: A number of tools are available to transform data from proprietary formats to platform independent formats. [Project Open Data](http://project-open-data.github.io) has collected a number of such tools. Many of these tools are easy to operate and require very little investment of time or effort. As such, they can be used either by a recipient (if the award specifies delivery in a platform independent format) or by Federal employees (if the award either does not specify or specifies delivery in a proprietary format). -Your agency CIO is responsible for leading the open data initiative in your agency as well as working with the interagency working group on this effort. Your agency may determine that having the agency translate the delivered data is most cost effective. +A: A number of tools are available to transform data from proprietary formats to platform independent formats. [Project Open Data](http://project-open-data.github.io) has collected a number of such tools. Many of these tools are easy to operate and require very little investment of time or effort. As such, they can be used either by a recipient (if the award specifies delivery in a platform independent format) or by Federal employees (if the award either does not specify or specifies delivery in a proprietary format). +Your agency CIO is responsible for leading the open data initiative in your agency as well as working with the interagency working group on this effort. Your agency may determine that having the agency translate the delivered data is most cost effective. ####Q20: Is there OMB guidance that might affect an agency’s authority to post data generated with Federal financial assistance? -A: For Federal financial assistance to state, local, and tribal governments please see the guidance from A-102 . For Federal financial assistance to nonprofit organizations including institutions of higher education, please see the guidance in [OMB Circular A-110](http://www.whitehouse.gov/omb/circulars_a102). +A: For Federal financial assistance to state, local, and tribal governments please see the guidance from A-102 . For Federal financial assistance to nonprofit organizations including institutions of higher education, please see the guidance in [OMB Circular A-110](http://www.whitehouse.gov/omb/circulars_a102). -While this guidance address the rights obtained by an agency, decisions regarding whether data will be published will be made in accordance with Q9 above, generally involving, as appropriate, consultation among the agency CIO, general counsel, contracting personnel, security personnel and privacy personnel. +While this guidance address the rights obtained by an agency, decisions regarding whether data will be published will be made in accordance with Q9 above, generally involving, as appropriate, consultation among the agency CIO, general counsel, contracting personnel, security personnel and privacy personnel. diff --git a/glossary.md b/glossary.md index ff259e9e..c13cb430 100644 --- a/glossary.md +++ b/glossary.md @@ -5,9 +5,9 @@ permalink: /glossary/ filename: glossary.md --- -This section contains explanations of common terms referenced in Project Open Data and the Open Data Policy. +This section contains explanations of common terms referenced in Project Open Data and the Open Data Policy. -### API +### API An application programming interface, which is a set of definitions of the ways one piece of computer software communicates with another. It is a method of achieving abstraction, usually (but not necessarily) between higher-level and lower-level software. —*[source](http://www.whitehouse.gov/sites/default/files/omb/assets/egov_docs/DRM_2_0_Final.pdf)* @@ -21,43 +21,43 @@ Quality API documentation is the gateway to a successful API. API documentation API documentation can be written by developers of the API, but additional edits should be made by developers who were not responsible for deploying the API. As a developer, it’s easy to overlook parameters and other details that developers have made assumptions about. —*[source](http://management.apievangelist.com/building-blocks.html)* -### Application Library +### Application Library Complete, functioning applications built on an API is the end goal of any API owner. Make sure and showcase all applications that are built on an API using an application showcase or directory. App showcases are a great way to showcase not just applications built by the API owner, but also showcase the successful integrations of ecosystem partners and individual developers. —*[source](http://management.apievangelist.com/building-blocks.html)* -### Basic Auth +### Basic Auth Basic Auth is a way for a web browser or application to provide credentials in the form of a username and password. Because Basic Auth is integrated into HTTP protocol it is the easiest way for users to authenticate with a RESTful API. Basic Auth is easily integrated, however if SSL is not used, the username and password are passed in plain text and can be easily intercepted on the open Internet. —*[source](http://management.apievangelist.com/building-blocks.html)* -### Catalog +### Catalog A catalog is a collection of datasets or web services. —*[source](http://www.data.gov/glossary)* -### Code Library +### Code Library Working code samples in all the top programming languages are common place in the most successful APIs. Documentation will describe in a general way, how to use an API, but code samples will speak in the specific language of developers. —*[source](http://management.apievangelist.com/building-blocks.html)* -### Content API +### Content API A web service that provides dynamic access to the page content of a website, includes the title, body, and body elements of individual pages. Such an API often but not always functions atop a Content Management System. —*[source](http://seabourneinc.com/2011/09/23/what-is-a-content-api/)* -### CSV +### CSV A comma separated values (CSV) file is a computer data file used for implementing the tried and true organizational tool, the Comma Separated List. The CSV file is used for the digital storage of data structured in a table of lists form. Each line in the CSV file corresponds to a row in the table. Within a line, fields are separated by commas, and each field belongs to one table column. CSV files are often used for moving tabular data between two different computer programs (like moving between a database program and a spreadsheet program). —*[source](http://www.data.gov/glossary)* -### Data +### Data A value or set of values representing a specific concept or concepts. Data become "information" when analyzed and possibly combined with other data in order to extract meaning, and to provide context. The meaning of data can vary depending on its context. Data includes *all* data. It includes, but is not limited to, 1) geospatial data 2) unstructured data, 3) structured data, etc. —*[source](http://www.data.gov/glossary)* ### /Data page -A hub for data discovery which provides a common location that lists and links to an organization’s datasets. Such a hub is often located at www.example.com/data. —*[source](http://project-open-data.github.io/policy-memo/)* +A hub for data discovery which provides a common location that lists and links to an organization’s datasets. Such a hub is often located at www.example.com/data. —*[source](/policy-memo/)* -### Data Asset - -A collection of data elements or datasets that make sense to group together. Each community of interest identifies the Data Assets specific to supporting the needs of their respective mission or business functions. Notably, a Data Asset is a deliberately abstract concept. A given Data Asset may represent an entire database consisting of multiple distinct entity classes, or may represent a single entity class. -*[source](http://project-open-data.github.io/implementation-guide/#footnote-1)* +### Data Asset + +A collection of data elements or datasets that make sense to group together. Each community of interest identifies the Data Assets specific to supporting the needs of their respective mission or business functions. Notably, a Data Asset is a deliberately abstract concept. A given Data Asset may represent an entire database consisting of multiple distinct entity classes, or may represent a single entity class. -*[source](/implementation-guide/#footnote-1)* ### Dataset @@ -67,25 +67,25 @@ A dataset is an organized collection of data. The most basic representation of a A hub for API discovery which provides a common location where an organization’s APIs and their associated documentation. Such a hub is often located at www.example.com/developer. —*[source](http://www.howto.gov/mobile/apis-in-government/api-developer-kit)* -### Database +### Database A collection of data stored according to a schema and manipulated according to the rules set out in one Data Modelling Facility. —*[source](http://ise.gov/building-blocks/glossary/)* -### Endpoint +### Endpoint An association between a binding and a network address, specified by a URI, that may be used to communicate with an instance of a service. An end point indicates a specific location for accessing a service using a specific protocol and data format. —*[source](http://www.w3.org/TR/2004/NOTE-ws-gloss-20040211/)* -### Error Response Code +### Error Response Code Errors are an inevitable part of API integration, and providing not only a robust set of clear and meaningful API error response codes, but a clear listing of these codes for developers to follow and learn from is essential. API errors are directly related to frustration during developer integration, the more friendlier and meaningful they are, the greater the chance a developer will move forward after encountering an error. Put a lot of consideration into your error responses and the documentation that educates developers. —*[source](http://management.apievangelist.com/building-blocks.html)* -### GitHub +### GitHub GitHub is a social coding platform allowing developers to publicly or privately build code repositories and interact with other developers around these repositories--providing the ability to download or fork a repository, as well as contribute back, resulting in a collaborative environment for software development. —*[source](http://management.apievangelist.com/building-blocks.html)* -### Hackathon +### Hackathon An event in which computer programmers and others in the field of software development, like graphic designers, interface designers, project managers and computational philologists, collaborate intensively on software projects. Occasionally, there is a hardware component as well. Hackathons typically last between a day and a week in length. Some hackathons are intended simply for educational or social purposes, although in many cases the goal is to create usable software. Hackathons tend to have a specific focus, which can include the programming language used, the operating system, an application, an API, the subject, or the demographic group of the programmers. In other cases, there is no restriction on the type of software being created. —*[source](http://en.wikipedia.org/wiki/Hackathon)* @@ -105,11 +105,11 @@ Information system, as defined in OMB Circular A-130, means a discrete set of in Information system life cycle, as defined in OMB Circular A-130, means the phases through which an information system passes, typically characterized as initiation, development, operation, and termination. —*[source](http://www.whitehouse.gov/omb/circulars_a130_a130trans4#6)* -### JSON +### JSON JSON (JavaScript Object Notation) is a lightweight data-interchange format. It is easy for humans to read and write. It is easy for machines to parse and generate. It is based on a subset of the JavaScript Programming Language, Standard ECMA-262 3rd Edition - December 1999. JSON is a text format that is completely language independent but uses conventions that are familiar to programmers of the C-family of languages, including C, C++, C#, Java, JavaScript, Perl, Python, and many others. These properties make JSON an ideal data-interchange language. —*[source](http://www.epa.gov/waters/doc/glossary.html)* -### JSONP +### JSONP JSONP or "JSON with padding" is a JSON extension wherein the name of a callback function is specified as an input argument of the underlying JSON call itself. JSONP makes use of runtime script tag injection. —*[source](http://www.epa.gov/waters/doc/glossary.html)* @@ -117,25 +117,25 @@ JSONP or "JSON with padding" is a JSON extension wherein the name of a callback Refers to information or data that is in a format that can be easily processed by a computer without human intervention while ensuring no semantic meaning is lost. —*[source](https://www.niem.gov/glossary/Pages/Glossary.aspx?alpha=All)* -### Metadata +### Metadata To facilitate common understanding, a number of characteristics, or attributes, of data are defined. These characteristics of data are known as “metadata”, that is, “data that describes data.” For any particular datum, the metadata may describe how the datum is represented, ranges of acceptable values, its relationship to other data, and how it should be labeled. Metadata also may provide other relevant information, such as the responsible steward, associated laws and regulations, and access management policy. Each of the types of data described above has a corresponding set of metadata. Two of the many metadata standards are the Dublin Core Metadata Initiative (DCMI) and Department of Defense Discovery Metadata Standard (DDMS). The metadata for structured data objects describes the structure, data elements, interrelationships, and other characteristics of information, including its creation, disposition, access and handling controls, formats, content, and context, as well as related audit trails. Metadata includes data element names (such as Organization Name, Address, etc.), their definition, and their format (numeric, date, text, etc.). In contrast, data is the actual data values such as the “US Patent and Trade Office” or the “Social Security Administration” for the metadata called “Organization Name”. Metadata may include metrics about an organization’s data including its data quality (accuracy, completeness, etc.). —*[source](http://www.whitehouse.gov/sites/default/files/omb/assets/egov_docs/DRM_2_0_Final.pdf)* -### OAuth +### OAuth An open standard for authorization. It allows users to share their private resources stored on one site with another site without having to hand out their credentials, typically username and password. —*[source](http://management.apievangelist.com/building-blocks.html)* -### Open Source Software +### Open Source Software Computer software that is available in source code form: the source code and certain other rights normally reserved for copyright holders are provided under an open-source license that permits users to study, change, improve and at times also to distribute the software. Open source software is very often developed in a public, collaborative manner. Open source software is the most prominent example of open source development and often compared to (technically defined) user-generated content or (legally defined) open content movements. —*[source](http://en.wikipedia.org/wiki/Open-source_software)* -### Open Standard +### Open Standard A standard developed or adopted by voluntary consensus standards bodies, both domestic and international. These standards include provisions requiring that owners of relevant intellectual property have agreed to make that intellectual property available on a non-discriminatory, royalty-free or reasonable royalty basis to all interested parties. —*[source](http://ise.gov/building-blocks/glossary)* -### Parameter +### Parameter A special kind of variable, used in a subroutine to refer to one of the pieces of data provided as input to the subroutine. The semantics for how parameters can be declared and how the arguments get passed to the parameters of subroutines are defined by the language, but the details of how this is represented in any particular computer system depend on the calling conventions of that system. —*source* @@ -143,45 +143,45 @@ A special kind of variable, used in a subroutine to refer to one of the pieces o Resource Description Framework - A family of specifications for a metadata model. The RDF family of specifications is maintained by the World Wide Web Consortium (W3C). The RDF metadata model is based upon the idea of making statements about resources in the form of a subject-predicate-object expression...and is a major component in what is proposed by the W3C’s Semantic Web activity: an evolutionary stage of the World Wide Web in which automated software can store, exchange, and utilize metadata about the vast resources of the Web, in turn enabling users to deal with those resources with greater efficiency and certainty. RDF’s simple data model and ability to model disparate, abstract concepts has also led to its increasing use in knowledge management applications unrelated to Semantic Web activity. —*[source](http://www.whitehouse.gov/sites/default/files/omb/assets/egov_docs/DRM_2_0_Final.pdf)* -### REST +### REST A style of software architecture for distributed systems such as the World Wide Web. REST has emerged as a predominant Web service design model. REST facilitates the transaction between web servers by allowing loose coupling between different services. REST is less strongly typed than its counterpart, SOAP. The REST language is based on the use of nouns and verbs, and has an emphasis on readability. Unlike SOAP, REST does not require XML parsing and does not require a message header to and from a service provider. This ultimately uses less bandwidth. —*[source](http://en.wikipedia.org/wiki/REST)* -### RSS +### RSS A family of web feed formats (often dubbed Really Simple Syndication) used to publish frequently updated works — such as blog entries, news headlines, audio, and video — in a standardized format. An RSS document (which is called a "feed," "web feed," or "channel") includes full or summarized text, plus metadata such as publishing dates and authorship. —*[source](http://en.wikipedia.org/wiki/RSS)* -### Schema +### Schema An XML schema defines the structure of an XML document. An XML schema defines things such as which data elements and attributes can appear in a document; how the data elements relate to one another; whether an element is empty or can include text; which types of data are allowed for specific data elements and attributes; and what the default and fixed values are for elements and attributes. A schema is also a description of the data represented within a database. The format of the description varies but includes a table layout for a relational database or an entity-relationship diagram. It is method for specifying constraints on XML documents. —*[source](http://www.epa.gov/networkg/glossary.html)* -### SDK +### SDK Software Development Kits (SDK) are the next step in providing code for developers, after basic code samples. SDKs are more complete code libraries that usually include authentication and production ready objects, that developers can use after they are more familiar with an API and are ready for integration. Just like with code samples, SDKs should be provided in as many common programming languages as possible. Code samples will help developers understand an API, while SDKs will actually facilitate their integration of an API into their application. When providing SDKs, consider a software licensing that gives your developers as much flexibility as possible in their commercial products. —*[source](http://management.apievangelist.com/building-blocks.html)* -### Service-Oriented-Architecture +### Service-Oriented-Architecture Expresses a software architectural concept that defines the use of services to support the requirements of software users. In a SOA environment, nodes on a network make resources available to other participants in the network as independent services that the participants access in a standardized way. Most definitions of SOA identify the use of Web services (using SOAP and WSDL) in its implementation. However, one can implement SOA using any service-based technology with loose coupling among interacting software agents. —*[source](http://www.whitehouse.gov/sites/default/files/omb/assets/egov_docs/DRM_2_0_Final.pdf)* -### SOAP +### SOAP SOAP (Simple Object Access Protocol) is a message-based protocol based on XML for accessing services on the Web. It employs XML syntax to send text commands across the Internet using HTTP. SOAP is similar in purpose to the DCOM and CORBA distributed object systems, but is more lightweight and less programming-intensive. Because of its simple exchange mechanism, SOAP can also be used to implement a messaging system. —*[source](http://www.epa.gov/waters/doc/glossary.html)* -### Swagger +### Swagger A specification and complete framework implementation for describing, producing, consuming, and visualizing RESTful web services. The overarching goal of Swagger is to enable client and documentation systems to update at the same pace as the server. The documentation of methods, parameters and models are tightly integrated into the server code, allowing APIs to always stay in sync. —*[source](http://swagger.wordnik.com/)* -### Terms of Service +### Terms of Service Terms of Service provide a legal framework for developers to operate within. They set the stage for the business development relationships that will occur within an API ecosystem. Terms of Service should protect the API owner's company, assets and brand, but should also provide assurances for developers who are building businesses on top of an API. —*[source](http://management.apievangelist.com/building-blocks.html)* -### TSV +### TSV A simple text format for a database table. Each record in the table is one line of the text file. Each field value of a record is separated from the next by a tab stop character. It is a form of the more general delimiter-separated values format. —*[source](http://en.wikipedia.org/wiki/Tab-separated_values)* -### Unstructured Data +### Unstructured Data Data that is more free-form, such as multimedia files, images, sound files, or unstructured text. Unstructured data does not necessarily follow any format or hierarchical sequence, nor does it follow any relational rules. Unstructured data refers to masses of (usually) computerized information which do not have a data structure which is easily readable by a machine. Examples of unstructured data may include audio, video and unstructured text such as the body of an email or word processor document. Data mining techniques are used to find patterns in, or otherwise interpret, this information. Merrill Lynch estimates that more than 85 percent of all business information exists as unstructured data – commonly appearing in e-mails, memos, notes from call centers and support operations, news, user groups, chats, reports, letters, surveys, white papers, marketing material, research, presentations, and Web pages (“The Problem with Unstructured Data.”) —*[source](http://www.whitehouse.gov/sites/default/files/omb/assets/egov_docs/DRM_2_0_Final.pdf)* @@ -193,6 +193,6 @@ A Web service is a software system designed to support interoperable machine-to- An XML-based language (Web Services Description Language) used to describe the services a business offers and to provide a way for individuals and other businesses to access those services electronically. —*[source](http://www.epa.gov/networkg/glossary.html)* -### XML +### XML Extensible Markup Language (XML) is a flexible language for creating common information formats and sharing both the format and content of data over the Internet and elsewhere. XML is a formatting language recommended by the World Wide Web Consortium (W3C). —*[source](http://www.epa.gov/networkg/glossary.html)* diff --git a/implementation-guide.md b/implementation-guide.md index e7349d5f..df62f0d0 100644 --- a/implementation-guide.md +++ b/implementation-guide.md @@ -19,7 +19,7 @@ The purpose of this guidance is to provide additional clarification and detailed 4. Document if data cannot be released 5. Clarify roles and responsibilities for promoting efficient and effective data release -Agencies will establish an open data infrastructure by implementing this guidance and Memorandum [M-13-13](/policy-memo) and taking advantage of the resources provided on [Project Open Data](http://project-open-data.github.io). Once established, agencies will continue to evolve the infrastructure by identifying and adding new data assets[1](#footnote-1), enriching the description of those data assets through improved metadata, and increasing the amount of data shared with other agencies and the public. +Agencies will establish an open data infrastructure by implementing this guidance and Memorandum [M-13-13](/policy-memo) and taking advantage of the resources provided on [Project Open Data](/). Once established, agencies will continue to evolve the infrastructure by identifying and adding new data assets[1](#footnote-1), enriching the description of those data assets through improved metadata, and increasing the amount of data shared with other agencies and the public. At a minimum, a successful open data infrastructure must: @@ -29,7 +29,7 @@ At a minimum, a successful open data infrastructure must: The “access level” categories described in this document are intended to be used for organizational purposes within agencies and to reflect decisions already made in agencies about whether data assets can be made public; simply marking data assets “public” cannot substitute for the analysis necessary to ensure the data can be made public. Agencies are reminded that this underlying data from the inventory may only be released to the public after a full analysis of privacy, confidentiality, security, and other valid restrictions pertinent to law and policy. -This guidance seeks to balance the need to establish clear and meaningful expectations for agencies to meet, while allowing sufficient flexibility on the approach each agency may take to address their own unique needs. This guidance also includes references to other OMB memoranda that relate to the management of information. Agencies should refer to the definitions included in the attachment in [OMB Memorandum M-13-13](/policy-memo) *Open Data Policy-Managing Information as an Asset*. +This guidance seeks to balance the need to establish clear and meaningful expectations for agencies to meet, while allowing sufficient flexibility on the approach each agency may take to address their own unique needs. This guidance also includes references to other OMB memoranda that relate to the management of information. Agencies should refer to the definitions included in the attachment in [OMB Memorandum M-13-13](/policy-memo) *Open Data Policy-Managing Information as an Asset*. This guidance introduces an Enterprise Data Inventory framework to provide agencies with improved clarity on specific actions to be taken and minimum requirements to be met. It also provides OMB with a rubric by which to evaluate compliance and progress toward the objectives laid out in the Open Data Policy. Following the November 30, 2013 deadline, agencies shall report progress on a quarterly basis, and performance will be tracked through the Open Data Cross-Agency Priority (CAP) Goal. Meeting the requirements of this guidance will ensure agencies are putting in place a basic infrastructure for inventorying, managing, and opening up data to unlock the value created by opening up information resources. @@ -109,7 +109,7 @@ To improve the discoverability and usability of data assets, all federal agencie Agencies, at their discretion, may choose to include entries for non-public data assets in their Public Data Listings, taking into account guidance in section D. For example, an agency may choose to list data assets with an ‘accessLevel’ of ‘restricted public’ to make the public aware of their existence and the process by which these data may be obtained. -Agencies’ Public Data Listings will be used to dynamically populate the newly renovated Data.gov, the main website to find data assets generated and held by the U.S. Government. Data.gov allows anyone from the public to find, download, and use government data. The upcoming re-launch of Data.gov (currently in beta at next.data.gov) will automatically aggregate the agency-managed Public Data Listings into one centralized location, using the Project Open Data metadata standards and tagging to improve the user ability to find and use government data. +Agencies’ Public Data Listings will be used to dynamically populate the newly renovated Data.gov, the main website to find data assets generated and held by the U.S. Government. Data.gov allows anyone from the public to find, download, and use government data. The upcoming re-launch of Data.gov (currently in beta at next.data.gov) will automatically aggregate the agency-managed Public Data Listings into one centralized location, using the Project Open Data metadata standards and tagging to improve the user ability to find and use government data. The objectives of this activity are to: @@ -135,7 +135,7 @@ The objectives of this activity are to: ### C. Create a Process to Engage With Customers to Help Facilitate and Prioritize Data Release #### Purpose -Identifying and engaging with key data customers to help determine the value of federal data assets can help agencies prioritize those of highest value for quickest release. Data customers include public as well as government stakeholders[17](#footnote-17). All Federal Agencies will be required to engage public input and reflect on how to incorporate customer feedback into their data management practices. Agencies may develop criteria at their discretion for prioritizing the opening of data assets, accounting for a range of factors, such as the quantity and quality of user demand, internal management priorities, and agency mission relevance. As customer feedback mechanisms and internal prioritization criteria will likely evolve over time and vary across agencies, agencies should share successful innovations in incorporating customer feedback through interagency working groups and Project Open Data to disseminate best practices. Agencies should regularly review the evolving customer feedback and public engagement strategy. +Identifying and engaging with key data customers to help determine the value of federal data assets can help agencies prioritize those of highest value for quickest release. Data customers include public as well as government stakeholders[17](#footnote-17). All Federal Agencies will be required to engage public input and reflect on how to incorporate customer feedback into their data management practices. Agencies may develop criteria at their discretion for prioritizing the opening of data assets, accounting for a range of factors, such as the quantity and quality of user demand, internal management priorities, and agency mission relevance. As customer feedback mechanisms and internal prioritization criteria will likely evolve over time and vary across agencies, agencies should share successful innovations in incorporating customer feedback through interagency working groups and Project Open Data to disseminate best practices. Agencies should regularly review the evolving customer feedback and public engagement strategy. The objectives of this activity are to: @@ -163,7 +163,7 @@ The objectives of this activity are to: ### D. Document if Data Cannot be Released #### Purpose -The Open Data Policy requires agencies to strengthen and develop policies and processes to ensure that only the appropriate data are made available publicly. Agencies should work with their Senior Agency Official for Privacy and other relevant officials to ensure a complete analysis of issues that could preclude public disclosure of information collected or created. If the agency determines the data should not be made publicly available because of law, regulation, or policy or because the data are subject to privacy, confidentiality, security, trade secret, contractual, or other valid restrictions to release, agencies must document the determination in consultation with their Office of General Counsel or equivalent. The agency should designate one of three “access levels” for each data asset listed in the inventory: public, restricted public, and non-public. The descriptions of these categories can be found below and on Project Open Data. +The Open Data Policy requires agencies to strengthen and develop policies and processes to ensure that only the appropriate data are made available publicly. Agencies should work with their Senior Agency Official for Privacy and other relevant officials to ensure a complete analysis of issues that could preclude public disclosure of information collected or created. If the agency determines the data should not be made publicly available because of law, regulation, or policy or because the data are subject to privacy, confidentiality, security, trade secret, contractual, or other valid restrictions to release, agencies must document the determination in consultation with their Office of General Counsel or equivalent. The agency should designate one of three “access levels” for each data asset listed in the inventory: public, restricted public, and non-public. The descriptions of these categories can be found below and on Project Open Data. The objectives of this activity are to: diff --git a/licenses.md b/licenses.md index 0f3a2562..bc07337a 100644 --- a/licenses.md +++ b/licenses.md @@ -6,7 +6,7 @@ redirect_from: /license-examples/ filename: licenses.md --- -The [Federal Open Data Policy](https://project-open-data.cio.gov/policy-memo/#c-ensure-information-stewardship-through-the-use-of-open-licenses) states: *"Agencies must apply open licenses, in consultation with the best practices found in Project Open Data, to information as it is collected or created so that if data are made public there are no restrictions on copying, publishing, distributing, transmitting, adapting, or otherwise using the information for non-commercial or for commercial purposes."* +The [Federal Open Data Policy](https://project-open-data.cio.gov/policy-memo/#c-ensure-information-stewardship-through-the-use-of-open-licenses) states: *"Agencies must apply open licenses, in consultation with the best practices found in Project Open Data, to information as it is collected or created so that if data are made public there are no restrictions on copying, publishing, distributing, transmitting, adapting, or otherwise using the information for non-commercial or for commercial purposes."* As described below, works created by U.S. Government employees within the scope of their employment default to U.S. Public Domain. However, works produced by outside parties which are created or obtained for use by the U.S. Government may need open licenses applied to them: *"When information is acquired or accessed by an agency through performance of a contract, appropriate existing clauses [22](https://acquisition.gov/far/current/html/Subpart%2027_4.html) shall be utilized to meet these objectives"* @@ -18,7 +18,7 @@ For the purposes of Project Open Data, the term "Open License" is used to refer * *Reuse*. The license must allow for reproductions, modifications and derivative works and permit their distribution under the terms of the original work. The rights attached to the work must not depend on the work being part of a particular package. If the work is extracted from that package and used or distributed within the terms of the work’s license, all parties to whom the work is redistributed should have the same rights as those that are granted in conjunction with the original package. -* *Redistribution*. The license shall not restrict any party from selling or giving away the work either on its own or as part of a package made from works from many different sources. The license shall not require a royalty or other fee for such sale or distribution. The license may require as a condition for the work being distributed in modified form that the resulting work carry a different name or version number from the original work. The rights attached to the work must apply to all to whom it is redistributed without the need for execution of an additional license by those parties. The license must not place restrictions on other works that are distributed along with the licensed work. For example, the license must not insist that all other works distributed on the same medium are open. If adaptations of the work are made publicly available, these must be under the same license terms as the original work. +* *Redistribution*. The license shall not restrict any party from selling or giving away the work either on its own or as part of a package made from works from many different sources. The license shall not require a royalty or other fee for such sale or distribution. The license may require as a condition for the work being distributed in modified form that the resulting work carry a different name or version number from the original work. The rights attached to the work must apply to all to whom it is redistributed without the need for execution of an additional license by those parties. The license must not place restrictions on other works that are distributed along with the licensed work. For example, the license must not insist that all other works distributed on the same medium are open. If adaptations of the work are made publicly available, these must be under the same license terms as the original work. * *No Discrimination against Persons, Groups, or Fields of Endeavor*. The license must not discriminate against any person or group of persons. The license must not restrict anyone from making use of the work in a specific field of endeavor. For example, it may not restrict the work from being used in a business, or from being used for research. @@ -52,4 +52,3 @@ When agencies purchase data or content from third-party vendors, care must be ta * *[Copyright and Other Rights Pertaining to U.S. Government Works](http://www.usa.gov/copyright.shtml)* * *[Licensing Policies, Principles, and Resources: Examples of how government has addressed open licensing questions](https://project-open-data.cio.gov/licensing-resources/)* * *[Extended list of conformant licenses](http://opendefinition.org/licenses/)* - diff --git a/licensing-resources.md b/licensing-resources.md index 2654ee1d..745ad5b5 100644 --- a/licensing-resources.md +++ b/licensing-resources.md @@ -71,7 +71,7 @@ This Source Code Policy, forked from the DoD/CFPB policy, includes a new "Commun An educational resource for government employees and government contractors to understand the policies and legal issues relating to the use of open source software in the DoD. Much of the information collected there is applicable to other Federal agencies. The FAQ covers a range of issues, including: DoD policy on OSS, general information about OSS, OSS licenses, release of government software as OSS, and OSS-like approaches used within the Federal government. * [Working Version of the FAQs](http://risacher.github.io/DoD-OSS-FAQ/) -###[How to FOSS Your Government Project](http://bit.ly/HowToFOSS) +###[How to FOSS Your Government Project](https://bit.ly/HowToFOSS) *National Security Agency & DoD CIO* A checklist developed at NSA to document the internal processes required to release government-developed software as open source software. It provides a detailed example for other agencies to use as a starting point. The original document contained a number of NSA-specific processes, the linked document is a "template" version that removes the specifics, and leaves just the outline and advisory material. diff --git a/policy-docs.md b/policy-docs.md index a195e010..f1cc579e 100644 --- a/policy-docs.md +++ b/policy-docs.md @@ -5,23 +5,23 @@ permalink: /policy-docs/ filename: policy-docs.md --- -This section offers examples of policy documents (memos, guidance, manuals, etc) on open data and data management as inspiration for other agencies. +This section offers examples of policy documents (memos, guidance, manuals, etc) on open data and data management as inspiration for other agencies. ## Memos and Policies -* Department of Health and Human Services, "[The Health Data Initiative Strategy and Execution Plan](https://docs.google.com/document/d/1FyKD_JLmFNLKgw5wjJOSn-F84SZi3cdOUg-1LUYFGIA/pub?embedded=true)". This document, released on October 23, 2013 [on HealthData.gov](http://healthdata.gov/blog/health-data-initiative-strategy-execution-plan-released-and-ready-feedback) serves as the Department's data milestones for 2014-2016. -* Department of Health and Human Services, "[Health Data Initiative Next Steps](https://s3.amazonaws.com/file_hosting/HDI+Memo+KGS+8+1+11.pdf)". Memo from Secretary Kathleen Sebelius to agency heads, released on August 1, 2011. -* Department of Housing and Urban Development, "[Open Data Policy - Managing HUD's Data as a Strategic Information Asset](http://project-open-data.github.io/assets/docs/Memo_from_the_Acting_Deputy%20Secretary_et_alia_Open_Data_Policy.pdf)". This memo, released on April 30, 2014, describes HUD's efforts to manage and share data with the public. +* Department of Health and Human Services, "[The Health Data Initiative Strategy and Execution Plan](https://docs.google.com/document/d/1FyKD_JLmFNLKgw5wjJOSn-F84SZi3cdOUg-1LUYFGIA/pub?embedded=true)". This document, released on October 23, 2013 [on HealthData.gov](http://healthdata.gov/blog/health-data-initiative-strategy-execution-plan-released-and-ready-feedback) serves as the Department's data milestones for 2014-2016. +* Department of Health and Human Services, "[Health Data Initiative Next Steps](https://s3.amazonaws.com/file_hosting/HDI+Memo+KGS+8+1+11.pdf)". Memo from Secretary Kathleen Sebelius to agency heads, released on August 1, 2011. +* Department of Housing and Urban Development, "[Open Data Policy - Managing HUD's Data as a Strategic Information Asset](/assets/docs/Memo_from_the_Acting_Deputy%20Secretary_et_alia_Open_Data_Policy.pdf)". This memo, released on April 30, 2014, describes HUD's efforts to manage and share data with the public. * Department of the Interior, "[Memo Re: Implementation of DOI Open Data Policy"](/assets/docs/MEMO_RE_IMPLEMENTATION_OF_DOI_OPEN_DATA_POLICY.pdf)". This memo served as the official DOI announcement to all Department Bureaus and Offices of the near-term requirements of M-13-13, and also called for the creation of a Data Services Board to oversee data lifecycle management moving forward. -* Department of State, [Open Data Plan](http://www.state.gov/documents/organization/217997.pdf). This document describes how the Department of State will meet the requirements of m-13-13. -* Department of Veterans Affairs, "[Open Data - Quarterly Data Asset Collection Requirements](http://project-open-data.github.io/assets/docs/Data_Asset_Collection_Memo-02_21_14.pdf)". Memo from the Acting Assistant Secretary for Policy and Planning to Under Secretaries, Assistant Secretaries and Other Key Officials, released on February 26, 2014. +* Department of State, [Open Data Plan](http://www.state.gov/documents/organization/217997.pdf). This document describes how the Department of State will meet the requirements of m-13-13. +* Department of Veterans Affairs, "[Open Data - Quarterly Data Asset Collection Requirements](/assets/docs/Data_Asset_Collection_Memo-02_21_14.pdf)". Memo from the Acting Assistant Secretary for Policy and Planning to Under Secretaries, Assistant Secretaries and Other Key Officials, released on February 26, 2014. * Environmental Protection Agency, "[Open Data Policy Implementation Plan](http://www.epa.gov/digitalstrategy/pdf/EPA_OpenDataPolicy_ImplementationPlan_2013Nov26.pdf)". This report documents EPA's implementation plan and current progress in meeting the requirements set forth in the White House “Open Data Policy –Managing Information as an Asset” Memorandum M-13-13. -* General Services Administration, ["Increasing Data Sharing, Transparency and Reuse at GSA"](https://s3.amazonaws.com/file_hosting/263508-Data+sharing+memo-FINAL.pdf). This document, released on February 14, 2014, outlined to all agency staff the role of OMB Memorandum M-13-13 within the agency. -* National Aeronautics and Space Administration, [Data and Information Policy for NASA's Earth Science Program](http://science.nasa.gov/earth-science/earth-science-data/data-information-policy/). A set of data exchange and access principles adopted by NASA for its Earth Science program. +* General Services Administration, ["Increasing Data Sharing, Transparency and Reuse at GSA"](https://s3.amazonaws.com/file_hosting/263508-Data+sharing+memo-FINAL.pdf). This document, released on February 14, 2014, outlined to all agency staff the role of OMB Memorandum M-13-13 within the agency. +* National Aeronautics and Space Administration, [Data and Information Policy for NASA's Earth Science Program](http://science.nasa.gov/earth-science/earth-science-data/data-information-policy/). A set of data exchange and access principles adopted by NASA for its Earth Science program. * Department of Education, [Data Standardization Policies and Procedures](http://federalstudentaid.ed.gov/static/gw/docs/ciolibrary/ECONOPS_Docs/DataStandardizationPolicies&Procedures.pdf). The purpose of this document, released in 2007, is to describe policies and procedures for data standardization and data exchange in Federal Student Aid. * Department of Energy, [Atmospheric Radiation Measurement (ARM) Data Sharing and Distribution Policy](http://www.arm.gov/data/docs/policy). This document sets expectations and establishes procedures for sharing data acquired in through the operations of Atmospheric Radiation Measurement (ARM) Climate Research Facility. -* Los Alamos National Laboratory, [GISLab Data Policy](http://gislab.lanl.gov/policies/data_policy.html). GISLab has established a spatial data warehouse to provide for access to spatial data at LANL. The GISLab spatial data policy specifies provisions for both data providers and data users. -* United States Agency for International Development, [Open Data Policy](http://pdf.usaid.gov/pdf_docs/pbaab096.pdf). The document, released in October 2014, sets forth the agency's plan for data management. Read their one page [Fact Sheet](http://www.usaid.gov/sites/default/files/documents/1868/ADS579FactSheet.pdf) and access their [presentation] (https://github.com/bpushed/Misc/blob/master/files/USAID%20OpenData%20Policy-OSTP-2014-10-21a.pptx?raw=true) delivered to White House OSTP on October 21, 2014. +* Los Alamos National Laboratory, [GISLab Data Policy](http://gislab.lanl.gov/policies/data_policy.html). GISLab has established a spatial data warehouse to provide for access to spatial data at LANL. The GISLab spatial data policy specifies provisions for both data providers and data users. +* United States Agency for International Development, [Open Data Policy](http://pdf.usaid.gov/pdf_docs/pbaab096.pdf). The document, released in October 2014, sets forth the agency's plan for data management. Read their one page [Fact Sheet](http://www.usaid.gov/sites/default/files/documents/1868/ADS579FactSheet.pdf) and access their [presentation] (https://github.com/bpushed/Misc/blob/master/files/USAID%20OpenData%20Policy-OSTP-2014-10-21a.pptx?raw=true) delivered to White House OSTP on October 21, 2014. ## Guidance diff --git a/policy-memo.md b/policy-memo.md index 9b76870f..076b48f7 100644 --- a/policy-memo.md +++ b/policy-memo.md @@ -73,7 +73,7 @@ This attachment provides definitions and implementation guidance for M-13-13, *O * *Managed Post-Release.* A point of contact must be designated to assist with data use and to respond to complaints about adherence to these open data requirements. -* *Project Open Data:* "Project Open Data," a new OMB and OSTP resource, is an online repository of tools, best practices, and schema to help agencies adopt the framework presented in this guidance. Project Open Data can be accessed at [project-open-data.github.io](/). [^18] Project Open Data will evolve over time as a community resource to facilitate adoption of open data practices. The repository includes definitions, code, checklists, case studies, and more, and enables collaboration across the Federal Government, in partnership with public developers, as applicable. Agencies can visit Project Open Data for a more comprehensive glossary of terms related to open data. +* *Project Open Data:* "Project Open Data," a new OMB and OSTP resource, is an online repository of tools, best practices, and schema to help agencies adopt the framework presented in this guidance. Project Open Data can be accessed at [project-open-data.cio.gov](/). [^18] Project Open Data will evolve over time as a community resource to facilitate adoption of open data practices. The repository includes definitions, code, checklists, case studies, and more, and enables collaboration across the Federal Government, in partnership with public developers, as applicable. Agencies can visit Project Open Data for a more comprehensive glossary of terms related to open data. ### II. Scope diff --git a/schema.md b/schema.md index 5b68e135..909e9061 100644 --- a/schema.md +++ b/schema.md @@ -27,11 +27,11 @@ Updates to the metadata schema can be found in the [changelog](/metadata-changel Standard Metadata Vocabulary ---------------------------- -Metadata is structured information that describes, explains, locates, or otherwise makes it easier to retrieve, use, or manage an information resource (NISO 2004, ISBN: 1-880124-62-9). The challenge is to define and name standard metadata fields so that a data consumer has sufficient information to process and understand the described data. The more information that can be conveyed in a standardized regular format, the more valuable data becomes. +Metadata is structured information that describes, explains, locates, or otherwise makes it easier to retrieve, use, or manage an information resource (NISO 2004, ISBN: 1-880124-62-9). The challenge is to define and name standard metadata fields so that a data consumer has sufficient information to process and understand the described data. The more information that can be conveyed in a standardized regular format, the more valuable data becomes. Metadata can range from basic to advanced, from allowing one to discover the mere fact that a certain data asset exists and is about a general subject all the way to providing detailed information documenting the structure, processing history, quality, relationships, and other properties of a dataset. Making metadata machine readable greatly increases its utility, but requires more detailed standardization, defining not only field names, but also how information is encoded in the metadata fields. -Establishing a common vocabulary is the key to communication. The **common core metadata** specified in this memorandum is based on [DCAT](http://www.w3.org/TR/vocab-dcat/), a hierarchical vocabulary specific to datasets. This specification defines three levels of metadata elements: Required, Required-if (conditionally required), and Expanded fields. These elements were selected to represent information that is most often looked for on the web. To assist users of other metadata standards, [mappings](http://project-open-data.github.io/metadata-resources/#common_core_required_fields_equivalents) to equivalent elements in other standards are provided. +Establishing a common vocabulary is the key to communication. The **common core metadata** specified in this memorandum is based on [DCAT](http://www.w3.org/TR/vocab-dcat/), a hierarchical vocabulary specific to datasets. This specification defines three levels of metadata elements: Required, Required-if (conditionally required), and Expanded fields. These elements were selected to represent information that is most often looked for on the web. To assist users of other metadata standards, [mappings](/metadata-resources/#common_core_required_fields_equivalents) to equivalent elements in other standards are provided. What to Document -- Datasets and Web APIs ------------------------------------- @@ -48,16 +48,16 @@ Metadata File Format -- JSON The [Implementation Guidance](/implementation-guide/) available as a part of Project Open Data describes Agency requirements for the development of metadata as per the Open Data Policy. A quick primer on the file format involved: -[JSON](http://www.json.org) is a lightweight data-exchange format that is very easy to read, parse, and generate. Based on a subset of the JavaScript programming language, JSON is a text format that is optimized for data interchange. JSON is built on two structures: (1) a collection of name/value pairs; and (2) an ordered list of values. +[JSON](http://www.json.org) is a lightweight data-exchange format that is very easy to read, parse, and generate. Based on a subset of the JavaScript programming language, JSON is a text format that is optimized for data interchange. JSON is built on two structures: (1) a collection of name/value pairs; and (2) an ordered list of values. -Where optional fields are included in a catalog file but are unpopulated, they may be represented by a `null` value. They should not be represented by an empty string (`""`). +Where optional fields are included in a catalog file but are unpopulated, they may be represented by a `null` value. They should not be represented by an empty string (`""`). -The Project Open Data schema is case sensitive. The schema uses a camel case convention where the first letter of some words within a field are capitalized (usually all words but the first one). While it may seem subtle which characters are uppercase and lowercase, it is necessary to follow the exact same casing as defined in the schema documented here. For example: +The Project Open Data schema is case sensitive. The schema uses a camel case convention where the first letter of some words within a field are capitalized (usually all words but the first one). While it may seem subtle which characters are uppercase and lowercase, it is necessary to follow the exact same casing as defined in the schema documented here. For example: -> Correct: `contactPoint` -> Incorrect: `ContactPoint` -> Incorrect: `contactpoint` -> incorrect: `CONTACTPOINT` +> Correct: `contactPoint` +> Incorrect: `ContactPoint` +> Incorrect: `contactpoint` +> incorrect: `CONTACTPOINT` Links to downloadable examples of metadata files developed in this and other formats in [the metadata resources](/metadata-resources/). Tools to help agencies produce and maintain their data inventories are [available on GitHub](http://www.github.com/project-open-data) and hosted at [Labs.Data.gov](http://labs.data.gov). @@ -71,14 +71,14 @@ The following "common core" fields are required, to be used to describe each ent {: .table .table-striped} Field | Label | Definition -------------- | -------------- | -------------- -title | Title | Human-readable name of the asset. Should be in plain English and include sufficient detail to facilitate search and discovery. -description | Description | Human-readable description (e.g., an abstract) with sufficient detail to enable a user to quickly understand whether the asset is of interest. -keyword | Tags | Tags (or keywords) help users discover your dataset; please include terms that would be used by technical and non-technical users. -modified | Last Update | Most recent date on which the dataset was changed, updated or modified. -publisher | Publisher | The publishing entity. -contactPoint | Contact Name | Contact person's name for the asset. -mbox | Contact Email | Contact person's email address. -identifier | Unique Identifier | A unique identifier for the dataset or API as maintained within an Agency catalog or database. +title | Title | Human-readable name of the asset. Should be in plain English and include sufficient detail to facilitate search and discovery. +description | Description | Human-readable description (e.g., an abstract) with sufficient detail to enable a user to quickly understand whether the asset is of interest. +keyword | Tags | Tags (or keywords) help users discover your dataset; please include terms that would be used by technical and non-technical users. +modified | Last Update | Most recent date on which the dataset was changed, updated or modified. +publisher | Publisher | The publishing entity. +contactPoint | Contact Name | Contact person's name for the asset. +mbox | Contact Email | Contact person's email address. +identifier | Unique Identifier | A unique identifier for the dataset or API as maintained within an Agency catalog or database. accessLevel | Public Access Level | The degree to which this dataset **could** be made publicly-available, *regardless of whether it has been made available*. Choices: public (Data asset is or could be made publicly available to all without restrictions), restricted public (Data asset is available under certain use restrictions), or non-public (Data asset is not available to members of the public) "Common Core" Required-if-Applicable Fields @@ -87,16 +87,16 @@ The following fields must be used to describe each dataset if they are applicabl {: .table .table-striped} Field | Label | Definition --------------- | -------------- | -------------- -bureauCode | Bureau Code | Federal agencies, combined agency and bureau code from [OMB Circular A-11, Appendix C](http://www.whitehouse.gov/sites/default/files/omb/assets/a11_current_year/app_c.pdf) in the format of `015:11`. -programCode | Program Code | Federal agencies, list the primary program related to this data asset, from the [Federal Program Inventory](http://goals.performance.gov/sites/default/files/images/FederalProgramInventory_FY13_MachineReadable_091613.xls). Use the format of `015:001` -accessLevelComment | Access Level Comment | An explanation for the selected “accessLevel” including instructions for how to access a restricted file, if applicable, or explanation for why a “non-public” or “restricted public” data asset is not “public,” if applicable. Text, 255 characters. -accessURL | Download URL | URL providing direct access to the downloadable distribution of a dataset. -webService | Endpoint | Endpoint of web service to access dataset. -format | Format | The file format or API type of the distribution. -license | License | The license with which the dataset or API is published. See [Open Licenses](/open-licenses/) for more information. -spatial | Spatial | The range of spatial applicability of a dataset. Could include a spatial region like a bounding box or a named place. -temporal | Temporal | The range of temporal applicability of a dataset (i.e., a start and end date of applicability for the data). +-------------- | -------------- | -------------- +bureauCode | Bureau Code | Federal agencies, combined agency and bureau code from [OMB Circular A-11, Appendix C](http://www.whitehouse.gov/sites/default/files/omb/assets/a11_current_year/app_c.pdf) in the format of `015:11`. +programCode | Program Code | Federal agencies, list the primary program related to this data asset, from the [Federal Program Inventory](http://goals.performance.gov/sites/default/files/images/FederalProgramInventory_FY13_MachineReadable_091613.xls). Use the format of `015:001` +accessLevelComment | Access Level Comment | An explanation for the selected “accessLevel” including instructions for how to access a restricted file, if applicable, or explanation for why a “non-public” or “restricted public” data asset is not “public,” if applicable. Text, 255 characters. +accessURL | Download URL | URL providing direct access to the downloadable distribution of a dataset. +webService | Endpoint | Endpoint of web service to access dataset. +format | Format | The file format or API type of the distribution. +license | License | The license with which the dataset or API is published. See [Open Licenses](/open-licenses/) for more information. +spatial | Spatial | The range of spatial applicability of a dataset. Could include a spatial region like a bounding box or a named place. +temporal | Temporal | The range of temporal applicability of a dataset (i.e., a start and end date of applicability for the data). Beyond Common Core -- Extending the Schema ------------------------------------------ @@ -108,19 +108,18 @@ Agencies are encouraged to use the following expanded fields when appropriate. A {: .table .table-striped} Field | Label | Definition --------------- | -------------- | -------------- -theme | Category | Main thematic category of the dataset. -dataDictionary | Data Dictionary | URL to the data dictionary for the dataset or API. Note that documentation other than a data dictionary can be referenced using Related Documents as shown in the expanded fields. -dataQuality | Data Quality | Whether the dataset meets the agency's Information Quality Guidelines (true/false). -distribution | Distribution | Holds multiple download URLs for datasets composed of multiple files and/or file types -accrualPeriodicity | Frequency | Frequency with which dataset is published. +-------------- | -------------- | -------------- +theme | Category | Main thematic category of the dataset. +dataDictionary | Data Dictionary | URL to the data dictionary for the dataset or API. Note that documentation other than a data dictionary can be referenced using Related Documents as shown in the expanded fields. +dataQuality | Data Quality | Whether the dataset meets the agency's Information Quality Guidelines (true/false). +distribution | Distribution | Holds multiple download URLs for datasets composed of multiple files and/or file types +accrualPeriodicity | Frequency | Frequency with which dataset is published. landingPage | Homepage URL | Alternative landing page used to redirect user to a contextual, Agency-hosted "homepage" for the dataset or API when selecting this resource from the Data.gov user interface. -language | Language | The language of the dataset. -PrimaryITInvestmentUII | Primary IT Investment UII | For linking a dataset with an IT Unique Investment Identifier (UII) -references | Related Documents | Related documents such as technical information about a dataset, developer documentation, etc. -issued | Release Date | Date of formal issuance. -systemOfRecords | System of Records | If the systems is designated as a system of records under the Privacy Act of 1974, provide the URL to the System of Records Notice related to this dataset. - +language | Language | The language of the dataset. +PrimaryITInvestmentUII | Primary IT Investment UII | For linking a dataset with an IT Unique Investment Identifier (UII) +references | Related Documents | Related documents such as technical information about a dataset, developer documentation, etc. +issued | Release Date | Date of formal issuance. +systemOfRecords | System of Records | If the systems is designated as a system of records under the Privacy Act of 1974, provide the URL to the System of Records Notice related to this dataset. Further Metadata Field Guidance (alphabetical by field) ------------------------------- @@ -140,7 +139,7 @@ Further Metadata Field Guidance (alphabetical by field) **Cardinality** | (0,1) **Required** | Yes, if accessLevel is "restricted public" or "non-public" **Accepted Values** | String -**Usage Notes** | An explanation for the selected “accessLevel” including instructions for how to access a restricted file, if applicable, or explanation for why a “non-public” or “restricted public” data asset is not “public,” if applicable. +**Usage Notes** | An explanation for the selected “accessLevel” including instructions for how to access a restricted file, if applicable, or explanation for why a “non-public” or “restricted public” data asset is not “public,” if applicable. **Example** | `{"accessLevelComment":"This dataset contains Personally Identifiable Information and could not be released for public access. A statistical analysis of the data contained herein, stripped of all personal identifiers, is available at http://another.website.gov/dataset."}` {: .table .table-striped} @@ -212,23 +211,23 @@ Further Metadata Field Guidance (alphabetical by field) **Cardinality** | (0,n) **Required** | No **Accepted Values** | See Usage Notes -**Usage Notes** | Distribution is a concatenation, as appropriate, of the following elements: **accessURL** and **format**. If an entry has only one dataset, enter details for that one; if it has multiple datasets (such as a bulk download and an API), separate entries as seen below: - +**Usage Notes** | Distribution is a concatenation, as appropriate, of the following elements: **accessURL** and **format**. If an entry has only one dataset, enter details for that one; if it has multiple datasets (such as a bulk download and an API), separate entries as seen below: + "distribution": [ { - "accessURL":"https://explore.data.gov/views/ykv5-fn9t/rows.csv?accessType=DOWNLOAD", + "accessURL":"https://explore.data.gov/views/ykv5-fn9t/rows.csv?accessType=DOWNLOAD", "format":"text/csv" - }, + }, { - "accessURL":"https://explore.data.gov/views/ykv5-fn9t/rows.json?accessType=DOWNLOAD", + "accessURL":"https://explore.data.gov/views/ykv5-fn9t/rows.json?accessType=DOWNLOAD", "format":"application/json" - }, + }, { - "accessURL":"https://explore.data.gov/views/ykv5-fn9t/rows.xml?accessType=DOWNLOAD", + "accessURL":"https://explore.data.gov/views/ykv5-fn9t/rows.xml?accessType=DOWNLOAD", "format":"text/xml" } ] - + {: .table .table-striped} **Field ** | **format** ----- | ----- @@ -253,7 +252,7 @@ Further Metadata Field Guidance (alphabetical by field) **Cardinality** | (0,1) **Required** | No **Accepted Values** | ISO 8601 Date -**Usage Notes** | Dates should be [ISO 8601](http://en.wikipedia.org/wiki/ISO_8601) of least resolution. In other words, as much of YYYY-MM-DDThh:mm:ss.sTZD as is relevant to this dataset. +**Usage Notes** | Dates should be [ISO 8601](http://en.wikipedia.org/wiki/ISO_8601) of least resolution. In other words, as much of YYYY-MM-DDThh:mm:ss.sTZD as is relevant to this dataset. **Example** | `{"issued":"2001-01-15"}` {: .table .table-striped} @@ -262,7 +261,7 @@ Further Metadata Field Guidance (alphabetical by field) **Cardinality** | (1,n) **Required** | Yes, always **Accepted Values** | Array of strings -**Usage Notes** | Surround each keyword with quotes. Separate keywords with commas. Avoid duplicate keywords in the same record. +**Usage Notes** | Surround each keyword with quotes. Separate keywords with commas. Avoid duplicate keywords in the same record. **Example** | `{"keyword":["vegetables","veggies","greens","leafy","spinach","kale","nutrition"]}` {: .table .table-striped} @@ -281,7 +280,7 @@ Further Metadata Field Guidance (alphabetical by field) **Required** | No **Accepted Values** | Array of strings **Usage Notes** | This should adhere to the [RFC 5646](http://tools.ietf.org/html/rfc5646) standard. This [language subtag lookup](http://rishida.net/utils/subtags/) provides a good tool for checking and verifying language codes. A language tag is comprised of either one or two parts, the language subtag (such as en for English, sp for Spanish, wo for Wolof) and the regional subtag (such as US for United States, GB for Great Britain, MX for Mexico), separated by a hyphen. Regional subtags should only be provided when needed to distinguish a language tag from another one (such as American vs. British English). -**Example** | `{"language":["en-US"]}` or if multiple languages, `{"language":["es-MX","wo","nv","en-US"]}` +**Example** | `{"language":["en-US"]}` or if multiple languages, `{"language":["es-MX","wo","nv","en-US"]}` {: .table .table-striped} **Field ** | **license** @@ -307,9 +306,9 @@ Further Metadata Field Guidance (alphabetical by field) **Cardinality** | (1,1) **Required** | Yes, always **Accepted Values** | ISO 8601 Date -**Usage Notes** | Dates should be [ISO 8601](http://en.wikipedia.org/wiki/ISO_8601) of least resolution. In other words, as much of YYYY-MM-DDThh:mm:ss.sTZD as is relevant to this dataset. If this file is brand-new, enter the **issued** date here as well. - -If there is a need to reflect that the dataset is continually updated, ISO 8601 formatting can account for this [with repeating intervals](http://en.wikipedia.org/wiki/ISO_8601#Time_intervals). For instance, `R/P1D` for daily, `R/P2W` for every two weeks, and `R/PT5M` for every five minutes. +**Usage Notes** | Dates should be [ISO 8601](http://en.wikipedia.org/wiki/ISO_8601) of least resolution. In other words, as much of YYYY-MM-DDThh:mm:ss.sTZD as is relevant to this dataset. If this file is brand-new, enter the **issued** date here as well. + +If there is a need to reflect that the dataset is continually updated, ISO 8601 formatting can account for this [with repeating intervals](http://en.wikipedia.org/wiki/ISO_8601#Time_intervals). For instance, `R/P1D` for daily, `R/P2W` for every two weeks, and `R/PT5M` for every five minutes. **Example** | `{"modified":"2012-01-15"}` or `{"modified":"R/P1D"}` {: .table .table-striped} @@ -363,12 +362,12 @@ If there is a need to reflect that the dataset is continually updated, ISO 8601 **Cardinality** | (0,1) **Required** | Yes, if applicable **Accepted Values** | ISO 8601 Date -**Usage Notes** | This field should contain an interval of time defined by start and end dates. Dates should be formatted as pairs of {start datetime/end datetime} in the [ISO 8601](http://en.wikipedia.org/wiki/ISO_8601) format. ISO 8601 specifies that datetimes can be formatted in a number of ways, including a simple four-digit year (eg. 2013) to a much more specific YYYY-MM-DDTHH:MM:SSZ, where the T specifies a seperator between the date and time and time is expressed in 24 hour notation in the UTC (Zulu) time zone. (e.g., 2011-02-14T12:00:00Z/2013-07-04T19:34:00Z). Use a solidus ("/") to separate start and end times. - +**Usage Notes** | This field should contain an interval of time defined by start and end dates. Dates should be formatted as pairs of {start datetime/end datetime} in the [ISO 8601](http://en.wikipedia.org/wiki/ISO_8601) format. ISO 8601 specifies that datetimes can be formatted in a number of ways, including a simple four-digit year (eg. 2013) to a much more specific YYYY-MM-DDTHH:MM:SSZ, where the T specifies a seperator between the date and time and time is expressed in 24 hour notation in the UTC (Zulu) time zone. (e.g., 2011-02-14T12:00:00Z/2013-07-04T19:34:00Z). Use a solidus ("/") to separate start and end times. + If there is a need to reflect that the dataset is continually updated, ISO 8601 formatting can account for this [with repeating intervals](http://en.wikipedia.org/wiki/ISO_8601#Time_intervals). For instance, updated monthly starting in January 2010 and continuing through the present would be represented as: `R/2010-01/P1M`. -Updated every 5 minutes beginning on February 15, 2010 would be represented as: `R/2010-02-15/PT5M`. -**Example** | `{"temporal":"2000-01-15T00:45:00Z/2010-01-15T00:06:00Z"}` or `{"temporal":"R/2000-01-15T00:45:00Z/P1W"}` +Updated every 5 minutes beginning on February 15, 2010 would be represented as: `R/2010-02-15/PT5M`. +**Example** | `{"temporal":"2000-01-15T00:45:00Z/2010-01-15T00:06:00Z"}` or `{"temporal":"R/2000-01-15T00:45:00Z/P1W"}` {: .table .table-striped} **Field ** | **theme** @@ -376,7 +375,7 @@ Updated every 5 minutes beginning on February 15, 2010 would be represented as: **Cardinality** | (0,n) **Required** | No **Accepted Values** | Array of strings -**Usage Notes** | Separate multiple categories with a comma. Could include [ISO Topic Categories](http://www.isotopicmaps.org/). +**Usage Notes** | Separate multiple categories with a comma. Could include [ISO Topic Categories](http://www.isotopicmaps.org/). **Examples** | `{"theme":["vegetables"]}` or if multiple categories, `{"theme":["vegetables","produce"]}` {: .table .table-striped}