diff --git a/federal-awards-faq.md b/federal-awards-faq.md new file mode 100644 index 00000000..3c0dabad --- /dev/null +++ b/federal-awards-faq.md @@ -0,0 +1,122 @@ +# Applying the Open Data Policy to Federal Awards + +##Frequently Asked Questions + +###[General Questions:](#general) +* Q1: What sort of data is covered by the open data policy? +* Q2: Who within my agency is responsible for Open Data Policy decision-making? +* Q3: What does “open data” mean? +* Q4: What does “platform independent” mean? +* Q5: What does “machine readable” mean? +* Q6: Are all agencies approaching this data in the same way? What if my agency has a grant or contract with recipient A and we translate the data but another agency has a similar contract with recipient A and requires the recipient to provide the data in a platform-independent, machine-readable format? How does this help with consistency across the government? +* Q7: Must all agency data be posted publicly? What sorts of considerations guide whether data should be posted publicly? +* Q8: Some of the data I collect cannot be shared publicly. How is this data affected by the open data policy? +* Q9: Who decides whether data should be posted publicly? +* Q10: Once the decision to post data publicly is made, who is responsible for doing so? +* Q11: Do I need to change the way Federal award data are reported, for example, does my agency contract or grant writing system or interface with the Federal Procurement Data System, USAspending.gov, or agency financial system need to change? + +###[Questions Applicable to Contracts:](#contracts) +* Q12: Do I need to initiate modifications on my existing contracts to address the open data policy? +* Q13: For new contracts, what changes must I make to my deliverables or terms and conditions to address the open data policy? +* Q14: How difficult is it for agency personnel to transform machine-readable data into a platform-independent format? Is requiring data in a platform-independent format likely to drive up deliverable cost? Who can provide guidance on making the decision regarding who should convert the data? +* Q15: Are there any Federal Acquisition Regulation (FAR) provisions that might affect an agency’s authority to publish data provided as a contract deliverable? + +###[Questions Applicable to Grants and Other Financial Assistance:](#financial) +* Q16: Do I need to modify existing grants or financial assistance awards to address the open data policy? +* Q17: For new financial assistance awards, what changes must I make to terms and conditions to address the open data policy? +* Q18: What are the implications of the policy for new investigator-initiated scientific research grants? +* Q19: How difficult is it for agency personnel to transform machine-readable data into a platform-independent format? Is requiring data in a platform-independent format likely to drive up costs for recipients? Who can provide guidance on making the decision regarding who should convert the data? +* Q20: Is there OMB guidance that might affect an agency’s authority to post data generated with Federal financial assistance? + +## Responses + +###General Questions +####Q1: What sort of data is covered by the open data policy? +A: The purpose of the Open Data Policy is to make open and machine readable the new default for government data collection and dissemination in order to enhance availability, access and interoperability, per May 9, 2013 Executive Order released by the White House. This includes all information generated and stored by the Federal Government or for data purchased or created through Federal funding such as data collected in conjunction with program administration, scientific research, public health surveillance. There is an expectation that agencies will work to prioritize ensuring that data is open when it would be likely to fuel entrepreneurship, innovation, accountability, and scientific discovery, and improve the lives of Americans in tangible ways. + +####Q2: Who within my agency is responsible for Open Data Policy decision-making? +A: The senior accountable official in your agency for making those decisions is the CIO, in partnership with the data owners in your agency. + +####Q3: What does “open data” mean? +A: Open data is data that is publicly available data and structured in a way that enables these data to be fully discoverable and useable by end users inside and outside of government. In general, open data is consistent with the following principles: public, accessible, described, reusable, complete, timely, and managed post-release (attributes defined in the Office of Management and Budget (OMB) [Memorandum M-13-13 *Open Data Policy – Managing Information as an Asset*](http://www.whitehouse.gov/sites/default/files/omb/memoranda/2013/m-13-13.pdf). + +####Q4: What does “platform independent” mean? +A: Platform independent data do not require a specific program to open (e.g., a comma-separated value (.csv) spreadsheet). They are in a format that can be read and processed by a variety of software tools – including free tools. + +####Q5: What does “machine readable” mean? +A: Machine readable data is data structured in a format that can be understood and processed by a computer. Some file formats, such as a PDF or Word documents, are human readable and easier to read and edit. This differs significantly from machine readable formats that can be processed by a computer and parsed or organized around specific information. OMB defines “Machine Readable Format” and provides some examples in Circular A-11 Part 6, “Preparation and Submission of Strategic Plans, Annual Performance Plans, and Annual Program Performance Reports.” The circular describes machine readable format as: +> a standard computer language (not English text) that can be read automatically by a web browser or computer system. (e.g.; xml). Traditional word processing documents, hypertext markup language (HTML) and portable document format (PDF) files are easily read by humans but typically are difficult for machines to interpret. Other formats such as extensible markup language (XML), [JavaScript Object Notation] (JSON), or spreadsheets with header columns that can be exported as comma separated values (CSV) are machine readable formats. It is possible to make traditional word processing documents and other formats machine readable but the documents must include enhanced structural elements. + +In addition, M-13-13, “Open Data Policy – Managing Information as an Asset” encourages agencies to make data available in non-proprietary (i.e., platform independent) formats to the extent permitted by law. The preferred format for making data available is non-propriety (i.e., platform independent) because this platform allows the data to be accessed without the required use of certain proprietary software programs such as Excel. File formats that are preferred are identified in A-11 Part 6 and include JSON, XML, and CSV. +Your agency CIO will be able to provide more information and subject matter expertise to help contracting officer representatives (CORs) or other program officials in understanding requirements and how to assess deliverables. + +####Q6: Are all agencies approaching this data in the same way? What if my agency has a grant or contract with recipient A and we translate the data but another agency has a similar contract with recipient A and requires the recipient to provide the data in a platform-independent, machine-readable format? How does this help with consistency across the government?** +A: OMB is working collaboratively across the Executive Office of the President to better understand how agencies are implementing open data and providing input and information to smooth the implementation process for grant recipients and contractors. As part of that process, we will be working across the Councils identified in Section 3 (b) of the [Executive Order (EO) “Making Open Data and Machine Readable the New Default for Government Information”](http://www.whitehouse.gov/the-press-office/2013/05/09/executive-order-making-open-and-machine-readable-new-default-government-) to better understand overall implementation efforts and further develop tools and information that provides consistency and avoids unnecessary duplication. + +####Q7: Must all agency data be posted publicly? What sorts of considerations guide whether data should be posted publicly? +A: No. Neither the E.O. nor policy change Agency obligations under any law. Specifically, nothing in the EO or policy compels or authorizes the disclosure of privileged information, law enforcement information, national security information, personal information, or information the disclosure of which is prohibited by law. For example, for contracts, information subject to the Trade Secrets Act regarding proprietary company information would be an example of data that should not be made publicly available, as would detailed vendor pricing information in many cases. Note that when datasets are designated as inappropriate for public release due to, for instance, issues associated with personally identifiable information, agencies should consider whether the value of the data can be increased by either (1) making the data available to a set of qualified parties (e.g. researchers) with strong legal and privacy protections, or (2) providing access to other Executive branch agencies though through inter-agency data sharing agreements. However, M-13-13 still requires that these types of information be inventoried in an Enterprise Data Inventory regardless of whether they are considered “open data,” which may lead to valuable internal use. Your agency CIO as the responsible official for implementation of the policy will have more information about any agency or program specific considerations that should be applied to data in your agency. + +####Q8: Some of the data I collect cannot be shared publicly. How is this data affected by the open data policy? +A: The open data policy is about more than just making data publicly available; it is also about managing information as a strategic asset within the enterprise. Even data that cannot be shared publicly should be received and stored in platform-independent, machine-readable formats whenever possible. In many cases, there are significant benefits to be derived from sharing data across the government, even when the data will not be publicly posted (See, for instance, [OMB Memorandum M-11-02: Sharing Data While Protecting Privacy]( http://www.whitehouse.gov/sites/default/files/omb/memoranda/2011/m11-02.pdf)). As an example, detailed contract price data that can be shared within the government can provide a valuable touch-point as other agencies plan their procurements, but in some cases that data cannot be publicly shared. Similarly, data collected in conjunction with administering public benefits can be valuable to statistical agencies. + +####Q9: Who decides whether data should be posted publicly? +A: Program office personnel and grant officers will largely be responsible for recommending whether a given data set should be made publicly available. As the data owners, such personnel and officers are in the best position to determine whether privacy, national security, or other concerns would preclude public posting. While agency general counsel,grant and contracting officers, and security and privacy personnel, among others, may all be involved in reviewing whether or not a given dataset can be made public, the agency CIO is the responsible official in each agency for leading the open data effort, providing leadership and guidance on implementation. + +####Q10: Once the decision to post data publicly is made, who is responsible for doing so? +A: The agency CIO has the primary responsibility for managing and creating an inventory of all agency datasets as part of an enterprise-wide data inventory. The agency CIO is also responsible for creating a public data listing, as a subset of the full inventory. The public data listing does not include the underlying databases, but is simply a list of all datasets (e.g. title, description, point of contact, etc.) in the agency that are public or could be made public. Public data listings, under M-13-13, are required to be posted at agency.gov/data pages. + +#### Q11: Do I need to change the way Federal award data are reported, for example, does my agency contract or grant writing system or interface with the Federal Procurement Data System – Next Generation (FPDS-NG), USAspending.gov, or agency financial system need to change? +A: No, FPDS-NG and USAspending already provide public data access to their contents in open formats. Some system changes may need to be considered for other Federal agency platforms that house public data, and enable public data to be accessed or downloaded. + +###Questions Applicable to Contracts + +####Q12: Do I need to initiate modifications on my existing contracts to address the open data policy? +A: No. The open data policy is prospective. For new awards, Federal contracts must ensure that the Government treats data as a valuable national asset, and structures any data-related deliverables to collect such data in formats that can be shared, regardless of whether a determination has been made as to whether the data should be made available to the public. There is no requirement to modify existing procurement agreements. + +####Q13: For new contracts, what changes must I make to my deliverables or terms and conditions to address the open data policy? +A: That depends on what the Government intends to buy. If data are to be included as a deliverable, and the deliverable is written to require submission of those data in an appropriate format (See Section A or contact your agency CIO for more details), there are no additional requirements. The open data policy is not designed to change what the government is buying, but is focused instead on how that information is delivered, and ensuring it can be appropriately reused. + +For example, if a deliverable consists of an assessment or end-state report and no underlying data are required, the open data policy will not force the agency to add a data deliverable or alter the other terms and conditions of the contract. If, however, a deliverable includes data, the deliverable should be structured to require the delivery or export of the data in a machine-readable format. The agency has discretion regarding whether deliverables must be provided in platform-independent formats, but if the deliverable is not provided in such a format, the agency will need to take the extra step of translating it into a platform-independent format. Either solution is acceptable to reach the objective of open data. + +As agencies contemplate the purchase of systems that will process data, the requirements in Section 3 of M-13-13 must be considered. These requirements provide for a life-cycle view of effective and efficient information management by requiring that information is collected in a way that supports downstream processing, and that systems are built to support interoperability and information accessibility, including regular access or exporting of the data as a standard requirement of such systems. Addressing these considerations early in the acquisition process will protect against the costly retrofitting that is often involved in retrieving data from legacy platform-dependent systems. + +#### Q14: How difficult is it for agency personnel to transform machine-readable data into a platform-independent format? Is requiring data in a platform-independent format likely to drive up deliverable cost? Who can provide guidance on making the decision regarding who should convert the data? +A: A number of tools are available to transform data from proprietary formats to platform independent formats. [Project Open Data](http://project-open-data.github.io/) has collected a number of such tools. Many of these tools are easy to operate and require very little investment of time or effort. As such, they can be used either by a contractor (if the contract specifies delivery in a platform independent format) or by Federal employees (if the contract specifies delivery in a proprietary format). +Your agency CIO is responsible for leading the open data initiative in your agency as well as working with the interagency working group on this effort. Your agency may determine that having the agency translate the delivered data is most cost effective so we encourage you to work with your CIO on this effort. + +####Q15: Are there any Federal Acquisition Regulation (FAR) provisions that might affect an agency’s authority to publish data provided as a contract deliverable? +A: While the scope of work and other contractual language concerning the required deliverables will determine what data is delivered under a given contract, the rights that the government obtains to the data that is delivered are generally covered by one of a number of standard contract clauses. For civilian agencies, procedures for utilizing these clauses appear within FAR subpart 27.4 – Rights in Data and Copyrights ([48 C.F.R. 27.400 et. seq.,](https://acquisition.gov/far/current/html/FARTOCP27.html)). For the Department of Defense, contracting officers are instructed by the Defense Federal Acquisition Regulation Supplement (DFARS) to use the DFARS coverage in [subparts 227.71 and 227.72](http://www.acq.osd.mil/dpap/dars/dfarspgi/current/index.html) in lieu of the guidance in FAR subpart 27.4. Other agencies may have supplemental acquisitions regulations that apply. +The procedures contained in both the FAR and the DFARS are designed to implement the policy principles expressed at FAR 27.402. Specifically: +(a) To carry out their missions and programs, agencies acquire or obtain access to many kinds of data produced during or used in the performance of their contracts. Agencies require data to— +(1) Obtain competition among suppliers; +(2) Fulfill certain responsibilities for disseminating and publishing the results of their activities; +(3) Ensure appropriate utilization of the results of research, development, and demonstration activities including the dissemination of technical information to foster subsequent technological developments; +(4) Meet other programmatic and statutory requirements; and +(5) Meet specialized acquisition needs and ensure logistics support. +(b) Contractors may have proprietary interests in data. In order to prevent the compromise of these interests, agencies shall protect proprietary data from unauthorized use and disclosure. The protection of such data is also necessary to encourage qualified contractors to participate in and apply innovative concepts to Government programs. In light of these considerations, agencies shall balance the Government’s needs and the contractor’s legitimate proprietary interests. + +While these clauses address the data rights obtained by an agency, decisions regarding whether data will be published will be made in accordance with Q9 above, involving, as appropriate, consultation among the agency CIO, general counsel, contracting personnel, security personnel and privacy personnel. + +### Questions Applicable to Grants and Other Financial Assistance +####Q16: Do I need to modify existing grants or financial assistance awards to address the open data policy? +A: No. The open data policy is prospective and looks at grants made after the publication of the Executive Order. For these new awards, Federal financial assistance and grant agreements must ensure that the Government treats data as a valuable national asset, and structures any data-related deliverables to collect such data in formats that can be appropriately shared. There is no requirement to modify existing agreements. + +####Q17: For new financial assistance awards, what changes must I make to terms and conditions to address the open data policy? +A: You are not required to make any changes to the terms and conditions of new awards, but depending on the situation, you may want to consider future changes per below. + +A1: If data are included or expected to be provided to the Federal government as an outcome of a Federal award and those data are already in the right format (See Section B or contact your agency CIO for more details), there are no additional requirements for the award. + +A2: If data are expected to be provided to the Federal government but are not in the correct format, the agency must make a determination about whether to require the recipient to report the information in the new format (which may require a change to the terms and conditions), or whether to make the change to the new format themselves (which would not). +Your agency CIO is responsible for leading the open data initiative in your agency as well as working with the interagency working group on this effort. Your agency may determine that having the agency translate the delivered data is most cost effective so we encourage you to work with your CIO on this effort. +A3: If data are generated with Federal funding, but providing data to the Federal government would not otherwise be an expected outcome of the Federal award this policy does not include any new requirement to collect it. + +####Q18: What are the implications of the policy for new investigator-initiated scientific research grants? +A: Although most investigator-initiated research grants are expected to generate data, those that do not include a specific provision requiring that the resulting data be provided to the government are not subject to the requirements of this policy. However, to increase the value of the agency’s investment, some grants already require that data be provided to the agency. Furthermore, it is likely that some agencies will begin providing additional guidance with respect to managing digital scientific data created under investigator-initiated grants as part of their response to the Office of Science and Technology’s [Memorandum on Increasing Public Access to the Results of Federally Funded Research](http://www.whitehouse.gov/sites/default/files/microsites/ostp/ostp_public_access_memo_2013.pdf). Agencies, in conjunction with their CIOs, may determine that some requirements of the M-13-13 policy are appropriate for inclusion in future funding announcements (e.g., machine readable, platform independent). + +####Q19: How difficult is it for agency personnel to transform machine-readable data into a platform-independent format? Is requiring data in a platform-independent format likely to drive up costs for recipients? Who can provide guidance on making the decision regarding who should convert the data? +A: A number of tools are available to transform data from proprietary formats to platform independent formats. [Project Open Data](http://project-open-data.github.io) has collected a number of such tools. Many of these tools are easy to operate and require very little investment of time or effort. As such, they can be used either by a recipient (if the award specifies delivery in a platform independent format) or by Federal employees (if the award either does not specify or specifies delivery in a proprietary format). +Your agency CIO is responsible for leading the open data initiative in your agency as well as working with the interagency working group on this effort. Your agency may determine that having the agency translate the delivered data is most cost effective. + +####Q20: Is there OMB guidance that might affect an agency’s authority to post data generated with Federal financial assistance? +A: For Federal financial assistance to state, local, and tribal governments please see the guidance from A-102 . For Federal financial assistance to nonprofit organizations including institutions of higher education, please see the guidance in [OMB Circular A-110](http://www.whitehouse.gov/omb/circulars_a102). +While this guidance address the rights obtained by an agency, decisions regarding whether data will be published will be made in accordance with Q9 above, generally involving, as appropriate, consultation among the agency CIO, general counsel, contracting personnel, security personnel and privacy personnel. diff --git a/implementation-guide.md b/implementation-guide.md index 5f9c7964..766df1ad 100644 --- a/implementation-guide.md +++ b/implementation-guide.md @@ -6,207 +6,309 @@ permalink: "/implementation-guide/" filename: "implementation-guide.md" --- -## 1) Create and maintain an enterprise data inventory -*\[Due by 11/9/13\]* +#Supplemental Guidance on the Implementation of M-13-13 "Open Data Policy – Managing Information as an Asset” -Maintain a complete listing of all datasets owned, managed, collected, and/or created by your agency, described in a common format. +## I. Introduction -### A) Minimum Required for Compliance +The purpose of this guidance is to provide additional clarification and detailed requirements to assist agencies in carrying out the objectives of [Executive Order 13642] (http://www.whitehouse.gov/the-press-office/2013/05/09/executive-order-making-open-and-machine-readable-new-default-government) of May 9, 2013, *Making Open and Machine Readable the New Default for Government Information* and [OMB Memorandum M-13-13](/policy-memo.md) *Open Data Policy-Managing Information as an Asset*. Specifically, this document focuses on near-term efforts agencies must take to meet the following five initial requirements of M-13-13, which are due November 1, 2013 (six months from publication of M-13-13): -Produce a single catalog or list of data managed in a single table, workspace, or other relevant location. Describe each dataset according to the [common core metadata](/schema/). +1. Create and maintain an Enterprise Data Inventory (Inventory) +2. Create and maintain a Public Data Listing +3. Create a process to engage with customers to help facilitate and prioritize data release +4. Document if data cannot be released +5. Clarify roles and responsibilities for promoting efficient and effective data release -This listing can be maintained in Data Management Systems (DMS) such as the open-source [CKAN](http://www.ckan.org) platform or Software as a Service offerings like [Socrata](http://www.socrata.com/open-data-portal); a single spreadsheet, with each metadata field as its own column; or a DMS of your choosing. +Agencies will establish an open data infrastructure by implementing this guidance and Memorandum [M-13-13](/policy-memo.md)and taking advantage of the resources provided on [Project Open Data](http://project-open-data.github.io). Once established, agencies will continue to evolve the infrastructure by identifying and adding new data assets [^1], enriching the description of those data assets through improved metadata, and increasing the amount of data shared with other agencies and the public. -### B) Best Practices and Examples +At a minimum, a successful open data infrastructure must: +* Provide a robust and usable Enterprise Data Inventory of an agency’s data assets, so that an agency can manage its data as strategic assets, +* Incorporate iterative and efficient processes for managing and opening data assets, and +* Create the Public Data Listing as a direct output or subset of the Enterprise Data Inventory. -* Conduct a zero-based review effort of all existing data. Give this effort a very short timeframe and the very specific goal of producing a simple list of all data assets within the agency. Stop at the due date rather than stopping at the 100 percent marker, which is very difficult to reach in a single pass. Repeat at regular intervals. -* Develop and communicate a clear path for listing newly created or acquired datasets into the enterprise data inventory. -* The more employees who can contribute to the enterprise data inventory, whether by submitting feedback or by actually being able to log in and update listings in the agency DMS, the more accurate and complete your metadata will be. -* While it may initially seem that maintaining your agency data inventory in a single spreadsheet is the simplest solution, this is often not the case. A central spreadsheet is difficult for more than one person to maintain, easily leading to errors and omissions. -* In addition to the required [common core metadata](/schema/), work with your agency and topical experts to develop an expanded set of metadata fields that make sense for your vertical. Many already exist; explore [Schema.org](http://www.schema.org) as a starting point. -* Your agency can and should use this central inventory listing as an internal search tool to increase awareness of data collections already in existence and to prevent duplicative research efforts. For example, a search of this inventory may reveal that the combination of two existing datasets could produce the results sought by a proposed new collection. +The “access level” categories described in this document are intended to be used for organizational purposes within agencies and to reflect decisions already made in agencies about whether data assets can be made public; simply marking data assets “public” cannot substitute for the analysis necessary to ensure the data can be made public. Agencies are reminded that this underlying data from the inventory may only be released to the public after a full analysis of privacy, confidentiality, security, and other valid restrictions pertinent to law and policy. - -## 2) Create and maintain a public data catalogue -*\[Due by 11/9/13\]* +This guidance seeks to balance the need to establish clear and meaningful expectations for agencies to meet, while allowing sufficient flexibility on the approach each agency may take to address their own unique needs. This guidance also includes references to other OMB memoranda that relate to the management of information. Agencies should refer to the definitions included in the attachment in [OMB Memorandum M-13-13](/policy-memo.md) *Open Data Policy-Managing Information as an Asset*. -Maintain a publicly accessible listing of all datasets maintained by your agency for harvesting by a central Data.gov search engine and the public at large. +This guidance introduces an Enterprise Data Inventory framework to provide agencies with improved clarity on specific actions to be taken and minimum requirements to be met. It also provides OMB with a rubric by which to evaluate compliance and progress toward the objectives laid out in the Open Data Policy. Following the November 1, 2013 deadline, agencies shall report progress on a quarterly basis, and performance will be tracked through the Open Data Cross-Agency Priority (CAP) Goal. Meeting the requirements of this guidance will ensure agencies are putting in place a basic infrastructure for inventorying, managing, and opening up data to unlock the value created by opening up information resources. -While agencies are only required to list datasets with an "Access Level" value of "public," agencies are free to include metadata for other datasets at their discretion. (For example, if the agency intends to also use the catalog as an internal search tool.) +## II. Policy Requirements -### A) Minimum Required for Compliance +### A. Create and Maintain an Enterprise Data Inventory -Document any datasets or metadata in your enterprise data inventory that your agency does not believe can be made publicly available, in consultation with your Office of General Counsel or its equivalent. +#### Purpose +To develop a clear and comprehensive understanding of what data assets they possess, Federal Agencies are required to create an Enterprise Data Inventory (Inventory) that accounts for all data assets created or collected by the agency. This includes, but is not limited to, data assets used in the agency’s information systems. The Inventory must be enterprise-wide, accounting for data assets across programs [^2] and bureaus [^3], and must use the required [common core metadata](/schema.md) available on Project Open Data. After creating the Inventory, agencies should continually improve the usefulness of the Inventory by expanding, enriching, and opening the Inventory (concepts described in the framework below). -Publish your agency’s enterprise data inventory, with the aforementioned information removed, to a file located at \[agency\].gov/data.json and described using (at minimum) the [common core metadata](/schema/). This file itself must be listed as a dataset within itself (see [an example of format](/examples/catalog-sample-extended.json) ); if you have multiple data.json files across your agency, include all of them in the top-level data.json at agency.gov/data.json. +The objectives of this activity are to: +* Build an internal inventory that accounts for data assets used in the agency' s information systems +* Include data assets produced through agency contracts and cooperative agreements, and in some cases agency-funded grants; include data assets associated with, but not limited to, research, program administration, statistical, and financial activities +* Indicate if the data may be made publicly available and if currently available +* Describe the data with [common core metadata](/schema.md) available on Project Open Data. -While you could manually create this file in a text editor, it is recommended that you use one of the tools provided to generate this file automatically from your existing DMS or enterprise inventory file. +#### Framework to Create and Maintain the Enterprise Data Inventory: Expand, Enrich, Open +Since agencies have varying levels of visibility into their data assets, the size and maturity of agencies’ Enterprise Data Inventories will differ across agencies. OMB will assess agency progress toward overall maturity of the Enterprise Data Inventory through the maturity areas of “Expand,” “Enrich,” and “Open.” -### B) Tools +**Expand**: Expanding the inventory refers to adding additional data assets to the Inventory. Agencies should develop their own strategy to expand the inventory and break down the work according to agency-defined classes of data [^4]. Agencies should communicate their plans for expanding the Inventory in the Inventory Schedule (described in the minimum requirements). As agencies develop an Inventory Schedule, they may find it helpful to group their data assets into classes of data. The following list provides examples of classes agencies may use as they schedule the expansion of the Inventory: +* [Agency operating units](http://www.whitehouse.gov/sites/default/files/omb/assets/a11_current_year/app_c.pdf) (for example, bureaus or offices) +* [Federal Program Inventory](http://goals.performance.gov/federalprograminventory) on Performance.gov +* Common business areas or segments, such as those described in the [Business Reference Model](http://www.whitehouse.gov/omb/e-gov/fea) or the [Budget Function Codes](http://www.whitehouse.gov/sites/default/files/omb/assets/a11_current_year/s79.pdf) of budget accounts +* Agency strategic objectives on Performance.gov and the [Performance Reference Model](http://www.whitehouse.gov/omb/e-gov/fea) +* Types of data from [Data Reference Model] (http://www.whitehouse.gov/omb/e-gov/fea) +* Existing listings of certain types of data assets, such as Information Collection Requests (ICR) submitted to OMB under the Paperwork Reduction Act (as listed on reginfo.gov [^5]) and/or files posted on the agency’s public website +* Data assets already prioritized by the agency in response to other Administration initiatives [^6] +* Primary related IT investments from the Federal IT Dashboard [^7] +* Agency-defined prioritizations of data assets +* Other classes or criteria -* **Don’t have a DMS?** Use the hosted Catalog Generator to create your data.json file via basic data entry. -* **Is your data inventory stored in a CSV (Excel file)?** Use the [CSV-to-API generator](http://labs.data.gov/csv-to-api/) to automatically convert it into a compliant data.json file. -* **Is your data inventory stored in Socrata?** Socrata has native support for data.json, so any datasets stored are automatically exposed appropriately. In addition, any of the extended properties specified in the [common core metadata](http://project-open-data.github.io/schema/) can be set on a per-dataset basis using the built-in metadata facilities. -* **Is your data inventory stored in CKAN?** Use the Data.gov extension (coming soon). -* **Not sure if your data.json file meets the requirements?** Paste your file into the [JSON Validator](https://github.com/project-open-data/json-validator) to receive real-time feedback. +> Example ways to evaluate “Expand” maturity: How has the Inventory expanded over time to include additional data assets? What “classes” of data (for example, financial, performance, scientific, regulatory, etc.) have been added or are planned to be added? Are all bureaus and programs represented in the Inventory? If not, what percentage is?\* -### C) Best Practices and Examples +**Enrich**: To improve the discoverability, management, and re-usability of data assets, agencies should enrich the Inventory over time by improving the quality of metadata describing each data asset. For example, agencies may: -* Using the [common core metadata](/schema/) to describe your enterprise data inventory makes it very simple to use that inventory for your public inventory. +* increase the number of keyword tags, +* clarify descriptions of data, or +* add additional metadata fields consistent with existing communities of practice or use cases. -* A detailed and descriptive title, description, and set of keywords for each dataset is the difference between customers finding your data and no one finding your data. Since agency data catalogs are harvested and searchable on Data.gov, accurate and thorough metadata is the best way to connect customers with your data. -* Consider including restricted and non-public datasets in your public data inventory listing. Remember that this file contains metadata about the data and not the data themselves. -* When you include restricted datasets in your public data inventory, include specific information on how customers can request and qualify for access to those data. -* Integrate your public data inventory with a tool for soliciting feedback from customers to avoid duplicative effort. For example, the [Kickstart WordPress plugin](https://github.com/project-open-data/kickstart) can automatically generate a voting and commenting mechanism from your data.json file. -* Data consumers should not need to know an Agency's org chart in order to find data. While it is helpful to include metadata about which part of the organization is providing the data, consider that secondary users will likely be searching for data using topical and thematic keywords as opposed to agency structure. -* In describing data, avoid use of agency acronyms wherever possible. +Project Open Data provides metadata requirements, additional optional metadata fields, and examples of metadata areas (see Appendix for examples). To improve the management of IT systems through the Inventory, agencies are encouraged to include the Primary Related IT Investment Unique Investment Identifier (UII) as a metadata field. As they work to enrich data assets, agencies should carefully weigh the potential value of efforts to improve data description or increase the number of metadata fields against the potential associated burden. Agencies should work to avoid the risk of duplicative metadata and work toward adopting uniform schema. To that end, agencies should draw on the expertise of existing communities of practice [^8], review standard taxonomies [^9], and coordinate across the government to harmonize definitions when adopting additional metadata fields. -### D) Resources +> Example ways to evaluate “Enrich” maturity: How has the agency improved the quality of metadata for each record? Are effective keywords and clear language used in data descriptions? Are additional metadata fields applying best practices from Project Open Data? Has the agency developed policies and procedures for populating these fields consistently? Has the agency linked the Inventory to federal IT management by including the Primary Related IT Investment Unique Investment Identifier (UII)?\* -* [Common Core Metadata](/schema/) +**Open**: Agencies should implement tools and processes that will accelerate the opening of additional valuable data assets by making them public and machine-readable, while ensuring adequate policy, process, and technical safeguards are in place to prevent against the release of sensitive data. Agencies are required to increase the number of public data assets included in the Public Data Listing (described in the next section) over time. Agencies should work toward increasing the ratio of data that are public and machine-readable to data that can be made public as measured in the Inventory. - -## 3) Engage with customers to help facilitate and prioritize data release -*\[Due by 11/9/13\]* +> Example ways to evaluate “Open” maturity: How many releasable data assets have been released in the Public Data Listing? How have more data assets been released in accordance with the [“open data” principles](/principles.md) over time?\* -### Minimum Required for Compliance +### Minimum Requirements to Create and Maintain an Enterprise Data Inventory -Create a process to solicit feedback from customers about existing and potential future dataset releases, including (but not limited to): -* Suggestions about additional formats in which to release a particular dataset, such as via an API -* Suggestions as to which datasets to release next +#### Develop and Submit to OMB an Inventory Schedule (by November 1, 2013) +* Describe how the agency will ensure that all data assets from each bureau and program in the agency have been identified and accounted for in the Inventory, to the extent practicable, no later than November 1, 2014. +* Describe how the agency plans to expand, enrich, and open their Inventory each quarter through November 1, 2014 at a minimum; include a summary and milestones in the schedule. [^10] +* Publish Inventory Schedule on the www.\[agency\].gov/digitalstrategy page by November 1, 2013. [^11] -### A) Tools +#### Create an Enterprise Data Inventory (by November 1, 2013) +* Include, at a minimum, all data assets which were posted on Data.gov before August 1, 2013 and additional representative data assets from programs and bureaus. +* Ensure the Inventory contains one metadata record for each data asset. A data asset can describe a collection of datasets (such as a CSV file for each state). +* Use common core “required” fields and “required-if-applicable” fields on Project Open Data (includes indicating whether data can be made publicly available). +* Submit to OMB via MAX Community [^12] the inventory as a single JSON file using the defined schema from Project Open Data. OMB invites agency input on the option of replacing future submission with an API via a discussion on Project Open Data. -* **Have WordPress?** Use the [Data Kickstart plugin](https://github.com/project-open-data/kickstart) to provide an instant voting interface based on your existing data.json file, allowing customers to vote up or down datasets and to leave comments on specific datasets. -* **Using Socrata?** Ask to have the "Suggest a Dataset" feature turned on, and the opened data portal will allow the public to make dataset suggestions. -* **Using CKAN?** Ask to have the DISQUS plugin installed and integrated into appropriate section templates. +#### Maintain the Enterprise Data Inventory (ongoing after November 1, 2013) +* Continue to expand, enrich, and open the Inventory on an on-going basis. +* Update the Inventory Schedule submitted on November 1, 2013 on a quarterly basis on the www.\[agency].gov/digitalstrategy page. [^13] -### B) Best Practices and Examples +#### Tools and Resources on Project Open Data +* Out-of-the-box Inventory Tool: OMB and GSA have provided a data inventory tool (CKAN) that is customized to be compliant with the Open Data Policy out of the box. Customization includes the ability to generate the compliant Public Data Listing directly from the Inventory, as well as integration of the required common core metadata schema. Agencies may choose to install CKAN on their servers or use the centrally hosted tool. +* Definitions and schema of “common core metadata fields” and selected “extensible metadata fields” +* The JSON schema for each Inventory’s “JSON Snapshot” as well as a schema generator and validator tools to facilitate agency efforts to create metadata +* Additional best practices, case studies, and tools -* The required set of [common core metadata](/schema/) includes fields for a contact name (“person”) and an email address (“mbox”). Listing specific, accurate information in these fields for each dataset ensures that customers can give direct feedback on a dataset to the person who is most likely to be able to act on that feedback. -* If you enable customers to leave comments on datasets, ensure someone at your agency monitors these comments and responds in a timely manner. When new visitors see outdated, unanswered comments, they are less likely to provide feedback. -* Consider a feedback mechanism and structure whereby data quality issues identified by data consumers can be submitted and vetted and integrated back into source data systems. -* Consider use cases for how data is likely to be accessed by API, whether by specific queries and subsets, or by entire tables, and consider building API infrastructure that is optimized to meet those needs. +### B. Create and Maintain a Public Data Listing -## 4) Clarify Roles and Responsibilities -*\[Due by 11/9/13\]* +#### Purpose +To improve the discoverability and usability of data assets, all federal agencies must develop a Public Data Listing, which contains a list of all data assets that are or could be made available to the public. This Public Data Listing, posted at www.\[agency].gov/data.json, would typically be a subset of the agency’s Inventory. This will allow the public to view agencies’ open data assets and subsequent progress as additional data assets are published. -### A) Minimum Required for Compliance +Agencies, at their discretion, may choose to include entries for non-public data assets in their Public Data Listings, taking into account guidance in section D. For example, an agency may choose to list data assets with an ‘accessLevel’ of ‘restricted public’ to make the public aware of their existence and the process by which these data may be obtained. -Ensure your agency CIO is positioned and authorized to implement the requirements of this Memorandum, as per the Clinger-Cohen Act of 1996, in coordination with the agency's Chief Acquisition Officer, Chief Financial Officer, Chief Technology Officer, Senior Agency Official for Geospatial Information, Senior Agency Official for Privacy (SAOP), Chief Information Security Officer (CISO), Senior Agency Official for Records Management, and Chief Freedom of Information Act (FOIA) Officer. +Agencies’ Public Data Listings will be used to dynamically populate the newly renovated Data.gov, the main website to find data assets generated and held by the U.S. Government. Data.gov allows anyone from the public to find, download, and use government data. The upcoming re-launch of Data.gov (currently in beta at next.data.gov) will automatically aggregate the agency-managed Public Data Listings into one centralized location, using the common core metadata standards and tagging to improve the user ability to find and use government data. -Ensure there is also someone in your agency who is, more specifically, responsible for the promotion of efficient and effective data release practices across the agency. +The objectives of this activity are to: +* List any data assets in the agency's Enterprise Data Inventory that can be made publicly available +* Publish Public Data Listing at www.\[agency].gov/data.json +* Include data assets produced through agency-funded grants, contracts, and cooperative agreements -Ensure your privacy and security officials are positioned with the authority to identify information that may require additional protection and agency activities that may require additional safeguards. +#### Minimum Requirements to Create and Maintain a Public Data Listing -Update your Senior Agency Official for Privacy (SAOP) responsibilities to include incorporating a full analysis of privacy, confidentiality, and security issues into every step of the agency information system planning process. +**Publish a Public Data Listing (by November 1, 2013)** +* Include, at a minimum, all data assets where ‘accessLevel’ = ‘public’ [^14] in the Inventory. By design, an agency should be able to filter the Inventory to all entries where ‘accessLevel’ = ‘public’ to easily generate the Public Data Listing. +* Publish the Public Data Listing at www.\[agency].gov/data.json. +* Follow the schema available on Project Open Data. +* Include **accessURL** [^15] link in the data asset’s metadata for all data assets in the Public Data Listing that are already publicly available [^16]. (as opposed to those that *could be publicly available*). -If your Senior Agency Official for Privacy is not positioned within the office of the CIO, designate an official within the office of the CIO to liaise with the privacy office. +**Tools and Resources on Project Open Data** +* Schema Generator +* CKAN +* JSON Validator -### B) Resources +### C. Create a Process to Engage With Customers to Help Facilitate and Prioritize Data Release -* [The Clinger-Cohen Act of 1996](http://govinfo.library.unt.edu/npr/library/misc/itref.html) -* [OMB Memorandum M-05-08](http://www.whitehouse.gov/sites/default/files/omb/assets/omb/memoranda/fy2005/m05-08.pdf) +#### Purpose +Identifying and engaging with key data customers to help determine the value of federal data assets can help agencies prioritize those of highest value for quickest release. Data customers include public as well as government stakeholders [^17]. All Federal Agencies will be required to engage public input and reflect on how to incorporate customer feedback into their data management practices. Agencies may develop criteria at their discretion for prioritizing the opening of data assets, accounting for a range of factors, such as the quantity and quality of user demand, internal management priorities, and agency mission relevance. As customer feedback mechanisms and internal prioritization criteria will likely evolve over time and vary across agencies, agencies should share successful innovations in incorporating customer feedback through interagency working groups and Project Open Data to disseminate best practices. Agencies should regularly review the evolving customer feedback and public engagement strategy. - -## 5) Update IRM Strategic Plan +The objectives of this activity are to: +* Create a process to engage with customers through www.\[agency].gov/data pages and other appropriate channels +* Make data available in multiple formats according to customer needs +* Help agencies prioritize data release through the Public Data Listing and management efforts to improve data discoverability and usability -### A) Minimum Required for Compliance +#### Minimum Requirements to Create a Process to Engage With Customers to Help Facilitate and Prioritize Data Release -Review and update your existing IRM Strategic Plan to describe how your agency has institutionalized and operationalized the requirements of this Memorandum. In your IRM Strategic Plans under the *Managing Information as an Asset* section, you should describe your approach to managing information as an asset, including how your agency will promote interoperability and openness throughout the information life cycle and properly safeguard information that may require additional protection. Agencies should specifically address how information collection and creation efforts, information system design, and data management and release practices will support interoperability and openness. This may involve describing updates to policies and processes, and offering employee trainings. +**Establish Customer Feedback Mechanism (by November 1, 2013)** +* Through the common core metadata requirements, agencies are already required to include a point of contact within each data asset’s metadata listed. +* Agencies should create a process to engage with customers on the www.\[agency].gov/data page or other appropriate mechanism. If the feedback tool is in an external location, it must be linked to the www.\[agency].gov/data page. +* Agencies should consider utilizing tools available on Project Open Data, such as the “Kickstart” + plug-in, to organize feedback around individual data assets. +**Describe Customer Feedback Processes (by November 1, 2013)** +* Update www.\[agency].gov/digitalstrategy [^18] page to describe the agency’s process to engage with customers. +* Moving forward, agencies should consider updating their customer feedback strategy and reflecting changes on www.\[agency].gov/digitalstrategy beyond November 1, 2013. -Additionally, you should include information on: -* Use of open licenses -* Use of open standards -* Collecting data in a machine-readable, standards-compliant way -* Publishing data in open formats -* Privacy analysis, with a presumption of openness +**Tools and Resources on Project Open Data** +* Data “Kickstart” Plug-in +* [GSA’s Innovation Center](http://gsablogs.gsa.gov/dsic/) API Resources -### B) Resources +### D. Document if Data Cannot be Released -* [44 USC 3506 (b)(2)](http://www.gpo.gov/fdsys/granule/USCODE-2011-title44/USCODE-2011-title44-chap35-subchapI-sec3506/content-detail.html) -* [OMB Circular A-11](http://www.whitehouse.gov/omb/circulars_a11_current_year_a11_toc) -* [OMB FY 13 PortfolioStat Guidance](http://www.whitehouse.gov/blog/2013/03/27/portfoliostat-20-driving-better-management-and-efficiency-federal-it) +#### Purpose +The Open Data Policy requires agencies to strengthen and develop policies and processes to ensure that only the appropriate data are made available publicly. Agencies should work with their Senior Agency Official for Privacy and other relevant officials to ensure a complete analysis of issues that could preclude public disclosure of information collected or created. If the agency determines the data should not be made publicly available because of law, regulation, or policy or because the data are subject to privacy, confidentiality, security, trade secret, contractual, or other valid restrictions to release, agencies must document the determination in consultation with their Office of General Counsel or equivalent. The agency should designate one of three “access levels” for each data asset listed in the inventory: public, restricted public, and non-public. The descriptions of these categories can be found below and on Project Open Data. -