Metadata Standard for the Montana GIS Data List
Introduction
This is the standard for metadata documents submitted for publication on the Montana GIS Data List. The GIS Data List is a central location for the discovery of Geographic Information System data about Montana.
Metadata is data about data. In the context of the GIS Data List, a metadata document is an XML file that contains descriptive information about a GIS data set. The XML file must conform to a standard that allows the Data List software to load it into a searchable index, provide basic descriptive information about the data set, and provide information that allows users to retrieve the data set.
The State Library has a Metadata Writing Guide for ArcGIS version 10 with instructions on how to create a metadata file. If you follow the instructions, your metadata will comply with this standard.
Part 1: FGDC Compliance
Metadata documents submitted to the Data List must be valid XML documents that conform to the structure set out in the Federal Geographic Data Committee's (FGDC) Content Standard for Digital Geospatial Metadata (CSDGM), version 2.
There is extensive information about the CSDGM at http://www.fgdc.gov/metadata. The CSDGM Workbook is available at http://ftp.geoinfo.msl.mt.gov/documents/Metadata_Tools/FGDCworkbook.pdf.
The Data List is a very relaxed version of the CSDGM. Data List publishers may ignore many of the metadata elements that the CSDGM requires. For the purposes of the GIS Data List, the CSDGM is a list of elements that are available to fill out. Some metadata software tools have various levels of enforcement of the element requirements from the CSDGM - Data List publishers may exercise discretion on whether to obey the rules set up by these tools.
Part 2: Required FGDC Elements
The following metadata elements are used by the Data List search functions or displayed by the Data List results page. Documents without proper entries in these fields will not be accepted by the Data List. In this and subsequent lists, the metadata element name is followed by the XML path to the element in the metadata file and a description of the required content.
- Title (idinfo/citation/citeinfo/title). Metadata must contain a title for the data set. The title should describe the region covered by the data set (such as "Montana", "Bozeman", or "Smith Watershed") and the subject (such as "Highways" or "Water Wells"). If the data set is not being actively maintained to be up-to-date, the title should include a date, for example "2006 Color Orthophoto of Helena, Montana" or "Yellowstone County School Districts, 1980". When users search for records in the Data List, records that have the search words in the title are promoted to the top of the search results.
- Abstract (idinfo/descript/abstract) is a concise summary of important information about the data set. In about three paragraphs, say everything you would want someone to know about the data, assuming this is all they were going to read.
- Originator (idinfo/citation/citeinfo/origin) is the agency or person primarily responsible for creating the data set. If, for example, this data was obtained from the Census Bureau and you made several corrections to the data, you might still list the Census Bureau as the originator. You may enter multiple originators if you feel this is necessary.
- Publisher (idinfo/citation/citeinfo/pubinfo/publish) is the organization that is making the data available. This should be your organization, unless you are putting this record in the Data List to direct people to some other source.
- Time Period of Content (idinfo/timeperd/timeinfo) is a date, dates, or range of dates when the data set was valid. Dates must be entered in the metadata in YYYYMMDD format, such as 20100330. If you do not know the exact date, you may just fill out a year or year and month, as in 2011 or 199708. The FGDC standard allows you to fill this section with a publication date, but this is strongly discouraged, unless you have no information about when the data was really collected.
- Bounding Coordinates (idinfo/spdom/bounding) are the latitude and longitude coordinates of a rectangle that encloses the region covered by the data set. The Data List does not currently use these data, but there are nationwide data catalogs which Data List records may someday be forwarded to that need them to enable map-based data searches.
- Theme Keywords (idinfo/keywords/theme). There must be a theme keyword section that contains a themekt element (Theme Keyword Thesaurus) whose value is "ISO 19115 Topic Category" and one or more themekey elements (Theme Keywords) whose values are chosen from the list in Appendix A. The Data List allows users to restrict their searches to records containing one of the ISO keywords. You may also include a second theme keywords section without a themekt element or with a different themekt value. When users search for records in the Data List, records that have the search words in the keyword sections are promoted above records that have the words in other parts of the metadata, but below records that have the search words in their titles.
- Distributor (distinfo/distrib/cntinfo) contains information on who to contact about getting a copy of the data. This must contain a contact organization or contact person name (cntper and/or cntorg under cntorgp or cntperp) and a contact phone number or email address (cntemail and/or cntvoice).
- Resource Description (distinfo/resdesc) must be filled out with a choice from a pre-set list of values. Its value is shown in the Content Type section of the search results and it helps control how the Data List creates a link to the data. The Data List's search page allows users to search for values in this field. See Appendix B for the rules for this element.
- Metadata Date (metainfo/metd) is the latest revision date of the metadata. Dates must be entered in the metadata in YYYYMMDD format, such as 20080131. If you do not know the exact date, you may just fill out the year or year and month, as in 1995 or 199708.
Part 3: FGDC Elements for optional Data List functions
The following elements will be used by the Data List if they are present, but they are not required.
- Online Linkage (idinfo/citation/citinfo/onlink) is a link to a resource that allows immediate access to the data. See Appendix B for the rules for this element.
- Browse Graphic (idinfo/browse). If you have some sort of on-line picture or map that features the data and put a link to it in this section, the Data List has a place where it will show users a thumbnail image of the picture that will bring up the full-size image when you click on it.
- Global Unique ID (esri/PublishedDocID). The Data List will use this ID as your record's unique identifier in the database, and the links to your metadata in the Data List will use this ID. When you upload revisions to your metadata, the Data List will check to see that the ID in the document you upload matches the ID of the document you are replacing before it does the replacement. If you do not include an ID in your metadata, the Data List will assign it one. If you revise one of your metadata records by uploading a document without an ID, the Data List will automatically replace your old metadata with the file you upload and there is no check to see if you might have loaded the wrong document by mistake. Instructions for creating this ID are in Appendix C.
- Larger Work Citation (idinfo/citation/citinfo/lworkcit). If this section contains a title and an onlink that points to another record in the Data List, the Data List will create a "Related Records" tab for this record and for the "parent" record that is referred to. The Related Records tab will list the titles all the records that refer to the parent record, and clicking on the title will take you to those records. The format of the onlink is http://apps.msl.mt.gov/Geographic_Information/Data/DataList/datalist_Details.aspx?did={ab6754c0-95ad-47a2-8071-04d51ad41892}, where the characters between the curly braces are the Global Unique ID of the parent record.
Appendix A
Theme Keywords
Metadata documents for the Montana GIS Data List must have a Theme Keyword section that contain a Theme Keyword Thesaurus whose value is "ISO 19115 Topic Category" and at least one Theme Keyword whose value is taken from the following list, which is presented in the format as they would appear in your xml file.
<theme>
<themekt>ISO 19115 Topic Category</themekt>
<themekey>farming</themekey>
<themekey>biota</themekey>
<themekey>boundaries</themekey>
<themekey>climatologyMeteorologyAtmosphere</themekey>
<themekey>economy</themekey>
<themekey>elevation</themekey>
<themekey>environment</themekey>
<themekey>geoscientificInformation</themekey>
<themekey>health</themekey>
<themekey>imageryBaseMapsEarthCover</themekey>
<themekey>intelligenceMilitary</themekey>
<themekey>inlandWaters</themekey>
<themekey>location</themekey>
<themekey>oceans</themekey>
<themekey>planningCadastre</themekey>
<themekey>society</themekey>
<themekey>structure</themekey>
<themekey>transportation</themekey>
<themekey>utilitiesCommunication</themekey>
</theme>
Here is a sample of what part of a metadata document with two theme keyword sections would look like. ISO Keywords and other keywords should not be mixed into the same section.
<keywords>
<theme>
<themekt>ISO 19115 Topic Category</themekt>
<themekey>boundaries</themekey>
<themekey>health</themekey>
</theme>
<theme>
<themekey>health department</themekey>
<themekey>service areas</themekey>
<themekey>health clinics</themekey>
</theme>
<place>
<themekey>Montana</themekey>
</place>
</keywords>
Appendix B
Resource Description and Online Linkage
The Data List uses the Resource Description as one of the search options. It uses the Resource Description and the Online Linkage to decide how to label the "get data" button on your record's Details page. Your record may have more than one of each of these elements if it describes more than one resource.
Resource Description may have the following values.
- Downloadable Data (Data files available for download)
- Offline Data (Data files that you have to order)
- Live Data and Maps (Web services that may be added to mapping applications)
- Applications (Web sites that allow visualization of, or other access to, GIS data)
Online Linkage is a link to a web service, a file containing GIS data, or a web site that has more information about the data or how to obtain it.
These are the rules for how the Data List acts on these elements:
- If the first Resource Description is Downloadable Data, the Data List will create a "Download Data" button that links to the first Online Linkage. If there is no Online Linkage, there will be no button
- If the first Resource Description is Live Data and Maps or Applications, the Data List will create a "View" button that links to the first Online Linkage. If there is no Online Linkage, there will be no button
- If the first Resource Description is Offline Data, the Data List will create a "Request Data" button that opens an email message to the email address of the distributor contact, with the title of the metadata record in the subject.
Appendix C
Global Unique IDs
Each metadata document will have a global unique ID (GUID) associated with it. The Data List uses the ID as a part of all links to the document. If you tell the Data List you are uploading a new document, but it has the same GUID as an existing document, it will tell you that the ID is already in use. If you tell the Data List you are uploading a new file to replace an existing document, it will reject the new file if the ID does not match the existing document. If you upload a new document without an ID, the Data List will assign one. If you upload a replacement document without an ID, it will replace the existing document without question and use the previously assigned or uploaded ID for the replacement document.
The GUID section may be inserted in the metadata with ArcGIS 10 as described in the Metadata Writing Guide, or it may be inserted into the metadata with a plain-text editor immediately before the last line of the file. The last line of the file should be "</metadata>." An example of a GUID section and the last line of a metadata file is shown below.
<Esri>
<PublishedDocID>
{13B2A163-4EE2-4204-B553-6309DD3434C2}
</PublishedDocID>
</Esri>
</metadata>
The GUID is the number between the curly braces. There are many free GUID generators on the Web that you can use to create unique GUIDs for your metadata files, such as http://www.guidgenerator.com/ and http://www.famkruithof.net/uuid/uuidgen.