10.7 Adding new nodes

The Add button in the main page allows you to add new harvesting nodes. It will open the form shown in Figure 10.3, “Adding a new harvesting node”. When creating a new node, you have to choose the harvesting protocol supported by the remote server. The supported protocols are:

  1. GeoNetwork 2.1 remote node This is the standard and most powerful harvesting protocol used in GeoNetwork. It is able to log in into the remote node, to perform a standard search using the common query fields and to import all matching metadata. Furthermore, the protocol will try to keep both remote privileges and categories of the harvested metadata if they exist locally. Please notice that since GeoNetwork 2.1 the harvesting protocol has been improved. This means that it is not possible to use this protocol to harvest from version 2.0 or below.

  2. Web DAV server This harvesting type uses the web DAV (Distributed Authoring and Versioning) protocol to harvest metadata from a DAV server. It can be useful to users that want to publish their metadata through a web server that offers a DAV interface. The protocol allows to retrieve the contents of a web page (a list of files) with their change date.

  3. Catalogue Services for the Web 2.0 The Open Geospatial Consortium Catalogue Services for the Web and it is a search interface for catalogues developed by the Open Geospatial Consortium. GeoNetwork implements version 2.0 of this protocol.

  4. GeoNetwork v2.0 remote node GeoNetwork 2.1 introduced a new powerful harvesting engine which is not compatible with GeoNetwork version 2.0 based catalogues. Old 2.0 servers can still harvest from 2.1 servers but harvesting metadata from a v2.0 server requires this harvesting type. This harvesting type is deprecated.

  5. Z3950 Remote search Not implemented. This is a placeholder.

  6. OAI Protocol for Metadata Harvesting 2.0 This is a good harvesting protocol that is widely used among libraries. GeoNetwork implements version 2.0 of the protocol.

The drop down list shows all available protocols. Pressing the Add button you will reach an edit page whose content depends on the chosen protocol. The Back button will go back to the main page.

Figure 10.3. Adding a new harvesting node

Adding a new harvesting node

Adding a GeoNetwork node

This type of harvesting allows you to connect to a GeoNetwork node, perform a simple search as in the main page and retrieve all matched metadata. The search is useful because it allows you to focus only on metadata of interest. Once you add a node of this type, you will get a page like the one shown in Figure 10.4, “Adding a GeoNetwork node”. The meaning of the options is the following:

Figure 10.4. Adding a GeoNetwork node

Adding a GeoNetwork node

Site Here you put information about the GeoNetwork’s node you want to harvest from (host, port and servlet). If you want to search protected metadata you have to specify an account. The name parameter is just a short description that will be shown in the main page beside each node. Search In this section you can specify search parameters: they are the same present in the main page. Before doing that, it is important to remember that the GeoNetwork’s harvesting can be hierarchical so a remote node can contain both its metadata and metadata harvested from other nodes and sources. At the beginning, the Source drop down is empty and you have to use the Retrieve sources button to fill it. The purpose of this button is to query GeoNetwork about all sources which it is currently harvesting from. Once you get the drop down filled, you can choose a source name to constrain the search to that source only. Leaving the drop down blank, the search will spread over all metadata (harvested and not). You can add several search criteria for each site through the Add button: several searches will be performed and results merged. Each search box can be removed pressing the small button on the left of the site’s name. If no search criteria is added, a global unconstrained search will be performed. Options This is just a container for general options.

Every This is the harvesting period. The smallest value is 1 minute while the greatest value is 100 days. One run only If this option is checked, the harvesting will do only one run after which it will become inactive. Privileges Here you decide how to map remote group’s privileges. You can assign a copy policy to each group. The Intranet group is not considered because it does not make sense to copy its privileges. The All group has different policies from all the others:

  1. Copy: Privileges are copied.

  2. Copy to Intranet: Privileges are copied but to the Intranet group. This way public metadata can be made protected.

  3. Don’t copy: Privileges are not copied and harvested metadata will not be publicly visible.

For all other groups the policies are these:

  1. Copy: Privileges are copied only if there is a local group with the same (not localised) name as the remote group.

  2. Create and copy: Privileges are copied. If there is no local group with the same name as the remote group then it is created.

  3. Don’t copy: Privileges are not copied.

On the bottom side of the page there are some buttons:

Back Simply return to the main harvesting page. Save Saves the current node information and returns to the main harvesting page. When creating a new node, the node will be actually created only when you press this button.

Adding a Web DAV node

In this type of harvesting, metadata are retrieved from a remote web page. The available options are shown in Figure 10.5, “Adding a web DAV node” and have the following meaning:

Figure 10.5. Adding a web DAV node

Adding a web DAV node

Site Here are the connection information. The available options are:

Name This is a short description of the node. It will be shown in the harvesting main page. URL The remote URL from which metadata will be harvested. Each file found that ends with .xml will indicate a metadata and will be retrieved, converted into XML and imported. Icon Just an icon to assign to harvested metadata. The icon will be used when showing search results. Use account Account credentials for a basic HTTP authentication toward the remote URL. Options General harvesting options:

Every This is the harvesting period. The smallest value is 1 minute while the greatest value is 100 days. One run only If this option is checked, the harvesting will do only one run after which it will become inactive. Validate If checked, the metadata will be validate during import. If the validation does not pass, the metadata will be skipped. Recurse When the harvesting engine will find folders, it will recursively descend into them. Privileges Here it is possible to assign privileges to imported metadata. The Groups area lists all available groups in GeoNetwork. Once one (or more) group has been selected, it can be added through the Add button (each group can be added only once). For each added group, a row of privileges is created at the bottom of the list to allow privilege selection. To remove a row simply press the associated Remove button on its right. Categories Here you can assign local categories to harvested metadata.

At the bottom of the page there are the following buttons:

Back Go back to the main harvesting page. The harvesting is not added. Save Saves node’s data creating a new harvesting node. Then it will go back to the main harvesting page.

Adding a CSW node

This type of harvesting is capable of connecting to a remote CSW server and retrieving all matching metadata. Please, note that in order to be harvested metadata must have one of the schema format handled by GeoNetwork. Figure 10.6, “Adding a Catalogue Services for the Web harvesting node” shows the options available, whose meaning is the following:

Figure 10.6. Adding a Catalogue Services for the Web harvesting node

Adding a Catalogue Services for the Web harvesting node

Site Here you have to specify the connection parameters which are similar to the web DAV harvesting. In this case the URL points to the capabilities document of the CSW server. This document is used to discover the location of the services to call to query and retrieve metadata. Search Using the Add button, you can add several search criteria. You can query only the fields recognised by the CSW protocol. Options General harvesting options:

Every This is the harvesting period. The smallest value is 1 minute while the greatest value is 100 days. One run only If this option is checked, the harvesting will do only one run after which it will become inactive. Privileges Please, see web DAV harvesting. Categories Please, see web DAV harvesting.

At the bottom of the page there are the following buttons:

Back Go back to the main harvesting page. The harvesting is not added. Save Saves node’s data creating a new harvesting node. Then it will go back to the main harvesting page.

Adding an OAI-PMH node

An OAI-PMH server implements a harvesting protocol that GeoNetwork, acting as a client, can use to harvest metadata. If you are requesting the oai_dc output format, GeoNetwork will convert it into its Dublin Core format. Other formats can be harvested only if GeoNetwork supports them and is able to autodetect the schema from the metadata. Figure 10.7, “ Adding an OAI-PMH harvesting node ” shows all available options, which are:

Figure 10.7.  Adding an OAI-PMH harvesting node

Adding an OAI-PMH harvesting node

Site All options are the same as web DAV harvesting. The only difference is that the URL parameter here points to an OAI-PMH server. This is the entry point that GeoNetwork will use to issue all PMH commands. Search This part allows you to restrict the harvesting to specific metadata subsets. You can specify several searches: GeoNetwork will execute them sequentially and results will be merged to avoid the harvesting of the same metadata. Several searches allow you to specify different search criteria. In each search, you can specify the following parameters:

From You can provide a start date here. All metadata whose last change date is equal to or greater than this date will be harvested. You cannot simply edit this field but you have to use the icon to popup a calendar and choose the date. This field is optional so if you don’t provide it the start date constraint is dropped. Use the icon to clear the field. Until Works exactly as the from parameter but adds an end constraint to the last change date. The until date is included in the date range, the check is: less than or equal to. Set An OAI-PMH server classifies its metadata into hierarchical sets. You can request to return metadata that belong to only one set (and its subsets). This narrows the search result. Initially the drop down shows only a blank option that indicate no set. After specifying the connection URL, you can press the Retrieve Info button, whose purpose is to connect to the remote node, retrieve all supported sets and prefixes and fill the search drop downs. After you have pressed this button, you can select a remote set from the drop down. Prefix Here prefix means metadata format. The oai_dc prefix is mandatory for any OAI-PMH compliant server, so this entry is always present into the prefix drop down. To have this drop down filled with all prefixes supported by the remote server, you have to enter a valid URL and press the Retrieve Info button.

You can use the Add button to add one more search to the list. A search can be removed clicking the icon on its left. Options Most options are common to web DAV harvesting. The validate option, when checked, will validate each harvested metadata against GeoNetwork’s schemas. Only valid metadata will be harvested. Invalid one will be skipped. Privileges Please, see web DAV harvesting. Categories Please, see web DAV harvesting.

At the bottom of the page there are the following buttons:

Back Go back to the main harvesting page. The harvesting is not added. Save Saves node’s data creating a new harvesting node. Then it will go back to the main harvesting page.

Please note that when you edit a previously created node, both the set and prefix drop down lists will be empty. They will contain only the previously selected entries, plus the default ones if they were not selected. Furthermore, the set name will not be localised but the internal code string will be displayed. You have to press the retrieve info button again to connect to the remote server and retrieve the localised name and all set and prefix information.

Adding an OGC Service (ie. WMS, WFS, WCS)

An OGC service implements a GetCapabilities operation that GeoNetwork, acting as a client, can use to produce metadata. The GetCapability document provides information about the service and the layers/feature types/coverages served. GeoNetwork will convert it into ISO19139/119 format. Figure 10.8, “ Adding an OGC service harvesting node ” shows all available options, which are:

Figure 10.8.  Adding an OGC service harvesting node

Adding an OGC service harvesting node

Site: Name is the name of the catalogue and will be one of the search criteria. The type of OGC service indicates if the harvester has to query for a specific kind of service. Supported type are WMS (1.0.0 and 1.1.1), WFS (1.0.0 and 1.1.0, WCS (1.0.0) and WPS (0.4.0 and 1.0.0). The service URL is the URL of the service to contact (without parameters like "REQUEST=GetCapabilities", "VERSION=", ...). It has to be a valid URL like http://your.preferred.ogcservice/type_wms. The metadata language has to be specified. It will define the language of the metadata. It should be the language used by the web service administrator. The ISO topic category is used to populate the metadata. It is recommended to choose on as the topic is mandatory for the ISO standard if the hierarchical level is "datasets".

The type of import allows to define if the harvester has to produce only one metadata for the service or if it should loop over datasets served by the service and produce also metadata for each datasets. For each dataset the second checkbox allow to generate metadata for the dataset using an XML document referenced in the MetadataUrl attribute of the dataset in the GetCapability document. If this document is loaded but it is not valid (ie. unknown schema, bad XML format), the GetCapability document is used. For WMS, thumbnails could be created during harvesting.

Icons and privileges are defined as in the other harvester types.

Metadata for the harvested service is linked to the category selected for the service (usually "interactive resources" should be the best category). For each dataset, the "category for datasets" is linked to each metadata for datasets.


Other documents: The complete manual in pdf format | License | Readme | Changes