20.3 Harvesting nodes

The second top level hierarchy is harvesting. All nodes added using the web interface are stored here. Each child has node in its key and its value can be GeoNetwork, WebDAV, CSW or another depending on the node type.

All harvesting nodes share a common setting structure, which is used by the harvesting engine to retrieve these common parameters. This imply that any new harvesting type must honour this structure, which is the following:

Privileges and categories nodes can or cannot be present depending on the harvesting type. In the following structures, this common structure is not shown. Only extra information specific to the harvesting type is described.

Nodes of type GeoNetwork

This is the native harvesting supported by GeoNetwork 2.1 and above.

  • site: Contains host and account information

    • host (string)

    • port (integer)

    • servlet (string)

  • search [0..n]: Contains the search parameters. If this element is missing, an unconstrained search will be performed.

    • freeText (string)

    • title (string)

    • abstract (string)

    • keywords (string)

    • digital (boolean)

    • hardcopy (boolean)

    • source (string)

  • groupsCopyPolicy [0..n]: Represents a copy policy for a remote group. It is used to maintain remote privileges on harvested metadata.

    • name (string): Internal name (not localised) of a remote group.

    • policy (string): Copy policy. For the group all, policies are: copy, copyToIntranet. For all other groups, policies are: copy, createAndCopy. The Intranet group is not considered.

Nodes of type geonetwork20

This type allows harvesting from older GeoNetwork 2.0.x nodes.

  • site: Contains host and account information

    • host (string)

    • port (integer)

    • servlet (string)

  • search [0..n]: Contains the search parameters. If this element is missing no harvesting will be performed but the host’s parameters will be used to connect to the remote node.

    • freeText (string)

    • title (string)

    • abstract (string)

    • keywords (string)

    • digital (boolean)

    • hardcopy (boolean)

    • siteId (string)

Nodes of type WebDAV

This harvesting type is capable of connecting to a web server which is WebDAV enabled.

  • Site: Contains the URL to connect to and account information

    • URL (string): URL to connect to. Must be well formed, starting with ’http://’, ’file://’ or a supported protocol.

    • Icon (string): This is the icon that will be used as the metadata source’s logo. The image is taken from the images/harvesting folder and copied to the images/logos folder.

  • options

    • Recurse (boolean): Indicates if the remote folder must be recursively scanned for metadata.

    • Validate (boolean): If set, the harvester will validate the metadata against its schema and the metadata will be harvested only if it is valid.

Nodes of type CSW

This type of harvesting is capable of querying a Catalogue Services for the Web (CSW) server and retrieving all found metadata.

  • site

    • capabUrl (string): URL of the capabilities file that will be used to retrieve the operations address.

    • icon (string): This is the icon that will be used as the metadata source’s logo. The image is taken from the images/harvesting folder and copied to the images/logos folder.

  • search [0..n]: Contains search parameters. If this element is missing, an unconstrained search will be performed.

    • freeText (string)

    • title (string)

    • abstract (string)

    • subject (string)


Other documents: The complete manual in pdf format | License | Readme | Changes