To add a new harvesting type, follow these steps:
Add the proper folder in web/scripts/harvesting, maybe copying an already existing one.
Edit the harvesting.js file to include the new type (edit both constructor and init methods).
Add the proper folder in web/xsl/harvesting (again, it is easy to copy from an already existing one).
Edit the stylesheet web/xsl/harvesting/harvesting.xsl and add the new type
Add the transformation stylesheet in web/xsl/xml/harvesting. Its name must match the string used for the harvesting type.
Add the Java code in a package inside org.fao.geonet.kernel.harvest.harvester.
Add proper strings in web/geonetwork/loc/XX/xml/harvesting.xml.
Here follows a list of general notes to follow when adding a new harvesting type:
Every harvesting node (not type) must generate its UUID. This UUID is used to remove metadata when the harvesting node is removed and to check if a metadata (which has another UUID) has been already harvested by another node.
If a harvesting type supports multiple searches on a remote site, these must be done sequentially and results merged.
Every harvesting type must save in the folder images/logos a GIF image whose name is the node’s UUID. This image must be deleted when the harvesting node is removed. This is necessary to propagate harvesting information to other GeoNetwork nodes.
When a harvesting node is removed, all collected metadata must be removed too.
During harvesting, take in mind that a metadata could have been removed just after being added to the result list. In this case the metadata should be skipped and no exception raised.
The only settable privileges are: view, dynamic, featured. It does not make sense to use the others.
If a node raises an exception during harvesting, that node will be deactivated.
If a metadata already exists (its UUID exists) but belong to another node, it must not be updated even if it has been changed. This way the harvesting will not conflict with the other one. As a side effect, this prevent locally created metadata from being changed.
The harvesting engine does not store results on disk so they will get lost when the server will be restarted.
When some harvesting parameters are changed, the new harvesting type must use them during the next harvesting without requiring to reboot the server.