In case the seedlink host crashes unexpectedly, the seedlink data archive might remain in a state, where part of the data is missing. seedlink in such cases might not properly fill the local archive from upstream data acquisition systems. In such cases local data files on the data archive host should be replaced with the intact data files as present on the upstream systems. This has to be done outside the seedlink protocol transport mechanism, most appropriately through ssh and scp.
csbackresc.py supports the operator in this task.
An XML control file defines the upstream systems to be checked.
See online usage information provided by the program.
The tool first is run in configuration mode.
In this step status files are prepared in
In the then following replacement step files with differing checksums are replaced on the data archive host.
Usually this step has to be confirmed for each single file.
In case checksums have already been created in the
checksumfile.cs-files for files which have been replaced,
these entries must be removed.
The shell script
csclean.sh supports this task based on the csbackresc configuration.
Checksums then automatically will be regenerated in the next csback-run.
~/.csbackand edit file as appropriate for your installation.
Run configuration command:
sysop@pinatubo:~/.csback> csbackresc -l -v -a -c
Execution may take a while. It is recommended to run the process in a
screenenvironment such that the operator can easily detach and reattach to the interactive shell. Execution is slow because csbackresc starts to calculate the sha512 hash for each data file present on the upstream data acquisition host.
The configuration phase results in a set of files:
sysop@pinatubo:~/.csback> wc -l seedlink.tmp/* 5 seedlink.tmp/GR.corrupt 0 seedlink.tmp/GR.missing 38885 seedlink.tmp/GR.stream 38884 seedlink.tmp/GR.stream.pinatubo 32 seedlink.tmp/XE.corrupt 2 seedlink.tmp/XE.missing 33761 seedlink.tmp/XE.stream 33758 seedlink.tmp/XE.stream.pinatubo 5 seedlink.tmp/XG.corrupt 2196 seedlink.tmp/XG.missing 39900 seedlink.tmp/XG.stream 37703 seedlink.tmp/XG.stream.pinatubo 53 seedlink.tmp/XM.corrupt 3 seedlink.tmp/XM.missing 23636 seedlink.tmp/XM.stream 23632 seedlink.tmp/XM.stream.pinatubo 4 seedlink.tmp/XS.corrupt 0 seedlink.tmp/XS.missing 35654 seedlink.tmp/XS.stream 35653 seedlink.tmp/XS.stream.pinatubo 85 seedlink.tmp/XW.corrupt 2 seedlink.tmp/XW.missing 77450 seedlink.tmp/XW.stream 77447 seedlink.tmp/XW.stream.pinatubo 498750 total
*.streamfiles contain sha512 hashs for data files present on the upstream data acquisition system
*.stream.pinatubofiles contain the hashs for the corresponding files on the local system
*.corruptfiles contain names of files for which upstream and local hash differ
*.corruptfiles contain names of files which are present on the upstream host but not on the local system
Check results of configuration phase. For the recovery phase you may either use the interactive mode of
csbackrescor you choose to customize the above listed files. Useful commands can be
egrep '2015....$' seedlink.tmp/*.missing
to check for missing files from a given year. Or
egrep -v '2015.189$' seedlink.tmp/*.corrupt
to check files not being corrupted during a known incident.
Recovery phase. Execute:
sysop@pinatubo:~/.csback> csbackresc -l -v -a -r -f
This phase restores files by copying from upstream hosts with
scp. Along with this sed-files are created, which can be used to purge the
Purge checksum files. Execute:
sysop@pinatubo:~/.csback> csclean -vdl ./seedlink.tmp
Your data archive should have recovered now.