Merge branch 'crypto-review'
[python-delta-tar] / docs / recovery.rst
CommitLineData
7118811d
PG
1The “disaster recovery” mode serves as a blunt instrument to retrieve data
2from damaged backup sets.
3
4In itself, the mechanism is not very sophisticated. It operates by ignoring
5errors in the various validations performed on the individual layers: e. g. the
6GCM tag, the gzip CRC32, tar header checksums. Recovery requires a valid index
7file to access objects in the tarballs directly by their offsets instead of
8iterating contents in a linear fashion. Should extraction of an object fail,
9Deltatar will skip ahead to the next one. The assumption is always that the
10offsets of objects in the corrupted archive did not move.
11
12On the command line, recovery mode is entered by passing the parameter
13``--recover``. Deltatar will then require that an index be specified as well.
14Apart from that, the invocation is the same as with ``--restore``.
15
16::
17 $ python3 backup.py --recover --mode '#gz' --targetpath \
18 /tmp/corrupt/out --indexes \
19 /tmp/corrupt/in/bfull-2017-08-14-1441.index.gz
20
21After extraction is complete, Deltatar will emit a list of the files whose
22corruption was detected. Note that some corruptions may go undetected without
23GCM encryption. Plaintext, uncompressed tarballs are especially susceptible
24since only the object headers are checksummed.
25
26Depending on which part of an object the corruption affected the results can
27be quite damaging, especially if compression is involved. Garbled output may
28look like this:
29
30 6876REF,06875) VPNCONN_XAUTH_SER2ER_EN72:"72.1"ad
31 6877REF,06875) VPNCONN_SECUR")
32 C42"ESP"
33 6878REF,06875) VPNCONN_RETRIES72.1"3d
34 6879REF,06875) VPNCONN_REMOTE_TYPE72.1"CUSTOMd
35
36Needless to say, care must be taken to inspect the files reported damaged.
37
8b8b0d32
PG
38Security considerations
39-------------------------------------------------------------------------------
40
41With encrypted backup sets, recovery mode omits the GCM authentication tag on
42ciphertext. This introduces a severe flaw in that bad initalization vectors
43will no longer cause decryption to fail. An attacker controlling the backup
44sets (e. g. after subverting the file server) can exploit this to recover
45plaintext from encrypted objects. Thus recovery mode should only ever be used
46as the last resort when dealing with known corrupted files. The decrypted
47result must be conscientiously inspected for manipulation attempts.
48