Thomas Jarosch [Mon, 27 Jan 2020 15:04:16 +0000]
Change unit tests to check for expected exception
It would be an error if the expected exception is not thrown
f.e. for re-use of IVs.
Thomas Jarosch [Sat, 25 Jan 2020 18:38:50 +0000]
Fix position of [-c] handler in usage
It won't work the other way around.
Thomas Jarosch [Sat, 25 Jan 2020 18:25:06 +0000]
Fix wrong script name in usage help
Thomas Jarosch [Sat, 25 Jan 2020 18:08:49 +0000]
Increase version to 1.6
Thomas Jarosch [Sat, 25 Jan 2020 17:48:29 +0000]
Fix use of wrong 'i' variable name
pylint complained:
tarfile.py:3797:38: E0602: Undefined variable 'i' (undefined-variable)
Code was introduced like this in
commit
27ee4dd4df48340541317123f5df348056c235ca
Author: Philipp Gesang <philipp.gesang@intra2net.com>
AuthorDate: Tue Aug 29 12:00:54 2017 +0200
implement volume handling for rescue mode
When reconstructing the index, traverse backup volumes and set
the “volume” member on the objects appropriately.
-> I guess 'i' was renamed to nvol for better readability.
Thomas Jarosch [Sat, 25 Jan 2020 17:40:53 +0000]
Fix errno.ESPIPE error handler
pylint complained:
crypto.py:1632:28: E1101: Module 'os' has no 'errno' member (no-member)
Thomas Jarosch [Sat, 25 Jan 2020 17:37:59 +0000]
Add missing iv fixed part in error output
pylint complained:
crypto.py:1357:36: E1306: Not enough arguments for format string (too-few-format-args)
Thomas Jarosch [Sat, 25 Jan 2020 17:30:58 +0000]
Fix 'pw' variable name in error handling
pylint complained:
crypto.py:840:40: E0602: Undefined variable 'password' (undefined-variable)
Thomas Jarosch [Tue, 3 Apr 2018 05:50:48 +0000]
Documentation improvements
Philipp Gesang [Thu, 31 Aug 2017 12:30:03 +0000]
remove development tweak for test runner
This essentially reverts
commit
406e0fa86d97f912b50689d6b080c2aee69eef86
Author: Philipp Gesang <philipp.gesang@intra2net.com>
Date: Tue Apr 18 11:59:02 2017 +0200
allow selecting individual tests with runtests.py
Tests and files for all imported test classes may still take some
moments but we don’t need to skip them any longer.
Philipp Gesang [Thu, 31 Aug 2017 11:37:10 +0000]
handle problems with incomplete gzip headers
Throw the appropriate exn to signal EOF or malformed data
conditions when tentatively parsing GZip headers.
Philipp Gesang [Thu, 31 Aug 2017 09:41:50 +0000]
add tests for truncated files
Philipp Gesang [Thu, 31 Aug 2017 08:57:17 +0000]
describe corruption mechanisms and their function in testing
Philipp Gesang [Wed, 30 Aug 2017 07:47:16 +0000]
skip some unittests on older python versions
Philipp Gesang [Tue, 29 Aug 2017 15:19:04 +0000]
test multivol index reconstruct with hole and header corruption
Philipp Gesang [Tue, 29 Aug 2017 14:59:14 +0000]
extend index reconstruct tests for multivol backups
Philipp Gesang [Tue, 29 Aug 2017 12:45:31 +0000]
lift block alignment requirement for tar archive rescue
It is unlikely that damaged archives have correctly aligned tar
headers. Thus we need to check each header-like section whether
it contains the right magic and the checksum matches. Objects
without a correct checksum (which spans the better part of the
header) are discarded similar to what file(1) does.
Philipp Gesang [Tue, 29 Aug 2017 10:00:54 +0000]
implement volume handling for rescue mode
When reconstructing the index, traverse backup volumes and set
the “volume” member on the objects appropriately.
Philipp Gesang [Tue, 29 Aug 2017 08:39:22 +0000]
implement leading garbage test
Implemented a rescue test since offsets won’t match in this
scenario.
Philipp Gesang [Tue, 29 Aug 2017 07:53:46 +0000]
include description of object validation with crypto.py scan mode
Example output for a second object with a corrupt byte in the
size field:
PDT: obj 1: read payload @64
PDT: · version = 1 : 0100
PDT: · paramversion = 1 : 0100
PDT: · nacl : 1dc1 154a 5405 ef5e df81 173f 2821 7a0c
PDT: · iv : 7cae 452a a05b 5182 0300 0000
PDT: · ctsize = 230 : e600 0000 0000 0000
PDT: · tag : 42c0 8774 3309 88eb 0e1a 71dc 8fd9 80c1
PDT: 0 → ✓ valid object 64–294
PDT: 294 → EOF inside object (358≤5312627≤
1095216701872); adjusting size to 5312269
PDT: obj 2: read payload @358
PDT: · version = 1 : 0100
PDT: · paramversion = 1 : 0100
PDT: · nacl : 1dc1 154a 5405 ef5e df81 173f 2821 7a0c
PDT: · iv : 7cae 452a a05b 5182 0400 0000
PDT: · ctsize = 5312269 : 0d0f 5100 0000 0000
PDT: · tag : 5946 dbcf 41b9 ac7e 4729 9e09 46c7 3388
PDT: GCM tag mismatch for object 358–5312627
PDT: 294 → × fishy object 358–5312627, corrupt header
Philipp Gesang [Mon, 28 Aug 2017 15:59:42 +0000]
add unit test for borked ciphertext size
Philipp Gesang [Mon, 28 Aug 2017 15:56:33 +0000]
allow for detecting overlapping objects with tarfile
Philipp Gesang [Mon, 28 Aug 2017 13:29:41 +0000]
detect overlapping objects
The CLI will run one additional pass to determine whether objects
overlap one another. Overlap might indicate bad headers or gaps
in the file (object offsets shifted).
Philipp Gesang [Mon, 28 Aug 2017 07:45:44 +0000]
adjust post-conditions for GZ[,AES]/rescue unit test
Philipp Gesang [Mon, 28 Aug 2017 07:17:41 +0000]
make crypto.py CLI accept hex-encoded keys again
Also handle decoding of those keys at the same level as base64
encoded ones.
Philipp Gesang [Fri, 25 Aug 2017 15:41:21 +0000]
use real new volume handler during rescue
With the dummy we end up with a nil object instead of a tarinfo
at the end of the volume. Reinstating the actual handler is
harmless and produces a valid info object again.
Philipp Gesang [Fri, 25 Aug 2017 15:03:37 +0000]
prevent tarobject iteration in disaster mode
The tarfile iterator relies on the header data to determine the
next object offset which may be wrong for corrupt files. Instead,
skip that iteration step and completely rely on the object
offsets determined during index rebuild.
Philipp Gesang [Fri, 25 Aug 2017 12:23:17 +0000]
implement tolerant GNU tar header parser
When skimming a file for tar objects, only consider the GNU
header magic and whether the blocks are aligned.
Philipp Gesang [Fri, 25 Aug 2017 09:12:39 +0000]
add restore helper handling for reconstructed indices
Philipp Gesang [Fri, 25 Aug 2017 08:27:39 +0000]
add iterator mode for reconstructed index
Philipp Gesang [Fri, 25 Aug 2017 07:33:49 +0000]
unify construction of secret values
Philipp Gesang [Thu, 24 Aug 2017 15:24:36 +0000]
implement tolerant gz header parser
Since they assume a stream object, we cannot rely on the original
tarfile GZ handling. Add a “tolerant” one according to the format
spec that notices malformed or unexpected (in Deltatar context)
values, but glosses over them if they do not necessarily impact
the readability of the object.
Also use the new symbolic constants in the existing GZ reader
instead of magic numbers.
Philipp Gesang [Thu, 24 Aug 2017 14:48:32 +0000]
ignore GCM tag mismatch in scan mode
Header info is assumed unreliable during rescue so a tag mismatch
must not result in a bad object.
Philipp Gesang [Thu, 24 Aug 2017 11:21:30 +0000]
convert TarInfo to index format
Philipp Gesang [Thu, 24 Aug 2017 09:57:53 +0000]
read tar objects at predetermined offsets for rescue index
Leverage the tarobj to read the object headers at the determined
offsets. Currently only implemented for encrypted archives whose
offsets are located with *crypto.py*.
Philipp Gesang [Thu, 24 Aug 2017 09:56:14 +0000]
add test skeleton for corrupt index reconstruction
Starting with an intact backup set.
Philipp Gesang [Wed, 23 Aug 2017 08:49:36 +0000]
draft rescue mode through all layers
The strategy is for rescue mode to reconstruct the relevant [*]
information from the index by inspecting the passed tar object,
then continue from there. On the crypto side, this boils down to
a streamlined (and silent) version of the “scan” mode. The
tarfile side is still WIP.
[*] Omitting the useless parts like inode number.
Philipp Gesang [Tue, 22 Aug 2017 15:17:04 +0000]
derive test skeleton for disaster rescue mode
Philipp Gesang [Tue, 22 Aug 2017 15:06:41 +0000]
implement dump mode for tolerant decryption
Utilize the safe dirfd based implementation from split mode to
write extracted objects to a target directory.
Philipp Gesang [Tue, 22 Aug 2017 13:30:15 +0000]
extend tarfile API for rescue mode
Philipp Gesang [Tue, 22 Aug 2017 11:29:45 +0000]
implement decryption for tolerant mode
Not possible to reuse the existing CLI decryption since we’re
operating with fds in scan mode.
Philipp Gesang [Tue, 22 Aug 2017 09:59:35 +0000]
attempt to process candidate objects in scan mode
Philipp Gesang [Tue, 22 Aug 2017 08:25:21 +0000]
print list of header candidates
Philipp Gesang [Tue, 15 Aug 2017 15:37:12 +0000]
implement PDTCRYPT header scanning
First phase: collect all possible header start locations.
Adds a CLI subcommand “scan” to crypto.py for analyzing files.
Philipp Gesang [Tue, 15 Aug 2017 14:54:30 +0000]
test corruption by tearing a hole in a volume
Philipp Gesang [Tue, 15 Aug 2017 13:38:17 +0000]
add test corrupting an entire volume
Zero out the first volume: None of the content can be restored.
This includes the file extending from the first into the second
volume.
Philipp Gesang [Tue, 15 Aug 2017 12:42:35 +0000]
use symbolic constant for errno
Philipp Gesang [Tue, 15 Aug 2017 09:14:01 +0000]
clarify index read failure
Instead of erroring out with an exception, make --restore emit an
error message indicating that something is wrong with the index.
Philipp Gesang [Tue, 15 Aug 2017 08:31:15 +0000]
do not discard valid data in buffers when in tolerant mode
Both decryption and decompression will fail on the first error
and ignore any results of earlier passes. In normal operation,
the hard failures are desirable to indicate a bad backup set.
However, in tolerant / recovery mode the error handling is closer
to the opposite extreme: we want to retrieve every last byte that
made it through the various layers and only skip over the parts
that cannot be interpreted at all.
Philipp Gesang [Mon, 14 Aug 2017 15:24:56 +0000]
catch bad parameter version in header
Philipp Gesang [Mon, 14 Aug 2017 14:04:53 +0000]
reject bad index files with a meaningful error
Philipp Gesang [Mon, 14 Aug 2017 13:10:44 +0000]
add brief description of disaster recovery
Philipp Gesang [Mon, 14 Aug 2017 12:14:16 +0000]
fail with info message if recovery is asked with source path
Philipp Gesang [Mon, 14 Aug 2017 10:09:28 +0000]
allow for numbers of missing and failed files to differ in recovery test
Philipp Gesang [Mon, 14 Aug 2017 09:54:17 +0000]
adjust the expectations about checksum mismatches with non-authenticated recover modes
Philipp Gesang [Mon, 14 Aug 2017 09:35:18 +0000]
use index iterator to accomodate multivol extraction
For reasons unknown, the “tar path iterator” always terminates
after the last element of the first volume. In fact, it does so
even for multi volume archives if the last object in the first
volume extends into the second volume. In this case, the object
is completely extracted but extraction terminates.
Philipp Gesang [Fri, 11 Aug 2017 14:41:51 +0000]
use random data in multivol tests
Brute force incompressibility to preven gzip from invalidating
our multivolume tests.
Philipp Gesang [Fri, 11 Aug 2017 13:45:50 +0000]
give each recovery test a multivol companion
This derives single- and multivolume versions of the tests.
Multiple volumes are generated by stretching the input file count
and size.
Philipp Gesang [Fri, 11 Aug 2017 12:16:56 +0000]
work around false positives in deltatar fs checks during rpmbuild
These only happen when running in rpmbuild, otherwise the tests
are fine. Of course, on RHBT the choir resoundeth “thou shalt not
run thine rpm build as root” but that’s not really an option
here.
Philipp Gesang [Fri, 11 Aug 2017 09:50:55 +0000]
catch incomplete trailing header in tolerant recovery
This makes decryption in recovery mode resistant against
malformed trailing data which would otherwise error out for the
entire buffered chunk on account of a decryption failure.
Philipp Gesang [Fri, 11 Aug 2017 09:39:42 +0000]
test recovery behavior with traling data
Philipp Gesang [Fri, 11 Aug 2017 09:16:33 +0000]
track successful recover of corrupted payload in tests
Gzip does CRC32, GCM has a MAC, but ordinary Tar only checksums
the header part, not the content. Thus recovery of a damaged
object will appear to succeed provided the object header is
intact. In order to detect the corruption, an external integrity
check is necessary.
Philipp Gesang [Fri, 11 Aug 2017 08:53:09 +0000]
add recover tests for completely damaged headers
Philipp Gesang [Fri, 11 Aug 2017 08:25:12 +0000]
sync tarfile stream diligently when writing new objects
Turns out all the offsets written to the index when neither
encrypting nor compressing were, well, … off. In fact they would
only be updated at tar block boundaries due to buffering. Since
“last_block_offset” record keeping blatantly violates layering
boundaries, it would only work reliably with the concat
compression and encryption modes that do the same.
Sync when adding a new object so we get the accurate offset
value. Voilà, recovery now works with uncompressed and
unencrypted archives as well
Philipp Gesang [Thu, 10 Aug 2017 15:01:42 +0000]
add header corruption tests
We hit them where it hurts:
* for compressed backups, flip a bit in the magic;
* for encrypted backups, flip a bit in the tag.
In either case, normal restore must fail, and disaster recovery
will be incomplete.
Philipp Gesang [Thu, 10 Aug 2017 13:32:16 +0000]
add test for corruption of encrypted files
Philipp Gesang [Thu, 10 Aug 2017 12:39:40 +0000]
track irrecoverable files in test_recover
Philipp Gesang [Thu, 10 Aug 2017 11:06:30 +0000]
prefer index iterator for recovery
Philipp Gesang [Thu, 10 Aug 2017 09:38:39 +0000]
properly damage gzip files for recover test
Ensure we are flipping bits in the compressed payload, not in the
mostly useless header. Requires some extra parsing to determine
the header length.
Philipp Gesang [Thu, 10 Aug 2017 08:34:15 +0000]
add bit flip helper for recover tests
Philipp Gesang [Thu, 10 Aug 2017 08:13:18 +0000]
fix misleading docstrings for index file hook
Philipp Gesang [Thu, 10 Aug 2017 07:37:08 +0000]
lay out skeleton for disaster recovery tests
New series of tests for corrupting backup sets and restoring them
incompletely (“tolerant” or “disaster recovery” mode).
Philipp Gesang [Tue, 8 Aug 2017 11:58:20 +0000]
draft disaster recovery mode for deltatar
The first stage recovery assumes the index is intact and all
objects are at their expected position. In this scenario, an
attempt is made to extract each object, keeping track of those
that weren’t readable and why.
Philipp Gesang [Tue, 8 Aug 2017 10:03:01 +0000]
return valid decrypted data on decryption failure
Philipp Gesang [Tue, 8 Aug 2017 08:48:31 +0000]
force tarfile reopen after bad read in deltatar
Closing the tarfile after an unreadable object was encountered
causes the stream to be reopened for the next read. Otherwise,
the corrupt object is already buffered and tarfile would continue
to seek inside the bad data.
Philipp Gesang [Tue, 8 Aug 2017 07:44:56 +0000]
distinguish invalid files from parse errors in restore
Especially with index files, the parse error is misleading.
Indicate the prevalent cause of the problem, i. e. that the
file is compressed but compression was not requested during
restore.
Philipp Gesang [Tue, 8 Aug 2017 07:14:12 +0000]
update help usage strings wrt. crypto in backup.py
Philipp Gesang [Mon, 7 Aug 2017 13:37:19 +0000]
extend crypto.py exception descriptions
Philipp Gesang [Tue, 27 Jun 2017 08:24:00 +0000]
actually default to i2n mode with crypto.py scrypt
And adapt the relevant unit test to explicitly request the full
parameters output.
Philipp Gesang [Fri, 23 Jun 2017 08:35:08 +0000]
add crypto.py option to output cnf-compatible scrypt object
Philipp Gesang [Wed, 31 May 2017 11:53:21 +0000]
support PDT encrypted archives with rescue_tar.py
Philipp Gesang [Tue, 30 May 2017 15:29:26 +0000]
adapt file_crypt.py for revised crypto
Philipp Gesang [Tue, 30 May 2017 15:10:59 +0000]
kill off old crypto implementation
The old aescrypto.py was only kept for reference but since
downstream integration is more or less complete wrt. encryption
we don’t need it any longer.
Good riddance.
Philipp Gesang [Tue, 30 May 2017 10:40:19 +0000]
allow passing salt to crypto.py on the command line
Nifty shortcut for hashing without a corresponding pdtcrypt file.
Philipp Gesang [Tue, 30 May 2017 09:23:57 +0000]
properly align usage message of crypto.py
Philipp Gesang [Tue, 23 May 2017 12:55:10 +0000]
improve bad CLI argument handling of crypto.py
Philipp Gesang [Mon, 22 May 2017 12:10:33 +0000]
include header version info in scrypt handler
Philipp Gesang [Fri, 19 May 2017 15:22:17 +0000]
accept crypto format version in deltatar ctor
Philipp Gesang [Fri, 19 May 2017 09:16:10 +0000]
add unit test for CLI scrypt hashing
Philipp Gesang [Thu, 18 May 2017 15:47:25 +0000]
allow passing keys directly to CLI crypto.py
Keys may now be passed as command line argument or environment
variable.
The only valid format is 16 bytes in hexadecimal.
Philipp Gesang [Thu, 18 May 2017 11:44:07 +0000]
grab password from envp if not supplied on CLI
In order to avoid the password showing up in full in the process
table, pass it in the environment instead. Uses the environment
variable PDTCRYPT_PASSWORD with both crypto.py and backup.py.
Philipp Gesang [Tue, 16 May 2017 11:37:43 +0000]
default to index mode of deltatar object when choosing extension
For external use.
Philipp Gesang [Tue, 16 May 2017 08:57:01 +0000]
handle bad randomness during IV creation
Since IVs must be unique we rely on /dev/urandom to yield a
different sequence of bytes when requesting a new fixed part.
In the unlikely event that a new fixed part has already been
used earlier, repeat it for number of times.
Abort if no unique IV could be generated this way since it
most likely indicates a faulty RNG.
Philipp Gesang [Mon, 15 May 2017 15:44:48 +0000]
extend crypto.py documentation
Philipp Gesang [Thu, 11 May 2017 15:40:21 +0000]
distinguish auxiliary file errors
Auxiliary files that grow larger than the maximum defined
encrypted file size cause an irrecoverable error because their
fixed IV is being reused. Add a new exception to distinguish this
specific case. Encrypted auxiliary files thus never consist of
more than one object, no on-the-fly continuation is permitted
like with ordinary files.
Philipp Gesang [Thu, 11 May 2017 09:28:08 +0000]
adapt unit tests for crypto.py subcommands
Philipp Gesang [Thu, 11 May 2017 08:50:30 +0000]
export scrypt hashing functionality
Philipp Gesang [Thu, 11 May 2017 08:35:40 +0000]
add SCRYPT hashing mode to crypto.py
Add a subcommand “scrypt” to crypto.py in CLI mode. Example:
$ python3 ./deltatar/crypto.py scrypt foo -i - -o pwd \
<backup_dir/bfull-2017-05-11-0919-001.tar.pdtcrypt
{"scrypt_params": {"p": 1, "N": 65536, "dkLen": 16, "r": 8},
"salt": "b'
fbdbaa9890ae243eb16391199c9243f6'", "hash":
"b'
1e7d7a78b9300d461779e9c80e4a15ac'"}
The output “hash” is calculated from the salt in the first
header found in the given archive and the password specified.
Philipp Gesang [Tue, 9 May 2017 13:42:17 +0000]
graciously handle GCM data length limit
Philipp Gesang [Tue, 9 May 2017 08:59:28 +0000]
unit test crypto file counter wraparound
After the file counter reaches UINT_MAX, it wraps around and a
new fixed part must be created.
The file counter is 32 bit unsigned integer so it needs to be
lowered to make bounds testing feasible.