Philipp Gesang [Wed, 29 Jan 2020 10:35:56 +0000]
improve unit test SNR
Philipp Gesang [Wed, 29 Jan 2020 10:03:54 +0000]
turn API-mandated no-op into assertion
Make it explicit that there cannot actually be a rest data when
finalizing an encrypted object. The Cryptography API mandates
that the caller handle the remainder of the data on finalization.
By virtue of being a stream cipher, the AES-GCM encoder always
returns the exact number of bytes that it was given so
technically the rest is meaningless.
Philipp Gesang [Wed, 29 Jan 2020 09:28:34 +0000]
account for one-tuple return
For some reason, feeding ``(0,)'' into os.read() as the length
argument fails with newer versions of python. Fix this by
unpacking the single element ``tuple'' before using it.
Thomas Jarosch [Mon, 27 Jan 2020 17:21:50 +0000]
Document source of the AES GCM size limit
Also verified our bit left shift operations match the numbers.
Thomas Jarosch [Mon, 27 Jan 2020 16:51:43 +0000]
Mention sibling UT for the test_crypto_aes_gcm_enc_length_cap() test
Thomas Jarosch [Mon, 27 Jan 2020 16:27:35 +0000]
Fix up docstring of test_create_backup_index_max_file_length()
Thomas Jarosch [Mon, 27 Jan 2020 16:05:18 +0000]
Extend unit test for larger than GCM encrytion size
Add a few corner cases in addition to the existing test:
* MAX_SIZE-1
* MAX_SIZE
* 2*MAX_SIZE + 1
Thomas Jarosch [Mon, 27 Jan 2020 15:56:53 +0000]
Fix indentation of 'else' block
Right now this affects the debug output only.
Thomas Jarosch [Mon, 27 Jan 2020 15:21:27 +0000]
Fix up check_consecutive_iv()
It was using the wrong variable to store the previously used IV.
Thomas Jarosch [Mon, 27 Jan 2020 15:20:24 +0000]
Clarify two functions are meant to be used by desaster recovery
Thomas Jarosch [Mon, 27 Jan 2020 15:09:03 +0000]
Remove upstream GCM issue that was resolved
Thomas Jarosch [Mon, 27 Jan 2020 15:07:25 +0000]
Remove outdated author information
Thomas Jarosch [Mon, 27 Jan 2020 15:04:16 +0000]
Change unit tests to check for expected exception
It would be an error if the expected exception is not thrown
f.e. for re-use of IVs.
Thomas Jarosch [Sat, 25 Jan 2020 18:38:50 +0000]
Fix position of [-c] handler in usage
It won't work the other way around.
Thomas Jarosch [Sat, 25 Jan 2020 18:25:06 +0000]
Fix wrong script name in usage help
Thomas Jarosch [Sat, 25 Jan 2020 18:08:49 +0000]
Increase version to 1.6
Thomas Jarosch [Sat, 25 Jan 2020 17:48:29 +0000]
Fix use of wrong 'i' variable name
pylint complained:
tarfile.py:3797:38: E0602: Undefined variable 'i' (undefined-variable)
Code was introduced like this in
commit
27ee4dd4df48340541317123f5df348056c235ca
Author: Philipp Gesang <philipp.gesang@intra2net.com>
AuthorDate: Tue Aug 29 12:00:54 2017 +0200
implement volume handling for rescue mode
When reconstructing the index, traverse backup volumes and set
the “volume” member on the objects appropriately.
-> I guess 'i' was renamed to nvol for better readability.
Thomas Jarosch [Sat, 25 Jan 2020 17:40:53 +0000]
Fix errno.ESPIPE error handler
pylint complained:
crypto.py:1632:28: E1101: Module 'os' has no 'errno' member (no-member)
Thomas Jarosch [Sat, 25 Jan 2020 17:37:59 +0000]
Add missing iv fixed part in error output
pylint complained:
crypto.py:1357:36: E1306: Not enough arguments for format string (too-few-format-args)
Thomas Jarosch [Sat, 25 Jan 2020 17:30:58 +0000]
Fix 'pw' variable name in error handling
pylint complained:
crypto.py:840:40: E0602: Undefined variable 'password' (undefined-variable)
Thomas Jarosch [Tue, 3 Apr 2018 05:50:48 +0000]
Documentation improvements
Philipp Gesang [Thu, 31 Aug 2017 12:30:03 +0000]
remove development tweak for test runner
This essentially reverts
commit
406e0fa86d97f912b50689d6b080c2aee69eef86
Author: Philipp Gesang <philipp.gesang@intra2net.com>
Date: Tue Apr 18 11:59:02 2017 +0200
allow selecting individual tests with runtests.py
Tests and files for all imported test classes may still take some
moments but we don’t need to skip them any longer.
Philipp Gesang [Thu, 31 Aug 2017 11:37:10 +0000]
handle problems with incomplete gzip headers
Throw the appropriate exn to signal EOF or malformed data
conditions when tentatively parsing GZip headers.
Philipp Gesang [Thu, 31 Aug 2017 09:41:50 +0000]
add tests for truncated files
Philipp Gesang [Thu, 31 Aug 2017 08:57:17 +0000]
describe corruption mechanisms and their function in testing
Philipp Gesang [Wed, 30 Aug 2017 07:47:16 +0000]
skip some unittests on older python versions
Philipp Gesang [Tue, 29 Aug 2017 15:19:04 +0000]
test multivol index reconstruct with hole and header corruption
Philipp Gesang [Tue, 29 Aug 2017 14:59:14 +0000]
extend index reconstruct tests for multivol backups
Philipp Gesang [Tue, 29 Aug 2017 12:45:31 +0000]
lift block alignment requirement for tar archive rescue
It is unlikely that damaged archives have correctly aligned tar
headers. Thus we need to check each header-like section whether
it contains the right magic and the checksum matches. Objects
without a correct checksum (which spans the better part of the
header) are discarded similar to what file(1) does.
Philipp Gesang [Tue, 29 Aug 2017 10:00:54 +0000]
implement volume handling for rescue mode
When reconstructing the index, traverse backup volumes and set
the “volume” member on the objects appropriately.
Philipp Gesang [Tue, 29 Aug 2017 08:39:22 +0000]
implement leading garbage test
Implemented a rescue test since offsets won’t match in this
scenario.
Philipp Gesang [Tue, 29 Aug 2017 07:53:46 +0000]
include description of object validation with crypto.py scan mode
Example output for a second object with a corrupt byte in the
size field:
PDT: obj 1: read payload @64
PDT: · version = 1 : 0100
PDT: · paramversion = 1 : 0100
PDT: · nacl : 1dc1 154a 5405 ef5e df81 173f 2821 7a0c
PDT: · iv : 7cae 452a a05b 5182 0300 0000
PDT: · ctsize = 230 : e600 0000 0000 0000
PDT: · tag : 42c0 8774 3309 88eb 0e1a 71dc 8fd9 80c1
PDT: 0 → ✓ valid object 64–294
PDT: 294 → EOF inside object (358≤5312627≤
1095216701872); adjusting size to 5312269
PDT: obj 2: read payload @358
PDT: · version = 1 : 0100
PDT: · paramversion = 1 : 0100
PDT: · nacl : 1dc1 154a 5405 ef5e df81 173f 2821 7a0c
PDT: · iv : 7cae 452a a05b 5182 0400 0000
PDT: · ctsize = 5312269 : 0d0f 5100 0000 0000
PDT: · tag : 5946 dbcf 41b9 ac7e 4729 9e09 46c7 3388
PDT: GCM tag mismatch for object 358–5312627
PDT: 294 → × fishy object 358–5312627, corrupt header
Philipp Gesang [Mon, 28 Aug 2017 15:59:42 +0000]
add unit test for borked ciphertext size
Philipp Gesang [Mon, 28 Aug 2017 15:56:33 +0000]
allow for detecting overlapping objects with tarfile
Philipp Gesang [Mon, 28 Aug 2017 13:29:41 +0000]
detect overlapping objects
The CLI will run one additional pass to determine whether objects
overlap one another. Overlap might indicate bad headers or gaps
in the file (object offsets shifted).
Philipp Gesang [Mon, 28 Aug 2017 07:45:44 +0000]
adjust post-conditions for GZ[,AES]/rescue unit test
Philipp Gesang [Mon, 28 Aug 2017 07:17:41 +0000]
make crypto.py CLI accept hex-encoded keys again
Also handle decoding of those keys at the same level as base64
encoded ones.
Philipp Gesang [Fri, 25 Aug 2017 15:41:21 +0000]
use real new volume handler during rescue
With the dummy we end up with a nil object instead of a tarinfo
at the end of the volume. Reinstating the actual handler is
harmless and produces a valid info object again.
Philipp Gesang [Fri, 25 Aug 2017 15:03:37 +0000]
prevent tarobject iteration in disaster mode
The tarfile iterator relies on the header data to determine the
next object offset which may be wrong for corrupt files. Instead,
skip that iteration step and completely rely on the object
offsets determined during index rebuild.
Philipp Gesang [Fri, 25 Aug 2017 12:23:17 +0000]
implement tolerant GNU tar header parser
When skimming a file for tar objects, only consider the GNU
header magic and whether the blocks are aligned.
Philipp Gesang [Fri, 25 Aug 2017 09:12:39 +0000]
add restore helper handling for reconstructed indices
Philipp Gesang [Fri, 25 Aug 2017 08:27:39 +0000]
add iterator mode for reconstructed index
Philipp Gesang [Fri, 25 Aug 2017 07:33:49 +0000]
unify construction of secret values
Philipp Gesang [Thu, 24 Aug 2017 15:24:36 +0000]
implement tolerant gz header parser
Since they assume a stream object, we cannot rely on the original
tarfile GZ handling. Add a “tolerant” one according to the format
spec that notices malformed or unexpected (in Deltatar context)
values, but glosses over them if they do not necessarily impact
the readability of the object.
Also use the new symbolic constants in the existing GZ reader
instead of magic numbers.
Philipp Gesang [Thu, 24 Aug 2017 14:48:32 +0000]
ignore GCM tag mismatch in scan mode
Header info is assumed unreliable during rescue so a tag mismatch
must not result in a bad object.
Philipp Gesang [Thu, 24 Aug 2017 11:21:30 +0000]
convert TarInfo to index format
Philipp Gesang [Thu, 24 Aug 2017 09:57:53 +0000]
read tar objects at predetermined offsets for rescue index
Leverage the tarobj to read the object headers at the determined
offsets. Currently only implemented for encrypted archives whose
offsets are located with *crypto.py*.
Philipp Gesang [Thu, 24 Aug 2017 09:56:14 +0000]
add test skeleton for corrupt index reconstruction
Starting with an intact backup set.
Philipp Gesang [Wed, 23 Aug 2017 08:49:36 +0000]
draft rescue mode through all layers
The strategy is for rescue mode to reconstruct the relevant [*]
information from the index by inspecting the passed tar object,
then continue from there. On the crypto side, this boils down to
a streamlined (and silent) version of the “scan” mode. The
tarfile side is still WIP.
[*] Omitting the useless parts like inode number.
Philipp Gesang [Tue, 22 Aug 2017 15:17:04 +0000]
derive test skeleton for disaster rescue mode
Philipp Gesang [Tue, 22 Aug 2017 15:06:41 +0000]
implement dump mode for tolerant decryption
Utilize the safe dirfd based implementation from split mode to
write extracted objects to a target directory.
Philipp Gesang [Tue, 22 Aug 2017 13:30:15 +0000]
extend tarfile API for rescue mode
Philipp Gesang [Tue, 22 Aug 2017 11:29:45 +0000]
implement decryption for tolerant mode
Not possible to reuse the existing CLI decryption since we’re
operating with fds in scan mode.
Philipp Gesang [Tue, 22 Aug 2017 09:59:35 +0000]
attempt to process candidate objects in scan mode
Philipp Gesang [Tue, 22 Aug 2017 08:25:21 +0000]
print list of header candidates
Philipp Gesang [Tue, 15 Aug 2017 15:37:12 +0000]
implement PDTCRYPT header scanning
First phase: collect all possible header start locations.
Adds a CLI subcommand “scan” to crypto.py for analyzing files.
Philipp Gesang [Tue, 15 Aug 2017 14:54:30 +0000]
test corruption by tearing a hole in a volume
Philipp Gesang [Tue, 15 Aug 2017 13:38:17 +0000]
add test corrupting an entire volume
Zero out the first volume: None of the content can be restored.
This includes the file extending from the first into the second
volume.
Philipp Gesang [Tue, 15 Aug 2017 12:42:35 +0000]
use symbolic constant for errno
Philipp Gesang [Tue, 15 Aug 2017 09:14:01 +0000]
clarify index read failure
Instead of erroring out with an exception, make --restore emit an
error message indicating that something is wrong with the index.
Philipp Gesang [Tue, 15 Aug 2017 08:31:15 +0000]
do not discard valid data in buffers when in tolerant mode
Both decryption and decompression will fail on the first error
and ignore any results of earlier passes. In normal operation,
the hard failures are desirable to indicate a bad backup set.
However, in tolerant / recovery mode the error handling is closer
to the opposite extreme: we want to retrieve every last byte that
made it through the various layers and only skip over the parts
that cannot be interpreted at all.
Philipp Gesang [Mon, 14 Aug 2017 15:24:56 +0000]
catch bad parameter version in header
Philipp Gesang [Mon, 14 Aug 2017 14:04:53 +0000]
reject bad index files with a meaningful error
Philipp Gesang [Mon, 14 Aug 2017 13:10:44 +0000]
add brief description of disaster recovery
Philipp Gesang [Mon, 14 Aug 2017 12:14:16 +0000]
fail with info message if recovery is asked with source path
Philipp Gesang [Mon, 14 Aug 2017 10:09:28 +0000]
allow for numbers of missing and failed files to differ in recovery test
Philipp Gesang [Mon, 14 Aug 2017 09:54:17 +0000]
adjust the expectations about checksum mismatches with non-authenticated recover modes
Philipp Gesang [Mon, 14 Aug 2017 09:35:18 +0000]
use index iterator to accomodate multivol extraction
For reasons unknown, the “tar path iterator” always terminates
after the last element of the first volume. In fact, it does so
even for multi volume archives if the last object in the first
volume extends into the second volume. In this case, the object
is completely extracted but extraction terminates.
Philipp Gesang [Fri, 11 Aug 2017 14:41:51 +0000]
use random data in multivol tests
Brute force incompressibility to preven gzip from invalidating
our multivolume tests.
Philipp Gesang [Fri, 11 Aug 2017 13:45:50 +0000]
give each recovery test a multivol companion
This derives single- and multivolume versions of the tests.
Multiple volumes are generated by stretching the input file count
and size.
Philipp Gesang [Fri, 11 Aug 2017 12:16:56 +0000]
work around false positives in deltatar fs checks during rpmbuild
These only happen when running in rpmbuild, otherwise the tests
are fine. Of course, on RHBT the choir resoundeth “thou shalt not
run thine rpm build as root” but that’s not really an option
here.
Philipp Gesang [Fri, 11 Aug 2017 09:50:55 +0000]
catch incomplete trailing header in tolerant recovery
This makes decryption in recovery mode resistant against
malformed trailing data which would otherwise error out for the
entire buffered chunk on account of a decryption failure.
Philipp Gesang [Fri, 11 Aug 2017 09:39:42 +0000]
test recovery behavior with traling data
Philipp Gesang [Fri, 11 Aug 2017 09:16:33 +0000]
track successful recover of corrupted payload in tests
Gzip does CRC32, GCM has a MAC, but ordinary Tar only checksums
the header part, not the content. Thus recovery of a damaged
object will appear to succeed provided the object header is
intact. In order to detect the corruption, an external integrity
check is necessary.
Philipp Gesang [Fri, 11 Aug 2017 08:53:09 +0000]
add recover tests for completely damaged headers
Philipp Gesang [Fri, 11 Aug 2017 08:25:12 +0000]
sync tarfile stream diligently when writing new objects
Turns out all the offsets written to the index when neither
encrypting nor compressing were, well, … off. In fact they would
only be updated at tar block boundaries due to buffering. Since
“last_block_offset” record keeping blatantly violates layering
boundaries, it would only work reliably with the concat
compression and encryption modes that do the same.
Sync when adding a new object so we get the accurate offset
value. Voilà, recovery now works with uncompressed and
unencrypted archives as well
Philipp Gesang [Thu, 10 Aug 2017 15:01:42 +0000]
add header corruption tests
We hit them where it hurts:
* for compressed backups, flip a bit in the magic;
* for encrypted backups, flip a bit in the tag.
In either case, normal restore must fail, and disaster recovery
will be incomplete.
Philipp Gesang [Thu, 10 Aug 2017 13:32:16 +0000]
add test for corruption of encrypted files
Philipp Gesang [Thu, 10 Aug 2017 12:39:40 +0000]
track irrecoverable files in test_recover
Philipp Gesang [Thu, 10 Aug 2017 11:06:30 +0000]
prefer index iterator for recovery
Philipp Gesang [Thu, 10 Aug 2017 09:38:39 +0000]
properly damage gzip files for recover test
Ensure we are flipping bits in the compressed payload, not in the
mostly useless header. Requires some extra parsing to determine
the header length.
Philipp Gesang [Thu, 10 Aug 2017 08:34:15 +0000]
add bit flip helper for recover tests
Philipp Gesang [Thu, 10 Aug 2017 08:13:18 +0000]
fix misleading docstrings for index file hook
Philipp Gesang [Thu, 10 Aug 2017 07:37:08 +0000]
lay out skeleton for disaster recovery tests
New series of tests for corrupting backup sets and restoring them
incompletely (“tolerant” or “disaster recovery” mode).
Philipp Gesang [Tue, 8 Aug 2017 11:58:20 +0000]
draft disaster recovery mode for deltatar
The first stage recovery assumes the index is intact and all
objects are at their expected position. In this scenario, an
attempt is made to extract each object, keeping track of those
that weren’t readable and why.
Philipp Gesang [Tue, 8 Aug 2017 10:03:01 +0000]
return valid decrypted data on decryption failure
Philipp Gesang [Tue, 8 Aug 2017 08:48:31 +0000]
force tarfile reopen after bad read in deltatar
Closing the tarfile after an unreadable object was encountered
causes the stream to be reopened for the next read. Otherwise,
the corrupt object is already buffered and tarfile would continue
to seek inside the bad data.
Philipp Gesang [Tue, 8 Aug 2017 07:44:56 +0000]
distinguish invalid files from parse errors in restore
Especially with index files, the parse error is misleading.
Indicate the prevalent cause of the problem, i. e. that the
file is compressed but compression was not requested during
restore.
Philipp Gesang [Tue, 8 Aug 2017 07:14:12 +0000]
update help usage strings wrt. crypto in backup.py
Philipp Gesang [Mon, 7 Aug 2017 13:37:19 +0000]
extend crypto.py exception descriptions
Philipp Gesang [Tue, 27 Jun 2017 08:24:00 +0000]
actually default to i2n mode with crypto.py scrypt
And adapt the relevant unit test to explicitly request the full
parameters output.
Philipp Gesang [Fri, 23 Jun 2017 08:35:08 +0000]
add crypto.py option to output cnf-compatible scrypt object
Philipp Gesang [Wed, 31 May 2017 11:53:21 +0000]
support PDT encrypted archives with rescue_tar.py
Philipp Gesang [Tue, 30 May 2017 15:29:26 +0000]
adapt file_crypt.py for revised crypto
Philipp Gesang [Tue, 30 May 2017 15:10:59 +0000]
kill off old crypto implementation
The old aescrypto.py was only kept for reference but since
downstream integration is more or less complete wrt. encryption
we don’t need it any longer.
Good riddance.
Philipp Gesang [Tue, 30 May 2017 10:40:19 +0000]
allow passing salt to crypto.py on the command line
Nifty shortcut for hashing without a corresponding pdtcrypt file.
Philipp Gesang [Tue, 30 May 2017 09:23:57 +0000]
properly align usage message of crypto.py
Philipp Gesang [Tue, 23 May 2017 12:55:10 +0000]
improve bad CLI argument handling of crypto.py
Philipp Gesang [Mon, 22 May 2017 12:10:33 +0000]
include header version info in scrypt handler
Philipp Gesang [Fri, 19 May 2017 15:22:17 +0000]
accept crypto format version in deltatar ctor