Philipp Gesang [Thu, 10 Aug 2017 08:13:18 +0000]
fix misleading docstrings for index file hook
Philipp Gesang [Thu, 10 Aug 2017 07:37:08 +0000]
lay out skeleton for disaster recovery tests
New series of tests for corrupting backup sets and restoring them
incompletely (“tolerant” or “disaster recovery” mode).
Philipp Gesang [Tue, 8 Aug 2017 11:58:20 +0000]
draft disaster recovery mode for deltatar
The first stage recovery assumes the index is intact and all
objects are at their expected position. In this scenario, an
attempt is made to extract each object, keeping track of those
that weren’t readable and why.
Philipp Gesang [Tue, 8 Aug 2017 10:03:01 +0000]
return valid decrypted data on decryption failure
Philipp Gesang [Tue, 8 Aug 2017 08:48:31 +0000]
force tarfile reopen after bad read in deltatar
Closing the tarfile after an unreadable object was encountered
causes the stream to be reopened for the next read. Otherwise,
the corrupt object is already buffered and tarfile would continue
to seek inside the bad data.
Philipp Gesang [Tue, 8 Aug 2017 07:44:56 +0000]
distinguish invalid files from parse errors in restore
Especially with index files, the parse error is misleading.
Indicate the prevalent cause of the problem, i. e. that the
file is compressed but compression was not requested during
restore.
Philipp Gesang [Tue, 8 Aug 2017 07:14:12 +0000]
update help usage strings wrt. crypto in backup.py
Philipp Gesang [Mon, 7 Aug 2017 13:37:19 +0000]
extend crypto.py exception descriptions
Philipp Gesang [Tue, 27 Jun 2017 08:24:00 +0000]
actually default to i2n mode with crypto.py scrypt
And adapt the relevant unit test to explicitly request the full
parameters output.
Philipp Gesang [Fri, 23 Jun 2017 08:35:08 +0000]
add crypto.py option to output cnf-compatible scrypt object
Philipp Gesang [Wed, 31 May 2017 11:53:21 +0000]
support PDT encrypted archives with rescue_tar.py
Philipp Gesang [Tue, 30 May 2017 15:29:26 +0000]
adapt file_crypt.py for revised crypto
Philipp Gesang [Tue, 30 May 2017 15:10:59 +0000]
kill off old crypto implementation
The old aescrypto.py was only kept for reference but since
downstream integration is more or less complete wrt. encryption
we don’t need it any longer.
Good riddance.
Philipp Gesang [Tue, 30 May 2017 10:40:19 +0000]
allow passing salt to crypto.py on the command line
Nifty shortcut for hashing without a corresponding pdtcrypt file.
Philipp Gesang [Tue, 30 May 2017 09:23:57 +0000]
properly align usage message of crypto.py
Philipp Gesang [Tue, 23 May 2017 12:55:10 +0000]
improve bad CLI argument handling of crypto.py
Philipp Gesang [Mon, 22 May 2017 12:10:33 +0000]
include header version info in scrypt handler
Philipp Gesang [Fri, 19 May 2017 15:22:17 +0000]
accept crypto format version in deltatar ctor
Philipp Gesang [Fri, 19 May 2017 09:16:10 +0000]
add unit test for CLI scrypt hashing
Philipp Gesang [Thu, 18 May 2017 15:47:25 +0000]
allow passing keys directly to CLI crypto.py
Keys may now be passed as command line argument or environment
variable.
The only valid format is 16 bytes in hexadecimal.
Philipp Gesang [Thu, 18 May 2017 11:44:07 +0000]
grab password from envp if not supplied on CLI
In order to avoid the password showing up in full in the process
table, pass it in the environment instead. Uses the environment
variable PDTCRYPT_PASSWORD with both crypto.py and backup.py.
Philipp Gesang [Tue, 16 May 2017 11:37:43 +0000]
default to index mode of deltatar object when choosing extension
For external use.
Philipp Gesang [Tue, 16 May 2017 08:57:01 +0000]
handle bad randomness during IV creation
Since IVs must be unique we rely on /dev/urandom to yield a
different sequence of bytes when requesting a new fixed part.
In the unlikely event that a new fixed part has already been
used earlier, repeat it for number of times.
Abort if no unique IV could be generated this way since it
most likely indicates a faulty RNG.
Philipp Gesang [Mon, 15 May 2017 15:44:48 +0000]
extend crypto.py documentation
Philipp Gesang [Thu, 11 May 2017 15:40:21 +0000]
distinguish auxiliary file errors
Auxiliary files that grow larger than the maximum defined
encrypted file size cause an irrecoverable error because their
fixed IV is being reused. Add a new exception to distinguish this
specific case. Encrypted auxiliary files thus never consist of
more than one object, no on-the-fly continuation is permitted
like with ordinary files.
Philipp Gesang [Thu, 11 May 2017 09:28:08 +0000]
adapt unit tests for crypto.py subcommands
Philipp Gesang [Thu, 11 May 2017 08:50:30 +0000]
export scrypt hashing functionality
Philipp Gesang [Thu, 11 May 2017 08:35:40 +0000]
add SCRYPT hashing mode to crypto.py
Add a subcommand “scrypt” to crypto.py in CLI mode. Example:
$ python3 ./deltatar/crypto.py scrypt foo -i - -o pwd \
<backup_dir/bfull-2017-05-11-0919-001.tar.pdtcrypt
{"scrypt_params": {"p": 1, "N": 65536, "dkLen": 16, "r": 8},
"salt": "b'
fbdbaa9890ae243eb16391199c9243f6'", "hash":
"b'
1e7d7a78b9300d461779e9c80e4a15ac'"}
The output “hash” is calculated from the salt in the first
header found in the given archive and the password specified.
Philipp Gesang [Tue, 9 May 2017 13:42:17 +0000]
graciously handle GCM data length limit
Philipp Gesang [Tue, 9 May 2017 08:59:28 +0000]
unit test crypto file counter wraparound
After the file counter reaches UINT_MAX, it wraps around and a
new fixed part must be created.
The file counter is 32 bit unsigned integer so it needs to be
lowered to make bounds testing feasible.
Philipp Gesang [Tue, 9 May 2017 08:22:43 +0000]
extend strict iv tracking to encryption
This is just an extra soundness check to prevent accidental reuse
if IVs when handled incorrectly (same initial counters passed
twice to the same context). In normal usage this case cannot
happen.
Philipp Gesang [Mon, 8 May 2017 14:27:13 +0000]
expand crypto api to accept precomputed key
Philipp Gesang [Mon, 8 May 2017 15:13:26 +0000]
reduce noise in test_multivol_compression_sizes.py
Philipp Gesang [Mon, 8 May 2017 13:33:29 +0000]
improve iv diagnostics when decrypting
Philipp Gesang [Mon, 8 May 2017 09:26:54 +0000]
test that seeking backwards is disallowed by _Stream
Re-extracting an already decrypted file will fail on account of
IV reuse. Currently, tarfile._Stream is not capable of performing
backward seeks, so we’re good. Should this limitation be removed
in a future version, this unit test will fail.
Philipp Gesang [Mon, 8 May 2017 07:58:33 +0000]
remove pytest dependency from test_crypto.py
Philipp Gesang [Fri, 5 May 2017 15:52:51 +0000]
add unit test for IV reuse
Philipp Gesang [Fri, 5 May 2017 15:18:11 +0000]
adapt crypto unit tests to run in main suite
Philipp Gesang [Fri, 5 May 2017 14:52:45 +0000]
remove IV validation step from RestoreHelper
Since the same decryption context is carried over between the Tar
volumes of one backup set, the built-in IV uniqueness checks
suffice. Between multiple backup sets, the salt and IV fixed
parts change, so there is no occasion for conflict. The IVs of
auxiliary files are unique anyways.
Philipp Gesang [Fri, 5 May 2017 12:27:29 +0000]
adjust acceptable size window for compressed unit test data
A low bound of 330 causes the test to fail with version 1.2.3 of
zlib.
Earlier this did not occur because in concat mode, tarfile would
always write an empty zlib compressed chunk right at the
beginning of the archive and then immediately create a new one as
soon as actual input arrived. For this reason, the resulting
archive size remained within the bounds chosen in
test_multivol.py. Due to the removal of the redundancy, this is
no longer the case. The problem is masked on newer versions of
zlib (tested: 1.2.8 of fc25) which create larger compressed files
in general for the same inputs.
For the “test_compress_single” unit test, the input consists of a
an archive 61440 bytes. Compress with level 9, window bits 31,
and a memlevel of 9, the output length is:
version size (B)
1.2.3 308
1.2.8 324
Add to that the file name in our custom header and the latter
passes 330 B whereas the former doesn’t.
A lower bound of 315 is justified.
Philipp Gesang [Fri, 5 May 2017 09:00:37 +0000]
reuse existing crypto context for subsequent volumes
Philipp Gesang [Fri, 5 May 2017 08:20:50 +0000]
validate exceptions being thrown from invalid tarfile.open() params
Philipp Gesang [Thu, 4 May 2017 16:06:04 +0000]
move final IV checks out of crypto context
Collect IVs while decrypting but postpone the final check for
duplicates. Reused IVs still trigger an exception during
decryption but since multiple different contexts may be active
(e. g. when handling a diff backup), the IVs they retrieved from
the headers must be compared afterwards. This test has its place
in a new function “validate” of the ``RestoreHelper`` and must be
called when decryption has been completed.
Philipp Gesang [Thu, 4 May 2017 13:12:50 +0000]
write auxiliary files whilst processing the backup
Introduce a fixed value for the index file counter to allow
encryption on the fly.
Philipp Gesang [Thu, 4 May 2017 12:24:06 +0000]
use independent decryption contexts for backup files
When restoring individual files from a diff backup, Deltatar will
traverse both tarballs simultaneously. This leads to access
patterns where reads are interleaved between the two sources,
possibly corrupting the decryption state. Thus when restoring
from multiple “index files” (in practice only two are relevant),
use a separate decryptor context for each of them.
Philipp Gesang [Tue, 2 May 2017 14:14:44 +0000]
clean up multi-index handling
WIP
Philipp Gesang [Tue, 2 May 2017 15:59:19 +0000]
properly handle encryption and compression of empty archives
The old implementation always initialized in the ctor regardless
of whether contents would be written to the archive. For empty
archives this now has to be done in ``.close()`` if no data has
been added yet.
Philipp Gesang [Tue, 2 May 2017 14:44:43 +0000]
adapt test_multivol_compression_sizes.py to revised crypto
Philipp Gesang [Tue, 2 May 2017 13:06:24 +0000]
remove redundant test
The first part of the condition always evaluates to True since
it’s the precondition to entering that branch.
Philipp Gesang [Fri, 28 Apr 2017 16:06:27 +0000]
encode operation modes
Introduce the “arcmode” to comprehensively switch modes to
supplant the pervasive ad-hoc string parsing and attribute
queries. Encodes the triple encryption, compression, concat.
Philipp Gesang [Fri, 28 Apr 2017 12:04:42 +0000]
cleanly perform block transition in non-concat mode
Philipp Gesang [Fri, 28 Apr 2017 08:41:27 +0000]
clarify exception-driven control flow
Distinguish the actual EOF when hit at the beginning from other
IO errors in _init_read_gz() and only catch this one where it’s
expected. Well formed archives do not end inside a header.
Philipp Gesang [Fri, 28 Apr 2017 08:28:29 +0000]
remove unused state variable
“internal_pos” which is only written to and never read was
introduced with this commit:
commit
85737f48c38a432f2429e9e3e4b81fed164c4b9a
Author: Eduardo Robles Elvira <edulix@wadobo.com>
Date: Fri Jul 5 11:50:43 2013 +0200
extracting files in r#gz mode now works too, includes unit tests
and lacks a raison d’être ever since.
Philipp Gesang [Thu, 27 Apr 2017 15:18:31 +0000]
use append mode in symlink unit test
These tests currently fail despite using the original Gzipfile
compression path. The archives appear to overwrite instead the
passed archive instead of writing new objects.
Philipp Gesang [Thu, 27 Apr 2017 14:03:46 +0000]
fix multivol compression handling
Philipp Gesang [Tue, 25 Apr 2017 14:38:12 +0000]
handle uncompressed, encrypted archives with tarfile
Internally, tarfile.py uses “tar” to refer to uncompressed
archives, so just handle this accordingly at the API level.
Philipp Gesang [Tue, 25 Apr 2017 13:28:09 +0000]
sync on .close() for unencrypted archives
Philipp Gesang [Tue, 25 Apr 2017 12:03:36 +0000]
properly (re-) initialize compressor at archive / volume bounds
For unencrypted streams, the compressor still must be reset in
concat mode. For encrypted streams, the decompressor can be
initialized right at the start of the archive since no further
inputs are needed.
Philipp Gesang [Tue, 25 Apr 2017 09:13:17 +0000]
keep separate encryptor and decryptor contexts in deltatar.py
The same Deltatar object appears to function as handle for
reading and writing files simultaneously. To support this,
introduce two different crypto contexts that are created
on demand.
Philipp Gesang [Mon, 24 Apr 2017 13:04:38 +0000]
properly restart compression when encrypting
Separate finalization of a zlib block from creation of a new one.
Otherwise, we end up with trailing data from the last object that
lingers in the write buffer and gets flushed to the archive after
the next encrypted object has been initialized.
Also get rid of the “new_compression_block” wrapper which
needlessly complicated things.
Special precautions must be taken for the PAX format. Due to its
requirement of a global archive header, TarFile will write to the
stream prior to initialization that is performed in addfile().
Thus, initialize compression before the PAX header is being
written and properly restart compression for the first object
committed to the archive or volume.
Philipp Gesang [Mon, 24 Apr 2017 10:06:46 +0000]
use crypto.py to split test archive in test_encryption.py
This again verifies individual decryptability of objects in the
PDT archive.
Philipp Gesang [Mon, 24 Apr 2017 09:37:23 +0000]
implement passthrough mode in crypto.py
When invoked with --no-decrypt, write object headers and
ciphertext to output. Combined with --split this allows
extracting encrypted objects from the archive.
Philipp Gesang [Mon, 24 Apr 2017 09:23:53 +0000]
implement split mode for CLI encryption
Philipp Gesang [Fri, 21 Apr 2017 16:03:17 +0000]
explicitly disable gz initalization for _Stream’s used in aux files
The process of writing an auxiliary (index, info) file differs
drastically from that of tar archives: Since files are not added
individually, the encryption must be initialized externally and
the compression layer cannot rely on being enable in the ctor
because, obviously, the latter is executed before the manual
encryption setup can be performed.
Extend the API of the _Stream ctor with a parameter to “noinit”
to request that all initialization be postponed until the
encryption has been set up. This seems to do the trick but is
quite ugly.
Philipp Gesang [Fri, 21 Apr 2017 13:23:49 +0000]
fix decompression error handling
This seems to be a copy&paste oversight from
commit
be60ffd0598fec172eccb69f3c6a8433af6fb643
Author: Eduardo Robles Elvira <edulix@wadobo.com>
Date: Mon Nov 4 08:50:55 2013 +0100
initial port to python 3, not finished
which added the per-compression mode exceptions but not the
actual handling code.
Philipp Gesang [Fri, 21 Apr 2017 13:21:06 +0000]
fix index file encryption handling
Philipp Gesang [Fri, 21 Apr 2017 07:57:05 +0000]
convert test_deltatar to revised crypto
Philipp Gesang [Fri, 21 Apr 2017 07:39:02 +0000]
permit setting crypto.py parameter version via deltatar ctor
Introduce an optional argument to request a specific crypto
parameter version when invoking Deltatar. This isn’t of much
use ATM since only the one version is implemented, but it’s
handy for testing nonetheless.
Philipp Gesang [Fri, 21 Apr 2017 07:16:49 +0000]
eliminate the last traces of encryption “modes”
Since encryption handling has been moved outside of tarfile.py
these no longer apply. Thus remove all references so they don’t
obscure problems in the unit tests.
Philipp Gesang [Thu, 20 Apr 2017 15:53:49 +0000]
initialize compression globally for non-“concat” archives
Philipp Gesang [Thu, 20 Apr 2017 15:26:04 +0000]
remove obsolete unittests for 256 bit AES
Philipp Gesang [Thu, 20 Apr 2017 14:53:06 +0000]
pass encryption context to deltatar volume handlers
Philipp Gesang [Tue, 18 Apr 2017 13:04:07 +0000]
rework encryption unittests
Philipp Gesang [Thu, 20 Apr 2017 09:34:35 +0000]
fix compression handling on volume bounds
The old “concat compression” simply relied on the _Stream() ctor
to create a new zip block which is no longer possible since the
prerequisite encryption is only available when the first object
is committed to the archive.
Hence, reintroduce the new block initialization after
transitioning to the new volume.
Philipp Gesang [Tue, 18 Apr 2017 14:07:29 +0000]
add strict IV validation to decryption handler
Optionally (on CLI, with the “-s” flag) check for additional IV
properties:
- Accidental reuse: in GCM, the same IV used more than once
means that the plaintext is compromised.
- Unstructured archive: In the headers of a normal PDT
encrypted archive, the variable parts of the IVs are
consecutive unless the fixed part changes.
Philipp Gesang [Tue, 18 Apr 2017 09:59:02 +0000]
allow selecting individual tests with runtests.py
If arguments are passed on the command line, interpret them as
test names and attempt to compose a suite comprising only the
tests specified.
The behavior remains the same if invoked without argument.
Philipp Gesang [Mon, 10 Apr 2017 15:37:33 +0000]
improve error handling in crypto handler
Since invalid tags are some the most important bits of
information to be passed down, make the corresponding error
message human-readable.
Philipp Gesang [Mon, 10 Apr 2017 14:32:50 +0000]
remove obsolete block size check
Philipp Gesang [Mon, 10 Apr 2017 13:08:31 +0000]
fix fallout from EOF changes in CLI decryptor
Philipp Gesang [Mon, 10 Apr 2017 13:01:56 +0000]
throw error on partial header reading stream
Throw the EOF exception only if the stream ends exactly at an
object boundary. Otherwise, when less then sizeof(hdr) bytes
are returned from read(), throw InvalidHeader to indicate a
malformed file. This keeps EOF a “benign” exception.
Philipp Gesang [Mon, 10 Apr 2017 11:43:53 +0000]
communicate remainder to caller when hitting EOF from crypto
Philipp Gesang [Mon, 10 Apr 2017 09:43:06 +0000]
strip extraneous parameters from decryption handler ctor
Format and parameter version as well as the salt are supplied
from the headers. Decrypting should thus only require the
password and, depending on context, an explicit counter as well
as the list of valid IV fixed parts.
Philipp Gesang [Mon, 10 Apr 2017 09:15:09 +0000]
add input checks at API boundaries
Verify conformance of user-supplied inputs on a very basic level,
communicating violations via InvalidParameter exception.
Of course due to the limitations of the type systems these can’t
be made exhaustive. E. g. no effort is being made to inspect a
(passing) list or dict test for well-formed contents.
Philipp Gesang [Mon, 10 Apr 2017 08:36:12 +0000]
document exceptions used in encryption handler
Prepare clear and rigorous communication of errors and other
unexpected conditions to the user. Eventually these will make
the foundation for messages propagating up the stack until they
reach the UI.
Philipp Gesang [Mon, 10 Apr 2017 08:27:47 +0000]
use exception to communicate tag mismatch
Philipp Gesang [Mon, 10 Apr 2017 08:13:02 +0000]
unify error and parameter handling in crypto.py
Three classes of errors:
- bad encryption (tag mismatch, bad IVs);
- bad user input (request info counter twice);
- internal error (state was reached that indicates a problem
with crypto.py).
Philipp Gesang [Mon, 10 Apr 2017 07:34:32 +0000]
remove obsolete tag handling functionality from crypto.py
The GCM tag does no longer occur independent of a PDT header so
these are no longer relevant.
Philipp Gesang [Fri, 7 Apr 2017 15:49:35 +0000]
fix search string in tar volume generation
Philipp Gesang [Fri, 7 Apr 2017 14:59:19 +0000]
use OSError instead of IOError
IOError is a synonym for OSError, so the latter should be used
everywhere to avoid confusion, especially when throwing the one
but catching the other.
Cf. PEP 3151.
Philipp Gesang [Fri, 7 Apr 2017 09:15:00 +0000]
allow test_compression_level.py as module
Philipp Gesang [Thu, 6 Apr 2017 15:56:09 +0000]
rework crypto.py unittests for revised encryption
Main changes:
- Adjust usage to revised encryption handler.
- Adapt to header format.
- Adjust to changes in error passing (above all ``hdr_read()``).
- Remove Scrypt or tag tests, these interfaces are no longer
available.
Philipp Gesang [Fri, 7 Apr 2017 07:29:27 +0000]
specify salt and version in ctor when encrypting
Simplify the signature of Encrypt.next() by removing the salt and
version arguments: This will make the encryptor reuse the values
it already has which was either passed to or randomly generated
by the ctor. Currently there is no case where we’d need to change
the salt or version during encryption. When decrypting, the
values from the headers are used anyways so nothing changes over
there.
Philipp Gesang [Thu, 6 Apr 2017 15:54:35 +0000]
increment file counter after handling current object
Philipp Gesang [Thu, 6 Apr 2017 15:06:05 +0000]
fix IV fixed part validation on decryption
Philipp Gesang [Thu, 6 Apr 2017 14:54:33 +0000]
parse buffer as header if passed as arg to next()
Philipp Gesang [Thu, 6 Apr 2017 13:42:15 +0000]
adapt concat_compress unit tests to gzip block sequence
The unit tests assume that compression of three files requires
three distinct Gzip blocks. The first one of these is empty and
serves no purpose, differing from the others by containing the
more or less redundant archive name. This is no longer the case
after the revision of the header code: the first block will still
have the archive name in the metadata but also contain the first
file.
Thus, adapt the unit tests to no longer check for and then ignore
the empty initial gzip block.
Philipp Gesang [Thu, 6 Apr 2017 13:09:10 +0000]
prefer symbolic constants over literals referring to gzip header
The mixed use of hex and octal is pretty confusing to say the
least, use named constants instead that are defined only in
tarfile.
Philipp Gesang [Thu, 6 Apr 2017 12:36:26 +0000]
fix tarfile crypto parameter passing
Remove obsolete parameters like “password” that are no longer
meaningful after moving the creation of the crypto context
outside of tarfile.py.
Also, check test the presence of encryption attributes before
accessing them to avoid conflicts with zlib streams. (Kludgy, but
not avoidable without a larger changes due to the possibility of
“fileobj” being anything, including things that don’t satisfy all
the interfaces that “_Stream” provides.
Philipp Gesang [Thu, 6 Apr 2017 12:34:20 +0000]
accept external counter in crypto.py
Required when encrypting an auxiliary file of type info.
Philipp Gesang [Wed, 5 Apr 2017 06:49:50 +0000]
unify constant naming I2N_→PDTCRYPT_