Philipp Gesang [Fri, 24 Feb 2017 10:18:18 +0000]
remove key length parameter wherever feasible
Since we’re using fixed AES-128 everywhere, the revised version
no longer offers adjustable key length.
Philipp Gesang [Fri, 24 Feb 2017 09:50:03 +0000]
make tarfile.py error out on invalid crypto modes and combos
The tarfile stream ctor will simply gloss over encryption
requested by the caller unless it happens to exactly match the
string (!) “aes”. Furthermore, with non-gzip compression the
encryption will be ignored altogether.
Instead of deceiving the user about the encryption being applied,
have the ctor fail immediately on invalid combinations.
Philipp Gesang [Thu, 23 Feb 2017 15:34:19 +0000]
init crypto support v2
Implements header reading and writing as well as PoC encryption
wrappers.
WIP
Thomas Jarosch [Mon, 2 Apr 2018 11:31:18 +0000]
Merge branch 'skip-symlinks'
Thomas Jarosch [Mon, 2 Apr 2018 11:15:30 +0000]
RestoreHelper: Prevent endless loop if both indexes contain a list:// entry
This completes the control flow refactoring from Philipp
in commit
7273719cca856677d25102d805f6f96e36173731
Thomas Jarosch [Mon, 2 Apr 2018 10:48:59 +0000]
Document restore code index handling
Philipp Gesang [Mon, 7 Nov 2016 09:00:32 +0000]
ignore all symlinks
Don’t delay the creation of symlinks but suppress it entirely.
The rationale is that extraction with deltatar will only ever
operate on inputs whose symlinks are dereferenced upon archive
creation. Thus valid archives will not contain symlinks at all.
Also, it would appear that deltatar assumes paths of objects
inside a tarball are unique. If the tarball contains ultiple
objects with the same path, it will extract only the first one it
encounters and ignore the rest. This means that it would take at
least two successive backups to perform a symlink attack, the
first one planting the link and the second writing over the
location. This is prevented by the current mitigation strategy
(and by the --unlink option of other tar utilities).
Philipp Gesang [Fri, 4 Nov 2016 16:00:59 +0000]
add unit test for overwriting symlinks
Currently, we implement the behavior of GNU Tar: Subsequent files
in an archive override previous ones, which is also true of
symlinks.
Philipp Gesang [Fri, 4 Nov 2016 14:32:13 +0000]
rectify delayed symlink restoration
Again, GNU tar serves as the model for safe behavior: We now
check whether the placeholder file exists and if it is indeed the
one we created earlier.
Since deltatar does not allow including symlinks in the backup,
the unit tests invoke tarfile functionality directly to add some
symlinks to an existing backup.
Philipp Gesang [Fri, 4 Nov 2016 10:59:34 +0000]
add unit test tracking behavior wrt symlinks
Philipp Gesang [Thu, 3 Nov 2016 16:02:54 +0000]
fix calls to deprecated function in deltatar.py
Fixes warning mandating “.warning()” over “.warn()”:
DeprecationWarning: The 'warn' method is deprecated, use 'warning' instead
Philipp Gesang [Thu, 3 Nov 2016 13:31:14 +0000]
delay only absolute symlinks and those pointing to parent dirs
Only apply the symlink hook on those with fishy targets. Internal
symlinks need not be contained so they can be applied as-is.
Philipp Gesang [Thu, 3 Nov 2016 11:02:15 +0000]
implement delayed symlink creation
Introduce a hook in ``extract()`` to invoke a callback if a
symlink is encountered in the archive. The implementation is
modeled after GNU Tar.
This is a v2 attempt on the symlink extraction problematic. The
first version simply ``unlink(2)`` all files before extraction
which is a less efficient albeit more robust strategy.
Philipp Gesang [Mon, 31 Oct 2016 16:44:42 +0000]
avoid crash in test helper due to fp division
As a matter of fact, ``randomint()`` accepts only int-ishly typed
values, not floats. Consequently, integer division is the way to
go.
Philipp Gesang [Wed, 2 Nov 2016 16:42:33 +0000]
simplify control flow in RestoreHelper methods
Make the control flow more obvious. The code in question was
introduced with commit
ea6d3c3e… but did not make sense back then
either because cur_index which is the constant $1$ was compared
to the literal constant $0$:
+ cur_index = 1
+ while cur_index < len(self._data):
+ data = self._data[cur_index]
…
+ if cur_index == 0:
This bogus test was since removed but the convoluted ``while``
loop survived. Instead, access index 1 only once using an integer
literal.
Thomas Jarosch [Mon, 4 Jul 2016 10:13:39 +0000]
Increase version to 1.5
Thomas Jarosch [Mon, 4 Jul 2016 09:49:06 +0000]
Code review done, comment changes only
Thomas Jarosch [Mon, 4 Jul 2016 09:48:28 +0000]
Remove dead code
cur_index is always >= 1 in this code path
Thomas Jarosch [Mon, 4 Jul 2016 09:48:16 +0000]
Remove code duplication
Thomas Jarosch [Thu, 30 Jun 2016 08:03:40 +0000]
Don't use exception handling for normal control flow
-> Replace buf.index() with buf.find().
Unwinding the stack is expensive and we were
even doing it for the default code path.
Christian Herdtweck [Fri, 24 Jun 2016 07:15:56 +0000]
add file_crypt.py to scripts in setup.py
Christian Herdtweck [Thu, 23 Jun 2016 16:03:03 +0000]
created tool to encrypt/decrypt files using aes128 with compression
Thomas Jarosch [Thu, 23 Jun 2016 12:17:46 +0000]
Increase version to 1.4
Lots of little fixes and improvements.
Christian Herdtweck [Thu, 23 Jun 2016 12:33:40 +0000]
appease pylint
Christian Herdtweck [Wed, 22 Jun 2016 15:14:15 +0000]
fix error found by pylint
Thomas Jarosch [Thu, 23 Jun 2016 10:31:11 +0000]
Rename design document so pylint3 doesn't pick it up
Thomas Jarosch [Thu, 23 Jun 2016 08:08:16 +0000]
Implement cache for pwd.getpwuid() and grp.getgrgid()
Those functions always parse /etc/passwd and we
look up the owner for each file we backup.
This change is only relevant when creating full backups.
Speed up with ~1.000.000 emails is 11%.
Thomas Jarosch [Tue, 21 Jun 2016 08:01:35 +0000]
Fix 'directory' type when iterating tar archives without index
'dir' is not used anywhere in the code base.
Christian Herdtweck [Mon, 20 Jun 2016 07:43:54 +0000]
use the "& 0xFFFfff" after all crc32 calculations
Thomas Jarosch [Fri, 17 Jun 2016 15:39:13 +0000]
Increase release to 1.3
Also switch group from Intranator to Intra2net
Christian Herdtweck [Fri, 17 Jun 2016 14:02:27 +0000]
improve one more unittest: raise proper assertion instead of failing with non-existent variable
Christian Herdtweck [Fri, 17 Jun 2016 14:01:37 +0000]
adjust unittests in test_deltatar
Christian Herdtweck [Fri, 17 Jun 2016 13:29:19 +0000]
correct a comment, add more info to log message
Christian Herdtweck [Fri, 17 Jun 2016 13:28:34 +0000]
adjust filter_path: also remove trailing os.sep
Christian Herdtweck [Fri, 17 Jun 2016 13:28:09 +0000]
fix strip_base_dir argument for DeltaTar._recursive_walk_dir: check for os.sep
Christian Herdtweck [Fri, 17 Jun 2016 09:59:27 +0000]
simplify DeltaTar._recursive_walk_dir
(had called os.path.isdir and filter_path twice on each file directly
after another)
Christian Herdtweck [Fri, 17 Jun 2016 09:56:24 +0000]
had forgotten a few tarobj.close and os.unlink(temp_file) in new tests
Christian Herdtweck [Fri, 17 Jun 2016 07:31:50 +0000]
change one output, make 2 variables to testing routine arguments
Christian Herdtweck [Fri, 17 Jun 2016 07:31:16 +0000]
use KiB, MiB (factor 1024) instead of KB, MB (factor 1000)
Christian Herdtweck [Wed, 15 Jun 2016 13:07:24 +0000]
fix search for file with impossible size (had forgotten that volume_size is in MB)
Thomas Jarosch [Wed, 15 Jun 2016 12:28:38 +0000]
Increase version to 1.2
Thomas Jarosch [Wed, 15 Jun 2016 12:18:33 +0000]
Merge branch 'fix-compression-size'
The new code will give final tar file sizes
close to the volume size even when using compression.
Christian Herdtweck [Wed, 15 Jun 2016 09:39:41 +0000]
ensure temp file is deleted; add some comments about results
Christian Herdtweck [Wed, 15 Jun 2016 09:19:09 +0000]
added performance test script
Christian Herdtweck [Wed, 15 Jun 2016 09:18:55 +0000]
added minimum file size arg to find_random_files
should make returned files more realistic
Christian Herdtweck [Wed, 15 Jun 2016 07:55:16 +0000]
reduce time wasted on _dbg output: format string only when it is actually printed
Christian Herdtweck [Wed, 15 Jun 2016 07:54:02 +0000]
remove _dbg(str.format(args)) from performance-sensitive loop in addfile
Christian Herdtweck [Wed, 15 Jun 2016 07:53:48 +0000]
added some more comments
Christian Herdtweck [Tue, 14 Jun 2016 10:28:32 +0000]
add unittest that runs one of the many multivolume compression size tests
Christian Herdtweck [Mon, 13 Jun 2016 11:07:08 +0000]
created another test for multivolume compression size
Christian Herdtweck [Tue, 14 Jun 2016 09:59:03 +0000]
changed debug output level of the debug output I added earlier
Christian Herdtweck [Mon, 13 Jun 2016 11:06:38 +0000]
removed some debug output
Christian Herdtweck [Mon, 13 Jun 2016 11:06:28 +0000]
fix ValueError message (otherwise '*' is interpreted as string repetition)
Christian Herdtweck [Mon, 13 Jun 2016 11:05:50 +0000]
ensure max_volume_size is int or None
Christian Herdtweck [Fri, 10 Jun 2016 12:34:05 +0000]
skip one test with known failure
Christian Herdtweck [Fri, 10 Jun 2016 12:32:38 +0000]
correct number of compressed backup volumes in tests
Christian Herdtweck [Fri, 10 Jun 2016 12:18:33 +0000]
added class variable MODE_COMPRESS to test_deltatar unittests
Christian Herdtweck [Fri, 10 Jun 2016 12:17:33 +0000]
do a proper unittest.skip if test is not run
Christian Herdtweck [Fri, 10 Jun 2016 12:15:30 +0000]
remove print()s in unittest test_multivol
Christian Herdtweck [Fri, 10 Jun 2016 08:09:33 +0000]
remove unused import
Christian Herdtweck [Fri, 10 Jun 2016 08:09:24 +0000]
extend valid range of sizes for sample.tar.gz file
Christian Herdtweck [Thu, 9 Jun 2016 16:01:37 +0000]
added new unittest test_multivol_compress with bigger random-data file
Christian Herdtweck [Thu, 9 Jun 2016 16:01:12 +0000]
renamed unittest test_multivol_compress to test_compress_single (compresses to single volume)
Christian Herdtweck [Thu, 9 Jun 2016 15:58:39 +0000]
added a few debug messages to addfile and open_volume
Christian Herdtweck [Thu, 9 Jun 2016 15:58:04 +0000]
changed TarFile.addfile to get better sized volumes if compressing
Christian Herdtweck [Thu, 9 Jun 2016 15:56:00 +0000]
created a 2nd TarFile._size_left: one for file and one for stream
Christian Herdtweck [Thu, 9 Jun 2016 15:54:51 +0000]
fix unittest's create_pseudo_random_file: do return file name as docu says
author of that module seems to never have heard of the tempfile module in
python stdlib!
Christian Herdtweck [Thu, 9 Jun 2016 15:53:54 +0000]
improved unittest's create file: do not gather GB in memory before writing
Christian Herdtweck [Thu, 9 Jun 2016 15:52:58 +0000]
clean up in unittest: remove size_test_* files
Christian Herdtweck [Thu, 9 Jun 2016 15:52:41 +0000]
clean up in unittest: remove unused imports
Christian Herdtweck [Thu, 9 Jun 2016 09:23:33 +0000]
make file size test faster by removing 16GB test file
Christian Herdtweck [Thu, 9 Jun 2016 09:23:09 +0000]
create unittest for warning
Christian Herdtweck [Thu, 9 Jun 2016 09:22:58 +0000]
warn if trying to compress/encrypt multivolume with mode w:...
Christian Herdtweck [Thu, 9 Jun 2016 09:09:41 +0000]
adjust file-size-estimation unittest; add an actual test to unittest
Christian Herdtweck [Wed, 8 Jun 2016 15:50:25 +0000]
started a unittest for multi-volume with compression
Christian Herdtweck [Wed, 8 Jun 2016 15:48:23 +0000]
created unittest for get_file_size
Christian Herdtweck [Wed, 8 Jun 2016 15:46:49 +0000]
added function get_file_size and var _file_size to _Stream
Christian Herdtweck [Wed, 8 Jun 2016 15:46:02 +0000]
changed some comments, corrected one indent
Thomas Jarosch [Wed, 15 Jun 2016 06:42:02 +0000]
delete(): Don't crash on removing symlinks to directories
shutil.rmtree() will refuse to follow symlinks.
-> just call os.unlink() for symlinks.
Christian Herdtweck [Thu, 7 Apr 2016 13:48:27 +0000]
Revert "added arguments for permissions of created dirs and files to deltatar.create_full/diff_backup"
This reverts commit
8db5dd46357def656b84019cded8aed8f0626a03.
(simpler and more important to have UI deal with unclean input)
Christian Herdtweck [Wed, 6 Apr 2016 16:04:46 +0000]
added arguments for permissions of created dirs and files to deltatar.create_full/diff_backup
Thomas Jarosch [Fri, 3 Jul 2015 19:34:05 +0000]
Increase version to 1.1
Reflect the filename changes and
a few other bells and whistles.
Thomas Jarosch [Fri, 3 Jul 2015 19:24:14 +0000]
Switch produced filenames from YY-mm-dd to YYYY-mm-dd
Thomas Jarosch [Fri, 3 Jul 2015 19:20:24 +0000]
Ability to the run the unit tests without a .git directory
Useful for executing the tests during every rpm build.
Thomas Jarosch [Fri, 3 Jul 2015 19:06:26 +0000]
Include unit test files in source tarball
Don't install them though. Just needed for
running the unit tests during the rpm binary build.
Thomas Jarosch [Fri, 3 Jul 2015 15:57:49 +0000]
Unit test: Skip multivolume split test for modes that don't support it.
Actual code borrowed from test_restore_from_index().
Same code in test_restore_multivol_from_index() and a few others.
All unit tests running again.
Thomas Jarosch [Fri, 3 Jul 2015 15:24:14 +0000]
Unit test: Fix _equal_stat_dicts() invocation
The jsonize_path_iterator() generator returns a tuple
with the wanted dictionary as the first element.
Thomas Jarosch [Fri, 3 Jul 2015 14:35:57 +0000]
Pass on 'compresslevel' for gzip / bzip2 compressed archives only
Otherwise we crash when opening plain tar files for writing:
TypeError: taropen() takes from 2 to 4 positional arguments but 5 were given
First part of the unit test fix. Verified with debug statements
in gzopen() that passing on the parameter still works.
Samir Aguiar [Wed, 22 Apr 2015 10:52:42 +0000]
Add comments about the callback param to the docs
Also, some others cosmetic changes.
Samir Aguiar [Wed, 22 Apr 2015 08:15:32 +0000]
Add a callback function to the restore process
By having a function called for every file
that's about to be restore, we can keep track
of the progress of the extraction.
Eduardo Robles Elvira [Thu, 7 Aug 2014 09:38:14 +0000]
adding an integration test that uses create_pseudo_random_files to create a deterministic files and directories tree and then uses delta-tar with the run_benchmark script.
Eduardo Robles Elvira [Fri, 1 Aug 2014 12:45:35 +0000]
adding zlib and zlib-blocks tests
Eduardo Robles Elvira [Tue, 22 Jul 2014 17:46:12 +0000]
improving benchmark tests and their documentation
Eduardo Robles Elvira [Tue, 22 Jul 2014 17:45:22 +0000]
adding support to set the gzip compression level in tarfile
Thomas Jarosch [Wed, 16 Jul 2014 14:39:58 +0000]
Move README.txt to the docs folder. Better name
Victor Ramirez [Wed, 16 Jul 2014 14:14:45 +0000]
Added profiler test for tarfile and zlib.
Daniel Garcia Moreno [Tue, 1 Jul 2014 18:16:52 +0000]
Added test to test multivolume corruption at restore
Daniel Garcia Moreno [Mon, 30 Jun 2014 07:31:03 +0000]
Fixed "random" file corruption in multivol restore
Files splitted between two volumes were not valid in the restored backup.
The first part was okay but the second part was garbage. The problem was
that we don't set correctly the volume number so the second part always was
from the first tar file.
We were checking member variable, but this variable changes before, so I've
added a new bool variable "ismember" that don't change with member.
Eduardo Robles Elvira [Tue, 24 Jun 2014 09:35:42 +0000]
fix delete current file when doesnt match when restoring, logging was not being done correctly (crash)
Eduardo Robles Elvira [Mon, 23 Jun 2014 12:11:35 +0000]
allow to set a new_volume_handler in iterate_tar_path, but make it optional