Thomas Jarosch [Wed, 15 Jun 2016 12:18:33 +0000]
Merge branch 'fix-compression-size'
The new code will give final tar file sizes
close to the volume size even when using compression.
Christian Herdtweck [Wed, 15 Jun 2016 09:39:41 +0000]
ensure temp file is deleted; add some comments about results
Christian Herdtweck [Wed, 15 Jun 2016 09:19:09 +0000]
added performance test script
Christian Herdtweck [Wed, 15 Jun 2016 09:18:55 +0000]
added minimum file size arg to find_random_files
should make returned files more realistic
Christian Herdtweck [Wed, 15 Jun 2016 07:55:16 +0000]
reduce time wasted on _dbg output: format string only when it is actually printed
Christian Herdtweck [Wed, 15 Jun 2016 07:54:02 +0000]
remove _dbg(str.format(args)) from performance-sensitive loop in addfile
Christian Herdtweck [Wed, 15 Jun 2016 07:53:48 +0000]
added some more comments
Christian Herdtweck [Tue, 14 Jun 2016 10:28:32 +0000]
add unittest that runs one of the many multivolume compression size tests
Christian Herdtweck [Mon, 13 Jun 2016 11:07:08 +0000]
created another test for multivolume compression size
Christian Herdtweck [Tue, 14 Jun 2016 09:59:03 +0000]
changed debug output level of the debug output I added earlier
Christian Herdtweck [Mon, 13 Jun 2016 11:06:38 +0000]
removed some debug output
Christian Herdtweck [Mon, 13 Jun 2016 11:06:28 +0000]
fix ValueError message (otherwise '*' is interpreted as string repetition)
Christian Herdtweck [Mon, 13 Jun 2016 11:05:50 +0000]
ensure max_volume_size is int or None
Christian Herdtweck [Fri, 10 Jun 2016 12:34:05 +0000]
skip one test with known failure
Christian Herdtweck [Fri, 10 Jun 2016 12:32:38 +0000]
correct number of compressed backup volumes in tests
Christian Herdtweck [Fri, 10 Jun 2016 12:18:33 +0000]
added class variable MODE_COMPRESS to test_deltatar unittests
Christian Herdtweck [Fri, 10 Jun 2016 12:17:33 +0000]
do a proper unittest.skip if test is not run
Christian Herdtweck [Fri, 10 Jun 2016 12:15:30 +0000]
remove print()s in unittest test_multivol
Christian Herdtweck [Fri, 10 Jun 2016 08:09:33 +0000]
remove unused import
Christian Herdtweck [Fri, 10 Jun 2016 08:09:24 +0000]
extend valid range of sizes for sample.tar.gz file
Christian Herdtweck [Thu, 9 Jun 2016 16:01:37 +0000]
added new unittest test_multivol_compress with bigger random-data file
Christian Herdtweck [Thu, 9 Jun 2016 16:01:12 +0000]
renamed unittest test_multivol_compress to test_compress_single (compresses to single volume)
Christian Herdtweck [Thu, 9 Jun 2016 15:58:39 +0000]
added a few debug messages to addfile and open_volume
Christian Herdtweck [Thu, 9 Jun 2016 15:58:04 +0000]
changed TarFile.addfile to get better sized volumes if compressing
Christian Herdtweck [Thu, 9 Jun 2016 15:56:00 +0000]
created a 2nd TarFile._size_left: one for file and one for stream
Christian Herdtweck [Thu, 9 Jun 2016 15:54:51 +0000]
fix unittest's create_pseudo_random_file: do return file name as docu says
author of that module seems to never have heard of the tempfile module in
python stdlib!
Christian Herdtweck [Thu, 9 Jun 2016 15:53:54 +0000]
improved unittest's create file: do not gather GB in memory before writing
Christian Herdtweck [Thu, 9 Jun 2016 15:52:58 +0000]
clean up in unittest: remove size_test_* files
Christian Herdtweck [Thu, 9 Jun 2016 15:52:41 +0000]
clean up in unittest: remove unused imports
Christian Herdtweck [Thu, 9 Jun 2016 09:23:33 +0000]
make file size test faster by removing 16GB test file
Christian Herdtweck [Thu, 9 Jun 2016 09:23:09 +0000]
create unittest for warning
Christian Herdtweck [Thu, 9 Jun 2016 09:22:58 +0000]
warn if trying to compress/encrypt multivolume with mode w:...
Christian Herdtweck [Thu, 9 Jun 2016 09:09:41 +0000]
adjust file-size-estimation unittest; add an actual test to unittest
Christian Herdtweck [Wed, 8 Jun 2016 15:50:25 +0000]
started a unittest for multi-volume with compression
Christian Herdtweck [Wed, 8 Jun 2016 15:48:23 +0000]
created unittest for get_file_size
Christian Herdtweck [Wed, 8 Jun 2016 15:46:49 +0000]
added function get_file_size and var _file_size to _Stream
Christian Herdtweck [Wed, 8 Jun 2016 15:46:02 +0000]
changed some comments, corrected one indent
Thomas Jarosch [Wed, 15 Jun 2016 06:42:02 +0000]
delete(): Don't crash on removing symlinks to directories
shutil.rmtree() will refuse to follow symlinks.
-> just call os.unlink() for symlinks.
Christian Herdtweck [Thu, 7 Apr 2016 13:48:27 +0000]
Revert "added arguments for permissions of created dirs and files to deltatar.create_full/diff_backup"
This reverts commit
8db5dd46357def656b84019cded8aed8f0626a03.
(simpler and more important to have UI deal with unclean input)
Christian Herdtweck [Wed, 6 Apr 2016 16:04:46 +0000]
added arguments for permissions of created dirs and files to deltatar.create_full/diff_backup
Thomas Jarosch [Fri, 3 Jul 2015 19:34:05 +0000]
Increase version to 1.1
Reflect the filename changes and
a few other bells and whistles.
Thomas Jarosch [Fri, 3 Jul 2015 19:24:14 +0000]
Switch produced filenames from YY-mm-dd to YYYY-mm-dd
Thomas Jarosch [Fri, 3 Jul 2015 19:20:24 +0000]
Ability to the run the unit tests without a .git directory
Useful for executing the tests during every rpm build.
Thomas Jarosch [Fri, 3 Jul 2015 19:06:26 +0000]
Include unit test files in source tarball
Don't install them though. Just needed for
running the unit tests during the rpm binary build.
Thomas Jarosch [Fri, 3 Jul 2015 15:57:49 +0000]
Unit test: Skip multivolume split test for modes that don't support it.
Actual code borrowed from test_restore_from_index().
Same code in test_restore_multivol_from_index() and a few others.
All unit tests running again.
Thomas Jarosch [Fri, 3 Jul 2015 15:24:14 +0000]
Unit test: Fix _equal_stat_dicts() invocation
The jsonize_path_iterator() generator returns a tuple
with the wanted dictionary as the first element.
Thomas Jarosch [Fri, 3 Jul 2015 14:35:57 +0000]
Pass on 'compresslevel' for gzip / bzip2 compressed archives only
Otherwise we crash when opening plain tar files for writing:
TypeError: taropen() takes from 2 to 4 positional arguments but 5 were given
First part of the unit test fix. Verified with debug statements
in gzopen() that passing on the parameter still works.
Samir Aguiar [Wed, 22 Apr 2015 10:52:42 +0000]
Add comments about the callback param to the docs
Also, some others cosmetic changes.
Samir Aguiar [Wed, 22 Apr 2015 08:15:32 +0000]
Add a callback function to the restore process
By having a function called for every file
that's about to be restore, we can keep track
of the progress of the extraction.
Eduardo Robles Elvira [Thu, 7 Aug 2014 09:38:14 +0000]
adding an integration test that uses create_pseudo_random_files to create a deterministic files and directories tree and then uses delta-tar with the run_benchmark script.
Eduardo Robles Elvira [Fri, 1 Aug 2014 12:45:35 +0000]
adding zlib and zlib-blocks tests
Eduardo Robles Elvira [Tue, 22 Jul 2014 17:46:12 +0000]
improving benchmark tests and their documentation
Eduardo Robles Elvira [Tue, 22 Jul 2014 17:45:22 +0000]
adding support to set the gzip compression level in tarfile
Thomas Jarosch [Wed, 16 Jul 2014 14:39:58 +0000]
Move README.txt to the docs folder. Better name
Victor Ramirez [Wed, 16 Jul 2014 14:14:45 +0000]
Added profiler test for tarfile and zlib.
Daniel Garcia Moreno [Tue, 1 Jul 2014 18:16:52 +0000]
Added test to test multivolume corruption at restore
Daniel Garcia Moreno [Mon, 30 Jun 2014 07:31:03 +0000]
Fixed "random" file corruption in multivol restore
Files splitted between two volumes were not valid in the restored backup.
The first part was okay but the second part was garbage. The problem was
that we don't set correctly the volume number so the second part always was
from the first tar file.
We were checking member variable, but this variable changes before, so I've
added a new bool variable "ismember" that don't change with member.
Eduardo Robles Elvira [Tue, 24 Jun 2014 09:35:42 +0000]
fix delete current file when doesnt match when restoring, logging was not being done correctly (crash)
Eduardo Robles Elvira [Mon, 23 Jun 2014 12:11:35 +0000]
allow to set a new_volume_handler in iterate_tar_path, but make it optional
Eduardo Robles Elvira [Sun, 22 Jun 2014 22:19:17 +0000]
small formating change
Thomas Jarosch [Mon, 16 Jun 2014 08:57:32 +0000]
Fix typo in comment
Eduardo Robles Elvira [Wed, 19 Mar 2014 16:50:18 +0000]
fix typo error, variable should be named beginning_size
Eduardo Robles Elvira [Wed, 19 Mar 2014 12:16:29 +0000]
making testbackup.py work
Eduardo Robles Elvira [Tue, 11 Mar 2014 12:27:10 +0000]
adding more easy to understand diff backup debug output
Eduardo Robles Elvira [Fri, 7 Mar 2014 13:15:57 +0000]
fix debug logging in diff mode
Eduardo Robles Elvira [Tue, 11 Feb 2014 10:51:37 +0000]
add function to easily get the extension used for indexes
Thomas Jarosch [Wed, 5 Feb 2014 16:22:08 +0000]
Ignore files created by setup.sh
Eduardo Robles Elvira [Wed, 5 Feb 2014 10:21:10 +0000]
adding some missing debug lines
Eduardo Robles Elvira [Mon, 20 Jan 2014 09:07:16 +0000]
adding setup files to create an rpm for python-delta-tar
Eduardo Robles Elvira [Mon, 20 Jan 2014 09:05:29 +0000]
removing unused vars
Eduardo Robles Elvira [Thu, 26 Dec 2013 13:49:07 +0000]
derefering backed up files for now
Eduardo Robles Elvira [Sat, 7 Dec 2013 10:29:27 +0000]
missing change from string to bytes join
Eduardo Robles Elvira [Thu, 14 Nov 2013 18:54:14 +0000]
adding support for adding extra data to deltatar indexes
Daniel Garcia Moreno [Wed, 13 Nov 2013 13:05:12 +0000]
Test compatible with tar 1.22 (intranator version)
Daniel Garcia Moreno [Wed, 13 Nov 2013 12:57:07 +0000]
Using python3 instead of python in tests system calls
Eduardo Robles Elvira [Mon, 11 Nov 2013 09:00:34 +0000]
adding header and move to python 3
Eduardo Robles Elvira [Mon, 11 Nov 2013 09:00:19 +0000]
adding python3 headers
Eduardo Robles Elvira [Wed, 6 Nov 2013 11:51:39 +0000]
fixing another file not closed warning on new vol handler and special case where the last file in a multivolume tar was being repeated in the tar index iterator
Eduardo Robles Elvira [Wed, 6 Nov 2013 11:51:00 +0000]
fixing some warnings in python3 from dereferenced but unclosed files
Eduardo Robles Elvira [Wed, 6 Nov 2013 11:49:36 +0000]
fixing open mode for updating the file to r+b
Eduardo Robles Elvira [Mon, 4 Nov 2013 08:08:56 +0000]
closing also volume files on cleanup
Eduardo Robles Elvira [Mon, 4 Nov 2013 07:50:55 +0000]
initial port to python 3, not finished
Eduardo Robles Elvira [Tue, 22 Oct 2013 10:38:38 +0000]
ignore the PowmInsecureWarning warning given by libgmp4 because it doesn't affect our code
Eduardo Robles Elvira [Fri, 18 Oct 2013 10:08:04 +0000]
adding support for filtering via whitelist with -inc
Eduardo Robles Elvira [Fri, 18 Oct 2013 09:07:56 +0000]
some unit tests were only failing when run with superuser, because only then chown can be called and had not been properly tested
Eduardo Robles Elvira [Thu, 17 Oct 2013 18:21:42 +0000]
fixing aes only mode: it turns out we were not closing correctly the aes stream
Eduardo Robles Elvira [Sat, 12 Oct 2013 08:18:00 +0000]
removing some unnecesary over-optimizations like running the gc manually or deleting stat dict
optimization memory usage during restore by saving dir perms in a specific class with only the needed data
Eduardo Robles Elvira [Wed, 9 Oct 2013 09:56:09 +0000]
start working on AES-only delta tar mode, with no encryption. some tests still fail
Eduardo Robles Elvira [Wed, 9 Oct 2013 09:55:38 +0000]
removing references to dowser, comment pytracemalloc usage, very useful though painful to setup tool to trace memory usage
Eduardo Robles Elvira [Wed, 9 Oct 2013 09:54:42 +0000]
fixing unit test where only even lines where being processed
Eduardo Robles Elvira [Wed, 9 Oct 2013 09:52:02 +0000]
fixing memory leaks, now memory usage remains nearly constant when creating a full backup
Eduardo Robles Elvira [Thu, 3 Oct 2013 14:17:17 +0000]
reducing leaks in tarfile by allowing not to store files in self.members
Eduardo Robles Elvira [Mon, 30 Sep 2013 07:58:55 +0000]
fixing last two unit tests bugs, realted to multivol and tar iterator
* when a file was split in two volumes, the tarinfo from the second volume was read twice
* tar iterator failed when reading an empty volume because instead of returning None, it tried to read position 0 and when reading from _Stream thats not allowed
Eduardo Robles Elvira [Mon, 30 Sep 2013 07:58:42 +0000]
fixing bug in a deltatar that excepts two vols for a backup of .git, which might not be needed
Eduardo Robles Elvira [Sat, 28 Sep 2013 10:29:30 +0000]
migrating tar restore to use the same code as index restore
previous commit was a mess because it mixed two commits, my fault.
it mixed both the mtime directories fix and the generalization of
tar restore with index restore.
this commit continues the work towards this generalization, fixing
the TarPathIterator (which was just not tested), doing the removed
os.chdir in restore_backup (which resulted in cwd data loss when
running the tests), fixing also the indentation at the end of the
restore_backup function, and setting correctly the
new_volume_handler in RestoreHelper when restoring from a tarball.
However, there are still some unit tests related to multivolume
handling that still fail.
Eduardo Robles Elvira [Sat, 28 Sep 2013 09:18:30 +0000]
fixing bug when restoring files, mtime of parent dir was not preserved/restored correctly
Eduardo Robles Elvira [Fri, 27 Sep 2013 14:06:58 +0000]
Fixing corner case where pad was not taken into account when decrypting the end of a file in a stream
__dec_read function reads directly from the file and returns the data
decrypted. This means that if the file is not encrypted, this function
is trivial.
If the data in the file is encrypted, then the process is different:
first we have to read the raw encrypted data, then decrypt it and
return. But the decryption process is not straightforward because the
self.fileobj stream contains multiple encrypted files one after the
other. We need to detect each separate file, which is detected because
they are separated by the "Salted__" keyword.
It gets more complicated, because we decrypt chunk by chunk, and to
correctly decrypt one chunk we need to set a "last" variable that
specifies if it's the last chunk of a file, because the end of a file is
handled differently, as it gets padded.
Knowing if the current chunk is the last part of a file is usually done
just by detecting if it's followed by a "Salted__" keyword or if we
cannot read more bytes from the stream. BUT there's a pretty particular
case, in which the current chunk ends exactly with one file, so that
the next chunk starts with "Salted__".
To fix that rare case, we just read N bytes from the stream, and check
if the last bytes correspond with the string "Salted__". Then we save
those last characters for next call to __dec_read. If the last bytes
were "Salted__", then we set "last" to True.
Well, actually we not only substract the length of "Salted__", but 16/32
chars because the file is decrypted in multiples of the key size.
Eduardo Robles Elvira [Fri, 27 Sep 2013 14:06:37 +0000]
when a comparison of two json paths in test_deltatar fails, show the two of them in console, this is quite handy
Eduardo Robles Elvira [Fri, 27 Sep 2013 14:05:16 +0000]
adding again all the test clases to runtests, as we were only testing default tar mode
Eduardo Robles Elvira [Fri, 27 Sep 2013 14:04:28 +0000]
improving the filesplitter, allowing to split at specified offsets. this is useful for debugging or to use as a last resort technique to recover a backup, knowing the offsets from the index for example