6 ===============================================================================
7 crypto -- Encryption Layer for the Deltatar Backup
8 ===============================================================================
12 - AES-GCM for the symmetric encryption;
17 - NIST Recommendation for Block Cipher Modes of Operation: Galois/Counter
19 http://nvlpubs.nist.gov/nistpubs/Legacy/SP/nistspecialpublication800-38d.pdf
22 https://cryptome.org/2014/01/aes-gcm-v1.pdf
24 - Authentication weaknesses in GCM
25 http://csrc.nist.gov/groups/ST/toolkit/BCM/documents/comments/CWC-GCM/Ferguson2.pdf
28 -------------------------------------------------------------------------------
30 Errors fall into roughly three categories:
32 - Cryptographical errors or invalid data.
34 - ``InvalidGCMTag`` (decryption failed on account of an invalid GCM
36 - ``InvalidIVFixedPart`` (IV fixed part of object not found in list),
37 - ``DuplicateIV`` (the IV of an encrypted object already occurred),
38 - ``DecryptionError`` (used in CLI decryption for presenting error
39 conditions to the user).
41 - Incorrect usage of the library.
43 - ``InvalidParameter`` (non-conforming user supplied parameter),
44 - ``InvalidHeader`` (data passed for reading not parsable into header),
45 - ``FormatError`` (cannot handle header or parameter version),
48 - Bad internal state. If one of these is encountered it means that a state
49 was reached that shouldn’t occur during normal processing.
54 Also, ``EndOfFile`` is used as a sentinel to communicate that a stream supplied
55 for reading is exhausted.
57 Initialization Vectors
58 -------------------------------------------------------------------------------
60 Initialization vectors are checked for reuse during the lifetime of a decryptor.
61 The fixed counters for metadata files cannot be reused and attempts to do so
62 will cause a DuplicateIV error. This means the length of objects encrypted with
63 a metadata counter is capped at 63 GB.
65 For ordinary, non-metadata payload, there is an optional mode with strict IV
66 checking that causes a crypto context to fail if an IV encountered or created
67 was already used for decrypting or encrypting, respectively, an earlier object.
68 Note that this mode can trigger false positives when decrypting non-linearly,
69 e. g. when traversing the same object multiple times. Since the crypto context
70 has no notion of a position in a PDT encrypted archive, this condition must be
71 sorted out downstream.
74 -------------------------------------------------------------------------------
76 ``crypto.py`` may be invoked as a script for decrypting, validating, and
77 splitting PDT encrypted files. Consult the usage message for details.
81 Decrypt from stdin using the password ‘foo’: ::
83 $ crypto.py process foo -i - -o - <some-file.tar.gz.pdtcrypt >some-file.tar.gz
85 Output verbose information about the encrypted objects in the archive: ::
87 $ crypto.py process foo -v -i some-file.tar.gz.pdtcrypt -o /dev/null
88 PDT: decrypt from some-file.tar.gz.pdtcrypt
89 PDT: decrypt to /dev/null
90 PDT: source: file some-file.tar.gz.pdtcrypt
91 PDT: sink: file /dev/null
93 PDT: · version = 1 : 0100
94 PDT: · paramversion = 1 : 0100
95 PDT: · nacl : d270 b031 00d1 87e2 c946 610d 7b7f 7e5f
96 PDT: · iv : 02ee 3dd7 a963 1eb1 0100 0000
97 PDT: · ctsize = 591 : 4f02 0000 0000 0000
98 PDT: · tag : 5b2d 6d8b 8f82 4842 12fd 0b10 b6e3 369b
99 PDT: 64 decrypt obj no. 1, 591 B
100 PDT: · [64] 0% done, read block (591 B of 591 B remaining)
101 PDT: · decrypt ciphertext 591 B
102 PDT: · decrypt plaintext 591 B
106 Also, the mode *scrypt* allows deriving encryption keys. To calculate the
107 encryption key from the password ‘foo’ and the salt of the first object in a
108 PDT encrypted file: ::
110 $ crypto.py scrypt foo -i some-file.pdtcrypt
111 {"paramversion": 1, "salt": "Cqzbk48e3peEjzWto8D0yA==", "key": "JH9EkMwaM4x9F5aim5gK/Q=="}
113 The computed 16 byte key is given in hexadecimal notation in the value to
114 ``hash`` and can be fed into Python’s ``binascii.unhexlify()`` to obtain the
115 corresponding binary representation.
117 Note that in Scrypt hashing mode, no data integrity checks are being performed.
118 If the wrong password is given, a wrong key will be derived. Whether the password
119 was indeed correct can only be determined by decrypting. Note that since PDT
120 archives essentially consist of a stream of independent objects, the salt and
121 other parameters may change. Thus a key derived using above method from the
122 first object doesn’t necessarily apply to any of the subsequent objects.
131 from functools import reduce, partial
142 except ImportError as exn:
145 if __name__ == "__main__": ## Work around the import mechanism lest Python’s
146 pwd = os.getcwd() ## preference for local imports causes a cyclical
147 ## import (crypto → pylibscrypt → […] → ./tarfile → crypto).
148 sys.path = [ p for p in sys.path if p.find ("deltatar") < 0 ]
151 from cryptography.hazmat.primitives.ciphers import Cipher, algorithms, modes
152 from cryptography.hazmat.backends import default_backend
156 __all__ = [ "hdr_make", "hdr_read", "hdr_fmt", "hdr_fmt_pretty"
158 , "PDTCRYPT_HDR_SIZE", "AES_GCM_IV_CNT_DATA"
159 , "AES_GCM_IV_CNT_INFOFILE", "AES_GCM_IV_CNT_INDEX"
163 ###############################################################################
165 ###############################################################################
167 class EndOfFile (Exception):
171 def __init__ (self, n=None, msg=None):
177 class InvalidParameter (Exception):
178 """Inputs not valid for PDT encryption."""
182 class InvalidHeader (Exception):
183 """Header not valid."""
187 class InvalidGCMTag (Exception):
189 The GCM tag calculated during decryption differs from that in the object
195 class InvalidIVFixedPart (Exception):
197 IV fixed part not in supplied list: either the backup is corrupt or the
198 current object does not belong to it.
203 class IVFixedPartError (Exception):
205 Error creating a unique IV fixed part: repeated calls to system RNG yielded
206 the same sequence of bytes as the last IV used.
211 class InvalidFileCounter (Exception):
213 When encrypting, an attempted reuse of a dedicated counter (info file,
214 index file) was caught.
219 class DuplicateIV (Exception):
221 During encryption, the current IV fixed part is identical to an already
222 existing IV (same prefix and file counter). This indicates tampering or
223 programmer error and cannot be recovered from.
228 class NonConsecutiveIV (Exception):
230 IVs not numbered consecutively. This is a hard error with strict IV
231 checking. Precludes random access to the encrypted objects.
236 class CiphertextTooLong (Exception):
238 An attempt was made to decrypt more data than the ciphertext size declared
239 in the object header.
244 class FormatError (Exception):
245 """Unusable parameters in header."""
249 class DecryptionError (Exception):
250 """Error during decryption with ``crypto.py`` on the command line."""
254 class Unreachable (Exception):
256 Makeshift __builtin_unreachable(); always a programmer error if
262 class InternalError (Exception):
263 """Errors not ascribable to bad user inputs or cryptography."""
267 ###############################################################################
268 ## crypto layer version
269 ###############################################################################
271 ENCRYPTION_PARAMETERS = \
273 { "kdf": ("dummy", 16)
274 , "enc": "passthrough" }
282 , "enc": "aes-gcm" } }
284 # Mode zero is unencrypted and only provided for testing purposes. nless
285 # the encryptor / decryptor are explicitly instructed to do so.
286 MIN_SECURE_PARAMETERS = 1
288 ###############################################################################
290 ###############################################################################
292 PDTCRYPT_HDR_MAGIC = b"PDTCRYPT"
294 PDTCRYPT_HDR_SIZE_MAGIC = 8 # 8
295 PDTCRYPT_HDR_SIZE_VERSION = 2 # 10
296 PDTCRYPT_HDR_SIZE_PARAMVERSION = 2 # 12
297 PDTCRYPT_HDR_SIZE_NACL = 16 # 28
298 PDTCRYPT_HDR_SIZE_IV = 12 # 40
299 PDTCRYPT_HDR_SIZE_CTSIZE = 8 # 48
300 PDTCRYPT_HDR_SIZE_TAG = 16 # 64 GCM auth tag
302 PDTCRYPT_HDR_SIZE = PDTCRYPT_HDR_SIZE_MAGIC + PDTCRYPT_HDR_SIZE_VERSION \
303 + PDTCRYPT_HDR_SIZE_PARAMVERSION + PDTCRYPT_HDR_SIZE_NACL \
304 + PDTCRYPT_HDR_SIZE_IV + PDTCRYPT_HDR_SIZE_CTSIZE \
305 + PDTCRYPT_HDR_SIZE_TAG # = 64
307 # precalculate offsets since Python can’t do constant folding over names
308 HDR_OFF_VERSION = PDTCRYPT_HDR_SIZE_MAGIC
309 HDR_OFF_PARAMVERSION = HDR_OFF_VERSION + PDTCRYPT_HDR_SIZE_VERSION
310 HDR_OFF_NACL = HDR_OFF_PARAMVERSION + PDTCRYPT_HDR_SIZE_PARAMVERSION
311 HDR_OFF_IV = HDR_OFF_NACL + PDTCRYPT_HDR_SIZE_NACL
312 HDR_OFF_CTSIZE = HDR_OFF_IV + PDTCRYPT_HDR_SIZE_IV
313 HDR_OFF_TAG = HDR_OFF_CTSIZE + PDTCRYPT_HDR_SIZE_CTSIZE
317 FMT_I2N_IV = "<8sL" # 8 random bytes ‖ 32 bit counter
318 FMT_I2N_HDR = ("<" # host byte order
322 "16s" # sodium chloride
328 AES_KEY_SIZE = 16 # b"0123456789abcdef"
329 AES_KEY_SIZE_B64 = 24 # b'MDEyMzQ1Njc4OWFiY2RlZg=='
331 AES_GCM_MAX_SIZE = (1 << 36) - (1 << 5) # 2^39 - 2^8 b ≅ 64 GB.
332 # Source: NIST SP 800-38D section 5.2.1.1
333 # https://crypto.stackexchange.com/questions/31793/plain-text-size-limits-for-aes-gcm-mode-just-64gb
335 PDTCRYPT_MAX_OBJ_SIZE_DEFAULT = 63 * (1 << 30) # 63 GB
336 PDTCRYPT_MAX_OBJ_SIZE = PDTCRYPT_MAX_OBJ_SIZE_DEFAULT
338 # index and info files are written on-the fly while encrypting so their
339 # counters must be available in advance
340 AES_GCM_IV_CNT_INFOFILE = 1 # constant
341 AES_GCM_IV_CNT_INDEX = AES_GCM_IV_CNT_INFOFILE + 1
342 AES_GCM_IV_CNT_DATA = AES_GCM_IV_CNT_INDEX + 1 # also for multivolume
343 AES_GCM_IV_CNT_MAX_DEFAULT = 0xffFFffFF
344 AES_GCM_IV_CNT_MAX = AES_GCM_IV_CNT_MAX_DEFAULT
346 # IV structure and generation
347 PDTCRYPT_IV_GEN_MAX_RETRIES = 10 # ×
348 PDTCRYPT_IV_FIXEDPART_SIZE = 8 # B
349 PDTCRYPT_IV_COUNTER_SIZE = 4 # B
351 # secret type: PW of string | KEY of char [16]
352 PDTCRYPT_SECRET_PW = 0
353 PDTCRYPT_SECRET_KEY = 1
355 ###############################################################################
357 ###############################################################################
363 # , paramversion : u16
369 # fn hdr_read (f : handle) -> hdrinfo;
370 # fn hdr_make (f : handle, h : hdrinfo) -> IOResult<usize>;
371 # fn hdr_fmt (h : hdrinfo) -> String;
376 Read bytes as header structure.
378 If the input could not be interpreted as a header, fail with
383 mag, version, paramversion, nacl, iv, ctsize, tag = \
384 struct.unpack (FMT_I2N_HDR, data)
385 except Exception as exn:
386 raise InvalidHeader ("error unpacking header from [%r]: %s"
387 % (binascii.hexlify (data), str (exn)))
389 if mag != PDTCRYPT_HDR_MAGIC:
390 raise InvalidHeader ("bad magic in header: expected [%s], got [%s]"
391 % (PDTCRYPT_HDR_MAGIC, mag))
394 { "version" : version
395 , "paramversion" : paramversion
403 def hdr_read_stream (instr):
405 Read header from stream at the current position.
407 Fail with ``InvalidHeader`` if insufficient bytes were read from the
408 stream, or if the content could not be interpreted as a header.
410 data = instr.read(PDTCRYPT_HDR_SIZE)
414 elif ldata != PDTCRYPT_HDR_SIZE:
415 raise InvalidHeader ("hdr_read_stream: expected %d B, received %d B"
416 % (PDTCRYPT_HDR_SIZE, ldata))
417 return hdr_read (data)
420 def hdr_from_params (version, paramversion, nacl, iv, ctsize, tag):
422 Assemble the necessary values into a PDTCRYPT header.
424 :type version: int to fit uint16_t
425 :type paramversion: int to fit uint16_t
426 :type nacl: bytes to fit uint8_t[16]
427 :type iv: bytes to fit uint8_t[12]
428 :type size: int to fit uint64_t
429 :type tag: bytes to fit uint8_t[16]
431 buf = bytearray (PDTCRYPT_HDR_SIZE)
432 bufv = memoryview (buf)
435 struct.pack_into (FMT_I2N_HDR, bufv, 0,
437 version, paramversion, nacl, iv, ctsize, tag)
438 except Exception as exn:
439 return False, "error assembling header: %s" % str (exn)
441 return True, bytes (buf)
444 def hdr_make_dummy (s):
446 Create a header sized block of bytes initialized to a value derived from a
447 string. Used to verify we’ve jumped back correctly to the actual position
448 of the object header.
450 c = reduce (lambda a, c: a + ord(c), s, 0) % 0xFF
451 return bytes (bytearray (struct.pack ("B", c)) * PDTCRYPT_HDR_SIZE)
456 Assemble a header from the given header structure.
458 return hdr_from_params (version=hdr.get("version"),
459 paramversion=hdr.get("paramversion"),
460 nacl=hdr.get("nacl"), iv=hdr.get("iv"),
461 ctsize=hdr.get("ctsize"), tag=hdr.get("tag"))
464 HDR_FMT = "I2n_header { version: %d, paramversion: %d, nacl: %s[%d]," \
465 " iv: %s[%d], ctsize: %d, tag: %s[%d] }"
468 """Format a header structure into readable output."""
469 return HDR_FMT % (h["version"], h["paramversion"],
470 binascii.hexlify (h["nacl"]), len(h["nacl"]),
471 binascii.hexlify (h["iv"]), len(h["iv"]),
473 binascii.hexlify (h["tag"]), len(h["tag"]))
476 def hex_spaced_of_bytes (b):
477 """Format bytes object, hexdump style."""
478 return " ".join ([ "%.2x%.2x" % (c1, c2)
479 for c1, c2 in zip (b[0::2], b[1::2]) ]) \
480 + (len (b) | 1 == len (b) and " %.2x" % b[-1] or "") # odd lengths
483 def hdr_iv_counter (h):
484 """Extract the variable part of the IV of the given header."""
485 _fixed, cnt = struct.unpack (FMT_I2N_IV, h ["iv"])
489 def hdr_iv_fixed (h):
490 """Extract the fixed part of the IV of the given header."""
491 fixed, _cnt = struct.unpack (FMT_I2N_IV, h ["iv"])
495 hdr_dump = hex_spaced_of_bytes
499 """version = %-4d : %s
500 paramversion = %-4d : %s
507 def hdr_fmt_pretty (h):
509 Format header structure into multi-line representation of its contents and
510 their raw representation. (Omit the implicit “PDTCRYPT” magic bytes that
511 precede every header.)
513 return HDR_FMT_PRETTY \
515 hex_spaced_of_bytes (struct.pack (FMT_UINT16_LE, h["version"])),
517 hex_spaced_of_bytes (struct.pack (FMT_UINT16_LE, h["paramversion"])),
518 hex_spaced_of_bytes (h["nacl"]),
519 hex_spaced_of_bytes (h["iv"]),
521 hex_spaced_of_bytes (struct.pack (FMT_UINT64_LE, h["ctsize"])),
522 hex_spaced_of_bytes (h["tag"]))
524 IV_FMT = "((f %s) (c %d))"
527 """Format the two components of an IV in a readable fashion."""
528 fixed, cnt = struct.unpack (FMT_I2N_IV, iv)
529 return IV_FMT % (binascii.hexlify (fixed), cnt)
532 ###############################################################################
534 ###############################################################################
536 class Location (object):
540 def restore_loc_fmt (loc):
542 % (loc.n, loc.offset)
544 def locate_hdr_candidates (fd):
546 Walk over instances of the magic string in the payload, collecting their
547 positions. If the offset of the first found instance is not zero, the file
548 begins with leading garbage. Used by desaster recovery.
550 :return: The list of offsets in the file.
554 mm = mmap.mmap(fd, 0, mmap.MAP_SHARED, mmap.PROT_READ)
557 pos = mm.find (PDTCRYPT_HDR_MAGIC, pos)
566 HDR_CAND_GOOD = 0 # header marks begin of valid object
567 HDR_CAND_FISHY = 1 # inconclusive (tag mismatch, obj overlap etc.)
568 HDR_CAND_JUNK = 2 # not a header / object unreadable
571 { HDR_CAND_GOOD : "valid"
572 , HDR_CAND_FISHY : "fishy"
573 , HDR_CAND_JUNK : "junk"
577 def verdict_fmt (vdt):
578 return HDR_VERDICT_NAME [vdt]
581 def inspect_hdr (fd, off):
583 Attempt to parse a header in *fd* at position *off*.
585 Returns a verdict about the quality of that header plus the parsed header
589 _ = os.lseek (fd, off, os.SEEK_SET)
591 if os.lseek (fd, 0, os.SEEK_CUR) != off:
592 if PDTCRYPT_VERBOSE is True:
593 noise ("PDT: %d → dismissed (lseek() past EOF)" % off)
594 return HDR_CAND_JUNK, None
596 raw = os.read (fd, PDTCRYPT_HDR_SIZE)
597 if len (raw) != PDTCRYPT_HDR_SIZE:
598 if PDTCRYPT_VERBOSE is True:
599 noise ("PDT: %d → dismissed (EOF inside header)" % off)
600 return HDR_CAND_JUNK, None
604 except InvalidHeader as exn:
605 if PDTCRYPT_VERBOSE is True:
606 noise ("PDT: %d → dismissed (invalid: [%s])" % (off, str (exn)))
607 return HDR_CAND_JUNK, None
609 obj0 = off + PDTCRYPT_HDR_SIZE
610 objX = obj0 + hdr ["ctsize"]
612 eof = os.lseek (fd, 0, os.SEEK_END)
614 if PDTCRYPT_VERBOSE is True:
615 noise ("PDT: %d → EOF inside object (%d≤%d≤%d); adjusting size to "
616 "%d" % (off, obj0, eof, objX, (eof - obj0)))
617 # try reading up to the end
618 hdr ["ctsize"] = eof - obj0
619 return HDR_CAND_FISHY, hdr
621 return HDR_CAND_GOOD, hdr
624 def try_decrypt (ifd, off, hdr, secret, ofd=-1):
626 Attempt to decrypt the object in the (seekable) descriptor *ifd* starting
627 at *off* using the metadata in *hdr* and *secret*. An output fd can be
628 specified with *ofd*; if it is *-1* – the default –, the decrypted payload
631 Always creates a fresh decryptor, so validation steps across objects don’t
634 Errors during GCM tag validation are ignored. Used by desaster recovery.
636 ctleft = hdr ["ctsize"]
640 if ks == PDTCRYPT_SECRET_PW:
641 decr = Decrypt (password=secret [1])
642 elif ks == PDTCRYPT_SECRET_KEY:
644 decr = Decrypt (key=key)
651 os.lseek (ifd, pos, os.SEEK_SET)
654 cnksiz = min (ctleft, PDTCRYPT_BLOCKSIZE)
655 cnk = os.read (ifd, cnksiz)
658 pt = decr.process (cnk)
663 except InvalidGCMTag:
664 noise ("PDT: GCM tag mismatch for object %d–%d"
665 % (off, off + hdr ["ctsize"]))
666 if len (pt) > 0 and ofd != -1:
669 except Exception as exn:
670 noise ("PDT: error decrypting object %d–%d@%d, %d B remaining [%s]"
671 % (off, off + hdr ["ctsize"], pos, ctleft, exn))
677 def readable_objects_offsets (ifd, secret, cands):
679 From a list of candidates, locate the ones that mark the start of actual
680 readable PDTCRYPT objects.
684 for i, cand in enumerate (cands):
685 vdt, hdr = inspect_hdr (ifd, cand)
686 if vdt == HDR_CAND_JUNK:
687 pass # ignore unreadable ones
688 elif vdt in [HDR_CAND_GOOD, HDR_CAND_FISHY]:
689 ctsize = hdr ["ctsize"]
690 off0 = cand + PDTCRYPT_HDR_SIZE
691 ok = try_decrypt (ifd, off0, hdr, secret) == ctsize
693 good.append ((cand, off0 + ctsize))
695 overlap = find_overlaps (good)
697 return [ g [0] for g in good ]
700 def reconstruct_offsets (fname, secret):
701 ifd = os.open (fname, os.O_RDONLY)
704 cands = locate_hdr_candidates (ifd)
705 return readable_objects_offsets (ifd, secret, cands)
710 ###############################################################################
712 ###############################################################################
714 def make_secret (password=None, key=None):
716 Safely create a “secret” value that consists either of a key or a password.
717 Inputs are validated: the password is accepted as (UTF-8 encoded) bytes or
718 string; for the key only a bytes object of the proper size or a base64
719 encoded string thereof is accepted.
721 If both are provided, the key is preferred over the password; no checks are
722 performed whether the key is derived from the password.
724 :returns: secret value if inputs were acceptable | None otherwise.
727 if isinstance (key, str) is True:
728 key = key.encode ("utf-8")
729 if isinstance (key, bytes) is True:
730 if len (key) == AES_KEY_SIZE:
731 return (PDTCRYPT_SECRET_KEY, key)
732 if len (key) == AES_KEY_SIZE * 2:
734 key = binascii.unhexlify (key)
735 return (PDTCRYPT_SECRET_KEY, key)
736 except binascii.Error: # garbage in string
738 if len (key) == AES_KEY_SIZE_B64:
740 key = base64.b64decode (key)
741 # the base64 processor is very tolerant and allows for
742 # arbitrary trailing and leading data thus the data obtained
743 # must be checked for the proper length
744 if len (key) == AES_KEY_SIZE:
745 return (PDTCRYPT_SECRET_KEY, key)
746 except binascii.Error: # “incorrect padding”
748 elif password is not None:
749 if isinstance (password, str) is True:
750 return (PDTCRYPT_SECRET_PW, password)
751 elif isinstance (password, bytes) is True:
753 password = password.decode ("utf-8")
754 return (PDTCRYPT_SECRET_PW, password)
755 except UnicodeDecodeError:
761 ###############################################################################
762 ## passthrough / null encryption
763 ###############################################################################
765 class PassthroughCipher (object):
767 tag = struct.pack ("<QQ", 0, 0)
769 def __init__ (self) : pass
771 def update (self, b) : return b
773 def finalize (self) : return b""
775 def finalize_with_tag (self, _) : return b""
777 ###############################################################################
778 ## convenience wrapper
779 ###############################################################################
782 def kdf_dummy (klen, password, _nacl):
784 Fake KDF for testing purposes that is called when parameter version zero is
787 q, r = divmod (klen, len (password))
788 if isinstance (password, bytes) is False:
789 password = password.encode ()
790 return password * q + password [:r], b""
793 SCRYPT_KEY_MEMO = { } # static because needed for both the info file and the archive
796 def kdf_scrypt (params, password, nacl):
798 Wrapper for the Scrypt KDF, corresponds to parameter version one. The
799 computation result is memoized based on the inputs to facilitate spawning
800 multiple encryption contexts.
805 dkLen = params["dkLen"]
808 nacl = os.urandom (params["NaCl_LEN"])
810 key_parms = (password, nacl, N, r, p, dkLen)
811 global SCRYPT_KEY_MEMO
812 if key_parms not in SCRYPT_KEY_MEMO:
813 SCRYPT_KEY_MEMO [key_parms] = \
814 pylibscrypt.scrypt (password, nacl, N, r, p, dkLen)
815 return SCRYPT_KEY_MEMO [key_parms], nacl
818 def kdf_by_version (paramversion=None, defs=None):
820 Pick the KDF handler corresponding to the parameter version or the
823 :rtype: function (password : str, nacl : str) -> str
825 if paramversion is not None:
826 defs = ENCRYPTION_PARAMETERS.get(paramversion, None)
828 raise InvalidParameter ("no encryption parameters for version %r"
830 (kdf, params) = defs["kdf"]
832 if kdf == "scrypt" : fn = kdf_scrypt
833 elif kdf == "dummy" : fn = kdf_dummy
835 raise ValueError ("key derivation method %r unknown" % kdf)
836 return partial (fn, params)
839 ###############################################################################
841 ###############################################################################
843 def scrypt_hashsource (pw, ins):
845 Calculate the SCRYPT hash from the password and the information contained
846 in the first header found in ``ins``.
848 This does not validate whether the first object is encrypted correctly.
850 if isinstance (pw, str) is True:
852 elif isinstance (pw, bytes) is False:
853 raise InvalidParameter ("password must be a string, not %s"
855 if isinstance (ins, io.BufferedReader) is False and \
856 isinstance (ins, io.FileIO) is False:
857 raise InvalidParameter ("file to hash must be opened in “binary” mode")
860 hdr = hdr_read_stream (ins)
861 except EndOfFile as exn:
862 noise ("PDT: malformed input: end of file reading first object header")
867 pver = hdr ["paramversion"]
868 if PDTCRYPT_VERBOSE is True:
869 noise ("PDT: salt of first object : %s" % binascii.hexlify (nacl))
870 noise ("PDT: parameter version of archive : %d" % pver)
873 defs = ENCRYPTION_PARAMETERS.get(pver, None)
874 kdfname, params = defs ["kdf"]
875 if kdfname != "scrypt":
876 noise ("PDT: input is not an SCRYPT archive")
879 kdf = kdf_by_version (None, defs)
880 except ValueError as exn:
881 noise ("PDT: object has unknown parameter version %d" % pver)
883 hsh, _void = kdf (pw, nacl)
885 return hsh, nacl, hdr ["version"], pver
888 def scrypt_hashfile (pw, fname):
890 Calculate the SCRYPT hash from the password and the information contained
891 in the first header found in the given file. The header is read only at
894 with deptdcrypt_mk_stream (PDTCRYPT_SOURCE, fname or "-") as ins:
895 hsh, _void, _void, _void = scrypt_hashsource (pw, ins)
899 ###############################################################################
901 ###############################################################################
903 class Crypto (object):
905 Encryption context to remain alive throughout an entire tarfile pass.
910 cnt = None # file counter (uint32_t != 0)
911 iv = None # current IV
912 fixed = None # accu for 64 bit fixed parts of IV
913 used_ivs = None # tracks IVs
914 strict_ivs = False # if True, panic on duplicate object IV
917 insecure = False # allow plaintext parameters
924 info_counter_used = False
925 index_counter_used = False
927 def __init__ (self, *al, **akv):
928 self.used_ivs = set ()
929 self.set_parameters (*al, **akv)
932 def next_fixed (self):
937 def set_object_counter (self, cnt=None):
939 Safely set the internal counter of encrypted objects. Numerous
942 The same counter may not be reused in combination with one IV fixed
943 part. This is validated elsewhere in the IV handling.
945 Counter zero is invalid. The first two counters are reserved for
946 metadata. The implementation does not allow for splitting metadata
947 files over multiple encrypted objects. (This would be possible by
948 assigning new fixed parts.) Thus in a Deltatar backup there is at most
949 one object with a counter value of one and two. On creation of a
950 context, the initial counter may be chosen. The globals
951 ``AES_GCM_IV_CNT_INFOFILE`` and ``AES_GCM_IV_CNT_INDEX`` can be used to
952 request one of the reserved values. If one of these values has been
953 used, any further attempt of setting the counter to that value will
954 be rejected with an ``InvalidFileCounter`` exception.
956 Out of bounds values (i. e. below one and more than the maximum of 2³²)
957 cause an ``InvalidParameter`` exception to be thrown.
960 self.cnt = AES_GCM_IV_CNT_DATA
962 if cnt == 0 or cnt > AES_GCM_IV_CNT_MAX + 1:
963 raise InvalidParameter ("invalid counter value %d requested: "
964 "acceptable values are from 1 to %d"
965 % (cnt, AES_GCM_IV_CNT_MAX))
966 if cnt == AES_GCM_IV_CNT_INFOFILE:
967 if self.info_counter_used is True:
968 raise InvalidFileCounter ("attempted to reuse info file "
969 "counter %d: must be unique" % cnt)
970 self.info_counter_used = True
971 elif cnt == AES_GCM_IV_CNT_INDEX:
972 if self.index_counter_used is True:
973 raise InvalidFileCounter ("attempted to reuse index file "
974 " counter %d: must be unique" % cnt)
975 self.index_counter_used = True
976 if cnt <= AES_GCM_IV_CNT_MAX:
979 # cnt == AES_GCM_IV_CNT_MAX + 1 → wrap
980 self.cnt = AES_GCM_IV_CNT_DATA
984 def set_parameters (self, password=None, key=None, paramversion=None,
985 nacl=None, counter=None, strict_ivs=False,
988 Configure the internal state of a crypto context. Not intended for
991 A parameter version indicating passthrough (plaintext) mode is rejected
992 with an ``InvalidParameter`` unless ``insecure`` is set.
995 self.set_object_counter (counter)
996 self.strict_ivs = strict_ivs
998 self.insecure = insecure
1000 if paramversion is not None:
1001 if self.insecure is False \
1002 and paramversion < MIN_SECURE_PARAMETERS:
1003 raise InvalidParameter \
1004 ("set_parameters: requested parameter version %d but "
1005 "plaintext encryption disallowed in secure context!"
1007 self.paramversion = paramversion
1010 self.key, self.nacl = key, nacl
1013 if password is not None:
1014 if isinstance (password, bytes) is False:
1015 password = str.encode (password)
1016 self.password = password
1017 if paramversion is None and nacl is None:
1018 # postpone key setup until first header is available
1020 kdf = kdf_by_version (paramversion)
1022 self.key, self.nacl = kdf (password, nacl)
1025 def process (self, buf):
1027 Encrypt / decrypt a buffer. Invokes the ``.update()`` method on the
1028 wrapped encryptor or decryptor, respectively.
1030 The Cryptography exception ``AlreadyFinalized`` is translated to an
1031 ``InternalError`` at this point. It may occur in sound code when the GC
1032 closes an encrypting stream after an error. Everywhere else it must be
1035 if self.enc is None:
1036 raise RuntimeError ("process: context not initialized")
1037 self.stats ["in"] += len (buf)
1039 out = self.enc.update (buf)
1040 except cryptography.exceptions.AlreadyFinalized as exn:
1041 raise InternalError (exn)
1042 self.stats ["out"] += len (out)
1046 def next (self, password, paramversion, nacl, iv):
1048 Prepare for encrypting another object: Reset the data counters and
1049 change the configuration in case one of the variable parameters differs
1050 from the last object. Also check the IV for duplicates and error out
1051 if strict checking was requested.
1055 self.stats ["obj"] += 1
1057 self.check_duplicate_iv (iv)
1059 if ( self.paramversion != paramversion
1060 or self.password != password
1061 or self.nacl != nacl):
1062 self.set_parameters (password=password, paramversion=paramversion,
1063 nacl=nacl, strict_ivs=self.strict_ivs,
1064 insecure=self.insecure)
1067 def check_duplicate_iv (self, iv):
1069 Add an IV (the 12 byte representation as in the header) to the list. With
1070 strict checking enabled, this will throw a ``DuplicateIV``. Depending on
1071 the context, this may indicate a serious error (IV reuse).
1073 if self.strict_ivs is True and iv in self.used_ivs:
1074 raise DuplicateIV ("iv %s was reused" % iv_fmt (iv))
1075 # vi has not been used before; add to collection
1076 self.used_ivs.add (iv)
1079 def counters (self):
1081 Access the data counters.
1083 return self.stats ["obj"], self.stats ["in"], self.stats ["out"]
1088 Clear the current context regardless of its finalization state. The
1089 next operation must be ``.next()``.
1094 class Encrypt (Crypto):
1100 def __init__ (self, version, paramversion, password=None, key=None, nacl=None,
1101 counter=AES_GCM_IV_CNT_DATA, strict_ivs=True, insecure=False):
1103 The ctor will throw immediately if one of the parameters does not conform
1104 to our expectations.
1106 counter=AES_GCM_IV_CNT_DATA, strict_ivs=True):
1107 :type version: int to fit uint16_t
1108 :type paramversion: int to fit uint16_t
1109 :param password: mutually exclusive with ``key``
1110 :type password: bytes
1111 :param key: mutually exclusive with ``password``
1114 :type counter: initial object counter the values
1115 ``AES_GCM_IV_CNT_INFOFILE`` and
1116 ``AES_GCM_IV_CNT_INDEX`` are unique in each backup set
1117 and cannot be reused even with different fixed parts.
1118 :type strict_ivs: bool
1119 :type insecure: bool, whether to permit passthrough mode
1121 if password is None and key is None \
1122 or password is not None and key is not None :
1123 raise InvalidParameter ("__init__: need either key or password")
1126 if isinstance (key, bytes) is False:
1127 raise InvalidParameter ("__init__: key must be provided as "
1128 "bytes, not %s" % type (key))
1130 raise InvalidParameter ("__init__: salt must be provided along "
1131 "with encryption key")
1132 else: # password, no key
1133 if isinstance (password, str) is False:
1134 raise InvalidParameter ("__init__: password must be a string, not %s"
1136 if len (password) == 0:
1137 raise InvalidParameter ("__init__: supplied empty password but not "
1138 "permitted for PDT encrypted files")
1140 if isinstance (version, int) is False:
1141 raise InvalidParameter ("__init__: version number must be an "
1142 "integer, not %s" % type (version))
1144 raise InvalidParameter ("__init__: version number must be a "
1145 "nonnegative integer, not %d" % version)
1147 if isinstance (paramversion, int) is False:
1148 raise InvalidParameter ("__init__: crypto parameter version number "
1149 "must be an integer, not %s"
1150 % type (paramversion))
1151 if paramversion < 0:
1152 raise InvalidParameter ("__init__: crypto parameter version number "
1153 "must be a nonnegative integer, not %d"
1156 if nacl is not None:
1157 if isinstance (nacl, bytes) is False:
1158 raise InvalidParameter ("__init__: salt given, but of type %s "
1159 "instead of bytes" % type (nacl))
1160 # salt length would depend on the actual encryption so it can’t be
1161 # validated at this point
1163 self.version = version
1164 self.paramenc = ENCRYPTION_PARAMETERS.get (paramversion) ["enc"]
1166 super().__init__ (password, key, paramversion, nacl, counter=counter,
1167 strict_ivs=strict_ivs, insecure=insecure)
1170 def next_fixed (self, retries=PDTCRYPT_IV_GEN_MAX_RETRIES):
1172 Generate the next IV fixed part by reading eight bytes from
1173 ``/dev/urandom``. The buffer so obtained is tested against the fixed
1174 parts used so far to prevent accidental reuse of IVs. After a
1175 configurable number of attempts to create a unique fixed part, it will
1176 refuse to continue with an ``IVFixedPartError``. This is unlikely to
1177 ever happen on a normal system but may detect an issue with the random
1180 The list of fixed parts that were used by the context at hand can be
1181 accessed through the ``.fixed`` list. Its last element is the fixed
1182 part currently in use.
1186 fp = os.urandom (PDTCRYPT_IV_FIXEDPART_SIZE)
1187 if fp not in self.fixed:
1188 self.fixed.append (fp)
1191 raise IVFixedPartError ("error obtaining a unique IV fixed part from "
1192 "/dev/urandom; giving up after %d tries" % i)
1197 Construct a 12-bytes IV from the current fixed part and the object
1200 return struct.pack(FMT_I2N_IV, self.fixed [-1], self.cnt)
1203 def next (self, filename=None, counter=None):
1205 Prepare for encrypting the next incoming object. Update the counter
1206 and put together the IV, possibly changing prefixes. Then create the
1209 The argument ``counter`` can be used to specify a file counter for this
1210 object. Unless it is one of the reserved values, the counter of
1211 subsequent objects will be computed from this one.
1213 If this is the first object in a series, ``filename`` is required,
1214 otherwise it is reused if not present. The value is used to derive a
1215 header sized placeholder to use until after encryption when all the
1216 inputs to construct the final header are available. This is then
1217 matched in ``.done()`` against the value found at the position of the
1218 header. The motivation for this extra check is primarily to assist
1219 format debugging: It makes stray headers easy to spot in malformed
1222 if filename is None:
1223 if self.lastinfo is None:
1224 raise InvalidParameter ("next: filename is mandatory for "
1226 filename, _dummy = self.lastinfo
1228 if isinstance (filename, str) is False:
1229 raise InvalidParameter ("next: filename must be a string, no %s"
1231 if counter is not None:
1232 if isinstance (counter, int) is False:
1233 raise InvalidParameter ("next: the supplied counter is of "
1234 "invalid type %s; please pass an "
1235 "integer instead" % type (counter))
1236 self.set_object_counter (counter)
1238 self.iv = self.iv_make ()
1239 if self.paramenc == "aes-gcm":
1241 ( algorithms.AES (self.key)
1242 , modes.GCM (self.iv)
1243 , backend = default_backend ()) \
1245 elif self.paramenc == "passthrough":
1246 self.enc = PassthroughCipher ()
1248 raise InvalidParameter ("next: parameter version %d not known"
1249 % self.paramversion)
1250 hdrdum = hdr_make_dummy (filename)
1251 self.lastinfo = (filename, hdrdum)
1252 super().next (self.password, self.paramversion, self.nacl, self.iv)
1254 self.set_object_counter (self.cnt + 1)
1258 def done (self, cmpdata):
1260 Complete encryption of an object. After this has been called, attempts
1261 of encrypting further data will cause an error until ``.next()`` is
1264 Returns a 64 bytes buffer containing the object header including all
1265 values including the “late” ones e. g. the ciphertext size and the
1268 if isinstance (cmpdata, bytes) is False:
1269 raise InvalidParameter ("done: comparison input expected as bytes, "
1270 "not %s" % type (cmpdata))
1271 if self.lastinfo is None:
1272 raise RuntimeError ("done: encryption context not initialized")
1273 filename, hdrdum = self.lastinfo
1274 if cmpdata != hdrdum:
1275 raise RuntimeError ("done: bad sync of header for object %d: "
1276 "preliminary data does not match; this likely "
1277 "indicates a wrongly repositioned stream"
1279 data = self.enc.finalize ()
1280 self.stats ["out"] += len (data)
1281 self.ctsize += len (data)
1282 ok, hdr = hdr_from_params (self.version, self.paramversion, self.nacl,
1283 self.iv, self.ctsize, self.enc.tag)
1285 raise InternalError ("error constructing header: %r" % hdr)
1286 return data, hdr, self.fixed
1289 def process (self, buf):
1291 Encrypt a chunk of plaintext with the active encryptor. Returns the
1292 size of the input consumed. This **must** be checked downstream. If the
1293 maximum possible object size has been reached, the current context must
1294 be finalized and a new one established before any further data can be
1295 encrypted. The second argument is the remainder of the plaintext that
1296 was not encrypted for the caller to use immediately after the new
1299 if isinstance (buf, bytes) is False:
1300 raise InvalidParameter ("process: expected byte buffer, not %s"
1303 newptsize = self.ptsize + bsize
1304 diff = newptsize - PDTCRYPT_MAX_OBJ_SIZE
1307 newptsize = PDTCRYPT_MAX_OBJ_SIZE
1308 self.ptsize = newptsize
1309 data = super().process (buf [:bsize])
1310 self.ctsize += len (data)
1314 class Decrypt (Crypto):
1316 tag = None # GCM tag, part of header
1317 last_iv = None # check consecutive ivs in strict mode
1320 def __init__ (self, password=None, key=None, counter=None, fixedparts=None,
1321 strict_ivs=False, insecure=False):
1323 Sanitizing ctor for the decryption context. ``fixedparts`` specifies a
1324 list of IV fixed parts accepted during decryption. If a fixed part is
1325 encountered that is not in the list, decryption will fail.
1327 :param password: mutually exclusive with ``key``
1328 :type password: bytes
1329 :param key: mutually exclusive with ``password``
1331 :type counter: initial object counter the values
1332 ``AES_GCM_IV_CNT_INFOFILE`` and
1333 ``AES_GCM_IV_CNT_INDEX`` are unique in each backup set
1334 and cannot be reused even with different fixed parts.
1335 :type fixedparts: bytes list
1336 :type insecure: bool, whether to process objects encrypted in
1337 passthrough mode (*``paramversion`` < 1*)
1339 if password is None and key is None \
1340 or password is not None and key is not None :
1341 raise InvalidParameter ("__init__: need either key or password")
1344 if isinstance (key, bytes) is False:
1345 raise InvalidParameter ("__init__: key must be provided as "
1346 "bytes, not %s" % type (key))
1347 else: # password, no key
1348 if isinstance (password, str) is False:
1349 raise InvalidParameter ("__init__: password must be a string, not %s"
1351 if len (password) == 0:
1352 raise InvalidParameter ("__init__: supplied empty password but not "
1353 "permitted for PDT encrypted files")
1355 if fixedparts is not None:
1356 if isinstance (fixedparts, list) is False:
1357 raise InvalidParameter ("__init__: IV fixed parts must be "
1358 "supplied as list, not %s"
1359 % type (fixedparts))
1360 self.fixed = fixedparts
1363 super().__init__ (password=password, key=key, counter=counter,
1364 strict_ivs=strict_ivs, insecure=insecure)
1367 def valid_fixed_part (self, iv):
1369 Check if a fixed part was already seen.
1371 # check if fixed part is known
1372 fixed, _cnt = struct.unpack (FMT_I2N_IV, iv)
1373 i = bisect.bisect_left (self.fixed, fixed)
1374 return i != len (self.fixed) and self.fixed [i] == fixed
1377 def check_consecutive_iv (self, iv):
1379 Check whether the counter part of the given IV is indeed the successor
1380 of the currently present counter. This should always be the case for
1381 the objects in a well formed PDT archive but should not be enforced
1382 when decrypting out-of-order.
1384 fixed, cnt = struct.unpack (FMT_I2N_IV, iv)
1385 if self.strict_ivs is True \
1386 and self.last_iv is not None \
1387 and self.last_iv [0] == fixed \
1388 and self.last_iv [1] != cnt - 1:
1389 raise NonConsecutiveIV ("iv %s counter not successor of "
1390 "last object (expected %d, found %d)"
1391 % (iv_fmt (iv), self.last_iv [1], cnt))
1392 self.last_iv = (fixed, cnt)
1395 def next (self, hdr):
1397 Start decrypting the next object. The PDTCRYPT header for the object
1398 can be given either as already parsed object or as bytes.
1400 if isinstance (hdr, bytes) is True:
1401 hdr = hdr_read (hdr)
1402 elif isinstance (hdr, dict) is False:
1403 # this won’t catch malformed specs though
1404 raise InvalidParameter ("next: wrong type of parameter hdr: "
1405 "expected bytes or spec, got %s"
1408 paramversion = hdr ["paramversion"]
1412 ctsize = hdr ["ctsize"]
1414 raise InvalidHeader ("next: not a header %r" % hdr)
1416 if ctsize > PDTCRYPT_MAX_OBJ_SIZE:
1417 raise InvalidHeader ("next: ciphertext size %d exceeds maximum "
1419 % (ctsize, PDTCRYPT_MAX_OBJ_SIZE))
1421 self.hdr_ctsize = ctsize
1423 super().next (self.password, paramversion, nacl, iv)
1424 if self.fixed is not None and self.valid_fixed_part (iv) is False:
1425 raise InvalidIVFixedPart ("iv %s has invalid fixed part"
1427 self.check_consecutive_iv (iv)
1430 defs = ENCRYPTION_PARAMETERS.get (paramversion, None)
1432 raise FormatError ("header contains unknown parameter version %d; "
1433 "maybe the file was created by a more recent "
1434 "version of Deltatar" % paramversion)
1436 if enc == "aes-gcm":
1438 ( algorithms.AES (self.key)
1439 , modes.GCM (iv, tag=self.tag)
1440 , backend = default_backend ()) \
1442 elif enc == "passthrough":
1443 self.enc = PassthroughCipher ()
1445 raise InternalError ("encryption parameter set %d refers to unknown "
1446 "mode %r" % (paramversion, enc))
1447 self.set_object_counter (self.cnt + 1)
1450 def done (self, tag=None):
1452 Stop decryption of the current object and finalize it with the active
1453 context. This will throw an *InvalidGCMTag* exception to indicate that
1454 the authentication tag does not match the data. If the tag is correct,
1455 the rest of the plaintext is returned.
1460 data = self.enc.finalize ()
1462 if isinstance (tag, bytes) is False:
1463 raise InvalidParameter ("done: wrong type of parameter "
1464 "tag: expected bytes, got %s"
1466 data = self.enc.finalize_with_tag (self.tag)
1467 except cryptography.exceptions.InvalidTag:
1468 raise InvalidGCMTag ("done: tag mismatch of object %d: %s "
1469 "rejected by finalize ()"
1470 % (self.cnt, binascii.hexlify (self.tag)))
1471 self.ptsize += len (data)
1472 self.stats ["out"] += len (data)
1474 assert self.ctsize == self.ptsize == self.hdr_ctsize
1479 def process (self, buf):
1481 Decrypt the bytes object *buf* with the active decryptor.
1483 if isinstance (buf, bytes) is False:
1484 raise InvalidParameter ("process: expected byte buffer, not %s"
1486 self.ctsize += len (buf)
1487 if self.ctsize > self.hdr_ctsize:
1488 raise CiphertextTooLong ("process: object length exceeded: got "
1489 "%d B but header specfiies %d B"
1490 % (self.ctsize, self.hdr_ctsize))
1492 data = super().process (buf)
1493 self.ptsize += len (data)
1497 ###############################################################################
1499 ###############################################################################
1501 def _patch_global (glob, vow, n=None):
1503 Adapt upper file counter bound for testing IV logic. Completely unsafe.
1505 assert vow == "I am fully aware that this will void my warranty."
1506 r = globals () [glob]
1508 n = globals () [glob + "_DEFAULT"]
1509 globals () [glob] = n
1512 _testing_set_AES_GCM_IV_CNT_MAX = \
1513 partial (_patch_global, "AES_GCM_IV_CNT_MAX")
1515 _testing_set_PDTCRYPT_MAX_OBJ_SIZE = \
1516 partial (_patch_global, "PDTCRYPT_MAX_OBJ_SIZE")
1518 def open2_dump_file (fname, dir_fd, force=False):
1521 oflags = os.O_CREAT | os.O_WRONLY
1523 oflags |= os.O_TRUNC
1528 outfd = os.open (fname, oflags,
1529 stat.S_IRUSR | stat.S_IWUSR, dir_fd=dir_fd)
1530 except FileExistsError as exn:
1531 noise ("PDT: refusing to overwrite existing file %s" % fname)
1533 raise RuntimeError ("destination file %s already exists" % fname)
1534 if PDTCRYPT_VERBOSE is True:
1535 noise ("PDT: new output file %s (fd=%d)" % (fname, outfd))
1539 ###############################################################################
1540 ## freestanding invocation
1541 ###############################################################################
1543 PDTCRYPT_SUB_PROCESS = 0
1544 PDTCRYPT_SUB_SCRYPT = 1
1545 PDTCRYPT_SUB_SCAN = 2
1548 { "process" : PDTCRYPT_SUB_PROCESS
1549 , "scrypt" : PDTCRYPT_SUB_SCRYPT
1550 , "scan" : PDTCRYPT_SUB_SCAN }
1552 PDTCRYPT_DECRYPT = 1 << 0 # decrypt archive with password
1553 PDTCRYPT_SPLIT = 1 << 1 # split archive into individual objects
1554 PDTCRYPT_HASH = 1 << 2 # output scrypt hash for file and given password
1556 PDTCRYPT_SPLITNAME = "pdtcrypt-object-%d.bin"
1557 PDTCRYPT_RESCUENAME = "pdtcrypt-rescue-object-%0.5d.bin"
1559 PDTCRYPT_VERBOSE = False
1560 PDTCRYPT_STRICTIVS = False
1561 PDTCRYPT_OVERWRITE = False
1562 PDTCRYPT_BLOCKSIZE = 1 << 12
1567 PDTCRYPT_DEFAULT_VER = 1
1568 PDTCRYPT_DEFAULT_PVER = 1
1570 # scrypt hashing output control
1571 PDTCRYPT_SCRYPT_INTRANATOR = 0
1572 PDTCRYPT_SCRYPT_PARAMETERS = 1
1573 PDTCRYPT_SCRYPT_DEFAULT = PDTCRYPT_SCRYPT_INTRANATOR
1575 PDTCRYPT_SCRYPT_FORMAT = \
1576 { "i2n" : PDTCRYPT_SCRYPT_INTRANATOR
1577 , "params" : PDTCRYPT_SCRYPT_PARAMETERS }
1579 PDTCRYPT_TT_COLUMNS = 80 # assume standard terminal
1581 class PDTDecryptionError (Exception):
1582 """Decryption failed."""
1584 class PDTSplitError (Exception):
1585 """Decryption failed."""
1588 def noise (*a, **b):
1589 print (file=sys.stderr, *a, **b)
1592 class PassthroughDecryptor (object):
1594 curhdr = None # write current header on first data write
1596 def __init__ (self):
1597 if PDTCRYPT_VERBOSE is True:
1598 noise ("PDT: no encryption; data passthrough")
1600 def next (self, hdr):
1601 ok, curhdr = hdr_make (hdr)
1603 raise PDTDecryptionError ("bad header %r" % hdr)
1604 self.curhdr = curhdr
1607 if self.curhdr is not None:
1611 def process (self, d):
1612 if self.curhdr is not None:
1618 def depdtcrypt (mode, secret, ins, outs):
1620 Remove PDTCRYPT layer from all objects encrypted with the secret. Used on a
1621 Deltatar backup this will yield a (possibly Gzip compressed) tarball.
1623 ctleft = -1 # length of ciphertext to consume
1624 ctcurrent = 0 # total ciphertext of current object
1625 total_obj = 0 # total number of objects read
1626 total_pt = 0 # total plaintext bytes
1627 total_ct = 0 # total ciphertext bytes
1628 total_read = 0 # total bytes read
1629 outfile = None # Python file object for output
1631 if mode & PDTCRYPT_DECRYPT: # decryptor
1633 if ks == PDTCRYPT_SECRET_PW:
1634 decr = Decrypt (password=secret [1], strict_ivs=PDTCRYPT_STRICTIVS)
1635 elif ks == PDTCRYPT_SECRET_KEY:
1637 decr = Decrypt (key=key, strict_ivs=PDTCRYPT_STRICTIVS)
1639 raise InternalError ("‘%d’ does not specify a valid kind of secret"
1642 decr = PassthroughDecryptor ()
1645 """Dummy for non-split mode: output file does not vary."""
1648 if mode & PDTCRYPT_SPLIT:
1649 def nextout (outfile):
1651 We were passed an fd as outs for accessing the destination
1652 directory where extracted archive components are supposed
1657 if PDTCRYPT_VERBOSE is True:
1658 noise ("PDT: no output file to close at this point")
1660 if PDTCRYPT_VERBOSE is True:
1661 noise ("PDT: release output file %r" % outfile)
1662 # cleanup happens automatically by the GC; the next
1663 # line will error out on account of an invalid fd
1666 assert total_obj > 0
1667 fname = PDTCRYPT_SPLITNAME % total_obj
1669 outfd = open2_dump_file (fname, outs, force=PDTCRYPT_OVERWRITE)
1670 except RuntimeError as exn:
1671 raise PDTSplitError (exn)
1672 return os.fdopen (outfd, "wb", closefd=True)
1676 """ESPIPE is normal on non-seekable stdio stream."""
1679 except OSError as exn:
1680 if exn.errno == errno.ESPIPE:
1683 def out (pt, outfile):
1687 if PDTCRYPT_VERBOSE is True:
1688 noise ("PDT:\t· decrypt plaintext %d B" % (npt))
1690 nn = outfile.write (pt)
1691 except OSError as exn: # probably ENOSPC
1692 raise DecryptionError ("error (%s)" % exn)
1694 raise DecryptionError ("write aborted after %d of %d B" % (nn, npt))
1698 # current object completed; in a valid archive this marks either
1699 # the start of a new header or the end of the input
1700 if ctleft == 0: # current object requires finalization
1701 if PDTCRYPT_VERBOSE is True:
1702 noise ("PDT: %d finalize" % tell (ins))
1705 except InvalidGCMTag as exn:
1706 raise DecryptionError ("error finalizing object %d (%d B): "
1707 "%r" % (total_obj, len (pt), exn)) \
1710 if PDTCRYPT_VERBOSE is True:
1711 noise ("PDT:\t· object validated")
1713 if PDTCRYPT_VERBOSE is True:
1714 noise ("PDT: %d hdr" % tell (ins))
1716 hdr = hdr_read_stream (ins)
1717 total_read += PDTCRYPT_HDR_SIZE
1718 except EndOfFile as exn:
1719 total_read += exn.remainder
1720 if total_ct + total_obj * PDTCRYPT_HDR_SIZE != total_read:
1721 raise PDTDecryptionError ("ciphertext processed (%d B) plus "
1722 "overhead (%d × %d B) does not match "
1723 "the number of bytes read (%d )"
1724 % (total_ct, total_obj, PDTCRYPT_HDR_SIZE,
1726 # the single good exit
1727 return total_read, total_obj, total_ct, total_pt
1728 except InvalidHeader as exn:
1729 raise PDTDecryptionError ("invalid header at position %d in %r "
1730 "(%s)" % (tell (ins), exn, ins))
1731 if PDTCRYPT_VERBOSE is True:
1732 pretty = hdr_fmt_pretty (hdr)
1733 noise (reduce (lambda a, e: (a + "\n" if a else "") + "PDT:\t· " + e,
1734 pretty.splitlines (), ""))
1735 ctcurrent = ctleft = hdr ["ctsize"]
1739 total_obj += 1 # used in file counter with split mode
1741 # finalization complete or skipped in case of first object in
1742 # stream; create a new output file if necessary
1743 outfile = nextout (outfile)
1745 if PDTCRYPT_VERBOSE is True:
1746 noise ("PDT: %d decrypt obj no. %d, %d B"
1747 % (tell (ins), total_obj, ctleft))
1749 # always allocate a new buffer since python-cryptography doesn’t allow
1750 # passing a bytearray :/
1751 nexpect = min (ctleft, PDTCRYPT_BLOCKSIZE)
1752 if PDTCRYPT_VERBOSE is True:
1753 noise ("PDT:\t· [%d] %d%% done, read block (%d B of %d B remaining)"
1755 100 - ctleft * 100 / (ctcurrent > 0 and ctcurrent or 1),
1757 ct = ins.read (nexpect)
1761 raise EndOfFile (nct,
1762 "hit EOF after %d of %d B in block [%d:%d); "
1763 "%d B ciphertext remaining for object no %d"
1764 % (nct, nexpect, off, off + nexpect, ctleft,
1770 if PDTCRYPT_VERBOSE is True:
1771 noise ("PDT:\t· decrypt ciphertext %d B" % (nct))
1772 pt = decr.process (ct)
1776 def deptdcrypt_mk_stream (kind, path):
1777 """Create stream from file or stdio descriptor."""
1778 if kind == PDTCRYPT_SINK:
1780 if PDTCRYPT_VERBOSE is True: noise ("PDT: sink: stdout")
1781 return sys.stdout.buffer
1783 if PDTCRYPT_VERBOSE is True: noise ("PDT: sink: file %s" % path)
1784 return io.FileIO (path, "w")
1785 if kind == PDTCRYPT_SOURCE:
1787 if PDTCRYPT_VERBOSE is True: noise ("PDT: source: stdin")
1788 return sys.stdin.buffer
1790 if PDTCRYPT_VERBOSE is True: noise ("PDT: source: file %s" % path)
1791 return io.FileIO (path, "r")
1793 raise ValueError ("bogus stream “%s” / %s" % (kind, path))
1796 def mode_depdtcrypt (mode, secret, ins, outs):
1798 total_read, total_obj, total_ct, total_pt = \
1799 depdtcrypt (mode, secret, ins, outs)
1800 except DecryptionError as exn:
1801 noise ("PDT: Decryption failed:")
1803 noise ("PDT: “%s”" % exn)
1805 noise ("PDT: Did you specify the correct key / password?")
1808 except PDTSplitError as exn:
1809 noise ("PDT: Split operation failed:")
1811 noise ("PDT: “%s”" % exn)
1813 noise ("PDT: Hint: target directory should be empty.")
1817 if PDTCRYPT_VERBOSE is True:
1818 noise ("PDT: decryption successful" )
1819 noise ("PDT: %.10d bytes read" % total_read)
1820 noise ("PDT: %.10d objects decrypted" % total_obj )
1821 noise ("PDT: %.10d bytes ciphertext" % total_ct )
1822 noise ("PDT: %.10d bytes plaintext" % total_pt )
1828 def mode_scrypt (pw, ins=None, nacl=None, fmt=PDTCRYPT_SCRYPT_INTRANATOR):
1830 paramversion = PDTCRYPT_DEFAULT_PVER
1832 hsh, nacl, version, paramversion = scrypt_hashsource (pw, ins)
1833 defs = ENCRYPTION_PARAMETERS.get(paramversion, None)
1835 nacl = binascii.unhexlify (nacl)
1836 defs = ENCRYPTION_PARAMETERS.get(paramversion, None)
1837 version = PDTCRYPT_DEFAULT_VER
1839 kdfname, params = defs ["kdf"]
1841 kdf = kdf_by_version (None, defs)
1842 hsh, _void = kdf (pw, nacl)
1846 if fmt == PDTCRYPT_SCRYPT_INTRANATOR:
1847 out = json.dumps ({ "salt" : base64.b64encode (nacl).decode ()
1848 , "key" : base64.b64encode (hsh) .decode ()
1849 , "paramversion" : paramversion })
1850 elif fmt == PDTCRYPT_SCRYPT_PARAMETERS:
1851 out = json.dumps ({ "salt" : binascii.hexlify (nacl).decode ()
1852 , "key" : binascii.hexlify (hsh) .decode ()
1853 , "version" : version
1854 , "scrypt_params" : { "N" : params ["N"]
1855 , "r" : params ["r"]
1856 , "p" : params ["p"]
1857 , "dkLen" : params ["dkLen"] } })
1859 raise RuntimeError ("bad scrypt output scheme %r" % fmt)
1864 def noise_output_candidates (cands, indent=8, cols=PDTCRYPT_TT_COLUMNS):
1866 Print a list of offsets without garbling the terminal too much.
1868 The indent is counted from column zero; if it is wide enough, the “PDT: ”
1869 marker will be prepended, considered part of the indentation.
1873 idt = " " * indent if indent < 5 else "PDT: " + " " * (indent - 5)
1878 init = True # prevent leading separator
1881 raise ValueError ("the requested indentation exceeds the line "
1882 "width by %d" % (indent - wd))
1892 if lpos > wd: # line break
1908 SLICE_START = 1 # ordering is important to have starts of intervals
1909 SLICE_END = 0 # sorted before equal ends
1911 def find_overlaps (slices):
1913 Find overlapping slices: iterate open/close points of intervals, tracking
1914 the ones open at any time.
1917 inside = set () # of indices into bounds
1918 ovrlp = set () # of indices into bounds
1920 for i, s in enumerate (slices):
1921 bounds.append ((s [0], SLICE_START, i))
1922 bounds.append ((s [1], SLICE_END , i))
1923 bounds = sorted (bounds)
1927 if val [1] == SLICE_START:
1930 if len (inside) > 1: # closing one that overlapped
1934 return [ slices [i] for i in ovrlp ]
1937 def mode_scan (secret, fname, outs=None, nacl=None):
1939 Dissect a binary file, looking for PDTCRYPT headers and objects.
1941 If *outs* is supplied, recoverable data will be dumped into the specified
1945 ifd = os.open (fname, os.O_RDONLY)
1946 except FileNotFoundError:
1947 noise ("PDT: failed to open %s readonly" % fname)
1952 if PDTCRYPT_VERBOSE is True:
1953 noise ("PDT: scan for potential sync points")
1954 cands = locate_hdr_candidates (ifd)
1955 if len (cands) == 0:
1956 noise ("PDT: scan complete: input does not contain potential PDT "
1957 "headers; giving up.")
1959 if PDTCRYPT_VERBOSE is True:
1960 noise ("PDT: scan complete: found %d candidates:" % len (cands))
1961 noise_output_candidates (cands)
1966 junk, todo, slices = [], [], []
1971 vdt, hdr = inspect_hdr (ifd, cand)
1973 vdts = verdict_fmt (vdt)
1975 if vdt == HDR_CAND_JUNK:
1976 noise ("PDT: obj %d: %s object: bad header, skipping" % vdts)
1979 off0 = cand + PDTCRYPT_HDR_SIZE
1980 if PDTCRYPT_VERBOSE is True:
1981 noise ("PDT: obj %d: read payload @%d" % (nobj, off0))
1982 pretty = hdr_fmt_pretty (hdr)
1983 noise (reduce (lambda a, e: (a + "\n" if a else "") + "PDT:\t· " + e,
1984 pretty.splitlines (), ""))
1987 if outs is not None:
1988 ofname = PDTCRYPT_RESCUENAME % nobj
1989 ofd = open2_dump_file (ofname, outs, force=PDTCRYPT_OVERWRITE)
1991 ctsize = hdr ["ctsize"]
1993 l = try_decrypt (ifd, off0, hdr, secret, ofd=ofd)
1995 slices.append ((off0, off0 + l))
1999 if vdt == HDR_CAND_GOOD and ok is True:
2000 noise ("PDT: %d → ✓ %s object %d–%d"
2001 % (cand, vdts, off0, off0 + ctsize))
2002 elif vdt == HDR_CAND_FISHY and ok is True:
2003 noise ("PDT: %d → × %s object %d–%d, corrupt header"
2004 % (cand, vdts, off0, off0 + ctsize))
2005 elif vdt == HDR_CAND_GOOD and ok is False:
2006 noise ("PDT: %d → × %s object %d–%d, problematic payload"
2007 % (cand, vdts, off0, off0 + ctsize))
2008 elif vdt == HDR_CAND_FISHY and ok is False:
2009 noise ("PDT: %d → × %s object %d–%d, corrupt header, problematic "
2010 "ciphertext" % (cand, vdts, off0, off0 + ctsize))
2017 noise ("PDT: all headers ok")
2019 noise ("PDT: %d candidates not parseable as headers:" % len (junk))
2020 noise_output_candidates (junk)
2022 overlap = find_overlaps (slices)
2023 if len (overlap) > 0:
2024 noise ("PDT: %d objects overlapping others" % len (overlap))
2025 for slice in overlap:
2026 noise ("PDT: × %d→%d" % (slice [0], slice [1]))
2028 def usage (err=False):
2032 indent = ' ' * len (SELF)
2033 out ("usage: %s SUBCOMMAND { --help" % SELF)
2034 out (" %s | [ -v ] { -p PASSWORD | -k KEY }" % indent)
2035 out (" %s [ { -i | --in } { - | SOURCE } ]" % indent)
2036 out (" %s [ { -n | --nacl } { SALT } ]" % indent)
2037 out (" %s [ { -o | --out } { - | DESTINATION } ]" % indent)
2038 out (" %s [ -D | --no-decrypt ] [ -S | --split ]" % indent)
2039 out (" %s [ -f | --format ]" % indent)
2042 out ("\t\tSUBCOMMAND main mode: { process | scrypt }")
2044 out ("\t\t process: extract objects from PDT archive")
2045 out ("\t\t scrypt: calculate hash from password and first object")
2046 out ("\t\t-p PASSWORD password to derive the encryption key from")
2047 out ("\t\t-k KEY encryption key as 16 bytes in hexadecimal notation")
2048 out ("\t\t-s enforce strict handling of initialization vectors")
2049 out ("\t\t-i SOURCE file name to read from")
2050 out ("\t\t-o DESTINATION file to write output to")
2051 out ("\t\t-n SALT provide salt for scrypt mode in hex encoding")
2052 out ("\t\t-v print extra info")
2053 out ("\t\t-S split into files at object boundaries; this")
2054 out ("\t\t requires DESTINATION to refer to directory")
2055 out ("\t\t-D PDT header and ciphertext passthrough")
2056 out ("\t\t-f format of SCRYPT hash output (“default” or “parameters”)")
2058 out ("\tinstead of filenames, “-” may used to specify stdin / stdout")
2060 sys.exit ((err is True) and 42 or 0)
2070 def parse_argv (argv):
2071 global PDTCRYPT_OVERWRITE
2073 mode = PDTCRYPT_DECRYPT
2079 scrypt_format = PDTCRYPT_SCRYPT_DEFAULT
2082 SELF = os.path.basename (next (argvi))
2085 rawsubcmd = next (argvi)
2086 subcommand = PDTCRYPT_SUB [rawsubcmd]
2087 except StopIteration:
2088 bail ("ERROR: subcommand required")
2090 bail ("ERROR: invalid subcommand “%s” specified" % rawsubcmd)
2096 except StopIteration:
2097 bail ("ERROR: argument list incomplete")
2099 def checked_secret (s):
2104 bail ("ERROR: encountered “%s” but secret already given" % arg)
2107 if arg in [ "-h", "--help" ]:
2110 elif arg in [ "-v", "--verbose", "--wtf" ]:
2111 global PDTCRYPT_VERBOSE
2112 PDTCRYPT_VERBOSE = True
2113 elif arg in [ "-i", "--in", "--source" ]:
2114 insspec = checked_arg ()
2115 if PDTCRYPT_VERBOSE is True: noise ("PDT: decrypt from %s" % insspec)
2116 elif arg in [ "-p", "--password" ]:
2117 arg = checked_arg ()
2118 checked_secret (make_secret (password=arg))
2119 if PDTCRYPT_VERBOSE is True: noise ("PDT: decrypting with password")
2121 if subcommand == PDTCRYPT_SUB_PROCESS:
2122 if arg in [ "-s", "--strict-ivs" ]:
2123 global PDTCRYPT_STRICTIVS
2124 PDTCRYPT_STRICTIVS = True
2125 elif arg in [ "-o", "--out", "--dest", "--sink" ]:
2126 outsspec = checked_arg ()
2127 if PDTCRYPT_VERBOSE is True: noise ("PDT: decrypt to %s" % outsspec)
2128 elif arg in [ "-f", "--force" ]:
2129 PDTCRYPT_OVERWRITE = True
2130 if PDTCRYPT_VERBOSE is True: noise ("PDT: overwrite existing files")
2131 elif arg in [ "-S", "--split" ]:
2132 mode |= PDTCRYPT_SPLIT
2133 if PDTCRYPT_VERBOSE is True: noise ("PDT: split files")
2134 elif arg in [ "-D", "--no-decrypt" ]:
2135 mode &= ~PDTCRYPT_DECRYPT
2136 if PDTCRYPT_VERBOSE is True: noise ("PDT: not decrypting")
2137 elif arg in [ "-k", "--key" ]:
2138 arg = checked_arg ()
2139 checked_secret (make_secret (key=arg))
2140 if PDTCRYPT_VERBOSE is True: noise ("PDT: decrypting with key")
2142 bail ("ERROR: unexpected positional argument “%s”" % arg)
2143 elif subcommand == PDTCRYPT_SUB_SCRYPT:
2144 if arg in [ "-n", "--nacl", "--salt" ]:
2145 nacl = checked_arg ()
2146 if PDTCRYPT_VERBOSE is True: noise ("PDT: salt key with %s" % nacl)
2147 elif arg in [ "-f", "--format" ]:
2148 arg = checked_arg ()
2150 scrypt_format = PDTCRYPT_SCRYPT_FORMAT [arg]
2152 bail ("ERROR: invalid scrypt output format %s" % arg)
2153 if PDTCRYPT_VERBOSE is True:
2154 noise ("PDT: scrypt output format “%s”" % scrypt_format)
2156 bail ("ERROR: unexpected positional argument “%s”" % arg)
2157 elif subcommand == PDTCRYPT_SUB_SCAN:
2158 if arg in [ "-o", "--out", "--dest", "--sink" ]:
2159 outsspec = checked_arg ()
2160 if PDTCRYPT_VERBOSE is True: noise ("PDT: decrypt to %s" % outsspec)
2161 elif arg in [ "-f", "--force" ]:
2162 PDTCRYPT_OVERWRITE = True
2163 if PDTCRYPT_VERBOSE is True: noise ("PDT: overwrite existing files")
2165 bail ("ERROR: unexpected positional argument “%s”" % arg)
2168 if PDTCRYPT_VERBOSE is True:
2169 noise ("ERROR: no password or key specified, trying $PDTCRYPT_PASSWORD")
2170 epw = os.getenv ("PDTCRYPT_PASSWORD")
2172 checked_secret (make_secret (password=epw.strip ()))
2175 if PDTCRYPT_VERBOSE is True:
2176 noise ("ERROR: no password or key specified, trying $PDTCRYPT_KEY")
2177 ek = os.getenv ("PDTCRYPT_KEY")
2179 checked_secret (make_secret (key=ek.strip ()))
2182 if subcommand == PDTCRYPT_SUB_SCRYPT:
2183 bail ("ERROR: scrypt hash mode requested but no password given")
2184 elif mode & PDTCRYPT_DECRYPT:
2185 bail ("ERROR: decryption requested but no password given")
2187 if mode & PDTCRYPT_SPLIT and outsspec is None:
2188 bail ("ERROR: split mode is incompatible with stdout sink "
2191 if subcommand == PDTCRYPT_SUB_SCAN and outsspec is None:
2192 pass # no output by default in scan mode
2193 elif mode & PDTCRYPT_SPLIT or subcommand == PDTCRYPT_SUB_SCAN:
2194 # destination must be directory
2196 bail ("ERROR: mode is incompatible with stdout sink")
2199 os.makedirs (outsspec, 0o700)
2200 except FileExistsError:
2201 # if it’s a directory with appropriate perms, everything is
2202 # good; otherwise, below invocation of open(2) will fail
2204 outs = os.open (outsspec, os.O_DIRECTORY, 0o600)
2205 except FileNotFoundError as exn:
2206 bail ("ERROR: cannot create target directory “%s”" % outsspec)
2207 except NotADirectoryError as exn:
2208 bail ("ERROR: target path “%s” is not a directory" % outsspec)
2210 outs = deptdcrypt_mk_stream (PDTCRYPT_SINK, outsspec or "-")
2212 if subcommand == PDTCRYPT_SUB_SCAN:
2214 bail ("ERROR: please supply an input file for scanning")
2216 bail ("ERROR: input must be seekable; please specify a file")
2217 return True, partial (mode_scan, secret, insspec, outs, nacl=nacl)
2219 if subcommand == PDTCRYPT_SUB_SCRYPT:
2220 if secret [0] == PDTCRYPT_SECRET_KEY:
2221 bail ("ERROR: scrypt mode requires a password")
2222 if insspec is not None and nacl is not None \
2223 or insspec is None and nacl is None :
2224 bail ("ERROR: please supply either an input file or "
2229 if insspec is not None or subcommand != PDTCRYPT_SUB_SCRYPT:
2230 ins = deptdcrypt_mk_stream (PDTCRYPT_SOURCE, insspec or "-")
2232 if subcommand == PDTCRYPT_SUB_SCRYPT:
2233 return True, partial (mode_scrypt, secret [1].encode (), ins, nacl,
2236 return True, partial (mode_depdtcrypt, mode, secret, ins, outs)
2240 ok, runner = parse_argv (argv)
2242 if ok is True: return runner ()
2247 if __name__ == "__main__":
2248 sys.exit (main (sys.argv))