6 ===============================================================================
7 crypto -- Encryption Layer for the Deltatar Backup
8 ===============================================================================
12 - AES-GCM for the symmetric encryption;
17 - NIST Recommendation for Block Cipher Modes of Operation: Galois/Counter
19 http://nvlpubs.nist.gov/nistpubs/Legacy/SP/nistspecialpublication800-38d.pdf
22 https://cryptome.org/2014/01/aes-gcm-v1.pdf
24 - Authentication weaknesses in GCM
25 http://csrc.nist.gov/groups/ST/toolkit/BCM/documents/comments/CWC-GCM/Ferguson2.pdf
27 Trouble with python-cryptography packages: authentication tags can only be
28 passed in advance: https://github.com/pyca/cryptography/pull/3421
31 -------------------------------------------------------------------------------
33 Errors fall into roughly three categories:
35 - Cryptographical errors or invalid data.
37 - ``InvalidGCMTag`` (decryption failed on account of an invalid GCM
39 - ``InvalidIVFixedPart`` (IV fixed part of object not found in list),
40 - ``DuplicateIV`` (the IV of an encrypted object already occurred),
41 - ``DecryptionError`` (used in CLI decryption for presenting error
42 conditions to the user).
44 - Incorrect usage of the library.
46 - ``InvalidParameter`` (non-conforming user supplied parameter),
47 - ``InvalidHeader`` (data passed for reading not parsable into header),
48 - ``FormatError`` (cannot handle header or parameter version),
51 - Bad internal state. If one of these is encountered it means that a state
52 was reached that shouldn’t occur during normal processing.
57 Also, ``EndOfFile`` is used as a sentinel to communicate that a stream supplied
58 for reading is exhausted.
60 Initialization Vectors
61 -------------------------------------------------------------------------------
63 Initialization vectors are checked reuse during the lifetime of a decryptor.
64 The fixed counters for metadata files cannot be reused and attempts to do so
65 will cause a DuplicateIV error. This means the length of objects encrypted with
66 a metadata counter is capped at 63 GB.
68 For ordinary, non-metadata payload, there is an optional mode with strict IV
69 checking that causes a crypto context to fail if an IV encountered or created
70 was already used for decrypting or encrypting, respectively, an earlier object.
71 Note that this mode can trigger false positives when decrypting non-linearly,
72 e. g. when traversing the same object multiple times. Since the crypto context
73 has no notion of a position in a PDT encrypted archive, this condition must be
74 sorted out downstream.
77 -------------------------------------------------------------------------------
79 ``crypto.py`` may be invoked as a script for decrypting, validating, and
80 splitting PDT encrypted files. Consult the usage message for details.
84 Decrypt from stdin using the password ‘foo’: ::
86 $ crypto.py process foo -i - -o - <some-file.tar.gz.pdtcrypt >some-file.tar.gz
88 Output verbose information about the encrypted objects in the archive: ::
90 $ crypto.py process foo -v -i some-file.tar.gz.pdtcrypt -o /dev/null
91 PDT: decrypt from some-file.tar.gz.pdtcrypt
92 PDT: decrypt to /dev/null
93 PDT: source: file some-file.tar.gz.pdtcrypt
94 PDT: sink: file /dev/null
96 PDT: · version = 1 : 0100
97 PDT: · paramversion = 1 : 0100
98 PDT: · nacl : d270 b031 00d1 87e2 c946 610d 7b7f 7e5f
99 PDT: · iv : 02ee 3dd7 a963 1eb1 0100 0000
100 PDT: · ctsize = 591 : 4f02 0000 0000 0000
101 PDT: · tag : 5b2d 6d8b 8f82 4842 12fd 0b10 b6e3 369b
102 PDT: 64 decrypt obj no. 1, 591 B
103 PDT: · [64] 0% done, read block (591 B of 591 B remaining)
104 PDT: · decrypt ciphertext 591 B
105 PDT: · decrypt plaintext 591 B
109 Also, the mode *scrypt* allows deriving encryption keys. To calculate the
110 encryption key from the password ‘foo’ and the salt of the first object in a
111 PDT encrypted file: ::
113 $ crypto.py scrypt foo -i some-file.pdtcrypt
114 {"paramversion": 1, "salt": "Cqzbk48e3peEjzWto8D0yA==", "key": "JH9EkMwaM4x9F5aim5gK/Q=="}
116 The computed 16 byte key is given in hexadecimal notation in the value to
117 ``hash`` and can be fed into Python’s ``binascii.unhexlify()`` to obtain the
118 corresponding binary representation.
120 Note that in Scrypt hashing mode, no data integrity checks are being performed.
121 If the wrong password is given, a wrong key will be derived. Whether the password
122 was indeed correct can only be determined by decrypting. Note that since PDT
123 archives essentially consist of a stream of independent objects, the salt and
124 other parameters may change. Thus a key derived using above method from the
125 first object doesn’t necessarily apply to any of the subsequent objects.
134 from functools import reduce, partial
144 except ImportError as exn:
147 if __name__ == "__main__": ## Work around the import mechanism lest Python’s
148 pwd = os.getcwd() ## preference for local imports causes a cyclical
149 ## import (crypto → pylibscrypt → […] → ./tarfile → crypto).
150 sys.path = [ p for p in sys.path if p.find ("deltatar") < 0 ]
153 from cryptography.hazmat.primitives.ciphers import Cipher, algorithms, modes
154 from cryptography.hazmat.backends import default_backend
158 __all__ = [ "hdr_make", "hdr_read", "hdr_fmt", "hdr_fmt_pretty"
160 , "PDTCRYPT_HDR_SIZE", "AES_GCM_IV_CNT_DATA"
161 , "AES_GCM_IV_CNT_INFOFILE", "AES_GCM_IV_CNT_INDEX"
165 ###############################################################################
167 ###############################################################################
169 class EndOfFile (Exception):
173 def __init__ (self, n=None, msg=None):
179 class InvalidParameter (Exception):
180 """Inputs not valid for PDT encryption."""
184 class InvalidHeader (Exception):
185 """Header not valid."""
189 class InvalidGCMTag (Exception):
191 The GCM tag calculated during decryption differs from that in the object
197 class InvalidIVFixedPart (Exception):
199 IV fixed part not in supplied list: either the backup is corrupt or the
200 current object does not belong to it.
205 class IVFixedPartError (Exception):
207 Error creating a unique IV fixed part: repeated calls to system RNG yielded
208 the same sequence of bytes as the last IV used.
213 class InvalidFileCounter (Exception):
215 When encrypting, an attempted reuse of a dedicated counter (info file,
216 index file) was caught.
221 class DuplicateIV (Exception):
223 During encryption, the current IV fixed part is identical to an already
224 existing IV (same prefix and file counter). This indicates tampering or
225 programmer error and cannot be recovered from.
230 class NonConsecutiveIV (Exception):
232 IVs not numbered consecutively. This is a hard error with strict IV
233 checking. Precludes random access to the encrypted objects.
238 class FormatError (Exception):
239 """Unusable parameters in header."""
243 class DecryptionError (Exception):
244 """Error during decryption with ``crypto.py`` on the command line."""
248 class Unreachable (Exception):
250 Makeshift __builtin_unreachable(); always a programmer error if
256 class InternalError (Exception):
257 """Errors not ascribable to bad user inputs or cryptography."""
261 ###############################################################################
262 ## crypto layer version
263 ###############################################################################
265 ENCRYPTION_PARAMETERS = \
267 { "kdf": ("dummy", 16)
268 , "enc": "passthrough" }
276 , "enc": "aes-gcm" } }
278 ###############################################################################
280 ###############################################################################
282 PDTCRYPT_HDR_MAGIC = b"PDTCRYPT"
284 PDTCRYPT_HDR_SIZE_MAGIC = 8 # 8
285 PDTCRYPT_HDR_SIZE_VERSION = 2 # 10
286 PDTCRYPT_HDR_SIZE_PARAMVERSION = 2 # 12
287 PDTCRYPT_HDR_SIZE_NACL = 16 # 28
288 PDTCRYPT_HDR_SIZE_IV = 12 # 40
289 PDTCRYPT_HDR_SIZE_CTSIZE = 8 # 48
290 PDTCRYPT_HDR_SIZE_TAG = 16 # 64 GCM auth tag
292 PDTCRYPT_HDR_SIZE = PDTCRYPT_HDR_SIZE_MAGIC + PDTCRYPT_HDR_SIZE_VERSION \
293 + PDTCRYPT_HDR_SIZE_PARAMVERSION + PDTCRYPT_HDR_SIZE_NACL \
294 + PDTCRYPT_HDR_SIZE_IV + PDTCRYPT_HDR_SIZE_CTSIZE \
295 + PDTCRYPT_HDR_SIZE_TAG # = 64
297 # precalculate offsets since Python can’t do constant folding over names
298 HDR_OFF_VERSION = PDTCRYPT_HDR_SIZE_MAGIC
299 HDR_OFF_PARAMVERSION = HDR_OFF_VERSION + PDTCRYPT_HDR_SIZE_VERSION
300 HDR_OFF_NACL = HDR_OFF_PARAMVERSION + PDTCRYPT_HDR_SIZE_PARAMVERSION
301 HDR_OFF_IV = HDR_OFF_NACL + PDTCRYPT_HDR_SIZE_NACL
302 HDR_OFF_CTSIZE = HDR_OFF_IV + PDTCRYPT_HDR_SIZE_IV
303 HDR_OFF_TAG = HDR_OFF_CTSIZE + PDTCRYPT_HDR_SIZE_CTSIZE
307 FMT_I2N_IV = "<8sL" # 8 random bytes ‖ 32 bit counter
308 FMT_I2N_HDR = ("<" # host byte order
312 "16s" # sodium chloride
318 AES_KEY_SIZE = 16 # b"0123456789abcdef"
319 AES_KEY_SIZE_B64 = 24 # b'MDEyMzQ1Njc4OWFiY2RlZg=='
320 AES_GCM_MAX_SIZE = (1 << 36) - (1 << 5) # 2^39 - 2^8 b ≅ 64 GB
321 PDTCRYPT_MAX_OBJ_SIZE_DEFAULT = 63 * (1 << 30) # 63 GB
322 PDTCRYPT_MAX_OBJ_SIZE = PDTCRYPT_MAX_OBJ_SIZE_DEFAULT
324 # index and info files are written on-the fly while encrypting so their
325 # counters must be available inadvance
326 AES_GCM_IV_CNT_INFOFILE = 1 # constant
327 AES_GCM_IV_CNT_INDEX = AES_GCM_IV_CNT_INFOFILE + 1
328 AES_GCM_IV_CNT_DATA = AES_GCM_IV_CNT_INDEX + 1 # also for multivolume
329 AES_GCM_IV_CNT_MAX_DEFAULT = 0xffFFffFF
330 AES_GCM_IV_CNT_MAX = AES_GCM_IV_CNT_MAX_DEFAULT
332 # IV structure and generation
333 PDTCRYPT_IV_GEN_MAX_RETRIES = 10 # ×
334 PDTCRYPT_IV_FIXEDPART_SIZE = 8 # B
335 PDTCRYPT_IV_COUNTER_SIZE = 4 # B
337 # secret type: PW of string | KEY of char [16]
338 PDTCRYPT_SECRET_PW = 0
339 PDTCRYPT_SECRET_KEY = 1
341 ###############################################################################
343 ###############################################################################
349 # , paramversion : u16
355 # fn hdr_read (f : handle) -> hdrinfo;
356 # fn hdr_make (f : handle, h : hdrinfo) -> IOResult<usize>;
357 # fn hdr_fmt (h : hdrinfo) -> String;
362 Read bytes as header structure.
364 If the input could not be interpreted as a header, fail with
369 mag, version, paramversion, nacl, iv, ctsize, tag = \
370 struct.unpack (FMT_I2N_HDR, data)
371 except Exception as exn:
372 raise InvalidHeader ("error unpacking header from [%r]: %s"
373 % (binascii.hexlify (data), str (exn)))
375 if mag != PDTCRYPT_HDR_MAGIC:
376 raise InvalidHeader ("bad magic in header: expected [%s], got [%s]"
377 % (PDTCRYPT_HDR_MAGIC, mag))
380 { "version" : version
381 , "paramversion" : paramversion
389 def hdr_read_stream (instr):
391 Read header from stream at the current position.
393 Fail with ``InvalidHeader`` if insufficient bytes were read from the
394 stream, or if the content could not be interpreted as a header.
396 data = instr.read(PDTCRYPT_HDR_SIZE)
400 elif ldata != PDTCRYPT_HDR_SIZE:
401 raise InvalidHeader ("hdr_read_stream: expected %d B, received %d B"
402 % (PDTCRYPT_HDR_SIZE, ldata))
403 return hdr_read (data)
406 def hdr_from_params (version, paramversion, nacl, iv, ctsize, tag):
408 Assemble the necessary values into a PDTCRYPT header.
410 :type version: int to fit uint16_t
411 :type paramversion: int to fit uint16_t
412 :type nacl: bytes to fit uint8_t[16]
413 :type iv: bytes to fit uint8_t[12]
414 :type size: int to fit uint64_t
415 :type tag: bytes to fit uint8_t[16]
417 buf = bytearray (PDTCRYPT_HDR_SIZE)
418 bufv = memoryview (buf)
421 struct.pack_into (FMT_I2N_HDR, bufv, 0,
423 version, paramversion, nacl, iv, ctsize, tag)
424 except Exception as exn:
425 return False, "error assembling header: %s" % str (exn)
427 return True, bytes (buf)
430 def hdr_make_dummy (s):
432 Create a header sized block of bytes initialized to a value derived from a
433 string. Used to verify we’ve jumped back correctly to the actual position
434 of the object header.
436 c = reduce (lambda a, c: a + ord(c), s, 0) % 0xFF
437 return bytes (bytearray (struct.pack ("B", c)) * PDTCRYPT_HDR_SIZE)
442 Assemble a header from the given header structure.
444 return hdr_from_params (version=hdr.get("version"),
445 paramversion=hdr.get("paramversion"),
446 nacl=hdr.get("nacl"), iv=hdr.get("iv"),
447 ctsize=hdr.get("ctsize"), tag=hdr.get("tag"))
450 HDR_FMT = "I2n_header { version: %d, paramversion: %d, nacl: %s[%d]," \
451 " iv: %s[%d], ctsize: %d, tag: %s[%d] }"
454 """Format a header structure into readable output."""
455 return HDR_FMT % (h["version"], h["paramversion"],
456 binascii.hexlify (h["nacl"]), len(h["nacl"]),
457 binascii.hexlify (h["iv"]), len(h["iv"]),
459 binascii.hexlify (h["tag"]), len(h["tag"]))
462 def hex_spaced_of_bytes (b):
463 """Format bytes object, hexdump style."""
464 return " ".join ([ "%.2x%.2x" % (c1, c2)
465 for c1, c2 in zip (b[0::2], b[1::2]) ]) \
466 + (len (b) | 1 == len (b) and " %.2x" % b[-1] or "") # odd lengths
469 def hdr_iv_counter (h):
470 """Extract the variable part of the IV of the given header."""
471 _fixed, cnt = struct.unpack (FMT_I2N_IV, h ["iv"])
475 def hdr_iv_fixed (h):
476 """Extract the fixed part of the IV of the given header."""
477 fixed, _cnt = struct.unpack (FMT_I2N_IV, h ["iv"])
481 hdr_dump = hex_spaced_of_bytes
485 """version = %-4d : %s
486 paramversion = %-4d : %s
493 def hdr_fmt_pretty (h):
495 Format header structure into multi-line representation of its contents and
496 their raw representation. (Omit the implicit “PDTCRYPT” magic bytes that
497 precede every header.)
499 return HDR_FMT_PRETTY \
501 hex_spaced_of_bytes (struct.pack (FMT_UINT16_LE, h["version"])),
503 hex_spaced_of_bytes (struct.pack (FMT_UINT16_LE, h["paramversion"])),
504 hex_spaced_of_bytes (h["nacl"]),
505 hex_spaced_of_bytes (h["iv"]),
507 hex_spaced_of_bytes (struct.pack (FMT_UINT64_LE, h["ctsize"])),
508 hex_spaced_of_bytes (h["tag"]))
510 IV_FMT = "((f %s) (c %d))"
513 """Format the two components of an IV in a readable fashion."""
514 fixed, cnt = struct.unpack (FMT_I2N_IV, iv)
515 return IV_FMT % (binascii.hexlify (fixed), cnt)
518 ###############################################################################
520 ###############################################################################
522 class Location (object):
526 def restore_loc_fmt (loc):
528 % (loc.n, loc.offset)
530 def locate_hdr_candidates (fd):
532 Walk over instances of the magic string in the payload, collecting their
533 positions. If the offset of the first found instance is not zero, the file
534 begins with leading garbage.
536 :return: The list of offsets in the file.
540 mm = mmap.mmap(fd, 0, mmap.MAP_SHARED, mmap.PROT_READ)
543 pos = mm.find (PDTCRYPT_HDR_MAGIC, pos)
552 HDR_CAND_GOOD = 0 # header marks begin of valid object
553 HDR_CAND_FISHY = 1 # inconclusive (tag mismatch, obj overlap etc.)
554 HDR_CAND_JUNK = 2 # not a header / object unreadable
557 def inspect_hdr (fd, off):
559 Attempt to parse a header in *fd* at position *off*.
561 Returns a verdict about the quality of that header plus the parsed header
565 _ = os.lseek (fd, off, os.SEEK_SET)
567 if os.lseek (fd, 0, os.SEEK_CUR) != off:
568 if PDTCRYPT_VERBOSE is True:
569 noise ("PDT: %d → dismissed (lseek() past EOF)" % off)
570 return HDR_CAND_JUNK, None
572 raw = os.read (fd, PDTCRYPT_HDR_SIZE)
573 if len (raw) != PDTCRYPT_HDR_SIZE:
574 if PDTCRYPT_VERBOSE is True:
575 noise ("PDT: %d → dismissed (EOF inside header)" % off)
576 return HDR_CAND_JUNK, None
580 except InvalidHeader as exn:
581 if PDTCRYPT_VERBOSE is True:
582 noise ("PDT: %d → dismissed (invalid: [%s])" % (off, str (exn)))
583 return HDR_CAND_JUNK, None
585 obj0 = off + PDTCRYPT_HDR_SIZE
586 objX = obj0 + hdr ["ctsize"]
588 eof = os.lseek (fd, 0, os.SEEK_END)
590 if PDTCRYPT_VERBOSE is True:
591 noise ("PDT: %d → EOF inside object (%d≤%d≤%d); adjusting size to "
592 "%d" % (off, obj0, eof, objX, (eof - obj0)))
593 # try reading up to the end
594 hdr ["ctsize"] = eof - obj0
595 return HDR_CAND_FISHY, hdr
597 return HDR_CAND_GOOD, hdr
600 def try_decrypt (ifd, off, hdr, secret, ofd=-1):
602 Attempt to decrypt the object in the (seekable) descriptor *ifd* starting
603 at *off* using the metadata in *hdr* and *secret*. An output fd can be
604 specified with *ofd*; if it is *-1* – the default –, the decrypted payload
607 Always creates a fresh decryptor, so validation steps across objects don’t
610 Errors during GCM tag validation are ignored.
612 ctleft = hdr ["ctsize"]
616 if ks == PDTCRYPT_SECRET_PW:
617 decr = Decrypt (password=secret [1])
618 elif ks == PDTCRYPT_SECRET_KEY:
620 decr = Decrypt (key=key)
627 os.lseek (ifd, pos, os.SEEK_SET)
629 cnksiz = min (ctleft, PDTCRYPT_BLOCKSIZE)
630 cnk = os.read (ifd, cnksiz)
633 pt = decr.process (cnk)
638 except InvalidGCMTag:
639 noise ("PDT: GCM tag mismatch for object %d–%d"
640 % (off, off + hdr ["ctsize"]))
641 if len (pt) > 0 and ofd != -1:
644 except Exception as exn:
645 noise ("PDT: error decrypting object %d–%d@%d, %d B remaining [%s]"
646 % (off, off + hdr ["ctsize"], pos, ctleft, exn))
652 def readable_objects_offsets (ifd, secret, cands):
654 From a list of candidates, locate the ones that mark the start of actual
655 readable PDTCRYPT objects.
661 vdt, hdr = inspect_hdr (ifd, cand)
662 if vdt == HDR_CAND_JUNK:
663 pass # ignore unreadable ones
664 elif vdt in [HDR_CAND_GOOD, HDR_CAND_FISHY]:
665 off0 = cand + PDTCRYPT_HDR_SIZE
666 ok = try_decrypt (ifd, off0, hdr, secret) == hdr ["ctsize"]
672 def reconstruct_offsets (fname, secret):
673 ifd = os.open (fname, os.O_RDONLY)
676 cands = locate_hdr_candidates (ifd)
677 return readable_objects_offsets (ifd, secret, cands)
682 ###############################################################################
684 ###############################################################################
686 def make_secret (password=None, key=None):
688 Safely create a “secret” value that consists either of a key or a password.
689 Inputs are validated: the password is accepted as (UTF-8 encoded) bytes or
690 string; for the key only a bytes object of the proper size or a base64
691 encoded string thereof is accepted.
693 If both are provided, the key is preferred over the password; no checks are
694 performed whether the key is derived from the password.
696 :returns: secret value if inputs were acceptable | None otherwise.
699 if isinstance (key, str) is True:
700 key = key.encode ("utf-8")
701 if isinstance (key, bytes) is True:
702 if len (key) == AES_KEY_SIZE:
703 return (PDTCRYPT_SECRET_KEY, key)
704 if len (key) == AES_KEY_SIZE * 2:
706 key = binascii.unhexlify (key)
707 return (PDTCRYPT_SECRET_KEY, key)
708 except binascii.Error: # garbage in string
710 if len (key) == AES_KEY_SIZE_B64:
712 key = base64.b64decode (key)
713 # the base64 processor is very tolerant and allows for
714 # arbitrary trailing and leading data thus the data obtained
715 # must be checked for the proper length
716 if len (key) == AES_KEY_SIZE:
717 return (PDTCRYPT_SECRET_KEY, key)
718 except binascii.Error: # “incorrect padding”
720 elif password is not None:
721 if isinstance (password, str) is True:
722 return (PDTCRYPT_SECRET_PW, password)
723 elif isinstance (password, bytes) is True:
725 password = password.decode ("utf-8")
726 return (PDTCRYPT_SECRET_PW, password)
727 except UnicodeDecodeError:
733 ###############################################################################
734 ## passthrough / null encryption
735 ###############################################################################
737 class PassthroughCipher (object):
739 tag = struct.pack ("<QQ", 0, 0)
741 def __init__ (self) : pass
743 def update (self, b) : return b
745 def finalize (self) : return b""
747 def finalize_with_tag (self, _) : return b""
749 ###############################################################################
750 ## convenience wrapper
751 ###############################################################################
754 def kdf_dummy (klen, password, _nacl):
756 Fake KDF for testing purposes that is called when parameter version zero is
759 q, r = divmod (klen, len (password))
760 if isinstance (password, bytes) is False:
761 password = password.encode ()
762 return password * q + password [:r], b""
765 SCRYPT_KEY_MEMO = { } # static because needed for both the info file and the archive
768 def kdf_scrypt (params, password, nacl):
770 Wrapper for the Scrypt KDF, corresponds to parameter version one. The
771 computation result is memoized based on the inputs to facilitate spawning
772 multiple encryption contexts.
777 dkLen = params["dkLen"]
780 nacl = os.urandom (params["NaCl_LEN"])
782 key_parms = (password, nacl, N, r, p, dkLen)
783 global SCRYPT_KEY_MEMO
784 if key_parms not in SCRYPT_KEY_MEMO:
785 SCRYPT_KEY_MEMO [key_parms] = \
786 pylibscrypt.scrypt (password, nacl, N, r, p, dkLen)
787 return SCRYPT_KEY_MEMO [key_parms], nacl
790 def kdf_by_version (paramversion=None, defs=None):
792 Pick the KDF handler corresponding to the parameter version or the
795 :rtype: function (password : str, nacl : str) -> str
797 if paramversion is not None:
798 defs = ENCRYPTION_PARAMETERS.get(paramversion, None)
800 raise InvalidParameter ("no encryption parameters for version %r"
802 (kdf, params) = defs["kdf"]
804 if kdf == "scrypt" : fn = kdf_scrypt
805 if kdf == "dummy" : fn = kdf_dummy
807 raise ValueError ("key derivation method %r unknown" % kdf)
808 return partial (fn, params)
811 ###############################################################################
813 ###############################################################################
815 def scrypt_hashsource (pw, ins):
817 Calculate the SCRYPT hash from the password and the information contained
818 in the first header found in ``ins``.
820 This does not validate whether the first object is encrypted correctly.
822 if isinstance (pw, str) is True:
824 elif isinstance (pw, bytes) is False:
825 raise InvalidParameter ("password must be a string, not %s"
827 if isinstance (ins, io.BufferedReader) is False and \
828 isinstance (ins, io.FileIO) is False:
829 raise InvalidParameter ("file to hash must be opened in “binary” mode")
832 hdr = hdr_read_stream (ins)
833 except EndOfFile as exn:
834 noise ("PDT: malformed input: end of file reading first object header")
839 pver = hdr ["paramversion"]
840 if PDTCRYPT_VERBOSE is True:
841 noise ("PDT: salt of first object : %s" % binascii.hexlify (nacl))
842 noise ("PDT: parameter version of archive : %d" % pver)
845 defs = ENCRYPTION_PARAMETERS.get(pver, None)
846 kdfname, params = defs ["kdf"]
847 if kdfname != "scrypt":
848 noise ("PDT: input is not an SCRYPT archive")
851 kdf = kdf_by_version (None, defs)
852 except ValueError as exn:
853 noise ("PDT: object has unknown parameter version %d" % pver)
855 hsh, _void = kdf (pw, nacl)
857 return hsh, nacl, hdr ["version"], pver
860 def scrypt_hashfile (pw, fname):
862 Calculate the SCRYPT hash from the password and the information contained
863 in the first header found in the given file. The header is read only at
866 with deptdcrypt_mk_stream (PDTCRYPT_SOURCE, fname or "-") as ins:
867 hsh, _void, _void, _void = scrypt_hashsource (pw, ins)
871 ###############################################################################
873 ###############################################################################
875 class Crypto (object):
877 Encryption context to remain alive throughout an entire tarfile pass.
882 cnt = None # file counter (uint32_t != 0)
883 iv = None # current IV
884 fixed = None # accu for 64 bit fixed parts of IV
885 used_ivs = None # tracks IVs
886 strict_ivs = False # if True, panic on duplicate object IV
895 info_counter_used = False
896 index_counter_used = False
898 def __init__ (self, *al, **akv):
899 self.used_ivs = set ()
900 self.set_parameters (*al, **akv)
903 def next_fixed (self):
908 def set_object_counter (self, cnt=None):
910 Safely set the internal counter of encrypted objects. Numerous
913 The same counter may not be reused in combination with one IV fixed
914 part. This is validated elsewhere in the IV handling.
916 Counter zero is invalid. The first two counters are reserved for
917 metadata. The implementation does not allow for splitting metadata
918 files over multiple encrypted objects. (This would be possible by
919 assigning new fixed parts.) Thus in a Deltatar backup there is at most
920 one object with a counter value of one and two. On creation of a
921 context, the initial counter may be chosen. The globals
922 ``AES_GCM_IV_CNT_INFOFILE`` and ``AES_GCM_IV_CNT_INDEX`` can be used to
923 request one of the reserved values. If one of these values has been
924 used, any further attempt of setting the counter to that value will
925 be rejected with an ``InvalidFileCounter`` exception.
927 Out of bounds values (i. e. below one and more than the maximum of 2³²)
928 cause an ``InvalidParameter`` exception to be thrown.
931 self.cnt = AES_GCM_IV_CNT_DATA
933 if cnt == 0 or cnt > AES_GCM_IV_CNT_MAX + 1:
934 raise InvalidParameter ("invalid counter value %d requested: "
935 "acceptable values are from 1 to %d"
936 % (cnt, AES_GCM_IV_CNT_MAX))
937 if cnt == AES_GCM_IV_CNT_INFOFILE:
938 if self.info_counter_used is True:
939 raise InvalidFileCounter ("attempted to reuse info file "
940 "counter %d: must be unique" % cnt)
941 self.info_counter_used = True
942 elif cnt == AES_GCM_IV_CNT_INDEX:
943 if self.index_counter_used is True:
944 raise InvalidFileCounter ("attempted to reuse index file "
945 " counter %d: must be unique" % cnt)
946 self.index_counter_used = True
947 if cnt <= AES_GCM_IV_CNT_MAX:
950 # cnt == AES_GCM_IV_CNT_MAX + 1 → wrap
951 self.cnt = AES_GCM_IV_CNT_DATA
955 def set_parameters (self, password=None, key=None, paramversion=None,
956 nacl=None, counter=None, strict_ivs=False):
958 Configure the internal state of a crypto context. Not intended for
962 self.set_object_counter (counter)
963 self.strict_ivs = strict_ivs
965 if paramversion is not None:
966 self.paramversion = paramversion
969 self.key, self.nacl = key, nacl
972 if password is not None:
973 if isinstance (password, bytes) is False:
974 password = str.encode (password)
975 self.password = password
976 if paramversion is None and nacl is None:
977 # postpone key setup until first header is available
979 kdf = kdf_by_version (paramversion)
981 self.key, self.nacl = kdf (password, nacl)
984 def process (self, buf):
986 Encrypt / decrypt a buffer. Invokes the ``.update()`` method on the
987 wrapped encryptor or decryptor, respectively.
989 The Cryptography exception ``AlreadyFinalized`` is translated to an
990 ``InternalError`` at this point. It may occur in sound code when the GC
991 closes an encrypting stream after an error. Everywhere else it must be
995 raise RuntimeError ("process: context not initialized")
996 self.stats ["in"] += len (buf)
998 out = self.enc.update (buf)
999 except cryptography.exceptions.AlreadyFinalized as exn:
1000 raise InternalError (exn)
1001 self.stats ["out"] += len (out)
1005 def next (self, password, paramversion, nacl, iv):
1007 Prepare for encrypting another object: Reset the data counters and
1008 change the configuration in case one of the variable parameters differs
1009 from the last object. Also check the IV for duplicates and error out
1010 if strict checking was requested.
1014 self.stats ["obj"] += 1
1016 self.check_duplicate_iv (iv)
1018 if ( self.paramversion != paramversion
1019 or self.password != password
1020 or self.nacl != nacl):
1021 self.set_parameters (password=password, paramversion=paramversion,
1022 nacl=nacl, strict_ivs=self.strict_ivs)
1025 def check_duplicate_iv (self, iv):
1027 Add an IV (the 12 byte representation as in the header) to the list. With
1028 strict checking enabled, this will throw a ``DuplicateIV``. Depending on
1029 the context, this may indicate a serious error (IV reuse).
1031 if self.strict_ivs is True and iv in self.used_ivs:
1032 raise DuplicateIV ("iv %s was reused" % iv_fmt (iv))
1033 # vi has not been used before; add to collection
1034 self.used_ivs.add (iv)
1037 def counters (self):
1039 Access the data counters.
1041 return self.stats ["obj"], self.stats ["in"], self.stats ["out"]
1046 Clear the current context regardless of its finalization state. The
1047 next operation must be ``.next()``.
1052 class Encrypt (Crypto):
1058 def __init__ (self, version, paramversion, password=None, key=None, nacl=None,
1059 counter=AES_GCM_IV_CNT_DATA, strict_ivs=True):
1061 The ctor will throw immediately if one of the parameters does not conform
1062 to our expectations.
1064 counter=AES_GCM_IV_CNT_DATA, strict_ivs=True):
1065 :type version: int to fit uint16_t
1066 :type paramversion: int to fit uint16_t
1067 :param password: mutually exclusive with ``key``
1068 :type password: bytes
1069 :param key: mutually exclusive with ``password``
1072 :type counter: initial object counter the values
1073 ``AES_GCM_IV_CNT_INFOFILE`` and
1074 ``AES_GCM_IV_CNT_INDEX`` are unique in each backup set
1075 and cannot be reused even with different fixed parts.
1076 :type strict_ivs: bool
1078 if password is None and key is None \
1079 or password is not None and key is not None :
1080 raise InvalidParameter ("__init__: need either key or password")
1083 if isinstance (key, bytes) is False:
1084 raise InvalidParameter ("__init__: key must be provided as "
1085 "bytes, not %s" % type (key))
1087 raise InvalidParameter ("__init__: salt must be provided along "
1088 "with encryption key")
1089 else: # password, no key
1090 if isinstance (password, str) is False:
1091 raise InvalidParameter ("__init__: password must be a string, not %s"
1093 if len (password) == 0:
1094 raise InvalidParameter ("__init__: supplied empty password but not "
1095 "permitted for PDT encrypted files")
1097 if isinstance (version, int) is False:
1098 raise InvalidParameter ("__init__: version number must be an "
1099 "integer, not %s" % type (version))
1101 raise InvalidParameter ("__init__: version number must be a "
1102 "nonnegative integer, not %d" % version)
1104 if isinstance (paramversion, int) is False:
1105 raise InvalidParameter ("__init__: crypto parameter version number "
1106 "must be an integer, not %s"
1107 % type (paramversion))
1108 if paramversion < 0:
1109 raise InvalidParameter ("__init__: crypto parameter version number "
1110 "must be a nonnegative integer, not %d"
1113 if nacl is not None:
1114 if isinstance (nacl, bytes) is False:
1115 raise InvalidParameter ("__init__: salt given, but of type %s "
1116 "instead of bytes" % type (nacl))
1117 # salt length would depend on the actual encryption so it can’t be
1118 # validated at this point
1120 self.version = version
1121 self.paramenc = ENCRYPTION_PARAMETERS.get (paramversion) ["enc"]
1123 super().__init__ (password, key, paramversion, nacl, counter=counter,
1124 strict_ivs=strict_ivs)
1127 def next_fixed (self, retries=PDTCRYPT_IV_GEN_MAX_RETRIES):
1129 Generate the next IV fixed part by reading eight bytes from
1130 ``/dev/urandom``. The buffer so obtained is tested against the fixed
1131 parts used so far to prevent accidental reuse of IVs. After a
1132 configurable number of attempts to create a unique fixed part, it will
1133 refuse to continue with an ``IVFixedPartError``. This is unlikely to
1134 ever happen on a normal system but may detect an issue with the random
1137 The list of fixed parts that were used by the context at hand can be
1138 accessed through the ``.fixed`` list. Its last element is the fixed
1139 part currently in use.
1143 fp = os.urandom (PDTCRYPT_IV_FIXEDPART_SIZE)
1144 if fp not in self.fixed:
1145 self.fixed.append (fp)
1148 raise IVFixedPartError ("error obtaining a unique IV fixed part from "
1149 "/dev/urandom; giving up after %d tries" % i)
1154 Construct a 12-bytes IV from the current fixed part and the object
1157 return struct.pack(FMT_I2N_IV, self.fixed [-1], self.cnt)
1160 def next (self, filename=None, counter=None):
1162 Prepare for encrypting the next incoming object. Update the counter
1163 and put together the IV, possibly changing prefixes. Then create the
1166 The argument ``counter`` can be used to specify a file counter for this
1167 object. Unless it is one of the reserved values, the counter of
1168 subsequent objects will be computed from this one.
1170 If this is the first object in a series, ``filename`` is required,
1171 otherwise it is reused if not present. The value is used to derive a
1172 header sized placeholder to use until after encryption when all the
1173 inputs to construct the final header are available. This is then
1174 matched in ``.done()`` against the value found at the position of the
1175 header. The motivation for this extra check is primarily to assist
1176 format debugging: It makes stray headers easy to spot in malformed
1179 if filename is None:
1180 if self.lastinfo is None:
1181 raise InvalidParameter ("next: filename is mandatory for "
1183 filename, _dummy = self.lastinfo
1185 if isinstance (filename, str) is False:
1186 raise InvalidParameter ("next: filename must be a string, no %s"
1188 if counter is not None:
1189 if isinstance (counter, int) is False:
1190 raise InvalidParameter ("next: the supplied counter is of "
1191 "invalid type %s; please pass an "
1192 "integer instead" % type (counter))
1193 self.set_object_counter (counter)
1195 self.iv = self.iv_make ()
1196 if self.paramenc == "aes-gcm":
1198 ( algorithms.AES (self.key)
1199 , modes.GCM (self.iv)
1200 , backend = default_backend ()) \
1202 elif self.paramenc == "passthrough":
1203 self.enc = PassthroughCipher ()
1205 raise InvalidParameter ("next: parameter version %d not known"
1206 % self.paramversion)
1207 hdrdum = hdr_make_dummy (filename)
1208 self.lastinfo = (filename, hdrdum)
1209 super().next (self.password, self.paramversion, self.nacl, self.iv)
1211 self.set_object_counter (self.cnt + 1)
1215 def done (self, cmpdata):
1217 Complete encryption of an object. After this has been called, attempts
1218 of encrypting further data will cause an error until ``.next()`` is
1221 Returns a 64 bytes buffer containing the object header including all
1222 values including the “late” ones e. g. the ciphertext size and the
1225 if isinstance (cmpdata, bytes) is False:
1226 raise InvalidParameter ("done: comparison input expected as bytes, "
1227 "not %s" % type (cmpdata))
1228 if self.lastinfo is None:
1229 raise RuntimeError ("done: encryption context not initialized")
1230 filename, hdrdum = self.lastinfo
1231 if cmpdata != hdrdum:
1232 raise RuntimeError ("done: bad sync of header for object %d: "
1233 "preliminary data does not match; this likely "
1234 "indicates a wrongly repositioned stream"
1236 data = self.enc.finalize ()
1237 self.stats ["out"] += len (data)
1238 self.ctsize += len (data)
1239 ok, hdr = hdr_from_params (self.version, self.paramversion, self.nacl,
1240 self.iv, self.ctsize, self.enc.tag)
1242 raise InternalError ("error constructing header: %r" % hdr)
1243 return data, hdr, self.fixed
1246 def process (self, buf):
1248 Encrypt a chunk of plaintext with the active encryptor. Returns the
1249 size of the input consumed. This **must** be checked downstream. If the
1250 maximum possible object size has been reached, the current context must
1251 be finalized and a new one established before any further data can be
1252 encrypted. The second argument is the remainder of the plaintext that
1253 was not encrypted for the caller to use immediately after the new
1256 if isinstance (buf, bytes) is False:
1257 raise InvalidParameter ("process: expected byte buffer, not %s"
1260 newptsize = self.ptsize + bsize
1261 diff = newptsize - PDTCRYPT_MAX_OBJ_SIZE
1264 newptsize = PDTCRYPT_MAX_OBJ_SIZE
1265 self.ptsize = newptsize
1266 data = super().process (buf [:bsize])
1267 self.ctsize += len (data)
1271 class Decrypt (Crypto):
1273 tag = None # GCM tag, part of header
1274 last_iv = None # check consecutive ivs in strict mode
1276 def __init__ (self, password=None, key=None, counter=None, fixedparts=None,
1279 Sanitizing ctor for the decryption context. ``fixedparts`` specifies a
1280 list of IV fixed parts accepted during decryption. If a fixed part is
1281 encountered that is not in the list, decryption will fail.
1283 :param password: mutually exclusive with ``key``
1284 :type password: bytes
1285 :param key: mutually exclusive with ``password``
1287 :type counter: initial object counter the values
1288 ``AES_GCM_IV_CNT_INFOFILE`` and
1289 ``AES_GCM_IV_CNT_INDEX`` are unique in each backup set
1290 and cannot be reused even with different fixed parts.
1291 :type fixedparts: bytes list
1293 if password is None and key is None \
1294 or password is not None and key is not None :
1295 raise InvalidParameter ("__init__: need either key or password")
1298 if isinstance (key, bytes) is False:
1299 raise InvalidParameter ("__init__: key must be provided as "
1300 "bytes, not %s" % type (key))
1301 else: # password, no key
1302 if isinstance (password, str) is False:
1303 raise InvalidParameter ("__init__: password must be a string, not %s"
1305 if len (password) == 0:
1306 raise InvalidParameter ("__init__: supplied empty password but not "
1307 "permitted for PDT encrypted files")
1309 if fixedparts is not None:
1310 if isinstance (fixedparts, list) is False:
1311 raise InvalidParameter ("__init__: IV fixed parts must be "
1312 "supplied as list, not %s"
1313 % type (fixedparts))
1314 self.fixed = fixedparts
1317 super().__init__ (password=password, key=key, counter=counter,
1318 strict_ivs=strict_ivs)
1321 def valid_fixed_part (self, iv):
1323 Check if a fixed part was already seen.
1325 # check if fixed part is known
1326 fixed, _cnt = struct.unpack (FMT_I2N_IV, iv)
1327 i = bisect.bisect_left (self.fixed, fixed)
1328 return i != len (self.fixed) and self.fixed [i] == fixed
1331 def check_consecutive_iv (self, iv):
1333 Check whether the counter part of the given IV is indeed the successor
1334 of the currently present counter. This should always be the case for
1335 the objects in a well formed PDT archive but should not be enforced
1336 when decrypting out-of-order.
1338 fixed, cnt = struct.unpack (FMT_I2N_IV, iv)
1339 if self.strict_ivs is True \
1340 and self.last_iv is not None \
1341 and self.last_iv [0] == fixed \
1342 and self.last_iv [1] != cnt - 1:
1343 raise NonConsecutiveIV ("iv %s counter not successor of "
1344 "last object (expected %d, found %d)"
1345 % (iv_fmt (self.last_iv [1]), cnt))
1346 self.last_iv = (iv, cnt)
1349 def next (self, hdr):
1351 Start decrypting the next object. The PDTCRYPT header for the object
1352 can be given either as already parsed object or as bytes.
1354 if isinstance (hdr, bytes) is True:
1355 hdr = hdr_read (hdr)
1356 elif isinstance (hdr, dict) is False:
1357 # this won’t catch malformed specs though
1358 raise InvalidParameter ("next: wrong type of parameter hdr: "
1359 "expected bytes or spec, got %s"
1362 paramversion = hdr ["paramversion"]
1367 raise InvalidHeader ("next: not a header %r" % hdr)
1369 super().next (self.password, paramversion, nacl, iv)
1370 if self.fixed is not None and self.valid_fixed_part (iv) is False:
1371 raise InvalidIVFixedPart ("iv %s has invalid fixed part"
1373 self.check_consecutive_iv (iv)
1376 defs = ENCRYPTION_PARAMETERS.get (paramversion, None)
1378 raise FormatError ("header contains unknown parameter version %d; "
1379 "maybe the file was created by a more recent "
1380 "version of Deltatar" % paramversion)
1382 if enc == "aes-gcm":
1384 ( algorithms.AES (self.key)
1385 , modes.GCM (iv, tag=self.tag)
1386 , backend = default_backend ()) \
1388 elif enc == "passthrough":
1389 self.enc = PassthroughCipher ()
1391 raise InternalError ("encryption parameter set %d refers to unknown "
1392 "mode %r" % (paramversion, enc))
1393 self.set_object_counter (self.cnt + 1)
1396 def done (self, tag=None):
1398 Stop decryption of the current object and finalize it with the active
1399 context. This will throw an *InvalidGCMTag* exception to indicate that
1400 the authentication tag does not match the data. If the tag is correct,
1401 the rest of the plaintext is returned.
1406 data = self.enc.finalize ()
1408 if isinstance (tag, bytes) is False:
1409 raise InvalidParameter ("done: wrong type of parameter "
1410 "tag: expected bytes, got %s"
1412 data = self.enc.finalize_with_tag (self.tag)
1413 except cryptography.exceptions.InvalidTag:
1414 raise InvalidGCMTag ("done: tag mismatch of object %d: %s "
1415 "rejected by finalize ()"
1416 % (self.cnt, binascii.hexlify (self.tag)))
1417 self.ctsize += len (data)
1418 self.stats ["out"] += len (data)
1422 def process (self, buf):
1424 Decrypt the bytes object *buf* with the active decryptor.
1426 if isinstance (buf, bytes) is False:
1427 raise InvalidParameter ("process: expected byte buffer, not %s"
1429 self.ctsize += len (buf)
1430 data = super().process (buf)
1431 self.ptsize += len (data)
1435 ###############################################################################
1437 ###############################################################################
1439 def _patch_global (glob, vow, n=None):
1441 Adapt upper file counter bound for testing IV logic. Completely unsafe.
1443 assert vow == "I am fully aware that this will void my warranty."
1444 r = globals () [glob]
1446 n = globals () [glob + "_DEFAULT"]
1447 globals () [glob] = n
1450 _testing_set_AES_GCM_IV_CNT_MAX = \
1451 partial (_patch_global, "AES_GCM_IV_CNT_MAX")
1453 _testing_set_PDTCRYPT_MAX_OBJ_SIZE = \
1454 partial (_patch_global, "PDTCRYPT_MAX_OBJ_SIZE")
1456 def open2_dump_file (fname, dir_fd, force=False):
1459 oflags = os.O_CREAT | os.O_WRONLY
1461 oflags |= os.O_TRUNC
1466 outfd = os.open (fname, oflags,
1467 stat.S_IRUSR | stat.S_IWUSR, dir_fd=dir_fd)
1468 except FileExistsError as exn:
1469 noise ("PDT: refusing to overwrite existing file %s" % fname)
1471 raise RuntimeError ("destination file %s already exists" % fname)
1472 if PDTCRYPT_VERBOSE is True:
1473 noise ("PDT: new output file %s (fd=%d)" % (fname, outfd))
1477 ###############################################################################
1478 ## freestanding invocation
1479 ###############################################################################
1481 PDTCRYPT_SUB_PROCESS = 0
1482 PDTCRYPT_SUB_SCRYPT = 1
1483 PDTCRYPT_SUB_SCAN = 2
1486 { "process" : PDTCRYPT_SUB_PROCESS
1487 , "scrypt" : PDTCRYPT_SUB_SCRYPT
1488 , "scan" : PDTCRYPT_SUB_SCAN }
1490 PDTCRYPT_DECRYPT = 1 << 0 # decrypt archive with password
1491 PDTCRYPT_SPLIT = 1 << 1 # split archive into individual objects
1492 PDTCRYPT_HASH = 1 << 2 # output scrypt hash for file and given password
1494 PDTCRYPT_SPLITNAME = "pdtcrypt-object-%d.bin"
1495 PDTCRYPT_RESCUENAME = "pdtcrypt-rescue-object-%0.5d.bin"
1497 PDTCRYPT_VERBOSE = False
1498 PDTCRYPT_STRICTIVS = False
1499 PDTCRYPT_OVERWRITE = False
1500 PDTCRYPT_BLOCKSIZE = 1 << 12
1505 PDTCRYPT_DEFAULT_VER = 1
1506 PDTCRYPT_DEFAULT_PVER = 1
1508 # scrypt hashing output control
1509 PDTCRYPT_SCRYPT_INTRANATOR = 0
1510 PDTCRYPT_SCRYPT_PARAMETERS = 1
1511 PDTCRYPT_SCRYPT_DEFAULT = PDTCRYPT_SCRYPT_INTRANATOR
1513 PDTCRYPT_SCRYPT_FORMAT = \
1514 { "i2n" : PDTCRYPT_SCRYPT_INTRANATOR
1515 , "params" : PDTCRYPT_SCRYPT_PARAMETERS }
1517 PDTCRYPT_TT_COLUMNS = 80 # assume standard terminal
1519 class PDTDecryptionError (Exception):
1520 """Decryption failed."""
1522 class PDTSplitError (Exception):
1523 """Decryption failed."""
1526 def noise (*a, **b):
1527 print (file=sys.stderr, *a, **b)
1530 class PassthroughDecryptor (object):
1532 curhdr = None # write current header on first data write
1534 def __init__ (self):
1535 if PDTCRYPT_VERBOSE is True:
1536 noise ("PDT: no encryption; data passthrough")
1538 def next (self, hdr):
1539 ok, curhdr = hdr_make (hdr)
1541 raise PDTDecryptionError ("bad header %r" % hdr)
1542 self.curhdr = curhdr
1545 if self.curhdr is not None:
1549 def process (self, d):
1550 if self.curhdr is not None:
1556 def depdtcrypt (mode, secret, ins, outs):
1558 Remove PDTCRYPT layer from all objects encrypted with the secret. Used on a
1559 Deltatar backup this will yield a (possibly Gzip compressed) tarball.
1561 ctleft = -1 # length of ciphertext to consume
1562 ctcurrent = 0 # total ciphertext of current object
1563 total_obj = 0 # total number of objects read
1564 total_pt = 0 # total plaintext bytes
1565 total_ct = 0 # total ciphertext bytes
1566 total_read = 0 # total bytes read
1567 outfile = None # Python file object for output
1569 if mode & PDTCRYPT_DECRYPT: # decryptor
1571 if ks == PDTCRYPT_SECRET_PW:
1572 decr = Decrypt (password=secret [1], strict_ivs=PDTCRYPT_STRICTIVS)
1573 elif ks == PDTCRYPT_SECRET_KEY:
1575 decr = Decrypt (key=key, strict_ivs=PDTCRYPT_STRICTIVS)
1577 raise InternalError ("‘%d’ does not specify a valid kind of secret"
1580 decr = PassthroughDecryptor ()
1583 """Dummy for non-split mode: output file does not vary."""
1586 if mode & PDTCRYPT_SPLIT:
1587 def nextout (outfile):
1589 We were passed an fd as outs for accessing the destination
1590 directory where extracted archive components are supposed
1595 if PDTCRYPT_VERBOSE is True:
1596 noise ("PDT: no output file to close at this point")
1598 if PDTCRYPT_VERBOSE is True:
1599 noise ("PDT: release output file %r" % outfile)
1600 # cleanup happens automatically by the GC; the next
1601 # line will error out on account of an invalid fd
1604 assert total_obj > 0
1605 fname = PDTCRYPT_SPLITNAME % total_obj
1607 outfd = open2_dump_file (fname, outs, force=PDTCRYPT_OVERWRITE)
1608 except RuntimeError as exn:
1609 raise PDTSplitError (exn)
1610 return os.fdopen (outfd, "wb", closefd=True)
1614 """ESPIPE is normal on non-seekable stdio stream."""
1617 except OSError as exn:
1618 if exn.errno == os.errno.ESPIPE:
1621 def out (pt, outfile):
1625 if PDTCRYPT_VERBOSE is True:
1626 noise ("PDT:\t· decrypt plaintext %d B" % (npt))
1628 nn = outfile.write (pt)
1629 except OSError as exn: # probably ENOSPC
1630 raise DecryptionError ("error (%s)" % exn)
1632 raise DecryptionError ("write aborted after %d of %d B" % (nn, npt))
1636 # current object completed; in a valid archive this marks either
1637 # the start of a new header or the end of the input
1638 if ctleft == 0: # current object requires finalization
1639 if PDTCRYPT_VERBOSE is True:
1640 noise ("PDT: %d finalize" % tell (ins))
1643 except InvalidGCMTag as exn:
1644 raise DecryptionError ("error finalizing object %d (%d B): "
1645 "%r" % (total_obj, len (pt), exn)) \
1648 if PDTCRYPT_VERBOSE is True:
1649 noise ("PDT:\t· object validated")
1651 if PDTCRYPT_VERBOSE is True:
1652 noise ("PDT: %d hdr" % tell (ins))
1654 hdr = hdr_read_stream (ins)
1655 total_read += PDTCRYPT_HDR_SIZE
1656 except EndOfFile as exn:
1657 total_read += exn.remainder
1658 if total_ct + total_obj * PDTCRYPT_HDR_SIZE != total_read:
1659 raise PDTDecryptionError ("ciphertext processed (%d B) plus "
1660 "overhead (%d × %d B) does not match "
1661 "the number of bytes read (%d )"
1662 % (total_ct, total_obj, PDTCRYPT_HDR_SIZE,
1664 # the single good exit
1665 return total_read, total_obj, total_ct, total_pt
1666 except InvalidHeader as exn:
1667 raise PDTDecryptionError ("invalid header at position %d in %r "
1668 "(%s)" % (tell (ins), exn, ins))
1669 if PDTCRYPT_VERBOSE is True:
1670 pretty = hdr_fmt_pretty (hdr)
1671 noise (reduce (lambda a, e: (a + "\n" if a else "") + "PDT:\t· " + e,
1672 pretty.splitlines (), ""))
1673 ctcurrent = ctleft = hdr ["ctsize"]
1677 total_obj += 1 # used in file counter with split mode
1679 # finalization complete or skipped in case of first object in
1680 # stream; create a new output file if necessary
1681 outfile = nextout (outfile)
1683 if PDTCRYPT_VERBOSE is True:
1684 noise ("PDT: %d decrypt obj no. %d, %d B"
1685 % (tell (ins), total_obj, ctleft))
1687 # always allocate a new buffer since python-cryptography doesn’t allow
1688 # passing a bytearray :/
1689 nexpect = min (ctleft, PDTCRYPT_BLOCKSIZE)
1690 if PDTCRYPT_VERBOSE is True:
1691 noise ("PDT:\t· [%d] %d%% done, read block (%d B of %d B remaining)"
1693 100 - ctleft * 100 / (ctcurrent > 0 and ctcurrent or 1),
1695 ct = ins.read (nexpect)
1699 raise EndOfFile (nct,
1700 "hit EOF after %d of %d B in block [%d:%d); "
1701 "%d B ciphertext remaining for object no %d"
1702 % (nct, nexpect, off, off + nexpect, ctleft,
1708 if PDTCRYPT_VERBOSE is True:
1709 noise ("PDT:\t· decrypt ciphertext %d B" % (nct))
1710 pt = decr.process (ct)
1714 def deptdcrypt_mk_stream (kind, path):
1715 """Create stream from file or stdio descriptor."""
1716 if kind == PDTCRYPT_SINK:
1718 if PDTCRYPT_VERBOSE is True: noise ("PDT: sink: stdout")
1719 return sys.stdout.buffer
1721 if PDTCRYPT_VERBOSE is True: noise ("PDT: sink: file %s" % path)
1722 return io.FileIO (path, "w")
1723 if kind == PDTCRYPT_SOURCE:
1725 if PDTCRYPT_VERBOSE is True: noise ("PDT: source: stdin")
1726 return sys.stdin.buffer
1728 if PDTCRYPT_VERBOSE is True: noise ("PDT: source: file %s" % path)
1729 return io.FileIO (path, "r")
1731 raise ValueError ("bogus stream “%s” / %s" % (kind, path))
1734 def mode_depdtcrypt (mode, secret, ins, outs):
1736 total_read, total_obj, total_ct, total_pt = \
1737 depdtcrypt (mode, secret, ins, outs)
1738 except DecryptionError as exn:
1739 noise ("PDT: Decryption failed:")
1741 noise ("PDT: “%s”" % exn)
1743 noise ("PDT: Did you specify the correct key / password?")
1746 except PDTSplitError as exn:
1747 noise ("PDT: Split operation failed:")
1749 noise ("PDT: “%s”" % exn)
1751 noise ("PDT: Hint: target directory should be empty.")
1755 if PDTCRYPT_VERBOSE is True:
1756 noise ("PDT: decryption successful" )
1757 noise ("PDT: %.10d bytes read" % total_read)
1758 noise ("PDT: %.10d objects decrypted" % total_obj )
1759 noise ("PDT: %.10d bytes ciphertext" % total_ct )
1760 noise ("PDT: %.10d bytes plaintext" % total_pt )
1766 def mode_scrypt (pw, ins=None, nacl=None, fmt=PDTCRYPT_SCRYPT_INTRANATOR):
1768 paramversion = PDTCRYPT_DEFAULT_PVER
1770 hsh, nacl, version, paramversion = scrypt_hashsource (pw, ins)
1771 defs = ENCRYPTION_PARAMETERS.get(paramversion, None)
1773 nacl = binascii.unhexlify (nacl)
1774 defs = ENCRYPTION_PARAMETERS.get(paramversion, None)
1775 version = PDTCRYPT_DEFAULT_VER
1777 kdfname, params = defs ["kdf"]
1779 kdf = kdf_by_version (None, defs)
1780 hsh, _void = kdf (pw, nacl)
1784 if fmt == PDTCRYPT_SCRYPT_INTRANATOR:
1785 out = json.dumps ({ "salt" : base64.b64encode (nacl).decode ()
1786 , "key" : base64.b64encode (hsh) .decode ()
1787 , "paramversion" : paramversion })
1788 elif fmt == PDTCRYPT_SCRYPT_PARAMETERS:
1789 out = json.dumps ({ "salt" : binascii.hexlify (nacl).decode ()
1790 , "key" : binascii.hexlify (hsh) .decode ()
1791 , "version" : version
1792 , "scrypt_params" : { "N" : params ["N"]
1793 , "r" : params ["r"]
1794 , "p" : params ["p"]
1795 , "dkLen" : params ["dkLen"] } })
1797 raise RuntimeError ("bad scrypt output scheme %r" % fmt)
1802 def noise_output_candidates (cands, indent=8, cols=PDTCRYPT_TT_COLUMNS):
1804 Print a list of offsets without garbling the terminal too much.
1806 The indent is counted from column zero; if it is wide enough, the “PDT: ”
1807 marker will be prepended, considered part of the indentation.
1811 idt = " " * indent if indent < 5 else "PDT: " + " " * (indent - 5)
1816 init = True # prevent leading separator
1819 raise ValueError ("the requested indentation exceeds the line "
1820 "width by %d" % (indent - wd))
1830 if lpos > wd: # line break
1846 def mode_scan (secret, fname, outs=None, nacl=None):
1848 Dissect a binary file, looking for PDTCRYPT headers and objects.
1850 If *outs* is supplied, recoverable data will be dumped into the specified
1854 ifd = os.open (fname, os.O_RDONLY)
1855 except FileNotFoundError:
1856 noise ("PDT: failed to open %s readonly" % fname)
1861 if PDTCRYPT_VERBOSE is True:
1862 noise ("PDT: scan for potential sync points")
1863 cands = locate_hdr_candidates (ifd)
1864 if len (cands) == 0:
1865 noise ("PDT: scan complete: input does not contain potential PDT "
1866 "headers; giving up.")
1868 if PDTCRYPT_VERBOSE is True:
1869 noise ("PDT: scan complete: found %d candidates:" % len (cands))
1870 noise_output_candidates (cands)
1880 vdt, hdr = inspect_hdr (ifd, cand)
1881 if vdt == HDR_CAND_JUNK:
1884 off0 = cand + PDTCRYPT_HDR_SIZE
1885 if PDTCRYPT_VERBOSE is True:
1886 noise ("PDT: obj %d: read payload @%d" % (nobj, off0))
1887 pretty = hdr_fmt_pretty (hdr)
1888 noise (reduce (lambda a, e: (a + "\n" if a else "") + "PDT:\t· " + e,
1889 pretty.splitlines (), ""))
1892 if outs is not None:
1893 ofname = PDTCRYPT_RESCUENAME % nobj
1894 ofd = open2_dump_file (ofname, outs, force=PDTCRYPT_OVERWRITE)
1897 ok = try_decrypt (ifd, off0, hdr, secret, ofd=ofd) == hdr ["ctsize"]
1901 if vdt == HDR_CAND_GOOD and ok is True:
1902 noise ("PDT: %d → ✓ valid object %d–%d"
1903 % (cand, off0, off0 + hdr ["ctsize"]))
1904 elif vdt == HDR_CAND_FISHY and ok is True:
1905 noise ("PDT: %d → × object %d–%d, corrupt header"
1906 % (cand, off0, off0 + hdr ["ctsize"]))
1907 elif vdt == HDR_CAND_GOOD and ok is False:
1908 noise ("PDT: %d → × object %d–%d, problematic payload"
1909 % (cand, off0, off0 + hdr ["ctsize"]))
1910 elif vdt == HDR_CAND_FISHY and ok is False:
1911 noise ("PDT: %d → × object %d–%d, corrupt header, problematic "
1912 "ciphertext" % (cand, off0, off0 + hdr ["ctsize"]))
1919 noise ("PDT: all headers ok")
1921 noise ("PDT: %d candidates not parseable as headers:" % len (junk))
1922 noise_output_candidates (junk)
1924 def usage (err=False):
1928 indent = ' ' * len (SELF)
1929 out ("usage: %s SUBCOMMAND { --help" % SELF)
1930 out (" %s | [ -v ] { -p PASSWORD | -k KEY }" % indent)
1931 out (" %s [ { -i | --in } { - | SOURCE } ]" % indent)
1932 out (" %s [ { -n | --nacl } { SALT } ]" % indent)
1933 out (" %s [ { -o | --out } { - | DESTINATION } ]" % indent)
1934 out (" %s [ -D | --no-decrypt ] [ -S | --split ]" % indent)
1935 out (" %s [ -f | --format ]" % indent)
1938 out ("\t\tSUBCOMMAND main mode: { process | scrypt }")
1940 out ("\t\t process: extract objects from PDT archive")
1941 out ("\t\t scrypt: calculate hash from password and first object")
1942 out ("\t\t-p PASSWORD password to derive the encryption key from")
1943 out ("\t\t-k KEY encryption key as 16 bytes in hexadecimal notation")
1944 out ("\t\t-s enforce strict handling of initialization vectors")
1945 out ("\t\t-i SOURCE file name to read from")
1946 out ("\t\t-o DESTINATION file to write output to")
1947 out ("\t\t-n SALT provide salt for scrypt mode in hex encoding")
1948 out ("\t\t-v print extra info")
1949 out ("\t\t-S split into files at object boundaries; this")
1950 out ("\t\t requires DESTINATION to refer to directory")
1951 out ("\t\t-D PDT header and ciphertext passthrough")
1952 out ("\t\t-f format of SCRYPT hash output (“default” or “parameters”)")
1954 out ("\tinstead of filenames, “-” may used to specify stdin / stdout")
1956 sys.exit ((err is True) and 42 or 0)
1966 def parse_argv (argv):
1967 global PDTCRYPT_OVERWRITE
1969 mode = PDTCRYPT_DECRYPT
1975 scrypt_format = PDTCRYPT_SCRYPT_DEFAULT
1978 SELF = os.path.basename (next (argvi))
1981 rawsubcmd = next (argvi)
1982 subcommand = PDTCRYPT_SUB [rawsubcmd]
1983 except StopIteration:
1984 bail ("ERROR: subcommand required")
1986 bail ("ERROR: invalid subcommand “%s” specified" % rawsubcmd)
1992 except StopIteration:
1993 bail ("ERROR: argument list incomplete")
1995 def checked_secret (s):
2000 bail ("ERROR: encountered “%s” but secret already given" % arg)
2003 if arg in [ "-h", "--help" ]:
2006 elif arg in [ "-v", "--verbose", "--wtf" ]:
2007 global PDTCRYPT_VERBOSE
2008 PDTCRYPT_VERBOSE = True
2009 elif arg in [ "-i", "--in", "--source" ]:
2010 insspec = checked_arg ()
2011 if PDTCRYPT_VERBOSE is True: noise ("PDT: decrypt from %s" % insspec)
2012 elif arg in [ "-p", "--password" ]:
2013 arg = checked_arg ()
2014 checked_secret (make_secret (password=arg))
2015 if PDTCRYPT_VERBOSE is True: noise ("PDT: decrypting with password")
2017 if subcommand == PDTCRYPT_SUB_PROCESS:
2018 if arg in [ "-s", "--strict-ivs" ]:
2019 global PDTCRYPT_STRICTIVS
2020 PDTCRYPT_STRICTIVS = True
2021 elif arg in [ "-o", "--out", "--dest", "--sink" ]:
2022 outsspec = checked_arg ()
2023 if PDTCRYPT_VERBOSE is True: noise ("PDT: decrypt to %s" % outsspec)
2024 elif arg in [ "-f", "--force" ]:
2025 PDTCRYPT_OVERWRITE = True
2026 if PDTCRYPT_VERBOSE is True: noise ("PDT: overwrite existing files")
2027 elif arg in [ "-S", "--split" ]:
2028 mode |= PDTCRYPT_SPLIT
2029 if PDTCRYPT_VERBOSE is True: noise ("PDT: split files")
2030 elif arg in [ "-D", "--no-decrypt" ]:
2031 mode &= ~PDTCRYPT_DECRYPT
2032 if PDTCRYPT_VERBOSE is True: noise ("PDT: not decrypting")
2033 elif arg in [ "-k", "--key" ]:
2034 arg = checked_arg ()
2035 checked_secret (make_secret (key=arg))
2036 if PDTCRYPT_VERBOSE is True: noise ("PDT: decrypting with key")
2038 bail ("ERROR: unexpected positional argument “%s”" % arg)
2039 elif subcommand == PDTCRYPT_SUB_SCRYPT:
2040 if arg in [ "-n", "--nacl", "--salt" ]:
2041 nacl = checked_arg ()
2042 if PDTCRYPT_VERBOSE is True: noise ("PDT: salt key with %s" % nacl)
2043 elif arg in [ "-f", "--format" ]:
2044 arg = checked_arg ()
2046 scrypt_format = PDTCRYPT_SCRYPT_FORMAT [arg]
2048 bail ("ERROR: invalid scrypt output format %s" % arg)
2049 if PDTCRYPT_VERBOSE is True:
2050 noise ("PDT: scrypt output format “%s”" % scrypt_format)
2052 bail ("ERROR: unexpected positional argument “%s”" % arg)
2053 elif subcommand == PDTCRYPT_SUB_SCAN:
2054 if arg in [ "-o", "--out", "--dest", "--sink" ]:
2055 outsspec = checked_arg ()
2056 if PDTCRYPT_VERBOSE is True: noise ("PDT: decrypt to %s" % outsspec)
2057 elif arg in [ "-f", "--force" ]:
2058 PDTCRYPT_OVERWRITE = True
2059 if PDTCRYPT_VERBOSE is True: noise ("PDT: overwrite existing files")
2061 bail ("ERROR: unexpected positional argument “%s”" % arg)
2064 if PDTCRYPT_VERBOSE is True:
2065 noise ("ERROR: no password or key specified, trying $PDTCRYPT_PASSWORD")
2066 epw = os.getenv ("PDTCRYPT_PASSWORD")
2068 checked_secret (make_secret (password=epw.strip ()))
2071 if PDTCRYPT_VERBOSE is True:
2072 noise ("ERROR: no password or key specified, trying $PDTCRYPT_KEY")
2073 ek = os.getenv ("PDTCRYPT_KEY")
2075 checked_secret (make_secret (key=ek.strip ()))
2078 if subcommand == PDTCRYPT_SUB_SCRYPT:
2079 bail ("ERROR: scrypt hash mode requested but no password given")
2080 elif mode & PDTCRYPT_DECRYPT:
2081 bail ("ERROR: decryption requested but no password given")
2083 if mode & PDTCRYPT_SPLIT and outsspec is None:
2084 bail ("ERROR: split mode is incompatible with stdout sink "
2087 if subcommand == PDTCRYPT_SUB_SCAN and outsspec is None:
2088 pass # no output by default in scan mode
2089 elif mode & PDTCRYPT_SPLIT or subcommand == PDTCRYPT_SUB_SCAN:
2090 # destination must be directory
2092 bail ("ERROR: mode is incompatible with stdout sink")
2095 os.makedirs (outsspec, 0o700)
2096 except FileExistsError:
2097 # if it’s a directory with appropriate perms, everything is
2098 # good; otherwise, below invocation of open(2) will fail
2100 outs = os.open (outsspec, os.O_DIRECTORY, 0o600)
2101 except FileNotFoundError as exn:
2102 bail ("ERROR: cannot create target directory “%s”" % outsspec)
2103 except NotADirectoryError as exn:
2104 bail ("ERROR: target path “%s” is not a directory" % outsspec)
2106 outs = deptdcrypt_mk_stream (PDTCRYPT_SINK, outsspec or "-")
2108 if subcommand == PDTCRYPT_SUB_SCAN:
2110 bail ("ERROR: please supply an input file for scanning")
2112 bail ("ERROR: input must be seekable; please specify a file")
2113 return True, partial (mode_scan, secret, insspec, outs, nacl=nacl)
2115 if subcommand == PDTCRYPT_SUB_SCRYPT:
2116 if secret [0] == PDTCRYPT_SECRET_KEY:
2117 bail ("ERROR: scrypt mode requires a password")
2118 if insspec is not None and nacl is not None \
2119 or insspec is None and nacl is None :
2120 bail ("ERROR: please supply either an input file or "
2125 if insspec is not None or subcommand != PDTCRYPT_SUB_SCRYPT:
2126 ins = deptdcrypt_mk_stream (PDTCRYPT_SOURCE, insspec or "-")
2128 if subcommand == PDTCRYPT_SUB_SCRYPT:
2129 return True, partial (mode_scrypt, secret [1].encode (), ins, nacl,
2132 return True, partial (mode_depdtcrypt, mode, secret, ins, outs)
2136 ok, runner = parse_argv (argv)
2138 if ok is True: return runner ()
2143 if __name__ == "__main__":
2144 sys.exit (main (sys.argv))