6 ===============================================================================
7 crypto -- Encryption Layer for the Deltatar Backup
8 ===============================================================================
12 - AES-GCM for the symmetric encryption;
17 - NIST Recommendation for Block Cipher Modes of Operation: Galois/Counter
19 http://nvlpubs.nist.gov/nistpubs/Legacy/SP/nistspecialpublication800-38d.pdf
22 https://cryptome.org/2014/01/aes-gcm-v1.pdf
24 - Authentication weaknesses in GCM
25 http://csrc.nist.gov/groups/ST/toolkit/BCM/documents/comments/CWC-GCM/Ferguson2.pdf
27 Trouble with python-cryptography packages: authentication tags can only be
28 passed in advance: https://github.com/pyca/cryptography/pull/3421
31 -------------------------------------------------------------------------------
33 Errors fall into roughly three categories:
35 - Cryptographical errors or invalid data.
37 - ``InvalidGCMTag`` (decryption failed on account of an invalid GCM
39 - ``InvalidIVFixedPart`` (IV fixed part of object not found in list),
40 - ``DuplicateIV`` (the IV of an encrypted object already occurred),
41 - ``DecryptionError`` (used in CLI decryption for presenting error
42 conditions to the user).
44 - Incorrect usage of the library.
46 - ``InvalidParameter`` (non-conforming user supplied parameter),
47 - ``InvalidHeader`` (data passed for reading not parsable into header),
48 - ``FormatError`` (cannot handle header or parameter version),
51 - Bad internal state. If one of these is encountered it means that a state
52 was reached that shouldn’t occur during normal processing.
57 Also, ``EndOfFile`` is used as a sentinel to communicate that a stream supplied
58 for reading is exhausted.
60 Initialization Vectors
61 -------------------------------------------------------------------------------
63 Initialization vectors are checked reuse during the lifetime of a decryptor.
64 The fixed counters for metadata files cannot be reused and attempts to do so
65 will cause a DuplicateIV error. This means the length of objects encrypted with
66 a metadata counter is capped at 63 GB.
68 For ordinary, non-metadata payload, there is an optional mode with strict IV
69 checking that causes a crypto context to fail if an IV encountered or created
70 was already used for decrypting or encrypting, respectively, an earlier object.
71 Note that this mode can trigger false positives when decrypting non-linearly,
72 e. g. when traversing the same object multiple times. Since the crypto context
73 has no notion of a position in a PDT encrypted archive, this condition must be
74 sorted out downstream.
77 -------------------------------------------------------------------------------
79 ``crypto.py`` may be invoked as a script for decrypting, validating, and
80 splitting PDT encrypted files. Consult the usage message for details.
84 Decrypt from stdin using the password ‘foo’: ::
86 $ crypto.py process foo -i - -o - <some-file.tar.gz.pdtcrypt >some-file.tar.gz
88 Output verbose information about the encrypted objects in the archive: ::
90 $ crypto.py process foo -v -i some-file.tar.gz.pdtcrypt -o /dev/null
91 PDT: decrypt from some-file.tar.gz.pdtcrypt
92 PDT: decrypt to /dev/null
93 PDT: source: file some-file.tar.gz.pdtcrypt
94 PDT: sink: file /dev/null
96 PDT: · version = 1 : 0100
97 PDT: · paramversion = 1 : 0100
98 PDT: · nacl : d270 b031 00d1 87e2 c946 610d 7b7f 7e5f
99 PDT: · iv : 02ee 3dd7 a963 1eb1 0100 0000
100 PDT: · ctsize = 591 : 4f02 0000 0000 0000
101 PDT: · tag : 5b2d 6d8b 8f82 4842 12fd 0b10 b6e3 369b
102 PDT: 64 decrypt obj no. 1, 591 B
103 PDT: · [64] 0% done, read block (591 B of 591 B remaining)
104 PDT: · decrypt ciphertext 591 B
105 PDT: · decrypt plaintext 591 B
109 Also, the mode *scrypt* allows deriving encryption keys. To calculate the
110 encryption key from the password ‘foo’ and the salt of the first object in a
111 PDT encrypted file: ::
113 $ crypto.py scrypt foo -i some-file.pdtcrypt
114 {"paramversion": 1, "salt": "Cqzbk48e3peEjzWto8D0yA==", "key": "JH9EkMwaM4x9F5aim5gK/Q=="}
116 The computed 16 byte key is given in hexadecimal notation in the value to
117 ``hash`` and can be fed into Python’s ``binascii.unhexlify()`` to obtain the
118 corresponding binary representation.
120 Note that in Scrypt hashing mode, no data integrity checks are being performed.
121 If the wrong password is given, a wrong key will be derived. Whether the password
122 was indeed correct can only be determined by decrypting. Note that since PDT
123 archives essentially consist of a stream of independent objects, the salt and
124 other parameters may change. Thus a key derived using above method from the
125 first object doesn’t necessarily apply to any of the subsequent objects.
134 from functools import reduce, partial
144 except ImportError as exn:
147 if __name__ == "__main__": ## Work around the import mechanism’s lest Python’s
148 pwd = os.getcwd() ## preference for local imports causes a cyclical
149 ## import (crypto → pylibscrypt → […] → ./tarfile → crypto).
150 sys.path = [ p for p in sys.path if p.find ("deltatar") < 0 ]
153 from cryptography.hazmat.primitives.ciphers import Cipher, algorithms, modes
154 from cryptography.hazmat.backends import default_backend
158 __all__ = [ "hdr_make", "hdr_read", "hdr_fmt", "hdr_fmt_pretty"
160 , "PDTCRYPT_HDR_SIZE", "AES_GCM_IV_CNT_DATA"
161 , "AES_GCM_IV_CNT_INFOFILE", "AES_GCM_IV_CNT_INDEX"
165 ###############################################################################
167 ###############################################################################
169 class EndOfFile (Exception):
173 def __init__ (self, n=None, msg=None):
179 class InvalidParameter (Exception):
180 """Inputs not valid for PDT encryption."""
184 class InvalidHeader (Exception):
185 """Header not valid."""
189 class InvalidGCMTag (Exception):
191 The GCM tag calculated during decryption differs from that in the object
197 class InvalidIVFixedPart (Exception):
199 IV fixed part not in supplied list: either the backup is corrupt or the
200 current object does not belong to it.
205 class IVFixedPartError (Exception):
207 Error creating a unique IV fixed part: repeated calls to system RNG yielded
208 the same sequence of bytes as the last IV used.
213 class InvalidFileCounter (Exception):
215 When encrypting, an attempted reuse of a dedicated counter (info file,
216 index file) was caught.
221 class DuplicateIV (Exception):
223 During encryption, the current IV fixed part is identical to an already
224 existing IV (same prefix and file counter). This indicates tampering or
225 programmer error and cannot be recovered from.
230 class NonConsecutiveIV (Exception):
232 IVs not numbered consecutively. This is a hard error with strict IV
233 checking. Precludes random access to the encrypted objects.
238 class FormatError (Exception):
239 """Unusable parameters in header."""
243 class DecryptionError (Exception):
244 """Error during decryption with ``crypto.py`` on the command line."""
248 class Unreachable (Exception):
250 Makeshift __builtin_unreachable(); always a programmer error if
256 class InternalError (Exception):
257 """Errors not ascribable to bad user inputs or cryptography."""
261 ###############################################################################
262 ## crypto layer version
263 ###############################################################################
265 ENCRYPTION_PARAMETERS = \
267 { "kdf": ("dummy", 16)
268 , "enc": "passthrough" }
276 , "enc": "aes-gcm" } }
278 ###############################################################################
280 ###############################################################################
282 PDTCRYPT_HDR_MAGIC = b"PDTCRYPT"
284 PDTCRYPT_HDR_SIZE_MAGIC = 8 # 8
285 PDTCRYPT_HDR_SIZE_VERSION = 2 # 10
286 PDTCRYPT_HDR_SIZE_PARAMVERSION = 2 # 12
287 PDTCRYPT_HDR_SIZE_NACL = 16 # 28
288 PDTCRYPT_HDR_SIZE_IV = 12 # 40
289 PDTCRYPT_HDR_SIZE_CTSIZE = 8 # 48
290 PDTCRYPT_HDR_SIZE_TAG = 16 # 64 GCM auth tag
292 PDTCRYPT_HDR_SIZE = PDTCRYPT_HDR_SIZE_MAGIC + PDTCRYPT_HDR_SIZE_VERSION \
293 + PDTCRYPT_HDR_SIZE_PARAMVERSION + PDTCRYPT_HDR_SIZE_NACL \
294 + PDTCRYPT_HDR_SIZE_IV + PDTCRYPT_HDR_SIZE_CTSIZE \
295 + PDTCRYPT_HDR_SIZE_TAG # = 64
297 # precalculate offsets since Python can’t do constant folding over names
298 HDR_OFF_VERSION = PDTCRYPT_HDR_SIZE_MAGIC
299 HDR_OFF_PARAMVERSION = HDR_OFF_VERSION + PDTCRYPT_HDR_SIZE_VERSION
300 HDR_OFF_NACL = HDR_OFF_PARAMVERSION + PDTCRYPT_HDR_SIZE_PARAMVERSION
301 HDR_OFF_IV = HDR_OFF_NACL + PDTCRYPT_HDR_SIZE_NACL
302 HDR_OFF_CTSIZE = HDR_OFF_IV + PDTCRYPT_HDR_SIZE_IV
303 HDR_OFF_TAG = HDR_OFF_CTSIZE + PDTCRYPT_HDR_SIZE_CTSIZE
307 FMT_I2N_IV = "<8sL" # 8 random bytes ‖ 32 bit counter
308 FMT_I2N_HDR = ("<" # host byte order
312 "16s" # sodium chloride
318 AES_GCM_MAX_SIZE = (1 << 36) - (1 << 5) # 2^39 - 2^8 b ≅ 64 GB
319 PDTCRYPT_MAX_OBJ_SIZE_DEFAULT = 63 * (1 << 30) # 63 GB
320 PDTCRYPT_MAX_OBJ_SIZE = PDTCRYPT_MAX_OBJ_SIZE_DEFAULT
322 # index and info files are written on-the fly while encrypting so their
323 # counters must be available inadvance
324 AES_GCM_IV_CNT_INFOFILE = 1 # constant
325 AES_GCM_IV_CNT_INDEX = AES_GCM_IV_CNT_INFOFILE + 1
326 AES_GCM_IV_CNT_DATA = AES_GCM_IV_CNT_INDEX + 1 # also for multivolume
327 AES_GCM_IV_CNT_MAX_DEFAULT = 0xffFFffFF
328 AES_GCM_IV_CNT_MAX = AES_GCM_IV_CNT_MAX_DEFAULT
330 # IV structure and generation
331 PDTCRYPT_IV_GEN_MAX_RETRIES = 10 # ×
332 PDTCRYPT_IV_FIXEDPART_SIZE = 8 # B
333 PDTCRYPT_IV_COUNTER_SIZE = 4 # B
335 ###############################################################################
337 ###############################################################################
343 # , paramversion : u16
349 # fn hdr_read (f : handle) -> hdrinfo;
350 # fn hdr_make (f : handle, h : hdrinfo) -> IOResult<usize>;
351 # fn hdr_fmt (h : hdrinfo) -> String;
356 Read bytes as header structure.
358 If the input could not be interpreted as a header, fail with
363 mag, version, paramversion, nacl, iv, ctsize, tag = \
364 struct.unpack (FMT_I2N_HDR, data)
365 except Exception as exn:
366 raise InvalidHeader ("error unpacking header from [%r]: %s"
367 % (binascii.hexlify (data), str (exn)))
369 if mag != PDTCRYPT_HDR_MAGIC:
370 raise InvalidHeader ("bad magic in header: expected [%s], got [%s]"
371 % (PDTCRYPT_HDR_MAGIC, mag))
374 { "version" : version
375 , "paramversion" : paramversion
383 def hdr_read_stream (instr):
385 Read header from stream at the current position.
387 Fail with ``InvalidHeader`` if insufficient bytes were read from the
388 stream, or if the content could not be interpreted as a header.
390 data = instr.read(PDTCRYPT_HDR_SIZE)
394 elif ldata != PDTCRYPT_HDR_SIZE:
395 raise InvalidHeader ("hdr_read_stream: expected %d B, received %d B"
396 % (PDTCRYPT_HDR_SIZE, ldata))
397 return hdr_read (data)
400 def hdr_from_params (version, paramversion, nacl, iv, ctsize, tag):
402 Assemble the necessary values into a PDTCRYPT header.
404 :type version: int to fit uint16_t
405 :type paramversion: int to fit uint16_t
406 :type nacl: bytes to fit uint8_t[16]
407 :type iv: bytes to fit uint8_t[12]
408 :type size: int to fit uint64_t
409 :type tag: bytes to fit uint8_t[16]
411 buf = bytearray (PDTCRYPT_HDR_SIZE)
412 bufv = memoryview (buf)
415 struct.pack_into (FMT_I2N_HDR, bufv, 0,
417 version, paramversion, nacl, iv, ctsize, tag)
418 except Exception as exn:
419 return False, "error assembling header: %s" % str (exn)
421 return True, bytes (buf)
424 def hdr_make_dummy (s):
426 Create a header sized block of bytes initialized to a value derived from a
427 string. Used to verify we’ve jumped back correctly to the actual position
428 of the object header.
430 c = reduce (lambda a, c: a + ord(c), s, 0) % 0xFF
431 return bytes (bytearray (struct.pack ("B", c)) * PDTCRYPT_HDR_SIZE)
436 Assemble a header from the given header structure.
438 return hdr_from_params (version=hdr.get("version"),
439 paramversion=hdr.get("paramversion"),
440 nacl=hdr.get("nacl"), iv=hdr.get("iv"),
441 ctsize=hdr.get("ctsize"), tag=hdr.get("tag"))
444 HDR_FMT = "I2n_header { version: %d, paramversion: %d, nacl: %s[%d]," \
445 " iv: %s[%d], ctsize: %d, tag: %s[%d] }"
448 """Format a header structure into readable output."""
449 return HDR_FMT % (h["version"], h["paramversion"],
450 binascii.hexlify (h["nacl"]), len(h["nacl"]),
451 binascii.hexlify (h["iv"]), len(h["iv"]),
453 binascii.hexlify (h["tag"]), len(h["tag"]))
456 def hex_spaced_of_bytes (b):
457 """Format bytes object, hexdump style."""
458 return " ".join ([ "%.2x%.2x" % (c1, c2)
459 for c1, c2 in zip (b[0::2], b[1::2]) ]) \
460 + (len (b) | 1 == len (b) and " %.2x" % b[-1] or "") # odd lengths
463 def hdr_iv_counter (h):
464 """Extract the variable part of the IV of the given header."""
465 _fixed, cnt = struct.unpack (FMT_I2N_IV, h ["iv"])
469 def hdr_iv_fixed (h):
470 """Extract the fixed part of the IV of the given header."""
471 fixed, _cnt = struct.unpack (FMT_I2N_IV, h ["iv"])
475 hdr_dump = hex_spaced_of_bytes
479 """version = %-4d : %s
480 paramversion = %-4d : %s
487 def hdr_fmt_pretty (h):
489 Format header structure into multi-line representation of its contents and
490 their raw representation. (Omit the implicit “PDTCRYPT” magic bytes that
491 precede every header.)
493 return HDR_FMT_PRETTY \
495 hex_spaced_of_bytes (struct.pack (FMT_UINT16_LE, h["version"])),
497 hex_spaced_of_bytes (struct.pack (FMT_UINT16_LE, h["paramversion"])),
498 hex_spaced_of_bytes (h["nacl"]),
499 hex_spaced_of_bytes (h["iv"]),
501 hex_spaced_of_bytes (struct.pack (FMT_UINT64_LE, h["ctsize"])),
502 hex_spaced_of_bytes (h["tag"]))
504 IV_FMT = "((f %s) (c %d))"
507 """Format the two components of an IV in a readable fashion."""
508 fixed, cnt = struct.unpack (FMT_I2N_IV, iv)
509 return IV_FMT % (binascii.hexlify (fixed), cnt)
512 ###############################################################################
514 ###############################################################################
516 class Location (object):
520 def restore_loc_fmt (loc):
522 % (loc.n, loc.offset)
524 def locate_hdr_candidates (fd):
526 Walk over instances of the magic string in the payload, collecting their
527 positions. If the offset of the first found instance is not zero, the file
528 begins with leading garbage.
530 :return: The list of offsets in the file.
534 mm = mmap.mmap(fd, 0, mmap.MAP_SHARED, mmap.PROT_READ)
537 pos = mm.find (PDTCRYPT_HDR_MAGIC, pos)
546 HDR_CAND_GOOD = 0 # header marks begin of valid object
547 HDR_CAND_FISHY = 1 # inconclusive (tag mismatch, obj overlap etc.)
548 HDR_CAND_JUNK = 2 # not a header / object unreadable
551 def inspect_hdr (fd, off):
553 Attempt to parse a header in *fd* at position *off*.
555 Returns a verdict about the quality of that header plus the parsed header
559 _ = os.lseek (fd, off, os.SEEK_SET)
561 if os.lseek (fd, 0, os.SEEK_CUR) != off:
562 if PDTCRYPT_VERBOSE is True:
563 noise ("PDT: %d → dismissed (lseek() past EOF)" % off)
564 return HDR_CAND_JUNK, None
566 raw = os.read (fd, PDTCRYPT_HDR_SIZE)
567 if len (raw) != PDTCRYPT_HDR_SIZE:
568 if PDTCRYPT_VERBOSE is True:
569 noise ("PDT: %d → dismissed (EOF inside header)" % off)
570 return HDR_CAND_JUNK, None
574 except InvalidHeader as exn:
575 if PDTCRYPT_VERBOSE is True:
576 noise ("PDT: %d → dismissed (invalid: [%s])" % (off, str (exn)))
577 return HDR_CAND_JUNK, None
579 obj0 = off + PDTCRYPT_HDR_SIZE
580 objX = obj0 + hdr ["ctsize"]
582 eof = os.lseek (fd, 0, os.SEEK_END)
584 if PDTCRYPT_VERBOSE is True:
585 noise ("PDT: %d → EOF inside object (%d≤%d≤%d); adjusting size to "
586 "%d" % (off, obj0, eof, objX, (eof - obj0)))
587 # try reading up to the end
588 hdr ["ctsize"] = eof - obj0
589 return HDR_CAND_FISHY, hdr
591 return HDR_CAND_GOOD, hdr
594 def try_decrypt (ifd, off, hdr, secret, ofd=-1):
596 Attempt to decrypt the object in the (seekable) descriptor *ifd* starting
597 at *off* using the metadata in *hdr* and *secret*. An output fd can be
598 specified with *ofd*; if it is *-1* – the default –, the decrypted payload
601 Always creates a fresh decryptor, so validation steps across objects don’t
604 ctleft = hdr ["ctsize"]
608 if ks == PDTCRYPT_SECRET_PW:
609 decr = Decrypt (password=secret [1])
610 elif ks == PDTCRYPT_SECRET_KEY:
611 key = binascii.unhexlify (secret [1])
612 decr = Decrypt (key=key)
619 os.lseek (ifd, pos, os.SEEK_SET)
621 cnksiz = min (ctleft, PDTCRYPT_BLOCKSIZE)
622 cnk = os.read (ifd, cnksiz)
625 pt = decr.process (cnk)
629 if len (pt) > 0 and ofd != -1:
632 except Exception as exn:
633 noise ("PDT: error decrypting object %d–%d@%d, %d B remaining [%s]"
634 % (off, off + hdr ["ctsize"], pos, ctleft, exn))
640 def readable_objects_offsets (ifd, secret, cands):
642 From a list of candidates, locate the ones that mark the start of actual
643 readable PDTCRYPT objects.
649 vdt, hdr = inspect_hdr (ifd, cand)
650 if vdt == HDR_CAND_JUNK:
651 pass # ignore unreadable ones
652 elif vdt in [HDR_CAND_GOOD, HDR_CAND_FISHY]:
653 off0 = cand + PDTCRYPT_HDR_SIZE
654 ok = try_decrypt (ifd, off0, hdr, secret) == hdr ["ctsize"]
660 def reconstruct_offsets (fname, secret):
661 ifd = os.open (fname, os.O_RDONLY)
664 cands = locate_hdr_candidates (ifd)
665 return readable_objects_offsets (ifd, secret, cands)
670 ###############################################################################
671 ## passthrough / null encryption
672 ###############################################################################
674 class PassthroughCipher (object):
676 tag = struct.pack ("<QQ", 0, 0)
678 def __init__ (self) : pass
680 def update (self, b) : return b
682 def finalize (self) : return b""
684 def finalize_with_tag (self, _) : return b""
686 ###############################################################################
687 ## convenience wrapper
688 ###############################################################################
691 def kdf_dummy (klen, password, _nacl):
693 Fake KDF for testing purposes that is called when parameter version zero is
696 q, r = divmod (klen, len (password))
697 if isinstance (password, bytes) is False:
698 password = password.encode ()
699 return password * q + password [:r], b""
702 SCRYPT_KEY_MEMO = { } # static because needed for both the info file and the archive
705 def kdf_scrypt (params, password, nacl):
707 Wrapper for the Scrypt KDF, corresponds to parameter version one. The
708 computation result is memoized based on the inputs to facilitate spawning
709 multiple encryption contexts.
714 dkLen = params["dkLen"]
717 nacl = os.urandom (params["NaCl_LEN"])
719 key_parms = (password, nacl, N, r, p, dkLen)
720 global SCRYPT_KEY_MEMO
721 if key_parms not in SCRYPT_KEY_MEMO:
722 SCRYPT_KEY_MEMO [key_parms] = \
723 pylibscrypt.scrypt (password, nacl, N, r, p, dkLen)
724 return SCRYPT_KEY_MEMO [key_parms], nacl
727 def kdf_by_version (paramversion=None, defs=None):
729 Pick the KDF handler corresponding to the parameter version or the
732 :rtype: function (password : str, nacl : str) -> str
734 if paramversion is not None:
735 defs = ENCRYPTION_PARAMETERS.get(paramversion, None)
737 raise InvalidParameter ("no encryption parameters for version %r"
739 (kdf, params) = defs["kdf"]
741 if kdf == "scrypt" : fn = kdf_scrypt
742 if kdf == "dummy" : fn = kdf_dummy
744 raise ValueError ("key derivation method %r unknown" % kdf)
745 return partial (fn, params)
748 ###############################################################################
750 ###############################################################################
752 def scrypt_hashsource (pw, ins):
754 Calculate the SCRYPT hash from the password and the information contained
755 in the first header found in ``ins``.
757 This does not validate whether the first object is encrypted correctly.
759 if isinstance (pw, str) is True:
761 elif isinstance (pw, bytes) is False:
762 raise InvalidParameter ("password must be a string, not %s"
764 if isinstance (ins, io.BufferedReader) is False and \
765 isinstance (ins, io.FileIO) is False:
766 raise InvalidParameter ("file to hash must be opened in “binary” mode")
769 hdr = hdr_read_stream (ins)
770 except EndOfFile as exn:
771 noise ("PDT: malformed input: end of file reading first object header")
776 pver = hdr ["paramversion"]
777 if PDTCRYPT_VERBOSE is True:
778 noise ("PDT: salt of first object : %s" % binascii.hexlify (nacl))
779 noise ("PDT: parameter version of archive : %d" % pver)
782 defs = ENCRYPTION_PARAMETERS.get(pver, None)
783 kdfname, params = defs ["kdf"]
784 if kdfname != "scrypt":
785 noise ("PDT: input is not an SCRYPT archive")
788 kdf = kdf_by_version (None, defs)
789 except ValueError as exn:
790 noise ("PDT: object has unknown parameter version %d" % pver)
792 hsh, _void = kdf (pw, nacl)
794 return hsh, nacl, hdr ["version"], pver
797 def scrypt_hashfile (pw, fname):
799 Calculate the SCRYPT hash from the password and the information contained
800 in the first header found in the given file. The header is read only at
803 with deptdcrypt_mk_stream (PDTCRYPT_SOURCE, fname or "-") as ins:
804 hsh, _void, _void, _void = scrypt_hashsource (pw, ins)
808 ###############################################################################
810 ###############################################################################
812 class Crypto (object):
814 Encryption context to remain alive throughout an entire tarfile pass.
819 cnt = None # file counter (uint32_t != 0)
820 iv = None # current IV
821 fixed = None # accu for 64 bit fixed parts of IV
822 used_ivs = None # tracks IVs
823 strict_ivs = False # if True, panic on duplicate object IV
832 info_counter_used = False
833 index_counter_used = False
835 def __init__ (self, *al, **akv):
836 self.used_ivs = set ()
837 self.set_parameters (*al, **akv)
840 def next_fixed (self):
845 def set_object_counter (self, cnt=None):
847 Safely set the internal counter of encrypted objects. Numerous
850 The same counter may not be reused in combination with one IV fixed
851 part. This is validated elsewhere in the IV handling.
853 Counter zero is invalid. The first two counters are reserved for
854 metadata. The implementation does not allow for splitting metadata
855 files over multiple encrypted objects. (This would be possible by
856 assigning new fixed parts.) Thus in a Deltatar backup there is at most
857 one object with a counter value of one and two. On creation of a
858 context, the initial counter may be chosen. The globals
859 ``AES_GCM_IV_CNT_INFOFILE`` and ``AES_GCM_IV_CNT_INDEX`` can be used to
860 request one of the reserved values. If one of these values has been
861 used, any further attempt of setting the counter to that value will
862 be rejected with an ``InvalidFileCounter`` exception.
864 Out of bounds values (i. e. below one and more than the maximum of 2³²)
865 cause an ``InvalidParameter`` exception to be thrown.
868 self.cnt = AES_GCM_IV_CNT_DATA
870 if cnt == 0 or cnt > AES_GCM_IV_CNT_MAX + 1:
871 raise InvalidParameter ("invalid counter value %d requested: "
872 "acceptable values are from 1 to %d"
873 % (cnt, AES_GCM_IV_CNT_MAX))
874 if cnt == AES_GCM_IV_CNT_INFOFILE:
875 if self.info_counter_used is True:
876 raise InvalidFileCounter ("attempted to reuse info file "
877 "counter %d: must be unique" % cnt)
878 self.info_counter_used = True
879 elif cnt == AES_GCM_IV_CNT_INDEX:
880 if self.index_counter_used is True:
881 raise InvalidFileCounter ("attempted to reuse index file "
882 " counter %d: must be unique" % cnt)
883 self.index_counter_used = True
884 if cnt <= AES_GCM_IV_CNT_MAX:
887 # cnt == AES_GCM_IV_CNT_MAX + 1 → wrap
888 self.cnt = AES_GCM_IV_CNT_DATA
892 def set_parameters (self, password=None, key=None, paramversion=None,
893 nacl=None, counter=None, strict_ivs=False):
895 Configure the internal state of a crypto context. Not intended for
899 self.set_object_counter (counter)
900 self.strict_ivs = strict_ivs
902 if paramversion is not None:
903 self.paramversion = paramversion
906 self.key, self.nacl = key, nacl
909 if password is not None:
910 if isinstance (password, bytes) is False:
911 password = str.encode (password)
912 self.password = password
913 if paramversion is None and nacl is None:
914 # postpone key setup until first header is available
916 kdf = kdf_by_version (paramversion)
918 self.key, self.nacl = kdf (password, nacl)
921 def process (self, buf):
923 Encrypt / decrypt a buffer. Invokes the ``.update()`` method on the
924 wrapped encryptor or decryptor, respectively.
926 The Cryptography exception ``AlreadyFinalized`` is translated to an
927 ``InternalError`` at this point. It may occur in sound code when the GC
928 closes an encrypting stream after an error. Everywhere else it must be
932 raise RuntimeError ("process: context not initialized")
933 self.stats ["in"] += len (buf)
935 out = self.enc.update (buf)
936 except cryptography.exceptions.AlreadyFinalized as exn:
937 raise InternalError (exn)
938 self.stats ["out"] += len (out)
942 def next (self, password, paramversion, nacl, iv):
944 Prepare for encrypting another object: Reset the data counters and
945 change the configuration in case one of the variable parameters differs
946 from the last object. Also check the IV for duplicates and error out
947 if strict checking was requested.
951 self.stats ["obj"] += 1
953 self.check_duplicate_iv (iv)
955 if ( self.paramversion != paramversion
956 or self.password != password
957 or self.nacl != nacl):
958 self.set_parameters (password=password, paramversion=paramversion,
959 nacl=nacl, strict_ivs=self.strict_ivs)
962 def check_duplicate_iv (self, iv):
964 Add an IV (the 12 byte representation as in the header) to the list. With
965 strict checking enabled, this will throw a ``DuplicateIV``. Depending on
966 the context, this may indicate a serious error (IV reuse).
968 if self.strict_ivs is True and iv in self.used_ivs:
969 raise DuplicateIV ("iv %s was reused" % iv_fmt (iv))
970 # vi has not been used before; add to collection
971 self.used_ivs.add (iv)
976 Access the data counters.
978 return self.stats ["obj"], self.stats ["in"], self.stats ["out"]
983 Clear the current context regardless of its finalization state. The
984 next operation must be ``.next()``.
989 class Encrypt (Crypto):
995 def __init__ (self, version, paramversion, password=None, key=None, nacl=None,
996 counter=AES_GCM_IV_CNT_DATA, strict_ivs=True):
998 The ctor will throw immediately if one of the parameters does not conform
1001 counter=AES_GCM_IV_CNT_DATA, strict_ivs=True):
1002 :type version: int to fit uint16_t
1003 :type paramversion: int to fit uint16_t
1004 :param password: mutually exclusive with ``key``
1005 :type password: bytes
1006 :param key: mutually exclusive with ``password``
1009 :type counter: initial object counter the values
1010 ``AES_GCM_IV_CNT_INFOFILE`` and
1011 ``AES_GCM_IV_CNT_INDEX`` are unique in each backup set
1012 and cannot be reused even with different fixed parts.
1013 :type strict_ivs: bool
1015 if password is None and key is None \
1016 or password is not None and key is not None :
1017 raise InvalidParameter ("__init__: need either key or password")
1020 if isinstance (key, bytes) is False:
1021 raise InvalidParameter ("__init__: key must be provided as "
1022 "bytes, not %s" % type (key))
1024 raise InvalidParameter ("__init__: salt must be provided along "
1025 "with encryption key")
1026 else: # password, no key
1027 if isinstance (password, str) is False:
1028 raise InvalidParameter ("__init__: password must be a string, not %s"
1030 if len (password) == 0:
1031 raise InvalidParameter ("__init__: supplied empty password but not "
1032 "permitted for PDT encrypted files")
1034 if isinstance (version, int) is False:
1035 raise InvalidParameter ("__init__: version number must be an "
1036 "integer, not %s" % type (version))
1038 raise InvalidParameter ("__init__: version number must be a "
1039 "nonnegative integer, not %d" % version)
1041 if isinstance (paramversion, int) is False:
1042 raise InvalidParameter ("__init__: crypto parameter version number "
1043 "must be an integer, not %s"
1044 % type (paramversion))
1045 if paramversion < 0:
1046 raise InvalidParameter ("__init__: crypto parameter version number "
1047 "must be a nonnegative integer, not %d"
1050 if nacl is not None:
1051 if isinstance (nacl, bytes) is False:
1052 raise InvalidParameter ("__init__: salt given, but of type %s "
1053 "instead of bytes" % type (nacl))
1054 # salt length would depend on the actual encryption so it can’t be
1055 # validated at this point
1057 self.version = version
1058 self.paramenc = ENCRYPTION_PARAMETERS.get (paramversion) ["enc"]
1060 super().__init__ (password, key, paramversion, nacl, counter=counter,
1061 strict_ivs=strict_ivs)
1064 def next_fixed (self, retries=PDTCRYPT_IV_GEN_MAX_RETRIES):
1066 Generate the next IV fixed part by reading eight bytes from
1067 ``/dev/urandom``. The buffer so obtained is tested against the fixed
1068 parts used so far to prevent accidental reuse of IVs. After a
1069 configurable number of attempts to create a unique fixed part, it will
1070 refuse to continue with an ``IVFixedPartError``. This is unlikely to
1071 ever happen on a normal system but may detect an issue with the random
1074 The list of fixed parts that were used by the context at hand can be
1075 accessed through the ``.fixed`` list. Its last element is the fixed
1076 part currently in use.
1080 fp = os.urandom (PDTCRYPT_IV_FIXEDPART_SIZE)
1081 if fp not in self.fixed:
1082 self.fixed.append (fp)
1085 raise IVFixedPartError ("error obtaining a unique IV fixed part from "
1086 "/dev/urandom; giving up after %d tries" % i)
1091 Construct a 12-bytes IV from the current fixed part and the object
1094 return struct.pack(FMT_I2N_IV, self.fixed [-1], self.cnt)
1097 def next (self, filename=None, counter=None):
1099 Prepare for encrypting the next incoming object. Update the counter
1100 and put together the IV, possibly changing prefixes. Then create the
1103 The argument ``counter`` can be used to specify a file counter for this
1104 object. Unless it is one of the reserved values, the counter of
1105 subsequent objects will be computed from this one.
1107 If this is the first object in a series, ``filename`` is required,
1108 otherwise it is reused if not present. The value is used to derive a
1109 header sized placeholder to use until after encryption when all the
1110 inputs to construct the final header are available. This is then
1111 matched in ``.done()`` against the value found at the position of the
1112 header. The motivation for this extra check is primarily to assist
1113 format debugging: It makes stray headers easy to spot in malformed
1116 if filename is None:
1117 if self.lastinfo is None:
1118 raise InvalidParameter ("next: filename is mandatory for "
1120 filename, _dummy = self.lastinfo
1122 if isinstance (filename, str) is False:
1123 raise InvalidParameter ("next: filename must be a string, no %s"
1125 if counter is not None:
1126 if isinstance (counter, int) is False:
1127 raise InvalidParameter ("next: the supplied counter is of "
1128 "invalid type %s; please pass an "
1129 "integer instead" % type (counter))
1130 self.set_object_counter (counter)
1132 self.iv = self.iv_make ()
1133 if self.paramenc == "aes-gcm":
1135 ( algorithms.AES (self.key)
1136 , modes.GCM (self.iv)
1137 , backend = default_backend ()) \
1139 elif self.paramenc == "passthrough":
1140 self.enc = PassthroughCipher ()
1142 raise InvalidParameter ("next: parameter version %d not known"
1143 % self.paramversion)
1144 hdrdum = hdr_make_dummy (filename)
1145 self.lastinfo = (filename, hdrdum)
1146 super().next (self.password, self.paramversion, self.nacl, self.iv)
1148 self.set_object_counter (self.cnt + 1)
1152 def done (self, cmpdata):
1154 Complete encryption of an object. After this has been called, attempts
1155 of encrypting further data will cause an error until ``.next()`` is
1158 Returns a 64 bytes buffer containing the object header including all
1159 values including the “late” ones e. g. the ciphertext size and the
1162 if isinstance (cmpdata, bytes) is False:
1163 raise InvalidParameter ("done: comparison input expected as bytes, "
1164 "not %s" % type (cmpdata))
1165 if self.lastinfo is None:
1166 raise RuntimeError ("done: encryption context not initialized")
1167 filename, hdrdum = self.lastinfo
1168 if cmpdata != hdrdum:
1169 raise RuntimeError ("done: bad sync of header for object %d: "
1170 "preliminary data does not match; this likely "
1171 "indicates a wrongly repositioned stream"
1173 data = self.enc.finalize ()
1174 self.stats ["out"] += len (data)
1175 self.ctsize += len (data)
1176 ok, hdr = hdr_from_params (self.version, self.paramversion, self.nacl,
1177 self.iv, self.ctsize, self.enc.tag)
1179 raise InternalError ("error constructing header: %r" % hdr)
1180 return data, hdr, self.fixed
1183 def process (self, buf):
1185 Encrypt a chunk of plaintext with the active encryptor. Returns the
1186 size of the input consumed. This **must** be checked downstream. If the
1187 maximum possible object size has been reached, the current context must
1188 be finalized and a new one established before any further data can be
1189 encrypted. The second argument is the remainder of the plaintext that
1190 was not encrypted for the caller to use immediately after the new
1193 if isinstance (buf, bytes) is False:
1194 raise InvalidParameter ("process: expected byte buffer, not %s"
1197 newptsize = self.ptsize + bsize
1198 diff = newptsize - PDTCRYPT_MAX_OBJ_SIZE
1201 newptsize = PDTCRYPT_MAX_OBJ_SIZE
1202 self.ptsize = newptsize
1203 data = super().process (buf [:bsize])
1204 self.ctsize += len (data)
1208 class Decrypt (Crypto):
1210 tag = None # GCM tag, part of header
1211 last_iv = None # check consecutive ivs in strict mode
1213 def __init__ (self, password=None, key=None, counter=None, fixedparts=None,
1216 Sanitizing ctor for the decryption context. ``fixedparts`` specifies a
1217 list of IV fixed parts accepted during decryption. If a fixed part is
1218 encountered that is not in the list, decryption will fail.
1220 :param password: mutually exclusive with ``key``
1221 :type password: bytes
1222 :param key: mutually exclusive with ``password``
1224 :type counter: initial object counter the values
1225 ``AES_GCM_IV_CNT_INFOFILE`` and
1226 ``AES_GCM_IV_CNT_INDEX`` are unique in each backup set
1227 and cannot be reused even with different fixed parts.
1228 :type fixedparts: bytes list
1230 if password is None and key is None \
1231 or password is not None and key is not None :
1232 raise InvalidParameter ("__init__: need either key or password")
1235 if isinstance (key, bytes) is False:
1236 raise InvalidParameter ("__init__: key must be provided as "
1237 "bytes, not %s" % type (key))
1238 else: # password, no key
1239 if isinstance (password, str) is False:
1240 raise InvalidParameter ("__init__: password must be a string, not %s"
1242 if len (password) == 0:
1243 raise InvalidParameter ("__init__: supplied empty password but not "
1244 "permitted for PDT encrypted files")
1246 if fixedparts is not None:
1247 if isinstance (fixedparts, list) is False:
1248 raise InvalidParameter ("__init__: IV fixed parts must be "
1249 "supplied as list, not %s"
1250 % type (fixedparts))
1251 self.fixed = fixedparts
1254 super().__init__ (password=password, key=key, counter=counter,
1255 strict_ivs=strict_ivs)
1258 def valid_fixed_part (self, iv):
1260 Check if a fixed part was already seen.
1262 # check if fixed part is known
1263 fixed, _cnt = struct.unpack (FMT_I2N_IV, iv)
1264 i = bisect.bisect_left (self.fixed, fixed)
1265 return i != len (self.fixed) and self.fixed [i] == fixed
1268 def check_consecutive_iv (self, iv):
1270 Check whether the counter part of the given IV is indeed the successor
1271 of the currently present counter. This should always be the case for
1272 the objects in a well formed PDT archive but should not be enforced
1273 when decrypting out-of-order.
1275 fixed, cnt = struct.unpack (FMT_I2N_IV, iv)
1276 if self.strict_ivs is True \
1277 and self.last_iv is not None \
1278 and self.last_iv [0] == fixed \
1279 and self.last_iv [1] != cnt - 1:
1280 raise NonConsecutiveIV ("iv %s counter not successor of "
1281 "last object (expected %d, found %d)"
1282 % (iv_fmt (self.last_iv [1]), cnt))
1283 self.last_iv = (iv, cnt)
1286 def next (self, hdr):
1288 Start decrypting the next object. The PDTCRYPT header for the object
1289 can be given either as already parsed object or as bytes.
1291 if isinstance (hdr, bytes) is True:
1292 hdr = hdr_read (hdr)
1293 elif isinstance (hdr, dict) is False:
1294 # this won’t catch malformed specs though
1295 raise InvalidParameter ("next: wrong type of parameter hdr: "
1296 "expected bytes or spec, got %s"
1299 paramversion = hdr ["paramversion"]
1304 raise InvalidHeader ("next: not a header %r" % hdr)
1306 super().next (self.password, paramversion, nacl, iv)
1307 if self.fixed is not None and self.valid_fixed_part (iv) is False:
1308 raise InvalidIVFixedPart ("iv %s has invalid fixed part"
1310 self.check_consecutive_iv (iv)
1313 defs = ENCRYPTION_PARAMETERS.get (paramversion, None)
1315 raise FormatError ("header contains unknown parameter version %d; "
1316 "maybe the file was created by a more recent "
1317 "version of Deltatar" % paramversion)
1319 if enc == "aes-gcm":
1321 ( algorithms.AES (self.key)
1322 , modes.GCM (iv, tag=self.tag)
1323 , backend = default_backend ()) \
1325 elif enc == "passthrough":
1326 self.enc = PassthroughCipher ()
1328 raise InternalError ("encryption parameter set %d refers to unknown "
1329 "mode %r" % (paramversion, enc))
1330 self.set_object_counter (self.cnt + 1)
1333 def done (self, tag=None):
1335 Stop decryption of the current object and finalize it with the active
1336 context. This will throw an *InvalidGCMTag* exception to indicate that
1337 the authentication tag does not match the data. If the tag is correct,
1338 the rest of the plaintext is returned.
1343 data = self.enc.finalize ()
1345 if isinstance (tag, bytes) is False:
1346 raise InvalidParameter ("done: wrong type of parameter "
1347 "tag: expected bytes, got %s"
1349 data = self.enc.finalize_with_tag (self.tag)
1350 except cryptography.exceptions.InvalidTag:
1351 raise InvalidGCMTag ("done: tag mismatch of object %d: %s "
1352 "rejected by finalize ()"
1353 % (self.cnt, binascii.hexlify (self.tag)))
1354 self.ctsize += len (data)
1355 self.stats ["out"] += len (data)
1359 def process (self, buf):
1361 Decrypt the bytes object *buf* with the active decryptor.
1363 if isinstance (buf, bytes) is False:
1364 raise InvalidParameter ("process: expected byte buffer, not %s"
1366 self.ctsize += len (buf)
1367 data = super().process (buf)
1368 self.ptsize += len (data)
1372 ###############################################################################
1374 ###############################################################################
1376 def _patch_global (glob, vow, n=None):
1378 Adapt upper file counter bound for testing IV logic. Completely unsafe.
1380 assert vow == "I am fully aware that this will void my warranty."
1381 r = globals () [glob]
1383 n = globals () [glob + "_DEFAULT"]
1384 globals () [glob] = n
1387 _testing_set_AES_GCM_IV_CNT_MAX = \
1388 partial (_patch_global, "AES_GCM_IV_CNT_MAX")
1390 _testing_set_PDTCRYPT_MAX_OBJ_SIZE = \
1391 partial (_patch_global, "PDTCRYPT_MAX_OBJ_SIZE")
1393 def open2_dump_file (fname, dir_fd, force=False):
1396 oflags = os.O_CREAT | os.O_WRONLY
1398 oflags |= os.O_TRUNC
1403 outfd = os.open (fname, oflags,
1404 stat.S_IRUSR | stat.S_IWUSR, dir_fd=dir_fd)
1405 except FileExistsError as exn:
1406 noise ("PDT: refusing to overwrite existing file %s" % fname)
1408 raise RuntimeError ("destination file %s already exists" % fname)
1409 if PDTCRYPT_VERBOSE is True:
1410 noise ("PDT: new output file %s (fd=%d)" % (fname, outfd))
1414 ###############################################################################
1415 ## freestanding invocation
1416 ###############################################################################
1418 PDTCRYPT_SUB_PROCESS = 0
1419 PDTCRYPT_SUB_SCRYPT = 1
1420 PDTCRYPT_SUB_SCAN = 2
1423 { "process" : PDTCRYPT_SUB_PROCESS
1424 , "scrypt" : PDTCRYPT_SUB_SCRYPT
1425 , "scan" : PDTCRYPT_SUB_SCAN }
1427 PDTCRYPT_SECRET_PW = 0
1428 PDTCRYPT_SECRET_KEY = 1
1430 PDTCRYPT_DECRYPT = 1 << 0 # decrypt archive with password
1431 PDTCRYPT_SPLIT = 1 << 1 # split archive into individual objects
1432 PDTCRYPT_HASH = 1 << 2 # output scrypt hash for file and given password
1434 PDTCRYPT_SPLITNAME = "pdtcrypt-object-%d.bin"
1435 PDTCRYPT_RESCUENAME = "pdtcrypt-rescue-object-%0.5d.bin"
1437 PDTCRYPT_VERBOSE = False
1438 PDTCRYPT_STRICTIVS = False
1439 PDTCRYPT_OVERWRITE = False
1440 PDTCRYPT_BLOCKSIZE = 1 << 12
1445 PDTCRYPT_DEFAULT_VER = 1
1446 PDTCRYPT_DEFAULT_PVER = 1
1448 # scrypt hashing output control
1449 PDTCRYPT_SCRYPT_INTRANATOR = 0
1450 PDTCRYPT_SCRYPT_PARAMETERS = 1
1451 PDTCRYPT_SCRYPT_DEFAULT = PDTCRYPT_SCRYPT_INTRANATOR
1453 PDTCRYPT_SCRYPT_FORMAT = \
1454 { "i2n" : PDTCRYPT_SCRYPT_INTRANATOR
1455 , "params" : PDTCRYPT_SCRYPT_PARAMETERS }
1457 PDTCRYPT_TT_COLUMNS = 80 # assume standard terminal
1459 class PDTDecryptionError (Exception):
1460 """Decryption failed."""
1462 class PDTSplitError (Exception):
1463 """Decryption failed."""
1466 def noise (*a, **b):
1467 print (file=sys.stderr, *a, **b)
1470 class PassthroughDecryptor (object):
1472 curhdr = None # write current header on first data write
1474 def __init__ (self):
1475 if PDTCRYPT_VERBOSE is True:
1476 noise ("PDT: no encryption; data passthrough")
1478 def next (self, hdr):
1479 ok, curhdr = hdr_make (hdr)
1481 raise PDTDecryptionError ("bad header %r" % hdr)
1482 self.curhdr = curhdr
1485 if self.curhdr is not None:
1489 def process (self, d):
1490 if self.curhdr is not None:
1496 def depdtcrypt (mode, secret, ins, outs):
1498 Remove PDTCRYPT layer from all objects encrypted with the secret. Used on a
1499 Deltatar backup this will yield a (possibly Gzip compressed) tarball.
1501 ctleft = -1 # length of ciphertext to consume
1502 ctcurrent = 0 # total ciphertext of current object
1503 total_obj = 0 # total number of objects read
1504 total_pt = 0 # total plaintext bytes
1505 total_ct = 0 # total ciphertext bytes
1506 total_read = 0 # total bytes read
1507 outfile = None # Python file object for output
1509 if mode & PDTCRYPT_DECRYPT: # decryptor
1511 if ks == PDTCRYPT_SECRET_PW:
1512 decr = Decrypt (password=secret [1], strict_ivs=PDTCRYPT_STRICTIVS)
1513 elif ks == PDTCRYPT_SECRET_KEY:
1514 key = binascii.unhexlify (secret [1])
1515 decr = Decrypt (key=key, strict_ivs=PDTCRYPT_STRICTIVS)
1517 raise InternalError ("‘%d’ does not specify a valid kind of secret"
1520 decr = PassthroughDecryptor ()
1523 """Dummy for non-split mode: output file does not vary."""
1526 if mode & PDTCRYPT_SPLIT:
1527 def nextout (outfile):
1529 We were passed an fd as outs for accessing the destination
1530 directory where extracted archive components are supposed
1535 if PDTCRYPT_VERBOSE is True:
1536 noise ("PDT: no output file to close at this point")
1538 if PDTCRYPT_VERBOSE is True:
1539 noise ("PDT: release output file %r" % outfile)
1540 # cleanup happens automatically by the GC; the next
1541 # line will error out on account of an invalid fd
1544 assert total_obj > 0
1545 fname = PDTCRYPT_SPLITNAME % total_obj
1547 outfd = open2_dump_file (fname, outs, force=PDTCRYPT_OVERWRITE)
1548 except RuntimeError as exn:
1549 raise PDTSplitError (exn)
1550 return os.fdopen (outfd, "wb", closefd=True)
1554 """ESPIPE is normal on non-seekable stdio stream."""
1557 except OSError as exn:
1558 if exn.errno == os.errno.ESPIPE:
1561 def out (pt, outfile):
1565 if PDTCRYPT_VERBOSE is True:
1566 noise ("PDT:\t· decrypt plaintext %d B" % (npt))
1568 nn = outfile.write (pt)
1569 except OSError as exn: # probably ENOSPC
1570 raise DecryptionError ("error (%s)" % exn)
1572 raise DecryptionError ("write aborted after %d of %d B" % (nn, npt))
1576 # current object completed; in a valid archive this marks either
1577 # the start of a new header or the end of the input
1578 if ctleft == 0: # current object requires finalization
1579 if PDTCRYPT_VERBOSE is True:
1580 noise ("PDT: %d finalize" % tell (ins))
1583 except InvalidGCMTag as exn:
1584 raise DecryptionError ("error finalizing object %d (%d B): "
1585 "%r" % (total_obj, len (pt), exn)) \
1588 if PDTCRYPT_VERBOSE is True:
1589 noise ("PDT:\t· object validated")
1591 if PDTCRYPT_VERBOSE is True:
1592 noise ("PDT: %d hdr" % tell (ins))
1594 hdr = hdr_read_stream (ins)
1595 total_read += PDTCRYPT_HDR_SIZE
1596 except EndOfFile as exn:
1597 total_read += exn.remainder
1598 if total_ct + total_obj * PDTCRYPT_HDR_SIZE != total_read:
1599 raise PDTDecryptionError ("ciphertext processed (%d B) plus "
1600 "overhead (%d × %d B) does not match "
1601 "the number of bytes read (%d )"
1602 % (total_ct, total_obj, PDTCRYPT_HDR_SIZE,
1604 # the single good exit
1605 return total_read, total_obj, total_ct, total_pt
1606 except InvalidHeader as exn:
1607 raise PDTDecryptionError ("invalid header at position %d in %r "
1608 "(%s)" % (tell (ins), exn, ins))
1609 if PDTCRYPT_VERBOSE is True:
1610 pretty = hdr_fmt_pretty (hdr)
1611 noise (reduce (lambda a, e: (a + "\n" if a else "") + "PDT:\t· " + e,
1612 pretty.splitlines (), ""))
1613 ctcurrent = ctleft = hdr ["ctsize"]
1617 total_obj += 1 # used in file counter with split mode
1619 # finalization complete or skipped in case of first object in
1620 # stream; create a new output file if necessary
1621 outfile = nextout (outfile)
1623 if PDTCRYPT_VERBOSE is True:
1624 noise ("PDT: %d decrypt obj no. %d, %d B"
1625 % (tell (ins), total_obj, ctleft))
1627 # always allocate a new buffer since python-cryptography doesn’t allow
1628 # passing a bytearray :/
1629 nexpect = min (ctleft, PDTCRYPT_BLOCKSIZE)
1630 if PDTCRYPT_VERBOSE is True:
1631 noise ("PDT:\t· [%d] %d%% done, read block (%d B of %d B remaining)"
1633 100 - ctleft * 100 / (ctcurrent > 0 and ctcurrent or 1),
1635 ct = ins.read (nexpect)
1639 raise EndOfFile (nct,
1640 "hit EOF after %d of %d B in block [%d:%d); "
1641 "%d B ciphertext remaining for object no %d"
1642 % (nct, nexpect, off, off + nexpect, ctleft,
1648 if PDTCRYPT_VERBOSE is True:
1649 noise ("PDT:\t· decrypt ciphertext %d B" % (nct))
1650 pt = decr.process (ct)
1654 def deptdcrypt_mk_stream (kind, path):
1655 """Create stream from file or stdio descriptor."""
1656 if kind == PDTCRYPT_SINK:
1658 if PDTCRYPT_VERBOSE is True: noise ("PDT: sink: stdout")
1659 return sys.stdout.buffer
1661 if PDTCRYPT_VERBOSE is True: noise ("PDT: sink: file %s" % path)
1662 return io.FileIO (path, "w")
1663 if kind == PDTCRYPT_SOURCE:
1665 if PDTCRYPT_VERBOSE is True: noise ("PDT: source: stdin")
1666 return sys.stdin.buffer
1668 if PDTCRYPT_VERBOSE is True: noise ("PDT: source: file %s" % path)
1669 return io.FileIO (path, "r")
1671 raise ValueError ("bogus stream “%s” / %s" % (kind, path))
1674 def mode_depdtcrypt (mode, secret, ins, outs):
1676 total_read, total_obj, total_ct, total_pt = \
1677 depdtcrypt (mode, secret, ins, outs)
1678 except DecryptionError as exn:
1679 noise ("PDT: Decryption failed:")
1681 noise ("PDT: “%s”" % exn)
1683 noise ("PDT: Did you specify the correct key / password?")
1686 except PDTSplitError as exn:
1687 noise ("PDT: Split operation failed:")
1689 noise ("PDT: “%s”" % exn)
1691 noise ("PDT: Hint: target directory should be empty.")
1695 if PDTCRYPT_VERBOSE is True:
1696 noise ("PDT: decryption successful" )
1697 noise ("PDT: %.10d bytes read" % total_read)
1698 noise ("PDT: %.10d objects decrypted" % total_obj )
1699 noise ("PDT: %.10d bytes ciphertext" % total_ct )
1700 noise ("PDT: %.10d bytes plaintext" % total_pt )
1706 def mode_scrypt (pw, ins=None, nacl=None, fmt=PDTCRYPT_SCRYPT_INTRANATOR):
1708 paramversion = PDTCRYPT_DEFAULT_PVER
1710 hsh, nacl, version, paramversion = scrypt_hashsource (pw, ins)
1711 defs = ENCRYPTION_PARAMETERS.get(paramversion, None)
1713 nacl = binascii.unhexlify (nacl)
1714 defs = ENCRYPTION_PARAMETERS.get(paramversion, None)
1715 version = PDTCRYPT_DEFAULT_VER
1717 kdfname, params = defs ["kdf"]
1719 kdf = kdf_by_version (None, defs)
1720 hsh, _void = kdf (pw, nacl)
1724 if fmt == PDTCRYPT_SCRYPT_INTRANATOR:
1725 out = json.dumps ({ "salt" : base64.b64encode (nacl).decode ()
1726 , "key" : base64.b64encode (hsh) .decode ()
1727 , "paramversion" : paramversion })
1728 elif fmt == PDTCRYPT_SCRYPT_PARAMETERS:
1729 out = json.dumps ({ "salt" : binascii.hexlify (nacl).decode ()
1730 , "key" : binascii.hexlify (hsh) .decode ()
1731 , "version" : version
1732 , "scrypt_params" : { "N" : params ["N"]
1733 , "r" : params ["r"]
1734 , "p" : params ["p"]
1735 , "dkLen" : params ["dkLen"] } })
1737 raise RuntimeError ("bad scrypt output scheme %r" % fmt)
1742 def noise_output_candidates (cands, indent=8, cols=PDTCRYPT_TT_COLUMNS):
1744 Print a list of offsets without garbling the terminal too much.
1746 The indent is counted from column zero; if it is wide enough, the “PDT: ”
1747 marker will be prepended, considered part of the indentation.
1751 idt = " " * indent if indent < 5 else "PDT: " + " " * (indent - 5)
1756 init = True # prevent leading separator
1759 raise ValueError ("the requested indentation exceeds the line "
1760 "width by %d" % (indent - wd))
1770 if lpos > wd: # line break
1786 def mode_scan (secret, fname, outs=None, nacl=None):
1788 Dissect a binary file, looking for PDTCRYPT headers and objects.
1790 If *outs* is supplied, recoverable data will be dumped into the specified
1794 ifd = os.open (fname, os.O_RDONLY)
1795 except FileNotFoundError:
1796 noise ("PDT: failed to open %s readonly" % fname)
1801 if PDTCRYPT_VERBOSE is True:
1802 noise ("PDT: scan for potential sync points")
1803 cands = locate_hdr_candidates (ifd)
1804 if len (cands) == 0:
1805 noise ("PDT: scan complete: input does not contain potential PDT "
1806 "headers; giving up.")
1808 if PDTCRYPT_VERBOSE is True:
1809 noise ("PDT: scan complete: found %d candidates:" % len (cands))
1810 noise_output_candidates (cands)
1820 vdt, hdr = inspect_hdr (ifd, cand)
1821 if vdt == HDR_CAND_JUNK:
1824 off0 = cand + PDTCRYPT_HDR_SIZE
1825 if PDTCRYPT_VERBOSE is True:
1826 noise ("PDT: obj %d: read payload @%d" % (nobj, off0))
1827 pretty = hdr_fmt_pretty (hdr)
1828 noise (reduce (lambda a, e: (a + "\n" if a else "") + "PDT:\t· " + e,
1829 pretty.splitlines (), ""))
1832 if outs is not None:
1833 ofname = PDTCRYPT_RESCUENAME % nobj
1834 ofd = open2_dump_file (ofname, outs, force=PDTCRYPT_OVERWRITE)
1837 ok = try_decrypt (ifd, off0, hdr, secret, ofd=ofd) == hdr ["ctsize"]
1841 if vdt == HDR_CAND_GOOD and ok is True:
1842 noise ("PDT: %d → ✓ valid object %d–%d"
1843 % (cand, off0, off0 + hdr ["ctsize"]))
1844 elif vdt == HDR_CAND_FISHY and ok is True:
1845 noise ("PDT: %d → × object %d–%d, corrupt header"
1846 % (cand, off0, off0 + hdr ["ctsize"]))
1847 elif vdt == HDR_CAND_GOOD and ok is False:
1848 noise ("PDT: %d → × object %d–%d, problematic payload"
1849 % (cand, off0, off0 + hdr ["ctsize"]))
1850 elif vdt == HDR_CAND_FISHY and ok is False:
1851 noise ("PDT: %d → × object %d–%d, corrupt header, problematic "
1852 "ciphertext" % (cand, off0, off0 + hdr ["ctsize"]))
1859 noise ("PDT: all headers ok")
1861 noise ("PDT: %d candidates not parseable as headers:" % len (junk))
1862 noise_output_candidates (junk)
1864 def usage (err=False):
1868 indent = ' ' * len (SELF)
1869 out ("usage: %s SUBCOMMAND { --help" % SELF)
1870 out (" %s | [ -v ] { -p PASSWORD | -k KEY }" % indent)
1871 out (" %s [ { -i | --in } { - | SOURCE } ]" % indent)
1872 out (" %s [ { -n | --nacl } { SALT } ]" % indent)
1873 out (" %s [ { -o | --out } { - | DESTINATION } ]" % indent)
1874 out (" %s [ -D | --no-decrypt ] [ -S | --split ]" % indent)
1875 out (" %s [ -f | --format ]" % indent)
1878 out ("\t\tSUBCOMMAND main mode: { process | scrypt }")
1880 out ("\t\t process: extract objects from PDT archive")
1881 out ("\t\t scrypt: calculate hash from password and first object")
1882 out ("\t\t-p PASSWORD password to derive the encryption key from")
1883 out ("\t\t-k KEY encryption key as 16 bytes in hexadecimal notation")
1884 out ("\t\t-s enforce strict handling of initialization vectors")
1885 out ("\t\t-i SOURCE file name to read from")
1886 out ("\t\t-o DESTINATION file to write output to")
1887 out ("\t\t-n SALT provide salt for scrypt mode in hex encoding")
1888 out ("\t\t-v print extra info")
1889 out ("\t\t-S split into files at object boundaries; this")
1890 out ("\t\t requires DESTINATION to refer to directory")
1891 out ("\t\t-D PDT header and ciphertext passthrough")
1892 out ("\t\t-f format of SCRYPT hash output (“default” or “parameters”)")
1894 out ("\tinstead of filenames, “-” may used to specify stdin / stdout")
1896 sys.exit ((err is True) and 42 or 0)
1906 def parse_argv (argv):
1907 global PDTCRYPT_OVERWRITE
1909 mode = PDTCRYPT_DECRYPT
1915 scrypt_format = PDTCRYPT_SCRYPT_DEFAULT
1918 SELF = os.path.basename (next (argvi))
1921 rawsubcmd = next (argvi)
1922 subcommand = PDTCRYPT_SUB [rawsubcmd]
1923 except StopIteration:
1924 bail ("ERROR: subcommand required")
1926 bail ("ERROR: invalid subcommand “%s” specified" % rawsubcmd)
1932 except StopIteration:
1933 bail ("ERROR: argument list incomplete")
1935 def checked_secret (t, arg):
1940 bail ("ERROR: encountered “%s” but secret already given" % arg)
1943 if arg in [ "-h", "--help" ]:
1946 elif arg in [ "-v", "--verbose", "--wtf" ]:
1947 global PDTCRYPT_VERBOSE
1948 PDTCRYPT_VERBOSE = True
1949 elif arg in [ "-i", "--in", "--source" ]:
1950 insspec = checked_arg ()
1951 if PDTCRYPT_VERBOSE is True: noise ("PDT: decrypt from %s" % insspec)
1952 elif arg in [ "-p", "--password" ]:
1953 arg = checked_arg ()
1954 checked_secret (PDTCRYPT_SECRET_PW, arg)
1955 if PDTCRYPT_VERBOSE is True: noise ("PDT: decrypting with password")
1957 if subcommand == PDTCRYPT_SUB_PROCESS:
1958 if arg in [ "-s", "--strict-ivs" ]:
1959 global PDTCRYPT_STRICTIVS
1960 PDTCRYPT_STRICTIVS = True
1961 elif arg in [ "-o", "--out", "--dest", "--sink" ]:
1962 outsspec = checked_arg ()
1963 if PDTCRYPT_VERBOSE is True: noise ("PDT: decrypt to %s" % outsspec)
1964 elif arg in [ "-f", "--force" ]:
1965 PDTCRYPT_OVERWRITE = True
1966 if PDTCRYPT_VERBOSE is True: noise ("PDT: overwrite existing files")
1967 elif arg in [ "-S", "--split" ]:
1968 mode |= PDTCRYPT_SPLIT
1969 if PDTCRYPT_VERBOSE is True: noise ("PDT: split files")
1970 elif arg in [ "-D", "--no-decrypt" ]:
1971 mode &= ~PDTCRYPT_DECRYPT
1972 if PDTCRYPT_VERBOSE is True: noise ("PDT: not decrypting")
1973 elif arg in [ "-k", "--key" ]:
1974 arg = checked_arg ()
1975 checked_secret (PDTCRYPT_SECRET_KEY, arg)
1976 if PDTCRYPT_VERBOSE is True: noise ("PDT: decrypting with key")
1978 bail ("ERROR: unexpected positional argument “%s”" % arg)
1979 elif subcommand == PDTCRYPT_SUB_SCRYPT:
1980 if arg in [ "-n", "--nacl", "--salt" ]:
1981 nacl = checked_arg ()
1982 if PDTCRYPT_VERBOSE is True: noise ("PDT: salt key with %s" % nacl)
1983 elif arg in [ "-f", "--format" ]:
1984 arg = checked_arg ()
1986 scrypt_format = PDTCRYPT_SCRYPT_FORMAT [arg]
1988 bail ("ERROR: invalid scrypt output format %s" % arg)
1989 if PDTCRYPT_VERBOSE is True:
1990 noise ("PDT: scrypt output format “%s”" % scrypt_format)
1992 bail ("ERROR: unexpected positional argument “%s”" % arg)
1993 elif subcommand == PDTCRYPT_SUB_SCAN:
1994 if arg in [ "-o", "--out", "--dest", "--sink" ]:
1995 outsspec = checked_arg ()
1996 if PDTCRYPT_VERBOSE is True: noise ("PDT: decrypt to %s" % outsspec)
1997 elif arg in [ "-f", "--force" ]:
1998 PDTCRYPT_OVERWRITE = True
1999 if PDTCRYPT_VERBOSE is True: noise ("PDT: overwrite existing files")
2001 bail ("ERROR: unexpected positional argument “%s”" % arg)
2004 if PDTCRYPT_VERBOSE is True:
2005 noise ("ERROR: no password or key specified, trying $PDTCRYPT_PASSWORD")
2006 epw = os.getenv ("PDTCRYPT_PASSWORD")
2008 checked_secret (PDTCRYPT_SECRET_PW, epw.strip ())
2011 if PDTCRYPT_VERBOSE is True:
2012 noise ("ERROR: no password or key specified, trying $PDTCRYPT_KEY")
2013 ek = os.getenv ("PDTCRYPT_KEY")
2015 checked_secret (PDTCRYPT_SECRET_KEY, ek.strip ())
2018 if subcommand == PDTCRYPT_SUB_SCRYPT:
2019 bail ("ERROR: scrypt hash mode requested but no password given")
2020 elif mode & PDTCRYPT_DECRYPT:
2021 bail ("ERROR: encryption requested but no password given")
2023 if mode & PDTCRYPT_SPLIT and outsspec is None:
2024 bail ("ERROR: split mode is incompatible with stdout sink "
2027 if subcommand == PDTCRYPT_SUB_SCAN and outsspec is None:
2028 pass # no output by default in scan mode
2029 elif mode & PDTCRYPT_SPLIT or subcommand == PDTCRYPT_SUB_SCAN:
2030 # destination must be directory
2032 bail ("ERROR: mode is incompatible with stdout sink")
2035 os.makedirs (outsspec, 0o700)
2036 except FileExistsError:
2037 # if it’s a directory with appropriate perms, everything is
2038 # good; otherwise, below invocation of open(2) will fail
2040 outs = os.open (outsspec, os.O_DIRECTORY, 0o600)
2041 except FileNotFoundError as exn:
2042 bail ("ERROR: cannot create target directory “%s”" % outsspec)
2043 except NotADirectoryError as exn:
2044 bail ("ERROR: target path “%s” is not a directory" % outsspec)
2046 outs = deptdcrypt_mk_stream (PDTCRYPT_SINK, outsspec or "-")
2048 if subcommand == PDTCRYPT_SUB_SCAN:
2050 bail ("ERROR: please supply an input file for scanning")
2052 bail ("ERROR: input must be seekable; please specify a file")
2053 return True, partial (mode_scan, secret, insspec, outs, nacl=nacl)
2055 if subcommand == PDTCRYPT_SUB_SCRYPT:
2056 if secret [0] == PDTCRYPT_SECRET_KEY:
2057 bail ("ERROR: scrypt mode requires a password")
2058 if insspec is not None and nacl is not None \
2059 or insspec is None and nacl is None :
2060 bail ("ERROR: please supply either an input file or "
2065 if insspec is not None or subcommand != PDTCRYPT_SUB_SCRYPT:
2066 ins = deptdcrypt_mk_stream (PDTCRYPT_SOURCE, insspec or "-")
2068 if subcommand == PDTCRYPT_SUB_SCRYPT:
2069 return True, partial (mode_scrypt, secret [1].encode (), ins, nacl,
2072 return True, partial (mode_depdtcrypt, mode, secret, ins, outs)
2076 ok, runner = parse_argv (argv)
2078 if ok is True: return runner ()
2083 if __name__ == "__main__":
2084 sys.exit (main (sys.argv))