6 ===============================================================================
7 crypto -- Encryption Layer for the Deltatar Backup
8 ===============================================================================
12 - AES-GCM for the symmetric encryption;
17 - NIST Recommendation for Block Cipher Modes of Operation: Galois/Counter
19 http://nvlpubs.nist.gov/nistpubs/Legacy/SP/nistspecialpublication800-38d.pdf
22 https://cryptome.org/2014/01/aes-gcm-v1.pdf
24 - Authentication weaknesses in GCM
25 http://csrc.nist.gov/groups/ST/toolkit/BCM/documents/comments/CWC-GCM/Ferguson2.pdf
27 Trouble with python-cryptography packages: authentication tags can only be
28 passed in advance: https://github.com/pyca/cryptography/pull/3421
31 -------------------------------------------------------------------------------
33 Errors fall into roughly three categories:
35 - Cryptographical errors or invalid data.
37 - ``InvalidGCMTag`` (decryption failed on account of an invalid GCM
39 - ``InvalidIVFixedPart`` (IV fixed part of object not found in list),
40 - ``DuplicateIV`` (the IV of an encrypted object already occurred),
41 - ``DecryptionError`` (used in CLI decryption for presenting error
42 conditions to the user).
44 - Incorrect usage of the library.
46 - ``InvalidParameter`` (non-conforming user supplied parameter),
47 - ``InvalidHeader`` (data passed for reading not parsable into header),
48 - ``FormatError`` (cannot handle header or parameter version),
51 - Bad internal state. If one of these is encountered it means that a state
52 was reached that shouldn’t occur during normal processing.
57 Also, ``EndOfFile`` is used as a sentinel to communicate that a stream supplied
58 for reading is exhausted.
60 Initialization Vectors
61 -------------------------------------------------------------------------------
63 Initialization vectors are checked reuse during the lifetime of a decryptor.
64 The fixed counters for metadata files cannot be reused and attempts to do so
65 will cause a DuplicateIV error. This means the length of objects encrypted with
66 a metadata counter is capped at 63 GB.
68 For ordinary, non-metadata payload, there is an optional mode with strict IV
69 checking that causes a crypto context to fail if an IV encountered or created
70 was already used for decrypting or encrypting, respectively, an earlier object.
71 Note that this mode can trigger false positives when decrypting non-linearly,
72 e. g. when traversing the same object multiple times. Since the crypto context
73 has no notion of a position in a PDT encrypted archive, this condition must be
74 sorted out downstream.
77 -------------------------------------------------------------------------------
79 ``crypto.py`` may be invoked as a script for decrypting, validating, and
80 splitting PDT encrypted files. Consult the usage message for details.
84 Decrypt from stdin using the password ‘foo’: ::
86 $ crypto.py process foo -i - -o - <some-file.tar.gz.pdtcrypt >some-file.tar.gz
88 Output verbose information about the encrypted objects in the archive: ::
90 $ crypto.py process foo -v -i some-file.tar.gz.pdtcrypt -o /dev/null
91 PDT: decrypt from some-file.tar.gz.pdtcrypt
92 PDT: decrypt to /dev/null
93 PDT: source: file some-file.tar.gz.pdtcrypt
94 PDT: sink: file /dev/null
96 PDT: · version = 1 : 0100
97 PDT: · paramversion = 1 : 0100
98 PDT: · nacl : d270 b031 00d1 87e2 c946 610d 7b7f 7e5f
99 PDT: · iv : 02ee 3dd7 a963 1eb1 0100 0000
100 PDT: · ctsize = 591 : 4f02 0000 0000 0000
101 PDT: · tag : 5b2d 6d8b 8f82 4842 12fd 0b10 b6e3 369b
102 PDT: 64 decrypt obj no. 1, 591 B
103 PDT: · [64] 0% done, read block (591 B of 591 B remaining)
104 PDT: · decrypt ciphertext 591 B
105 PDT: · decrypt plaintext 591 B
109 Also, the mode *scrypt* allows deriving encryption keys. To calculate the
110 encryption key from the password ‘foo’ and the salt of the first object in a
111 PDT encrypted file: ::
113 $ crypto.py scrypt foo -i some-file.pdtcrypt
114 {"paramversion": 1, "salt": "Cqzbk48e3peEjzWto8D0yA==", "key": "JH9EkMwaM4x9F5aim5gK/Q=="}
116 The computed 16 byte key is given in hexadecimal notation in the value to
117 ``hash`` and can be fed into Python’s ``binascii.unhexlify()`` to obtain the
118 corresponding binary representation.
120 Note that in Scrypt hashing mode, no data integrity checks are being performed.
121 If the wrong password is given, a wrong key will be derived. Whether the password
122 was indeed correct can only be determined by decrypting. Note that since PDT
123 archives essentially consist of a stream of independent objects, the salt and
124 other parameters may change. Thus a key derived using above method from the
125 first object doesn’t necessarily apply to any of the subsequent objects.
134 from functools import reduce, partial
144 except ImportError as exn:
147 if __name__ == "__main__": ## Work around the import mechanism’s lest Python’s
148 pwd = os.getcwd() ## preference for local imports causes a cyclical
149 ## import (crypto → pylibscrypt → […] → ./tarfile → crypto).
150 sys.path = [ p for p in sys.path if p.find ("deltatar") < 0 ]
153 from cryptography.hazmat.primitives.ciphers import Cipher, algorithms, modes
154 from cryptography.hazmat.backends import default_backend
158 __all__ = [ "hdr_make", "hdr_read", "hdr_fmt", "hdr_fmt_pretty"
160 , "PDTCRYPT_HDR_SIZE", "AES_GCM_IV_CNT_DATA"
161 , "AES_GCM_IV_CNT_INFOFILE", "AES_GCM_IV_CNT_INDEX"
165 ###############################################################################
167 ###############################################################################
169 class EndOfFile (Exception):
173 def __init__ (self, n=None, msg=None):
179 class InvalidParameter (Exception):
180 """Inputs not valid for PDT encryption."""
184 class InvalidHeader (Exception):
185 """Header not valid."""
189 class InvalidGCMTag (Exception):
191 The GCM tag calculated during decryption differs from that in the object
197 class InvalidIVFixedPart (Exception):
199 IV fixed part not in supplied list: either the backup is corrupt or the
200 current object does not belong to it.
205 class IVFixedPartError (Exception):
207 Error creating a unique IV fixed part: repeated calls to system RNG yielded
208 the same sequence of bytes as the last IV used.
213 class InvalidFileCounter (Exception):
215 When encrypting, an attempted reuse of a dedicated counter (info file,
216 index file) was caught.
221 class DuplicateIV (Exception):
223 During encryption, the current IV fixed part is identical to an already
224 existing IV (same prefix and file counter). This indicates tampering or
225 programmer error and cannot be recovered from.
230 class NonConsecutiveIV (Exception):
232 IVs not numbered consecutively. This is a hard error with strict IV
233 checking. Precludes random access to the encrypted objects.
238 class FormatError (Exception):
239 """Unusable parameters in header."""
243 class DecryptionError (Exception):
244 """Error during decryption with ``crypto.py`` on the command line."""
248 class Unreachable (Exception):
250 Makeshift __builtin_unreachable(); always a programmer error if
256 class InternalError (Exception):
257 """Errors not ascribable to bad user inputs or cryptography."""
261 ###############################################################################
262 ## crypto layer version
263 ###############################################################################
265 ENCRYPTION_PARAMETERS = \
267 { "kdf": ("dummy", 16)
268 , "enc": "passthrough" }
276 , "enc": "aes-gcm" } }
278 ###############################################################################
280 ###############################################################################
282 PDTCRYPT_HDR_MAGIC = b"PDTCRYPT"
284 PDTCRYPT_HDR_SIZE_MAGIC = 8 # 8
285 PDTCRYPT_HDR_SIZE_VERSION = 2 # 10
286 PDTCRYPT_HDR_SIZE_PARAMVERSION = 2 # 12
287 PDTCRYPT_HDR_SIZE_NACL = 16 # 28
288 PDTCRYPT_HDR_SIZE_IV = 12 # 40
289 PDTCRYPT_HDR_SIZE_CTSIZE = 8 # 48
290 PDTCRYPT_HDR_SIZE_TAG = 16 # 64 GCM auth tag
292 PDTCRYPT_HDR_SIZE = PDTCRYPT_HDR_SIZE_MAGIC + PDTCRYPT_HDR_SIZE_VERSION \
293 + PDTCRYPT_HDR_SIZE_PARAMVERSION + PDTCRYPT_HDR_SIZE_NACL \
294 + PDTCRYPT_HDR_SIZE_IV + PDTCRYPT_HDR_SIZE_CTSIZE \
295 + PDTCRYPT_HDR_SIZE_TAG # = 64
297 # precalculate offsets since Python can’t do constant folding over names
298 HDR_OFF_VERSION = PDTCRYPT_HDR_SIZE_MAGIC
299 HDR_OFF_PARAMVERSION = HDR_OFF_VERSION + PDTCRYPT_HDR_SIZE_VERSION
300 HDR_OFF_NACL = HDR_OFF_PARAMVERSION + PDTCRYPT_HDR_SIZE_PARAMVERSION
301 HDR_OFF_IV = HDR_OFF_NACL + PDTCRYPT_HDR_SIZE_NACL
302 HDR_OFF_CTSIZE = HDR_OFF_IV + PDTCRYPT_HDR_SIZE_IV
303 HDR_OFF_TAG = HDR_OFF_CTSIZE + PDTCRYPT_HDR_SIZE_CTSIZE
307 FMT_I2N_IV = "<8sL" # 8 random bytes ‖ 32 bit counter
308 FMT_I2N_HDR = ("<" # host byte order
312 "16s" # sodium chloride
318 AES_GCM_MAX_SIZE = (1 << 36) - (1 << 5) # 2^39 - 2^8 b ≅ 64 GB
319 PDTCRYPT_MAX_OBJ_SIZE_DEFAULT = 63 * (1 << 30) # 63 GB
320 PDTCRYPT_MAX_OBJ_SIZE = PDTCRYPT_MAX_OBJ_SIZE_DEFAULT
322 # index and info files are written on-the fly while encrypting so their
323 # counters must be available inadvance
324 AES_GCM_IV_CNT_INFOFILE = 1 # constant
325 AES_GCM_IV_CNT_INDEX = AES_GCM_IV_CNT_INFOFILE + 1
326 AES_GCM_IV_CNT_DATA = AES_GCM_IV_CNT_INDEX + 1 # also for multivolume
327 AES_GCM_IV_CNT_MAX_DEFAULT = 0xffFFffFF
328 AES_GCM_IV_CNT_MAX = AES_GCM_IV_CNT_MAX_DEFAULT
330 # IV structure and generation
331 PDTCRYPT_IV_GEN_MAX_RETRIES = 10 # ×
332 PDTCRYPT_IV_FIXEDPART_SIZE = 8 # B
333 PDTCRYPT_IV_COUNTER_SIZE = 4 # B
335 ###############################################################################
337 ###############################################################################
343 # , paramversion : u16
349 # fn hdr_read (f : handle) -> hdrinfo;
350 # fn hdr_make (f : handle, h : hdrinfo) -> IOResult<usize>;
351 # fn hdr_fmt (h : hdrinfo) -> String;
356 Read bytes as header structure.
358 If the input could not be interpreted as a header, fail with
363 mag, version, paramversion, nacl, iv, ctsize, tag = \
364 struct.unpack (FMT_I2N_HDR, data)
365 except Exception as exn:
366 raise InvalidHeader ("error unpacking header from [%r]: %s"
367 % (binascii.hexlify (data), str (exn)))
369 if mag != PDTCRYPT_HDR_MAGIC:
370 raise InvalidHeader ("bad magic in header: expected [%s], got [%s]"
371 % (PDTCRYPT_HDR_MAGIC, mag))
374 { "version" : version
375 , "paramversion" : paramversion
383 def hdr_read_stream (instr):
385 Read header from stream at the current position.
387 Fail with ``InvalidHeader`` if insufficient bytes were read from the
388 stream, or if the content could not be interpreted as a header.
390 data = instr.read(PDTCRYPT_HDR_SIZE)
394 elif ldata != PDTCRYPT_HDR_SIZE:
395 raise InvalidHeader ("hdr_read_stream: expected %d B, received %d B"
396 % (PDTCRYPT_HDR_SIZE, ldata))
397 return hdr_read (data)
400 def hdr_from_params (version, paramversion, nacl, iv, ctsize, tag):
402 Assemble the necessary values into a PDTCRYPT header.
404 :type version: int to fit uint16_t
405 :type paramversion: int to fit uint16_t
406 :type nacl: bytes to fit uint8_t[16]
407 :type iv: bytes to fit uint8_t[12]
408 :type size: int to fit uint64_t
409 :type tag: bytes to fit uint8_t[16]
411 buf = bytearray (PDTCRYPT_HDR_SIZE)
412 bufv = memoryview (buf)
415 struct.pack_into (FMT_I2N_HDR, bufv, 0,
417 version, paramversion, nacl, iv, ctsize, tag)
418 except Exception as exn:
419 return False, "error assembling header: %s" % str (exn)
421 return True, bytes (buf)
424 def hdr_make_dummy (s):
426 Create a header sized block of bytes initialized to a value derived from a
427 string. Used to verify we’ve jumped back correctly to the actual position
428 of the object header.
430 c = reduce (lambda a, c: a + ord(c), s, 0) % 0xFF
431 return bytes (bytearray (struct.pack ("B", c)) * PDTCRYPT_HDR_SIZE)
436 Assemble a header from the given header structure.
438 return hdr_from_params (version=hdr.get("version"),
439 paramversion=hdr.get("paramversion"),
440 nacl=hdr.get("nacl"), iv=hdr.get("iv"),
441 ctsize=hdr.get("ctsize"), tag=hdr.get("tag"))
444 HDR_FMT = "I2n_header { version: %d, paramversion: %d, nacl: %s[%d]," \
445 " iv: %s[%d], ctsize: %d, tag: %s[%d] }"
448 """Format a header structure into readable output."""
449 return HDR_FMT % (h["version"], h["paramversion"],
450 binascii.hexlify (h["nacl"]), len(h["nacl"]),
451 binascii.hexlify (h["iv"]), len(h["iv"]),
453 binascii.hexlify (h["tag"]), len(h["tag"]))
456 def hex_spaced_of_bytes (b):
457 """Format bytes object, hexdump style."""
458 return " ".join ([ "%.2x%.2x" % (c1, c2)
459 for c1, c2 in zip (b[0::2], b[1::2]) ]) \
460 + (len (b) | 1 == len (b) and " %.2x" % b[-1] or "") # odd lengths
463 def hdr_iv_counter (h):
464 """Extract the variable part of the IV of the given header."""
465 _fixed, cnt = struct.unpack (FMT_I2N_IV, h ["iv"])
469 def hdr_iv_fixed (h):
470 """Extract the fixed part of the IV of the given header."""
471 fixed, _cnt = struct.unpack (FMT_I2N_IV, h ["iv"])
475 hdr_dump = hex_spaced_of_bytes
479 """version = %-4d : %s
480 paramversion = %-4d : %s
487 def hdr_fmt_pretty (h):
489 Format header structure into multi-line representation of its contents and
490 their raw representation. (Omit the implicit “PDTCRYPT” magic bytes that
491 precede every header.)
493 return HDR_FMT_PRETTY \
495 hex_spaced_of_bytes (struct.pack (FMT_UINT16_LE, h["version"])),
497 hex_spaced_of_bytes (struct.pack (FMT_UINT16_LE, h["paramversion"])),
498 hex_spaced_of_bytes (h["nacl"]),
499 hex_spaced_of_bytes (h["iv"]),
501 hex_spaced_of_bytes (struct.pack (FMT_UINT64_LE, h["ctsize"])),
502 hex_spaced_of_bytes (h["tag"]))
504 IV_FMT = "((f %s) (c %d))"
507 """Format the two components of an IV in a readable fashion."""
508 fixed, cnt = struct.unpack (FMT_I2N_IV, iv)
509 return IV_FMT % (binascii.hexlify (fixed), cnt)
512 ###############################################################################
514 ###############################################################################
516 class Location (object):
520 def restore_loc_fmt (loc):
522 % (loc.n, loc.offset)
524 def locate_hdr_candidates (fd):
526 Walk over instances of the magic string in the payload, collecting their
527 positions. If the offset of the first found instance is not zero, the file
528 begins with leading garbage.
530 :return: The list of offsets in the file.
534 mm = mmap.mmap(fd, 0, mmap.MAP_SHARED, mmap.PROT_READ)
537 pos = mm.find (PDTCRYPT_HDR_MAGIC, pos)
546 HDR_CAND_GOOD = 0 # header marks begin of valid object
547 HDR_CAND_FISHY = 1 # inconclusive (tag mismatch, obj overlap etc.)
548 HDR_CAND_JUNK = 2 # not a header / object unreadable
551 def inspect_hdr (fd, off):
553 Attempt to parse a header in *fd* at position *off*.
555 Returns a verdict about the quality of that header plus the parsed header
559 _ = os.lseek (fd, off, os.SEEK_SET)
561 if os.lseek (fd, 0, os.SEEK_CUR) != off:
562 if PDTCRYPT_VERBOSE is True:
563 noise ("PDT: %d → dismissed (lseek() past EOF)" % off)
564 return HDR_CAND_JUNK, None
566 raw = os.read (fd, PDTCRYPT_HDR_SIZE)
567 if len (raw) != PDTCRYPT_HDR_SIZE:
568 if PDTCRYPT_VERBOSE is True:
569 noise ("PDT: %d → dismissed (EOF inside header)" % off)
570 return HDR_CAND_JUNK, None
574 except InvalidHeader as exn:
575 if PDTCRYPT_VERBOSE is True:
576 noise ("PDT: %d → dismissed (invalid: [%s])" % (off, str (exn)))
577 return HDR_CAND_JUNK, None
579 obj0 = off + PDTCRYPT_HDR_SIZE
580 objX = obj0 + hdr ["ctsize"]
582 eof = os.lseek (fd, 0, os.SEEK_END)
584 if PDTCRYPT_VERBOSE is True:
585 noise ("PDT: %d → EOF inside object (%d≤%d≤%d); adjusting size to "
586 "%d" % (off, obj0, eof, objX, (eof - obj0)))
587 # try reading up to the end
588 hdr ["ctsize"] = eof - obj0
589 return HDR_CAND_FISHY, hdr
591 return HDR_CAND_GOOD, hdr
594 def try_decrypt (ifd, off, hdr, secret, ofd=-1):
596 Attempt to decrypt the object in the (seekable) descriptor *ifd* starting
597 at *off* using the metadata in *hdr* and *secret*. An output fd can be
598 specified with *ofd*; if it is *-1* – the default –, the decrypted payload
601 Always creates a fresh decryptor, so validation steps across objects don’t
604 Errors during GCM tag validation are ignored.
606 ctleft = hdr ["ctsize"]
610 if ks == PDTCRYPT_SECRET_PW:
611 decr = Decrypt (password=secret [1])
612 elif ks == PDTCRYPT_SECRET_KEY:
613 key = binascii.unhexlify (secret [1])
614 decr = Decrypt (key=key)
621 os.lseek (ifd, pos, os.SEEK_SET)
623 cnksiz = min (ctleft, PDTCRYPT_BLOCKSIZE)
624 cnk = os.read (ifd, cnksiz)
627 pt = decr.process (cnk)
632 except InvalidGCMTag:
633 noise ("PDT: GCM tag mismatch for object %d–%d"
634 % (off, off + hdr ["ctsize"]))
635 if len (pt) > 0 and ofd != -1:
638 except Exception as exn:
639 noise ("PDT: error decrypting object %d–%d@%d, %d B remaining [%s]"
640 % (off, off + hdr ["ctsize"], pos, ctleft, exn))
646 def readable_objects_offsets (ifd, secret, cands):
648 From a list of candidates, locate the ones that mark the start of actual
649 readable PDTCRYPT objects.
655 vdt, hdr = inspect_hdr (ifd, cand)
656 if vdt == HDR_CAND_JUNK:
657 pass # ignore unreadable ones
658 elif vdt in [HDR_CAND_GOOD, HDR_CAND_FISHY]:
659 off0 = cand + PDTCRYPT_HDR_SIZE
660 ok = try_decrypt (ifd, off0, hdr, secret) == hdr ["ctsize"]
666 def reconstruct_offsets (fname, secret):
667 ifd = os.open (fname, os.O_RDONLY)
670 cands = locate_hdr_candidates (ifd)
671 return readable_objects_offsets (ifd, secret, cands)
676 ###############################################################################
677 ## passthrough / null encryption
678 ###############################################################################
680 class PassthroughCipher (object):
682 tag = struct.pack ("<QQ", 0, 0)
684 def __init__ (self) : pass
686 def update (self, b) : return b
688 def finalize (self) : return b""
690 def finalize_with_tag (self, _) : return b""
692 ###############################################################################
693 ## convenience wrapper
694 ###############################################################################
697 def kdf_dummy (klen, password, _nacl):
699 Fake KDF for testing purposes that is called when parameter version zero is
702 q, r = divmod (klen, len (password))
703 if isinstance (password, bytes) is False:
704 password = password.encode ()
705 return password * q + password [:r], b""
708 SCRYPT_KEY_MEMO = { } # static because needed for both the info file and the archive
711 def kdf_scrypt (params, password, nacl):
713 Wrapper for the Scrypt KDF, corresponds to parameter version one. The
714 computation result is memoized based on the inputs to facilitate spawning
715 multiple encryption contexts.
720 dkLen = params["dkLen"]
723 nacl = os.urandom (params["NaCl_LEN"])
725 key_parms = (password, nacl, N, r, p, dkLen)
726 global SCRYPT_KEY_MEMO
727 if key_parms not in SCRYPT_KEY_MEMO:
728 SCRYPT_KEY_MEMO [key_parms] = \
729 pylibscrypt.scrypt (password, nacl, N, r, p, dkLen)
730 return SCRYPT_KEY_MEMO [key_parms], nacl
733 def kdf_by_version (paramversion=None, defs=None):
735 Pick the KDF handler corresponding to the parameter version or the
738 :rtype: function (password : str, nacl : str) -> str
740 if paramversion is not None:
741 defs = ENCRYPTION_PARAMETERS.get(paramversion, None)
743 raise InvalidParameter ("no encryption parameters for version %r"
745 (kdf, params) = defs["kdf"]
747 if kdf == "scrypt" : fn = kdf_scrypt
748 if kdf == "dummy" : fn = kdf_dummy
750 raise ValueError ("key derivation method %r unknown" % kdf)
751 return partial (fn, params)
754 ###############################################################################
756 ###############################################################################
758 def scrypt_hashsource (pw, ins):
760 Calculate the SCRYPT hash from the password and the information contained
761 in the first header found in ``ins``.
763 This does not validate whether the first object is encrypted correctly.
765 if isinstance (pw, str) is True:
767 elif isinstance (pw, bytes) is False:
768 raise InvalidParameter ("password must be a string, not %s"
770 if isinstance (ins, io.BufferedReader) is False and \
771 isinstance (ins, io.FileIO) is False:
772 raise InvalidParameter ("file to hash must be opened in “binary” mode")
775 hdr = hdr_read_stream (ins)
776 except EndOfFile as exn:
777 noise ("PDT: malformed input: end of file reading first object header")
782 pver = hdr ["paramversion"]
783 if PDTCRYPT_VERBOSE is True:
784 noise ("PDT: salt of first object : %s" % binascii.hexlify (nacl))
785 noise ("PDT: parameter version of archive : %d" % pver)
788 defs = ENCRYPTION_PARAMETERS.get(pver, None)
789 kdfname, params = defs ["kdf"]
790 if kdfname != "scrypt":
791 noise ("PDT: input is not an SCRYPT archive")
794 kdf = kdf_by_version (None, defs)
795 except ValueError as exn:
796 noise ("PDT: object has unknown parameter version %d" % pver)
798 hsh, _void = kdf (pw, nacl)
800 return hsh, nacl, hdr ["version"], pver
803 def scrypt_hashfile (pw, fname):
805 Calculate the SCRYPT hash from the password and the information contained
806 in the first header found in the given file. The header is read only at
809 with deptdcrypt_mk_stream (PDTCRYPT_SOURCE, fname or "-") as ins:
810 hsh, _void, _void, _void = scrypt_hashsource (pw, ins)
814 ###############################################################################
816 ###############################################################################
818 class Crypto (object):
820 Encryption context to remain alive throughout an entire tarfile pass.
825 cnt = None # file counter (uint32_t != 0)
826 iv = None # current IV
827 fixed = None # accu for 64 bit fixed parts of IV
828 used_ivs = None # tracks IVs
829 strict_ivs = False # if True, panic on duplicate object IV
838 info_counter_used = False
839 index_counter_used = False
841 def __init__ (self, *al, **akv):
842 self.used_ivs = set ()
843 self.set_parameters (*al, **akv)
846 def next_fixed (self):
851 def set_object_counter (self, cnt=None):
853 Safely set the internal counter of encrypted objects. Numerous
856 The same counter may not be reused in combination with one IV fixed
857 part. This is validated elsewhere in the IV handling.
859 Counter zero is invalid. The first two counters are reserved for
860 metadata. The implementation does not allow for splitting metadata
861 files over multiple encrypted objects. (This would be possible by
862 assigning new fixed parts.) Thus in a Deltatar backup there is at most
863 one object with a counter value of one and two. On creation of a
864 context, the initial counter may be chosen. The globals
865 ``AES_GCM_IV_CNT_INFOFILE`` and ``AES_GCM_IV_CNT_INDEX`` can be used to
866 request one of the reserved values. If one of these values has been
867 used, any further attempt of setting the counter to that value will
868 be rejected with an ``InvalidFileCounter`` exception.
870 Out of bounds values (i. e. below one and more than the maximum of 2³²)
871 cause an ``InvalidParameter`` exception to be thrown.
874 self.cnt = AES_GCM_IV_CNT_DATA
876 if cnt == 0 or cnt > AES_GCM_IV_CNT_MAX + 1:
877 raise InvalidParameter ("invalid counter value %d requested: "
878 "acceptable values are from 1 to %d"
879 % (cnt, AES_GCM_IV_CNT_MAX))
880 if cnt == AES_GCM_IV_CNT_INFOFILE:
881 if self.info_counter_used is True:
882 raise InvalidFileCounter ("attempted to reuse info file "
883 "counter %d: must be unique" % cnt)
884 self.info_counter_used = True
885 elif cnt == AES_GCM_IV_CNT_INDEX:
886 if self.index_counter_used is True:
887 raise InvalidFileCounter ("attempted to reuse index file "
888 " counter %d: must be unique" % cnt)
889 self.index_counter_used = True
890 if cnt <= AES_GCM_IV_CNT_MAX:
893 # cnt == AES_GCM_IV_CNT_MAX + 1 → wrap
894 self.cnt = AES_GCM_IV_CNT_DATA
898 def set_parameters (self, password=None, key=None, paramversion=None,
899 nacl=None, counter=None, strict_ivs=False):
901 Configure the internal state of a crypto context. Not intended for
905 self.set_object_counter (counter)
906 self.strict_ivs = strict_ivs
908 if paramversion is not None:
909 self.paramversion = paramversion
912 self.key, self.nacl = key, nacl
915 if password is not None:
916 if isinstance (password, bytes) is False:
917 password = str.encode (password)
918 self.password = password
919 if paramversion is None and nacl is None:
920 # postpone key setup until first header is available
922 kdf = kdf_by_version (paramversion)
924 self.key, self.nacl = kdf (password, nacl)
927 def process (self, buf):
929 Encrypt / decrypt a buffer. Invokes the ``.update()`` method on the
930 wrapped encryptor or decryptor, respectively.
932 The Cryptography exception ``AlreadyFinalized`` is translated to an
933 ``InternalError`` at this point. It may occur in sound code when the GC
934 closes an encrypting stream after an error. Everywhere else it must be
938 raise RuntimeError ("process: context not initialized")
939 self.stats ["in"] += len (buf)
941 out = self.enc.update (buf)
942 except cryptography.exceptions.AlreadyFinalized as exn:
943 raise InternalError (exn)
944 self.stats ["out"] += len (out)
948 def next (self, password, paramversion, nacl, iv):
950 Prepare for encrypting another object: Reset the data counters and
951 change the configuration in case one of the variable parameters differs
952 from the last object. Also check the IV for duplicates and error out
953 if strict checking was requested.
957 self.stats ["obj"] += 1
959 self.check_duplicate_iv (iv)
961 if ( self.paramversion != paramversion
962 or self.password != password
963 or self.nacl != nacl):
964 self.set_parameters (password=password, paramversion=paramversion,
965 nacl=nacl, strict_ivs=self.strict_ivs)
968 def check_duplicate_iv (self, iv):
970 Add an IV (the 12 byte representation as in the header) to the list. With
971 strict checking enabled, this will throw a ``DuplicateIV``. Depending on
972 the context, this may indicate a serious error (IV reuse).
974 if self.strict_ivs is True and iv in self.used_ivs:
975 raise DuplicateIV ("iv %s was reused" % iv_fmt (iv))
976 # vi has not been used before; add to collection
977 self.used_ivs.add (iv)
982 Access the data counters.
984 return self.stats ["obj"], self.stats ["in"], self.stats ["out"]
989 Clear the current context regardless of its finalization state. The
990 next operation must be ``.next()``.
995 class Encrypt (Crypto):
1001 def __init__ (self, version, paramversion, password=None, key=None, nacl=None,
1002 counter=AES_GCM_IV_CNT_DATA, strict_ivs=True):
1004 The ctor will throw immediately if one of the parameters does not conform
1005 to our expectations.
1007 counter=AES_GCM_IV_CNT_DATA, strict_ivs=True):
1008 :type version: int to fit uint16_t
1009 :type paramversion: int to fit uint16_t
1010 :param password: mutually exclusive with ``key``
1011 :type password: bytes
1012 :param key: mutually exclusive with ``password``
1015 :type counter: initial object counter the values
1016 ``AES_GCM_IV_CNT_INFOFILE`` and
1017 ``AES_GCM_IV_CNT_INDEX`` are unique in each backup set
1018 and cannot be reused even with different fixed parts.
1019 :type strict_ivs: bool
1021 if password is None and key is None \
1022 or password is not None and key is not None :
1023 raise InvalidParameter ("__init__: need either key or password")
1026 if isinstance (key, bytes) is False:
1027 raise InvalidParameter ("__init__: key must be provided as "
1028 "bytes, not %s" % type (key))
1030 raise InvalidParameter ("__init__: salt must be provided along "
1031 "with encryption key")
1032 else: # password, no key
1033 if isinstance (password, str) is False:
1034 raise InvalidParameter ("__init__: password must be a string, not %s"
1036 if len (password) == 0:
1037 raise InvalidParameter ("__init__: supplied empty password but not "
1038 "permitted for PDT encrypted files")
1040 if isinstance (version, int) is False:
1041 raise InvalidParameter ("__init__: version number must be an "
1042 "integer, not %s" % type (version))
1044 raise InvalidParameter ("__init__: version number must be a "
1045 "nonnegative integer, not %d" % version)
1047 if isinstance (paramversion, int) is False:
1048 raise InvalidParameter ("__init__: crypto parameter version number "
1049 "must be an integer, not %s"
1050 % type (paramversion))
1051 if paramversion < 0:
1052 raise InvalidParameter ("__init__: crypto parameter version number "
1053 "must be a nonnegative integer, not %d"
1056 if nacl is not None:
1057 if isinstance (nacl, bytes) is False:
1058 raise InvalidParameter ("__init__: salt given, but of type %s "
1059 "instead of bytes" % type (nacl))
1060 # salt length would depend on the actual encryption so it can’t be
1061 # validated at this point
1063 self.version = version
1064 self.paramenc = ENCRYPTION_PARAMETERS.get (paramversion) ["enc"]
1066 super().__init__ (password, key, paramversion, nacl, counter=counter,
1067 strict_ivs=strict_ivs)
1070 def next_fixed (self, retries=PDTCRYPT_IV_GEN_MAX_RETRIES):
1072 Generate the next IV fixed part by reading eight bytes from
1073 ``/dev/urandom``. The buffer so obtained is tested against the fixed
1074 parts used so far to prevent accidental reuse of IVs. After a
1075 configurable number of attempts to create a unique fixed part, it will
1076 refuse to continue with an ``IVFixedPartError``. This is unlikely to
1077 ever happen on a normal system but may detect an issue with the random
1080 The list of fixed parts that were used by the context at hand can be
1081 accessed through the ``.fixed`` list. Its last element is the fixed
1082 part currently in use.
1086 fp = os.urandom (PDTCRYPT_IV_FIXEDPART_SIZE)
1087 if fp not in self.fixed:
1088 self.fixed.append (fp)
1091 raise IVFixedPartError ("error obtaining a unique IV fixed part from "
1092 "/dev/urandom; giving up after %d tries" % i)
1097 Construct a 12-bytes IV from the current fixed part and the object
1100 return struct.pack(FMT_I2N_IV, self.fixed [-1], self.cnt)
1103 def next (self, filename=None, counter=None):
1105 Prepare for encrypting the next incoming object. Update the counter
1106 and put together the IV, possibly changing prefixes. Then create the
1109 The argument ``counter`` can be used to specify a file counter for this
1110 object. Unless it is one of the reserved values, the counter of
1111 subsequent objects will be computed from this one.
1113 If this is the first object in a series, ``filename`` is required,
1114 otherwise it is reused if not present. The value is used to derive a
1115 header sized placeholder to use until after encryption when all the
1116 inputs to construct the final header are available. This is then
1117 matched in ``.done()`` against the value found at the position of the
1118 header. The motivation for this extra check is primarily to assist
1119 format debugging: It makes stray headers easy to spot in malformed
1122 if filename is None:
1123 if self.lastinfo is None:
1124 raise InvalidParameter ("next: filename is mandatory for "
1126 filename, _dummy = self.lastinfo
1128 if isinstance (filename, str) is False:
1129 raise InvalidParameter ("next: filename must be a string, no %s"
1131 if counter is not None:
1132 if isinstance (counter, int) is False:
1133 raise InvalidParameter ("next: the supplied counter is of "
1134 "invalid type %s; please pass an "
1135 "integer instead" % type (counter))
1136 self.set_object_counter (counter)
1138 self.iv = self.iv_make ()
1139 if self.paramenc == "aes-gcm":
1141 ( algorithms.AES (self.key)
1142 , modes.GCM (self.iv)
1143 , backend = default_backend ()) \
1145 elif self.paramenc == "passthrough":
1146 self.enc = PassthroughCipher ()
1148 raise InvalidParameter ("next: parameter version %d not known"
1149 % self.paramversion)
1150 hdrdum = hdr_make_dummy (filename)
1151 self.lastinfo = (filename, hdrdum)
1152 super().next (self.password, self.paramversion, self.nacl, self.iv)
1154 self.set_object_counter (self.cnt + 1)
1158 def done (self, cmpdata):
1160 Complete encryption of an object. After this has been called, attempts
1161 of encrypting further data will cause an error until ``.next()`` is
1164 Returns a 64 bytes buffer containing the object header including all
1165 values including the “late” ones e. g. the ciphertext size and the
1168 if isinstance (cmpdata, bytes) is False:
1169 raise InvalidParameter ("done: comparison input expected as bytes, "
1170 "not %s" % type (cmpdata))
1171 if self.lastinfo is None:
1172 raise RuntimeError ("done: encryption context not initialized")
1173 filename, hdrdum = self.lastinfo
1174 if cmpdata != hdrdum:
1175 raise RuntimeError ("done: bad sync of header for object %d: "
1176 "preliminary data does not match; this likely "
1177 "indicates a wrongly repositioned stream"
1179 data = self.enc.finalize ()
1180 self.stats ["out"] += len (data)
1181 self.ctsize += len (data)
1182 ok, hdr = hdr_from_params (self.version, self.paramversion, self.nacl,
1183 self.iv, self.ctsize, self.enc.tag)
1185 raise InternalError ("error constructing header: %r" % hdr)
1186 return data, hdr, self.fixed
1189 def process (self, buf):
1191 Encrypt a chunk of plaintext with the active encryptor. Returns the
1192 size of the input consumed. This **must** be checked downstream. If the
1193 maximum possible object size has been reached, the current context must
1194 be finalized and a new one established before any further data can be
1195 encrypted. The second argument is the remainder of the plaintext that
1196 was not encrypted for the caller to use immediately after the new
1199 if isinstance (buf, bytes) is False:
1200 raise InvalidParameter ("process: expected byte buffer, not %s"
1203 newptsize = self.ptsize + bsize
1204 diff = newptsize - PDTCRYPT_MAX_OBJ_SIZE
1207 newptsize = PDTCRYPT_MAX_OBJ_SIZE
1208 self.ptsize = newptsize
1209 data = super().process (buf [:bsize])
1210 self.ctsize += len (data)
1214 class Decrypt (Crypto):
1216 tag = None # GCM tag, part of header
1217 last_iv = None # check consecutive ivs in strict mode
1219 def __init__ (self, password=None, key=None, counter=None, fixedparts=None,
1222 Sanitizing ctor for the decryption context. ``fixedparts`` specifies a
1223 list of IV fixed parts accepted during decryption. If a fixed part is
1224 encountered that is not in the list, decryption will fail.
1226 :param password: mutually exclusive with ``key``
1227 :type password: bytes
1228 :param key: mutually exclusive with ``password``
1230 :type counter: initial object counter the values
1231 ``AES_GCM_IV_CNT_INFOFILE`` and
1232 ``AES_GCM_IV_CNT_INDEX`` are unique in each backup set
1233 and cannot be reused even with different fixed parts.
1234 :type fixedparts: bytes list
1236 if password is None and key is None \
1237 or password is not None and key is not None :
1238 raise InvalidParameter ("__init__: need either key or password")
1241 if isinstance (key, bytes) is False:
1242 raise InvalidParameter ("__init__: key must be provided as "
1243 "bytes, not %s" % type (key))
1244 else: # password, no key
1245 if isinstance (password, str) is False:
1246 raise InvalidParameter ("__init__: password must be a string, not %s"
1248 if len (password) == 0:
1249 raise InvalidParameter ("__init__: supplied empty password but not "
1250 "permitted for PDT encrypted files")
1252 if fixedparts is not None:
1253 if isinstance (fixedparts, list) is False:
1254 raise InvalidParameter ("__init__: IV fixed parts must be "
1255 "supplied as list, not %s"
1256 % type (fixedparts))
1257 self.fixed = fixedparts
1260 super().__init__ (password=password, key=key, counter=counter,
1261 strict_ivs=strict_ivs)
1264 def valid_fixed_part (self, iv):
1266 Check if a fixed part was already seen.
1268 # check if fixed part is known
1269 fixed, _cnt = struct.unpack (FMT_I2N_IV, iv)
1270 i = bisect.bisect_left (self.fixed, fixed)
1271 return i != len (self.fixed) and self.fixed [i] == fixed
1274 def check_consecutive_iv (self, iv):
1276 Check whether the counter part of the given IV is indeed the successor
1277 of the currently present counter. This should always be the case for
1278 the objects in a well formed PDT archive but should not be enforced
1279 when decrypting out-of-order.
1281 fixed, cnt = struct.unpack (FMT_I2N_IV, iv)
1282 if self.strict_ivs is True \
1283 and self.last_iv is not None \
1284 and self.last_iv [0] == fixed \
1285 and self.last_iv [1] != cnt - 1:
1286 raise NonConsecutiveIV ("iv %s counter not successor of "
1287 "last object (expected %d, found %d)"
1288 % (iv_fmt (self.last_iv [1]), cnt))
1289 self.last_iv = (iv, cnt)
1292 def next (self, hdr):
1294 Start decrypting the next object. The PDTCRYPT header for the object
1295 can be given either as already parsed object or as bytes.
1297 if isinstance (hdr, bytes) is True:
1298 hdr = hdr_read (hdr)
1299 elif isinstance (hdr, dict) is False:
1300 # this won’t catch malformed specs though
1301 raise InvalidParameter ("next: wrong type of parameter hdr: "
1302 "expected bytes or spec, got %s"
1305 paramversion = hdr ["paramversion"]
1310 raise InvalidHeader ("next: not a header %r" % hdr)
1312 super().next (self.password, paramversion, nacl, iv)
1313 if self.fixed is not None and self.valid_fixed_part (iv) is False:
1314 raise InvalidIVFixedPart ("iv %s has invalid fixed part"
1316 self.check_consecutive_iv (iv)
1319 defs = ENCRYPTION_PARAMETERS.get (paramversion, None)
1321 raise FormatError ("header contains unknown parameter version %d; "
1322 "maybe the file was created by a more recent "
1323 "version of Deltatar" % paramversion)
1325 if enc == "aes-gcm":
1327 ( algorithms.AES (self.key)
1328 , modes.GCM (iv, tag=self.tag)
1329 , backend = default_backend ()) \
1331 elif enc == "passthrough":
1332 self.enc = PassthroughCipher ()
1334 raise InternalError ("encryption parameter set %d refers to unknown "
1335 "mode %r" % (paramversion, enc))
1336 self.set_object_counter (self.cnt + 1)
1339 def done (self, tag=None):
1341 Stop decryption of the current object and finalize it with the active
1342 context. This will throw an *InvalidGCMTag* exception to indicate that
1343 the authentication tag does not match the data. If the tag is correct,
1344 the rest of the plaintext is returned.
1349 data = self.enc.finalize ()
1351 if isinstance (tag, bytes) is False:
1352 raise InvalidParameter ("done: wrong type of parameter "
1353 "tag: expected bytes, got %s"
1355 data = self.enc.finalize_with_tag (self.tag)
1356 except cryptography.exceptions.InvalidTag:
1357 raise InvalidGCMTag ("done: tag mismatch of object %d: %s "
1358 "rejected by finalize ()"
1359 % (self.cnt, binascii.hexlify (self.tag)))
1360 self.ctsize += len (data)
1361 self.stats ["out"] += len (data)
1365 def process (self, buf):
1367 Decrypt the bytes object *buf* with the active decryptor.
1369 if isinstance (buf, bytes) is False:
1370 raise InvalidParameter ("process: expected byte buffer, not %s"
1372 self.ctsize += len (buf)
1373 data = super().process (buf)
1374 self.ptsize += len (data)
1378 ###############################################################################
1380 ###############################################################################
1382 def _patch_global (glob, vow, n=None):
1384 Adapt upper file counter bound for testing IV logic. Completely unsafe.
1386 assert vow == "I am fully aware that this will void my warranty."
1387 r = globals () [glob]
1389 n = globals () [glob + "_DEFAULT"]
1390 globals () [glob] = n
1393 _testing_set_AES_GCM_IV_CNT_MAX = \
1394 partial (_patch_global, "AES_GCM_IV_CNT_MAX")
1396 _testing_set_PDTCRYPT_MAX_OBJ_SIZE = \
1397 partial (_patch_global, "PDTCRYPT_MAX_OBJ_SIZE")
1399 def open2_dump_file (fname, dir_fd, force=False):
1402 oflags = os.O_CREAT | os.O_WRONLY
1404 oflags |= os.O_TRUNC
1409 outfd = os.open (fname, oflags,
1410 stat.S_IRUSR | stat.S_IWUSR, dir_fd=dir_fd)
1411 except FileExistsError as exn:
1412 noise ("PDT: refusing to overwrite existing file %s" % fname)
1414 raise RuntimeError ("destination file %s already exists" % fname)
1415 if PDTCRYPT_VERBOSE is True:
1416 noise ("PDT: new output file %s (fd=%d)" % (fname, outfd))
1420 ###############################################################################
1421 ## freestanding invocation
1422 ###############################################################################
1424 PDTCRYPT_SUB_PROCESS = 0
1425 PDTCRYPT_SUB_SCRYPT = 1
1426 PDTCRYPT_SUB_SCAN = 2
1429 { "process" : PDTCRYPT_SUB_PROCESS
1430 , "scrypt" : PDTCRYPT_SUB_SCRYPT
1431 , "scan" : PDTCRYPT_SUB_SCAN }
1433 PDTCRYPT_SECRET_PW = 0
1434 PDTCRYPT_SECRET_KEY = 1
1436 PDTCRYPT_DECRYPT = 1 << 0 # decrypt archive with password
1437 PDTCRYPT_SPLIT = 1 << 1 # split archive into individual objects
1438 PDTCRYPT_HASH = 1 << 2 # output scrypt hash for file and given password
1440 PDTCRYPT_SPLITNAME = "pdtcrypt-object-%d.bin"
1441 PDTCRYPT_RESCUENAME = "pdtcrypt-rescue-object-%0.5d.bin"
1443 PDTCRYPT_VERBOSE = False
1444 PDTCRYPT_STRICTIVS = False
1445 PDTCRYPT_OVERWRITE = False
1446 PDTCRYPT_BLOCKSIZE = 1 << 12
1451 PDTCRYPT_DEFAULT_VER = 1
1452 PDTCRYPT_DEFAULT_PVER = 1
1454 # scrypt hashing output control
1455 PDTCRYPT_SCRYPT_INTRANATOR = 0
1456 PDTCRYPT_SCRYPT_PARAMETERS = 1
1457 PDTCRYPT_SCRYPT_DEFAULT = PDTCRYPT_SCRYPT_INTRANATOR
1459 PDTCRYPT_SCRYPT_FORMAT = \
1460 { "i2n" : PDTCRYPT_SCRYPT_INTRANATOR
1461 , "params" : PDTCRYPT_SCRYPT_PARAMETERS }
1463 PDTCRYPT_TT_COLUMNS = 80 # assume standard terminal
1465 class PDTDecryptionError (Exception):
1466 """Decryption failed."""
1468 class PDTSplitError (Exception):
1469 """Decryption failed."""
1472 def noise (*a, **b):
1473 print (file=sys.stderr, *a, **b)
1476 class PassthroughDecryptor (object):
1478 curhdr = None # write current header on first data write
1480 def __init__ (self):
1481 if PDTCRYPT_VERBOSE is True:
1482 noise ("PDT: no encryption; data passthrough")
1484 def next (self, hdr):
1485 ok, curhdr = hdr_make (hdr)
1487 raise PDTDecryptionError ("bad header %r" % hdr)
1488 self.curhdr = curhdr
1491 if self.curhdr is not None:
1495 def process (self, d):
1496 if self.curhdr is not None:
1502 def depdtcrypt (mode, secret, ins, outs):
1504 Remove PDTCRYPT layer from all objects encrypted with the secret. Used on a
1505 Deltatar backup this will yield a (possibly Gzip compressed) tarball.
1507 ctleft = -1 # length of ciphertext to consume
1508 ctcurrent = 0 # total ciphertext of current object
1509 total_obj = 0 # total number of objects read
1510 total_pt = 0 # total plaintext bytes
1511 total_ct = 0 # total ciphertext bytes
1512 total_read = 0 # total bytes read
1513 outfile = None # Python file object for output
1515 if mode & PDTCRYPT_DECRYPT: # decryptor
1517 if ks == PDTCRYPT_SECRET_PW:
1518 decr = Decrypt (password=secret [1], strict_ivs=PDTCRYPT_STRICTIVS)
1519 elif ks == PDTCRYPT_SECRET_KEY:
1520 key = binascii.unhexlify (secret [1])
1521 decr = Decrypt (key=key, strict_ivs=PDTCRYPT_STRICTIVS)
1523 raise InternalError ("‘%d’ does not specify a valid kind of secret"
1526 decr = PassthroughDecryptor ()
1529 """Dummy for non-split mode: output file does not vary."""
1532 if mode & PDTCRYPT_SPLIT:
1533 def nextout (outfile):
1535 We were passed an fd as outs for accessing the destination
1536 directory where extracted archive components are supposed
1541 if PDTCRYPT_VERBOSE is True:
1542 noise ("PDT: no output file to close at this point")
1544 if PDTCRYPT_VERBOSE is True:
1545 noise ("PDT: release output file %r" % outfile)
1546 # cleanup happens automatically by the GC; the next
1547 # line will error out on account of an invalid fd
1550 assert total_obj > 0
1551 fname = PDTCRYPT_SPLITNAME % total_obj
1553 outfd = open2_dump_file (fname, outs, force=PDTCRYPT_OVERWRITE)
1554 except RuntimeError as exn:
1555 raise PDTSplitError (exn)
1556 return os.fdopen (outfd, "wb", closefd=True)
1560 """ESPIPE is normal on non-seekable stdio stream."""
1563 except OSError as exn:
1564 if exn.errno == os.errno.ESPIPE:
1567 def out (pt, outfile):
1571 if PDTCRYPT_VERBOSE is True:
1572 noise ("PDT:\t· decrypt plaintext %d B" % (npt))
1574 nn = outfile.write (pt)
1575 except OSError as exn: # probably ENOSPC
1576 raise DecryptionError ("error (%s)" % exn)
1578 raise DecryptionError ("write aborted after %d of %d B" % (nn, npt))
1582 # current object completed; in a valid archive this marks either
1583 # the start of a new header or the end of the input
1584 if ctleft == 0: # current object requires finalization
1585 if PDTCRYPT_VERBOSE is True:
1586 noise ("PDT: %d finalize" % tell (ins))
1589 except InvalidGCMTag as exn:
1590 raise DecryptionError ("error finalizing object %d (%d B): "
1591 "%r" % (total_obj, len (pt), exn)) \
1594 if PDTCRYPT_VERBOSE is True:
1595 noise ("PDT:\t· object validated")
1597 if PDTCRYPT_VERBOSE is True:
1598 noise ("PDT: %d hdr" % tell (ins))
1600 hdr = hdr_read_stream (ins)
1601 total_read += PDTCRYPT_HDR_SIZE
1602 except EndOfFile as exn:
1603 total_read += exn.remainder
1604 if total_ct + total_obj * PDTCRYPT_HDR_SIZE != total_read:
1605 raise PDTDecryptionError ("ciphertext processed (%d B) plus "
1606 "overhead (%d × %d B) does not match "
1607 "the number of bytes read (%d )"
1608 % (total_ct, total_obj, PDTCRYPT_HDR_SIZE,
1610 # the single good exit
1611 return total_read, total_obj, total_ct, total_pt
1612 except InvalidHeader as exn:
1613 raise PDTDecryptionError ("invalid header at position %d in %r "
1614 "(%s)" % (tell (ins), exn, ins))
1615 if PDTCRYPT_VERBOSE is True:
1616 pretty = hdr_fmt_pretty (hdr)
1617 noise (reduce (lambda a, e: (a + "\n" if a else "") + "PDT:\t· " + e,
1618 pretty.splitlines (), ""))
1619 ctcurrent = ctleft = hdr ["ctsize"]
1623 total_obj += 1 # used in file counter with split mode
1625 # finalization complete or skipped in case of first object in
1626 # stream; create a new output file if necessary
1627 outfile = nextout (outfile)
1629 if PDTCRYPT_VERBOSE is True:
1630 noise ("PDT: %d decrypt obj no. %d, %d B"
1631 % (tell (ins), total_obj, ctleft))
1633 # always allocate a new buffer since python-cryptography doesn’t allow
1634 # passing a bytearray :/
1635 nexpect = min (ctleft, PDTCRYPT_BLOCKSIZE)
1636 if PDTCRYPT_VERBOSE is True:
1637 noise ("PDT:\t· [%d] %d%% done, read block (%d B of %d B remaining)"
1639 100 - ctleft * 100 / (ctcurrent > 0 and ctcurrent or 1),
1641 ct = ins.read (nexpect)
1645 raise EndOfFile (nct,
1646 "hit EOF after %d of %d B in block [%d:%d); "
1647 "%d B ciphertext remaining for object no %d"
1648 % (nct, nexpect, off, off + nexpect, ctleft,
1654 if PDTCRYPT_VERBOSE is True:
1655 noise ("PDT:\t· decrypt ciphertext %d B" % (nct))
1656 pt = decr.process (ct)
1660 def deptdcrypt_mk_stream (kind, path):
1661 """Create stream from file or stdio descriptor."""
1662 if kind == PDTCRYPT_SINK:
1664 if PDTCRYPT_VERBOSE is True: noise ("PDT: sink: stdout")
1665 return sys.stdout.buffer
1667 if PDTCRYPT_VERBOSE is True: noise ("PDT: sink: file %s" % path)
1668 return io.FileIO (path, "w")
1669 if kind == PDTCRYPT_SOURCE:
1671 if PDTCRYPT_VERBOSE is True: noise ("PDT: source: stdin")
1672 return sys.stdin.buffer
1674 if PDTCRYPT_VERBOSE is True: noise ("PDT: source: file %s" % path)
1675 return io.FileIO (path, "r")
1677 raise ValueError ("bogus stream “%s” / %s" % (kind, path))
1680 def mode_depdtcrypt (mode, secret, ins, outs):
1682 total_read, total_obj, total_ct, total_pt = \
1683 depdtcrypt (mode, secret, ins, outs)
1684 except DecryptionError as exn:
1685 noise ("PDT: Decryption failed:")
1687 noise ("PDT: “%s”" % exn)
1689 noise ("PDT: Did you specify the correct key / password?")
1692 except PDTSplitError as exn:
1693 noise ("PDT: Split operation failed:")
1695 noise ("PDT: “%s”" % exn)
1697 noise ("PDT: Hint: target directory should be empty.")
1701 if PDTCRYPT_VERBOSE is True:
1702 noise ("PDT: decryption successful" )
1703 noise ("PDT: %.10d bytes read" % total_read)
1704 noise ("PDT: %.10d objects decrypted" % total_obj )
1705 noise ("PDT: %.10d bytes ciphertext" % total_ct )
1706 noise ("PDT: %.10d bytes plaintext" % total_pt )
1712 def mode_scrypt (pw, ins=None, nacl=None, fmt=PDTCRYPT_SCRYPT_INTRANATOR):
1714 paramversion = PDTCRYPT_DEFAULT_PVER
1716 hsh, nacl, version, paramversion = scrypt_hashsource (pw, ins)
1717 defs = ENCRYPTION_PARAMETERS.get(paramversion, None)
1719 nacl = binascii.unhexlify (nacl)
1720 defs = ENCRYPTION_PARAMETERS.get(paramversion, None)
1721 version = PDTCRYPT_DEFAULT_VER
1723 kdfname, params = defs ["kdf"]
1725 kdf = kdf_by_version (None, defs)
1726 hsh, _void = kdf (pw, nacl)
1730 if fmt == PDTCRYPT_SCRYPT_INTRANATOR:
1731 out = json.dumps ({ "salt" : base64.b64encode (nacl).decode ()
1732 , "key" : base64.b64encode (hsh) .decode ()
1733 , "paramversion" : paramversion })
1734 elif fmt == PDTCRYPT_SCRYPT_PARAMETERS:
1735 out = json.dumps ({ "salt" : binascii.hexlify (nacl).decode ()
1736 , "key" : binascii.hexlify (hsh) .decode ()
1737 , "version" : version
1738 , "scrypt_params" : { "N" : params ["N"]
1739 , "r" : params ["r"]
1740 , "p" : params ["p"]
1741 , "dkLen" : params ["dkLen"] } })
1743 raise RuntimeError ("bad scrypt output scheme %r" % fmt)
1748 def noise_output_candidates (cands, indent=8, cols=PDTCRYPT_TT_COLUMNS):
1750 Print a list of offsets without garbling the terminal too much.
1752 The indent is counted from column zero; if it is wide enough, the “PDT: ”
1753 marker will be prepended, considered part of the indentation.
1757 idt = " " * indent if indent < 5 else "PDT: " + " " * (indent - 5)
1762 init = True # prevent leading separator
1765 raise ValueError ("the requested indentation exceeds the line "
1766 "width by %d" % (indent - wd))
1776 if lpos > wd: # line break
1792 def mode_scan (secret, fname, outs=None, nacl=None):
1794 Dissect a binary file, looking for PDTCRYPT headers and objects.
1796 If *outs* is supplied, recoverable data will be dumped into the specified
1800 ifd = os.open (fname, os.O_RDONLY)
1801 except FileNotFoundError:
1802 noise ("PDT: failed to open %s readonly" % fname)
1807 if PDTCRYPT_VERBOSE is True:
1808 noise ("PDT: scan for potential sync points")
1809 cands = locate_hdr_candidates (ifd)
1810 if len (cands) == 0:
1811 noise ("PDT: scan complete: input does not contain potential PDT "
1812 "headers; giving up.")
1814 if PDTCRYPT_VERBOSE is True:
1815 noise ("PDT: scan complete: found %d candidates:" % len (cands))
1816 noise_output_candidates (cands)
1826 vdt, hdr = inspect_hdr (ifd, cand)
1827 if vdt == HDR_CAND_JUNK:
1830 off0 = cand + PDTCRYPT_HDR_SIZE
1831 if PDTCRYPT_VERBOSE is True:
1832 noise ("PDT: obj %d: read payload @%d" % (nobj, off0))
1833 pretty = hdr_fmt_pretty (hdr)
1834 noise (reduce (lambda a, e: (a + "\n" if a else "") + "PDT:\t· " + e,
1835 pretty.splitlines (), ""))
1838 if outs is not None:
1839 ofname = PDTCRYPT_RESCUENAME % nobj
1840 ofd = open2_dump_file (ofname, outs, force=PDTCRYPT_OVERWRITE)
1843 ok = try_decrypt (ifd, off0, hdr, secret, ofd=ofd) == hdr ["ctsize"]
1847 if vdt == HDR_CAND_GOOD and ok is True:
1848 noise ("PDT: %d → ✓ valid object %d–%d"
1849 % (cand, off0, off0 + hdr ["ctsize"]))
1850 elif vdt == HDR_CAND_FISHY and ok is True:
1851 noise ("PDT: %d → × object %d–%d, corrupt header"
1852 % (cand, off0, off0 + hdr ["ctsize"]))
1853 elif vdt == HDR_CAND_GOOD and ok is False:
1854 noise ("PDT: %d → × object %d–%d, problematic payload"
1855 % (cand, off0, off0 + hdr ["ctsize"]))
1856 elif vdt == HDR_CAND_FISHY and ok is False:
1857 noise ("PDT: %d → × object %d–%d, corrupt header, problematic "
1858 "ciphertext" % (cand, off0, off0 + hdr ["ctsize"]))
1865 noise ("PDT: all headers ok")
1867 noise ("PDT: %d candidates not parseable as headers:" % len (junk))
1868 noise_output_candidates (junk)
1870 def usage (err=False):
1874 indent = ' ' * len (SELF)
1875 out ("usage: %s SUBCOMMAND { --help" % SELF)
1876 out (" %s | [ -v ] { -p PASSWORD | -k KEY }" % indent)
1877 out (" %s [ { -i | --in } { - | SOURCE } ]" % indent)
1878 out (" %s [ { -n | --nacl } { SALT } ]" % indent)
1879 out (" %s [ { -o | --out } { - | DESTINATION } ]" % indent)
1880 out (" %s [ -D | --no-decrypt ] [ -S | --split ]" % indent)
1881 out (" %s [ -f | --format ]" % indent)
1884 out ("\t\tSUBCOMMAND main mode: { process | scrypt }")
1886 out ("\t\t process: extract objects from PDT archive")
1887 out ("\t\t scrypt: calculate hash from password and first object")
1888 out ("\t\t-p PASSWORD password to derive the encryption key from")
1889 out ("\t\t-k KEY encryption key as 16 bytes in hexadecimal notation")
1890 out ("\t\t-s enforce strict handling of initialization vectors")
1891 out ("\t\t-i SOURCE file name to read from")
1892 out ("\t\t-o DESTINATION file to write output to")
1893 out ("\t\t-n SALT provide salt for scrypt mode in hex encoding")
1894 out ("\t\t-v print extra info")
1895 out ("\t\t-S split into files at object boundaries; this")
1896 out ("\t\t requires DESTINATION to refer to directory")
1897 out ("\t\t-D PDT header and ciphertext passthrough")
1898 out ("\t\t-f format of SCRYPT hash output (“default” or “parameters”)")
1900 out ("\tinstead of filenames, “-” may used to specify stdin / stdout")
1902 sys.exit ((err is True) and 42 or 0)
1912 def parse_argv (argv):
1913 global PDTCRYPT_OVERWRITE
1915 mode = PDTCRYPT_DECRYPT
1921 scrypt_format = PDTCRYPT_SCRYPT_DEFAULT
1924 SELF = os.path.basename (next (argvi))
1927 rawsubcmd = next (argvi)
1928 subcommand = PDTCRYPT_SUB [rawsubcmd]
1929 except StopIteration:
1930 bail ("ERROR: subcommand required")
1932 bail ("ERROR: invalid subcommand “%s” specified" % rawsubcmd)
1938 except StopIteration:
1939 bail ("ERROR: argument list incomplete")
1941 def checked_secret (t, arg):
1946 bail ("ERROR: encountered “%s” but secret already given" % arg)
1949 if arg in [ "-h", "--help" ]:
1952 elif arg in [ "-v", "--verbose", "--wtf" ]:
1953 global PDTCRYPT_VERBOSE
1954 PDTCRYPT_VERBOSE = True
1955 elif arg in [ "-i", "--in", "--source" ]:
1956 insspec = checked_arg ()
1957 if PDTCRYPT_VERBOSE is True: noise ("PDT: decrypt from %s" % insspec)
1958 elif arg in [ "-p", "--password" ]:
1959 arg = checked_arg ()
1960 checked_secret (PDTCRYPT_SECRET_PW, arg)
1961 if PDTCRYPT_VERBOSE is True: noise ("PDT: decrypting with password")
1963 if subcommand == PDTCRYPT_SUB_PROCESS:
1964 if arg in [ "-s", "--strict-ivs" ]:
1965 global PDTCRYPT_STRICTIVS
1966 PDTCRYPT_STRICTIVS = True
1967 elif arg in [ "-o", "--out", "--dest", "--sink" ]:
1968 outsspec = checked_arg ()
1969 if PDTCRYPT_VERBOSE is True: noise ("PDT: decrypt to %s" % outsspec)
1970 elif arg in [ "-f", "--force" ]:
1971 PDTCRYPT_OVERWRITE = True
1972 if PDTCRYPT_VERBOSE is True: noise ("PDT: overwrite existing files")
1973 elif arg in [ "-S", "--split" ]:
1974 mode |= PDTCRYPT_SPLIT
1975 if PDTCRYPT_VERBOSE is True: noise ("PDT: split files")
1976 elif arg in [ "-D", "--no-decrypt" ]:
1977 mode &= ~PDTCRYPT_DECRYPT
1978 if PDTCRYPT_VERBOSE is True: noise ("PDT: not decrypting")
1979 elif arg in [ "-k", "--key" ]:
1980 arg = checked_arg ()
1981 checked_secret (PDTCRYPT_SECRET_KEY, arg)
1982 if PDTCRYPT_VERBOSE is True: noise ("PDT: decrypting with key")
1984 bail ("ERROR: unexpected positional argument “%s”" % arg)
1985 elif subcommand == PDTCRYPT_SUB_SCRYPT:
1986 if arg in [ "-n", "--nacl", "--salt" ]:
1987 nacl = checked_arg ()
1988 if PDTCRYPT_VERBOSE is True: noise ("PDT: salt key with %s" % nacl)
1989 elif arg in [ "-f", "--format" ]:
1990 arg = checked_arg ()
1992 scrypt_format = PDTCRYPT_SCRYPT_FORMAT [arg]
1994 bail ("ERROR: invalid scrypt output format %s" % arg)
1995 if PDTCRYPT_VERBOSE is True:
1996 noise ("PDT: scrypt output format “%s”" % scrypt_format)
1998 bail ("ERROR: unexpected positional argument “%s”" % arg)
1999 elif subcommand == PDTCRYPT_SUB_SCAN:
2000 if arg in [ "-o", "--out", "--dest", "--sink" ]:
2001 outsspec = checked_arg ()
2002 if PDTCRYPT_VERBOSE is True: noise ("PDT: decrypt to %s" % outsspec)
2003 elif arg in [ "-f", "--force" ]:
2004 PDTCRYPT_OVERWRITE = True
2005 if PDTCRYPT_VERBOSE is True: noise ("PDT: overwrite existing files")
2007 bail ("ERROR: unexpected positional argument “%s”" % arg)
2010 if PDTCRYPT_VERBOSE is True:
2011 noise ("ERROR: no password or key specified, trying $PDTCRYPT_PASSWORD")
2012 epw = os.getenv ("PDTCRYPT_PASSWORD")
2014 checked_secret (PDTCRYPT_SECRET_PW, epw.strip ())
2017 if PDTCRYPT_VERBOSE is True:
2018 noise ("ERROR: no password or key specified, trying $PDTCRYPT_KEY")
2019 ek = os.getenv ("PDTCRYPT_KEY")
2021 checked_secret (PDTCRYPT_SECRET_KEY, ek.strip ())
2024 if subcommand == PDTCRYPT_SUB_SCRYPT:
2025 bail ("ERROR: scrypt hash mode requested but no password given")
2026 elif mode & PDTCRYPT_DECRYPT:
2027 bail ("ERROR: encryption requested but no password given")
2029 if mode & PDTCRYPT_SPLIT and outsspec is None:
2030 bail ("ERROR: split mode is incompatible with stdout sink "
2033 if subcommand == PDTCRYPT_SUB_SCAN and outsspec is None:
2034 pass # no output by default in scan mode
2035 elif mode & PDTCRYPT_SPLIT or subcommand == PDTCRYPT_SUB_SCAN:
2036 # destination must be directory
2038 bail ("ERROR: mode is incompatible with stdout sink")
2041 os.makedirs (outsspec, 0o700)
2042 except FileExistsError:
2043 # if it’s a directory with appropriate perms, everything is
2044 # good; otherwise, below invocation of open(2) will fail
2046 outs = os.open (outsspec, os.O_DIRECTORY, 0o600)
2047 except FileNotFoundError as exn:
2048 bail ("ERROR: cannot create target directory “%s”" % outsspec)
2049 except NotADirectoryError as exn:
2050 bail ("ERROR: target path “%s” is not a directory" % outsspec)
2052 outs = deptdcrypt_mk_stream (PDTCRYPT_SINK, outsspec or "-")
2054 if subcommand == PDTCRYPT_SUB_SCAN:
2056 bail ("ERROR: please supply an input file for scanning")
2058 bail ("ERROR: input must be seekable; please specify a file")
2059 return True, partial (mode_scan, secret, insspec, outs, nacl=nacl)
2061 if subcommand == PDTCRYPT_SUB_SCRYPT:
2062 if secret [0] == PDTCRYPT_SECRET_KEY:
2063 bail ("ERROR: scrypt mode requires a password")
2064 if insspec is not None and nacl is not None \
2065 or insspec is None and nacl is None :
2066 bail ("ERROR: please supply either an input file or "
2071 if insspec is not None or subcommand != PDTCRYPT_SUB_SCRYPT:
2072 ins = deptdcrypt_mk_stream (PDTCRYPT_SOURCE, insspec or "-")
2074 if subcommand == PDTCRYPT_SUB_SCRYPT:
2075 return True, partial (mode_scrypt, secret [1].encode (), ins, nacl,
2078 return True, partial (mode_depdtcrypt, mode, secret, ins, outs)
2082 ok, runner = parse_argv (argv)
2084 if ok is True: return runner ()
2089 if __name__ == "__main__":
2090 sys.exit (main (sys.argv))