6 ===============================================================================
7 crypto -- Encryption Layer for the Deltatar Backup
8 ===============================================================================
12 - AES-GCM for the symmetric encryption;
17 - NIST Recommendation for Block Cipher Modes of Operation: Galois/Counter
19 http://nvlpubs.nist.gov/nistpubs/Legacy/SP/nistspecialpublication800-38d.pdf
22 https://cryptome.org/2014/01/aes-gcm-v1.pdf
24 - Authentication weaknesses in GCM
25 http://csrc.nist.gov/groups/ST/toolkit/BCM/documents/comments/CWC-GCM/Ferguson2.pdf
27 Trouble with python-cryptography packages: authentication tags can only be
28 passed in advance: https://github.com/pyca/cryptography/pull/3421
31 -------------------------------------------------------------------------------
33 Errors fall into roughly three categories:
35 - Cryptographical errors or invalid data.
37 - ``InvalidGCMTag`` (decryption failed on account of an invalid GCM
39 - ``InvalidIVFixedPart`` (IV fixed part of object not found in list),
40 - ``DuplicateIV`` (the IV of an encrypted object already occurred),
41 - ``DecryptionError`` (used in CLI decryption for presenting error
42 conditions to the user).
44 - Incorrect usage of the library.
46 - ``InvalidParameter`` (non-conforming user supplied parameter),
47 - ``InvalidHeader`` (data passed for reading not parsable into header),
48 - ``FormatError`` (cannot handle header or parameter version),
51 - Bad internal state. If one of these is encountered it means that a state
52 was reached that shouldn’t occur during normal processing.
57 Also, ``EndOfFile`` is used as a sentinel to communicate that a stream supplied
58 for reading is exhausted.
60 Initialization Vectors
61 -------------------------------------------------------------------------------
63 Initialization vectors are checked reuse during the lifetime of a decryptor.
64 The fixed counters for metadata files cannot be reused and attempts to do so
65 will cause a DuplicateIV error. This means the length of objects encrypted with
66 a metadata counter is capped at 63 GB.
68 For ordinary, non-metadata payload, there is an optional mode with strict IV
69 checking that causes a crypto context to fail if an IV encountered or created
70 was already used for decrypting or encrypting, respectively, an earlier object.
71 Note that this mode can trigger false positives when decrypting non-linearly,
72 e. g. when traversing the same object multiple times. Since the crypto context
73 has no notion of a position in a PDT encrypted archive, this condition must be
74 sorted out downstream.
77 -------------------------------------------------------------------------------
79 ``crypto.py`` may be invoked as a script for decrypting, validating, and
80 splitting PDT encrypted files. Consult the usage message for details.
84 Decrypt from stdin using the password ‘foo’: ::
86 $ crypto.py process foo -i - -o - <some-file.tar.gz.pdtcrypt >some-file.tar.gz
88 Output verbose information about the encrypted objects in the archive: ::
90 $ crypto.py process foo -v -i some-file.tar.gz.pdtcrypt -o /dev/null
91 PDT: decrypt from some-file.tar.gz.pdtcrypt
92 PDT: decrypt to /dev/null
93 PDT: source: file some-file.tar.gz.pdtcrypt
94 PDT: sink: file /dev/null
96 PDT: · version = 1 : 0100
97 PDT: · paramversion = 1 : 0100
98 PDT: · nacl : d270 b031 00d1 87e2 c946 610d 7b7f 7e5f
99 PDT: · iv : 02ee 3dd7 a963 1eb1 0100 0000
100 PDT: · ctsize = 591 : 4f02 0000 0000 0000
101 PDT: · tag : 5b2d 6d8b 8f82 4842 12fd 0b10 b6e3 369b
102 PDT: 64 decrypt obj no. 1, 591 B
103 PDT: · [64] 0% done, read block (591 B of 591 B remaining)
104 PDT: · decrypt ciphertext 591 B
105 PDT: · decrypt plaintext 591 B
109 Also, the mode *scrypt* allows deriving encryption keys. To calculate the
110 encryption key from the password ‘foo’ and the salt of the first object in a
111 PDT encrypted file: ::
113 $ crypto.py scrypt foo -i some-file.pdtcrypt
114 {"paramversion": 1, "salt": "Cqzbk48e3peEjzWto8D0yA==", "key": "JH9EkMwaM4x9F5aim5gK/Q=="}
116 The computed 16 byte key is given in hexadecimal notation in the value to
117 ``hash`` and can be fed into Python’s ``binascii.unhexlify()`` to obtain the
118 corresponding binary representation.
120 Note that in Scrypt hashing mode, no data integrity checks are being performed.
121 If the wrong password is given, a wrong key will be derived. Whether the password
122 was indeed correct can only be determined by decrypting. Note that since PDT
123 archives essentially consist of a stream of independent objects, the salt and
124 other parameters may change. Thus a key derived using above method from the
125 first object doesn’t necessarily apply to any of the subsequent objects.
134 from functools import reduce, partial
144 except ImportError as exn:
147 if __name__ == "__main__": ## Work around the import mechanism’s lest Python’s
148 pwd = os.getcwd() ## preference for local imports causes a cyclical
149 ## import (crypto → pylibscrypt → […] → ./tarfile → crypto).
150 sys.path = [ p for p in sys.path if p.find ("deltatar") < 0 ]
153 from cryptography.hazmat.primitives.ciphers import Cipher, algorithms, modes
154 from cryptography.hazmat.backends import default_backend
158 __all__ = [ "hdr_make", "hdr_read", "hdr_fmt", "hdr_fmt_pretty"
160 , "PDTCRYPT_HDR_SIZE", "AES_GCM_IV_CNT_DATA"
161 , "AES_GCM_IV_CNT_INFOFILE", "AES_GCM_IV_CNT_INDEX"
165 ###############################################################################
167 ###############################################################################
169 class EndOfFile (Exception):
173 def __init__ (self, n=None, msg=None):
179 class InvalidParameter (Exception):
180 """Inputs not valid for PDT encryption."""
184 class InvalidHeader (Exception):
185 """Header not valid."""
189 class InvalidGCMTag (Exception):
191 The GCM tag calculated during decryption differs from that in the object
197 class InvalidIVFixedPart (Exception):
199 IV fixed part not in supplied list: either the backup is corrupt or the
200 current object does not belong to it.
205 class IVFixedPartError (Exception):
207 Error creating a unique IV fixed part: repeated calls to system RNG yielded
208 the same sequence of bytes as the last IV used.
213 class InvalidFileCounter (Exception):
215 When encrypting, an attempted reuse of a dedicated counter (info file,
216 index file) was caught.
221 class DuplicateIV (Exception):
223 During encryption, the current IV fixed part is identical to an already
224 existing IV (same prefix and file counter). This indicates tampering or
225 programmer error and cannot be recovered from.
230 class NonConsecutiveIV (Exception):
232 IVs not numbered consecutively. This is a hard error with strict IV
233 checking. Precludes random access to the encrypted objects.
238 class FormatError (Exception):
239 """Unusable parameters in header."""
243 class DecryptionError (Exception):
244 """Error during decryption with ``crypto.py`` on the command line."""
248 class Unreachable (Exception):
250 Makeshift __builtin_unreachable(); always a programmer error if
256 class InternalError (Exception):
257 """Errors not ascribable to bad user inputs or cryptography."""
261 ###############################################################################
262 ## crypto layer version
263 ###############################################################################
265 ENCRYPTION_PARAMETERS = \
267 { "kdf": ("dummy", 16)
268 , "enc": "passthrough" }
276 , "enc": "aes-gcm" } }
278 ###############################################################################
280 ###############################################################################
282 PDTCRYPT_HDR_MAGIC = b"PDTCRYPT"
284 PDTCRYPT_HDR_SIZE_MAGIC = 8 # 8
285 PDTCRYPT_HDR_SIZE_VERSION = 2 # 10
286 PDTCRYPT_HDR_SIZE_PARAMVERSION = 2 # 12
287 PDTCRYPT_HDR_SIZE_NACL = 16 # 28
288 PDTCRYPT_HDR_SIZE_IV = 12 # 40
289 PDTCRYPT_HDR_SIZE_CTSIZE = 8 # 48
290 PDTCRYPT_HDR_SIZE_TAG = 16 # 64 GCM auth tag
292 PDTCRYPT_HDR_SIZE = PDTCRYPT_HDR_SIZE_MAGIC + PDTCRYPT_HDR_SIZE_VERSION \
293 + PDTCRYPT_HDR_SIZE_PARAMVERSION + PDTCRYPT_HDR_SIZE_NACL \
294 + PDTCRYPT_HDR_SIZE_IV + PDTCRYPT_HDR_SIZE_CTSIZE \
295 + PDTCRYPT_HDR_SIZE_TAG # = 64
297 # precalculate offsets since Python can’t do constant folding over names
298 HDR_OFF_VERSION = PDTCRYPT_HDR_SIZE_MAGIC
299 HDR_OFF_PARAMVERSION = HDR_OFF_VERSION + PDTCRYPT_HDR_SIZE_VERSION
300 HDR_OFF_NACL = HDR_OFF_PARAMVERSION + PDTCRYPT_HDR_SIZE_PARAMVERSION
301 HDR_OFF_IV = HDR_OFF_NACL + PDTCRYPT_HDR_SIZE_NACL
302 HDR_OFF_CTSIZE = HDR_OFF_IV + PDTCRYPT_HDR_SIZE_IV
303 HDR_OFF_TAG = HDR_OFF_CTSIZE + PDTCRYPT_HDR_SIZE_CTSIZE
307 FMT_I2N_IV = "<8sL" # 8 random bytes ‖ 32 bit counter
308 FMT_I2N_HDR = ("<" # host byte order
312 "16s" # sodium chloride
318 AES_GCM_MAX_SIZE = (1 << 36) - (1 << 5) # 2^39 - 2^8 b ≅ 64 GB
319 PDTCRYPT_MAX_OBJ_SIZE_DEFAULT = 63 * (1 << 30) # 63 GB
320 PDTCRYPT_MAX_OBJ_SIZE = PDTCRYPT_MAX_OBJ_SIZE_DEFAULT
322 # index and info files are written on-the fly while encrypting so their
323 # counters must be available inadvance
324 AES_GCM_IV_CNT_INFOFILE = 1 # constant
325 AES_GCM_IV_CNT_INDEX = AES_GCM_IV_CNT_INFOFILE + 1
326 AES_GCM_IV_CNT_DATA = AES_GCM_IV_CNT_INDEX + 1 # also for multivolume
327 AES_GCM_IV_CNT_MAX_DEFAULT = 0xffFFffFF
328 AES_GCM_IV_CNT_MAX = AES_GCM_IV_CNT_MAX_DEFAULT
330 # IV structure and generation
331 PDTCRYPT_IV_GEN_MAX_RETRIES = 10 # ×
332 PDTCRYPT_IV_FIXEDPART_SIZE = 8 # B
333 PDTCRYPT_IV_COUNTER_SIZE = 4 # B
335 ###############################################################################
337 ###############################################################################
343 # , paramversion : u16
349 # fn hdr_read (f : handle) -> hdrinfo;
350 # fn hdr_make (f : handle, h : hdrinfo) -> IOResult<usize>;
351 # fn hdr_fmt (h : hdrinfo) -> String;
356 Read bytes as header structure.
358 If the input could not be interpreted as a header, fail with
363 mag, version, paramversion, nacl, iv, ctsize, tag = \
364 struct.unpack (FMT_I2N_HDR, data)
365 except Exception as exn:
366 raise InvalidHeader ("error unpacking header from [%r]: %s"
367 % (binascii.hexlify (data), str (exn)))
369 if mag != PDTCRYPT_HDR_MAGIC:
370 raise InvalidHeader ("bad magic in header: expected [%s], got [%s]"
371 % (PDTCRYPT_HDR_MAGIC, mag))
374 { "version" : version
375 , "paramversion" : paramversion
383 def hdr_read_stream (instr):
385 Read header from stream at the current position.
387 Fail with ``InvalidHeader`` if insufficient bytes were read from the
388 stream, or if the content could not be interpreted as a header.
390 data = instr.read(PDTCRYPT_HDR_SIZE)
394 elif ldata != PDTCRYPT_HDR_SIZE:
395 raise InvalidHeader ("hdr_read_stream: expected %d B, received %d B"
396 % (PDTCRYPT_HDR_SIZE, ldata))
397 return hdr_read (data)
400 def hdr_from_params (version, paramversion, nacl, iv, ctsize, tag):
402 Assemble the necessary values into a PDTCRYPT header.
404 :type version: int to fit uint16_t
405 :type paramversion: int to fit uint16_t
406 :type nacl: bytes to fit uint8_t[16]
407 :type iv: bytes to fit uint8_t[12]
408 :type size: int to fit uint64_t
409 :type tag: bytes to fit uint8_t[16]
411 buf = bytearray (PDTCRYPT_HDR_SIZE)
412 bufv = memoryview (buf)
415 struct.pack_into (FMT_I2N_HDR, bufv, 0,
417 version, paramversion, nacl, iv, ctsize, tag)
418 except Exception as exn:
419 return False, "error assembling header: %s" % str (exn)
421 return True, bytes (buf)
424 def hdr_make_dummy (s):
426 Create a header sized block of bytes initialized to a value derived from a
427 string. Used to verify we’ve jumped back correctly to the actual position
428 of the object header.
430 c = reduce (lambda a, c: a + ord(c), s, 0) % 0xFF
431 return bytes (bytearray (struct.pack ("B", c)) * PDTCRYPT_HDR_SIZE)
436 Assemble a header from the given header structure.
438 return hdr_from_params (version=hdr.get("version"),
439 paramversion=hdr.get("paramversion"),
440 nacl=hdr.get("nacl"), iv=hdr.get("iv"),
441 ctsize=hdr.get("ctsize"), tag=hdr.get("tag"))
444 HDR_FMT = "I2n_header { version: %d, paramversion: %d, nacl: %s[%d]," \
445 " iv: %s[%d], ctsize: %d, tag: %s[%d] }"
448 """Format a header structure into readable output."""
449 return HDR_FMT % (h["version"], h["paramversion"],
450 binascii.hexlify (h["nacl"]), len(h["nacl"]),
451 binascii.hexlify (h["iv"]), len(h["iv"]),
453 binascii.hexlify (h["tag"]), len(h["tag"]))
456 def hex_spaced_of_bytes (b):
457 """Format bytes object, hexdump style."""
458 return " ".join ([ "%.2x%.2x" % (c1, c2)
459 for c1, c2 in zip (b[0::2], b[1::2]) ]) \
460 + (len (b) | 1 == len (b) and " %.2x" % b[-1] or "") # odd lengths
463 def hdr_iv_counter (h):
464 """Extract the variable part of the IV of the given header."""
465 _fixed, cnt = struct.unpack (FMT_I2N_IV, h ["iv"])
469 def hdr_iv_fixed (h):
470 """Extract the fixed part of the IV of the given header."""
471 fixed, _cnt = struct.unpack (FMT_I2N_IV, h ["iv"])
475 hdr_dump = hex_spaced_of_bytes
479 """version = %-4d : %s
480 paramversion = %-4d : %s
487 def hdr_fmt_pretty (h):
489 Format header structure into multi-line representation of its contents and
490 their raw representation. (Omit the implicit “PDTCRYPT” magic bytes that
491 precede every header.)
493 return HDR_FMT_PRETTY \
495 hex_spaced_of_bytes (struct.pack (FMT_UINT16_LE, h["version"])),
497 hex_spaced_of_bytes (struct.pack (FMT_UINT16_LE, h["paramversion"])),
498 hex_spaced_of_bytes (h["nacl"]),
499 hex_spaced_of_bytes (h["iv"]),
501 hex_spaced_of_bytes (struct.pack (FMT_UINT64_LE, h["ctsize"])),
502 hex_spaced_of_bytes (h["tag"]))
504 IV_FMT = "((f %s) (c %d))"
507 """Format the two components of an IV in a readable fashion."""
508 fixed, cnt = struct.unpack (FMT_I2N_IV, iv)
509 return IV_FMT % (binascii.hexlify (fixed), cnt)
512 ###############################################################################
514 ###############################################################################
516 class Location (object):
520 def restore_loc_fmt (loc):
522 % (loc.n, loc.offset)
524 def locate_hdr_candidates (fd):
526 Walk over instances of the magic string in the payload, collecting their
527 positions. If the offset of the first found instance is not zero, the file
528 begins with leading garbage.
530 :return: The list of offsets in the file.
534 mm = mmap.mmap(fd, 0, mmap.MAP_SHARED, mmap.PROT_READ)
537 pos = mm.find (PDTCRYPT_HDR_MAGIC, pos)
546 HDR_CAND_GOOD = 0 # header marks begin of valid object
547 HDR_CAND_FISHY = 1 # inconclusive (tag mismatch, obj overlap etc.)
548 HDR_CAND_JUNK = 2 # not a header / object unreadable
551 def inspect_hdr (fd, off):
553 Attempt to parse a header in *fd* at position *off*.
555 Returns a verdict about the quality of that header plus the parsed header
559 _ = os.lseek (fd, off, os.SEEK_SET)
561 if os.lseek (fd, 0, os.SEEK_CUR) != off:
562 if PDTCRYPT_VERBOSE is True:
563 noise ("PDT: %d → dismissed (lseek() past EOF)" % off)
564 return HDR_CAND_JUNK, None
566 raw = os.read (fd, PDTCRYPT_HDR_SIZE)
567 if len (raw) != PDTCRYPT_HDR_SIZE:
568 if PDTCRYPT_VERBOSE is True:
569 noise ("PDT: %d → dismissed (EOF inside header)" % off)
570 return HDR_CAND_JUNK, None
574 except InvalidHeader as exn:
575 if PDTCRYPT_VERBOSE is True:
576 noise ("PDT: %d → dismissed (invalid: [%s])" % (off, str (exn)))
577 return HDR_CAND_JUNK, None
579 obj0 = off + PDTCRYPT_HDR_SIZE
580 objX = obj0 + hdr ["ctsize"]
582 eof = os.lseek (fd, 0, os.SEEK_END)
584 if PDTCRYPT_VERBOSE is True:
585 noise ("PDT: %d → EOF inside object (%d≤%d≤%d); adjusting size to "
586 "%d" % (off, obj0, eof, objX, (eof - obj0)))
587 # try reading up to the end
588 hdr ["ctsize"] = eof - obj0
589 return HDR_CAND_FISHY, hdr
591 return HDR_CAND_GOOD, hdr
594 def try_decrypt (ifd, off, hdr, secret, ofd=-1):
596 Attempt to decrypt the object in the (seekable) descriptor *ifd* starting
597 at *off* using the metadata in *hdr* and *secret*. An output fd can be
598 specified with *ofd*; if it is *-1* – the default –, the decrypted payload
601 Always creates a fresh decryptor, so validation steps across objects don’t
604 ctleft = hdr ["ctsize"]
608 if ks == PDTCRYPT_SECRET_PW:
609 decr = Decrypt (password=secret [1])
610 elif ks == PDTCRYPT_SECRET_KEY:
611 key = binascii.unhexlify (secret [1])
612 decr = Decrypt (key=key)
619 os.lseek (ifd, pos, os.SEEK_SET)
621 cnksiz = min (ctleft, PDTCRYPT_BLOCKSIZE)
622 cnk = os.read (ifd, cnksiz)
625 pt = decr.process (cnk)
629 if len (pt) > 0 and ofd != -1:
632 except Exception as exn:
633 noise ("PDT: error decrypting object %d–%d@%d, %d B remaining [%s]"
634 % (off, off + hdr ["ctsize"], pos, ctleft, exn))
640 ###############################################################################
641 ## passthrough / null encryption
642 ###############################################################################
644 class PassthroughCipher (object):
646 tag = struct.pack ("<QQ", 0, 0)
648 def __init__ (self) : pass
650 def update (self, b) : return b
652 def finalize (self) : return b""
654 def finalize_with_tag (self, _) : return b""
656 ###############################################################################
657 ## convenience wrapper
658 ###############################################################################
661 def kdf_dummy (klen, password, _nacl):
663 Fake KDF for testing purposes that is called when parameter version zero is
666 q, r = divmod (klen, len (password))
667 if isinstance (password, bytes) is False:
668 password = password.encode ()
669 return password * q + password [:r], b""
672 SCRYPT_KEY_MEMO = { } # static because needed for both the info file and the archive
675 def kdf_scrypt (params, password, nacl):
677 Wrapper for the Scrypt KDF, corresponds to parameter version one. The
678 computation result is memoized based on the inputs to facilitate spawning
679 multiple encryption contexts.
684 dkLen = params["dkLen"]
687 nacl = os.urandom (params["NaCl_LEN"])
689 key_parms = (password, nacl, N, r, p, dkLen)
690 global SCRYPT_KEY_MEMO
691 if key_parms not in SCRYPT_KEY_MEMO:
692 SCRYPT_KEY_MEMO [key_parms] = \
693 pylibscrypt.scrypt (password, nacl, N, r, p, dkLen)
694 return SCRYPT_KEY_MEMO [key_parms], nacl
697 def kdf_by_version (paramversion=None, defs=None):
699 Pick the KDF handler corresponding to the parameter version or the
702 :rtype: function (password : str, nacl : str) -> str
704 if paramversion is not None:
705 defs = ENCRYPTION_PARAMETERS.get(paramversion, None)
707 raise InvalidParameter ("no encryption parameters for version %r"
709 (kdf, params) = defs["kdf"]
711 if kdf == "scrypt" : fn = kdf_scrypt
712 if kdf == "dummy" : fn = kdf_dummy
714 raise ValueError ("key derivation method %r unknown" % kdf)
715 return partial (fn, params)
718 ###############################################################################
720 ###############################################################################
722 def scrypt_hashsource (pw, ins):
724 Calculate the SCRYPT hash from the password and the information contained
725 in the first header found in ``ins``.
727 This does not validate whether the first object is encrypted correctly.
729 if isinstance (pw, str) is True:
731 elif isinstance (pw, bytes) is False:
732 raise InvalidParameter ("password must be a string, not %s"
734 if isinstance (ins, io.BufferedReader) is False and \
735 isinstance (ins, io.FileIO) is False:
736 raise InvalidParameter ("file to hash must be opened in “binary” mode")
739 hdr = hdr_read_stream (ins)
740 except EndOfFile as exn:
741 noise ("PDT: malformed input: end of file reading first object header")
746 pver = hdr ["paramversion"]
747 if PDTCRYPT_VERBOSE is True:
748 noise ("PDT: salt of first object : %s" % binascii.hexlify (nacl))
749 noise ("PDT: parameter version of archive : %d" % pver)
752 defs = ENCRYPTION_PARAMETERS.get(pver, None)
753 kdfname, params = defs ["kdf"]
754 if kdfname != "scrypt":
755 noise ("PDT: input is not an SCRYPT archive")
758 kdf = kdf_by_version (None, defs)
759 except ValueError as exn:
760 noise ("PDT: object has unknown parameter version %d" % pver)
762 hsh, _void = kdf (pw, nacl)
764 return hsh, nacl, hdr ["version"], pver
767 def scrypt_hashfile (pw, fname):
769 Calculate the SCRYPT hash from the password and the information contained
770 in the first header found in the given file. The header is read only at
773 with deptdcrypt_mk_stream (PDTCRYPT_SOURCE, fname or "-") as ins:
774 hsh, _void, _void, _void = scrypt_hashsource (pw, ins)
778 ###############################################################################
780 ###############################################################################
782 class Crypto (object):
784 Encryption context to remain alive throughout an entire tarfile pass.
789 cnt = None # file counter (uint32_t != 0)
790 iv = None # current IV
791 fixed = None # accu for 64 bit fixed parts of IV
792 used_ivs = None # tracks IVs
793 strict_ivs = False # if True, panic on duplicate object IV
802 info_counter_used = False
803 index_counter_used = False
805 def __init__ (self, *al, **akv):
806 self.used_ivs = set ()
807 self.set_parameters (*al, **akv)
810 def next_fixed (self):
815 def set_object_counter (self, cnt=None):
817 Safely set the internal counter of encrypted objects. Numerous
820 The same counter may not be reused in combination with one IV fixed
821 part. This is validated elsewhere in the IV handling.
823 Counter zero is invalid. The first two counters are reserved for
824 metadata. The implementation does not allow for splitting metadata
825 files over multiple encrypted objects. (This would be possible by
826 assigning new fixed parts.) Thus in a Deltatar backup there is at most
827 one object with a counter value of one and two. On creation of a
828 context, the initial counter may be chosen. The globals
829 ``AES_GCM_IV_CNT_INFOFILE`` and ``AES_GCM_IV_CNT_INDEX`` can be used to
830 request one of the reserved values. If one of these values has been
831 used, any further attempt of setting the counter to that value will
832 be rejected with an ``InvalidFileCounter`` exception.
834 Out of bounds values (i. e. below one and more than the maximum of 2³²)
835 cause an ``InvalidParameter`` exception to be thrown.
838 self.cnt = AES_GCM_IV_CNT_DATA
840 if cnt == 0 or cnt > AES_GCM_IV_CNT_MAX + 1:
841 raise InvalidParameter ("invalid counter value %d requested: "
842 "acceptable values are from 1 to %d"
843 % (cnt, AES_GCM_IV_CNT_MAX))
844 if cnt == AES_GCM_IV_CNT_INFOFILE:
845 if self.info_counter_used is True:
846 raise InvalidFileCounter ("attempted to reuse info file "
847 "counter %d: must be unique" % cnt)
848 self.info_counter_used = True
849 elif cnt == AES_GCM_IV_CNT_INDEX:
850 if self.index_counter_used is True:
851 raise InvalidFileCounter ("attempted to reuse index file "
852 " counter %d: must be unique" % cnt)
853 self.index_counter_used = True
854 if cnt <= AES_GCM_IV_CNT_MAX:
857 # cnt == AES_GCM_IV_CNT_MAX + 1 → wrap
858 self.cnt = AES_GCM_IV_CNT_DATA
862 def set_parameters (self, password=None, key=None, paramversion=None,
863 nacl=None, counter=None, strict_ivs=False):
865 Configure the internal state of a crypto context. Not intended for
869 self.set_object_counter (counter)
870 self.strict_ivs = strict_ivs
872 if paramversion is not None:
873 self.paramversion = paramversion
876 self.key, self.nacl = key, nacl
879 if password is not None:
880 if isinstance (password, bytes) is False:
881 password = str.encode (password)
882 self.password = password
883 if paramversion is None and nacl is None:
884 # postpone key setup until first header is available
886 kdf = kdf_by_version (paramversion)
888 self.key, self.nacl = kdf (password, nacl)
891 def process (self, buf):
893 Encrypt / decrypt a buffer. Invokes the ``.update()`` method on the
894 wrapped encryptor or decryptor, respectively.
896 The Cryptography exception ``AlreadyFinalized`` is translated to an
897 ``InternalError`` at this point. It may occur in sound code when the GC
898 closes an encrypting stream after an error. Everywhere else it must be
902 raise RuntimeError ("process: context not initialized")
903 self.stats ["in"] += len (buf)
905 out = self.enc.update (buf)
906 except cryptography.exceptions.AlreadyFinalized as exn:
907 raise InternalError (exn)
908 self.stats ["out"] += len (out)
912 def next (self, password, paramversion, nacl, iv):
914 Prepare for encrypting another object: Reset the data counters and
915 change the configuration in case one of the variable parameters differs
916 from the last object. Also check the IV for duplicates and error out
917 if strict checking was requested.
921 self.stats ["obj"] += 1
923 self.check_duplicate_iv (iv)
925 if ( self.paramversion != paramversion
926 or self.password != password
927 or self.nacl != nacl):
928 self.set_parameters (password=password, paramversion=paramversion,
929 nacl=nacl, strict_ivs=self.strict_ivs)
932 def check_duplicate_iv (self, iv):
934 Add an IV (the 12 byte representation as in the header) to the list. With
935 strict checking enabled, this will throw a ``DuplicateIV``. Depending on
936 the context, this may indicate a serious error (IV reuse).
938 if self.strict_ivs is True and iv in self.used_ivs:
939 raise DuplicateIV ("iv %s was reused" % iv_fmt (iv))
940 # vi has not been used before; add to collection
941 self.used_ivs.add (iv)
946 Access the data counters.
948 return self.stats ["obj"], self.stats ["in"], self.stats ["out"]
953 Clear the current context regardless of its finalization state. The
954 next operation must be ``.next()``.
959 class Encrypt (Crypto):
965 def __init__ (self, version, paramversion, password=None, key=None, nacl=None,
966 counter=AES_GCM_IV_CNT_DATA, strict_ivs=True):
968 The ctor will throw immediately if one of the parameters does not conform
971 counter=AES_GCM_IV_CNT_DATA, strict_ivs=True):
972 :type version: int to fit uint16_t
973 :type paramversion: int to fit uint16_t
974 :param password: mutually exclusive with ``key``
975 :type password: bytes
976 :param key: mutually exclusive with ``password``
979 :type counter: initial object counter the values
980 ``AES_GCM_IV_CNT_INFOFILE`` and
981 ``AES_GCM_IV_CNT_INDEX`` are unique in each backup set
982 and cannot be reused even with different fixed parts.
983 :type strict_ivs: bool
985 if password is None and key is None \
986 or password is not None and key is not None :
987 raise InvalidParameter ("__init__: need either key or password")
990 if isinstance (key, bytes) is False:
991 raise InvalidParameter ("__init__: key must be provided as "
992 "bytes, not %s" % type (key))
994 raise InvalidParameter ("__init__: salt must be provided along "
995 "with encryption key")
996 else: # password, no key
997 if isinstance (password, str) is False:
998 raise InvalidParameter ("__init__: password must be a string, not %s"
1000 if len (password) == 0:
1001 raise InvalidParameter ("__init__: supplied empty password but not "
1002 "permitted for PDT encrypted files")
1004 if isinstance (version, int) is False:
1005 raise InvalidParameter ("__init__: version number must be an "
1006 "integer, not %s" % type (version))
1008 raise InvalidParameter ("__init__: version number must be a "
1009 "nonnegative integer, not %d" % version)
1011 if isinstance (paramversion, int) is False:
1012 raise InvalidParameter ("__init__: crypto parameter version number "
1013 "must be an integer, not %s"
1014 % type (paramversion))
1015 if paramversion < 0:
1016 raise InvalidParameter ("__init__: crypto parameter version number "
1017 "must be a nonnegative integer, not %d"
1020 if nacl is not None:
1021 if isinstance (nacl, bytes) is False:
1022 raise InvalidParameter ("__init__: salt given, but of type %s "
1023 "instead of bytes" % type (nacl))
1024 # salt length would depend on the actual encryption so it can’t be
1025 # validated at this point
1027 self.version = version
1028 self.paramenc = ENCRYPTION_PARAMETERS.get (paramversion) ["enc"]
1030 super().__init__ (password, key, paramversion, nacl, counter=counter,
1031 strict_ivs=strict_ivs)
1034 def next_fixed (self, retries=PDTCRYPT_IV_GEN_MAX_RETRIES):
1036 Generate the next IV fixed part by reading eight bytes from
1037 ``/dev/urandom``. The buffer so obtained is tested against the fixed
1038 parts used so far to prevent accidental reuse of IVs. After a
1039 configurable number of attempts to create a unique fixed part, it will
1040 refuse to continue with an ``IVFixedPartError``. This is unlikely to
1041 ever happen on a normal system but may detect an issue with the random
1044 The list of fixed parts that were used by the context at hand can be
1045 accessed through the ``.fixed`` list. Its last element is the fixed
1046 part currently in use.
1050 fp = os.urandom (PDTCRYPT_IV_FIXEDPART_SIZE)
1051 if fp not in self.fixed:
1052 self.fixed.append (fp)
1055 raise IVFixedPartError ("error obtaining a unique IV fixed part from "
1056 "/dev/urandom; giving up after %d tries" % i)
1061 Construct a 12-bytes IV from the current fixed part and the object
1064 return struct.pack(FMT_I2N_IV, self.fixed [-1], self.cnt)
1067 def next (self, filename=None, counter=None):
1069 Prepare for encrypting the next incoming object. Update the counter
1070 and put together the IV, possibly changing prefixes. Then create the
1073 The argument ``counter`` can be used to specify a file counter for this
1074 object. Unless it is one of the reserved values, the counter of
1075 subsequent objects will be computed from this one.
1077 If this is the first object in a series, ``filename`` is required,
1078 otherwise it is reused if not present. The value is used to derive a
1079 header sized placeholder to use until after encryption when all the
1080 inputs to construct the final header are available. This is then
1081 matched in ``.done()`` against the value found at the position of the
1082 header. The motivation for this extra check is primarily to assist
1083 format debugging: It makes stray headers easy to spot in malformed
1086 if filename is None:
1087 if self.lastinfo is None:
1088 raise InvalidParameter ("next: filename is mandatory for "
1090 filename, _dummy = self.lastinfo
1092 if isinstance (filename, str) is False:
1093 raise InvalidParameter ("next: filename must be a string, no %s"
1095 if counter is not None:
1096 if isinstance (counter, int) is False:
1097 raise InvalidParameter ("next: the supplied counter is of "
1098 "invalid type %s; please pass an "
1099 "integer instead" % type (counter))
1100 self.set_object_counter (counter)
1102 self.iv = self.iv_make ()
1103 if self.paramenc == "aes-gcm":
1105 ( algorithms.AES (self.key)
1106 , modes.GCM (self.iv)
1107 , backend = default_backend ()) \
1109 elif self.paramenc == "passthrough":
1110 self.enc = PassthroughCipher ()
1112 raise InvalidParameter ("next: parameter version %d not known"
1113 % self.paramversion)
1114 hdrdum = hdr_make_dummy (filename)
1115 self.lastinfo = (filename, hdrdum)
1116 super().next (self.password, self.paramversion, self.nacl, self.iv)
1118 self.set_object_counter (self.cnt + 1)
1122 def done (self, cmpdata):
1124 Complete encryption of an object. After this has been called, attempts
1125 of encrypting further data will cause an error until ``.next()`` is
1128 Returns a 64 bytes buffer containing the object header including all
1129 values including the “late” ones e. g. the ciphertext size and the
1132 if isinstance (cmpdata, bytes) is False:
1133 raise InvalidParameter ("done: comparison input expected as bytes, "
1134 "not %s" % type (cmpdata))
1135 if self.lastinfo is None:
1136 raise RuntimeError ("done: encryption context not initialized")
1137 filename, hdrdum = self.lastinfo
1138 if cmpdata != hdrdum:
1139 raise RuntimeError ("done: bad sync of header for object %d: "
1140 "preliminary data does not match; this likely "
1141 "indicates a wrongly repositioned stream"
1143 data = self.enc.finalize ()
1144 self.stats ["out"] += len (data)
1145 self.ctsize += len (data)
1146 ok, hdr = hdr_from_params (self.version, self.paramversion, self.nacl,
1147 self.iv, self.ctsize, self.enc.tag)
1149 raise InternalError ("error constructing header: %r" % hdr)
1150 return data, hdr, self.fixed
1153 def process (self, buf):
1155 Encrypt a chunk of plaintext with the active encryptor. Returns the
1156 size of the input consumed. This **must** be checked downstream. If the
1157 maximum possible object size has been reached, the current context must
1158 be finalized and a new one established before any further data can be
1159 encrypted. The second argument is the remainder of the plaintext that
1160 was not encrypted for the caller to use immediately after the new
1163 if isinstance (buf, bytes) is False:
1164 raise InvalidParameter ("process: expected byte buffer, not %s"
1167 newptsize = self.ptsize + bsize
1168 diff = newptsize - PDTCRYPT_MAX_OBJ_SIZE
1171 newptsize = PDTCRYPT_MAX_OBJ_SIZE
1172 self.ptsize = newptsize
1173 data = super().process (buf [:bsize])
1174 self.ctsize += len (data)
1178 class Decrypt (Crypto):
1180 tag = None # GCM tag, part of header
1181 last_iv = None # check consecutive ivs in strict mode
1183 def __init__ (self, password=None, key=None, counter=None, fixedparts=None,
1186 Sanitizing ctor for the decryption context. ``fixedparts`` specifies a
1187 list of IV fixed parts accepted during decryption. If a fixed part is
1188 encountered that is not in the list, decryption will fail.
1190 :param password: mutually exclusive with ``key``
1191 :type password: bytes
1192 :param key: mutually exclusive with ``password``
1194 :type counter: initial object counter the values
1195 ``AES_GCM_IV_CNT_INFOFILE`` and
1196 ``AES_GCM_IV_CNT_INDEX`` are unique in each backup set
1197 and cannot be reused even with different fixed parts.
1198 :type fixedparts: bytes list
1200 if password is None and key is None \
1201 or password is not None and key is not None :
1202 raise InvalidParameter ("__init__: need either key or password")
1205 if isinstance (key, bytes) is False:
1206 raise InvalidParameter ("__init__: key must be provided as "
1207 "bytes, not %s" % type (key))
1208 else: # password, no key
1209 if isinstance (password, str) is False:
1210 raise InvalidParameter ("__init__: password must be a string, not %s"
1212 if len (password) == 0:
1213 raise InvalidParameter ("__init__: supplied empty password but not "
1214 "permitted for PDT encrypted files")
1216 if fixedparts is not None:
1217 if isinstance (fixedparts, list) is False:
1218 raise InvalidParameter ("__init__: IV fixed parts must be "
1219 "supplied as list, not %s"
1220 % type (fixedparts))
1221 self.fixed = fixedparts
1224 super().__init__ (password=password, key=key, counter=counter,
1225 strict_ivs=strict_ivs)
1228 def valid_fixed_part (self, iv):
1230 Check if a fixed part was already seen.
1232 # check if fixed part is known
1233 fixed, _cnt = struct.unpack (FMT_I2N_IV, iv)
1234 i = bisect.bisect_left (self.fixed, fixed)
1235 return i != len (self.fixed) and self.fixed [i] == fixed
1238 def check_consecutive_iv (self, iv):
1240 Check whether the counter part of the given IV is indeed the successor
1241 of the currently present counter. This should always be the case for
1242 the objects in a well formed PDT archive but should not be enforced
1243 when decrypting out-of-order.
1245 fixed, cnt = struct.unpack (FMT_I2N_IV, iv)
1246 if self.strict_ivs is True \
1247 and self.last_iv is not None \
1248 and self.last_iv [0] == fixed \
1249 and self.last_iv [1] != cnt - 1:
1250 raise NonConsecutiveIV ("iv %s counter not successor of "
1251 "last object (expected %d, found %d)"
1252 % (iv_fmt (self.last_iv [1]), cnt))
1253 self.last_iv = (iv, cnt)
1256 def next (self, hdr):
1258 Start decrypting the next object. The PDTCRYPT header for the object
1259 can be given either as already parsed object or as bytes.
1261 if isinstance (hdr, bytes) is True:
1262 hdr = hdr_read (hdr)
1263 elif isinstance (hdr, dict) is False:
1264 # this won’t catch malformed specs though
1265 raise InvalidParameter ("next: wrong type of parameter hdr: "
1266 "expected bytes or spec, got %s"
1269 paramversion = hdr ["paramversion"]
1274 raise InvalidHeader ("next: not a header %r" % hdr)
1276 super().next (self.password, paramversion, nacl, iv)
1277 if self.fixed is not None and self.valid_fixed_part (iv) is False:
1278 raise InvalidIVFixedPart ("iv %s has invalid fixed part"
1280 self.check_consecutive_iv (iv)
1283 defs = ENCRYPTION_PARAMETERS.get (paramversion, None)
1285 raise FormatError ("header contains unknown parameter version %d; "
1286 "maybe the file was created by a more recent "
1287 "version of Deltatar" % paramversion)
1289 if enc == "aes-gcm":
1291 ( algorithms.AES (self.key)
1292 , modes.GCM (iv, tag=self.tag)
1293 , backend = default_backend ()) \
1295 elif enc == "passthrough":
1296 self.enc = PassthroughCipher ()
1298 raise InternalError ("encryption parameter set %d refers to unknown "
1299 "mode %r" % (paramversion, enc))
1300 self.set_object_counter (self.cnt + 1)
1303 def done (self, tag=None):
1305 Stop decryption of the current object and finalize it with the active
1306 context. This will throw an *InvalidGCMTag* exception to indicate that
1307 the authentication tag does not match the data. If the tag is correct,
1308 the rest of the plaintext is returned.
1313 data = self.enc.finalize ()
1315 if isinstance (tag, bytes) is False:
1316 raise InvalidParameter ("done: wrong type of parameter "
1317 "tag: expected bytes, got %s"
1319 data = self.enc.finalize_with_tag (self.tag)
1320 except cryptography.exceptions.InvalidTag:
1321 raise InvalidGCMTag ("done: tag mismatch of object %d: %s "
1322 "rejected by finalize ()"
1323 % (self.cnt, binascii.hexlify (self.tag)))
1324 self.ctsize += len (data)
1325 self.stats ["out"] += len (data)
1329 def process (self, buf):
1331 Decrypt the bytes object *buf* with the active decryptor.
1333 if isinstance (buf, bytes) is False:
1334 raise InvalidParameter ("process: expected byte buffer, not %s"
1336 self.ctsize += len (buf)
1337 data = super().process (buf)
1338 self.ptsize += len (data)
1342 ###############################################################################
1344 ###############################################################################
1346 def _patch_global (glob, vow, n=None):
1348 Adapt upper file counter bound for testing IV logic. Completely unsafe.
1350 assert vow == "I am fully aware that this will void my warranty."
1351 r = globals () [glob]
1353 n = globals () [glob + "_DEFAULT"]
1354 globals () [glob] = n
1357 _testing_set_AES_GCM_IV_CNT_MAX = \
1358 partial (_patch_global, "AES_GCM_IV_CNT_MAX")
1360 _testing_set_PDTCRYPT_MAX_OBJ_SIZE = \
1361 partial (_patch_global, "PDTCRYPT_MAX_OBJ_SIZE")
1363 def open2_dump_file (fname, dir_fd, force=False):
1366 oflags = os.O_CREAT | os.O_WRONLY
1367 if PDTCRYPT_OVERWRITE is True:
1368 oflags |= os.O_TRUNC
1373 outfd = os.open (fname, oflags,
1374 stat.S_IRUSR | stat.S_IWUSR, dir_fd=dir_fd)
1375 except FileExistsError as exn:
1376 noise ("PDT: refusing to overwrite existing file %s" % fname)
1378 raise RuntimeError ("destination file %s already exists" % fname)
1379 if PDTCRYPT_VERBOSE is True:
1380 noise ("PDT: new output file %s (fd=%d)" % (fname, outfd))
1384 ###############################################################################
1385 ## freestanding invocation
1386 ###############################################################################
1388 PDTCRYPT_SUB_PROCESS = 0
1389 PDTCRYPT_SUB_SCRYPT = 1
1390 PDTCRYPT_SUB_SCAN = 2
1393 { "process" : PDTCRYPT_SUB_PROCESS
1394 , "scrypt" : PDTCRYPT_SUB_SCRYPT
1395 , "scan" : PDTCRYPT_SUB_SCAN }
1397 PDTCRYPT_SECRET_PW = 0
1398 PDTCRYPT_SECRET_KEY = 1
1400 PDTCRYPT_DECRYPT = 1 << 0 # decrypt archive with password
1401 PDTCRYPT_SPLIT = 1 << 1 # split archive into individual objects
1402 PDTCRYPT_HASH = 1 << 2 # output scrypt hash for file and given password
1404 PDTCRYPT_SPLITNAME = "pdtcrypt-object-%d.bin"
1405 PDTCRYPT_RESCUENAME = "pdtcrypt-rescue-object-%0.5d.bin"
1407 PDTCRYPT_VERBOSE = False
1408 PDTCRYPT_STRICTIVS = False
1409 PDTCRYPT_OVERWRITE = False
1410 PDTCRYPT_BLOCKSIZE = 1 << 12
1415 PDTCRYPT_DEFAULT_VER = 1
1416 PDTCRYPT_DEFAULT_PVER = 1
1418 # scrypt hashing output control
1419 PDTCRYPT_SCRYPT_INTRANATOR = 0
1420 PDTCRYPT_SCRYPT_PARAMETERS = 1
1421 PDTCRYPT_SCRYPT_DEFAULT = PDTCRYPT_SCRYPT_INTRANATOR
1423 PDTCRYPT_SCRYPT_FORMAT = \
1424 { "i2n" : PDTCRYPT_SCRYPT_INTRANATOR
1425 , "params" : PDTCRYPT_SCRYPT_PARAMETERS }
1427 PDTCRYPT_TT_COLUMNS = 80 # assume standard terminal
1429 class PDTDecryptionError (Exception):
1430 """Decryption failed."""
1432 class PDTSplitError (Exception):
1433 """Decryption failed."""
1436 def noise (*a, **b):
1437 print (file=sys.stderr, *a, **b)
1440 class PassthroughDecryptor (object):
1442 curhdr = None # write current header on first data write
1444 def __init__ (self):
1445 if PDTCRYPT_VERBOSE is True:
1446 noise ("PDT: no encryption; data passthrough")
1448 def next (self, hdr):
1449 ok, curhdr = hdr_make (hdr)
1451 raise PDTDecryptionError ("bad header %r" % hdr)
1452 self.curhdr = curhdr
1455 if self.curhdr is not None:
1459 def process (self, d):
1460 if self.curhdr is not None:
1466 def depdtcrypt (mode, secret, ins, outs):
1468 Remove PDTCRYPT layer from all objects encrypted with the secret. Used on a
1469 Deltatar backup this will yield a (possibly Gzip compressed) tarball.
1471 ctleft = -1 # length of ciphertext to consume
1472 ctcurrent = 0 # total ciphertext of current object
1473 total_obj = 0 # total number of objects read
1474 total_pt = 0 # total plaintext bytes
1475 total_ct = 0 # total ciphertext bytes
1476 total_read = 0 # total bytes read
1477 outfile = None # Python file object for output
1479 if mode & PDTCRYPT_DECRYPT: # decryptor
1481 if ks == PDTCRYPT_SECRET_PW:
1482 decr = Decrypt (password=secret [1], strict_ivs=PDTCRYPT_STRICTIVS)
1483 elif ks == PDTCRYPT_SECRET_KEY:
1484 key = binascii.unhexlify (secret [1])
1485 decr = Decrypt (key=key, strict_ivs=PDTCRYPT_STRICTIVS)
1487 raise InternalError ("‘%d’ does not specify a valid kind of secret"
1490 decr = PassthroughDecryptor ()
1493 """Dummy for non-split mode: output file does not vary."""
1496 if mode & PDTCRYPT_SPLIT:
1497 def nextout (outfile):
1499 We were passed an fd as outs for accessing the destination
1500 directory where extracted archive components are supposed
1505 if PDTCRYPT_VERBOSE is True:
1506 noise ("PDT: no output file to close at this point")
1508 if PDTCRYPT_VERBOSE is True:
1509 noise ("PDT: release output file %r" % outfile)
1510 # cleanup happens automatically by the GC; the next
1511 # line will error out on account of an invalid fd
1514 assert total_obj > 0
1515 fname = PDTCRYPT_SPLITNAME % total_obj
1517 outfd = open2_dump_file (fname, outs, force=PDTCRYPT_OVERWRITE)
1518 except RuntimeError as exn:
1519 raise PDTSplitError (exn)
1520 return os.fdopen (outfd, "wb", closefd=True)
1524 """ESPIPE is normal on non-seekable stdio stream."""
1527 except OSError as exn:
1528 if exn.errno == os.errno.ESPIPE:
1531 def out (pt, outfile):
1535 if PDTCRYPT_VERBOSE is True:
1536 noise ("PDT:\t· decrypt plaintext %d B" % (npt))
1538 nn = outfile.write (pt)
1539 except OSError as exn: # probably ENOSPC
1540 raise DecryptionError ("error (%s)" % exn)
1542 raise DecryptionError ("write aborted after %d of %d B" % (nn, npt))
1546 # current object completed; in a valid archive this marks either
1547 # the start of a new header or the end of the input
1548 if ctleft == 0: # current object requires finalization
1549 if PDTCRYPT_VERBOSE is True:
1550 noise ("PDT: %d finalize" % tell (ins))
1553 except InvalidGCMTag as exn:
1554 raise DecryptionError ("error finalizing object %d (%d B): "
1555 "%r" % (total_obj, len (pt), exn)) \
1558 if PDTCRYPT_VERBOSE is True:
1559 noise ("PDT:\t· object validated")
1561 if PDTCRYPT_VERBOSE is True:
1562 noise ("PDT: %d hdr" % tell (ins))
1564 hdr = hdr_read_stream (ins)
1565 total_read += PDTCRYPT_HDR_SIZE
1566 except EndOfFile as exn:
1567 total_read += exn.remainder
1568 if total_ct + total_obj * PDTCRYPT_HDR_SIZE != total_read:
1569 raise PDTDecryptionError ("ciphertext processed (%d B) plus "
1570 "overhead (%d × %d B) does not match "
1571 "the number of bytes read (%d )"
1572 % (total_ct, total_obj, PDTCRYPT_HDR_SIZE,
1574 # the single good exit
1575 return total_read, total_obj, total_ct, total_pt
1576 except InvalidHeader as exn:
1577 raise PDTDecryptionError ("invalid header at position %d in %r "
1578 "(%s)" % (tell (ins), exn, ins))
1579 if PDTCRYPT_VERBOSE is True:
1580 pretty = hdr_fmt_pretty (hdr)
1581 noise (reduce (lambda a, e: (a + "\n" if a else "") + "PDT:\t· " + e,
1582 pretty.splitlines (), ""))
1583 ctcurrent = ctleft = hdr ["ctsize"]
1587 total_obj += 1 # used in file counter with split mode
1589 # finalization complete or skipped in case of first object in
1590 # stream; create a new output file if necessary
1591 outfile = nextout (outfile)
1593 if PDTCRYPT_VERBOSE is True:
1594 noise ("PDT: %d decrypt obj no. %d, %d B"
1595 % (tell (ins), total_obj, ctleft))
1597 # always allocate a new buffer since python-cryptography doesn’t allow
1598 # passing a bytearray :/
1599 nexpect = min (ctleft, PDTCRYPT_BLOCKSIZE)
1600 if PDTCRYPT_VERBOSE is True:
1601 noise ("PDT:\t· [%d] %d%% done, read block (%d B of %d B remaining)"
1603 100 - ctleft * 100 / (ctcurrent > 0 and ctcurrent or 1),
1605 ct = ins.read (nexpect)
1609 raise EndOfFile (nct,
1610 "hit EOF after %d of %d B in block [%d:%d); "
1611 "%d B ciphertext remaining for object no %d"
1612 % (nct, nexpect, off, off + nexpect, ctleft,
1618 if PDTCRYPT_VERBOSE is True:
1619 noise ("PDT:\t· decrypt ciphertext %d B" % (nct))
1620 pt = decr.process (ct)
1624 def deptdcrypt_mk_stream (kind, path):
1625 """Create stream from file or stdio descriptor."""
1626 if kind == PDTCRYPT_SINK:
1628 if PDTCRYPT_VERBOSE is True: noise ("PDT: sink: stdout")
1629 return sys.stdout.buffer
1631 if PDTCRYPT_VERBOSE is True: noise ("PDT: sink: file %s" % path)
1632 return io.FileIO (path, "w")
1633 if kind == PDTCRYPT_SOURCE:
1635 if PDTCRYPT_VERBOSE is True: noise ("PDT: source: stdin")
1636 return sys.stdin.buffer
1638 if PDTCRYPT_VERBOSE is True: noise ("PDT: source: file %s" % path)
1639 return io.FileIO (path, "r")
1641 raise ValueError ("bogus stream “%s” / %s" % (kind, path))
1644 def mode_depdtcrypt (mode, secret, ins, outs):
1646 total_read, total_obj, total_ct, total_pt = \
1647 depdtcrypt (mode, secret, ins, outs)
1648 except DecryptionError as exn:
1649 noise ("PDT: Decryption failed:")
1651 noise ("PDT: “%s”" % exn)
1653 noise ("PDT: Did you specify the correct key / password?")
1656 except PDTSplitError as exn:
1657 noise ("PDT: Split operation failed:")
1659 noise ("PDT: “%s”" % exn)
1661 noise ("PDT: Hint: target directory should be empty.")
1665 if PDTCRYPT_VERBOSE is True:
1666 noise ("PDT: decryption successful" )
1667 noise ("PDT: %.10d bytes read" % total_read)
1668 noise ("PDT: %.10d objects decrypted" % total_obj )
1669 noise ("PDT: %.10d bytes ciphertext" % total_ct )
1670 noise ("PDT: %.10d bytes plaintext" % total_pt )
1676 def mode_scrypt (pw, ins=None, nacl=None, fmt=PDTCRYPT_SCRYPT_INTRANATOR):
1678 paramversion = PDTCRYPT_DEFAULT_PVER
1680 hsh, nacl, version, paramversion = scrypt_hashsource (pw, ins)
1681 defs = ENCRYPTION_PARAMETERS.get(paramversion, None)
1683 nacl = binascii.unhexlify (nacl)
1684 defs = ENCRYPTION_PARAMETERS.get(paramversion, None)
1685 version = PDTCRYPT_DEFAULT_VER
1687 kdfname, params = defs ["kdf"]
1689 kdf = kdf_by_version (None, defs)
1690 hsh, _void = kdf (pw, nacl)
1694 if fmt == PDTCRYPT_SCRYPT_INTRANATOR:
1695 out = json.dumps ({ "salt" : base64.b64encode (nacl).decode ()
1696 , "key" : base64.b64encode (hsh) .decode ()
1697 , "paramversion" : paramversion })
1698 elif fmt == PDTCRYPT_SCRYPT_PARAMETERS:
1699 out = json.dumps ({ "salt" : binascii.hexlify (nacl).decode ()
1700 , "key" : binascii.hexlify (hsh) .decode ()
1701 , "version" : version
1702 , "scrypt_params" : { "N" : params ["N"]
1703 , "r" : params ["r"]
1704 , "p" : params ["p"]
1705 , "dkLen" : params ["dkLen"] } })
1707 raise RuntimeError ("bad scrypt output scheme %r" % fmt)
1712 def noise_output_candidates (cands, indent=8, cols=PDTCRYPT_TT_COLUMNS):
1714 Print a list of offsets without garbling the terminal too much.
1716 The indent is counted from column zero; if it is wide enough, the “PDT: ”
1717 marker will be prepended, considered part of the indentation.
1721 idt = " " * indent if indent < 5 else "PDT: " + " " * (indent - 5)
1726 init = True # prevent leading separator
1729 raise ValueError ("the requested indentation exceeds the line "
1730 "width by %d" % (indent - wd))
1740 if lpos > wd: # line break
1756 def mode_scan (secret, fname, outs=None, nacl=None):
1758 Dissect a binary file, looking for PDTCRYPT headers and objects.
1760 If *outs* is supplied, recoverable data will be dumped into the specified
1764 ifd = os.open (fname, os.O_RDONLY)
1765 except FileNotFoundError:
1766 noise ("PDT: failed to open %s readonly" % fname)
1771 if PDTCRYPT_VERBOSE is True:
1772 noise ("PDT: scan for potential sync points")
1773 cands = locate_hdr_candidates (ifd)
1774 if len (cands) == 0:
1775 noise ("PDT: scan complete: input does not contain potential PDT "
1776 "headers; giving up.")
1778 if PDTCRYPT_VERBOSE is True:
1779 noise ("PDT: scan complete: found %d candidates:" % len (cands))
1780 noise_output_candidates (cands)
1790 vdt, hdr = inspect_hdr (ifd, cand)
1791 if vdt == HDR_CAND_JUNK:
1794 off0 = cand + PDTCRYPT_HDR_SIZE
1795 if PDTCRYPT_VERBOSE is True:
1796 noise ("PDT: obj %d: read payload @%d" % (nobj, off0))
1797 pretty = hdr_fmt_pretty (hdr)
1798 noise (reduce (lambda a, e: (a + "\n" if a else "") + "PDT:\t· " + e,
1799 pretty.splitlines (), ""))
1802 if outs is not None:
1803 ofname = PDTCRYPT_RESCUENAME % nobj
1804 ofd = open2_dump_file (ofname, outs, force=PDTCRYPT_OVERWRITE)
1807 ok = try_decrypt (ifd, off0, hdr, secret, ofd=ofd) == hdr ["ctsize"]
1811 if vdt == HDR_CAND_GOOD and ok is True:
1812 noise ("PDT: %d → ✓ valid object %d–%d"
1813 % (cand, off0, off0 + hdr ["ctsize"]))
1814 elif vdt == HDR_CAND_FISHY and ok is True:
1815 noise ("PDT: %d → × object %d–%d, corrupt header"
1816 % (cand, off0, off0 + hdr ["ctsize"]))
1817 elif vdt == HDR_CAND_GOOD and ok is False:
1818 noise ("PDT: %d → × object %d–%d, problematic payload"
1819 % (cand, off0, off0 + hdr ["ctsize"]))
1820 elif vdt == HDR_CAND_FISHY and ok is False:
1821 noise ("PDT: %d → × object %d–%d, corrupt header, problematic "
1822 "ciphertext" % (cand, off0, off0 + hdr ["ctsize"]))
1829 noise ("PDT: all headers ok")
1831 noise ("PDT: %d candidates not parseable as headers:" % len (junk))
1832 noise_output_candidates (junk)
1834 def usage (err=False):
1838 indent = ' ' * len (SELF)
1839 out ("usage: %s SUBCOMMAND { --help" % SELF)
1840 out (" %s | [ -v ] { -p PASSWORD | -k KEY }" % indent)
1841 out (" %s [ { -i | --in } { - | SOURCE } ]" % indent)
1842 out (" %s [ { -n | --nacl } { SALT } ]" % indent)
1843 out (" %s [ { -o | --out } { - | DESTINATION } ]" % indent)
1844 out (" %s [ -D | --no-decrypt ] [ -S | --split ]" % indent)
1845 out (" %s [ -f | --format ]" % indent)
1848 out ("\t\tSUBCOMMAND main mode: { process | scrypt }")
1850 out ("\t\t process: extract objects from PDT archive")
1851 out ("\t\t scrypt: calculate hash from password and first object")
1852 out ("\t\t-p PASSWORD password to derive the encryption key from")
1853 out ("\t\t-k KEY encryption key as 16 bytes in hexadecimal notation")
1854 out ("\t\t-s enforce strict handling of initialization vectors")
1855 out ("\t\t-i SOURCE file name to read from")
1856 out ("\t\t-o DESTINATION file to write output to")
1857 out ("\t\t-n SALT provide salt for scrypt mode in hex encoding")
1858 out ("\t\t-v print extra info")
1859 out ("\t\t-S split into files at object boundaries; this")
1860 out ("\t\t requires DESTINATION to refer to directory")
1861 out ("\t\t-D PDT header and ciphertext passthrough")
1862 out ("\t\t-f format of SCRYPT hash output (“default” or “parameters”)")
1864 out ("\tinstead of filenames, “-” may used to specify stdin / stdout")
1866 sys.exit ((err is True) and 42 or 0)
1876 def parse_argv (argv):
1878 mode = PDTCRYPT_DECRYPT
1884 scrypt_format = PDTCRYPT_SCRYPT_DEFAULT
1887 SELF = os.path.basename (next (argvi))
1890 rawsubcmd = next (argvi)
1891 subcommand = PDTCRYPT_SUB [rawsubcmd]
1892 except StopIteration:
1893 bail ("ERROR: subcommand required")
1895 bail ("ERROR: invalid subcommand “%s” specified" % rawsubcmd)
1901 except StopIteration:
1902 bail ("ERROR: argument list incomplete")
1904 def checked_secret (t, arg):
1909 bail ("ERROR: encountered “%s” but secret already given" % arg)
1912 if arg in [ "-h", "--help" ]:
1915 elif arg in [ "-v", "--verbose", "--wtf" ]:
1916 global PDTCRYPT_VERBOSE
1917 PDTCRYPT_VERBOSE = True
1918 elif arg in [ "-i", "--in", "--source" ]:
1919 insspec = checked_arg ()
1920 if PDTCRYPT_VERBOSE is True: noise ("PDT: decrypt from %s" % insspec)
1921 elif arg in [ "-p", "--password" ]:
1922 arg = checked_arg ()
1923 checked_secret (PDTCRYPT_SECRET_PW, arg)
1924 if PDTCRYPT_VERBOSE is True: noise ("PDT: decrypting with password")
1926 if subcommand == PDTCRYPT_SUB_PROCESS:
1927 if arg in [ "-s", "--strict-ivs" ]:
1928 global PDTCRYPT_STRICTIVS
1929 PDTCRYPT_STRICTIVS = True
1930 elif arg in [ "-o", "--out", "--dest", "--sink" ]:
1931 outsspec = checked_arg ()
1932 if PDTCRYPT_VERBOSE is True: noise ("PDT: decrypt to %s" % outsspec)
1933 elif arg in [ "-f", "--force" ]:
1934 global PDTCRYPT_OVERWRITE
1935 PDTCRYPT_OVERWRITE = True
1936 if PDTCRYPT_VERBOSE is True: noise ("PDT: overwrite existing files")
1937 elif arg in [ "-S", "--split" ]:
1938 mode |= PDTCRYPT_SPLIT
1939 if PDTCRYPT_VERBOSE is True: noise ("PDT: split files")
1940 elif arg in [ "-D", "--no-decrypt" ]:
1941 mode &= ~PDTCRYPT_DECRYPT
1942 if PDTCRYPT_VERBOSE is True: noise ("PDT: not decrypting")
1943 elif arg in [ "-k", "--key" ]:
1944 arg = checked_arg ()
1945 checked_secret (PDTCRYPT_SECRET_KEY, arg)
1946 if PDTCRYPT_VERBOSE is True: noise ("PDT: decrypting with key")
1948 bail ("ERROR: unexpected positional argument “%s”" % arg)
1949 elif subcommand == PDTCRYPT_SUB_SCRYPT:
1950 if arg in [ "-n", "--nacl", "--salt" ]:
1951 nacl = checked_arg ()
1952 if PDTCRYPT_VERBOSE is True: noise ("PDT: salt key with %s" % nacl)
1953 elif arg in [ "-f", "--format" ]:
1954 arg = checked_arg ()
1956 scrypt_format = PDTCRYPT_SCRYPT_FORMAT [arg]
1958 bail ("ERROR: invalid scrypt output format %s" % arg)
1959 if PDTCRYPT_VERBOSE is True:
1960 noise ("PDT: scrypt output format “%s”" % scrypt_format)
1962 bail ("ERROR: unexpected positional argument “%s”" % arg)
1963 elif subcommand == PDTCRYPT_SUB_SCAN:
1964 if arg in [ "-o", "--out", "--dest", "--sink" ]:
1965 outsspec = checked_arg ()
1966 if PDTCRYPT_VERBOSE is True: noise ("PDT: decrypt to %s" % outsspec)
1967 elif arg in [ "-f", "--force" ]:
1968 global PDTCRYPT_OVERWRITE
1969 PDTCRYPT_OVERWRITE = True
1970 if PDTCRYPT_VERBOSE is True: noise ("PDT: overwrite existing files")
1972 bail ("ERROR: unexpected positional argument “%s”" % arg)
1975 if PDTCRYPT_VERBOSE is True:
1976 noise ("ERROR: no password or key specified, trying $PDTCRYPT_PASSWORD")
1977 epw = os.getenv ("PDTCRYPT_PASSWORD")
1979 checked_secret (PDTCRYPT_SECRET_PW, epw.strip ())
1982 if PDTCRYPT_VERBOSE is True:
1983 noise ("ERROR: no password or key specified, trying $PDTCRYPT_KEY")
1984 ek = os.getenv ("PDTCRYPT_KEY")
1986 checked_secret (PDTCRYPT_SECRET_KEY, ek.strip ())
1989 if subcommand == PDTCRYPT_SUB_SCRYPT:
1990 bail ("ERROR: scrypt hash mode requested but no password given")
1991 elif mode & PDTCRYPT_DECRYPT:
1992 bail ("ERROR: encryption requested but no password given")
1994 if mode & PDTCRYPT_SPLIT and outsspec is None:
1995 bail ("ERROR: split mode is incompatible with stdout sink "
1998 if subcommand == PDTCRYPT_SUB_SCAN and outsspec is None:
1999 pass # no output by default in scan mode
2000 elif mode & PDTCRYPT_SPLIT or subcommand == PDTCRYPT_SUB_SCAN:
2001 # destination must be directory
2003 bail ("ERROR: mode is incompatible with stdout sink")
2006 os.makedirs (outsspec, 0o700)
2007 except FileExistsError:
2008 # if it’s a directory with appropriate perms, everything is
2009 # good; otherwise, below invocation of open(2) will fail
2011 outs = os.open (outsspec, os.O_DIRECTORY, 0o600)
2012 except FileNotFoundError as exn:
2013 bail ("ERROR: cannot create target directory “%s”" % outsspec)
2014 except NotADirectoryError as exn:
2015 bail ("ERROR: target path “%s” is not a directory" % outsspec)
2017 outs = deptdcrypt_mk_stream (PDTCRYPT_SINK, outsspec or "-")
2019 if subcommand == PDTCRYPT_SUB_SCAN:
2021 bail ("ERROR: please supply an input file for scanning")
2023 bail ("ERROR: input must be seekable; please specify a file")
2024 return True, partial (mode_scan, secret, insspec, outs, nacl=nacl)
2026 if subcommand == PDTCRYPT_SUB_SCRYPT:
2027 if secret [0] == PDTCRYPT_SECRET_KEY:
2028 bail ("ERROR: scrypt mode requires a password")
2029 if insspec is not None and nacl is not None \
2030 or insspec is None and nacl is None :
2031 bail ("ERROR: please supply either an input file or "
2036 if insspec is not None or subcommand != PDTCRYPT_SUB_SCRYPT:
2037 ins = deptdcrypt_mk_stream (PDTCRYPT_SOURCE, insspec or "-")
2039 if subcommand == PDTCRYPT_SUB_SCRYPT:
2040 return True, partial (mode_scrypt, secret [1].encode (), ins, nacl,
2043 return True, partial (mode_depdtcrypt, mode, secret, ins, outs)
2047 ok, runner = parse_argv (argv)
2049 if ok is True: return runner ()
2054 if __name__ == "__main__":
2055 sys.exit (main (sys.argv))