6 ===============================================================================
7 crypto -- Encryption Layer for the Deltatar Backup
8 ===============================================================================
12 - AES-GCM for the symmetric encryption;
17 - NIST Recommendation for Block Cipher Modes of Operation: Galois/Counter
19 http://nvlpubs.nist.gov/nistpubs/Legacy/SP/nistspecialpublication800-38d.pdf
22 https://cryptome.org/2014/01/aes-gcm-v1.pdf
24 - Authentication weaknesses in GCM
25 http://csrc.nist.gov/groups/ST/toolkit/BCM/documents/comments/CWC-GCM/Ferguson2.pdf
27 Trouble with python-cryptography packages: authentication tags can only be
28 passed in advance: https://github.com/pyca/cryptography/pull/3421
31 -------------------------------------------------------------------------------
33 Errors fall into roughly three categories:
35 - Cryptographical errors or invalid data.
37 - ``InvalidGCMTag`` (decryption failed on account of an invalid GCM
39 - ``InvalidIVFixedPart`` (IV fixed part of object not found in list),
40 - ``DuplicateIV`` (the IV of an encrypted object already occurred),
41 - ``DecryptionError`` (used in CLI decryption for presenting error
42 conditions to the user).
44 - Incorrect usage of the library.
46 - ``InvalidParameter`` (non-conforming user supplied parameter),
47 - ``InvalidHeader`` (data passed for reading not parsable into header),
48 - ``FormatError`` (cannot handle header or parameter version),
51 - Bad internal state. If one of these is encountered it means that a state
52 was reached that shouldn’t occur during normal processing.
57 Also, ``EndOfFile`` is used as a sentinel to communicate that a stream supplied
58 for reading is exhausted.
60 Initialization Vectors
61 -------------------------------------------------------------------------------
63 Initialization vectors are checked reuse during the lifetime of a decryptor.
64 The fixed counters for metadata files cannot be reused and attempts to do so
65 will cause a DuplicateIV error. This means the length of objects encrypted with
66 a metadata counter is capped at 63 GB.
68 For ordinary, non-metadata payload, there is an optional mode with strict IV
69 checking that causes a crypto context to fail if an IV encountered or created
70 was already used for decrypting or encrypting, respectively, an earlier object.
71 Note that this mode can trigger false positives when decrypting non-linearly,
72 e. g. when traversing the same object multiple times. Since the crypto context
73 has no notion of a position in a PDT encrypted archive, this condition must be
74 sorted out downstream.
77 -------------------------------------------------------------------------------
79 ``crypto.py`` may be invoked as a script for decrypting, validating, and
80 splitting PDT encrypted files. Consult the usage message for details.
84 Decrypt from stdin using the password ‘foo’: ::
86 $ crypto.py process foo -i - -o - <some-file.tar.gz.pdtcrypt >some-file.tar.gz
88 Output verbose information about the encrypted objects in the archive: ::
90 $ crypto.py process foo -v -i some-file.tar.gz.pdtcrypt -o /dev/null
91 PDT: decrypt from some-file.tar.gz.pdtcrypt
92 PDT: decrypt to /dev/null
93 PDT: source: file some-file.tar.gz.pdtcrypt
94 PDT: sink: file /dev/null
96 PDT: · version = 1 : 0100
97 PDT: · paramversion = 1 : 0100
98 PDT: · nacl : d270 b031 00d1 87e2 c946 610d 7b7f 7e5f
99 PDT: · iv : 02ee 3dd7 a963 1eb1 0100 0000
100 PDT: · ctsize = 591 : 4f02 0000 0000 0000
101 PDT: · tag : 5b2d 6d8b 8f82 4842 12fd 0b10 b6e3 369b
102 PDT: 64 decrypt obj no. 1, 591 B
103 PDT: · [64] 0% done, read block (591 B of 591 B remaining)
104 PDT: · decrypt ciphertext 591 B
105 PDT: · decrypt plaintext 591 B
109 Also, the mode *scrypt* allows deriving encryption keys. To calculate the
110 encryption key from the password ‘foo’ and the salt of the first object in a
111 PDT encrypted file: ::
113 $ crypto.py scrypt foo -i some-file.pdtcrypt
114 {"paramversion": 1, "salt": "Cqzbk48e3peEjzWto8D0yA==", "key": "JH9EkMwaM4x9F5aim5gK/Q=="}
116 The computed 16 byte key is given in hexadecimal notation in the value to
117 ``hash`` and can be fed into Python’s ``binascii.unhexlify()`` to obtain the
118 corresponding binary representation.
120 Note that in Scrypt hashing mode, no data integrity checks are being performed.
121 If the wrong password is given, a wrong key will be derived. Whether the password
122 was indeed correct can only be determined by decrypting. Note that since PDT
123 archives essentially consist of a stream of independent objects, the salt and
124 other parameters may change. Thus a key derived using above method from the
125 first object doesn’t necessarily apply to any of the subsequent objects.
134 from functools import reduce, partial
144 except ImportError as exn:
147 if __name__ == "__main__": ## Work around the import mechanism’s lest Python’s
148 pwd = os.getcwd() ## preference for local imports causes a cyclical
149 ## import (crypto → pylibscrypt → […] → ./tarfile → crypto).
150 sys.path = [ p for p in sys.path if p.find ("deltatar") < 0 ]
153 from cryptography.hazmat.primitives.ciphers import Cipher, algorithms, modes
154 from cryptography.hazmat.backends import default_backend
158 __all__ = [ "hdr_make", "hdr_read", "hdr_fmt", "hdr_fmt_pretty"
160 , "PDTCRYPT_HDR_SIZE", "AES_GCM_IV_CNT_DATA"
161 , "AES_GCM_IV_CNT_INFOFILE", "AES_GCM_IV_CNT_INDEX"
165 ###############################################################################
167 ###############################################################################
169 class EndOfFile (Exception):
173 def __init__ (self, n=None, msg=None):
179 class InvalidParameter (Exception):
180 """Inputs not valid for PDT encryption."""
184 class InvalidHeader (Exception):
185 """Header not valid."""
189 class InvalidGCMTag (Exception):
191 The GCM tag calculated during decryption differs from that in the object
197 class InvalidIVFixedPart (Exception):
199 IV fixed part not in supplied list: either the backup is corrupt or the
200 current object does not belong to it.
205 class IVFixedPartError (Exception):
207 Error creating a unique IV fixed part: repeated calls to system RNG yielded
208 the same sequence of bytes as the last IV used.
213 class InvalidFileCounter (Exception):
215 When encrypting, an attempted reuse of a dedicated counter (info file,
216 index file) was caught.
221 class DuplicateIV (Exception):
223 During encryption, the current IV fixed part is identical to an already
224 existing IV (same prefix and file counter). This indicates tampering or
225 programmer error and cannot be recovered from.
230 class NonConsecutiveIV (Exception):
232 IVs not numbered consecutively. This is a hard error with strict IV
233 checking. Precludes random access to the encrypted objects.
238 class FormatError (Exception):
239 """Unusable parameters in header."""
243 class DecryptionError (Exception):
244 """Error during decryption with ``crypto.py`` on the command line."""
248 class Unreachable (Exception):
250 Makeshift __builtin_unreachable(); always a programmer error if
256 class InternalError (Exception):
257 """Errors not ascribable to bad user inputs or cryptography."""
261 ###############################################################################
262 ## crypto layer version
263 ###############################################################################
265 ENCRYPTION_PARAMETERS = \
267 { "kdf": ("dummy", 16)
268 , "enc": "passthrough" }
276 , "enc": "aes-gcm" } }
278 ###############################################################################
280 ###############################################################################
282 PDTCRYPT_HDR_MAGIC = b"PDTCRYPT"
284 PDTCRYPT_HDR_SIZE_MAGIC = 8 # 8
285 PDTCRYPT_HDR_SIZE_VERSION = 2 # 10
286 PDTCRYPT_HDR_SIZE_PARAMVERSION = 2 # 12
287 PDTCRYPT_HDR_SIZE_NACL = 16 # 28
288 PDTCRYPT_HDR_SIZE_IV = 12 # 40
289 PDTCRYPT_HDR_SIZE_CTSIZE = 8 # 48
290 PDTCRYPT_HDR_SIZE_TAG = 16 # 64 GCM auth tag
292 PDTCRYPT_HDR_SIZE = PDTCRYPT_HDR_SIZE_MAGIC + PDTCRYPT_HDR_SIZE_VERSION \
293 + PDTCRYPT_HDR_SIZE_PARAMVERSION + PDTCRYPT_HDR_SIZE_NACL \
294 + PDTCRYPT_HDR_SIZE_IV + PDTCRYPT_HDR_SIZE_CTSIZE \
295 + PDTCRYPT_HDR_SIZE_TAG # = 64
297 # precalculate offsets since Python can’t do constant folding over names
298 HDR_OFF_VERSION = PDTCRYPT_HDR_SIZE_MAGIC
299 HDR_OFF_PARAMVERSION = HDR_OFF_VERSION + PDTCRYPT_HDR_SIZE_VERSION
300 HDR_OFF_NACL = HDR_OFF_PARAMVERSION + PDTCRYPT_HDR_SIZE_PARAMVERSION
301 HDR_OFF_IV = HDR_OFF_NACL + PDTCRYPT_HDR_SIZE_NACL
302 HDR_OFF_CTSIZE = HDR_OFF_IV + PDTCRYPT_HDR_SIZE_IV
303 HDR_OFF_TAG = HDR_OFF_CTSIZE + PDTCRYPT_HDR_SIZE_CTSIZE
307 FMT_I2N_IV = "<8sL" # 8 random bytes ‖ 32 bit counter
308 FMT_I2N_HDR = ("<" # host byte order
312 "16s" # sodium chloride
318 AES_KEY_SIZE = 16 # b"0123456789abcdef"
319 AES_KEY_SIZE_B64 = 24 # b'MDEyMzQ1Njc4OWFiY2RlZg=='
320 AES_GCM_MAX_SIZE = (1 << 36) - (1 << 5) # 2^39 - 2^8 b ≅ 64 GB
321 PDTCRYPT_MAX_OBJ_SIZE_DEFAULT = 63 * (1 << 30) # 63 GB
322 PDTCRYPT_MAX_OBJ_SIZE = PDTCRYPT_MAX_OBJ_SIZE_DEFAULT
324 # index and info files are written on-the fly while encrypting so their
325 # counters must be available inadvance
326 AES_GCM_IV_CNT_INFOFILE = 1 # constant
327 AES_GCM_IV_CNT_INDEX = AES_GCM_IV_CNT_INFOFILE + 1
328 AES_GCM_IV_CNT_DATA = AES_GCM_IV_CNT_INDEX + 1 # also for multivolume
329 AES_GCM_IV_CNT_MAX_DEFAULT = 0xffFFffFF
330 AES_GCM_IV_CNT_MAX = AES_GCM_IV_CNT_MAX_DEFAULT
332 # IV structure and generation
333 PDTCRYPT_IV_GEN_MAX_RETRIES = 10 # ×
334 PDTCRYPT_IV_FIXEDPART_SIZE = 8 # B
335 PDTCRYPT_IV_COUNTER_SIZE = 4 # B
337 # secret type: PW of string | KEY of char [16]
338 PDTCRYPT_SECRET_PW = 0
339 PDTCRYPT_SECRET_KEY = 1
341 ###############################################################################
343 ###############################################################################
349 # , paramversion : u16
355 # fn hdr_read (f : handle) -> hdrinfo;
356 # fn hdr_make (f : handle, h : hdrinfo) -> IOResult<usize>;
357 # fn hdr_fmt (h : hdrinfo) -> String;
362 Read bytes as header structure.
364 If the input could not be interpreted as a header, fail with
369 mag, version, paramversion, nacl, iv, ctsize, tag = \
370 struct.unpack (FMT_I2N_HDR, data)
371 except Exception as exn:
372 raise InvalidHeader ("error unpacking header from [%r]: %s"
373 % (binascii.hexlify (data), str (exn)))
375 if mag != PDTCRYPT_HDR_MAGIC:
376 raise InvalidHeader ("bad magic in header: expected [%s], got [%s]"
377 % (PDTCRYPT_HDR_MAGIC, mag))
380 { "version" : version
381 , "paramversion" : paramversion
389 def hdr_read_stream (instr):
391 Read header from stream at the current position.
393 Fail with ``InvalidHeader`` if insufficient bytes were read from the
394 stream, or if the content could not be interpreted as a header.
396 data = instr.read(PDTCRYPT_HDR_SIZE)
400 elif ldata != PDTCRYPT_HDR_SIZE:
401 raise InvalidHeader ("hdr_read_stream: expected %d B, received %d B"
402 % (PDTCRYPT_HDR_SIZE, ldata))
403 return hdr_read (data)
406 def hdr_from_params (version, paramversion, nacl, iv, ctsize, tag):
408 Assemble the necessary values into a PDTCRYPT header.
410 :type version: int to fit uint16_t
411 :type paramversion: int to fit uint16_t
412 :type nacl: bytes to fit uint8_t[16]
413 :type iv: bytes to fit uint8_t[12]
414 :type size: int to fit uint64_t
415 :type tag: bytes to fit uint8_t[16]
417 buf = bytearray (PDTCRYPT_HDR_SIZE)
418 bufv = memoryview (buf)
421 struct.pack_into (FMT_I2N_HDR, bufv, 0,
423 version, paramversion, nacl, iv, ctsize, tag)
424 except Exception as exn:
425 return False, "error assembling header: %s" % str (exn)
427 return True, bytes (buf)
430 def hdr_make_dummy (s):
432 Create a header sized block of bytes initialized to a value derived from a
433 string. Used to verify we’ve jumped back correctly to the actual position
434 of the object header.
436 c = reduce (lambda a, c: a + ord(c), s, 0) % 0xFF
437 return bytes (bytearray (struct.pack ("B", c)) * PDTCRYPT_HDR_SIZE)
442 Assemble a header from the given header structure.
444 return hdr_from_params (version=hdr.get("version"),
445 paramversion=hdr.get("paramversion"),
446 nacl=hdr.get("nacl"), iv=hdr.get("iv"),
447 ctsize=hdr.get("ctsize"), tag=hdr.get("tag"))
450 HDR_FMT = "I2n_header { version: %d, paramversion: %d, nacl: %s[%d]," \
451 " iv: %s[%d], ctsize: %d, tag: %s[%d] }"
454 """Format a header structure into readable output."""
455 return HDR_FMT % (h["version"], h["paramversion"],
456 binascii.hexlify (h["nacl"]), len(h["nacl"]),
457 binascii.hexlify (h["iv"]), len(h["iv"]),
459 binascii.hexlify (h["tag"]), len(h["tag"]))
462 def hex_spaced_of_bytes (b):
463 """Format bytes object, hexdump style."""
464 return " ".join ([ "%.2x%.2x" % (c1, c2)
465 for c1, c2 in zip (b[0::2], b[1::2]) ]) \
466 + (len (b) | 1 == len (b) and " %.2x" % b[-1] or "") # odd lengths
469 def hdr_iv_counter (h):
470 """Extract the variable part of the IV of the given header."""
471 _fixed, cnt = struct.unpack (FMT_I2N_IV, h ["iv"])
475 def hdr_iv_fixed (h):
476 """Extract the fixed part of the IV of the given header."""
477 fixed, _cnt = struct.unpack (FMT_I2N_IV, h ["iv"])
481 hdr_dump = hex_spaced_of_bytes
485 """version = %-4d : %s
486 paramversion = %-4d : %s
493 def hdr_fmt_pretty (h):
495 Format header structure into multi-line representation of its contents and
496 their raw representation. (Omit the implicit “PDTCRYPT” magic bytes that
497 precede every header.)
499 return HDR_FMT_PRETTY \
501 hex_spaced_of_bytes (struct.pack (FMT_UINT16_LE, h["version"])),
503 hex_spaced_of_bytes (struct.pack (FMT_UINT16_LE, h["paramversion"])),
504 hex_spaced_of_bytes (h["nacl"]),
505 hex_spaced_of_bytes (h["iv"]),
507 hex_spaced_of_bytes (struct.pack (FMT_UINT64_LE, h["ctsize"])),
508 hex_spaced_of_bytes (h["tag"]))
510 IV_FMT = "((f %s) (c %d))"
513 """Format the two components of an IV in a readable fashion."""
514 fixed, cnt = struct.unpack (FMT_I2N_IV, iv)
515 return IV_FMT % (binascii.hexlify (fixed), cnt)
518 ###############################################################################
520 ###############################################################################
522 class Location (object):
526 def restore_loc_fmt (loc):
528 % (loc.n, loc.offset)
530 def locate_hdr_candidates (fd):
532 Walk over instances of the magic string in the payload, collecting their
533 positions. If the offset of the first found instance is not zero, the file
534 begins with leading garbage.
536 :return: The list of offsets in the file.
540 mm = mmap.mmap(fd, 0, mmap.MAP_SHARED, mmap.PROT_READ)
543 pos = mm.find (PDTCRYPT_HDR_MAGIC, pos)
552 HDR_CAND_GOOD = 0 # header marks begin of valid object
553 HDR_CAND_FISHY = 1 # inconclusive (tag mismatch, obj overlap etc.)
554 HDR_CAND_JUNK = 2 # not a header / object unreadable
557 def inspect_hdr (fd, off):
559 Attempt to parse a header in *fd* at position *off*.
561 Returns a verdict about the quality of that header plus the parsed header
565 _ = os.lseek (fd, off, os.SEEK_SET)
567 if os.lseek (fd, 0, os.SEEK_CUR) != off:
568 if PDTCRYPT_VERBOSE is True:
569 noise ("PDT: %d → dismissed (lseek() past EOF)" % off)
570 return HDR_CAND_JUNK, None
572 raw = os.read (fd, PDTCRYPT_HDR_SIZE)
573 if len (raw) != PDTCRYPT_HDR_SIZE:
574 if PDTCRYPT_VERBOSE is True:
575 noise ("PDT: %d → dismissed (EOF inside header)" % off)
576 return HDR_CAND_JUNK, None
580 except InvalidHeader as exn:
581 if PDTCRYPT_VERBOSE is True:
582 noise ("PDT: %d → dismissed (invalid: [%s])" % (off, str (exn)))
583 return HDR_CAND_JUNK, None
585 obj0 = off + PDTCRYPT_HDR_SIZE
586 objX = obj0 + hdr ["ctsize"]
588 eof = os.lseek (fd, 0, os.SEEK_END)
590 if PDTCRYPT_VERBOSE is True:
591 noise ("PDT: %d → EOF inside object (%d≤%d≤%d); adjusting size to "
592 "%d" % (off, obj0, eof, objX, (eof - obj0)))
593 # try reading up to the end
594 hdr ["ctsize"] = eof - obj0
595 return HDR_CAND_FISHY, hdr
597 return HDR_CAND_GOOD, hdr
600 def try_decrypt (ifd, off, hdr, secret, ofd=-1):
602 Attempt to decrypt the object in the (seekable) descriptor *ifd* starting
603 at *off* using the metadata in *hdr* and *secret*. An output fd can be
604 specified with *ofd*; if it is *-1* – the default –, the decrypted payload
607 Always creates a fresh decryptor, so validation steps across objects don’t
610 Errors during GCM tag validation are ignored.
612 ctleft = hdr ["ctsize"]
616 if ks == PDTCRYPT_SECRET_PW:
617 decr = Decrypt (password=secret [1])
618 elif ks == PDTCRYPT_SECRET_KEY:
619 key = binascii.unhexlify (secret [1])
620 decr = Decrypt (key=key)
627 os.lseek (ifd, pos, os.SEEK_SET)
629 cnksiz = min (ctleft, PDTCRYPT_BLOCKSIZE)
630 cnk = os.read (ifd, cnksiz)
633 pt = decr.process (cnk)
638 except InvalidGCMTag:
639 noise ("PDT: GCM tag mismatch for object %d–%d"
640 % (off, off + hdr ["ctsize"]))
641 if len (pt) > 0 and ofd != -1:
644 except Exception as exn:
645 noise ("PDT: error decrypting object %d–%d@%d, %d B remaining [%s]"
646 % (off, off + hdr ["ctsize"], pos, ctleft, exn))
652 def readable_objects_offsets (ifd, secret, cands):
654 From a list of candidates, locate the ones that mark the start of actual
655 readable PDTCRYPT objects.
661 vdt, hdr = inspect_hdr (ifd, cand)
662 if vdt == HDR_CAND_JUNK:
663 pass # ignore unreadable ones
664 elif vdt in [HDR_CAND_GOOD, HDR_CAND_FISHY]:
665 off0 = cand + PDTCRYPT_HDR_SIZE
666 ok = try_decrypt (ifd, off0, hdr, secret) == hdr ["ctsize"]
672 def reconstruct_offsets (fname, secret):
673 ifd = os.open (fname, os.O_RDONLY)
676 cands = locate_hdr_candidates (ifd)
677 return readable_objects_offsets (ifd, secret, cands)
682 ###############################################################################
684 ###############################################################################
686 def make_secret (password=None, key=None):
688 Safely create a “secret” value that consists either of a key or a password.
689 Inputs are validated: the password is accepted as (UTF-8 encoded) bytes or
690 string; for the key only a bytes object of the proper size or a base64
691 encoded string thereof is accepted.
693 If both are provided, the key is preferred over the password; no checks are
694 performed whether the key is derived from the password.
696 :returns: secret value if inputs were acceptable | None otherwise.
699 if isinstance (key, str) is True:
700 key = key.encode ("utf-8")
701 if isinstance (key, bytes) is True:
702 if len (key) == AES_KEY_SIZE:
703 return (PDTCRYPT_SECRET_KEY, key)
704 if len (key) == AES_KEY_SIZE_B64:
706 key = base64.b64decode (key)
707 # the base64 processor is very tolerant and allows for
708 # arbitrary traling and leading data thus the data obtained
709 # must be checked for the proper length
710 if len (key) == AES_KEY_SIZE:
711 return (PDTCRYPT_SECRET_KEY, key)
712 except binascii.Error: # “incorrect padding”
714 elif password is not None:
715 if isinstance (password, str) is True:
716 return (PDTCRYPT_SECRET_PW, password)
717 elif isinstance (password, bytes) is True:
719 password = password.decode ("utf-8")
720 return (PDTCRYPT_SECRET_PW, password)
721 except UnicodeDecodeError:
727 ###############################################################################
728 ## passthrough / null encryption
729 ###############################################################################
731 class PassthroughCipher (object):
733 tag = struct.pack ("<QQ", 0, 0)
735 def __init__ (self) : pass
737 def update (self, b) : return b
739 def finalize (self) : return b""
741 def finalize_with_tag (self, _) : return b""
743 ###############################################################################
744 ## convenience wrapper
745 ###############################################################################
748 def kdf_dummy (klen, password, _nacl):
750 Fake KDF for testing purposes that is called when parameter version zero is
753 q, r = divmod (klen, len (password))
754 if isinstance (password, bytes) is False:
755 password = password.encode ()
756 return password * q + password [:r], b""
759 SCRYPT_KEY_MEMO = { } # static because needed for both the info file and the archive
762 def kdf_scrypt (params, password, nacl):
764 Wrapper for the Scrypt KDF, corresponds to parameter version one. The
765 computation result is memoized based on the inputs to facilitate spawning
766 multiple encryption contexts.
771 dkLen = params["dkLen"]
774 nacl = os.urandom (params["NaCl_LEN"])
776 key_parms = (password, nacl, N, r, p, dkLen)
777 global SCRYPT_KEY_MEMO
778 if key_parms not in SCRYPT_KEY_MEMO:
779 SCRYPT_KEY_MEMO [key_parms] = \
780 pylibscrypt.scrypt (password, nacl, N, r, p, dkLen)
781 return SCRYPT_KEY_MEMO [key_parms], nacl
784 def kdf_by_version (paramversion=None, defs=None):
786 Pick the KDF handler corresponding to the parameter version or the
789 :rtype: function (password : str, nacl : str) -> str
791 if paramversion is not None:
792 defs = ENCRYPTION_PARAMETERS.get(paramversion, None)
794 raise InvalidParameter ("no encryption parameters for version %r"
796 (kdf, params) = defs["kdf"]
798 if kdf == "scrypt" : fn = kdf_scrypt
799 if kdf == "dummy" : fn = kdf_dummy
801 raise ValueError ("key derivation method %r unknown" % kdf)
802 return partial (fn, params)
805 ###############################################################################
807 ###############################################################################
809 def scrypt_hashsource (pw, ins):
811 Calculate the SCRYPT hash from the password and the information contained
812 in the first header found in ``ins``.
814 This does not validate whether the first object is encrypted correctly.
816 if isinstance (pw, str) is True:
818 elif isinstance (pw, bytes) is False:
819 raise InvalidParameter ("password must be a string, not %s"
821 if isinstance (ins, io.BufferedReader) is False and \
822 isinstance (ins, io.FileIO) is False:
823 raise InvalidParameter ("file to hash must be opened in “binary” mode")
826 hdr = hdr_read_stream (ins)
827 except EndOfFile as exn:
828 noise ("PDT: malformed input: end of file reading first object header")
833 pver = hdr ["paramversion"]
834 if PDTCRYPT_VERBOSE is True:
835 noise ("PDT: salt of first object : %s" % binascii.hexlify (nacl))
836 noise ("PDT: parameter version of archive : %d" % pver)
839 defs = ENCRYPTION_PARAMETERS.get(pver, None)
840 kdfname, params = defs ["kdf"]
841 if kdfname != "scrypt":
842 noise ("PDT: input is not an SCRYPT archive")
845 kdf = kdf_by_version (None, defs)
846 except ValueError as exn:
847 noise ("PDT: object has unknown parameter version %d" % pver)
849 hsh, _void = kdf (pw, nacl)
851 return hsh, nacl, hdr ["version"], pver
854 def scrypt_hashfile (pw, fname):
856 Calculate the SCRYPT hash from the password and the information contained
857 in the first header found in the given file. The header is read only at
860 with deptdcrypt_mk_stream (PDTCRYPT_SOURCE, fname or "-") as ins:
861 hsh, _void, _void, _void = scrypt_hashsource (pw, ins)
865 ###############################################################################
867 ###############################################################################
869 class Crypto (object):
871 Encryption context to remain alive throughout an entire tarfile pass.
876 cnt = None # file counter (uint32_t != 0)
877 iv = None # current IV
878 fixed = None # accu for 64 bit fixed parts of IV
879 used_ivs = None # tracks IVs
880 strict_ivs = False # if True, panic on duplicate object IV
889 info_counter_used = False
890 index_counter_used = False
892 def __init__ (self, *al, **akv):
893 self.used_ivs = set ()
894 self.set_parameters (*al, **akv)
897 def next_fixed (self):
902 def set_object_counter (self, cnt=None):
904 Safely set the internal counter of encrypted objects. Numerous
907 The same counter may not be reused in combination with one IV fixed
908 part. This is validated elsewhere in the IV handling.
910 Counter zero is invalid. The first two counters are reserved for
911 metadata. The implementation does not allow for splitting metadata
912 files over multiple encrypted objects. (This would be possible by
913 assigning new fixed parts.) Thus in a Deltatar backup there is at most
914 one object with a counter value of one and two. On creation of a
915 context, the initial counter may be chosen. The globals
916 ``AES_GCM_IV_CNT_INFOFILE`` and ``AES_GCM_IV_CNT_INDEX`` can be used to
917 request one of the reserved values. If one of these values has been
918 used, any further attempt of setting the counter to that value will
919 be rejected with an ``InvalidFileCounter`` exception.
921 Out of bounds values (i. e. below one and more than the maximum of 2³²)
922 cause an ``InvalidParameter`` exception to be thrown.
925 self.cnt = AES_GCM_IV_CNT_DATA
927 if cnt == 0 or cnt > AES_GCM_IV_CNT_MAX + 1:
928 raise InvalidParameter ("invalid counter value %d requested: "
929 "acceptable values are from 1 to %d"
930 % (cnt, AES_GCM_IV_CNT_MAX))
931 if cnt == AES_GCM_IV_CNT_INFOFILE:
932 if self.info_counter_used is True:
933 raise InvalidFileCounter ("attempted to reuse info file "
934 "counter %d: must be unique" % cnt)
935 self.info_counter_used = True
936 elif cnt == AES_GCM_IV_CNT_INDEX:
937 if self.index_counter_used is True:
938 raise InvalidFileCounter ("attempted to reuse index file "
939 " counter %d: must be unique" % cnt)
940 self.index_counter_used = True
941 if cnt <= AES_GCM_IV_CNT_MAX:
944 # cnt == AES_GCM_IV_CNT_MAX + 1 → wrap
945 self.cnt = AES_GCM_IV_CNT_DATA
949 def set_parameters (self, password=None, key=None, paramversion=None,
950 nacl=None, counter=None, strict_ivs=False):
952 Configure the internal state of a crypto context. Not intended for
956 self.set_object_counter (counter)
957 self.strict_ivs = strict_ivs
959 if paramversion is not None:
960 self.paramversion = paramversion
963 self.key, self.nacl = key, nacl
966 if password is not None:
967 if isinstance (password, bytes) is False:
968 password = str.encode (password)
969 self.password = password
970 if paramversion is None and nacl is None:
971 # postpone key setup until first header is available
973 kdf = kdf_by_version (paramversion)
975 self.key, self.nacl = kdf (password, nacl)
978 def process (self, buf):
980 Encrypt / decrypt a buffer. Invokes the ``.update()`` method on the
981 wrapped encryptor or decryptor, respectively.
983 The Cryptography exception ``AlreadyFinalized`` is translated to an
984 ``InternalError`` at this point. It may occur in sound code when the GC
985 closes an encrypting stream after an error. Everywhere else it must be
989 raise RuntimeError ("process: context not initialized")
990 self.stats ["in"] += len (buf)
992 out = self.enc.update (buf)
993 except cryptography.exceptions.AlreadyFinalized as exn:
994 raise InternalError (exn)
995 self.stats ["out"] += len (out)
999 def next (self, password, paramversion, nacl, iv):
1001 Prepare for encrypting another object: Reset the data counters and
1002 change the configuration in case one of the variable parameters differs
1003 from the last object. Also check the IV for duplicates and error out
1004 if strict checking was requested.
1008 self.stats ["obj"] += 1
1010 self.check_duplicate_iv (iv)
1012 if ( self.paramversion != paramversion
1013 or self.password != password
1014 or self.nacl != nacl):
1015 self.set_parameters (password=password, paramversion=paramversion,
1016 nacl=nacl, strict_ivs=self.strict_ivs)
1019 def check_duplicate_iv (self, iv):
1021 Add an IV (the 12 byte representation as in the header) to the list. With
1022 strict checking enabled, this will throw a ``DuplicateIV``. Depending on
1023 the context, this may indicate a serious error (IV reuse).
1025 if self.strict_ivs is True and iv in self.used_ivs:
1026 raise DuplicateIV ("iv %s was reused" % iv_fmt (iv))
1027 # vi has not been used before; add to collection
1028 self.used_ivs.add (iv)
1031 def counters (self):
1033 Access the data counters.
1035 return self.stats ["obj"], self.stats ["in"], self.stats ["out"]
1040 Clear the current context regardless of its finalization state. The
1041 next operation must be ``.next()``.
1046 class Encrypt (Crypto):
1052 def __init__ (self, version, paramversion, password=None, key=None, nacl=None,
1053 counter=AES_GCM_IV_CNT_DATA, strict_ivs=True):
1055 The ctor will throw immediately if one of the parameters does not conform
1056 to our expectations.
1058 counter=AES_GCM_IV_CNT_DATA, strict_ivs=True):
1059 :type version: int to fit uint16_t
1060 :type paramversion: int to fit uint16_t
1061 :param password: mutually exclusive with ``key``
1062 :type password: bytes
1063 :param key: mutually exclusive with ``password``
1066 :type counter: initial object counter the values
1067 ``AES_GCM_IV_CNT_INFOFILE`` and
1068 ``AES_GCM_IV_CNT_INDEX`` are unique in each backup set
1069 and cannot be reused even with different fixed parts.
1070 :type strict_ivs: bool
1072 if password is None and key is None \
1073 or password is not None and key is not None :
1074 raise InvalidParameter ("__init__: need either key or password")
1077 if isinstance (key, bytes) is False:
1078 raise InvalidParameter ("__init__: key must be provided as "
1079 "bytes, not %s" % type (key))
1081 raise InvalidParameter ("__init__: salt must be provided along "
1082 "with encryption key")
1083 else: # password, no key
1084 if isinstance (password, str) is False:
1085 raise InvalidParameter ("__init__: password must be a string, not %s"
1087 if len (password) == 0:
1088 raise InvalidParameter ("__init__: supplied empty password but not "
1089 "permitted for PDT encrypted files")
1091 if isinstance (version, int) is False:
1092 raise InvalidParameter ("__init__: version number must be an "
1093 "integer, not %s" % type (version))
1095 raise InvalidParameter ("__init__: version number must be a "
1096 "nonnegative integer, not %d" % version)
1098 if isinstance (paramversion, int) is False:
1099 raise InvalidParameter ("__init__: crypto parameter version number "
1100 "must be an integer, not %s"
1101 % type (paramversion))
1102 if paramversion < 0:
1103 raise InvalidParameter ("__init__: crypto parameter version number "
1104 "must be a nonnegative integer, not %d"
1107 if nacl is not None:
1108 if isinstance (nacl, bytes) is False:
1109 raise InvalidParameter ("__init__: salt given, but of type %s "
1110 "instead of bytes" % type (nacl))
1111 # salt length would depend on the actual encryption so it can’t be
1112 # validated at this point
1114 self.version = version
1115 self.paramenc = ENCRYPTION_PARAMETERS.get (paramversion) ["enc"]
1117 super().__init__ (password, key, paramversion, nacl, counter=counter,
1118 strict_ivs=strict_ivs)
1121 def next_fixed (self, retries=PDTCRYPT_IV_GEN_MAX_RETRIES):
1123 Generate the next IV fixed part by reading eight bytes from
1124 ``/dev/urandom``. The buffer so obtained is tested against the fixed
1125 parts used so far to prevent accidental reuse of IVs. After a
1126 configurable number of attempts to create a unique fixed part, it will
1127 refuse to continue with an ``IVFixedPartError``. This is unlikely to
1128 ever happen on a normal system but may detect an issue with the random
1131 The list of fixed parts that were used by the context at hand can be
1132 accessed through the ``.fixed`` list. Its last element is the fixed
1133 part currently in use.
1137 fp = os.urandom (PDTCRYPT_IV_FIXEDPART_SIZE)
1138 if fp not in self.fixed:
1139 self.fixed.append (fp)
1142 raise IVFixedPartError ("error obtaining a unique IV fixed part from "
1143 "/dev/urandom; giving up after %d tries" % i)
1148 Construct a 12-bytes IV from the current fixed part and the object
1151 return struct.pack(FMT_I2N_IV, self.fixed [-1], self.cnt)
1154 def next (self, filename=None, counter=None):
1156 Prepare for encrypting the next incoming object. Update the counter
1157 and put together the IV, possibly changing prefixes. Then create the
1160 The argument ``counter`` can be used to specify a file counter for this
1161 object. Unless it is one of the reserved values, the counter of
1162 subsequent objects will be computed from this one.
1164 If this is the first object in a series, ``filename`` is required,
1165 otherwise it is reused if not present. The value is used to derive a
1166 header sized placeholder to use until after encryption when all the
1167 inputs to construct the final header are available. This is then
1168 matched in ``.done()`` against the value found at the position of the
1169 header. The motivation for this extra check is primarily to assist
1170 format debugging: It makes stray headers easy to spot in malformed
1173 if filename is None:
1174 if self.lastinfo is None:
1175 raise InvalidParameter ("next: filename is mandatory for "
1177 filename, _dummy = self.lastinfo
1179 if isinstance (filename, str) is False:
1180 raise InvalidParameter ("next: filename must be a string, no %s"
1182 if counter is not None:
1183 if isinstance (counter, int) is False:
1184 raise InvalidParameter ("next: the supplied counter is of "
1185 "invalid type %s; please pass an "
1186 "integer instead" % type (counter))
1187 self.set_object_counter (counter)
1189 self.iv = self.iv_make ()
1190 if self.paramenc == "aes-gcm":
1192 ( algorithms.AES (self.key)
1193 , modes.GCM (self.iv)
1194 , backend = default_backend ()) \
1196 elif self.paramenc == "passthrough":
1197 self.enc = PassthroughCipher ()
1199 raise InvalidParameter ("next: parameter version %d not known"
1200 % self.paramversion)
1201 hdrdum = hdr_make_dummy (filename)
1202 self.lastinfo = (filename, hdrdum)
1203 super().next (self.password, self.paramversion, self.nacl, self.iv)
1205 self.set_object_counter (self.cnt + 1)
1209 def done (self, cmpdata):
1211 Complete encryption of an object. After this has been called, attempts
1212 of encrypting further data will cause an error until ``.next()`` is
1215 Returns a 64 bytes buffer containing the object header including all
1216 values including the “late” ones e. g. the ciphertext size and the
1219 if isinstance (cmpdata, bytes) is False:
1220 raise InvalidParameter ("done: comparison input expected as bytes, "
1221 "not %s" % type (cmpdata))
1222 if self.lastinfo is None:
1223 raise RuntimeError ("done: encryption context not initialized")
1224 filename, hdrdum = self.lastinfo
1225 if cmpdata != hdrdum:
1226 raise RuntimeError ("done: bad sync of header for object %d: "
1227 "preliminary data does not match; this likely "
1228 "indicates a wrongly repositioned stream"
1230 data = self.enc.finalize ()
1231 self.stats ["out"] += len (data)
1232 self.ctsize += len (data)
1233 ok, hdr = hdr_from_params (self.version, self.paramversion, self.nacl,
1234 self.iv, self.ctsize, self.enc.tag)
1236 raise InternalError ("error constructing header: %r" % hdr)
1237 return data, hdr, self.fixed
1240 def process (self, buf):
1242 Encrypt a chunk of plaintext with the active encryptor. Returns the
1243 size of the input consumed. This **must** be checked downstream. If the
1244 maximum possible object size has been reached, the current context must
1245 be finalized and a new one established before any further data can be
1246 encrypted. The second argument is the remainder of the plaintext that
1247 was not encrypted for the caller to use immediately after the new
1250 if isinstance (buf, bytes) is False:
1251 raise InvalidParameter ("process: expected byte buffer, not %s"
1254 newptsize = self.ptsize + bsize
1255 diff = newptsize - PDTCRYPT_MAX_OBJ_SIZE
1258 newptsize = PDTCRYPT_MAX_OBJ_SIZE
1259 self.ptsize = newptsize
1260 data = super().process (buf [:bsize])
1261 self.ctsize += len (data)
1265 class Decrypt (Crypto):
1267 tag = None # GCM tag, part of header
1268 last_iv = None # check consecutive ivs in strict mode
1270 def __init__ (self, password=None, key=None, counter=None, fixedparts=None,
1273 Sanitizing ctor for the decryption context. ``fixedparts`` specifies a
1274 list of IV fixed parts accepted during decryption. If a fixed part is
1275 encountered that is not in the list, decryption will fail.
1277 :param password: mutually exclusive with ``key``
1278 :type password: bytes
1279 :param key: mutually exclusive with ``password``
1281 :type counter: initial object counter the values
1282 ``AES_GCM_IV_CNT_INFOFILE`` and
1283 ``AES_GCM_IV_CNT_INDEX`` are unique in each backup set
1284 and cannot be reused even with different fixed parts.
1285 :type fixedparts: bytes list
1287 if password is None and key is None \
1288 or password is not None and key is not None :
1289 raise InvalidParameter ("__init__: need either key or password")
1292 if isinstance (key, bytes) is False:
1293 raise InvalidParameter ("__init__: key must be provided as "
1294 "bytes, not %s" % type (key))
1295 else: # password, no key
1296 if isinstance (password, str) is False:
1297 raise InvalidParameter ("__init__: password must be a string, not %s"
1299 if len (password) == 0:
1300 raise InvalidParameter ("__init__: supplied empty password but not "
1301 "permitted for PDT encrypted files")
1303 if fixedparts is not None:
1304 if isinstance (fixedparts, list) is False:
1305 raise InvalidParameter ("__init__: IV fixed parts must be "
1306 "supplied as list, not %s"
1307 % type (fixedparts))
1308 self.fixed = fixedparts
1311 super().__init__ (password=password, key=key, counter=counter,
1312 strict_ivs=strict_ivs)
1315 def valid_fixed_part (self, iv):
1317 Check if a fixed part was already seen.
1319 # check if fixed part is known
1320 fixed, _cnt = struct.unpack (FMT_I2N_IV, iv)
1321 i = bisect.bisect_left (self.fixed, fixed)
1322 return i != len (self.fixed) and self.fixed [i] == fixed
1325 def check_consecutive_iv (self, iv):
1327 Check whether the counter part of the given IV is indeed the successor
1328 of the currently present counter. This should always be the case for
1329 the objects in a well formed PDT archive but should not be enforced
1330 when decrypting out-of-order.
1332 fixed, cnt = struct.unpack (FMT_I2N_IV, iv)
1333 if self.strict_ivs is True \
1334 and self.last_iv is not None \
1335 and self.last_iv [0] == fixed \
1336 and self.last_iv [1] != cnt - 1:
1337 raise NonConsecutiveIV ("iv %s counter not successor of "
1338 "last object (expected %d, found %d)"
1339 % (iv_fmt (self.last_iv [1]), cnt))
1340 self.last_iv = (iv, cnt)
1343 def next (self, hdr):
1345 Start decrypting the next object. The PDTCRYPT header for the object
1346 can be given either as already parsed object or as bytes.
1348 if isinstance (hdr, bytes) is True:
1349 hdr = hdr_read (hdr)
1350 elif isinstance (hdr, dict) is False:
1351 # this won’t catch malformed specs though
1352 raise InvalidParameter ("next: wrong type of parameter hdr: "
1353 "expected bytes or spec, got %s"
1356 paramversion = hdr ["paramversion"]
1361 raise InvalidHeader ("next: not a header %r" % hdr)
1363 super().next (self.password, paramversion, nacl, iv)
1364 if self.fixed is not None and self.valid_fixed_part (iv) is False:
1365 raise InvalidIVFixedPart ("iv %s has invalid fixed part"
1367 self.check_consecutive_iv (iv)
1370 defs = ENCRYPTION_PARAMETERS.get (paramversion, None)
1372 raise FormatError ("header contains unknown parameter version %d; "
1373 "maybe the file was created by a more recent "
1374 "version of Deltatar" % paramversion)
1376 if enc == "aes-gcm":
1378 ( algorithms.AES (self.key)
1379 , modes.GCM (iv, tag=self.tag)
1380 , backend = default_backend ()) \
1382 elif enc == "passthrough":
1383 self.enc = PassthroughCipher ()
1385 raise InternalError ("encryption parameter set %d refers to unknown "
1386 "mode %r" % (paramversion, enc))
1387 self.set_object_counter (self.cnt + 1)
1390 def done (self, tag=None):
1392 Stop decryption of the current object and finalize it with the active
1393 context. This will throw an *InvalidGCMTag* exception to indicate that
1394 the authentication tag does not match the data. If the tag is correct,
1395 the rest of the plaintext is returned.
1400 data = self.enc.finalize ()
1402 if isinstance (tag, bytes) is False:
1403 raise InvalidParameter ("done: wrong type of parameter "
1404 "tag: expected bytes, got %s"
1406 data = self.enc.finalize_with_tag (self.tag)
1407 except cryptography.exceptions.InvalidTag:
1408 raise InvalidGCMTag ("done: tag mismatch of object %d: %s "
1409 "rejected by finalize ()"
1410 % (self.cnt, binascii.hexlify (self.tag)))
1411 self.ctsize += len (data)
1412 self.stats ["out"] += len (data)
1416 def process (self, buf):
1418 Decrypt the bytes object *buf* with the active decryptor.
1420 if isinstance (buf, bytes) is False:
1421 raise InvalidParameter ("process: expected byte buffer, not %s"
1423 self.ctsize += len (buf)
1424 data = super().process (buf)
1425 self.ptsize += len (data)
1429 ###############################################################################
1431 ###############################################################################
1433 def _patch_global (glob, vow, n=None):
1435 Adapt upper file counter bound for testing IV logic. Completely unsafe.
1437 assert vow == "I am fully aware that this will void my warranty."
1438 r = globals () [glob]
1440 n = globals () [glob + "_DEFAULT"]
1441 globals () [glob] = n
1444 _testing_set_AES_GCM_IV_CNT_MAX = \
1445 partial (_patch_global, "AES_GCM_IV_CNT_MAX")
1447 _testing_set_PDTCRYPT_MAX_OBJ_SIZE = \
1448 partial (_patch_global, "PDTCRYPT_MAX_OBJ_SIZE")
1450 def open2_dump_file (fname, dir_fd, force=False):
1453 oflags = os.O_CREAT | os.O_WRONLY
1455 oflags |= os.O_TRUNC
1460 outfd = os.open (fname, oflags,
1461 stat.S_IRUSR | stat.S_IWUSR, dir_fd=dir_fd)
1462 except FileExistsError as exn:
1463 noise ("PDT: refusing to overwrite existing file %s" % fname)
1465 raise RuntimeError ("destination file %s already exists" % fname)
1466 if PDTCRYPT_VERBOSE is True:
1467 noise ("PDT: new output file %s (fd=%d)" % (fname, outfd))
1471 ###############################################################################
1472 ## freestanding invocation
1473 ###############################################################################
1475 PDTCRYPT_SUB_PROCESS = 0
1476 PDTCRYPT_SUB_SCRYPT = 1
1477 PDTCRYPT_SUB_SCAN = 2
1480 { "process" : PDTCRYPT_SUB_PROCESS
1481 , "scrypt" : PDTCRYPT_SUB_SCRYPT
1482 , "scan" : PDTCRYPT_SUB_SCAN }
1484 PDTCRYPT_DECRYPT = 1 << 0 # decrypt archive with password
1485 PDTCRYPT_SPLIT = 1 << 1 # split archive into individual objects
1486 PDTCRYPT_HASH = 1 << 2 # output scrypt hash for file and given password
1488 PDTCRYPT_SPLITNAME = "pdtcrypt-object-%d.bin"
1489 PDTCRYPT_RESCUENAME = "pdtcrypt-rescue-object-%0.5d.bin"
1491 PDTCRYPT_VERBOSE = False
1492 PDTCRYPT_STRICTIVS = False
1493 PDTCRYPT_OVERWRITE = False
1494 PDTCRYPT_BLOCKSIZE = 1 << 12
1499 PDTCRYPT_DEFAULT_VER = 1
1500 PDTCRYPT_DEFAULT_PVER = 1
1502 # scrypt hashing output control
1503 PDTCRYPT_SCRYPT_INTRANATOR = 0
1504 PDTCRYPT_SCRYPT_PARAMETERS = 1
1505 PDTCRYPT_SCRYPT_DEFAULT = PDTCRYPT_SCRYPT_INTRANATOR
1507 PDTCRYPT_SCRYPT_FORMAT = \
1508 { "i2n" : PDTCRYPT_SCRYPT_INTRANATOR
1509 , "params" : PDTCRYPT_SCRYPT_PARAMETERS }
1511 PDTCRYPT_TT_COLUMNS = 80 # assume standard terminal
1513 class PDTDecryptionError (Exception):
1514 """Decryption failed."""
1516 class PDTSplitError (Exception):
1517 """Decryption failed."""
1520 def noise (*a, **b):
1521 print (file=sys.stderr, *a, **b)
1524 class PassthroughDecryptor (object):
1526 curhdr = None # write current header on first data write
1528 def __init__ (self):
1529 if PDTCRYPT_VERBOSE is True:
1530 noise ("PDT: no encryption; data passthrough")
1532 def next (self, hdr):
1533 ok, curhdr = hdr_make (hdr)
1535 raise PDTDecryptionError ("bad header %r" % hdr)
1536 self.curhdr = curhdr
1539 if self.curhdr is not None:
1543 def process (self, d):
1544 if self.curhdr is not None:
1550 def depdtcrypt (mode, secret, ins, outs):
1552 Remove PDTCRYPT layer from all objects encrypted with the secret. Used on a
1553 Deltatar backup this will yield a (possibly Gzip compressed) tarball.
1555 ctleft = -1 # length of ciphertext to consume
1556 ctcurrent = 0 # total ciphertext of current object
1557 total_obj = 0 # total number of objects read
1558 total_pt = 0 # total plaintext bytes
1559 total_ct = 0 # total ciphertext bytes
1560 total_read = 0 # total bytes read
1561 outfile = None # Python file object for output
1563 if mode & PDTCRYPT_DECRYPT: # decryptor
1565 if ks == PDTCRYPT_SECRET_PW:
1566 decr = Decrypt (password=secret [1], strict_ivs=PDTCRYPT_STRICTIVS)
1567 elif ks == PDTCRYPT_SECRET_KEY:
1568 key = binascii.unhexlify (secret [1])
1569 decr = Decrypt (key=key, strict_ivs=PDTCRYPT_STRICTIVS)
1571 raise InternalError ("‘%d’ does not specify a valid kind of secret"
1574 decr = PassthroughDecryptor ()
1577 """Dummy for non-split mode: output file does not vary."""
1580 if mode & PDTCRYPT_SPLIT:
1581 def nextout (outfile):
1583 We were passed an fd as outs for accessing the destination
1584 directory where extracted archive components are supposed
1589 if PDTCRYPT_VERBOSE is True:
1590 noise ("PDT: no output file to close at this point")
1592 if PDTCRYPT_VERBOSE is True:
1593 noise ("PDT: release output file %r" % outfile)
1594 # cleanup happens automatically by the GC; the next
1595 # line will error out on account of an invalid fd
1598 assert total_obj > 0
1599 fname = PDTCRYPT_SPLITNAME % total_obj
1601 outfd = open2_dump_file (fname, outs, force=PDTCRYPT_OVERWRITE)
1602 except RuntimeError as exn:
1603 raise PDTSplitError (exn)
1604 return os.fdopen (outfd, "wb", closefd=True)
1608 """ESPIPE is normal on non-seekable stdio stream."""
1611 except OSError as exn:
1612 if exn.errno == os.errno.ESPIPE:
1615 def out (pt, outfile):
1619 if PDTCRYPT_VERBOSE is True:
1620 noise ("PDT:\t· decrypt plaintext %d B" % (npt))
1622 nn = outfile.write (pt)
1623 except OSError as exn: # probably ENOSPC
1624 raise DecryptionError ("error (%s)" % exn)
1626 raise DecryptionError ("write aborted after %d of %d B" % (nn, npt))
1630 # current object completed; in a valid archive this marks either
1631 # the start of a new header or the end of the input
1632 if ctleft == 0: # current object requires finalization
1633 if PDTCRYPT_VERBOSE is True:
1634 noise ("PDT: %d finalize" % tell (ins))
1637 except InvalidGCMTag as exn:
1638 raise DecryptionError ("error finalizing object %d (%d B): "
1639 "%r" % (total_obj, len (pt), exn)) \
1642 if PDTCRYPT_VERBOSE is True:
1643 noise ("PDT:\t· object validated")
1645 if PDTCRYPT_VERBOSE is True:
1646 noise ("PDT: %d hdr" % tell (ins))
1648 hdr = hdr_read_stream (ins)
1649 total_read += PDTCRYPT_HDR_SIZE
1650 except EndOfFile as exn:
1651 total_read += exn.remainder
1652 if total_ct + total_obj * PDTCRYPT_HDR_SIZE != total_read:
1653 raise PDTDecryptionError ("ciphertext processed (%d B) plus "
1654 "overhead (%d × %d B) does not match "
1655 "the number of bytes read (%d )"
1656 % (total_ct, total_obj, PDTCRYPT_HDR_SIZE,
1658 # the single good exit
1659 return total_read, total_obj, total_ct, total_pt
1660 except InvalidHeader as exn:
1661 raise PDTDecryptionError ("invalid header at position %d in %r "
1662 "(%s)" % (tell (ins), exn, ins))
1663 if PDTCRYPT_VERBOSE is True:
1664 pretty = hdr_fmt_pretty (hdr)
1665 noise (reduce (lambda a, e: (a + "\n" if a else "") + "PDT:\t· " + e,
1666 pretty.splitlines (), ""))
1667 ctcurrent = ctleft = hdr ["ctsize"]
1671 total_obj += 1 # used in file counter with split mode
1673 # finalization complete or skipped in case of first object in
1674 # stream; create a new output file if necessary
1675 outfile = nextout (outfile)
1677 if PDTCRYPT_VERBOSE is True:
1678 noise ("PDT: %d decrypt obj no. %d, %d B"
1679 % (tell (ins), total_obj, ctleft))
1681 # always allocate a new buffer since python-cryptography doesn’t allow
1682 # passing a bytearray :/
1683 nexpect = min (ctleft, PDTCRYPT_BLOCKSIZE)
1684 if PDTCRYPT_VERBOSE is True:
1685 noise ("PDT:\t· [%d] %d%% done, read block (%d B of %d B remaining)"
1687 100 - ctleft * 100 / (ctcurrent > 0 and ctcurrent or 1),
1689 ct = ins.read (nexpect)
1693 raise EndOfFile (nct,
1694 "hit EOF after %d of %d B in block [%d:%d); "
1695 "%d B ciphertext remaining for object no %d"
1696 % (nct, nexpect, off, off + nexpect, ctleft,
1702 if PDTCRYPT_VERBOSE is True:
1703 noise ("PDT:\t· decrypt ciphertext %d B" % (nct))
1704 pt = decr.process (ct)
1708 def deptdcrypt_mk_stream (kind, path):
1709 """Create stream from file or stdio descriptor."""
1710 if kind == PDTCRYPT_SINK:
1712 if PDTCRYPT_VERBOSE is True: noise ("PDT: sink: stdout")
1713 return sys.stdout.buffer
1715 if PDTCRYPT_VERBOSE is True: noise ("PDT: sink: file %s" % path)
1716 return io.FileIO (path, "w")
1717 if kind == PDTCRYPT_SOURCE:
1719 if PDTCRYPT_VERBOSE is True: noise ("PDT: source: stdin")
1720 return sys.stdin.buffer
1722 if PDTCRYPT_VERBOSE is True: noise ("PDT: source: file %s" % path)
1723 return io.FileIO (path, "r")
1725 raise ValueError ("bogus stream “%s” / %s" % (kind, path))
1728 def mode_depdtcrypt (mode, secret, ins, outs):
1730 total_read, total_obj, total_ct, total_pt = \
1731 depdtcrypt (mode, secret, ins, outs)
1732 except DecryptionError as exn:
1733 noise ("PDT: Decryption failed:")
1735 noise ("PDT: “%s”" % exn)
1737 noise ("PDT: Did you specify the correct key / password?")
1740 except PDTSplitError as exn:
1741 noise ("PDT: Split operation failed:")
1743 noise ("PDT: “%s”" % exn)
1745 noise ("PDT: Hint: target directory should be empty.")
1749 if PDTCRYPT_VERBOSE is True:
1750 noise ("PDT: decryption successful" )
1751 noise ("PDT: %.10d bytes read" % total_read)
1752 noise ("PDT: %.10d objects decrypted" % total_obj )
1753 noise ("PDT: %.10d bytes ciphertext" % total_ct )
1754 noise ("PDT: %.10d bytes plaintext" % total_pt )
1760 def mode_scrypt (pw, ins=None, nacl=None, fmt=PDTCRYPT_SCRYPT_INTRANATOR):
1762 paramversion = PDTCRYPT_DEFAULT_PVER
1764 hsh, nacl, version, paramversion = scrypt_hashsource (pw, ins)
1765 defs = ENCRYPTION_PARAMETERS.get(paramversion, None)
1767 nacl = binascii.unhexlify (nacl)
1768 defs = ENCRYPTION_PARAMETERS.get(paramversion, None)
1769 version = PDTCRYPT_DEFAULT_VER
1771 kdfname, params = defs ["kdf"]
1773 kdf = kdf_by_version (None, defs)
1774 hsh, _void = kdf (pw, nacl)
1778 if fmt == PDTCRYPT_SCRYPT_INTRANATOR:
1779 out = json.dumps ({ "salt" : base64.b64encode (nacl).decode ()
1780 , "key" : base64.b64encode (hsh) .decode ()
1781 , "paramversion" : paramversion })
1782 elif fmt == PDTCRYPT_SCRYPT_PARAMETERS:
1783 out = json.dumps ({ "salt" : binascii.hexlify (nacl).decode ()
1784 , "key" : binascii.hexlify (hsh) .decode ()
1785 , "version" : version
1786 , "scrypt_params" : { "N" : params ["N"]
1787 , "r" : params ["r"]
1788 , "p" : params ["p"]
1789 , "dkLen" : params ["dkLen"] } })
1791 raise RuntimeError ("bad scrypt output scheme %r" % fmt)
1796 def noise_output_candidates (cands, indent=8, cols=PDTCRYPT_TT_COLUMNS):
1798 Print a list of offsets without garbling the terminal too much.
1800 The indent is counted from column zero; if it is wide enough, the “PDT: ”
1801 marker will be prepended, considered part of the indentation.
1805 idt = " " * indent if indent < 5 else "PDT: " + " " * (indent - 5)
1810 init = True # prevent leading separator
1813 raise ValueError ("the requested indentation exceeds the line "
1814 "width by %d" % (indent - wd))
1824 if lpos > wd: # line break
1840 def mode_scan (secret, fname, outs=None, nacl=None):
1842 Dissect a binary file, looking for PDTCRYPT headers and objects.
1844 If *outs* is supplied, recoverable data will be dumped into the specified
1848 ifd = os.open (fname, os.O_RDONLY)
1849 except FileNotFoundError:
1850 noise ("PDT: failed to open %s readonly" % fname)
1855 if PDTCRYPT_VERBOSE is True:
1856 noise ("PDT: scan for potential sync points")
1857 cands = locate_hdr_candidates (ifd)
1858 if len (cands) == 0:
1859 noise ("PDT: scan complete: input does not contain potential PDT "
1860 "headers; giving up.")
1862 if PDTCRYPT_VERBOSE is True:
1863 noise ("PDT: scan complete: found %d candidates:" % len (cands))
1864 noise_output_candidates (cands)
1874 vdt, hdr = inspect_hdr (ifd, cand)
1875 if vdt == HDR_CAND_JUNK:
1878 off0 = cand + PDTCRYPT_HDR_SIZE
1879 if PDTCRYPT_VERBOSE is True:
1880 noise ("PDT: obj %d: read payload @%d" % (nobj, off0))
1881 pretty = hdr_fmt_pretty (hdr)
1882 noise (reduce (lambda a, e: (a + "\n" if a else "") + "PDT:\t· " + e,
1883 pretty.splitlines (), ""))
1886 if outs is not None:
1887 ofname = PDTCRYPT_RESCUENAME % nobj
1888 ofd = open2_dump_file (ofname, outs, force=PDTCRYPT_OVERWRITE)
1891 ok = try_decrypt (ifd, off0, hdr, secret, ofd=ofd) == hdr ["ctsize"]
1895 if vdt == HDR_CAND_GOOD and ok is True:
1896 noise ("PDT: %d → ✓ valid object %d–%d"
1897 % (cand, off0, off0 + hdr ["ctsize"]))
1898 elif vdt == HDR_CAND_FISHY and ok is True:
1899 noise ("PDT: %d → × object %d–%d, corrupt header"
1900 % (cand, off0, off0 + hdr ["ctsize"]))
1901 elif vdt == HDR_CAND_GOOD and ok is False:
1902 noise ("PDT: %d → × object %d–%d, problematic payload"
1903 % (cand, off0, off0 + hdr ["ctsize"]))
1904 elif vdt == HDR_CAND_FISHY and ok is False:
1905 noise ("PDT: %d → × object %d–%d, corrupt header, problematic "
1906 "ciphertext" % (cand, off0, off0 + hdr ["ctsize"]))
1913 noise ("PDT: all headers ok")
1915 noise ("PDT: %d candidates not parseable as headers:" % len (junk))
1916 noise_output_candidates (junk)
1918 def usage (err=False):
1922 indent = ' ' * len (SELF)
1923 out ("usage: %s SUBCOMMAND { --help" % SELF)
1924 out (" %s | [ -v ] { -p PASSWORD | -k KEY }" % indent)
1925 out (" %s [ { -i | --in } { - | SOURCE } ]" % indent)
1926 out (" %s [ { -n | --nacl } { SALT } ]" % indent)
1927 out (" %s [ { -o | --out } { - | DESTINATION } ]" % indent)
1928 out (" %s [ -D | --no-decrypt ] [ -S | --split ]" % indent)
1929 out (" %s [ -f | --format ]" % indent)
1932 out ("\t\tSUBCOMMAND main mode: { process | scrypt }")
1934 out ("\t\t process: extract objects from PDT archive")
1935 out ("\t\t scrypt: calculate hash from password and first object")
1936 out ("\t\t-p PASSWORD password to derive the encryption key from")
1937 out ("\t\t-k KEY encryption key as 16 bytes in hexadecimal notation")
1938 out ("\t\t-s enforce strict handling of initialization vectors")
1939 out ("\t\t-i SOURCE file name to read from")
1940 out ("\t\t-o DESTINATION file to write output to")
1941 out ("\t\t-n SALT provide salt for scrypt mode in hex encoding")
1942 out ("\t\t-v print extra info")
1943 out ("\t\t-S split into files at object boundaries; this")
1944 out ("\t\t requires DESTINATION to refer to directory")
1945 out ("\t\t-D PDT header and ciphertext passthrough")
1946 out ("\t\t-f format of SCRYPT hash output (“default” or “parameters”)")
1948 out ("\tinstead of filenames, “-” may used to specify stdin / stdout")
1950 sys.exit ((err is True) and 42 or 0)
1960 def parse_argv (argv):
1961 global PDTCRYPT_OVERWRITE
1963 mode = PDTCRYPT_DECRYPT
1969 scrypt_format = PDTCRYPT_SCRYPT_DEFAULT
1972 SELF = os.path.basename (next (argvi))
1975 rawsubcmd = next (argvi)
1976 subcommand = PDTCRYPT_SUB [rawsubcmd]
1977 except StopIteration:
1978 bail ("ERROR: subcommand required")
1980 bail ("ERROR: invalid subcommand “%s” specified" % rawsubcmd)
1986 except StopIteration:
1987 bail ("ERROR: argument list incomplete")
1989 def checked_secret (s):
1994 bail ("ERROR: encountered “%s” but secret already given" % arg)
1997 if arg in [ "-h", "--help" ]:
2000 elif arg in [ "-v", "--verbose", "--wtf" ]:
2001 global PDTCRYPT_VERBOSE
2002 PDTCRYPT_VERBOSE = True
2003 elif arg in [ "-i", "--in", "--source" ]:
2004 insspec = checked_arg ()
2005 if PDTCRYPT_VERBOSE is True: noise ("PDT: decrypt from %s" % insspec)
2006 elif arg in [ "-p", "--password" ]:
2007 arg = checked_arg ()
2008 checked_secret (make_secret (password=arg))
2009 if PDTCRYPT_VERBOSE is True: noise ("PDT: decrypting with password")
2011 if subcommand == PDTCRYPT_SUB_PROCESS:
2012 if arg in [ "-s", "--strict-ivs" ]:
2013 global PDTCRYPT_STRICTIVS
2014 PDTCRYPT_STRICTIVS = True
2015 elif arg in [ "-o", "--out", "--dest", "--sink" ]:
2016 outsspec = checked_arg ()
2017 if PDTCRYPT_VERBOSE is True: noise ("PDT: decrypt to %s" % outsspec)
2018 elif arg in [ "-f", "--force" ]:
2019 PDTCRYPT_OVERWRITE = True
2020 if PDTCRYPT_VERBOSE is True: noise ("PDT: overwrite existing files")
2021 elif arg in [ "-S", "--split" ]:
2022 mode |= PDTCRYPT_SPLIT
2023 if PDTCRYPT_VERBOSE is True: noise ("PDT: split files")
2024 elif arg in [ "-D", "--no-decrypt" ]:
2025 mode &= ~PDTCRYPT_DECRYPT
2026 if PDTCRYPT_VERBOSE is True: noise ("PDT: not decrypting")
2027 elif arg in [ "-k", "--key" ]:
2028 arg = checked_arg ()
2029 checked_secret (make_secret (key=arg))
2030 if PDTCRYPT_VERBOSE is True: noise ("PDT: decrypting with key")
2032 bail ("ERROR: unexpected positional argument “%s”" % arg)
2033 elif subcommand == PDTCRYPT_SUB_SCRYPT:
2034 if arg in [ "-n", "--nacl", "--salt" ]:
2035 nacl = checked_arg ()
2036 if PDTCRYPT_VERBOSE is True: noise ("PDT: salt key with %s" % nacl)
2037 elif arg in [ "-f", "--format" ]:
2038 arg = checked_arg ()
2040 scrypt_format = PDTCRYPT_SCRYPT_FORMAT [arg]
2042 bail ("ERROR: invalid scrypt output format %s" % arg)
2043 if PDTCRYPT_VERBOSE is True:
2044 noise ("PDT: scrypt output format “%s”" % scrypt_format)
2046 bail ("ERROR: unexpected positional argument “%s”" % arg)
2047 elif subcommand == PDTCRYPT_SUB_SCAN:
2048 if arg in [ "-o", "--out", "--dest", "--sink" ]:
2049 outsspec = checked_arg ()
2050 if PDTCRYPT_VERBOSE is True: noise ("PDT: decrypt to %s" % outsspec)
2051 elif arg in [ "-f", "--force" ]:
2052 PDTCRYPT_OVERWRITE = True
2053 if PDTCRYPT_VERBOSE is True: noise ("PDT: overwrite existing files")
2055 bail ("ERROR: unexpected positional argument “%s”" % arg)
2058 if PDTCRYPT_VERBOSE is True:
2059 noise ("ERROR: no password or key specified, trying $PDTCRYPT_PASSWORD")
2060 epw = os.getenv ("PDTCRYPT_PASSWORD")
2062 checked_secret (make_secret (password=epw.strip ()))
2065 if PDTCRYPT_VERBOSE is True:
2066 noise ("ERROR: no password or key specified, trying $PDTCRYPT_KEY")
2067 ek = os.getenv ("PDTCRYPT_KEY")
2069 checked_secret (make_secret (key=ek.strip ()))
2072 if subcommand == PDTCRYPT_SUB_SCRYPT:
2073 bail ("ERROR: scrypt hash mode requested but no password given")
2074 elif mode & PDTCRYPT_DECRYPT:
2075 bail ("ERROR: encryption requested but no password given")
2077 if mode & PDTCRYPT_SPLIT and outsspec is None:
2078 bail ("ERROR: split mode is incompatible with stdout sink "
2081 if subcommand == PDTCRYPT_SUB_SCAN and outsspec is None:
2082 pass # no output by default in scan mode
2083 elif mode & PDTCRYPT_SPLIT or subcommand == PDTCRYPT_SUB_SCAN:
2084 # destination must be directory
2086 bail ("ERROR: mode is incompatible with stdout sink")
2089 os.makedirs (outsspec, 0o700)
2090 except FileExistsError:
2091 # if it’s a directory with appropriate perms, everything is
2092 # good; otherwise, below invocation of open(2) will fail
2094 outs = os.open (outsspec, os.O_DIRECTORY, 0o600)
2095 except FileNotFoundError as exn:
2096 bail ("ERROR: cannot create target directory “%s”" % outsspec)
2097 except NotADirectoryError as exn:
2098 bail ("ERROR: target path “%s” is not a directory" % outsspec)
2100 outs = deptdcrypt_mk_stream (PDTCRYPT_SINK, outsspec or "-")
2102 if subcommand == PDTCRYPT_SUB_SCAN:
2104 bail ("ERROR: please supply an input file for scanning")
2106 bail ("ERROR: input must be seekable; please specify a file")
2107 return True, partial (mode_scan, secret, insspec, outs, nacl=nacl)
2109 if subcommand == PDTCRYPT_SUB_SCRYPT:
2110 if secret [0] == PDTCRYPT_SECRET_KEY:
2111 bail ("ERROR: scrypt mode requires a password")
2112 if insspec is not None and nacl is not None \
2113 or insspec is None and nacl is None :
2114 bail ("ERROR: please supply either an input file or "
2119 if insspec is not None or subcommand != PDTCRYPT_SUB_SCRYPT:
2120 ins = deptdcrypt_mk_stream (PDTCRYPT_SOURCE, insspec or "-")
2122 if subcommand == PDTCRYPT_SUB_SCRYPT:
2123 return True, partial (mode_scrypt, secret [1].encode (), ins, nacl,
2126 return True, partial (mode_depdtcrypt, mode, secret, ins, outs)
2130 ok, runner = parse_argv (argv)
2132 if ok is True: return runner ()
2137 if __name__ == "__main__":
2138 sys.exit (main (sys.argv))