6 ===============================================================================
7 crypto -- Encryption Layer for the Deltatar Backup
8 ===============================================================================
12 - AES-GCM for the symmetric encryption;
17 - NIST Recommendation for Block Cipher Modes of Operation: Galois/Counter
19 http://nvlpubs.nist.gov/nistpubs/Legacy/SP/nistspecialpublication800-38d.pdf
22 https://cryptome.org/2014/01/aes-gcm-v1.pdf
24 - Authentication weaknesses in GCM
25 http://csrc.nist.gov/groups/ST/toolkit/BCM/documents/comments/CWC-GCM/Ferguson2.pdf
27 Trouble with python-cryptography packages: authentication tags can only be
28 passed in advance: https://github.com/pyca/cryptography/pull/3421
31 -------------------------------------------------------------------------------
33 Errors fall into roughly three categories:
35 - Cryptographical errors or invalid data.
37 - ``InvalidGCMTag`` (decryption failed on account of an invalid GCM
39 - ``InvalidIVFixedPart`` (IV fixed part of object not found in list),
40 - ``DuplicateIV`` (the IV of an encrypted object already occurred),
41 - ``DecryptionError`` (used in CLI decryption for presenting error
42 conditions to the user).
44 - Incorrect usage of the library.
46 - ``InvalidParameter`` (non-conforming user supplied parameter),
47 - ``InvalidHeader`` (data passed for reading not parsable into header),
48 - ``FormatError`` (cannot handle header or parameter version),
51 - Bad internal state. If one of these is encountered it means that a state
52 was reached that shouldn’t occur during normal processing.
57 Also, ``EndOfFile`` is used as a sentinel to communicate that a stream supplied
58 for reading is exhausted.
60 Initialization Vectors
61 -------------------------------------------------------------------------------
63 Initialization vectors are checked reuse during the lifetime of a decryptor.
64 The fixed counters for metadata files cannot be reused and attempts to do so
65 will cause a DuplicateIV error. This means the length of objects encrypted with
66 a metadata counter is capped at 63 GB.
68 For ordinary, non-metadata payload, there is an optional mode with strict IV
69 checking that causes a crypto context to fail if an IV encountered or created
70 was already used for decrypting or encrypting, respectively, an earlier object.
71 Note that this mode can trigger false positives when decrypting non-linearly,
72 e. g. when traversing the same object multiple times. Since the crypto context
73 has no notion of a position in a PDT encrypted archive, this condition must be
74 sorted out downstream.
77 -------------------------------------------------------------------------------
79 ``crypto.py`` may be invoked as a script for decrypting, validating, and
80 splitting PDT encrypted files. Consult the usage message for details.
84 Decrypt from stdin using the password ‘foo’: ::
86 $ crypto.py process foo -i - -o - <some-file.tar.gz.pdtcrypt >some-file.tar.gz
88 Output verbose information about the encrypted objects in the archive: ::
90 $ crypto.py process foo -v -i some-file.tar.gz.pdtcrypt -o /dev/null
91 PDT: decrypt from some-file.tar.gz.pdtcrypt
92 PDT: decrypt to /dev/null
93 PDT: source: file some-file.tar.gz.pdtcrypt
94 PDT: sink: file /dev/null
96 PDT: · version = 1 : 0100
97 PDT: · paramversion = 1 : 0100
98 PDT: · nacl : d270 b031 00d1 87e2 c946 610d 7b7f 7e5f
99 PDT: · iv : 02ee 3dd7 a963 1eb1 0100 0000
100 PDT: · ctsize = 591 : 4f02 0000 0000 0000
101 PDT: · tag : 5b2d 6d8b 8f82 4842 12fd 0b10 b6e3 369b
102 PDT: 64 decrypt obj no. 1, 591 B
103 PDT: · [64] 0% done, read block (591 B of 591 B remaining)
104 PDT: · decrypt ciphertext 591 B
105 PDT: · decrypt plaintext 591 B
109 Also, the mode *scrypt* allows deriving encryption keys. To calculate the
110 encryption key from the password ‘foo’ and the salt of the first object in a
111 PDT encrypted file: ::
113 $ crypto.py scrypt foo -i some-file.pdtcrypt
114 {"paramversion": 1, "salt": "Cqzbk48e3peEjzWto8D0yA==", "key": "JH9EkMwaM4x9F5aim5gK/Q=="}
116 The computed 16 byte key is given in hexadecimal notation in the value to
117 ``hash`` and can be fed into Python’s ``binascii.unhexlify()`` to obtain the
118 corresponding binary representation.
120 Note that in Scrypt hashing mode, no data integrity checks are being performed.
121 If the wrong password is given, a wrong key will be derived. Whether the password
122 was indeed correct can only be determined by decrypting. Note that since PDT
123 archives essentially consist of a stream of independent objects, the salt and
124 other parameters may change. Thus a key derived using above method from the
125 first object doesn’t necessarily apply to any of the subsequent objects.
134 from functools import reduce, partial
143 except ImportError as exn:
146 if __name__ == "__main__": ## Work around the import mechanism’s lest Python’s
147 pwd = os.getcwd() ## preference for local imports causes a cyclical
148 ## import (crypto → pylibscrypt → […] → ./tarfile → crypto).
149 sys.path = [ p for p in sys.path if p.find ("deltatar") < 0 ]
152 from cryptography.hazmat.primitives.ciphers import Cipher, algorithms, modes
153 from cryptography.hazmat.backends import default_backend
157 __all__ = [ "hdr_make", "hdr_read", "hdr_fmt", "hdr_fmt_pretty"
159 , "PDTCRYPT_HDR_SIZE", "AES_GCM_IV_CNT_DATA"
160 , "AES_GCM_IV_CNT_INFOFILE", "AES_GCM_IV_CNT_INDEX"
164 ###############################################################################
166 ###############################################################################
168 class EndOfFile (Exception):
172 def __init__ (self, n=None, msg=None):
178 class InvalidParameter (Exception):
179 """Inputs not valid for PDT encryption."""
183 class InvalidHeader (Exception):
184 """Header not valid."""
188 class InvalidGCMTag (Exception):
190 The GCM tag calculated during decryption differs from that in the object
196 class InvalidIVFixedPart (Exception):
198 IV fixed part not in supplied list: either the backup is corrupt or the
199 current object does not belong to it.
204 class IVFixedPartError (Exception):
206 Error creating a unique IV fixed part: repeated calls to system RNG yielded
207 the same sequence of bytes as the last IV used.
212 class InvalidFileCounter (Exception):
214 When encrypting, an attempted reuse of a dedicated counter (info file,
215 index file) was caught.
220 class DuplicateIV (Exception):
222 During encryption, the current IV fixed part is identical to an already
223 existing IV (same prefix and file counter). This indicates tampering or
224 programmer error and cannot be recovered from.
229 class NonConsecutiveIV (Exception):
231 IVs not numbered consecutively. This is a hard error with strict IV
232 checking. Precludes random access to the encrypted objects.
237 class FormatError (Exception):
238 """Unusable parameters in header."""
242 class DecryptionError (Exception):
243 """Error during decryption with ``crypto.py`` on the command line."""
247 class Unreachable (Exception):
249 Makeshift __builtin_unreachable(); always a programmer error if
255 class InternalError (Exception):
256 """Errors not ascribable to bad user inputs or cryptography."""
260 ###############################################################################
261 ## crypto layer version
262 ###############################################################################
264 ENCRYPTION_PARAMETERS = \
266 { "kdf": ("dummy", 16)
267 , "enc": "passthrough" }
275 , "enc": "aes-gcm" } }
277 ###############################################################################
279 ###############################################################################
281 PDTCRYPT_HDR_MAGIC = b"PDTCRYPT"
283 PDTCRYPT_HDR_SIZE_MAGIC = 8 # 8
284 PDTCRYPT_HDR_SIZE_VERSION = 2 # 10
285 PDTCRYPT_HDR_SIZE_PARAMVERSION = 2 # 12
286 PDTCRYPT_HDR_SIZE_NACL = 16 # 28
287 PDTCRYPT_HDR_SIZE_IV = 12 # 40
288 PDTCRYPT_HDR_SIZE_CTSIZE = 8 # 48
289 PDTCRYPT_HDR_SIZE_TAG = 16 # 64 GCM auth tag
291 PDTCRYPT_HDR_SIZE = PDTCRYPT_HDR_SIZE_MAGIC + PDTCRYPT_HDR_SIZE_VERSION \
292 + PDTCRYPT_HDR_SIZE_PARAMVERSION + PDTCRYPT_HDR_SIZE_NACL \
293 + PDTCRYPT_HDR_SIZE_IV + PDTCRYPT_HDR_SIZE_CTSIZE \
294 + PDTCRYPT_HDR_SIZE_TAG # = 64
296 # precalculate offsets since Python can’t do constant folding over names
297 HDR_OFF_VERSION = PDTCRYPT_HDR_SIZE_MAGIC
298 HDR_OFF_PARAMVERSION = HDR_OFF_VERSION + PDTCRYPT_HDR_SIZE_VERSION
299 HDR_OFF_NACL = HDR_OFF_PARAMVERSION + PDTCRYPT_HDR_SIZE_PARAMVERSION
300 HDR_OFF_IV = HDR_OFF_NACL + PDTCRYPT_HDR_SIZE_NACL
301 HDR_OFF_CTSIZE = HDR_OFF_IV + PDTCRYPT_HDR_SIZE_IV
302 HDR_OFF_TAG = HDR_OFF_CTSIZE + PDTCRYPT_HDR_SIZE_CTSIZE
306 FMT_I2N_IV = "<8sL" # 8 random bytes ‖ 32 bit counter
307 FMT_I2N_HDR = ("<" # host byte order
311 "16s" # sodium chloride
317 AES_GCM_MAX_SIZE = (1 << 36) - (1 << 5) # 2^39 - 2^8 b ≅ 64 GB
318 PDTCRYPT_MAX_OBJ_SIZE_DEFAULT = 63 * (1 << 30) # 63 GB
319 PDTCRYPT_MAX_OBJ_SIZE = PDTCRYPT_MAX_OBJ_SIZE_DEFAULT
321 # index and info files are written on-the fly while encrypting so their
322 # counters must be available inadvance
323 AES_GCM_IV_CNT_INFOFILE = 1 # constant
324 AES_GCM_IV_CNT_INDEX = AES_GCM_IV_CNT_INFOFILE + 1
325 AES_GCM_IV_CNT_DATA = AES_GCM_IV_CNT_INDEX + 1 # also for multivolume
326 AES_GCM_IV_CNT_MAX_DEFAULT = 0xffFFffFF
327 AES_GCM_IV_CNT_MAX = AES_GCM_IV_CNT_MAX_DEFAULT
329 # IV structure and generation
330 PDTCRYPT_IV_GEN_MAX_RETRIES = 10 # ×
331 PDTCRYPT_IV_FIXEDPART_SIZE = 8 # B
332 PDTCRYPT_IV_COUNTER_SIZE = 4 # B
334 ###############################################################################
336 ###############################################################################
342 # , paramversion : u16
348 # fn hdr_read (f : handle) -> hdrinfo;
349 # fn hdr_make (f : handle, h : hdrinfo) -> IOResult<usize>;
350 # fn hdr_fmt (h : hdrinfo) -> String;
355 Read bytes as header structure.
357 If the input could not be interpreted as a header, fail with
362 mag, version, paramversion, nacl, iv, ctsize, tag = \
363 struct.unpack (FMT_I2N_HDR, data)
364 except Exception as exn:
365 raise InvalidHeader ("error unpacking header from [%r]: %s"
366 % (binascii.hexlify (data), str (exn)))
368 if mag != PDTCRYPT_HDR_MAGIC:
369 raise InvalidHeader ("bad magic in header: expected [%s], got [%s]"
370 % (PDTCRYPT_HDR_MAGIC, mag))
373 { "version" : version
374 , "paramversion" : paramversion
382 def hdr_read_stream (instr):
384 Read header from stream at the current position.
386 Fail with ``InvalidHeader`` if insufficient bytes were read from the
387 stream, or if the content could not be interpreted as a header.
389 data = instr.read(PDTCRYPT_HDR_SIZE)
393 elif ldata != PDTCRYPT_HDR_SIZE:
394 raise InvalidHeader ("hdr_read_stream: expected %d B, received %d B"
395 % (PDTCRYPT_HDR_SIZE, ldata))
396 return hdr_read (data)
399 def hdr_from_params (version, paramversion, nacl, iv, ctsize, tag):
401 Assemble the necessary values into a PDTCRYPT header.
403 :type version: int to fit uint16_t
404 :type paramversion: int to fit uint16_t
405 :type nacl: bytes to fit uint8_t[16]
406 :type iv: bytes to fit uint8_t[12]
407 :type size: int to fit uint64_t
408 :type tag: bytes to fit uint8_t[16]
410 buf = bytearray (PDTCRYPT_HDR_SIZE)
411 bufv = memoryview (buf)
414 struct.pack_into (FMT_I2N_HDR, bufv, 0,
416 version, paramversion, nacl, iv, ctsize, tag)
417 except Exception as exn:
418 return False, "error assembling header: %s" % str (exn)
420 return True, bytes (buf)
423 def hdr_make_dummy (s):
425 Create a header sized block of bytes initialized to a value derived from a
426 string. Used to verify we’ve jumped back correctly to the actual position
427 of the object header.
429 c = reduce (lambda a, c: a + ord(c), s, 0) % 0xFF
430 return bytes (bytearray (struct.pack ("B", c)) * PDTCRYPT_HDR_SIZE)
435 Assemble a header from the given header structure.
437 return hdr_from_params (version=hdr.get("version"),
438 paramversion=hdr.get("paramversion"),
439 nacl=hdr.get("nacl"), iv=hdr.get("iv"),
440 ctsize=hdr.get("ctsize"), tag=hdr.get("tag"))
443 HDR_FMT = "I2n_header { version: %d, paramversion: %d, nacl: %s[%d]," \
444 " iv: %s[%d], ctsize: %d, tag: %s[%d] }"
447 """Format a header structure into readable output."""
448 return HDR_FMT % (h["version"], h["paramversion"],
449 binascii.hexlify (h["nacl"]), len(h["nacl"]),
450 binascii.hexlify (h["iv"]), len(h["iv"]),
452 binascii.hexlify (h["tag"]), len(h["tag"]))
455 def hex_spaced_of_bytes (b):
456 """Format bytes object, hexdump style."""
457 return " ".join ([ "%.2x%.2x" % (c1, c2)
458 for c1, c2 in zip (b[0::2], b[1::2]) ]) \
459 + (len (b) | 1 == len (b) and " %.2x" % b[-1] or "") # odd lengths
462 def hdr_iv_counter (h):
463 """Extract the variable part of the IV of the given header."""
464 _fixed, cnt = struct.unpack (FMT_I2N_IV, h ["iv"])
468 def hdr_iv_fixed (h):
469 """Extract the fixed part of the IV of the given header."""
470 fixed, _cnt = struct.unpack (FMT_I2N_IV, h ["iv"])
474 hdr_dump = hex_spaced_of_bytes
478 """version = %-4d : %s
479 paramversion = %-4d : %s
486 def hdr_fmt_pretty (h):
488 Format header structure into multi-line representation of its contents and
489 their raw representation. (Omit the implicit “PDTCRYPT” magic bytes that
490 precede every header.)
492 return HDR_FMT_PRETTY \
494 hex_spaced_of_bytes (struct.pack (FMT_UINT16_LE, h["version"])),
496 hex_spaced_of_bytes (struct.pack (FMT_UINT16_LE, h["paramversion"])),
497 hex_spaced_of_bytes (h["nacl"]),
498 hex_spaced_of_bytes (h["iv"]),
500 hex_spaced_of_bytes (struct.pack (FMT_UINT64_LE, h["ctsize"])),
501 hex_spaced_of_bytes (h["tag"]))
503 IV_FMT = "((f %s) (c %d))"
506 """Format the two components of an IV in a readable fashion."""
507 fixed, cnt = struct.unpack (FMT_I2N_IV, iv)
508 return IV_FMT % (binascii.hexlify (fixed), cnt)
511 ###############################################################################
513 ###############################################################################
515 class Location (object):
519 def restore_loc_fmt (loc):
521 % (loc.n, loc.offset)
523 def locate_hdr_candidates (fd):
525 Walk over instances of the magic string in the payload, collecting their
526 positions. If the offset of the first found instance is not zero, the file
527 begins with leading garbage.
529 :return: The list of offsets in the file.
533 mm = mmap.mmap(fd, 0, mmap.MAP_SHARED, mmap.PROT_READ)
536 pos = mm.find (PDTCRYPT_HDR_MAGIC, pos)
545 HDR_CAND_GOOD = 0 # header marks begin of valid object
546 HDR_CAND_FISHY = 1 # inconclusive (tag mismatch, obj overlap etc.)
547 HDR_CAND_JUNK = 2 # not a header / object unreadable
550 def inspect_hdr (fd, off):
552 Attempt to parse a header in *fd* at position *off*.
554 Returns a verdict about the quality of that header plus the parsed header
558 _ = os.lseek (fd, off, os.SEEK_SET)
560 if os.lseek (fd, 0, os.SEEK_CUR) != off:
561 if PDTCRYPT_VERBOSE is True:
562 noise ("PDT: %d → dismissed (lseek() past EOF)" % off)
563 return HDR_CAND_JUNK, None
565 raw = os.read (fd, PDTCRYPT_HDR_SIZE)
566 if len (raw) != PDTCRYPT_HDR_SIZE:
567 if PDTCRYPT_VERBOSE is True:
568 noise ("PDT: %d → dismissed (EOF inside header)" % off)
569 return HDR_CAND_JUNK, None
573 except InvalidHeader as exn:
574 if PDTCRYPT_VERBOSE is True:
575 noise ("PDT: %d → dismissed (invalid: [%s])" % (off, str (exn)))
576 return HDR_CAND_JUNK, None
578 obj0 = off + PDTCRYPT_HDR_SIZE
579 objX = obj0 + hdr ["ctsize"]
581 eof = os.lseek (fd, 0, os.SEEK_END)
583 if PDTCRYPT_VERBOSE is True:
584 noise ("PDT: %d → EOF inside object (%d≤%d≤%d); adjusting size to "
585 "%d" % (off, obj0, eof, objX, (eof - obj0)))
586 # try reading up to the end
587 hdr ["ctsize"] = eof - obj0
588 return HDR_CAND_FISHY, hdr
590 return HDR_CAND_GOOD, hdr
593 def try_decrypt (fd, off, hdr, secret, fname=None):
595 Attempt to decrypt the object in the (seekable) descriptor *fd* starting at
596 *off* using the metadata in *hdr* and *secret*. An output file can be
597 specified with *fname*; if it is *None*, the decrypted payload will be
600 Always creates a fresh decryptor, so validation steps across objects don’t
603 ctleft = hdr ["ctsize"]
607 if ks == PDTCRYPT_SECRET_PW:
608 decr = Decrypt (password=secret [1])
609 elif ks == PDTCRYPT_SECRET_KEY:
610 key = binascii.unhexlify (secret [1])
611 decr = Decrypt (key=key)
615 if fname is not None: raise NotImplementedError
620 os.lseek (fd, pos, os.SEEK_SET)
622 cnksiz = min (ctleft, PDTCRYPT_BLOCKSIZE)
623 cnk = os.read (fd, cnksiz)
626 _pt = decr.process (cnk)
629 except Exception as exn:
630 noise ("PDT: error decrypting object %d–%d@%d, %d B remaining [%s]"
631 % (off, off + hdr ["ctsize"], pos, ctleft, exn))
637 ###############################################################################
638 ## passthrough / null encryption
639 ###############################################################################
641 class PassthroughCipher (object):
643 tag = struct.pack ("<QQ", 0, 0)
645 def __init__ (self) : pass
647 def update (self, b) : return b
649 def finalize (self) : return b""
651 def finalize_with_tag (self, _) : return b""
653 ###############################################################################
654 ## convenience wrapper
655 ###############################################################################
658 def kdf_dummy (klen, password, _nacl):
660 Fake KDF for testing purposes that is called when parameter version zero is
663 q, r = divmod (klen, len (password))
664 if isinstance (password, bytes) is False:
665 password = password.encode ()
666 return password * q + password [:r], b""
669 SCRYPT_KEY_MEMO = { } # static because needed for both the info file and the archive
672 def kdf_scrypt (params, password, nacl):
674 Wrapper for the Scrypt KDF, corresponds to parameter version one. The
675 computation result is memoized based on the inputs to facilitate spawning
676 multiple encryption contexts.
681 dkLen = params["dkLen"]
684 nacl = os.urandom (params["NaCl_LEN"])
686 key_parms = (password, nacl, N, r, p, dkLen)
687 global SCRYPT_KEY_MEMO
688 if key_parms not in SCRYPT_KEY_MEMO:
689 SCRYPT_KEY_MEMO [key_parms] = \
690 pylibscrypt.scrypt (password, nacl, N, r, p, dkLen)
691 return SCRYPT_KEY_MEMO [key_parms], nacl
694 def kdf_by_version (paramversion=None, defs=None):
696 Pick the KDF handler corresponding to the parameter version or the
699 :rtype: function (password : str, nacl : str) -> str
701 if paramversion is not None:
702 defs = ENCRYPTION_PARAMETERS.get(paramversion, None)
704 raise InvalidParameter ("no encryption parameters for version %r"
706 (kdf, params) = defs["kdf"]
708 if kdf == "scrypt" : fn = kdf_scrypt
709 if kdf == "dummy" : fn = kdf_dummy
711 raise ValueError ("key derivation method %r unknown" % kdf)
712 return partial (fn, params)
715 ###############################################################################
717 ###############################################################################
719 def scrypt_hashsource (pw, ins):
721 Calculate the SCRYPT hash from the password and the information contained
722 in the first header found in ``ins``.
724 This does not validate whether the first object is encrypted correctly.
726 if isinstance (pw, str) is True:
728 elif isinstance (pw, bytes) is False:
729 raise InvalidParameter ("password must be a string, not %s"
731 if isinstance (ins, io.BufferedReader) is False and \
732 isinstance (ins, io.FileIO) is False:
733 raise InvalidParameter ("file to hash must be opened in “binary” mode")
736 hdr = hdr_read_stream (ins)
737 except EndOfFile as exn:
738 noise ("PDT: malformed input: end of file reading first object header")
743 pver = hdr ["paramversion"]
744 if PDTCRYPT_VERBOSE is True:
745 noise ("PDT: salt of first object : %s" % binascii.hexlify (nacl))
746 noise ("PDT: parameter version of archive : %d" % pver)
749 defs = ENCRYPTION_PARAMETERS.get(pver, None)
750 kdfname, params = defs ["kdf"]
751 if kdfname != "scrypt":
752 noise ("PDT: input is not an SCRYPT archive")
755 kdf = kdf_by_version (None, defs)
756 except ValueError as exn:
757 noise ("PDT: object has unknown parameter version %d" % pver)
759 hsh, _void = kdf (pw, nacl)
761 return hsh, nacl, hdr ["version"], pver
764 def scrypt_hashfile (pw, fname):
766 Calculate the SCRYPT hash from the password and the information contained
767 in the first header found in the given file. The header is read only at
770 with deptdcrypt_mk_stream (PDTCRYPT_SOURCE, fname or "-") as ins:
771 hsh, _void, _void, _void = scrypt_hashsource (pw, ins)
775 ###############################################################################
777 ###############################################################################
779 class Crypto (object):
781 Encryption context to remain alive throughout an entire tarfile pass.
786 cnt = None # file counter (uint32_t != 0)
787 iv = None # current IV
788 fixed = None # accu for 64 bit fixed parts of IV
789 used_ivs = None # tracks IVs
790 strict_ivs = False # if True, panic on duplicate object IV
799 info_counter_used = False
800 index_counter_used = False
802 def __init__ (self, *al, **akv):
803 self.used_ivs = set ()
804 self.set_parameters (*al, **akv)
807 def next_fixed (self):
812 def set_object_counter (self, cnt=None):
814 Safely set the internal counter of encrypted objects. Numerous
817 The same counter may not be reused in combination with one IV fixed
818 part. This is validated elsewhere in the IV handling.
820 Counter zero is invalid. The first two counters are reserved for
821 metadata. The implementation does not allow for splitting metadata
822 files over multiple encrypted objects. (This would be possible by
823 assigning new fixed parts.) Thus in a Deltatar backup there is at most
824 one object with a counter value of one and two. On creation of a
825 context, the initial counter may be chosen. The globals
826 ``AES_GCM_IV_CNT_INFOFILE`` and ``AES_GCM_IV_CNT_INDEX`` can be used to
827 request one of the reserved values. If one of these values has been
828 used, any further attempt of setting the counter to that value will
829 be rejected with an ``InvalidFileCounter`` exception.
831 Out of bounds values (i. e. below one and more than the maximum of 2³²)
832 cause an ``InvalidParameter`` exception to be thrown.
835 self.cnt = AES_GCM_IV_CNT_DATA
837 if cnt == 0 or cnt > AES_GCM_IV_CNT_MAX + 1:
838 raise InvalidParameter ("invalid counter value %d requested: "
839 "acceptable values are from 1 to %d"
840 % (cnt, AES_GCM_IV_CNT_MAX))
841 if cnt == AES_GCM_IV_CNT_INFOFILE:
842 if self.info_counter_used is True:
843 raise InvalidFileCounter ("attempted to reuse info file "
844 "counter %d: must be unique" % cnt)
845 self.info_counter_used = True
846 elif cnt == AES_GCM_IV_CNT_INDEX:
847 if self.index_counter_used is True:
848 raise InvalidFileCounter ("attempted to reuse index file "
849 " counter %d: must be unique" % cnt)
850 self.index_counter_used = True
851 if cnt <= AES_GCM_IV_CNT_MAX:
854 # cnt == AES_GCM_IV_CNT_MAX + 1 → wrap
855 self.cnt = AES_GCM_IV_CNT_DATA
859 def set_parameters (self, password=None, key=None, paramversion=None,
860 nacl=None, counter=None, strict_ivs=False):
862 Configure the internal state of a crypto context. Not intended for
866 self.set_object_counter (counter)
867 self.strict_ivs = strict_ivs
869 if paramversion is not None:
870 self.paramversion = paramversion
873 self.key, self.nacl = key, nacl
876 if password is not None:
877 if isinstance (password, bytes) is False:
878 password = str.encode (password)
879 self.password = password
880 if paramversion is None and nacl is None:
881 # postpone key setup until first header is available
883 kdf = kdf_by_version (paramversion)
885 self.key, self.nacl = kdf (password, nacl)
888 def process (self, buf):
890 Encrypt / decrypt a buffer. Invokes the ``.update()`` method on the
891 wrapped encryptor or decryptor, respectively.
893 The Cryptography exception ``AlreadyFinalized`` is translated to an
894 ``InternalError`` at this point. It may occur in sound code when the GC
895 closes an encrypting stream after an error. Everywhere else it must be
899 raise RuntimeError ("process: context not initialized")
900 self.stats ["in"] += len (buf)
902 out = self.enc.update (buf)
903 except cryptography.exceptions.AlreadyFinalized as exn:
904 raise InternalError (exn)
905 self.stats ["out"] += len (out)
909 def next (self, password, paramversion, nacl, iv):
911 Prepare for encrypting another object: Reset the data counters and
912 change the configuration in case one of the variable parameters differs
913 from the last object. Also check the IV for duplicates and error out
914 if strict checking was requested.
918 self.stats ["obj"] += 1
920 self.check_duplicate_iv (iv)
922 if ( self.paramversion != paramversion
923 or self.password != password
924 or self.nacl != nacl):
925 self.set_parameters (password=password, paramversion=paramversion,
926 nacl=nacl, strict_ivs=self.strict_ivs)
929 def check_duplicate_iv (self, iv):
931 Add an IV (the 12 byte representation as in the header) to the list. With
932 strict checking enabled, this will throw a ``DuplicateIV``. Depending on
933 the context, this may indicate a serious error (IV reuse).
935 if self.strict_ivs is True and iv in self.used_ivs:
936 raise DuplicateIV ("iv %s was reused" % iv_fmt (iv))
937 # vi has not been used before; add to collection
938 self.used_ivs.add (iv)
943 Access the data counters.
945 return self.stats ["obj"], self.stats ["in"], self.stats ["out"]
950 Clear the current context regardless of its finalization state. The
951 next operation must be ``.next()``.
956 class Encrypt (Crypto):
962 def __init__ (self, version, paramversion, password=None, key=None, nacl=None,
963 counter=AES_GCM_IV_CNT_DATA, strict_ivs=True):
965 The ctor will throw immediately if one of the parameters does not conform
968 counter=AES_GCM_IV_CNT_DATA, strict_ivs=True):
969 :type version: int to fit uint16_t
970 :type paramversion: int to fit uint16_t
971 :param password: mutually exclusive with ``key``
972 :type password: bytes
973 :param key: mutually exclusive with ``password``
976 :type counter: initial object counter the values
977 ``AES_GCM_IV_CNT_INFOFILE`` and
978 ``AES_GCM_IV_CNT_INDEX`` are unique in each backup set
979 and cannot be reused even with different fixed parts.
980 :type strict_ivs: bool
982 if password is None and key is None \
983 or password is not None and key is not None :
984 raise InvalidParameter ("__init__: need either key or password")
987 if isinstance (key, bytes) is False:
988 raise InvalidParameter ("__init__: key must be provided as "
989 "bytes, not %s" % type (key))
991 raise InvalidParameter ("__init__: salt must be provided along "
992 "with encryption key")
993 else: # password, no key
994 if isinstance (password, str) is False:
995 raise InvalidParameter ("__init__: password must be a string, not %s"
997 if len (password) == 0:
998 raise InvalidParameter ("__init__: supplied empty password but not "
999 "permitted for PDT encrypted files")
1001 if isinstance (version, int) is False:
1002 raise InvalidParameter ("__init__: version number must be an "
1003 "integer, not %s" % type (version))
1005 raise InvalidParameter ("__init__: version number must be a "
1006 "nonnegative integer, not %d" % version)
1008 if isinstance (paramversion, int) is False:
1009 raise InvalidParameter ("__init__: crypto parameter version number "
1010 "must be an integer, not %s"
1011 % type (paramversion))
1012 if paramversion < 0:
1013 raise InvalidParameter ("__init__: crypto parameter version number "
1014 "must be a nonnegative integer, not %d"
1017 if nacl is not None:
1018 if isinstance (nacl, bytes) is False:
1019 raise InvalidParameter ("__init__: salt given, but of type %s "
1020 "instead of bytes" % type (nacl))
1021 # salt length would depend on the actual encryption so it can’t be
1022 # validated at this point
1024 self.version = version
1025 self.paramenc = ENCRYPTION_PARAMETERS.get (paramversion) ["enc"]
1027 super().__init__ (password, key, paramversion, nacl, counter=counter,
1028 strict_ivs=strict_ivs)
1031 def next_fixed (self, retries=PDTCRYPT_IV_GEN_MAX_RETRIES):
1033 Generate the next IV fixed part by reading eight bytes from
1034 ``/dev/urandom``. The buffer so obtained is tested against the fixed
1035 parts used so far to prevent accidental reuse of IVs. After a
1036 configurable number of attempts to create a unique fixed part, it will
1037 refuse to continue with an ``IVFixedPartError``. This is unlikely to
1038 ever happen on a normal system but may detect an issue with the random
1041 The list of fixed parts that were used by the context at hand can be
1042 accessed through the ``.fixed`` list. Its last element is the fixed
1043 part currently in use.
1047 fp = os.urandom (PDTCRYPT_IV_FIXEDPART_SIZE)
1048 if fp not in self.fixed:
1049 self.fixed.append (fp)
1052 raise IVFixedPartError ("error obtaining a unique IV fixed part from "
1053 "/dev/urandom; giving up after %d tries" % i)
1058 Construct a 12-bytes IV from the current fixed part and the object
1061 return struct.pack(FMT_I2N_IV, self.fixed [-1], self.cnt)
1064 def next (self, filename=None, counter=None):
1066 Prepare for encrypting the next incoming object. Update the counter
1067 and put together the IV, possibly changing prefixes. Then create the
1070 The argument ``counter`` can be used to specify a file counter for this
1071 object. Unless it is one of the reserved values, the counter of
1072 subsequent objects will be computed from this one.
1074 If this is the first object in a series, ``filename`` is required,
1075 otherwise it is reused if not present. The value is used to derive a
1076 header sized placeholder to use until after encryption when all the
1077 inputs to construct the final header are available. This is then
1078 matched in ``.done()`` against the value found at the position of the
1079 header. The motivation for this extra check is primarily to assist
1080 format debugging: It makes stray headers easy to spot in malformed
1083 if filename is None:
1084 if self.lastinfo is None:
1085 raise InvalidParameter ("next: filename is mandatory for "
1087 filename, _dummy = self.lastinfo
1089 if isinstance (filename, str) is False:
1090 raise InvalidParameter ("next: filename must be a string, no %s"
1092 if counter is not None:
1093 if isinstance (counter, int) is False:
1094 raise InvalidParameter ("next: the supplied counter is of "
1095 "invalid type %s; please pass an "
1096 "integer instead" % type (counter))
1097 self.set_object_counter (counter)
1099 self.iv = self.iv_make ()
1100 if self.paramenc == "aes-gcm":
1102 ( algorithms.AES (self.key)
1103 , modes.GCM (self.iv)
1104 , backend = default_backend ()) \
1106 elif self.paramenc == "passthrough":
1107 self.enc = PassthroughCipher ()
1109 raise InvalidParameter ("next: parameter version %d not known"
1110 % self.paramversion)
1111 hdrdum = hdr_make_dummy (filename)
1112 self.lastinfo = (filename, hdrdum)
1113 super().next (self.password, self.paramversion, self.nacl, self.iv)
1115 self.set_object_counter (self.cnt + 1)
1119 def done (self, cmpdata):
1121 Complete encryption of an object. After this has been called, attempts
1122 of encrypting further data will cause an error until ``.next()`` is
1125 Returns a 64 bytes buffer containing the object header including all
1126 values including the “late” ones e. g. the ciphertext size and the
1129 if isinstance (cmpdata, bytes) is False:
1130 raise InvalidParameter ("done: comparison input expected as bytes, "
1131 "not %s" % type (cmpdata))
1132 if self.lastinfo is None:
1133 raise RuntimeError ("done: encryption context not initialized")
1134 filename, hdrdum = self.lastinfo
1135 if cmpdata != hdrdum:
1136 raise RuntimeError ("done: bad sync of header for object %d: "
1137 "preliminary data does not match; this likely "
1138 "indicates a wrongly repositioned stream"
1140 data = self.enc.finalize ()
1141 self.stats ["out"] += len (data)
1142 self.ctsize += len (data)
1143 ok, hdr = hdr_from_params (self.version, self.paramversion, self.nacl,
1144 self.iv, self.ctsize, self.enc.tag)
1146 raise InternalError ("error constructing header: %r" % hdr)
1147 return data, hdr, self.fixed
1150 def process (self, buf):
1152 Encrypt a chunk of plaintext with the active encryptor. Returns the
1153 size of the input consumed. This **must** be checked downstream. If the
1154 maximum possible object size has been reached, the current context must
1155 be finalized and a new one established before any further data can be
1156 encrypted. The second argument is the remainder of the plaintext that
1157 was not encrypted for the caller to use immediately after the new
1160 if isinstance (buf, bytes) is False:
1161 raise InvalidParameter ("process: expected byte buffer, not %s"
1164 newptsize = self.ptsize + bsize
1165 diff = newptsize - PDTCRYPT_MAX_OBJ_SIZE
1168 newptsize = PDTCRYPT_MAX_OBJ_SIZE
1169 self.ptsize = newptsize
1170 data = super().process (buf [:bsize])
1171 self.ctsize += len (data)
1175 class Decrypt (Crypto):
1177 tag = None # GCM tag, part of header
1178 last_iv = None # check consecutive ivs in strict mode
1180 def __init__ (self, password=None, key=None, counter=None, fixedparts=None,
1183 Sanitizing ctor for the decryption context. ``fixedparts`` specifies a
1184 list of IV fixed parts accepted during decryption. If a fixed part is
1185 encountered that is not in the list, decryption will fail.
1187 :param password: mutually exclusive with ``key``
1188 :type password: bytes
1189 :param key: mutually exclusive with ``password``
1191 :type counter: initial object counter the values
1192 ``AES_GCM_IV_CNT_INFOFILE`` and
1193 ``AES_GCM_IV_CNT_INDEX`` are unique in each backup set
1194 and cannot be reused even with different fixed parts.
1195 :type fixedparts: bytes list
1197 if password is None and key is None \
1198 or password is not None and key is not None :
1199 raise InvalidParameter ("__init__: need either key or password")
1202 if isinstance (key, bytes) is False:
1203 raise InvalidParameter ("__init__: key must be provided as "
1204 "bytes, not %s" % type (key))
1205 else: # password, no key
1206 if isinstance (password, str) is False:
1207 raise InvalidParameter ("__init__: password must be a string, not %s"
1209 if len (password) == 0:
1210 raise InvalidParameter ("__init__: supplied empty password but not "
1211 "permitted for PDT encrypted files")
1213 if fixedparts is not None:
1214 if isinstance (fixedparts, list) is False:
1215 raise InvalidParameter ("__init__: IV fixed parts must be "
1216 "supplied as list, not %s"
1217 % type (fixedparts))
1218 self.fixed = fixedparts
1221 super().__init__ (password=password, key=key, counter=counter,
1222 strict_ivs=strict_ivs)
1225 def valid_fixed_part (self, iv):
1227 Check if a fixed part was already seen.
1229 # check if fixed part is known
1230 fixed, _cnt = struct.unpack (FMT_I2N_IV, iv)
1231 i = bisect.bisect_left (self.fixed, fixed)
1232 return i != len (self.fixed) and self.fixed [i] == fixed
1235 def check_consecutive_iv (self, iv):
1237 Check whether the counter part of the given IV is indeed the successor
1238 of the currently present counter. This should always be the case for
1239 the objects in a well formed PDT archive but should not be enforced
1240 when decrypting out-of-order.
1242 fixed, cnt = struct.unpack (FMT_I2N_IV, iv)
1243 if self.strict_ivs is True \
1244 and self.last_iv is not None \
1245 and self.last_iv [0] == fixed \
1246 and self.last_iv [1] != cnt - 1:
1247 raise NonConsecutiveIV ("iv %s counter not successor of "
1248 "last object (expected %d, found %d)"
1249 % (iv_fmt (self.last_iv [1]), cnt))
1250 self.last_iv = (iv, cnt)
1253 def next (self, hdr):
1255 Start decrypting the next object. The PDTCRYPT header for the object
1256 can be given either as already parsed object or as bytes.
1258 if isinstance (hdr, bytes) is True:
1259 hdr = hdr_read (hdr)
1260 elif isinstance (hdr, dict) is False:
1261 # this won’t catch malformed specs though
1262 raise InvalidParameter ("next: wrong type of parameter hdr: "
1263 "expected bytes or spec, got %s"
1266 paramversion = hdr ["paramversion"]
1271 raise InvalidHeader ("next: not a header %r" % hdr)
1273 super().next (self.password, paramversion, nacl, iv)
1274 if self.fixed is not None and self.valid_fixed_part (iv) is False:
1275 raise InvalidIVFixedPart ("iv %s has invalid fixed part"
1277 self.check_consecutive_iv (iv)
1280 defs = ENCRYPTION_PARAMETERS.get (paramversion, None)
1282 raise FormatError ("header contains unknown parameter version %d; "
1283 "maybe the file was created by a more recent "
1284 "version of Deltatar" % paramversion)
1286 if enc == "aes-gcm":
1288 ( algorithms.AES (self.key)
1289 , modes.GCM (iv, tag=self.tag)
1290 , backend = default_backend ()) \
1292 elif enc == "passthrough":
1293 self.enc = PassthroughCipher ()
1295 raise InternalError ("encryption parameter set %d refers to unknown "
1296 "mode %r" % (paramversion, enc))
1297 self.set_object_counter (self.cnt + 1)
1300 def done (self, tag=None):
1302 Stop decryption of the current object and finalize it with the active
1303 context. This will throw an *InvalidGCMTag* exception to indicate that
1304 the authentication tag does not match the data. If the tag is correct,
1305 the rest of the plaintext is returned.
1310 data = self.enc.finalize ()
1312 if isinstance (tag, bytes) is False:
1313 raise InvalidParameter ("done: wrong type of parameter "
1314 "tag: expected bytes, got %s"
1316 data = self.enc.finalize_with_tag (self.tag)
1317 except cryptography.exceptions.InvalidTag:
1318 raise InvalidGCMTag ("done: tag mismatch of object %d: %s "
1319 "rejected by finalize ()"
1320 % (self.cnt, binascii.hexlify (self.tag)))
1321 self.ctsize += len (data)
1322 self.stats ["out"] += len (data)
1326 def process (self, buf):
1328 Decrypt the bytes object *buf* with the active decryptor.
1330 if isinstance (buf, bytes) is False:
1331 raise InvalidParameter ("process: expected byte buffer, not %s"
1333 self.ctsize += len (buf)
1334 data = super().process (buf)
1335 self.ptsize += len (data)
1339 ###############################################################################
1341 ###############################################################################
1343 def _patch_global (glob, vow, n=None):
1345 Adapt upper file counter bound for testing IV logic. Completely unsafe.
1347 assert vow == "I am fully aware that this will void my warranty."
1348 r = globals () [glob]
1350 n = globals () [glob + "_DEFAULT"]
1351 globals () [glob] = n
1354 _testing_set_AES_GCM_IV_CNT_MAX = \
1355 partial (_patch_global, "AES_GCM_IV_CNT_MAX")
1357 _testing_set_PDTCRYPT_MAX_OBJ_SIZE = \
1358 partial (_patch_global, "PDTCRYPT_MAX_OBJ_SIZE")
1360 ###############################################################################
1361 ## freestanding invocation
1362 ###############################################################################
1364 PDTCRYPT_SUB_PROCESS = 0
1365 PDTCRYPT_SUB_SCRYPT = 1
1366 PDTCRYPT_SUB_SCAN = 2
1369 { "process" : PDTCRYPT_SUB_PROCESS
1370 , "scrypt" : PDTCRYPT_SUB_SCRYPT
1371 , "scan" : PDTCRYPT_SUB_SCAN }
1373 PDTCRYPT_SECRET_PW = 0
1374 PDTCRYPT_SECRET_KEY = 1
1376 PDTCRYPT_DECRYPT = 1 << 0 # decrypt archive with password
1377 PDTCRYPT_SPLIT = 1 << 1 # split archive into individual objects
1378 PDTCRYPT_HASH = 1 << 2 # output scrypt hash for file and given password
1380 PDTCRYPT_SPLITNAME = "pdtcrypt-object-%d.bin"
1382 PDTCRYPT_VERBOSE = False
1383 PDTCRYPT_STRICTIVS = False
1384 PDTCRYPT_OVERWRITE = False
1385 PDTCRYPT_BLOCKSIZE = 1 << 12
1390 PDTCRYPT_DEFAULT_VER = 1
1391 PDTCRYPT_DEFAULT_PVER = 1
1393 # scrypt hashing output control
1394 PDTCRYPT_SCRYPT_INTRANATOR = 0
1395 PDTCRYPT_SCRYPT_PARAMETERS = 1
1396 PDTCRYPT_SCRYPT_DEFAULT = PDTCRYPT_SCRYPT_INTRANATOR
1398 PDTCRYPT_SCRYPT_FORMAT = \
1399 { "i2n" : PDTCRYPT_SCRYPT_INTRANATOR
1400 , "params" : PDTCRYPT_SCRYPT_PARAMETERS }
1402 PDTCRYPT_TT_COLUMNS = 80 # assume standard terminal
1404 class PDTDecryptionError (Exception):
1405 """Decryption failed."""
1407 class PDTSplitError (Exception):
1408 """Decryption failed."""
1411 def noise (*a, **b):
1412 print (file=sys.stderr, *a, **b)
1415 class PassthroughDecryptor (object):
1417 curhdr = None # write current header on first data write
1419 def __init__ (self):
1420 if PDTCRYPT_VERBOSE is True:
1421 noise ("PDT: no encryption; data passthrough")
1423 def next (self, hdr):
1424 ok, curhdr = hdr_make (hdr)
1426 raise PDTDecryptionError ("bad header %r" % hdr)
1427 self.curhdr = curhdr
1430 if self.curhdr is not None:
1434 def process (self, d):
1435 if self.curhdr is not None:
1441 def depdtcrypt (mode, secret, ins, outs):
1443 Remove PDTCRYPT layer from all objects encrypted with the secret. Used on a
1444 Deltatar backup this will yield a (possibly Gzip compressed) tarball.
1446 ctleft = -1 # length of ciphertext to consume
1447 ctcurrent = 0 # total ciphertext of current object
1448 total_obj = 0 # total number of objects read
1449 total_pt = 0 # total plaintext bytes
1450 total_ct = 0 # total ciphertext bytes
1451 total_read = 0 # total bytes read
1452 outfile = None # Python file object for output
1454 if mode & PDTCRYPT_DECRYPT: # decryptor
1456 if ks == PDTCRYPT_SECRET_PW:
1457 decr = Decrypt (password=secret [1], strict_ivs=PDTCRYPT_STRICTIVS)
1458 elif ks == PDTCRYPT_SECRET_KEY:
1459 key = binascii.unhexlify (secret [1])
1460 decr = Decrypt (key=key, strict_ivs=PDTCRYPT_STRICTIVS)
1462 raise InternalError ("‘%d’ does not specify a valid kind of secret"
1465 decr = PassthroughDecryptor ()
1468 """Dummy for non-split mode: output file does not vary."""
1471 if mode & PDTCRYPT_SPLIT:
1472 def nextout (outfile):
1474 We were passed an fd as outs for accessing the destination
1475 directory where extracted archive components are supposed
1480 if PDTCRYPT_VERBOSE is True:
1481 noise ("PDT: no output file to close at this point")
1483 if PDTCRYPT_VERBOSE is True:
1484 noise ("PDT: release output file %r" % outfile)
1485 # cleanup happens automatically by the GC; the next
1486 # line will error out on account of an invalid fd
1489 assert total_obj > 0
1490 fname = PDTCRYPT_SPLITNAME % total_obj
1492 oflags = os.O_CREAT | os.O_WRONLY
1493 if PDTCRYPT_OVERWRITE is True:
1494 oflags |= os.O_TRUNC
1497 outfd = os.open (fname, oflags, 0o600, dir_fd=outs)
1498 if PDTCRYPT_VERBOSE is True:
1499 noise ("PDT: new output file %s → %d" % (fname, outfd))
1500 except FileExistsError as exn:
1501 noise ("PDT: refusing to overwrite existing file %s" % fname)
1503 raise PDTSplitError ("destination file %s already exists"
1506 return os.fdopen (outfd, "wb", closefd=True)
1510 """ESPIPE is normal on non-seekable stdio stream."""
1513 except OSError as exn:
1514 if exn.errno == os.errno.ESPIPE:
1517 def out (pt, outfile):
1521 if PDTCRYPT_VERBOSE is True:
1522 noise ("PDT:\t· decrypt plaintext %d B" % (npt))
1524 nn = outfile.write (pt)
1525 except OSError as exn: # probably ENOSPC
1526 raise DecryptionError ("error (%s)" % exn)
1528 raise DecryptionError ("write aborted after %d of %d B" % (nn, npt))
1532 # current object completed; in a valid archive this marks either
1533 # the start of a new header or the end of the input
1534 if ctleft == 0: # current object requires finalization
1535 if PDTCRYPT_VERBOSE is True:
1536 noise ("PDT: %d finalize" % tell (ins))
1539 except InvalidGCMTag as exn:
1540 raise DecryptionError ("error finalizing object %d (%d B): "
1541 "%r" % (total_obj, len (pt), exn)) \
1544 if PDTCRYPT_VERBOSE is True:
1545 noise ("PDT:\t· object validated")
1547 if PDTCRYPT_VERBOSE is True:
1548 noise ("PDT: %d hdr" % tell (ins))
1550 hdr = hdr_read_stream (ins)
1551 total_read += PDTCRYPT_HDR_SIZE
1552 except EndOfFile as exn:
1553 total_read += exn.remainder
1554 if total_ct + total_obj * PDTCRYPT_HDR_SIZE != total_read:
1555 raise PDTDecryptionError ("ciphertext processed (%d B) plus "
1556 "overhead (%d × %d B) does not match "
1557 "the number of bytes read (%d )"
1558 % (total_ct, total_obj, PDTCRYPT_HDR_SIZE,
1560 # the single good exit
1561 return total_read, total_obj, total_ct, total_pt
1562 except InvalidHeader as exn:
1563 raise PDTDecryptionError ("invalid header at position %d in %r "
1564 "(%s)" % (tell (ins), exn, ins))
1565 if PDTCRYPT_VERBOSE is True:
1566 pretty = hdr_fmt_pretty (hdr)
1567 noise (reduce (lambda a, e: (a + "\n" if a else "") + "PDT:\t· " + e,
1568 pretty.splitlines (), ""))
1569 ctcurrent = ctleft = hdr ["ctsize"]
1573 total_obj += 1 # used in file counter with split mode
1575 # finalization complete or skipped in case of first object in
1576 # stream; create a new output file if necessary
1577 outfile = nextout (outfile)
1579 if PDTCRYPT_VERBOSE is True:
1580 noise ("PDT: %d decrypt obj no. %d, %d B"
1581 % (tell (ins), total_obj, ctleft))
1583 # always allocate a new buffer since python-cryptography doesn’t allow
1584 # passing a bytearray :/
1585 nexpect = min (ctleft, PDTCRYPT_BLOCKSIZE)
1586 if PDTCRYPT_VERBOSE is True:
1587 noise ("PDT:\t· [%d] %d%% done, read block (%d B of %d B remaining)"
1589 100 - ctleft * 100 / (ctcurrent > 0 and ctcurrent or 1),
1591 ct = ins.read (nexpect)
1595 raise EndOfFile (nct,
1596 "hit EOF after %d of %d B in block [%d:%d); "
1597 "%d B ciphertext remaining for object no %d"
1598 % (nct, nexpect, off, off + nexpect, ctleft,
1604 if PDTCRYPT_VERBOSE is True:
1605 noise ("PDT:\t· decrypt ciphertext %d B" % (nct))
1606 pt = decr.process (ct)
1610 def deptdcrypt_mk_stream (kind, path):
1611 """Create stream from file or stdio descriptor."""
1612 if kind == PDTCRYPT_SINK:
1614 if PDTCRYPT_VERBOSE is True: noise ("PDT: sink: stdout")
1615 return sys.stdout.buffer
1617 if PDTCRYPT_VERBOSE is True: noise ("PDT: sink: file %s" % path)
1618 return io.FileIO (path, "w")
1619 if kind == PDTCRYPT_SOURCE:
1621 if PDTCRYPT_VERBOSE is True: noise ("PDT: source: stdin")
1622 return sys.stdin.buffer
1624 if PDTCRYPT_VERBOSE is True: noise ("PDT: source: file %s" % path)
1625 return io.FileIO (path, "r")
1627 raise ValueError ("bogus stream “%s” / %s" % (kind, path))
1630 def mode_depdtcrypt (mode, secret, ins, outs):
1632 total_read, total_obj, total_ct, total_pt = \
1633 depdtcrypt (mode, secret, ins, outs)
1634 except DecryptionError as exn:
1635 noise ("PDT: Decryption failed:")
1637 noise ("PDT: “%s”" % exn)
1639 noise ("PDT: Did you specify the correct key / password?")
1642 except PDTSplitError as exn:
1643 noise ("PDT: Split operation failed:")
1645 noise ("PDT: “%s”" % exn)
1647 noise ("PDT: Hint: target directory should be empty.")
1651 if PDTCRYPT_VERBOSE is True:
1652 noise ("PDT: decryption successful" )
1653 noise ("PDT: %.10d bytes read" % total_read)
1654 noise ("PDT: %.10d objects decrypted" % total_obj )
1655 noise ("PDT: %.10d bytes ciphertext" % total_ct )
1656 noise ("PDT: %.10d bytes plaintext" % total_pt )
1662 def mode_scrypt (pw, ins=None, nacl=None, fmt=PDTCRYPT_SCRYPT_INTRANATOR):
1664 paramversion = PDTCRYPT_DEFAULT_PVER
1666 hsh, nacl, version, paramversion = scrypt_hashsource (pw, ins)
1667 defs = ENCRYPTION_PARAMETERS.get(paramversion, None)
1669 nacl = binascii.unhexlify (nacl)
1670 defs = ENCRYPTION_PARAMETERS.get(paramversion, None)
1671 version = PDTCRYPT_DEFAULT_VER
1673 kdfname, params = defs ["kdf"]
1675 kdf = kdf_by_version (None, defs)
1676 hsh, _void = kdf (pw, nacl)
1680 if fmt == PDTCRYPT_SCRYPT_INTRANATOR:
1681 out = json.dumps ({ "salt" : base64.b64encode (nacl).decode ()
1682 , "key" : base64.b64encode (hsh) .decode ()
1683 , "paramversion" : paramversion })
1684 elif fmt == PDTCRYPT_SCRYPT_PARAMETERS:
1685 out = json.dumps ({ "salt" : binascii.hexlify (nacl).decode ()
1686 , "key" : binascii.hexlify (hsh) .decode ()
1687 , "version" : version
1688 , "scrypt_params" : { "N" : params ["N"]
1689 , "r" : params ["r"]
1690 , "p" : params ["p"]
1691 , "dkLen" : params ["dkLen"] } })
1693 raise RuntimeError ("bad scrypt output scheme %r" % fmt)
1698 def noise_output_candidates (cands, indent=8, cols=PDTCRYPT_TT_COLUMNS):
1700 Print a list of offsets without garbling the terminal too much.
1702 The indent is counted from column zero; if it is wide enough, the “PDT: ”
1703 marker will be prepended, considered part of the indentation.
1707 idt = " " * indent if indent < 5 else "PDT: " + " " * (indent - 5)
1712 init = True # prevent leading separator
1715 raise ValueError ("the requested indentation exceeds the line "
1716 "width by %d" % (indent - wd))
1726 if lpos > wd: # line break
1742 def mode_scan (secret, fname, nacl=None):
1744 Dissect a binary file, looking for PDTCRYPT headers and objects.
1747 fd = os.open (fname, os.O_RDONLY)
1748 except FileNotFoundError:
1749 noise ("PDT: failed to open %s readonly" % fname)
1754 if PDTCRYPT_VERBOSE is True:
1755 noise ("PDT: scan for potential sync points")
1756 cands = locate_hdr_candidates (fd)
1757 if len (cands) == 0:
1758 noise ("PDT: scan complete: input does not contain potential PDT "
1759 "headers; giving up.")
1761 if PDTCRYPT_VERBOSE is True:
1762 noise ("PDT: scan complete: found %d candidates:" % len (cands))
1763 noise_output_candidates (cands)
1771 vdt, hdr = inspect_hdr (fd, cand)
1772 if vdt == HDR_CAND_JUNK:
1775 off0 = cand + PDTCRYPT_HDR_SIZE
1776 if PDTCRYPT_VERBOSE is True:
1777 noise ("PDT: read payload @%d" % off0)
1778 pretty = hdr_fmt_pretty (hdr)
1779 noise (reduce (lambda a, e: (a + "\n" if a else "") + "PDT:\t· " + e,
1780 pretty.splitlines (), ""))
1782 ok = try_decrypt (fd, off0, hdr, secret) == hdr ["ctsize"]
1783 if vdt == HDR_CAND_GOOD and ok is True:
1784 noise ("PDT: %d → ✓ valid object %d–%d"
1785 % (cand, off0, off0 + hdr ["ctsize"]))
1786 elif vdt == HDR_CAND_FISHY and ok is True:
1787 noise ("PDT: %d → × object %d–%d, corrupt header"
1788 % (cand, off0, off0 + hdr ["ctsize"]))
1789 elif vdt == HDR_CAND_GOOD and ok is False:
1790 noise ("PDT: %d → × object %d–%d, problematic payload"
1791 % (cand, off0, off0 + hdr ["ctsize"]))
1792 elif vdt == HDR_CAND_FISHY and ok is False:
1793 noise ("PDT: %d → × object %d–%d, corrupt header, problematic "
1794 "ciphertext" % (cand, off0, off0 + hdr ["ctsize"]))
1801 noise ("PDT: all headers ok")
1803 noise ("PDT: %d candidates not parseable as headers:" % len (junk))
1804 noise_output_candidates (junk)
1806 def usage (err=False):
1810 indent = ' ' * len (SELF)
1811 out ("usage: %s SUBCOMMAND { --help" % SELF)
1812 out (" %s | [ -v ] { -p PASSWORD | -k KEY }" % indent)
1813 out (" %s [ { -i | --in } { - | SOURCE } ]" % indent)
1814 out (" %s [ { -n | --nacl } { SALT } ]" % indent)
1815 out (" %s [ { -o | --out } { - | DESTINATION } ]" % indent)
1816 out (" %s [ -D | --no-decrypt ] [ -S | --split ]" % indent)
1817 out (" %s [ -f | --format ]" % indent)
1820 out ("\t\tSUBCOMMAND main mode: { process | scrypt }")
1822 out ("\t\t process: extract objects from PDT archive")
1823 out ("\t\t scrypt: calculate hash from password and first object")
1824 out ("\t\t-p PASSWORD password to derive the encryption key from")
1825 out ("\t\t-k KEY encryption key as 16 bytes in hexadecimal notation")
1826 out ("\t\t-s enforce strict handling of initialization vectors")
1827 out ("\t\t-i SOURCE file name to read from")
1828 out ("\t\t-o DESTINATION file to write output to")
1829 out ("\t\t-n SALT provide salt for scrypt mode in hex encoding")
1830 out ("\t\t-v print extra info")
1831 out ("\t\t-S split into files at object boundaries; this")
1832 out ("\t\t requires DESTINATION to refer to directory")
1833 out ("\t\t-D PDT header and ciphertext passthrough")
1834 out ("\t\t-f format of SCRYPT hash output (“default” or “parameters”)")
1836 out ("\tinstead of filenames, “-” may used to specify stdin / stdout")
1838 sys.exit ((err is True) and 42 or 0)
1848 def parse_argv (argv):
1850 mode = PDTCRYPT_DECRYPT
1855 scrypt_format = PDTCRYPT_SCRYPT_DEFAULT
1858 SELF = os.path.basename (next (argvi))
1861 rawsubcmd = next (argvi)
1862 subcommand = PDTCRYPT_SUB [rawsubcmd]
1863 except StopIteration:
1864 bail ("ERROR: subcommand required")
1866 bail ("ERROR: invalid subcommand “%s” specified" % rawsubcmd)
1872 except StopIteration:
1873 bail ("ERROR: argument list incomplete")
1875 def checked_secret (t, arg):
1880 bail ("ERROR: encountered “%s” but secret already given" % arg)
1883 if arg in [ "-h", "--help" ]:
1886 elif arg in [ "-v", "--verbose", "--wtf" ]:
1887 global PDTCRYPT_VERBOSE
1888 PDTCRYPT_VERBOSE = True
1889 elif arg in [ "-i", "--in", "--source" ]:
1890 insspec = checked_arg ()
1891 if PDTCRYPT_VERBOSE is True: noise ("PDT: decrypt from %s" % insspec)
1892 elif arg in [ "-p", "--password" ]:
1893 arg = checked_arg ()
1894 checked_secret (PDTCRYPT_SECRET_PW, arg)
1895 if PDTCRYPT_VERBOSE is True: noise ("PDT: decrypting with password")
1897 if subcommand == PDTCRYPT_SUB_PROCESS:
1898 if arg in [ "-s", "--strict-ivs" ]:
1899 global PDTCRYPT_STRICTIVS
1900 PDTCRYPT_STRICTIVS = True
1901 elif arg in [ "-o", "--out", "--dest", "--sink" ]:
1902 outsspec = checked_arg ()
1903 if PDTCRYPT_VERBOSE is True: noise ("PDT: decrypt to %s" % outsspec)
1904 elif arg in [ "-f", "--force" ]:
1905 global PDTCRYPT_OVERWRITE
1906 PDTCRYPT_OVERWRITE = True
1907 if PDTCRYPT_VERBOSE is True: noise ("PDT: overwrite existing files")
1908 elif arg in [ "-S", "--split" ]:
1909 mode |= PDTCRYPT_SPLIT
1910 if PDTCRYPT_VERBOSE is True: noise ("PDT: split files")
1911 elif arg in [ "-D", "--no-decrypt" ]:
1912 mode &= ~PDTCRYPT_DECRYPT
1913 if PDTCRYPT_VERBOSE is True: noise ("PDT: not decrypting")
1914 elif arg in [ "-k", "--key" ]:
1915 arg = checked_arg ()
1916 checked_secret (PDTCRYPT_SECRET_KEY, arg)
1917 if PDTCRYPT_VERBOSE is True: noise ("PDT: decrypting with key")
1919 bail ("ERROR: unexpected positional argument “%s”" % arg)
1920 elif subcommand == PDTCRYPT_SUB_SCRYPT:
1921 if arg in [ "-n", "--nacl", "--salt" ]:
1922 nacl = checked_arg ()
1923 if PDTCRYPT_VERBOSE is True: noise ("PDT: salt key with %s" % nacl)
1924 elif arg in [ "-f", "--format" ]:
1925 arg = checked_arg ()
1927 scrypt_format = PDTCRYPT_SCRYPT_FORMAT [arg]
1929 bail ("ERROR: invalid scrypt output format %s" % arg)
1930 if PDTCRYPT_VERBOSE is True:
1931 noise ("PDT: scrypt output format “%s”" % scrypt_format)
1933 bail ("ERROR: unexpected positional argument “%s”" % arg)
1934 elif subcommand == PDTCRYPT_SUB_SCAN:
1938 if PDTCRYPT_VERBOSE is True:
1939 noise ("ERROR: no password or key specified, trying $PDTCRYPT_PASSWORD")
1940 epw = os.getenv ("PDTCRYPT_PASSWORD")
1942 checked_secret (PDTCRYPT_SECRET_PW, epw.strip ())
1945 if PDTCRYPT_VERBOSE is True:
1946 noise ("ERROR: no password or key specified, trying $PDTCRYPT_KEY")
1947 ek = os.getenv ("PDTCRYPT_KEY")
1949 checked_secret (PDTCRYPT_SECRET_KEY, ek.strip ())
1952 if subcommand == PDTCRYPT_SUB_SCRYPT:
1953 bail ("ERROR: scrypt hash mode requested but no password given")
1954 elif mode & PDTCRYPT_DECRYPT:
1955 bail ("ERROR: encryption requested but no password given")
1957 if subcommand == PDTCRYPT_SUB_SCAN:
1959 bail ("ERROR: please supply an input file for scanning")
1961 bail ("ERROR: input must be seekable; please specify a file")
1962 return True, partial (mode_scan, secret, insspec, nacl)
1964 if subcommand == PDTCRYPT_SUB_SCRYPT:
1965 if secret [0] == PDTCRYPT_SECRET_KEY:
1966 bail ("ERROR: scrypt mode requires a password")
1967 if insspec is not None and nacl is not None \
1968 or insspec is None and nacl is None :
1969 bail ("ERROR: please supply either an input file or "
1974 if insspec is not None or subcommand != PDTCRYPT_SUB_SCRYPT:
1975 ins = deptdcrypt_mk_stream (PDTCRYPT_SOURCE, insspec or "-")
1977 if subcommand == PDTCRYPT_SUB_SCRYPT:
1978 return True, partial (mode_scrypt, secret [1].encode (), ins, nacl,
1981 if mode & PDTCRYPT_SPLIT: # destination must be directory
1982 if outsspec is None or outsspec == "-":
1983 bail ("ERROR: split mode is incompatible with stdout sink")
1987 os.makedirs (outsspec, 0o700)
1988 except FileExistsError:
1989 # if it’s a directory with appropriate perms, everything is
1990 # good; otherwise, below invocation of open(2) will fail
1992 outs = os.open (outsspec, os.O_DIRECTORY, 0o600)
1993 except FileNotFoundError as exn:
1994 bail ("ERROR: cannot create target directory “%s”" % outsspec)
1995 except NotADirectoryError as exn:
1996 bail ("ERROR: target path “%s” is not a directory" % outsspec)
1999 outs = deptdcrypt_mk_stream (PDTCRYPT_SINK, outsspec or "-")
2001 return True, partial (mode_depdtcrypt, mode, secret, ins, outs)
2005 ok, runner = parse_argv (argv)
2007 if ok is True: return runner ()
2012 if __name__ == "__main__":
2013 sys.exit (main (sys.argv))