implement decryption for tolerant mode
[python-delta-tar] / deltatar / crypto.py
CommitLineData
00b3cd10
PG
1#!/usr/bin/env python3
2
3"""
83f2d71e 4Intra2net 2017
00b3cd10
PG
5
6===============================================================================
704ceaa5 7 crypto -- Encryption Layer for the Deltatar Backup
00b3cd10
PG
8===============================================================================
9
10Crypto stack:
11
12 - AES-GCM for the symmetric encryption;
13 - Scrypt as KDF.
14
15References:
16
17 - NIST Recommendation for Block Cipher Modes of Operation: Galois/Counter
18 Mode (GCM) and GMAC
19 http://nvlpubs.nist.gov/nistpubs/Legacy/SP/nistspecialpublication800-38d.pdf
20
21 - AES-GCM v1:
22 https://cryptome.org/2014/01/aes-gcm-v1.pdf
23
24 - Authentication weaknesses in GCM
25 http://csrc.nist.gov/groups/ST/toolkit/BCM/documents/comments/CWC-GCM/Ferguson2.pdf
26
83f2d71e
PG
27Trouble with python-cryptography packages: authentication tags can only be
28passed in advance: https://github.com/pyca/cryptography/pull/3421
29
6d08915c
PG
30Errors
31-------------------------------------------------------------------------------
32
33Errors fall into roughly three categories:
34
704ceaa5 35 - Cryptographical errors or invalid data.
6d08915c
PG
36
37 - ``InvalidGCMTag`` (decryption failed on account of an invalid GCM
38 tag),
39 - ``InvalidIVFixedPart`` (IV fixed part of object not found in list),
f6cd676f 40 - ``DuplicateIV`` (the IV of an encrypted object already occurred),
704ceaa5
PG
41 - ``DecryptionError`` (used in CLI decryption for presenting error
42 conditions to the user).
6d08915c
PG
43
44 - Incorrect usage of the library.
45
46 - ``InvalidParameter`` (non-conforming user supplied parameter),
47 - ``InvalidHeader`` (data passed for reading not parsable into header),
48 - ``FormatError`` (cannot handle header or parameter version),
49 - ``RuntimeError``.
50
51 - Bad internal state. If one of these is encountered it means that a state
52 was reached that shouldn’t occur during normal processing.
53
54 - ``InternalError``,
55 - ``Unreachable``.
56
57Also, ``EndOfFile`` is used as a sentinel to communicate that a stream supplied
58for reading is exhausted.
59
f6cd676f
PG
60Initialization Vectors
61-------------------------------------------------------------------------------
62
63Initialization vectors are checked reuse during the lifetime of a decryptor.
704ceaa5
PG
64The fixed counters for metadata files cannot be reused and attempts to do so
65will cause a DuplicateIV error. This means the length of objects encrypted with
66a metadata counter is capped at 63 GB.
67
68For ordinary, non-metadata payload, there is an optional mode with strict IV
69checking that causes a crypto context to fail if an IV encountered or created
70was already used for decrypting or encrypting, respectively, an earlier object.
71Note that this mode can trigger false positives when decrypting non-linearly,
72e. g. when traversing the same object multiple times. Since the crypto context
73has no notion of a position in a PDT encrypted archive, this condition must be
74sorted out downstream.
75
76Command Line Utility
77-------------------------------------------------------------------------------
78
79``crypto.py`` may be invoked as a script for decrypting, validating, and
80splitting PDT encrypted files. Consult the usage message for details.
81
82Usage examples:
83
84Decrypt from stdin using the password ‘foo’: ::
85
86 $ crypto.py process foo -i - -o - <some-file.tar.gz.pdtcrypt >some-file.tar.gz
87
88Output verbose information about the encrypted objects in the archive: ::
89
90 $ crypto.py process foo -v -i some-file.tar.gz.pdtcrypt -o /dev/null
91 PDT: decrypt from some-file.tar.gz.pdtcrypt
92 PDT: decrypt to /dev/null
93 PDT: source: file some-file.tar.gz.pdtcrypt
94 PDT: sink: file /dev/null
95 PDT: 0 hdr
96 PDT: · version = 1 : 0100
97 PDT: · paramversion = 1 : 0100
98 PDT: · nacl : d270 b031 00d1 87e2 c946 610d 7b7f 7e5f
99 PDT: · iv : 02ee 3dd7 a963 1eb1 0100 0000
100 PDT: · ctsize = 591 : 4f02 0000 0000 0000
101 PDT: · tag : 5b2d 6d8b 8f82 4842 12fd 0b10 b6e3 369b
102 PDT: 64 decrypt obj no. 1, 591 B
103 PDT: · [64] 0% done, read block (591 B of 591 B remaining)
104 PDT: · decrypt ciphertext 591 B
105 PDT: · decrypt plaintext 591 B
106 PDT: 655 finalize
107
108
109Also, the mode *scrypt* allows deriving encryption keys. To calculate the
110encryption key from the password ‘foo’ and the salt of the first object in a
111PDT encrypted file: ::
112
113 $ crypto.py scrypt foo -i some-file.pdtcrypt
4f6405d6 114 {"paramversion": 1, "salt": "Cqzbk48e3peEjzWto8D0yA==", "key": "JH9EkMwaM4x9F5aim5gK/Q=="}
704ceaa5
PG
115
116The computed 16 byte key is given in hexadecimal notation in the value to
117``hash`` and can be fed into Python’s ``binascii.unhexlify()`` to obtain the
118corresponding binary representation.
119
120Note that in Scrypt hashing mode, no data integrity checks are being performed.
121If the wrong password is given, a wrong key will be derived. Whether the password
122was indeed correct can only be determined by decrypting. Note that since PDT
123archives essentially consist of a stream of independent objects, the salt and
124other parameters may change. Thus a key derived using above method from the
125first object doesn’t necessarily apply to any of the subsequent objects.
f6cd676f 126
00b3cd10
PG
127"""
128
7b3940e5 129import base64
00b3cd10 130import binascii
50710d86 131import bisect
00b3cd10
PG
132import ctypes
133import io
c46c8670 134from functools import reduce, partial
f41973a6 135import mmap
00b3cd10
PG
136import os
137import struct
138import sys
139import time
da82bc58 140import types
00b3cd10
PG
141try:
142 import enum34
143except ImportError as exn:
144 pass
145
146if __name__ == "__main__": ## Work around the import mechanism’s lest Python’s
147 pwd = os.getcwd() ## preference for local imports causes a cyclical
148 ## import (crypto → pylibscrypt → […] → ./tarfile → crypto).
149 sys.path = [ p for p in sys.path if p.find ("deltatar") < 0 ]
150
151import pylibscrypt
152from cryptography.hazmat.primitives.ciphers import Cipher, algorithms, modes
153from cryptography.hazmat.backends import default_backend
15d3eefd 154import cryptography
00b3cd10
PG
155
156
a64085a8 157__all__ = [ "hdr_make", "hdr_read", "hdr_fmt", "hdr_fmt_pretty"
b360b772 158 , "scrypt_hashfile"
3031b7ae
PG
159 , "PDTCRYPT_HDR_SIZE", "AES_GCM_IV_CNT_DATA"
160 , "AES_GCM_IV_CNT_INFOFILE", "AES_GCM_IV_CNT_INDEX"
2d6fd8c8 161 ]
00b3cd10 162
a393d9cb
PG
163
164###############################################################################
15d3eefd
PG
165## exceptions
166###############################################################################
167
168class EndOfFile (Exception):
169 """Reached EOF."""
ae3d0f2a
PG
170 remainder = 0
171 msg = 0
8a8ac469 172 def __init__ (self, n=None, msg=None):
5d394c0d
PG
173 if n is not None:
174 self.remainder = n
175 self.msg = msg
15d3eefd 176
b0078f26 177
b12110dd
PG
178class InvalidParameter (Exception):
179 """Inputs not valid for PDT encryption."""
180 pass
181
b0078f26 182
15d3eefd
PG
183class InvalidHeader (Exception):
184 """Header not valid."""
185 pass
186
b0078f26
PG
187
188class InvalidGCMTag (Exception):
189 """
190 The GCM tag calculated during decryption differs from that in the object
191 header.
192 """
193 pass
194
195
26b42ad4 196class InvalidIVFixedPart (Exception):
89ec6e2f
PG
197 """
198 IV fixed part not in supplied list: either the backup is corrupt or the
199 current object does not belong to it.
200 """
26b42ad4
PG
201 pass
202
b0078f26 203
be124bca 204class IVFixedPartError (Exception):
89ec6e2f
PG
205 """
206 Error creating a unique IV fixed part: repeated calls to system RNG yielded
207 the same sequence of bytes as the last IV used.
208 """
be124bca
PG
209 pass
210
211
fac2cfe1 212class InvalidFileCounter (Exception):
89ec6e2f
PG
213 """
214 When encrypting, an attempted reuse of a dedicated counter (info file,
215 index file) was caught.
216 """
fac2cfe1
PG
217 pass
218
219
ee6aa239 220class DuplicateIV (Exception):
89ec6e2f
PG
221 """
222 During encryption, the current IV fixed part is identical to an already
223 existing IV (same prefix and file counter). This indicates tampering or
224 programmer error and cannot be recovered from.
225 """
ee6aa239
PG
226 pass
227
228
229class NonConsecutiveIV (Exception):
89ec6e2f
PG
230 """
231 IVs not numbered consecutively. This is a hard error with strict IV
232 checking. Precludes random access to the encrypted objects.
233 """
ee6aa239
PG
234 pass
235
236
b12110dd
PG
237class FormatError (Exception):
238 """Unusable parameters in header."""
239 pass
240
b0078f26 241
15d3eefd 242class DecryptionError (Exception):
89ec6e2f 243 """Error during decryption with ``crypto.py`` on the command line."""
15d3eefd
PG
244 pass
245
b0078f26 246
70ad9458 247class Unreachable (Exception):
89ec6e2f
PG
248 """
249 Makeshift __builtin_unreachable(); always a programmer error if
250 thrown.
251 """
70ad9458
PG
252 pass
253
b0078f26 254
b12110dd
PG
255class InternalError (Exception):
256 """Errors not ascribable to bad user inputs or cryptography."""
257 pass
258
15d3eefd
PG
259
260###############################################################################
a393d9cb
PG
261## crypto layer version
262###############################################################################
263
264ENCRYPTION_PARAMETERS = \
c46c8670 265 { 0: \
dd23cbc9
PG
266 { "kdf": ("dummy", 16)
267 , "enc": "passthrough" }
c46c8670 268 , 1: \
dd23cbc9
PG
269 { "kdf": ( "scrypt"
270 , { "dkLen" : 16
271 , "N" : 1 << 16
272 , "r" : 8
273 , "p" : 1
274 , "NaCl_LEN" : 16 })
275 , "enc": "aes-gcm" } }
a393d9cb 276
00b3cd10
PG
277###############################################################################
278## constants
279###############################################################################
280
dd47d6a2 281PDTCRYPT_HDR_MAGIC = b"PDTCRYPT"
00b3cd10 282
dd47d6a2
PG
283PDTCRYPT_HDR_SIZE_MAGIC = 8 # 8
284PDTCRYPT_HDR_SIZE_VERSION = 2 # 10
285PDTCRYPT_HDR_SIZE_PARAMVERSION = 2 # 12
286PDTCRYPT_HDR_SIZE_NACL = 16 # 28
287PDTCRYPT_HDR_SIZE_IV = 12 # 40
288PDTCRYPT_HDR_SIZE_CTSIZE = 8 # 48
289PDTCRYPT_HDR_SIZE_TAG = 16 # 64 GCM auth tag
00b3cd10 290
dd47d6a2
PG
291PDTCRYPT_HDR_SIZE = PDTCRYPT_HDR_SIZE_MAGIC + PDTCRYPT_HDR_SIZE_VERSION \
292 + PDTCRYPT_HDR_SIZE_PARAMVERSION + PDTCRYPT_HDR_SIZE_NACL \
293 + PDTCRYPT_HDR_SIZE_IV + PDTCRYPT_HDR_SIZE_CTSIZE \
294 + PDTCRYPT_HDR_SIZE_TAG # = 64
00b3cd10
PG
295
296# precalculate offsets since Python can’t do constant folding over names
dd47d6a2
PG
297HDR_OFF_VERSION = PDTCRYPT_HDR_SIZE_MAGIC
298HDR_OFF_PARAMVERSION = HDR_OFF_VERSION + PDTCRYPT_HDR_SIZE_VERSION
299HDR_OFF_NACL = HDR_OFF_PARAMVERSION + PDTCRYPT_HDR_SIZE_PARAMVERSION
300HDR_OFF_IV = HDR_OFF_NACL + PDTCRYPT_HDR_SIZE_NACL
301HDR_OFF_CTSIZE = HDR_OFF_IV + PDTCRYPT_HDR_SIZE_IV
302HDR_OFF_TAG = HDR_OFF_CTSIZE + PDTCRYPT_HDR_SIZE_CTSIZE
00b3cd10
PG
303
304FMT_UINT16_LE = "<H"
305FMT_UINT64_LE = "<Q"
50710d86 306FMT_I2N_IV = "<8sL" # 8 random bytes ‖ 32 bit counter
83f2d71e
PG
307FMT_I2N_HDR = ("<" # host byte order
308 "8s" # magic
309 "H" # version
310 "H" # paramversion
311 "16s" # sodium chloride
312 "12s" # iv
3b53fb98
PG
313 "Q" # size
314 "16s") # GCM tag
00b3cd10
PG
315
316# aes+gcm
cb7a3911
PG
317AES_GCM_MAX_SIZE = (1 << 36) - (1 << 5) # 2^39 - 2^8 b ≅ 64 GB
318PDTCRYPT_MAX_OBJ_SIZE_DEFAULT = 63 * (1 << 30) # 63 GB
319PDTCRYPT_MAX_OBJ_SIZE = PDTCRYPT_MAX_OBJ_SIZE_DEFAULT
00b3cd10 320
3031b7ae
PG
321# index and info files are written on-the fly while encrypting so their
322# counters must be available inadvance
cb7a3911
PG
323AES_GCM_IV_CNT_INFOFILE = 1 # constant
324AES_GCM_IV_CNT_INDEX = AES_GCM_IV_CNT_INFOFILE + 1
325AES_GCM_IV_CNT_DATA = AES_GCM_IV_CNT_INDEX + 1 # also for multivolume
326AES_GCM_IV_CNT_MAX_DEFAULT = 0xffFFffFF
327AES_GCM_IV_CNT_MAX = AES_GCM_IV_CNT_MAX_DEFAULT
2d6fd8c8 328
be124bca
PG
329# IV structure and generation
330PDTCRYPT_IV_GEN_MAX_RETRIES = 10 # ×
331PDTCRYPT_IV_FIXEDPART_SIZE = 8 # B
332PDTCRYPT_IV_COUNTER_SIZE = 4 # B
39accaaa 333
00b3cd10 334###############################################################################
39accaaa 335## header, trailer
00b3cd10
PG
336###############################################################################
337#
338# Interface:
339#
340# struct hdrinfo
341# { version : u16
342# , paramversion : u16
343# , nacl : [u8; 16]
344# , iv : [u8; 12]
704ceaa5
PG
345# , ctsize : usize
346# , tag : [u8; 16] }
83f2d71e 347#
00b3cd10 348# fn hdr_read (f : handle) -> hdrinfo;
c2d1c3ec 349# fn hdr_make (f : handle, h : hdrinfo) -> IOResult<usize>;
00b3cd10
PG
350# fn hdr_fmt (h : hdrinfo) -> String;
351#
352
83f2d71e 353def hdr_read (data):
704ceaa5
PG
354 """
355 Read bytes as header structure.
356
357 If the input could not be interpreted as a header, fail with
358 ``InvalidHeader``.
359 """
83f2d71e 360
00b3cd10 361 try:
3b53fb98 362 mag, version, paramversion, nacl, iv, ctsize, tag = \
83f2d71e
PG
363 struct.unpack (FMT_I2N_HDR, data)
364 except Exception as exn:
15d3eefd
PG
365 raise InvalidHeader ("error unpacking header from [%r]: %s"
366 % (binascii.hexlify (data), str (exn)))
00b3cd10 367
dd47d6a2 368 if mag != PDTCRYPT_HDR_MAGIC:
15d3eefd 369 raise InvalidHeader ("bad magic in header: expected [%s], got [%s]"
dd47d6a2 370 % (PDTCRYPT_HDR_MAGIC, mag))
00b3cd10 371
15d3eefd 372 return \
00b3cd10
PG
373 { "version" : version
374 , "paramversion" : paramversion
375 , "nacl" : nacl
376 , "iv" : iv
377 , "ctsize" : ctsize
3b53fb98 378 , "tag" : tag
00b3cd10
PG
379 }
380
381
39accaaa 382def hdr_read_stream (instr):
704ceaa5
PG
383 """
384 Read header from stream at the current position.
385
386 Fail with ``InvalidHeader`` if insufficient bytes were read from the
387 stream, or if the content could not be interpreted as a header.
388 """
dd47d6a2 389 data = instr.read(PDTCRYPT_HDR_SIZE)
ae3d0f2a 390 ldata = len (data)
8a8ac469
PG
391 if ldata == 0:
392 raise EndOfFile
393 elif ldata != PDTCRYPT_HDR_SIZE:
394 raise InvalidHeader ("hdr_read_stream: expected %d B, received %d B"
395 % (PDTCRYPT_HDR_SIZE, ldata))
47e27926 396 return hdr_read (data)
39accaaa
PG
397
398
3b53fb98 399def hdr_from_params (version, paramversion, nacl, iv, ctsize, tag):
704ceaa5
PG
400 """
401 Assemble the necessary values into a PDTCRYPT header.
402
403 :type version: int to fit uint16_t
404 :type paramversion: int to fit uint16_t
405 :type nacl: bytes to fit uint8_t[16]
406 :type iv: bytes to fit uint8_t[12]
407 :type size: int to fit uint64_t
408 :type tag: bytes to fit uint8_t[16]
409 """
dd47d6a2 410 buf = bytearray (PDTCRYPT_HDR_SIZE)
83f2d71e 411 bufv = memoryview (buf)
00b3cd10 412
00b3cd10 413 try:
83f2d71e 414 struct.pack_into (FMT_I2N_HDR, bufv, 0,
dd47d6a2 415 PDTCRYPT_HDR_MAGIC,
3b53fb98 416 version, paramversion, nacl, iv, ctsize, tag)
83f2d71e 417 except Exception as exn:
a83fa4ed 418 return False, "error assembling header: %s" % str (exn)
00b3cd10 419
83f2d71e 420 return True, bytes (buf)
00b3cd10 421
00b3cd10 422
8a990744
PG
423def hdr_make_dummy (s):
424 """
425 Create a header sized block of bytes initialized to a value derived from a
426 string. Used to verify we’ve jumped back correctly to the actual position
427 of the object header.
428 """
429 c = reduce (lambda a, c: a + ord(c), s, 0) % 0xFF
dd47d6a2 430 return bytes (bytearray (struct.pack ("B", c)) * PDTCRYPT_HDR_SIZE)
8a990744
PG
431
432
a393d9cb 433def hdr_make (hdr):
704ceaa5
PG
434 """
435 Assemble a header from the given header structure.
436 """
a393d9cb
PG
437 return hdr_from_params (version=hdr.get("version"),
438 paramversion=hdr.get("paramversion"),
439 nacl=hdr.get("nacl"), iv=hdr.get("iv"),
3b53fb98 440 ctsize=hdr.get("ctsize"), tag=hdr.get("tag"))
a393d9cb
PG
441
442
83f2d71e 443HDR_FMT = "I2n_header { version: %d, paramversion: %d, nacl: %s[%d]," \
89131745 444 " iv: %s[%d], ctsize: %d, tag: %s[%d] }"
00b3cd10 445
83f2d71e 446def hdr_fmt (h):
704ceaa5 447 """Format a header structure into readable output."""
83f2d71e
PG
448 return HDR_FMT % (h["version"], h["paramversion"],
449 binascii.hexlify (h["nacl"]), len(h["nacl"]),
450 binascii.hexlify (h["iv"]), len(h["iv"]),
db1f3ac7
PG
451 h["ctsize"],
452 binascii.hexlify (h["tag"]), len(h["tag"]))
00b3cd10 453
00b3cd10 454
83f2d71e 455def hex_spaced_of_bytes (b):
704ceaa5 456 """Format bytes object, hexdump style."""
83f2d71e
PG
457 return " ".join ([ "%.2x%.2x" % (c1, c2)
458 for c1, c2 in zip (b[0::2], b[1::2]) ]) \
459 + (len (b) | 1 == len (b) and " %.2x" % b[-1] or "") # odd lengths
00b3cd10 460
591a722f 461
3031b7ae
PG
462def hdr_iv_counter (h):
463 """Extract the variable part of the IV of the given header."""
464 _fixed, cnt = struct.unpack (FMT_I2N_IV, h ["iv"])
465 return cnt
466
467
468def hdr_iv_fixed (h):
469 """Extract the fixed part of the IV of the given header."""
470 fixed, _cnt = struct.unpack (FMT_I2N_IV, h ["iv"])
471 return fixed
472
473
83f2d71e 474hdr_dump = hex_spaced_of_bytes
00b3cd10 475
00b3cd10 476
15d3eefd
PG
477HDR_FMT_PRETTY = \
478"""version = %-4d : %s
479paramversion = %-4d : %s
480nacl : %s
481iv : %s
482ctsize = %-20d : %s
483tag : %s
83f2d71e 484"""
00b3cd10 485
83f2d71e 486def hdr_fmt_pretty (h):
704ceaa5
PG
487 """
488 Format header structure into multi-line representation of its contents and
489 their raw representation. (Omit the implicit “PDTCRYPT” magic bytes that
490 precede every header.)
491 """
83f2d71e
PG
492 return HDR_FMT_PRETTY \
493 % (h["version"],
494 hex_spaced_of_bytes (struct.pack (FMT_UINT16_LE, h["version"])),
495 h["paramversion"],
496 hex_spaced_of_bytes (struct.pack (FMT_UINT16_LE, h["paramversion"])),
497 hex_spaced_of_bytes (h["nacl"]),
498 hex_spaced_of_bytes (h["iv"]),
499 h["ctsize"],
15d3eefd
PG
500 hex_spaced_of_bytes (struct.pack (FMT_UINT64_LE, h["ctsize"])),
501 hex_spaced_of_bytes (h["tag"]))
00b3cd10 502
f6cd676f
PG
503IV_FMT = "((f %s) (c %d))"
504
505def iv_fmt (iv):
704ceaa5 506 """Format the two components of an IV in a readable fashion."""
f6cd676f
PG
507 fixed, cnt = struct.unpack (FMT_I2N_IV, iv)
508 return IV_FMT % (binascii.hexlify (fixed), cnt)
509
00b3cd10 510
00b3cd10 511###############################################################################
f41973a6
PG
512## restoration
513###############################################################################
514
515class Location (object):
516 n = 0
517 offset = 0
518
519def restore_loc_fmt (loc):
520 return "%d off:%d" \
521 % (loc.n, loc.offset)
522
523def locate_hdr_candidates (fd):
524 """
525 Walk over instances of the magic string in the payload, collecting their
526 positions. If the offset of the first found instance is not zero, the file
527 begins with leading garbage.
528
529 :return: The list of offsets in the file.
530 """
531 cands = []
532
533 mm = mmap.mmap(fd, 0, mmap.MAP_SHARED, mmap.PROT_READ)
534 pos = 0
535 while True:
536 pos = mm.find (PDTCRYPT_HDR_MAGIC, pos)
537 if pos == -1:
538 break
539 cands.append (pos)
540 pos += 1
541
542 return cands
543
544
6c8073ab
PG
545HDR_CAND_GOOD = 0 # header marks begin of valid object
546HDR_CAND_FISHY = 1 # inconclusive (tag mismatch, obj overlap etc.)
547HDR_CAND_JUNK = 2 # not a header / object unreadable
548
549
550def inspect_hdr (fd, off):
551 """
552 Attempt to parse a header in *fd* at position *off*.
553
554 Returns a verdict about the quality of that header plus the parsed header
555 when readable.
556 """
557
558 _ = os.lseek (fd, off, os.SEEK_SET)
559
560 if os.lseek (fd, 0, os.SEEK_CUR) != off:
561 if PDTCRYPT_VERBOSE is True:
562 noise ("PDT: %d → dismissed (lseek() past EOF)" % off)
563 return HDR_CAND_JUNK, None
564
565 raw = os.read (fd, PDTCRYPT_HDR_SIZE)
566 if len (raw) != PDTCRYPT_HDR_SIZE:
567 if PDTCRYPT_VERBOSE is True:
568 noise ("PDT: %d → dismissed (EOF inside header)" % off)
569 return HDR_CAND_JUNK, None
570
571 try:
572 hdr = hdr_read (raw)
573 except InvalidHeader as exn:
574 if PDTCRYPT_VERBOSE is True:
575 noise ("PDT: %d → dismissed (invalid: [%s])" % (off, str (exn)))
576 return HDR_CAND_JUNK, None
577
578 obj0 = off + PDTCRYPT_HDR_SIZE
579 objX = obj0 + hdr ["ctsize"]
580
581 eof = os.lseek (fd, 0, os.SEEK_END)
582 if eof < objX:
583 if PDTCRYPT_VERBOSE is True:
584 noise ("PDT: %d → EOF inside object (%d≤%d≤%d); adjusting size to "
585 "%d" % (off, obj0, eof, objX, (eof - obj0)))
586 # try reading up to the end
587 hdr ["ctsize"] = eof - obj0
588 return HDR_CAND_FISHY, hdr
589
590 return HDR_CAND_GOOD, hdr
591
592
70a33834 593def try_decrypt (fd, off, hdr, secret, fname=None):
6c8073ab
PG
594 """
595 Attempt to decrypt the object in the (seekable) descriptor *fd* starting at
70a33834
PG
596 *off* using the metadata in *hdr* and *secret*. An output file can be
597 specified with *fname*; if it is *None*, the decrypted payload will be
598 discarded.
599
600 Always creates a fresh decryptor, so validation steps across objects don’t
601 apply.
6c8073ab 602 """
70a33834
PG
603 ctleft = hdr ["ctsize"]
604 pos = off
605
606 ks = secret [0]
607 if ks == PDTCRYPT_SECRET_PW:
608 decr = Decrypt (password=secret [1])
609 elif ks == PDTCRYPT_SECRET_KEY:
610 key = binascii.unhexlify (secret [1])
611 decr = Decrypt (key=key)
612 else:
613 raise RuntimeError
614
615 if fname is not None: raise NotImplementedError
616
617 decr.next (hdr)
618
619 try:
620 os.lseek (fd, pos, os.SEEK_SET)
621 while ctleft > 0:
622 cnksiz = min (ctleft, PDTCRYPT_BLOCKSIZE)
623 cnk = os.read (fd, cnksiz)
624 ctleft -= cnksiz
625 pos += cnksiz
626 _pt = decr.process (cnk)
627
628 _pt = decr.done ()
629 except Exception as exn:
630 noise ("PDT: error decrypting object %d–%d@%d, %d B remaining [%s]"
631 % (off, off + hdr ["ctsize"], pos, ctleft, exn))
632 raise
6c8073ab 633
70a33834 634 return pos - off
6c8073ab
PG
635
636
f41973a6 637###############################################################################
6178061e
PG
638## passthrough / null encryption
639###############################################################################
640
641class PassthroughCipher (object):
642
643 tag = struct.pack ("<QQ", 0, 0)
644
645 def __init__ (self) : pass
646
647 def update (self, b) : return b
648
50710d86 649 def finalize (self) : return b""
6178061e
PG
650
651 def finalize_with_tag (self, _) : return b""
652
653###############################################################################
a393d9cb 654## convenience wrapper
00b3cd10
PG
655###############################################################################
656
c46c8670
PG
657
658def kdf_dummy (klen, password, _nacl):
704ceaa5
PG
659 """
660 Fake KDF for testing purposes that is called when parameter version zero is
661 encountered.
662 """
c46c8670
PG
663 q, r = divmod (klen, len (password))
664 if isinstance (password, bytes) is False:
665 password = password.encode ()
666 return password * q + password [:r], b""
667
668
669SCRYPT_KEY_MEMO = { } # static because needed for both the info file and the archive
670
671
672def kdf_scrypt (params, password, nacl):
704ceaa5
PG
673 """
674 Wrapper for the Scrypt KDF, corresponds to parameter version one. The
675 computation result is memoized based on the inputs to facilitate spawning
676 multiple encryption contexts.
677 """
c46c8670
PG
678 N = params["N"]
679 r = params["r"]
680 p = params["p"]
681 dkLen = params["dkLen"]
682
683 if nacl is None:
684 nacl = os.urandom (params["NaCl_LEN"])
685
686 key_parms = (password, nacl, N, r, p, dkLen)
687 global SCRYPT_KEY_MEMO
688 if key_parms not in SCRYPT_KEY_MEMO:
689 SCRYPT_KEY_MEMO [key_parms] = \
690 pylibscrypt.scrypt (password, nacl, N, r, p, dkLen)
691 return SCRYPT_KEY_MEMO [key_parms], nacl
a64085a8
PG
692
693
da82bc58 694def kdf_by_version (paramversion=None, defs=None):
704ceaa5
PG
695 """
696 Pick the KDF handler corresponding to the parameter version or the
697 definition set.
698
699 :rtype: function (password : str, nacl : str) -> str
700 """
da82bc58
PG
701 if paramversion is not None:
702 defs = ENCRYPTION_PARAMETERS.get(paramversion, None)
a64085a8 703 if defs is None:
1ed44e7b
PG
704 raise InvalidParameter ("no encryption parameters for version %r"
705 % paramversion)
a64085a8 706 (kdf, params) = defs["kdf"]
c46c8670
PG
707 fn = None
708 if kdf == "scrypt" : fn = kdf_scrypt
709 if kdf == "dummy" : fn = kdf_dummy
710 if fn is None:
a64085a8 711 raise ValueError ("key derivation method %r unknown" % kdf)
c46c8670 712 return partial (fn, params)
a64085a8
PG
713
714
b360b772
PG
715###############################################################################
716## SCRYPT hashing
717###############################################################################
718
719def scrypt_hashsource (pw, ins):
720 """
721 Calculate the SCRYPT hash from the password and the information contained
722 in the first header found in ``ins``.
723
724 This does not validate whether the first object is encrypted correctly.
725 """
c1ecc2e2
PG
726 if isinstance (pw, str) is True:
727 pw = str.encode (pw)
728 elif isinstance (pw, bytes) is False:
729 raise InvalidParameter ("password must be a string, not %s"
730 % type (password))
731 if isinstance (ins, io.BufferedReader) is False and \
732 isinstance (ins, io.FileIO) is False:
733 raise InvalidParameter ("file to hash must be opened in “binary” mode")
b360b772
PG
734 hdr = None
735 try:
736 hdr = hdr_read_stream (ins)
737 except EndOfFile as exn:
738 noise ("PDT: malformed input: end of file reading first object header")
739 noise ("PDT:")
740 return 1
741
742 nacl = hdr ["nacl"]
743 pver = hdr ["paramversion"]
744 if PDTCRYPT_VERBOSE is True:
745 noise ("PDT: salt of first object : %s" % binascii.hexlify (nacl))
746 noise ("PDT: parameter version of archive : %d" % pver)
747
748 try:
749 defs = ENCRYPTION_PARAMETERS.get(pver, None)
750 kdfname, params = defs ["kdf"]
751 if kdfname != "scrypt":
752 noise ("PDT: input is not an SCRYPT archive")
753 noise ("")
754 return 1
755 kdf = kdf_by_version (None, defs)
756 except ValueError as exn:
757 noise ("PDT: object has unknown parameter version %d" % pver)
758
759 hsh, _void = kdf (pw, nacl)
760
c1ecc2e2 761 return hsh, nacl, hdr ["version"], pver
b360b772
PG
762
763
764def scrypt_hashfile (pw, fname):
704ceaa5
PG
765 """
766 Calculate the SCRYPT hash from the password and the information contained
767 in the first header found in the given file. The header is read only at
768 offset zero.
769 """
b360b772 770 with deptdcrypt_mk_stream (PDTCRYPT_SOURCE, fname or "-") as ins:
c1ecc2e2 771 hsh, _void, _void, _void = scrypt_hashsource (pw, ins)
b360b772
PG
772 return hsh
773
774
775###############################################################################
776## AES-GCM context
777###############################################################################
778
a393d9cb
PG
779class Crypto (object):
780 """
781 Encryption context to remain alive throughout an entire tarfile pass.
782 """
6178061e 783 enc = None
a393d9cb
PG
784 nacl = None
785 key = None
50710d86
PG
786 cnt = None # file counter (uint32_t != 0)
787 iv = None # current IV
30019abf
PG
788 fixed = None # accu for 64 bit fixed parts of IV
789 used_ivs = None # tracks IVs
790 strict_ivs = False # if True, panic on duplicate object IV
48db09ba
PG
791 password = None
792 paramversion = None
633b18a9
PG
793 stats = { "in" : 0
794 , "out" : 0
795 , "obj" : 0 }
fa47412e 796
fa47412e
PG
797 ctsize = -1
798 ptsize = -1
3031b7ae
PG
799 info_counter_used = False
800 index_counter_used = False
a393d9cb 801
a64085a8 802 def __init__ (self, *al, **akv):
30019abf 803 self.used_ivs = set ()
a64085a8 804 self.set_parameters (*al, **akv)
39accaaa
PG
805
806
704ceaa5 807 def next_fixed (self):
be124bca 808 # NOP for decryption
50710d86
PG
809 pass
810
811
812 def set_object_counter (self, cnt=None):
704ceaa5
PG
813 """
814 Safely set the internal counter of encrypted objects. Numerous
815 constraints apply:
816
817 The same counter may not be reused in combination with one IV fixed
818 part. This is validated elsewhere in the IV handling.
819
820 Counter zero is invalid. The first two counters are reserved for
821 metadata. The implementation does not allow for splitting metadata
822 files over multiple encrypted objects. (This would be possible by
823 assigning new fixed parts.) Thus in a Deltatar backup there is at most
824 one object with a counter value of one and two. On creation of a
825 context, the initial counter may be chosen. The globals
826 ``AES_GCM_IV_CNT_INFOFILE`` and ``AES_GCM_IV_CNT_INDEX`` can be used to
827 request one of the reserved values. If one of these values has been
828 used, any further attempt of setting the counter to that value will
829 be rejected with an ``InvalidFileCounter`` exception.
830
831 Out of bounds values (i. e. below one and more than the maximum of 2³²)
832 cause an ``InvalidParameter`` exception to be thrown.
833 """
50710d86
PG
834 if cnt is None:
835 self.cnt = AES_GCM_IV_CNT_DATA
836 return
837 if cnt == 0 or cnt > AES_GCM_IV_CNT_MAX + 1:
b12110dd
PG
838 raise InvalidParameter ("invalid counter value %d requested: "
839 "acceptable values are from 1 to %d"
840 % (cnt, AES_GCM_IV_CNT_MAX))
50710d86
PG
841 if cnt == AES_GCM_IV_CNT_INFOFILE:
842 if self.info_counter_used is True:
fac2cfe1
PG
843 raise InvalidFileCounter ("attempted to reuse info file "
844 "counter %d: must be unique" % cnt)
50710d86 845 self.info_counter_used = True
3031b7ae
PG
846 elif cnt == AES_GCM_IV_CNT_INDEX:
847 if self.index_counter_used is True:
fac2cfe1
PG
848 raise InvalidFileCounter ("attempted to reuse index file "
849 " counter %d: must be unique" % cnt)
3031b7ae 850 self.index_counter_used = True
50710d86
PG
851 if cnt <= AES_GCM_IV_CNT_MAX:
852 self.cnt = cnt
853 return
854 # cnt == AES_GCM_IV_CNT_MAX + 1 → wrap
855 self.cnt = AES_GCM_IV_CNT_DATA
704ceaa5 856 self.next_fixed ()
50710d86
PG
857
858
1f3fd7b0 859 def set_parameters (self, password=None, key=None, paramversion=None,
be124bca 860 nacl=None, counter=None, strict_ivs=False):
704ceaa5
PG
861 """
862 Configure the internal state of a crypto context. Not intended for
863 external use.
864 """
be124bca 865 self.next_fixed ()
50710d86 866 self.set_object_counter (counter)
30019abf
PG
867 self.strict_ivs = strict_ivs
868
a83fa4ed
PG
869 if paramversion is not None:
870 self.paramversion = paramversion
871
1f3fd7b0
PG
872 if key is not None:
873 self.key, self.nacl = key, nacl
874 return
875
a83fa4ed
PG
876 if password is not None:
877 if isinstance (password, bytes) is False:
878 password = str.encode (password)
879 self.password = password
880 if paramversion is None and nacl is None:
881 # postpone key setup until first header is available
882 return
883 kdf = kdf_by_version (paramversion)
884 if kdf is not None:
885 self.key, self.nacl = kdf (password, nacl)
fa47412e 886
39accaaa 887
39accaaa 888 def process (self, buf):
704ceaa5
PG
889 """
890 Encrypt / decrypt a buffer. Invokes the ``.update()`` method on the
891 wrapped encryptor or decryptor, respectively.
892
893 The Cryptography exception ``AlreadyFinalized`` is translated to an
894 ``InternalError`` at this point. It may occur in sound code when the GC
895 closes an encrypting stream after an error. Everywhere else it must be
896 treated as a bug.
897 """
cb7a3911
PG
898 if self.enc is None:
899 raise RuntimeError ("process: context not initialized")
900 self.stats ["in"] += len (buf)
fac2cfe1
PG
901 try:
902 out = self.enc.update (buf)
903 except cryptography.exceptions.AlreadyFinalized as exn:
904 raise InternalError (exn)
cb7a3911
PG
905 self.stats ["out"] += len (out)
906 return out
39accaaa
PG
907
908
30019abf 909 def next (self, password, paramversion, nacl, iv):
704ceaa5
PG
910 """
911 Prepare for encrypting another object: Reset the data counters and
912 change the configuration in case one of the variable parameters differs
913 from the last object. Also check the IV for duplicates and error out
914 if strict checking was requested.
915 """
fa47412e
PG
916 self.ctsize = 0
917 self.ptsize = 0
918 self.stats ["obj"] += 1
30019abf
PG
919
920 self.check_duplicate_iv (iv)
921
6178061e
PG
922 if ( self.paramversion != paramversion
923 or self.password != password
924 or self.nacl != nacl):
1f3fd7b0 925 self.set_parameters (password=password, paramversion=paramversion,
30019abf
PG
926 nacl=nacl, strict_ivs=self.strict_ivs)
927
928
929 def check_duplicate_iv (self, iv):
704ceaa5
PG
930 """
931 Add an IV (the 12 byte representation as in the header) to the list. With
932 strict checking enabled, this will throw a ``DuplicateIV``. Depending on
933 the context, this may indicate a serious error (IV reuse).
934 """
30019abf
PG
935 if self.strict_ivs is True and iv in self.used_ivs:
936 raise DuplicateIV ("iv %s was reused" % iv_fmt (iv))
937 # vi has not been used before; add to collection
938 self.used_ivs.add (iv)
fa47412e
PG
939
940
633b18a9 941 def counters (self):
704ceaa5
PG
942 """
943 Access the data counters.
944 """
633b18a9
PG
945 return self.stats ["obj"], self.stats ["in"], self.stats ["out"]
946
947
8de91f4f
PG
948 def drop (self):
949 """
950 Clear the current context regardless of its finalization state. The
951 next operation must be ``.next()``.
952 """
953 self.enc = None
954
955
39accaaa
PG
956class Encrypt (Crypto):
957
48db09ba
PG
958 lastinfo = None
959 version = None
72a42219 960 paramenc = None
50710d86 961
1f3fd7b0 962 def __init__ (self, version, paramversion, password=None, key=None, nacl=None,
30019abf 963 counter=AES_GCM_IV_CNT_DATA, strict_ivs=True):
704ceaa5
PG
964 """
965 The ctor will throw immediately if one of the parameters does not conform
966 to our expectations.
967
968 counter=AES_GCM_IV_CNT_DATA, strict_ivs=True):
969 :type version: int to fit uint16_t
970 :type paramversion: int to fit uint16_t
971 :param password: mutually exclusive with ``key``
972 :type password: bytes
973 :param key: mutually exclusive with ``password``
974 :type key: bytes
975 :type nacl: bytes
976 :type counter: initial object counter the values
977 ``AES_GCM_IV_CNT_INFOFILE`` and
978 ``AES_GCM_IV_CNT_INDEX`` are unique in each backup set
979 and cannot be reused even with different fixed parts.
980 :type strict_ivs: bool
981 """
1f3fd7b0
PG
982 if password is None and key is None \
983 or password is not None and key is not None :
984 raise InvalidParameter ("__init__: need either key or password")
985
986 if key is not None:
987 if isinstance (key, bytes) is False:
988 raise InvalidParameter ("__init__: key must be provided as "
989 "bytes, not %s" % type (key))
990 if nacl is None:
991 raise InvalidParameter ("__init__: salt must be provided along "
992 "with encryption key")
993 else: # password, no key
994 if isinstance (password, str) is False:
995 raise InvalidParameter ("__init__: password must be a string, not %s"
996 % type (password))
997 if len (password) == 0:
998 raise InvalidParameter ("__init__: supplied empty password but not "
999 "permitted for PDT encrypted files")
36b9932a
PG
1000 # version
1001 if isinstance (version, int) is False:
1002 raise InvalidParameter ("__init__: version number must be an "
1003 "integer, not %s" % type (version))
1004 if version < 0:
1005 raise InvalidParameter ("__init__: version number must be a "
1006 "nonnegative integer, not %d" % version)
1007 # paramversion
1008 if isinstance (paramversion, int) is False:
1009 raise InvalidParameter ("__init__: crypto parameter version number "
1010 "must be an integer, not %s"
1011 % type (paramversion))
1012 if paramversion < 0:
1013 raise InvalidParameter ("__init__: crypto parameter version number "
1014 "must be a nonnegative integer, not %d"
1015 % paramversion)
1016 # salt
1017 if nacl is not None:
1018 if isinstance (nacl, bytes) is False:
1019 raise InvalidParameter ("__init__: salt given, but of type %s "
1020 "instead of bytes" % type (nacl))
1021 # salt length would depend on the actual encryption so it can’t be
1022 # validated at this point
b12110dd 1023 self.fixed = [ ]
48db09ba
PG
1024 self.version = version
1025 self.paramenc = ENCRYPTION_PARAMETERS.get (paramversion) ["enc"]
72a42219 1026
1f3fd7b0 1027 super().__init__ (password, key, paramversion, nacl, counter=counter,
30019abf 1028 strict_ivs=strict_ivs)
a393d9cb
PG
1029
1030
be124bca
PG
1031 def next_fixed (self, retries=PDTCRYPT_IV_GEN_MAX_RETRIES):
1032 """
1033 Generate the next IV fixed part by reading eight bytes from
1034 ``/dev/urandom``. The buffer so obtained is tested against the fixed
1035 parts used so far to prevent accidental reuse of IVs. After a
1036 configurable number of attempts to create a unique fixed part, it will
1037 refuse to continue with an ``IVFixedPartError``. This is unlikely to
1038 ever happen on a normal system but may detect an issue with the random
1039 generator.
1040
1041 The list of fixed parts that were used by the context at hand can be
1042 accessed through the ``.fixed`` list. Its last element is the fixed
1043 part currently in use.
1044 """
1045 i = 0
1046 while i < retries:
1047 fp = os.urandom (PDTCRYPT_IV_FIXEDPART_SIZE)
1048 if fp not in self.fixed:
1049 self.fixed.append (fp)
1050 return
1051 i += 1
1052 raise IVFixedPartError ("error obtaining a unique IV fixed part from "
1053 "/dev/urandom; giving up after %d tries" % i)
1054
1055
a393d9cb 1056 def iv_make (self):
704ceaa5
PG
1057 """
1058 Construct a 12-bytes IV from the current fixed part and the object
1059 counter.
1060 """
b12110dd 1061 return struct.pack(FMT_I2N_IV, self.fixed [-1], self.cnt)
a393d9cb
PG
1062
1063
cb7a3911 1064 def next (self, filename=None, counter=None):
704ceaa5
PG
1065 """
1066 Prepare for encrypting the next incoming object. Update the counter
1067 and put together the IV, possibly changing prefixes. Then create the
1068 new encryptor.
1069
1070 The argument ``counter`` can be used to specify a file counter for this
1071 object. Unless it is one of the reserved values, the counter of
1072 subsequent objects will be computed from this one.
1073
1074 If this is the first object in a series, ``filename`` is required,
1075 otherwise it is reused if not present. The value is used to derive a
1076 header sized placeholder to use until after encryption when all the
1077 inputs to construct the final header are available. This is then
1078 matched in ``.done()`` against the value found at the position of the
1079 header. The motivation for this extra check is primarily to assist
1080 format debugging: It makes stray headers easy to spot in malformed
1081 PDTCRYPT files.
1082 """
cb7a3911
PG
1083 if filename is None:
1084 if self.lastinfo is None:
1085 raise InvalidParameter ("next: filename is mandatory for "
1086 "first object")
1087 filename, _dummy = self.lastinfo
1088 else:
1089 if isinstance (filename, str) is False:
1090 raise InvalidParameter ("next: filename must be a string, no %s"
1091 % type (filename))
3031b7ae
PG
1092 if counter is not None:
1093 if isinstance (counter, int) is False:
1094 raise InvalidParameter ("next: the supplied counter is of "
1095 "invalid type %s; please pass an "
1096 "integer instead" % type (counter))
1097 self.set_object_counter (counter)
fac2cfe1 1098
50710d86 1099 self.iv = self.iv_make ()
72a42219 1100 if self.paramenc == "aes-gcm":
6178061e
PG
1101 self.enc = Cipher \
1102 ( algorithms.AES (self.key)
1103 , modes.GCM (self.iv)
1104 , backend = default_backend ()) \
1105 .encryptor ()
72a42219 1106 elif self.paramenc == "passthrough":
6178061e
PG
1107 self.enc = PassthroughCipher ()
1108 else:
b12110dd
PG
1109 raise InvalidParameter ("next: parameter version %d not known"
1110 % self.paramversion)
48db09ba
PG
1111 hdrdum = hdr_make_dummy (filename)
1112 self.lastinfo = (filename, hdrdum)
30019abf 1113 super().next (self.password, self.paramversion, self.nacl, self.iv)
72a42219 1114
3031b7ae 1115 self.set_object_counter (self.cnt + 1)
48db09ba 1116 return hdrdum
a393d9cb 1117
a393d9cb 1118
cd77dadb 1119 def done (self, cmpdata):
704ceaa5
PG
1120 """
1121 Complete encryption of an object. After this has been called, attempts
1122 of encrypting further data will cause an error until ``.next()`` is
1123 invoked properly.
1124
1125 Returns a 64 bytes buffer containing the object header including all
1126 values including the “late” ones e. g. the ciphertext size and the
1127 GCM tag.
1128 """
36b9932a
PG
1129 if isinstance (cmpdata, bytes) is False:
1130 raise InvalidParameter ("done: comparison input expected as bytes, "
1131 "not %s" % type (cmpdata))
cb7a3911
PG
1132 if self.lastinfo is None:
1133 raise RuntimeError ("done: encryption context not initialized")
48db09ba
PG
1134 filename, hdrdum = self.lastinfo
1135 if cmpdata != hdrdum:
b12110dd
PG
1136 raise RuntimeError ("done: bad sync of header for object %d: "
1137 "preliminary data does not match; this likely "
1138 "indicates a wrongly repositioned stream"
1139 % self.cnt)
6178061e 1140 data = self.enc.finalize ()
633b18a9 1141 self.stats ["out"] += len (data)
cd77dadb 1142 self.ctsize += len (data)
48db09ba
PG
1143 ok, hdr = hdr_from_params (self.version, self.paramversion, self.nacl,
1144 self.iv, self.ctsize, self.enc.tag)
8a990744 1145 if ok is False:
b12110dd
PG
1146 raise InternalError ("error constructing header: %r" % hdr)
1147 return data, hdr, self.fixed
a393d9cb 1148
a393d9cb 1149
cd77dadb 1150 def process (self, buf):
704ceaa5
PG
1151 """
1152 Encrypt a chunk of plaintext with the active encryptor. Returns the
1153 size of the input consumed. This **must** be checked downstream. If the
1154 maximum possible object size has been reached, the current context must
1155 be finalized and a new one established before any further data can be
1156 encrypted. The second argument is the remainder of the plaintext that
1157 was not encrypted for the caller to use immediately after the new
1158 context is ready.
1159 """
36b9932a
PG
1160 if isinstance (buf, bytes) is False:
1161 raise InvalidParameter ("process: expected byte buffer, not %s"
1162 % type (buf))
cb7a3911
PG
1163 bsize = len (buf)
1164 newptsize = self.ptsize + bsize
1165 diff = newptsize - PDTCRYPT_MAX_OBJ_SIZE
1166 if diff > 0:
1167 bsize -= diff
1168 newptsize = PDTCRYPT_MAX_OBJ_SIZE
1169 self.ptsize = newptsize
1170 data = super().process (buf [:bsize])
cd77dadb 1171 self.ctsize += len (data)
cb7a3911 1172 return bsize, data
cd77dadb
PG
1173
1174
39accaaa 1175class Decrypt (Crypto):
a393d9cb 1176
3031b7ae 1177 tag = None # GCM tag, part of header
3031b7ae 1178 last_iv = None # check consecutive ivs in strict mode
39accaaa 1179
1f3fd7b0 1180 def __init__ (self, password=None, key=None, counter=None, fixedparts=None,
ee6aa239 1181 strict_ivs=False):
704ceaa5
PG
1182 """
1183 Sanitizing ctor for the decryption context. ``fixedparts`` specifies a
1184 list of IV fixed parts accepted during decryption. If a fixed part is
1185 encountered that is not in the list, decryption will fail.
1186
1187 :param password: mutually exclusive with ``key``
1188 :type password: bytes
1189 :param key: mutually exclusive with ``password``
1190 :type key: bytes
1191 :type counter: initial object counter the values
1192 ``AES_GCM_IV_CNT_INFOFILE`` and
1193 ``AES_GCM_IV_CNT_INDEX`` are unique in each backup set
1194 and cannot be reused even with different fixed parts.
1195 :type fixedparts: bytes list
1196 """
1f3fd7b0
PG
1197 if password is None and key is None \
1198 or password is not None and key is not None :
1199 raise InvalidParameter ("__init__: need either key or password")
1200
1201 if key is not None:
1202 if isinstance (key, bytes) is False:
1203 raise InvalidParameter ("__init__: key must be provided as "
1204 "bytes, not %s" % type (key))
1205 else: # password, no key
1206 if isinstance (password, str) is False:
1207 raise InvalidParameter ("__init__: password must be a string, not %s"
1208 % type (password))
1209 if len (password) == 0:
1210 raise InvalidParameter ("__init__: supplied empty password but not "
1211 "permitted for PDT encrypted files")
36b9932a 1212 # fixed parts
50710d86 1213 if fixedparts is not None:
36b9932a
PG
1214 if isinstance (fixedparts, list) is False:
1215 raise InvalidParameter ("__init__: IV fixed parts must be "
1216 "supplied as list, not %s"
1217 % type (fixedparts))
b12110dd
PG
1218 self.fixed = fixedparts
1219 self.fixed.sort ()
ee6aa239 1220
a83fa4ed
PG
1221 super().__init__ (password=password, key=key, counter=counter,
1222 strict_ivs=strict_ivs)
39accaaa
PG
1223
1224
b12110dd 1225 def valid_fixed_part (self, iv):
704ceaa5
PG
1226 """
1227 Check if a fixed part was already seen.
1228 """
50710d86 1229 # check if fixed part is known
b12110dd
PG
1230 fixed, _cnt = struct.unpack (FMT_I2N_IV, iv)
1231 i = bisect.bisect_left (self.fixed, fixed)
1232 return i != len (self.fixed) and self.fixed [i] == fixed
50710d86
PG
1233
1234
ee6aa239 1235 def check_consecutive_iv (self, iv):
704ceaa5
PG
1236 """
1237 Check whether the counter part of the given IV is indeed the successor
1238 of the currently present counter. This should always be the case for
1239 the objects in a well formed PDT archive but should not be enforced
1240 when decrypting out-of-order.
1241 """
ee6aa239 1242 fixed, cnt = struct.unpack (FMT_I2N_IV, iv)
3031b7ae
PG
1243 if self.strict_ivs is True \
1244 and self.last_iv is not None \
ee6aa239
PG
1245 and self.last_iv [0] == fixed \
1246 and self.last_iv [1] != cnt - 1:
f6cd676f 1247 raise NonConsecutiveIV ("iv %s counter not successor of "
ee6aa239 1248 "last object (expected %d, found %d)"
f6cd676f 1249 % (iv_fmt (self.last_iv [1]), cnt))
ee6aa239
PG
1250 self.last_iv = (iv, cnt)
1251
1252
79782fa9 1253 def next (self, hdr):
704ceaa5
PG
1254 """
1255 Start decrypting the next object. The PDTCRYPT header for the object
1256 can be given either as already parsed object or as bytes.
1257 """
dccfe104
PG
1258 if isinstance (hdr, bytes) is True:
1259 hdr = hdr_read (hdr)
36b9932a
PG
1260 elif isinstance (hdr, dict) is False:
1261 # this won’t catch malformed specs though
1262 raise InvalidParameter ("next: wrong type of parameter hdr: "
1263 "expected bytes or spec, got %s"
fbfda3d4 1264 % type (hdr))
36b9932a
PG
1265 try:
1266 paramversion = hdr ["paramversion"]
1267 nacl = hdr ["nacl"]
1268 iv = hdr ["iv"]
1269 tag = hdr ["tag"]
1270 except KeyError:
1271 raise InvalidHeader ("next: not a header %r" % hdr)
1272
30019abf 1273 super().next (self.password, paramversion, nacl, iv)
b12110dd 1274 if self.fixed is not None and self.valid_fixed_part (iv) is False:
f6cd676f
PG
1275 raise InvalidIVFixedPart ("iv %s has invalid fixed part"
1276 % iv_fmt (iv))
3031b7ae 1277 self.check_consecutive_iv (iv)
ee6aa239 1278
36b9932a 1279 self.tag = tag
b12110dd
PG
1280 defs = ENCRYPTION_PARAMETERS.get (paramversion, None)
1281 if defs is None:
1282 raise FormatError ("header contains unknown parameter version %d; "
1283 "maybe the file was created by a more recent "
1284 "version of Deltatar" % paramversion)
50710d86 1285 enc = defs ["enc"]
6178061e
PG
1286 if enc == "aes-gcm":
1287 self.enc = Cipher \
1288 ( algorithms.AES (self.key)
36b9932a 1289 , modes.GCM (iv, tag=self.tag)
6178061e
PG
1290 , backend = default_backend ()) \
1291 . decryptor ()
1292 elif enc == "passthrough":
1293 self.enc = PassthroughCipher ()
1294 else:
b12110dd
PG
1295 raise InternalError ("encryption parameter set %d refers to unknown "
1296 "mode %r" % (paramversion, enc))
f484f2d1 1297 self.set_object_counter (self.cnt + 1)
39accaaa
PG
1298
1299
db1f3ac7 1300 def done (self, tag=None):
704ceaa5
PG
1301 """
1302 Stop decryption of the current object and finalize it with the active
1303 context. This will throw an *InvalidGCMTag* exception to indicate that
1304 the authentication tag does not match the data. If the tag is correct,
1305 the rest of the plaintext is returned.
1306 """
633b18a9 1307 data = b""
db1f3ac7
PG
1308 try:
1309 if tag is None:
f484f2d1 1310 data = self.enc.finalize ()
db1f3ac7 1311 else:
36b9932a
PG
1312 if isinstance (tag, bytes) is False:
1313 raise InvalidParameter ("done: wrong type of parameter "
1314 "tag: expected bytes, got %s"
1315 % type (tag))
f484f2d1 1316 data = self.enc.finalize_with_tag (self.tag)
b0078f26 1317 except cryptography.exceptions.InvalidTag:
f08c604b 1318 raise InvalidGCMTag ("done: tag mismatch of object %d: %s "
b0078f26 1319 "rejected by finalize ()"
f08c604b 1320 % (self.cnt, binascii.hexlify (self.tag)))
50710d86 1321 self.ctsize += len (data)
633b18a9 1322 self.stats ["out"] += len (data)
b0078f26 1323 return data
00b3cd10
PG
1324
1325
47e27926 1326 def process (self, buf):
704ceaa5
PG
1327 """
1328 Decrypt the bytes object *buf* with the active decryptor.
1329 """
36b9932a
PG
1330 if isinstance (buf, bytes) is False:
1331 raise InvalidParameter ("process: expected byte buffer, not %s"
1332 % type (buf))
47e27926
PG
1333 self.ctsize += len (buf)
1334 data = super().process (buf)
1335 self.ptsize += len (data)
1336 return data
1337
1338
00b3cd10 1339###############################################################################
770173c5
PG
1340## testing helpers
1341###############################################################################
1342
cb7a3911 1343def _patch_global (glob, vow, n=None):
770173c5
PG
1344 """
1345 Adapt upper file counter bound for testing IV logic. Completely unsafe.
1346 """
1347 assert vow == "I am fully aware that this will void my warranty."
cb7a3911
PG
1348 r = globals () [glob]
1349 if n is None:
1350 n = globals () [glob + "_DEFAULT"]
1351 globals () [glob] = n
770173c5
PG
1352 return r
1353
cb7a3911
PG
1354_testing_set_AES_GCM_IV_CNT_MAX = \
1355 partial (_patch_global, "AES_GCM_IV_CNT_MAX")
1356
1357_testing_set_PDTCRYPT_MAX_OBJ_SIZE = \
1358 partial (_patch_global, "PDTCRYPT_MAX_OBJ_SIZE")
1359
770173c5 1360###############################################################################
00b3cd10
PG
1361## freestanding invocation
1362###############################################################################
1363
da82bc58
PG
1364PDTCRYPT_SUB_PROCESS = 0
1365PDTCRYPT_SUB_SCRYPT = 1
f41973a6 1366PDTCRYPT_SUB_SCAN = 2
da82bc58
PG
1367
1368PDTCRYPT_SUB = \
1369 { "process" : PDTCRYPT_SUB_PROCESS
f41973a6
PG
1370 , "scrypt" : PDTCRYPT_SUB_SCRYPT
1371 , "scan" : PDTCRYPT_SUB_SCAN }
da82bc58 1372
a83fa4ed
PG
1373PDTCRYPT_SECRET_PW = 0
1374PDTCRYPT_SECRET_KEY = 1
1375
e3abcdf0
PG
1376PDTCRYPT_DECRYPT = 1 << 0 # decrypt archive with password
1377PDTCRYPT_SPLIT = 1 << 1 # split archive into individual objects
da82bc58 1378PDTCRYPT_HASH = 1 << 2 # output scrypt hash for file and given password
e3abcdf0
PG
1379
1380PDTCRYPT_SPLITNAME = "pdtcrypt-object-%d.bin"
1381
70ad9458 1382PDTCRYPT_VERBOSE = False
ee6aa239 1383PDTCRYPT_STRICTIVS = False
b07633d3 1384PDTCRYPT_OVERWRITE = False
15d3eefd 1385PDTCRYPT_BLOCKSIZE = 1 << 12
70ad9458
PG
1386PDTCRYPT_SINK = 0
1387PDTCRYPT_SOURCE = 1
1388SELF = None
1389
77058bab
PG
1390PDTCRYPT_DEFAULT_VER = 1
1391PDTCRYPT_DEFAULT_PVER = 1
1392
7b3940e5
PG
1393# scrypt hashing output control
1394PDTCRYPT_SCRYPT_INTRANATOR = 0
1395PDTCRYPT_SCRYPT_PARAMETERS = 1
4f6405d6 1396PDTCRYPT_SCRYPT_DEFAULT = PDTCRYPT_SCRYPT_INTRANATOR
7b3940e5
PG
1397
1398PDTCRYPT_SCRYPT_FORMAT = \
1399 { "i2n" : PDTCRYPT_SCRYPT_INTRANATOR
1400 , "params" : PDTCRYPT_SCRYPT_PARAMETERS }
1401
4c62ddc0 1402PDTCRYPT_TT_COLUMNS = 80 # assume standard terminal
15d3eefd
PG
1403
1404class PDTDecryptionError (Exception):
1405 """Decryption failed."""
1406
e3abcdf0
PG
1407class PDTSplitError (Exception):
1408 """Decryption failed."""
1409
15d3eefd
PG
1410
1411def noise (*a, **b):
591a722f 1412 print (file=sys.stderr, *a, **b)
15d3eefd
PG
1413
1414
89e1073c
PG
1415class PassthroughDecryptor (object):
1416
1417 curhdr = None # write current header on first data write
1418
1419 def __init__ (self):
1420 if PDTCRYPT_VERBOSE is True:
1421 noise ("PDT: no encryption; data passthrough")
1422
1423 def next (self, hdr):
1424 ok, curhdr = hdr_make (hdr)
1425 if ok is False:
1426 raise PDTDecryptionError ("bad header %r" % hdr)
1427 self.curhdr = curhdr
1428
1429 def done (self):
1430 if self.curhdr is not None:
1431 return self.curhdr
1432 return b""
1433
1434 def process (self, d):
1435 if self.curhdr is not None:
1436 d = self.curhdr + d
1437 self.curhdr = None
1438 return d
1439
1440
a83fa4ed 1441def depdtcrypt (mode, secret, ins, outs):
15d3eefd 1442 """
a83fa4ed
PG
1443 Remove PDTCRYPT layer from all objects encrypted with the secret. Used on a
1444 Deltatar backup this will yield a (possibly Gzip compressed) tarball.
15d3eefd
PG
1445 """
1446 ctleft = -1 # length of ciphertext to consume
1447 ctcurrent = 0 # total ciphertext of current object
15d3eefd
PG
1448 total_obj = 0 # total number of objects read
1449 total_pt = 0 # total plaintext bytes
1450 total_ct = 0 # total ciphertext bytes
1451 total_read = 0 # total bytes read
e3abcdf0
PG
1452 outfile = None # Python file object for output
1453
89e1073c 1454 if mode & PDTCRYPT_DECRYPT: # decryptor
a83fa4ed
PG
1455 ks = secret [0]
1456 if ks == PDTCRYPT_SECRET_PW:
1457 decr = Decrypt (password=secret [1], strict_ivs=PDTCRYPT_STRICTIVS)
1458 elif ks == PDTCRYPT_SECRET_KEY:
1459 key = binascii.unhexlify (secret [1])
1460 decr = Decrypt (key=key, strict_ivs=PDTCRYPT_STRICTIVS)
1461 else:
1462 raise InternalError ("‘%d’ does not specify a valid kind of secret"
1463 % ks)
89e1073c
PG
1464 else:
1465 decr = PassthroughDecryptor ()
1466
e3abcdf0
PG
1467 def nextout (_):
1468 """Dummy for non-split mode: output file does not vary."""
1469 return outs
1470
1471 if mode & PDTCRYPT_SPLIT:
1472 def nextout (outfile):
1473 """
1474 We were passed an fd as outs for accessing the destination
1475 directory where extracted archive components are supposed
1476 to end up in.
1477 """
1478
1479 if outfile is None:
1480 if PDTCRYPT_VERBOSE is True:
1481 noise ("PDT: no output file to close at this point")
77058bab
PG
1482 else:
1483 if PDTCRYPT_VERBOSE is True:
1484 noise ("PDT: release output file %r" % outfile)
e3abcdf0
PG
1485 # cleanup happens automatically by the GC; the next
1486 # line will error out on account of an invalid fd
1487 #outfile.close ()
1488
1489 assert total_obj > 0
1490 fname = PDTCRYPT_SPLITNAME % total_obj
1491 try:
b07633d3
PG
1492 oflags = os.O_CREAT | os.O_WRONLY
1493 if PDTCRYPT_OVERWRITE is True:
1494 oflags |= os.O_TRUNC
1495 else:
1496 oflags |= os.O_EXCL
1497 outfd = os.open (fname, oflags, 0o600, dir_fd=outs)
e3abcdf0
PG
1498 if PDTCRYPT_VERBOSE is True:
1499 noise ("PDT: new output file %s → %d" % (fname, outfd))
1500 except FileExistsError as exn:
1501 noise ("PDT: refusing to overwrite existing file %s" % fname)
1502 noise ("")
1503 raise PDTSplitError ("destination file %s already exists"
1504 % fname)
1505
1506 return os.fdopen (outfd, "wb", closefd=True)
1507
15d3eefd 1508
47d22679 1509 def tell (s):
b09a99eb 1510 """ESPIPE is normal on non-seekable stdio stream."""
47d22679
PG
1511 try:
1512 return s.tell ()
1513 except OSError as exn:
b09a99eb 1514 if exn.errno == os.errno.ESPIPE:
47d22679
PG
1515 return -1
1516
e3abcdf0 1517 def out (pt, outfile):
15d3eefd
PG
1518 npt = len (pt)
1519 nonlocal total_pt
1520 total_pt += npt
70ad9458 1521 if PDTCRYPT_VERBOSE is True:
15d3eefd
PG
1522 noise ("PDT:\t· decrypt plaintext %d B" % (npt))
1523 try:
e3abcdf0 1524 nn = outfile.write (pt)
15d3eefd
PG
1525 except OSError as exn: # probably ENOSPC
1526 raise DecryptionError ("error (%s)" % exn)
1527 if nn != npt:
1528 raise DecryptionError ("write aborted after %d of %d B" % (nn, npt))
1529
1530 while True:
1531 if ctleft <= 0:
1532 # current object completed; in a valid archive this marks either
1533 # the start of a new header or the end of the input
1534 if ctleft == 0: # current object requires finalization
70ad9458 1535 if PDTCRYPT_VERBOSE is True:
47d22679 1536 noise ("PDT: %d finalize" % tell (ins))
5d394c0d
PG
1537 try:
1538 pt = decr.done ()
1539 except InvalidGCMTag as exn:
f08c604b
PG
1540 raise DecryptionError ("error finalizing object %d (%d B): "
1541 "%r" % (total_obj, len (pt), exn)) \
1542 from exn
e3abcdf0 1543 out (pt, outfile)
70ad9458 1544 if PDTCRYPT_VERBOSE is True:
15d3eefd
PG
1545 noise ("PDT:\t· object validated")
1546
70ad9458 1547 if PDTCRYPT_VERBOSE is True:
47d22679 1548 noise ("PDT: %d hdr" % tell (ins))
15d3eefd
PG
1549 try:
1550 hdr = hdr_read_stream (ins)
dd47d6a2 1551 total_read += PDTCRYPT_HDR_SIZE
ae3d0f2a
PG
1552 except EndOfFile as exn:
1553 total_read += exn.remainder
dd47d6a2 1554 if total_ct + total_obj * PDTCRYPT_HDR_SIZE != total_read:
15d3eefd
PG
1555 raise PDTDecryptionError ("ciphertext processed (%d B) plus "
1556 "overhead (%d × %d B) does not match "
1557 "the number of bytes read (%d )"
dd47d6a2 1558 % (total_ct, total_obj, PDTCRYPT_HDR_SIZE,
15d3eefd
PG
1559 total_read))
1560 # the single good exit
1561 return total_read, total_obj, total_ct, total_pt
1562 except InvalidHeader as exn:
1563 raise PDTDecryptionError ("invalid header at position %d in %r "
ee6aa239 1564 "(%s)" % (tell (ins), exn, ins))
70ad9458 1565 if PDTCRYPT_VERBOSE is True:
15d3eefd
PG
1566 pretty = hdr_fmt_pretty (hdr)
1567 noise (reduce (lambda a, e: (a + "\n" if a else "") + "PDT:\t· " + e,
1568 pretty.splitlines (), ""))
1569 ctcurrent = ctleft = hdr ["ctsize"]
89e1073c 1570
15d3eefd 1571 decr.next (hdr)
e3abcdf0
PG
1572
1573 total_obj += 1 # used in file counter with split mode
1574
1575 # finalization complete or skipped in case of first object in
1576 # stream; create a new output file if necessary
1577 outfile = nextout (outfile)
15d3eefd 1578
70ad9458 1579 if PDTCRYPT_VERBOSE is True:
15d3eefd 1580 noise ("PDT: %d decrypt obj no. %d, %d B"
47d22679 1581 % (tell (ins), total_obj, ctleft))
15d3eefd
PG
1582
1583 # always allocate a new buffer since python-cryptography doesn’t allow
1584 # passing a bytearray :/
1585 nexpect = min (ctleft, PDTCRYPT_BLOCKSIZE)
70ad9458 1586 if PDTCRYPT_VERBOSE is True:
15d3eefd 1587 noise ("PDT:\t· [%d] %d%% done, read block (%d B of %d B remaining)"
47d22679 1588 % (tell (ins),
15d3eefd
PG
1589 100 - ctleft * 100 / (ctcurrent > 0 and ctcurrent or 1),
1590 nexpect, ctleft))
1591 ct = ins.read (nexpect)
1592 nct = len (ct)
1593 if nct < nexpect:
47d22679 1594 off = tell (ins)
ae3d0f2a
PG
1595 raise EndOfFile (nct,
1596 "hit EOF after %d of %d B in block [%d:%d); "
15d3eefd
PG
1597 "%d B ciphertext remaining for object no %d"
1598 % (nct, nexpect, off, off + nexpect, ctleft,
1599 total_obj))
1600 ctleft -= nct
1601 total_ct += nct
1602 total_read += nct
1603
70ad9458 1604 if PDTCRYPT_VERBOSE is True:
15d3eefd
PG
1605 noise ("PDT:\t· decrypt ciphertext %d B" % (nct))
1606 pt = decr.process (ct)
e3abcdf0 1607 out (pt, outfile)
15d3eefd 1608
d6c15a52 1609
70ad9458 1610def deptdcrypt_mk_stream (kind, path):
d6c15a52 1611 """Create stream from file or stdio descriptor."""
70ad9458 1612 if kind == PDTCRYPT_SINK:
d6c15a52 1613 if path == "-":
70ad9458 1614 if PDTCRYPT_VERBOSE is True: noise ("PDT: sink: stdout")
d6c15a52
PG
1615 return sys.stdout.buffer
1616 else:
70ad9458 1617 if PDTCRYPT_VERBOSE is True: noise ("PDT: sink: file %s" % path)
d6c15a52 1618 return io.FileIO (path, "w")
70ad9458 1619 if kind == PDTCRYPT_SOURCE:
d6c15a52 1620 if path == "-":
70ad9458 1621 if PDTCRYPT_VERBOSE is True: noise ("PDT: source: stdin")
d6c15a52
PG
1622 return sys.stdin.buffer
1623 else:
70ad9458 1624 if PDTCRYPT_VERBOSE is True: noise ("PDT: source: file %s" % path)
d6c15a52
PG
1625 return io.FileIO (path, "r")
1626
1627 raise ValueError ("bogus stream “%s” / %s" % (kind, path))
1628
15d3eefd 1629
a83fa4ed 1630def mode_depdtcrypt (mode, secret, ins, outs):
da82bc58
PG
1631 try:
1632 total_read, total_obj, total_ct, total_pt = \
a83fa4ed 1633 depdtcrypt (mode, secret, ins, outs)
da82bc58
PG
1634 except DecryptionError as exn:
1635 noise ("PDT: Decryption failed:")
1636 noise ("PDT:")
1637 noise ("PDT: “%s”" % exn)
1638 noise ("PDT:")
a83fa4ed 1639 noise ("PDT: Did you specify the correct key / password?")
da82bc58
PG
1640 noise ("")
1641 return 1
1642 except PDTSplitError as exn:
1643 noise ("PDT: Split operation failed:")
1644 noise ("PDT:")
1645 noise ("PDT: “%s”" % exn)
1646 noise ("PDT:")
a83fa4ed 1647 noise ("PDT: Hint: target directory should be empty.")
da82bc58
PG
1648 noise ("")
1649 return 1
1650
1651 if PDTCRYPT_VERBOSE is True:
1652 noise ("PDT: decryption successful" )
1653 noise ("PDT: %.10d bytes read" % total_read)
1654 noise ("PDT: %.10d objects decrypted" % total_obj )
1655 noise ("PDT: %.10d bytes ciphertext" % total_ct )
1656 noise ("PDT: %.10d bytes plaintext" % total_pt )
1657 noise ("" )
1658
1659 return 0
1660
1661
7b3940e5 1662def mode_scrypt (pw, ins=None, nacl=None, fmt=PDTCRYPT_SCRYPT_INTRANATOR):
77058bab 1663 hsh = None
7b3940e5 1664 paramversion = PDTCRYPT_DEFAULT_PVER
77058bab
PG
1665 if ins is not None:
1666 hsh, nacl, version, paramversion = scrypt_hashsource (pw, ins)
1667 defs = ENCRYPTION_PARAMETERS.get(paramversion, None)
1668 else:
1669 nacl = binascii.unhexlify (nacl)
7b3940e5 1670 defs = ENCRYPTION_PARAMETERS.get(paramversion, None)
77058bab
PG
1671 version = PDTCRYPT_DEFAULT_VER
1672
1673 kdfname, params = defs ["kdf"]
1674 if hsh is None:
1675 kdf = kdf_by_version (None, defs)
1676 hsh, _void = kdf (pw, nacl)
da82bc58
PG
1677
1678 import json
7b3940e5
PG
1679
1680 if fmt == PDTCRYPT_SCRYPT_INTRANATOR:
1681 out = json.dumps ({ "salt" : base64.b64encode (nacl).decode ()
1682 , "key" : base64.b64encode (hsh) .decode ()
1683 , "paramversion" : paramversion })
1684 elif fmt == PDTCRYPT_SCRYPT_PARAMETERS:
1685 out = json.dumps ({ "salt" : binascii.hexlify (nacl).decode ()
1686 , "key" : binascii.hexlify (hsh) .decode ()
1687 , "version" : version
1688 , "scrypt_params" : { "N" : params ["N"]
1689 , "r" : params ["r"]
1690 , "p" : params ["p"]
1691 , "dkLen" : params ["dkLen"] } })
1692 else:
1693 raise RuntimeError ("bad scrypt output scheme %r" % fmt)
1694
da82bc58
PG
1695 print (out)
1696
1697
4c62ddc0
PG
1698def noise_output_candidates (cands, indent=8, cols=PDTCRYPT_TT_COLUMNS):
1699 """
1700 Print a list of offsets without garbling the terminal too much.
1701
1702 The indent is counted from column zero; if it is wide enough, the “PDT: ”
1703 marker will be prepended, considered part of the indentation.
1704 """
1705 wd = cols - 1
1706 nc = len (cands)
1707 idt = " " * indent if indent < 5 else "PDT: " + " " * (indent - 5)
1708 line = idt
1709 lpos = indent
1710 sep = ","
1711 lsep = len (sep)
1712 init = True # prevent leading separator
1713
1714 if indent >= wd:
1715 raise ValueError ("the requested indentation exceeds the line "
1716 "width by %d" % (indent - wd))
1717
1718 for n in cands:
1719 ns = "%d" % n
1720 lns = len (ns)
1721 if init is False:
1722 line += sep
1723 lpos += lsep
1724
1725 lpos += lns
1726 if lpos > wd: # line break
1727 noise (line)
1728 line = idt
1729 lpos = indent + lns
1730 elif init is True:
1731 init = False
1732 else: # space
1733 line += ' '
1734 lpos += 1
1735
1736 line += ns
1737
1738 if lpos != indent:
1739 noise (line)
1740
1741
70a33834 1742def mode_scan (secret, fname, nacl=None):
f41973a6
PG
1743 """
1744 Dissect a binary file, looking for PDTCRYPT headers and objects.
1745 """
1746 try:
1747 fd = os.open (fname, os.O_RDONLY)
1748 except FileNotFoundError:
1749 noise ("PDT: failed to open %s readonly" % fname)
1750 noise ("")
1751 usage (err=True)
1752
1753 try:
1754 if PDTCRYPT_VERBOSE is True:
1755 noise ("PDT: scan for potential sync points")
1756 cands = locate_hdr_candidates (fd)
1757 if len (cands) == 0:
1758 noise ("PDT: scan complete: input does not contain potential PDT "
1759 "headers; giving up.")
1760 return -1
1761 if PDTCRYPT_VERBOSE is True:
4c62ddc0
PG
1762 noise ("PDT: scan complete: found %d candidates:" % len (cands))
1763 noise_output_candidates (cands)
6c8073ab 1764 except:
f41973a6 1765 os.close (fd)
6c8073ab 1766 raise
f41973a6 1767
6c8073ab
PG
1768 junk, todo = [], []
1769 try:
1770 for cand in cands:
1771 vdt, hdr = inspect_hdr (fd, cand)
1772 if vdt == HDR_CAND_JUNK:
1773 junk.append (cand)
1774 else:
1775 off0 = cand + PDTCRYPT_HDR_SIZE
1776 if PDTCRYPT_VERBOSE is True:
70a33834
PG
1777 noise ("PDT: read payload @%d" % off0)
1778 pretty = hdr_fmt_pretty (hdr)
1779 noise (reduce (lambda a, e: (a + "\n" if a else "") + "PDT:\t· " + e,
1780 pretty.splitlines (), ""))
6c8073ab 1781
70a33834
PG
1782 ok = try_decrypt (fd, off0, hdr, secret) == hdr ["ctsize"]
1783 if vdt == HDR_CAND_GOOD and ok is True:
6c8073ab
PG
1784 noise ("PDT: %d → ✓ valid object %d–%d"
1785 % (cand, off0, off0 + hdr ["ctsize"]))
70a33834 1786 elif vdt == HDR_CAND_FISHY and ok is True:
6c8073ab
PG
1787 noise ("PDT: %d → × object %d–%d, corrupt header"
1788 % (cand, off0, off0 + hdr ["ctsize"]))
70a33834 1789 elif vdt == HDR_CAND_GOOD and ok is False:
6c8073ab
PG
1790 noise ("PDT: %d → × object %d–%d, problematic payload"
1791 % (cand, off0, off0 + hdr ["ctsize"]))
70a33834 1792 elif vdt == HDR_CAND_FISHY and ok is False:
6c8073ab
PG
1793 noise ("PDT: %d → × object %d–%d, corrupt header, problematic "
1794 "ciphertext" % (cand, off0, off0 + hdr ["ctsize"]))
1795 else:
1796 raise Unreachable
1797 finally:
1798 os.close (fd)
7b3940e5 1799
70a33834
PG
1800 if len (junk) == 0:
1801 noise ("PDT: all headers ok")
1802 else:
1803 noise ("PDT: %d candidates not parseable as headers:" % len (junk))
1804 noise_output_candidates (junk)
1805
70ad9458
PG
1806def usage (err=False):
1807 out = print
1808 if err is True:
1809 out = noise
5afcb45d 1810 indent = ' ' * len (SELF)
da82bc58 1811 out ("usage: %s SUBCOMMAND { --help" % SELF)
5afcb45d 1812 out (" %s | [ -v ] { -p PASSWORD | -k KEY }" % indent)
77058bab
PG
1813 out (" %s [ { -i | --in } { - | SOURCE } ]" % indent)
1814 out (" %s [ { -n | --nacl } { SALT } ]" % indent)
1815 out (" %s [ { -o | --out } { - | DESTINATION } ]" % indent)
1816 out (" %s [ -D | --no-decrypt ] [ -S | --split ]" % indent)
7b3940e5 1817 out (" %s [ -f | --format ]" % indent)
70ad9458
PG
1818 out ("")
1819 out ("\twhere")
da82bc58
PG
1820 out ("\t\tSUBCOMMAND main mode: { process | scrypt }")
1821 out ("\t\t where:")
1822 out ("\t\t process: extract objects from PDT archive")
1823 out ("\t\t scrypt: calculate hash from password and first object")
a83fa4ed
PG
1824 out ("\t\t-p PASSWORD password to derive the encryption key from")
1825 out ("\t\t-k KEY encryption key as 16 bytes in hexadecimal notation")
e3abcdf0 1826 out ("\t\t-s enforce strict handling of initialization vectors")
70ad9458
PG
1827 out ("\t\t-i SOURCE file name to read from")
1828 out ("\t\t-o DESTINATION file to write output to")
77058bab 1829 out ("\t\t-n SALT provide salt for scrypt mode in hex encoding")
70ad9458 1830 out ("\t\t-v print extra info")
e3abcdf0
PG
1831 out ("\t\t-S split into files at object boundaries; this")
1832 out ("\t\t requires DESTINATION to refer to directory")
1833 out ("\t\t-D PDT header and ciphertext passthrough")
7b3940e5 1834 out ("\t\t-f format of SCRYPT hash output (“default” or “parameters”)")
70ad9458
PG
1835 out ("")
1836 out ("\tinstead of filenames, “-” may used to specify stdin / stdout")
1837 out ("")
1838 sys.exit ((err is True) and 42 or 0)
1839
1840
a83fa4ed
PG
1841def bail (msg):
1842 noise (msg)
1843 noise ("")
1844 usage (err=True)
1845 raise Unreachable
1846
1847
70ad9458
PG
1848def parse_argv (argv):
1849 global SELF
7b3940e5
PG
1850 mode = PDTCRYPT_DECRYPT
1851 secret = None
1852 insspec = None
1853 outsspec = None
1854 nacl = None
4f6405d6 1855 scrypt_format = PDTCRYPT_SCRYPT_DEFAULT
70ad9458
PG
1856
1857 argvi = iter (argv)
1858 SELF = os.path.basename (next (argvi))
1859
da82bc58
PG
1860 try:
1861 rawsubcmd = next (argvi)
1862 subcommand = PDTCRYPT_SUB [rawsubcmd]
1863 except StopIteration:
a83fa4ed 1864 bail ("ERROR: subcommand required")
da82bc58 1865 except KeyError:
a83fa4ed 1866 bail ("ERROR: invalid subcommand “%s” specified" % rawsubcmd)
da82bc58 1867
59d74e2b
PG
1868 def checked_arg ():
1869 nonlocal argvi
1870 try:
1871 return next (argvi)
1872 except StopIteration:
1873 bail ("ERROR: argument list incomplete")
1874
a83fa4ed
PG
1875 def checked_secret (t, arg):
1876 nonlocal secret
1877 if secret is None:
1878 secret = (t, arg)
da82bc58 1879 else:
a83fa4ed 1880 bail ("ERROR: encountered “%s” but secret already given" % arg)
da82bc58 1881
70ad9458
PG
1882 for arg in argvi:
1883 if arg in [ "-h", "--help" ]:
1884 usage ()
1885 raise Unreachable
1886 elif arg in [ "-v", "--verbose", "--wtf" ]:
1887 global PDTCRYPT_VERBOSE
1888 PDTCRYPT_VERBOSE = True
1889 elif arg in [ "-i", "--in", "--source" ]:
59d74e2b 1890 insspec = checked_arg ()
70ad9458 1891 if PDTCRYPT_VERBOSE is True: noise ("PDT: decrypt from %s" % insspec)
a83fa4ed 1892 elif arg in [ "-p", "--password" ]:
59d74e2b 1893 arg = checked_arg ()
a83fa4ed
PG
1894 checked_secret (PDTCRYPT_SECRET_PW, arg)
1895 if PDTCRYPT_VERBOSE is True: noise ("PDT: decrypting with password")
70ad9458 1896 else:
da82bc58
PG
1897 if subcommand == PDTCRYPT_SUB_PROCESS:
1898 if arg in [ "-s", "--strict-ivs" ]:
1899 global PDTCRYPT_STRICTIVS
1900 PDTCRYPT_STRICTIVS = True
77058bab
PG
1901 elif arg in [ "-o", "--out", "--dest", "--sink" ]:
1902 outsspec = checked_arg ()
1903 if PDTCRYPT_VERBOSE is True: noise ("PDT: decrypt to %s" % outsspec)
da82bc58
PG
1904 elif arg in [ "-f", "--force" ]:
1905 global PDTCRYPT_OVERWRITE
1906 PDTCRYPT_OVERWRITE = True
1907 if PDTCRYPT_VERBOSE is True: noise ("PDT: overwrite existing files")
1908 elif arg in [ "-S", "--split" ]:
1909 mode |= PDTCRYPT_SPLIT
1910 if PDTCRYPT_VERBOSE is True: noise ("PDT: split files")
1911 elif arg in [ "-D", "--no-decrypt" ]:
1912 mode &= ~PDTCRYPT_DECRYPT
1913 if PDTCRYPT_VERBOSE is True: noise ("PDT: not decrypting")
a83fa4ed 1914 elif arg in [ "-k", "--key" ]:
59d74e2b 1915 arg = checked_arg ()
a83fa4ed
PG
1916 checked_secret (PDTCRYPT_SECRET_KEY, arg)
1917 if PDTCRYPT_VERBOSE is True: noise ("PDT: decrypting with key")
da82bc58 1918 else:
a83fa4ed 1919 bail ("ERROR: unexpected positional argument “%s”" % arg)
da82bc58 1920 elif subcommand == PDTCRYPT_SUB_SCRYPT:
77058bab
PG
1921 if arg in [ "-n", "--nacl", "--salt" ]:
1922 nacl = checked_arg ()
1923 if PDTCRYPT_VERBOSE is True: noise ("PDT: salt key with %s" % nacl)
7b3940e5
PG
1924 elif arg in [ "-f", "--format" ]:
1925 arg = checked_arg ()
1926 try:
1927 scrypt_format = PDTCRYPT_SCRYPT_FORMAT [arg]
1928 except KeyError:
1929 bail ("ERROR: invalid scrypt output format %s" % arg)
1930 if PDTCRYPT_VERBOSE is True:
1931 noise ("PDT: scrypt output format “%s”" % scrypt_format)
77058bab
PG
1932 else:
1933 bail ("ERROR: unexpected positional argument “%s”" % arg)
f41973a6
PG
1934 elif subcommand == PDTCRYPT_SUB_SCAN:
1935 pass
70ad9458 1936
a83fa4ed 1937 if secret is None:
ecb9676d 1938 if PDTCRYPT_VERBOSE is True:
a83fa4ed 1939 noise ("ERROR: no password or key specified, trying $PDTCRYPT_PASSWORD")
ecb9676d
PG
1940 epw = os.getenv ("PDTCRYPT_PASSWORD")
1941 if epw is not None:
a83fa4ed
PG
1942 checked_secret (PDTCRYPT_SECRET_PW, epw.strip ())
1943
1944 if secret is None:
1945 if PDTCRYPT_VERBOSE is True:
1946 noise ("ERROR: no password or key specified, trying $PDTCRYPT_KEY")
1947 ek = os.getenv ("PDTCRYPT_KEY")
1948 if ek is not None:
1949 checked_secret (PDTCRYPT_SECRET_KEY, ek.strip ())
ecb9676d 1950
a83fa4ed 1951 if secret is None:
da82bc58 1952 if subcommand == PDTCRYPT_SUB_SCRYPT:
a83fa4ed 1953 bail ("ERROR: scrypt hash mode requested but no password given")
da82bc58 1954 elif mode & PDTCRYPT_DECRYPT:
a83fa4ed
PG
1955 bail ("ERROR: encryption requested but no password given")
1956
f41973a6
PG
1957 if subcommand == PDTCRYPT_SUB_SCAN:
1958 if insspec is None:
1959 bail ("ERROR: please supply an input file for scanning")
1960 if insspec == '-':
1961 bail ("ERROR: input must be seekable; please specify a file")
70a33834 1962 return True, partial (mode_scan, secret, insspec, nacl)
f41973a6 1963
77058bab
PG
1964 if subcommand == PDTCRYPT_SUB_SCRYPT:
1965 if secret [0] == PDTCRYPT_SECRET_KEY:
1966 bail ("ERROR: scrypt mode requires a password")
1967 if insspec is not None and nacl is not None \
1968 or insspec is None and nacl is None :
1969 bail ("ERROR: please supply either an input file or "
1970 "the salt")
70ad9458
PG
1971
1972 # default to stdout
77058bab
PG
1973 ins = None
1974 if insspec is not None or subcommand != PDTCRYPT_SUB_SCRYPT:
1975 ins = deptdcrypt_mk_stream (PDTCRYPT_SOURCE, insspec or "-")
da82bc58
PG
1976
1977 if subcommand == PDTCRYPT_SUB_SCRYPT:
7b3940e5
PG
1978 return True, partial (mode_scrypt, secret [1].encode (), ins, nacl,
1979 fmt=scrypt_format)
da82bc58 1980
e3abcdf0
PG
1981 if mode & PDTCRYPT_SPLIT: # destination must be directory
1982 if outsspec is None or outsspec == "-":
a83fa4ed 1983 bail ("ERROR: split mode is incompatible with stdout sink")
e3abcdf0
PG
1984
1985 try:
1986 try:
1987 os.makedirs (outsspec, 0o700)
1988 except FileExistsError:
1989 # if it’s a directory with appropriate perms, everything is
1990 # good; otherwise, below invocation of open(2) will fail
1991 pass
1992 outs = os.open (outsspec, os.O_DIRECTORY, 0o600)
1993 except FileNotFoundError as exn:
a83fa4ed 1994 bail ("ERROR: cannot create target directory “%s”" % outsspec)
e3abcdf0 1995 except NotADirectoryError as exn:
a83fa4ed 1996 bail ("ERROR: target path “%s” is not a directory" % outsspec)
da82bc58 1997
e3abcdf0 1998 else:
89e1073c 1999 outs = deptdcrypt_mk_stream (PDTCRYPT_SINK, outsspec or "-")
da82bc58 2000
a83fa4ed 2001 return True, partial (mode_depdtcrypt, mode, secret, ins, outs)
15d3eefd
PG
2002
2003
00b3cd10 2004def main (argv):
da82bc58 2005 ok, runner = parse_argv (argv)
f08c604b 2006
da82bc58 2007 if ok is True: return runner ()
15d3eefd 2008
da82bc58 2009 return 1
f08c604b 2010
00b3cd10
PG
2011
2012if __name__ == "__main__":
2013 sys.exit (main (sys.argv))
2014