implement dump mode for tolerant decryption
[python-delta-tar] / deltatar / crypto.py
CommitLineData
00b3cd10
PG
1#!/usr/bin/env python3
2
3"""
83f2d71e 4Intra2net 2017
00b3cd10
PG
5
6===============================================================================
704ceaa5 7 crypto -- Encryption Layer for the Deltatar Backup
00b3cd10
PG
8===============================================================================
9
10Crypto stack:
11
12 - AES-GCM for the symmetric encryption;
13 - Scrypt as KDF.
14
15References:
16
17 - NIST Recommendation for Block Cipher Modes of Operation: Galois/Counter
18 Mode (GCM) and GMAC
19 http://nvlpubs.nist.gov/nistpubs/Legacy/SP/nistspecialpublication800-38d.pdf
20
21 - AES-GCM v1:
22 https://cryptome.org/2014/01/aes-gcm-v1.pdf
23
24 - Authentication weaknesses in GCM
25 http://csrc.nist.gov/groups/ST/toolkit/BCM/documents/comments/CWC-GCM/Ferguson2.pdf
26
83f2d71e
PG
27Trouble with python-cryptography packages: authentication tags can only be
28passed in advance: https://github.com/pyca/cryptography/pull/3421
29
6d08915c
PG
30Errors
31-------------------------------------------------------------------------------
32
33Errors fall into roughly three categories:
34
704ceaa5 35 - Cryptographical errors or invalid data.
6d08915c
PG
36
37 - ``InvalidGCMTag`` (decryption failed on account of an invalid GCM
38 tag),
39 - ``InvalidIVFixedPart`` (IV fixed part of object not found in list),
f6cd676f 40 - ``DuplicateIV`` (the IV of an encrypted object already occurred),
704ceaa5
PG
41 - ``DecryptionError`` (used in CLI decryption for presenting error
42 conditions to the user).
6d08915c
PG
43
44 - Incorrect usage of the library.
45
46 - ``InvalidParameter`` (non-conforming user supplied parameter),
47 - ``InvalidHeader`` (data passed for reading not parsable into header),
48 - ``FormatError`` (cannot handle header or parameter version),
49 - ``RuntimeError``.
50
51 - Bad internal state. If one of these is encountered it means that a state
52 was reached that shouldn’t occur during normal processing.
53
54 - ``InternalError``,
55 - ``Unreachable``.
56
57Also, ``EndOfFile`` is used as a sentinel to communicate that a stream supplied
58for reading is exhausted.
59
f6cd676f
PG
60Initialization Vectors
61-------------------------------------------------------------------------------
62
63Initialization vectors are checked reuse during the lifetime of a decryptor.
704ceaa5
PG
64The fixed counters for metadata files cannot be reused and attempts to do so
65will cause a DuplicateIV error. This means the length of objects encrypted with
66a metadata counter is capped at 63 GB.
67
68For ordinary, non-metadata payload, there is an optional mode with strict IV
69checking that causes a crypto context to fail if an IV encountered or created
70was already used for decrypting or encrypting, respectively, an earlier object.
71Note that this mode can trigger false positives when decrypting non-linearly,
72e. g. when traversing the same object multiple times. Since the crypto context
73has no notion of a position in a PDT encrypted archive, this condition must be
74sorted out downstream.
75
76Command Line Utility
77-------------------------------------------------------------------------------
78
79``crypto.py`` may be invoked as a script for decrypting, validating, and
80splitting PDT encrypted files. Consult the usage message for details.
81
82Usage examples:
83
84Decrypt from stdin using the password ‘foo’: ::
85
86 $ crypto.py process foo -i - -o - <some-file.tar.gz.pdtcrypt >some-file.tar.gz
87
88Output verbose information about the encrypted objects in the archive: ::
89
90 $ crypto.py process foo -v -i some-file.tar.gz.pdtcrypt -o /dev/null
91 PDT: decrypt from some-file.tar.gz.pdtcrypt
92 PDT: decrypt to /dev/null
93 PDT: source: file some-file.tar.gz.pdtcrypt
94 PDT: sink: file /dev/null
95 PDT: 0 hdr
96 PDT: · version = 1 : 0100
97 PDT: · paramversion = 1 : 0100
98 PDT: · nacl : d270 b031 00d1 87e2 c946 610d 7b7f 7e5f
99 PDT: · iv : 02ee 3dd7 a963 1eb1 0100 0000
100 PDT: · ctsize = 591 : 4f02 0000 0000 0000
101 PDT: · tag : 5b2d 6d8b 8f82 4842 12fd 0b10 b6e3 369b
102 PDT: 64 decrypt obj no. 1, 591 B
103 PDT: · [64] 0% done, read block (591 B of 591 B remaining)
104 PDT: · decrypt ciphertext 591 B
105 PDT: · decrypt plaintext 591 B
106 PDT: 655 finalize
107
108
109Also, the mode *scrypt* allows deriving encryption keys. To calculate the
110encryption key from the password ‘foo’ and the salt of the first object in a
111PDT encrypted file: ::
112
113 $ crypto.py scrypt foo -i some-file.pdtcrypt
4f6405d6 114 {"paramversion": 1, "salt": "Cqzbk48e3peEjzWto8D0yA==", "key": "JH9EkMwaM4x9F5aim5gK/Q=="}
704ceaa5
PG
115
116The computed 16 byte key is given in hexadecimal notation in the value to
117``hash`` and can be fed into Python’s ``binascii.unhexlify()`` to obtain the
118corresponding binary representation.
119
120Note that in Scrypt hashing mode, no data integrity checks are being performed.
121If the wrong password is given, a wrong key will be derived. Whether the password
122was indeed correct can only be determined by decrypting. Note that since PDT
123archives essentially consist of a stream of independent objects, the salt and
124other parameters may change. Thus a key derived using above method from the
125first object doesn’t necessarily apply to any of the subsequent objects.
f6cd676f 126
00b3cd10
PG
127"""
128
7b3940e5 129import base64
00b3cd10 130import binascii
50710d86 131import bisect
00b3cd10
PG
132import ctypes
133import io
c46c8670 134from functools import reduce, partial
f41973a6 135import mmap
00b3cd10
PG
136import os
137import struct
a808459e 138import stat
00b3cd10
PG
139import sys
140import time
da82bc58 141import types
00b3cd10
PG
142try:
143 import enum34
144except ImportError as exn:
145 pass
146
147if __name__ == "__main__": ## Work around the import mechanism’s lest Python’s
148 pwd = os.getcwd() ## preference for local imports causes a cyclical
149 ## import (crypto → pylibscrypt → […] → ./tarfile → crypto).
150 sys.path = [ p for p in sys.path if p.find ("deltatar") < 0 ]
151
152import pylibscrypt
153from cryptography.hazmat.primitives.ciphers import Cipher, algorithms, modes
154from cryptography.hazmat.backends import default_backend
15d3eefd 155import cryptography
00b3cd10
PG
156
157
a64085a8 158__all__ = [ "hdr_make", "hdr_read", "hdr_fmt", "hdr_fmt_pretty"
b360b772 159 , "scrypt_hashfile"
3031b7ae
PG
160 , "PDTCRYPT_HDR_SIZE", "AES_GCM_IV_CNT_DATA"
161 , "AES_GCM_IV_CNT_INFOFILE", "AES_GCM_IV_CNT_INDEX"
2d6fd8c8 162 ]
00b3cd10 163
a393d9cb
PG
164
165###############################################################################
15d3eefd
PG
166## exceptions
167###############################################################################
168
169class EndOfFile (Exception):
170 """Reached EOF."""
ae3d0f2a
PG
171 remainder = 0
172 msg = 0
8a8ac469 173 def __init__ (self, n=None, msg=None):
5d394c0d
PG
174 if n is not None:
175 self.remainder = n
176 self.msg = msg
15d3eefd 177
b0078f26 178
b12110dd
PG
179class InvalidParameter (Exception):
180 """Inputs not valid for PDT encryption."""
181 pass
182
b0078f26 183
15d3eefd
PG
184class InvalidHeader (Exception):
185 """Header not valid."""
186 pass
187
b0078f26
PG
188
189class InvalidGCMTag (Exception):
190 """
191 The GCM tag calculated during decryption differs from that in the object
192 header.
193 """
194 pass
195
196
26b42ad4 197class InvalidIVFixedPart (Exception):
89ec6e2f
PG
198 """
199 IV fixed part not in supplied list: either the backup is corrupt or the
200 current object does not belong to it.
201 """
26b42ad4
PG
202 pass
203
b0078f26 204
be124bca 205class IVFixedPartError (Exception):
89ec6e2f
PG
206 """
207 Error creating a unique IV fixed part: repeated calls to system RNG yielded
208 the same sequence of bytes as the last IV used.
209 """
be124bca
PG
210 pass
211
212
fac2cfe1 213class InvalidFileCounter (Exception):
89ec6e2f
PG
214 """
215 When encrypting, an attempted reuse of a dedicated counter (info file,
216 index file) was caught.
217 """
fac2cfe1
PG
218 pass
219
220
ee6aa239 221class DuplicateIV (Exception):
89ec6e2f
PG
222 """
223 During encryption, the current IV fixed part is identical to an already
224 existing IV (same prefix and file counter). This indicates tampering or
225 programmer error and cannot be recovered from.
226 """
ee6aa239
PG
227 pass
228
229
230class NonConsecutiveIV (Exception):
89ec6e2f
PG
231 """
232 IVs not numbered consecutively. This is a hard error with strict IV
233 checking. Precludes random access to the encrypted objects.
234 """
ee6aa239
PG
235 pass
236
237
b12110dd
PG
238class FormatError (Exception):
239 """Unusable parameters in header."""
240 pass
241
b0078f26 242
15d3eefd 243class DecryptionError (Exception):
89ec6e2f 244 """Error during decryption with ``crypto.py`` on the command line."""
15d3eefd
PG
245 pass
246
b0078f26 247
70ad9458 248class Unreachable (Exception):
89ec6e2f
PG
249 """
250 Makeshift __builtin_unreachable(); always a programmer error if
251 thrown.
252 """
70ad9458
PG
253 pass
254
b0078f26 255
b12110dd
PG
256class InternalError (Exception):
257 """Errors not ascribable to bad user inputs or cryptography."""
258 pass
259
15d3eefd
PG
260
261###############################################################################
a393d9cb
PG
262## crypto layer version
263###############################################################################
264
265ENCRYPTION_PARAMETERS = \
c46c8670 266 { 0: \
dd23cbc9
PG
267 { "kdf": ("dummy", 16)
268 , "enc": "passthrough" }
c46c8670 269 , 1: \
dd23cbc9
PG
270 { "kdf": ( "scrypt"
271 , { "dkLen" : 16
272 , "N" : 1 << 16
273 , "r" : 8
274 , "p" : 1
275 , "NaCl_LEN" : 16 })
276 , "enc": "aes-gcm" } }
a393d9cb 277
00b3cd10
PG
278###############################################################################
279## constants
280###############################################################################
281
dd47d6a2 282PDTCRYPT_HDR_MAGIC = b"PDTCRYPT"
00b3cd10 283
dd47d6a2
PG
284PDTCRYPT_HDR_SIZE_MAGIC = 8 # 8
285PDTCRYPT_HDR_SIZE_VERSION = 2 # 10
286PDTCRYPT_HDR_SIZE_PARAMVERSION = 2 # 12
287PDTCRYPT_HDR_SIZE_NACL = 16 # 28
288PDTCRYPT_HDR_SIZE_IV = 12 # 40
289PDTCRYPT_HDR_SIZE_CTSIZE = 8 # 48
290PDTCRYPT_HDR_SIZE_TAG = 16 # 64 GCM auth tag
00b3cd10 291
dd47d6a2
PG
292PDTCRYPT_HDR_SIZE = PDTCRYPT_HDR_SIZE_MAGIC + PDTCRYPT_HDR_SIZE_VERSION \
293 + PDTCRYPT_HDR_SIZE_PARAMVERSION + PDTCRYPT_HDR_SIZE_NACL \
294 + PDTCRYPT_HDR_SIZE_IV + PDTCRYPT_HDR_SIZE_CTSIZE \
295 + PDTCRYPT_HDR_SIZE_TAG # = 64
00b3cd10
PG
296
297# precalculate offsets since Python can’t do constant folding over names
dd47d6a2
PG
298HDR_OFF_VERSION = PDTCRYPT_HDR_SIZE_MAGIC
299HDR_OFF_PARAMVERSION = HDR_OFF_VERSION + PDTCRYPT_HDR_SIZE_VERSION
300HDR_OFF_NACL = HDR_OFF_PARAMVERSION + PDTCRYPT_HDR_SIZE_PARAMVERSION
301HDR_OFF_IV = HDR_OFF_NACL + PDTCRYPT_HDR_SIZE_NACL
302HDR_OFF_CTSIZE = HDR_OFF_IV + PDTCRYPT_HDR_SIZE_IV
303HDR_OFF_TAG = HDR_OFF_CTSIZE + PDTCRYPT_HDR_SIZE_CTSIZE
00b3cd10
PG
304
305FMT_UINT16_LE = "<H"
306FMT_UINT64_LE = "<Q"
50710d86 307FMT_I2N_IV = "<8sL" # 8 random bytes ‖ 32 bit counter
83f2d71e
PG
308FMT_I2N_HDR = ("<" # host byte order
309 "8s" # magic
310 "H" # version
311 "H" # paramversion
312 "16s" # sodium chloride
313 "12s" # iv
3b53fb98
PG
314 "Q" # size
315 "16s") # GCM tag
00b3cd10
PG
316
317# aes+gcm
cb7a3911
PG
318AES_GCM_MAX_SIZE = (1 << 36) - (1 << 5) # 2^39 - 2^8 b ≅ 64 GB
319PDTCRYPT_MAX_OBJ_SIZE_DEFAULT = 63 * (1 << 30) # 63 GB
320PDTCRYPT_MAX_OBJ_SIZE = PDTCRYPT_MAX_OBJ_SIZE_DEFAULT
00b3cd10 321
3031b7ae
PG
322# index and info files are written on-the fly while encrypting so their
323# counters must be available inadvance
cb7a3911
PG
324AES_GCM_IV_CNT_INFOFILE = 1 # constant
325AES_GCM_IV_CNT_INDEX = AES_GCM_IV_CNT_INFOFILE + 1
326AES_GCM_IV_CNT_DATA = AES_GCM_IV_CNT_INDEX + 1 # also for multivolume
327AES_GCM_IV_CNT_MAX_DEFAULT = 0xffFFffFF
328AES_GCM_IV_CNT_MAX = AES_GCM_IV_CNT_MAX_DEFAULT
2d6fd8c8 329
be124bca
PG
330# IV structure and generation
331PDTCRYPT_IV_GEN_MAX_RETRIES = 10 # ×
332PDTCRYPT_IV_FIXEDPART_SIZE = 8 # B
333PDTCRYPT_IV_COUNTER_SIZE = 4 # B
39accaaa 334
00b3cd10 335###############################################################################
39accaaa 336## header, trailer
00b3cd10
PG
337###############################################################################
338#
339# Interface:
340#
341# struct hdrinfo
342# { version : u16
343# , paramversion : u16
344# , nacl : [u8; 16]
345# , iv : [u8; 12]
704ceaa5
PG
346# , ctsize : usize
347# , tag : [u8; 16] }
83f2d71e 348#
00b3cd10 349# fn hdr_read (f : handle) -> hdrinfo;
c2d1c3ec 350# fn hdr_make (f : handle, h : hdrinfo) -> IOResult<usize>;
00b3cd10
PG
351# fn hdr_fmt (h : hdrinfo) -> String;
352#
353
83f2d71e 354def hdr_read (data):
704ceaa5
PG
355 """
356 Read bytes as header structure.
357
358 If the input could not be interpreted as a header, fail with
359 ``InvalidHeader``.
360 """
83f2d71e 361
00b3cd10 362 try:
3b53fb98 363 mag, version, paramversion, nacl, iv, ctsize, tag = \
83f2d71e
PG
364 struct.unpack (FMT_I2N_HDR, data)
365 except Exception as exn:
15d3eefd
PG
366 raise InvalidHeader ("error unpacking header from [%r]: %s"
367 % (binascii.hexlify (data), str (exn)))
00b3cd10 368
dd47d6a2 369 if mag != PDTCRYPT_HDR_MAGIC:
15d3eefd 370 raise InvalidHeader ("bad magic in header: expected [%s], got [%s]"
dd47d6a2 371 % (PDTCRYPT_HDR_MAGIC, mag))
00b3cd10 372
15d3eefd 373 return \
00b3cd10
PG
374 { "version" : version
375 , "paramversion" : paramversion
376 , "nacl" : nacl
377 , "iv" : iv
378 , "ctsize" : ctsize
3b53fb98 379 , "tag" : tag
00b3cd10
PG
380 }
381
382
39accaaa 383def hdr_read_stream (instr):
704ceaa5
PG
384 """
385 Read header from stream at the current position.
386
387 Fail with ``InvalidHeader`` if insufficient bytes were read from the
388 stream, or if the content could not be interpreted as a header.
389 """
dd47d6a2 390 data = instr.read(PDTCRYPT_HDR_SIZE)
ae3d0f2a 391 ldata = len (data)
8a8ac469
PG
392 if ldata == 0:
393 raise EndOfFile
394 elif ldata != PDTCRYPT_HDR_SIZE:
395 raise InvalidHeader ("hdr_read_stream: expected %d B, received %d B"
396 % (PDTCRYPT_HDR_SIZE, ldata))
47e27926 397 return hdr_read (data)
39accaaa
PG
398
399
3b53fb98 400def hdr_from_params (version, paramversion, nacl, iv, ctsize, tag):
704ceaa5
PG
401 """
402 Assemble the necessary values into a PDTCRYPT header.
403
404 :type version: int to fit uint16_t
405 :type paramversion: int to fit uint16_t
406 :type nacl: bytes to fit uint8_t[16]
407 :type iv: bytes to fit uint8_t[12]
408 :type size: int to fit uint64_t
409 :type tag: bytes to fit uint8_t[16]
410 """
dd47d6a2 411 buf = bytearray (PDTCRYPT_HDR_SIZE)
83f2d71e 412 bufv = memoryview (buf)
00b3cd10 413
00b3cd10 414 try:
83f2d71e 415 struct.pack_into (FMT_I2N_HDR, bufv, 0,
dd47d6a2 416 PDTCRYPT_HDR_MAGIC,
3b53fb98 417 version, paramversion, nacl, iv, ctsize, tag)
83f2d71e 418 except Exception as exn:
a83fa4ed 419 return False, "error assembling header: %s" % str (exn)
00b3cd10 420
83f2d71e 421 return True, bytes (buf)
00b3cd10 422
00b3cd10 423
8a990744
PG
424def hdr_make_dummy (s):
425 """
426 Create a header sized block of bytes initialized to a value derived from a
427 string. Used to verify we’ve jumped back correctly to the actual position
428 of the object header.
429 """
430 c = reduce (lambda a, c: a + ord(c), s, 0) % 0xFF
dd47d6a2 431 return bytes (bytearray (struct.pack ("B", c)) * PDTCRYPT_HDR_SIZE)
8a990744
PG
432
433
a393d9cb 434def hdr_make (hdr):
704ceaa5
PG
435 """
436 Assemble a header from the given header structure.
437 """
a393d9cb
PG
438 return hdr_from_params (version=hdr.get("version"),
439 paramversion=hdr.get("paramversion"),
440 nacl=hdr.get("nacl"), iv=hdr.get("iv"),
3b53fb98 441 ctsize=hdr.get("ctsize"), tag=hdr.get("tag"))
a393d9cb
PG
442
443
83f2d71e 444HDR_FMT = "I2n_header { version: %d, paramversion: %d, nacl: %s[%d]," \
89131745 445 " iv: %s[%d], ctsize: %d, tag: %s[%d] }"
00b3cd10 446
83f2d71e 447def hdr_fmt (h):
704ceaa5 448 """Format a header structure into readable output."""
83f2d71e
PG
449 return HDR_FMT % (h["version"], h["paramversion"],
450 binascii.hexlify (h["nacl"]), len(h["nacl"]),
451 binascii.hexlify (h["iv"]), len(h["iv"]),
db1f3ac7
PG
452 h["ctsize"],
453 binascii.hexlify (h["tag"]), len(h["tag"]))
00b3cd10 454
00b3cd10 455
83f2d71e 456def hex_spaced_of_bytes (b):
704ceaa5 457 """Format bytes object, hexdump style."""
83f2d71e
PG
458 return " ".join ([ "%.2x%.2x" % (c1, c2)
459 for c1, c2 in zip (b[0::2], b[1::2]) ]) \
460 + (len (b) | 1 == len (b) and " %.2x" % b[-1] or "") # odd lengths
00b3cd10 461
591a722f 462
3031b7ae
PG
463def hdr_iv_counter (h):
464 """Extract the variable part of the IV of the given header."""
465 _fixed, cnt = struct.unpack (FMT_I2N_IV, h ["iv"])
466 return cnt
467
468
469def hdr_iv_fixed (h):
470 """Extract the fixed part of the IV of the given header."""
471 fixed, _cnt = struct.unpack (FMT_I2N_IV, h ["iv"])
472 return fixed
473
474
83f2d71e 475hdr_dump = hex_spaced_of_bytes
00b3cd10 476
00b3cd10 477
15d3eefd
PG
478HDR_FMT_PRETTY = \
479"""version = %-4d : %s
480paramversion = %-4d : %s
481nacl : %s
482iv : %s
483ctsize = %-20d : %s
484tag : %s
83f2d71e 485"""
00b3cd10 486
83f2d71e 487def hdr_fmt_pretty (h):
704ceaa5
PG
488 """
489 Format header structure into multi-line representation of its contents and
490 their raw representation. (Omit the implicit “PDTCRYPT” magic bytes that
491 precede every header.)
492 """
83f2d71e
PG
493 return HDR_FMT_PRETTY \
494 % (h["version"],
495 hex_spaced_of_bytes (struct.pack (FMT_UINT16_LE, h["version"])),
496 h["paramversion"],
497 hex_spaced_of_bytes (struct.pack (FMT_UINT16_LE, h["paramversion"])),
498 hex_spaced_of_bytes (h["nacl"]),
499 hex_spaced_of_bytes (h["iv"]),
500 h["ctsize"],
15d3eefd
PG
501 hex_spaced_of_bytes (struct.pack (FMT_UINT64_LE, h["ctsize"])),
502 hex_spaced_of_bytes (h["tag"]))
00b3cd10 503
f6cd676f
PG
504IV_FMT = "((f %s) (c %d))"
505
506def iv_fmt (iv):
704ceaa5 507 """Format the two components of an IV in a readable fashion."""
f6cd676f
PG
508 fixed, cnt = struct.unpack (FMT_I2N_IV, iv)
509 return IV_FMT % (binascii.hexlify (fixed), cnt)
510
00b3cd10 511
00b3cd10 512###############################################################################
f41973a6
PG
513## restoration
514###############################################################################
515
516class Location (object):
517 n = 0
518 offset = 0
519
520def restore_loc_fmt (loc):
521 return "%d off:%d" \
522 % (loc.n, loc.offset)
523
524def locate_hdr_candidates (fd):
525 """
526 Walk over instances of the magic string in the payload, collecting their
527 positions. If the offset of the first found instance is not zero, the file
528 begins with leading garbage.
529
530 :return: The list of offsets in the file.
531 """
532 cands = []
533
534 mm = mmap.mmap(fd, 0, mmap.MAP_SHARED, mmap.PROT_READ)
535 pos = 0
536 while True:
537 pos = mm.find (PDTCRYPT_HDR_MAGIC, pos)
538 if pos == -1:
539 break
540 cands.append (pos)
541 pos += 1
542
543 return cands
544
545
6c8073ab
PG
546HDR_CAND_GOOD = 0 # header marks begin of valid object
547HDR_CAND_FISHY = 1 # inconclusive (tag mismatch, obj overlap etc.)
548HDR_CAND_JUNK = 2 # not a header / object unreadable
549
550
551def inspect_hdr (fd, off):
552 """
553 Attempt to parse a header in *fd* at position *off*.
554
555 Returns a verdict about the quality of that header plus the parsed header
556 when readable.
557 """
558
559 _ = os.lseek (fd, off, os.SEEK_SET)
560
561 if os.lseek (fd, 0, os.SEEK_CUR) != off:
562 if PDTCRYPT_VERBOSE is True:
563 noise ("PDT: %d → dismissed (lseek() past EOF)" % off)
564 return HDR_CAND_JUNK, None
565
566 raw = os.read (fd, PDTCRYPT_HDR_SIZE)
567 if len (raw) != PDTCRYPT_HDR_SIZE:
568 if PDTCRYPT_VERBOSE is True:
569 noise ("PDT: %d → dismissed (EOF inside header)" % off)
570 return HDR_CAND_JUNK, None
571
572 try:
573 hdr = hdr_read (raw)
574 except InvalidHeader as exn:
575 if PDTCRYPT_VERBOSE is True:
576 noise ("PDT: %d → dismissed (invalid: [%s])" % (off, str (exn)))
577 return HDR_CAND_JUNK, None
578
579 obj0 = off + PDTCRYPT_HDR_SIZE
580 objX = obj0 + hdr ["ctsize"]
581
582 eof = os.lseek (fd, 0, os.SEEK_END)
583 if eof < objX:
584 if PDTCRYPT_VERBOSE is True:
585 noise ("PDT: %d → EOF inside object (%d≤%d≤%d); adjusting size to "
586 "%d" % (off, obj0, eof, objX, (eof - obj0)))
587 # try reading up to the end
588 hdr ["ctsize"] = eof - obj0
589 return HDR_CAND_FISHY, hdr
590
591 return HDR_CAND_GOOD, hdr
592
593
a808459e 594def try_decrypt (ifd, off, hdr, secret, ofd=-1):
6c8073ab 595 """
a808459e
PG
596 Attempt to decrypt the object in the (seekable) descriptor *ifd* starting
597 at *off* using the metadata in *hdr* and *secret*. An output fd can be
598 specified with *ofd*; if it is *-1* – the default –, the decrypted payload
599 will be discarded.
70a33834
PG
600
601 Always creates a fresh decryptor, so validation steps across objects don’t
602 apply.
6c8073ab 603 """
70a33834
PG
604 ctleft = hdr ["ctsize"]
605 pos = off
606
607 ks = secret [0]
608 if ks == PDTCRYPT_SECRET_PW:
609 decr = Decrypt (password=secret [1])
610 elif ks == PDTCRYPT_SECRET_KEY:
611 key = binascii.unhexlify (secret [1])
612 decr = Decrypt (key=key)
613 else:
614 raise RuntimeError
615
70a33834
PG
616 decr.next (hdr)
617
618 try:
a808459e 619 os.lseek (ifd, pos, os.SEEK_SET)
70a33834
PG
620 while ctleft > 0:
621 cnksiz = min (ctleft, PDTCRYPT_BLOCKSIZE)
a808459e 622 cnk = os.read (ifd, cnksiz)
70a33834
PG
623 ctleft -= cnksiz
624 pos += cnksiz
a808459e
PG
625 pt = decr.process (cnk)
626 if ofd != -1:
627 os.write (ofd, pt)
628 pt = decr.done ()
629 if len (pt) > 0 and ofd != -1:
630 os.write (ofd, pt)
70a33834 631
70a33834
PG
632 except Exception as exn:
633 noise ("PDT: error decrypting object %d–%d@%d, %d B remaining [%s]"
634 % (off, off + hdr ["ctsize"], pos, ctleft, exn))
635 raise
6c8073ab 636
70a33834 637 return pos - off
6c8073ab
PG
638
639
f41973a6 640###############################################################################
6178061e
PG
641## passthrough / null encryption
642###############################################################################
643
644class PassthroughCipher (object):
645
646 tag = struct.pack ("<QQ", 0, 0)
647
648 def __init__ (self) : pass
649
650 def update (self, b) : return b
651
50710d86 652 def finalize (self) : return b""
6178061e
PG
653
654 def finalize_with_tag (self, _) : return b""
655
656###############################################################################
a393d9cb 657## convenience wrapper
00b3cd10
PG
658###############################################################################
659
c46c8670
PG
660
661def kdf_dummy (klen, password, _nacl):
704ceaa5
PG
662 """
663 Fake KDF for testing purposes that is called when parameter version zero is
664 encountered.
665 """
c46c8670
PG
666 q, r = divmod (klen, len (password))
667 if isinstance (password, bytes) is False:
668 password = password.encode ()
669 return password * q + password [:r], b""
670
671
672SCRYPT_KEY_MEMO = { } # static because needed for both the info file and the archive
673
674
675def kdf_scrypt (params, password, nacl):
704ceaa5
PG
676 """
677 Wrapper for the Scrypt KDF, corresponds to parameter version one. The
678 computation result is memoized based on the inputs to facilitate spawning
679 multiple encryption contexts.
680 """
c46c8670
PG
681 N = params["N"]
682 r = params["r"]
683 p = params["p"]
684 dkLen = params["dkLen"]
685
686 if nacl is None:
687 nacl = os.urandom (params["NaCl_LEN"])
688
689 key_parms = (password, nacl, N, r, p, dkLen)
690 global SCRYPT_KEY_MEMO
691 if key_parms not in SCRYPT_KEY_MEMO:
692 SCRYPT_KEY_MEMO [key_parms] = \
693 pylibscrypt.scrypt (password, nacl, N, r, p, dkLen)
694 return SCRYPT_KEY_MEMO [key_parms], nacl
a64085a8
PG
695
696
da82bc58 697def kdf_by_version (paramversion=None, defs=None):
704ceaa5
PG
698 """
699 Pick the KDF handler corresponding to the parameter version or the
700 definition set.
701
702 :rtype: function (password : str, nacl : str) -> str
703 """
da82bc58
PG
704 if paramversion is not None:
705 defs = ENCRYPTION_PARAMETERS.get(paramversion, None)
a64085a8 706 if defs is None:
1ed44e7b
PG
707 raise InvalidParameter ("no encryption parameters for version %r"
708 % paramversion)
a64085a8 709 (kdf, params) = defs["kdf"]
c46c8670
PG
710 fn = None
711 if kdf == "scrypt" : fn = kdf_scrypt
712 if kdf == "dummy" : fn = kdf_dummy
713 if fn is None:
a64085a8 714 raise ValueError ("key derivation method %r unknown" % kdf)
c46c8670 715 return partial (fn, params)
a64085a8
PG
716
717
b360b772
PG
718###############################################################################
719## SCRYPT hashing
720###############################################################################
721
722def scrypt_hashsource (pw, ins):
723 """
724 Calculate the SCRYPT hash from the password and the information contained
725 in the first header found in ``ins``.
726
727 This does not validate whether the first object is encrypted correctly.
728 """
c1ecc2e2
PG
729 if isinstance (pw, str) is True:
730 pw = str.encode (pw)
731 elif isinstance (pw, bytes) is False:
732 raise InvalidParameter ("password must be a string, not %s"
733 % type (password))
734 if isinstance (ins, io.BufferedReader) is False and \
735 isinstance (ins, io.FileIO) is False:
736 raise InvalidParameter ("file to hash must be opened in “binary” mode")
b360b772
PG
737 hdr = None
738 try:
739 hdr = hdr_read_stream (ins)
740 except EndOfFile as exn:
741 noise ("PDT: malformed input: end of file reading first object header")
742 noise ("PDT:")
743 return 1
744
745 nacl = hdr ["nacl"]
746 pver = hdr ["paramversion"]
747 if PDTCRYPT_VERBOSE is True:
748 noise ("PDT: salt of first object : %s" % binascii.hexlify (nacl))
749 noise ("PDT: parameter version of archive : %d" % pver)
750
751 try:
752 defs = ENCRYPTION_PARAMETERS.get(pver, None)
753 kdfname, params = defs ["kdf"]
754 if kdfname != "scrypt":
755 noise ("PDT: input is not an SCRYPT archive")
756 noise ("")
757 return 1
758 kdf = kdf_by_version (None, defs)
759 except ValueError as exn:
760 noise ("PDT: object has unknown parameter version %d" % pver)
761
762 hsh, _void = kdf (pw, nacl)
763
c1ecc2e2 764 return hsh, nacl, hdr ["version"], pver
b360b772
PG
765
766
767def scrypt_hashfile (pw, fname):
704ceaa5
PG
768 """
769 Calculate the SCRYPT hash from the password and the information contained
770 in the first header found in the given file. The header is read only at
771 offset zero.
772 """
b360b772 773 with deptdcrypt_mk_stream (PDTCRYPT_SOURCE, fname or "-") as ins:
c1ecc2e2 774 hsh, _void, _void, _void = scrypt_hashsource (pw, ins)
b360b772
PG
775 return hsh
776
777
778###############################################################################
779## AES-GCM context
780###############################################################################
781
a393d9cb
PG
782class Crypto (object):
783 """
784 Encryption context to remain alive throughout an entire tarfile pass.
785 """
6178061e 786 enc = None
a393d9cb
PG
787 nacl = None
788 key = None
50710d86
PG
789 cnt = None # file counter (uint32_t != 0)
790 iv = None # current IV
30019abf
PG
791 fixed = None # accu for 64 bit fixed parts of IV
792 used_ivs = None # tracks IVs
793 strict_ivs = False # if True, panic on duplicate object IV
48db09ba
PG
794 password = None
795 paramversion = None
633b18a9
PG
796 stats = { "in" : 0
797 , "out" : 0
798 , "obj" : 0 }
fa47412e 799
fa47412e
PG
800 ctsize = -1
801 ptsize = -1
3031b7ae
PG
802 info_counter_used = False
803 index_counter_used = False
a393d9cb 804
a64085a8 805 def __init__ (self, *al, **akv):
30019abf 806 self.used_ivs = set ()
a64085a8 807 self.set_parameters (*al, **akv)
39accaaa
PG
808
809
704ceaa5 810 def next_fixed (self):
be124bca 811 # NOP for decryption
50710d86
PG
812 pass
813
814
815 def set_object_counter (self, cnt=None):
704ceaa5
PG
816 """
817 Safely set the internal counter of encrypted objects. Numerous
818 constraints apply:
819
820 The same counter may not be reused in combination with one IV fixed
821 part. This is validated elsewhere in the IV handling.
822
823 Counter zero is invalid. The first two counters are reserved for
824 metadata. The implementation does not allow for splitting metadata
825 files over multiple encrypted objects. (This would be possible by
826 assigning new fixed parts.) Thus in a Deltatar backup there is at most
827 one object with a counter value of one and two. On creation of a
828 context, the initial counter may be chosen. The globals
829 ``AES_GCM_IV_CNT_INFOFILE`` and ``AES_GCM_IV_CNT_INDEX`` can be used to
830 request one of the reserved values. If one of these values has been
831 used, any further attempt of setting the counter to that value will
832 be rejected with an ``InvalidFileCounter`` exception.
833
834 Out of bounds values (i. e. below one and more than the maximum of 2³²)
835 cause an ``InvalidParameter`` exception to be thrown.
836 """
50710d86
PG
837 if cnt is None:
838 self.cnt = AES_GCM_IV_CNT_DATA
839 return
840 if cnt == 0 or cnt > AES_GCM_IV_CNT_MAX + 1:
b12110dd
PG
841 raise InvalidParameter ("invalid counter value %d requested: "
842 "acceptable values are from 1 to %d"
843 % (cnt, AES_GCM_IV_CNT_MAX))
50710d86
PG
844 if cnt == AES_GCM_IV_CNT_INFOFILE:
845 if self.info_counter_used is True:
fac2cfe1
PG
846 raise InvalidFileCounter ("attempted to reuse info file "
847 "counter %d: must be unique" % cnt)
50710d86 848 self.info_counter_used = True
3031b7ae
PG
849 elif cnt == AES_GCM_IV_CNT_INDEX:
850 if self.index_counter_used is True:
fac2cfe1
PG
851 raise InvalidFileCounter ("attempted to reuse index file "
852 " counter %d: must be unique" % cnt)
3031b7ae 853 self.index_counter_used = True
50710d86
PG
854 if cnt <= AES_GCM_IV_CNT_MAX:
855 self.cnt = cnt
856 return
857 # cnt == AES_GCM_IV_CNT_MAX + 1 → wrap
858 self.cnt = AES_GCM_IV_CNT_DATA
704ceaa5 859 self.next_fixed ()
50710d86
PG
860
861
1f3fd7b0 862 def set_parameters (self, password=None, key=None, paramversion=None,
be124bca 863 nacl=None, counter=None, strict_ivs=False):
704ceaa5
PG
864 """
865 Configure the internal state of a crypto context. Not intended for
866 external use.
867 """
be124bca 868 self.next_fixed ()
50710d86 869 self.set_object_counter (counter)
30019abf
PG
870 self.strict_ivs = strict_ivs
871
a83fa4ed
PG
872 if paramversion is not None:
873 self.paramversion = paramversion
874
1f3fd7b0
PG
875 if key is not None:
876 self.key, self.nacl = key, nacl
877 return
878
a83fa4ed
PG
879 if password is not None:
880 if isinstance (password, bytes) is False:
881 password = str.encode (password)
882 self.password = password
883 if paramversion is None and nacl is None:
884 # postpone key setup until first header is available
885 return
886 kdf = kdf_by_version (paramversion)
887 if kdf is not None:
888 self.key, self.nacl = kdf (password, nacl)
fa47412e 889
39accaaa 890
39accaaa 891 def process (self, buf):
704ceaa5
PG
892 """
893 Encrypt / decrypt a buffer. Invokes the ``.update()`` method on the
894 wrapped encryptor or decryptor, respectively.
895
896 The Cryptography exception ``AlreadyFinalized`` is translated to an
897 ``InternalError`` at this point. It may occur in sound code when the GC
898 closes an encrypting stream after an error. Everywhere else it must be
899 treated as a bug.
900 """
cb7a3911
PG
901 if self.enc is None:
902 raise RuntimeError ("process: context not initialized")
903 self.stats ["in"] += len (buf)
fac2cfe1
PG
904 try:
905 out = self.enc.update (buf)
906 except cryptography.exceptions.AlreadyFinalized as exn:
907 raise InternalError (exn)
cb7a3911
PG
908 self.stats ["out"] += len (out)
909 return out
39accaaa
PG
910
911
30019abf 912 def next (self, password, paramversion, nacl, iv):
704ceaa5
PG
913 """
914 Prepare for encrypting another object: Reset the data counters and
915 change the configuration in case one of the variable parameters differs
916 from the last object. Also check the IV for duplicates and error out
917 if strict checking was requested.
918 """
fa47412e
PG
919 self.ctsize = 0
920 self.ptsize = 0
921 self.stats ["obj"] += 1
30019abf
PG
922
923 self.check_duplicate_iv (iv)
924
6178061e
PG
925 if ( self.paramversion != paramversion
926 or self.password != password
927 or self.nacl != nacl):
1f3fd7b0 928 self.set_parameters (password=password, paramversion=paramversion,
30019abf
PG
929 nacl=nacl, strict_ivs=self.strict_ivs)
930
931
932 def check_duplicate_iv (self, iv):
704ceaa5
PG
933 """
934 Add an IV (the 12 byte representation as in the header) to the list. With
935 strict checking enabled, this will throw a ``DuplicateIV``. Depending on
936 the context, this may indicate a serious error (IV reuse).
937 """
30019abf
PG
938 if self.strict_ivs is True and iv in self.used_ivs:
939 raise DuplicateIV ("iv %s was reused" % iv_fmt (iv))
940 # vi has not been used before; add to collection
941 self.used_ivs.add (iv)
fa47412e
PG
942
943
633b18a9 944 def counters (self):
704ceaa5
PG
945 """
946 Access the data counters.
947 """
633b18a9
PG
948 return self.stats ["obj"], self.stats ["in"], self.stats ["out"]
949
950
8de91f4f
PG
951 def drop (self):
952 """
953 Clear the current context regardless of its finalization state. The
954 next operation must be ``.next()``.
955 """
956 self.enc = None
957
958
39accaaa
PG
959class Encrypt (Crypto):
960
48db09ba
PG
961 lastinfo = None
962 version = None
72a42219 963 paramenc = None
50710d86 964
1f3fd7b0 965 def __init__ (self, version, paramversion, password=None, key=None, nacl=None,
30019abf 966 counter=AES_GCM_IV_CNT_DATA, strict_ivs=True):
704ceaa5
PG
967 """
968 The ctor will throw immediately if one of the parameters does not conform
969 to our expectations.
970
971 counter=AES_GCM_IV_CNT_DATA, strict_ivs=True):
972 :type version: int to fit uint16_t
973 :type paramversion: int to fit uint16_t
974 :param password: mutually exclusive with ``key``
975 :type password: bytes
976 :param key: mutually exclusive with ``password``
977 :type key: bytes
978 :type nacl: bytes
979 :type counter: initial object counter the values
980 ``AES_GCM_IV_CNT_INFOFILE`` and
981 ``AES_GCM_IV_CNT_INDEX`` are unique in each backup set
982 and cannot be reused even with different fixed parts.
983 :type strict_ivs: bool
984 """
1f3fd7b0
PG
985 if password is None and key is None \
986 or password is not None and key is not None :
987 raise InvalidParameter ("__init__: need either key or password")
988
989 if key is not None:
990 if isinstance (key, bytes) is False:
991 raise InvalidParameter ("__init__: key must be provided as "
992 "bytes, not %s" % type (key))
993 if nacl is None:
994 raise InvalidParameter ("__init__: salt must be provided along "
995 "with encryption key")
996 else: # password, no key
997 if isinstance (password, str) is False:
998 raise InvalidParameter ("__init__: password must be a string, not %s"
999 % type (password))
1000 if len (password) == 0:
1001 raise InvalidParameter ("__init__: supplied empty password but not "
1002 "permitted for PDT encrypted files")
36b9932a
PG
1003 # version
1004 if isinstance (version, int) is False:
1005 raise InvalidParameter ("__init__: version number must be an "
1006 "integer, not %s" % type (version))
1007 if version < 0:
1008 raise InvalidParameter ("__init__: version number must be a "
1009 "nonnegative integer, not %d" % version)
1010 # paramversion
1011 if isinstance (paramversion, int) is False:
1012 raise InvalidParameter ("__init__: crypto parameter version number "
1013 "must be an integer, not %s"
1014 % type (paramversion))
1015 if paramversion < 0:
1016 raise InvalidParameter ("__init__: crypto parameter version number "
1017 "must be a nonnegative integer, not %d"
1018 % paramversion)
1019 # salt
1020 if nacl is not None:
1021 if isinstance (nacl, bytes) is False:
1022 raise InvalidParameter ("__init__: salt given, but of type %s "
1023 "instead of bytes" % type (nacl))
1024 # salt length would depend on the actual encryption so it can’t be
1025 # validated at this point
b12110dd 1026 self.fixed = [ ]
48db09ba
PG
1027 self.version = version
1028 self.paramenc = ENCRYPTION_PARAMETERS.get (paramversion) ["enc"]
72a42219 1029
1f3fd7b0 1030 super().__init__ (password, key, paramversion, nacl, counter=counter,
30019abf 1031 strict_ivs=strict_ivs)
a393d9cb
PG
1032
1033
be124bca
PG
1034 def next_fixed (self, retries=PDTCRYPT_IV_GEN_MAX_RETRIES):
1035 """
1036 Generate the next IV fixed part by reading eight bytes from
1037 ``/dev/urandom``. The buffer so obtained is tested against the fixed
1038 parts used so far to prevent accidental reuse of IVs. After a
1039 configurable number of attempts to create a unique fixed part, it will
1040 refuse to continue with an ``IVFixedPartError``. This is unlikely to
1041 ever happen on a normal system but may detect an issue with the random
1042 generator.
1043
1044 The list of fixed parts that were used by the context at hand can be
1045 accessed through the ``.fixed`` list. Its last element is the fixed
1046 part currently in use.
1047 """
1048 i = 0
1049 while i < retries:
1050 fp = os.urandom (PDTCRYPT_IV_FIXEDPART_SIZE)
1051 if fp not in self.fixed:
1052 self.fixed.append (fp)
1053 return
1054 i += 1
1055 raise IVFixedPartError ("error obtaining a unique IV fixed part from "
1056 "/dev/urandom; giving up after %d tries" % i)
1057
1058
a393d9cb 1059 def iv_make (self):
704ceaa5
PG
1060 """
1061 Construct a 12-bytes IV from the current fixed part and the object
1062 counter.
1063 """
b12110dd 1064 return struct.pack(FMT_I2N_IV, self.fixed [-1], self.cnt)
a393d9cb
PG
1065
1066
cb7a3911 1067 def next (self, filename=None, counter=None):
704ceaa5
PG
1068 """
1069 Prepare for encrypting the next incoming object. Update the counter
1070 and put together the IV, possibly changing prefixes. Then create the
1071 new encryptor.
1072
1073 The argument ``counter`` can be used to specify a file counter for this
1074 object. Unless it is one of the reserved values, the counter of
1075 subsequent objects will be computed from this one.
1076
1077 If this is the first object in a series, ``filename`` is required,
1078 otherwise it is reused if not present. The value is used to derive a
1079 header sized placeholder to use until after encryption when all the
1080 inputs to construct the final header are available. This is then
1081 matched in ``.done()`` against the value found at the position of the
1082 header. The motivation for this extra check is primarily to assist
1083 format debugging: It makes stray headers easy to spot in malformed
1084 PDTCRYPT files.
1085 """
cb7a3911
PG
1086 if filename is None:
1087 if self.lastinfo is None:
1088 raise InvalidParameter ("next: filename is mandatory for "
1089 "first object")
1090 filename, _dummy = self.lastinfo
1091 else:
1092 if isinstance (filename, str) is False:
1093 raise InvalidParameter ("next: filename must be a string, no %s"
1094 % type (filename))
3031b7ae
PG
1095 if counter is not None:
1096 if isinstance (counter, int) is False:
1097 raise InvalidParameter ("next: the supplied counter is of "
1098 "invalid type %s; please pass an "
1099 "integer instead" % type (counter))
1100 self.set_object_counter (counter)
fac2cfe1 1101
50710d86 1102 self.iv = self.iv_make ()
72a42219 1103 if self.paramenc == "aes-gcm":
6178061e
PG
1104 self.enc = Cipher \
1105 ( algorithms.AES (self.key)
1106 , modes.GCM (self.iv)
1107 , backend = default_backend ()) \
1108 .encryptor ()
72a42219 1109 elif self.paramenc == "passthrough":
6178061e
PG
1110 self.enc = PassthroughCipher ()
1111 else:
b12110dd
PG
1112 raise InvalidParameter ("next: parameter version %d not known"
1113 % self.paramversion)
48db09ba
PG
1114 hdrdum = hdr_make_dummy (filename)
1115 self.lastinfo = (filename, hdrdum)
30019abf 1116 super().next (self.password, self.paramversion, self.nacl, self.iv)
72a42219 1117
3031b7ae 1118 self.set_object_counter (self.cnt + 1)
48db09ba 1119 return hdrdum
a393d9cb 1120
a393d9cb 1121
cd77dadb 1122 def done (self, cmpdata):
704ceaa5
PG
1123 """
1124 Complete encryption of an object. After this has been called, attempts
1125 of encrypting further data will cause an error until ``.next()`` is
1126 invoked properly.
1127
1128 Returns a 64 bytes buffer containing the object header including all
1129 values including the “late” ones e. g. the ciphertext size and the
1130 GCM tag.
1131 """
36b9932a
PG
1132 if isinstance (cmpdata, bytes) is False:
1133 raise InvalidParameter ("done: comparison input expected as bytes, "
1134 "not %s" % type (cmpdata))
cb7a3911
PG
1135 if self.lastinfo is None:
1136 raise RuntimeError ("done: encryption context not initialized")
48db09ba
PG
1137 filename, hdrdum = self.lastinfo
1138 if cmpdata != hdrdum:
b12110dd
PG
1139 raise RuntimeError ("done: bad sync of header for object %d: "
1140 "preliminary data does not match; this likely "
1141 "indicates a wrongly repositioned stream"
1142 % self.cnt)
6178061e 1143 data = self.enc.finalize ()
633b18a9 1144 self.stats ["out"] += len (data)
cd77dadb 1145 self.ctsize += len (data)
48db09ba
PG
1146 ok, hdr = hdr_from_params (self.version, self.paramversion, self.nacl,
1147 self.iv, self.ctsize, self.enc.tag)
8a990744 1148 if ok is False:
b12110dd
PG
1149 raise InternalError ("error constructing header: %r" % hdr)
1150 return data, hdr, self.fixed
a393d9cb 1151
a393d9cb 1152
cd77dadb 1153 def process (self, buf):
704ceaa5
PG
1154 """
1155 Encrypt a chunk of plaintext with the active encryptor. Returns the
1156 size of the input consumed. This **must** be checked downstream. If the
1157 maximum possible object size has been reached, the current context must
1158 be finalized and a new one established before any further data can be
1159 encrypted. The second argument is the remainder of the plaintext that
1160 was not encrypted for the caller to use immediately after the new
1161 context is ready.
1162 """
36b9932a
PG
1163 if isinstance (buf, bytes) is False:
1164 raise InvalidParameter ("process: expected byte buffer, not %s"
1165 % type (buf))
cb7a3911
PG
1166 bsize = len (buf)
1167 newptsize = self.ptsize + bsize
1168 diff = newptsize - PDTCRYPT_MAX_OBJ_SIZE
1169 if diff > 0:
1170 bsize -= diff
1171 newptsize = PDTCRYPT_MAX_OBJ_SIZE
1172 self.ptsize = newptsize
1173 data = super().process (buf [:bsize])
cd77dadb 1174 self.ctsize += len (data)
cb7a3911 1175 return bsize, data
cd77dadb
PG
1176
1177
39accaaa 1178class Decrypt (Crypto):
a393d9cb 1179
3031b7ae 1180 tag = None # GCM tag, part of header
3031b7ae 1181 last_iv = None # check consecutive ivs in strict mode
39accaaa 1182
1f3fd7b0 1183 def __init__ (self, password=None, key=None, counter=None, fixedparts=None,
ee6aa239 1184 strict_ivs=False):
704ceaa5
PG
1185 """
1186 Sanitizing ctor for the decryption context. ``fixedparts`` specifies a
1187 list of IV fixed parts accepted during decryption. If a fixed part is
1188 encountered that is not in the list, decryption will fail.
1189
1190 :param password: mutually exclusive with ``key``
1191 :type password: bytes
1192 :param key: mutually exclusive with ``password``
1193 :type key: bytes
1194 :type counter: initial object counter the values
1195 ``AES_GCM_IV_CNT_INFOFILE`` and
1196 ``AES_GCM_IV_CNT_INDEX`` are unique in each backup set
1197 and cannot be reused even with different fixed parts.
1198 :type fixedparts: bytes list
1199 """
1f3fd7b0
PG
1200 if password is None and key is None \
1201 or password is not None and key is not None :
1202 raise InvalidParameter ("__init__: need either key or password")
1203
1204 if key is not None:
1205 if isinstance (key, bytes) is False:
1206 raise InvalidParameter ("__init__: key must be provided as "
1207 "bytes, not %s" % type (key))
1208 else: # password, no key
1209 if isinstance (password, str) is False:
1210 raise InvalidParameter ("__init__: password must be a string, not %s"
1211 % type (password))
1212 if len (password) == 0:
1213 raise InvalidParameter ("__init__: supplied empty password but not "
1214 "permitted for PDT encrypted files")
36b9932a 1215 # fixed parts
50710d86 1216 if fixedparts is not None:
36b9932a
PG
1217 if isinstance (fixedparts, list) is False:
1218 raise InvalidParameter ("__init__: IV fixed parts must be "
1219 "supplied as list, not %s"
1220 % type (fixedparts))
b12110dd
PG
1221 self.fixed = fixedparts
1222 self.fixed.sort ()
ee6aa239 1223
a83fa4ed
PG
1224 super().__init__ (password=password, key=key, counter=counter,
1225 strict_ivs=strict_ivs)
39accaaa
PG
1226
1227
b12110dd 1228 def valid_fixed_part (self, iv):
704ceaa5
PG
1229 """
1230 Check if a fixed part was already seen.
1231 """
50710d86 1232 # check if fixed part is known
b12110dd
PG
1233 fixed, _cnt = struct.unpack (FMT_I2N_IV, iv)
1234 i = bisect.bisect_left (self.fixed, fixed)
1235 return i != len (self.fixed) and self.fixed [i] == fixed
50710d86
PG
1236
1237
ee6aa239 1238 def check_consecutive_iv (self, iv):
704ceaa5
PG
1239 """
1240 Check whether the counter part of the given IV is indeed the successor
1241 of the currently present counter. This should always be the case for
1242 the objects in a well formed PDT archive but should not be enforced
1243 when decrypting out-of-order.
1244 """
ee6aa239 1245 fixed, cnt = struct.unpack (FMT_I2N_IV, iv)
3031b7ae
PG
1246 if self.strict_ivs is True \
1247 and self.last_iv is not None \
ee6aa239
PG
1248 and self.last_iv [0] == fixed \
1249 and self.last_iv [1] != cnt - 1:
f6cd676f 1250 raise NonConsecutiveIV ("iv %s counter not successor of "
ee6aa239 1251 "last object (expected %d, found %d)"
f6cd676f 1252 % (iv_fmt (self.last_iv [1]), cnt))
ee6aa239
PG
1253 self.last_iv = (iv, cnt)
1254
1255
79782fa9 1256 def next (self, hdr):
704ceaa5
PG
1257 """
1258 Start decrypting the next object. The PDTCRYPT header for the object
1259 can be given either as already parsed object or as bytes.
1260 """
dccfe104
PG
1261 if isinstance (hdr, bytes) is True:
1262 hdr = hdr_read (hdr)
36b9932a
PG
1263 elif isinstance (hdr, dict) is False:
1264 # this won’t catch malformed specs though
1265 raise InvalidParameter ("next: wrong type of parameter hdr: "
1266 "expected bytes or spec, got %s"
fbfda3d4 1267 % type (hdr))
36b9932a
PG
1268 try:
1269 paramversion = hdr ["paramversion"]
1270 nacl = hdr ["nacl"]
1271 iv = hdr ["iv"]
1272 tag = hdr ["tag"]
1273 except KeyError:
1274 raise InvalidHeader ("next: not a header %r" % hdr)
1275
30019abf 1276 super().next (self.password, paramversion, nacl, iv)
b12110dd 1277 if self.fixed is not None and self.valid_fixed_part (iv) is False:
f6cd676f
PG
1278 raise InvalidIVFixedPart ("iv %s has invalid fixed part"
1279 % iv_fmt (iv))
3031b7ae 1280 self.check_consecutive_iv (iv)
ee6aa239 1281
36b9932a 1282 self.tag = tag
b12110dd
PG
1283 defs = ENCRYPTION_PARAMETERS.get (paramversion, None)
1284 if defs is None:
1285 raise FormatError ("header contains unknown parameter version %d; "
1286 "maybe the file was created by a more recent "
1287 "version of Deltatar" % paramversion)
50710d86 1288 enc = defs ["enc"]
6178061e
PG
1289 if enc == "aes-gcm":
1290 self.enc = Cipher \
1291 ( algorithms.AES (self.key)
36b9932a 1292 , modes.GCM (iv, tag=self.tag)
6178061e
PG
1293 , backend = default_backend ()) \
1294 . decryptor ()
1295 elif enc == "passthrough":
1296 self.enc = PassthroughCipher ()
1297 else:
b12110dd
PG
1298 raise InternalError ("encryption parameter set %d refers to unknown "
1299 "mode %r" % (paramversion, enc))
f484f2d1 1300 self.set_object_counter (self.cnt + 1)
39accaaa
PG
1301
1302
db1f3ac7 1303 def done (self, tag=None):
704ceaa5
PG
1304 """
1305 Stop decryption of the current object and finalize it with the active
1306 context. This will throw an *InvalidGCMTag* exception to indicate that
1307 the authentication tag does not match the data. If the tag is correct,
1308 the rest of the plaintext is returned.
1309 """
633b18a9 1310 data = b""
db1f3ac7
PG
1311 try:
1312 if tag is None:
f484f2d1 1313 data = self.enc.finalize ()
db1f3ac7 1314 else:
36b9932a
PG
1315 if isinstance (tag, bytes) is False:
1316 raise InvalidParameter ("done: wrong type of parameter "
1317 "tag: expected bytes, got %s"
1318 % type (tag))
f484f2d1 1319 data = self.enc.finalize_with_tag (self.tag)
b0078f26 1320 except cryptography.exceptions.InvalidTag:
f08c604b 1321 raise InvalidGCMTag ("done: tag mismatch of object %d: %s "
b0078f26 1322 "rejected by finalize ()"
f08c604b 1323 % (self.cnt, binascii.hexlify (self.tag)))
50710d86 1324 self.ctsize += len (data)
633b18a9 1325 self.stats ["out"] += len (data)
b0078f26 1326 return data
00b3cd10
PG
1327
1328
47e27926 1329 def process (self, buf):
704ceaa5
PG
1330 """
1331 Decrypt the bytes object *buf* with the active decryptor.
1332 """
36b9932a
PG
1333 if isinstance (buf, bytes) is False:
1334 raise InvalidParameter ("process: expected byte buffer, not %s"
1335 % type (buf))
47e27926
PG
1336 self.ctsize += len (buf)
1337 data = super().process (buf)
1338 self.ptsize += len (data)
1339 return data
1340
1341
00b3cd10 1342###############################################################################
770173c5
PG
1343## testing helpers
1344###############################################################################
1345
cb7a3911 1346def _patch_global (glob, vow, n=None):
770173c5
PG
1347 """
1348 Adapt upper file counter bound for testing IV logic. Completely unsafe.
1349 """
1350 assert vow == "I am fully aware that this will void my warranty."
cb7a3911
PG
1351 r = globals () [glob]
1352 if n is None:
1353 n = globals () [glob + "_DEFAULT"]
1354 globals () [glob] = n
770173c5
PG
1355 return r
1356
cb7a3911
PG
1357_testing_set_AES_GCM_IV_CNT_MAX = \
1358 partial (_patch_global, "AES_GCM_IV_CNT_MAX")
1359
1360_testing_set_PDTCRYPT_MAX_OBJ_SIZE = \
1361 partial (_patch_global, "PDTCRYPT_MAX_OBJ_SIZE")
1362
a808459e
PG
1363def open2_dump_file (fname, dir_fd, force=False):
1364 outfd = -1
1365
1366 oflags = os.O_CREAT | os.O_WRONLY
1367 if PDTCRYPT_OVERWRITE is True:
1368 oflags |= os.O_TRUNC
1369 else:
1370 oflags |= os.O_EXCL
1371
1372 try:
1373 outfd = os.open (fname, oflags,
1374 stat.S_IRUSR | stat.S_IWUSR, dir_fd=dir_fd)
1375 except FileExistsError as exn:
1376 noise ("PDT: refusing to overwrite existing file %s" % fname)
1377 noise ("")
1378 raise RuntimeError ("destination file %s already exists" % fname)
1379 if PDTCRYPT_VERBOSE is True:
1380 noise ("PDT: new output file %s (fd=%d)" % (fname, outfd))
1381
1382 return outfd
1383
770173c5 1384###############################################################################
00b3cd10
PG
1385## freestanding invocation
1386###############################################################################
1387
da82bc58
PG
1388PDTCRYPT_SUB_PROCESS = 0
1389PDTCRYPT_SUB_SCRYPT = 1
f41973a6 1390PDTCRYPT_SUB_SCAN = 2
da82bc58
PG
1391
1392PDTCRYPT_SUB = \
1393 { "process" : PDTCRYPT_SUB_PROCESS
f41973a6
PG
1394 , "scrypt" : PDTCRYPT_SUB_SCRYPT
1395 , "scan" : PDTCRYPT_SUB_SCAN }
da82bc58 1396
a83fa4ed
PG
1397PDTCRYPT_SECRET_PW = 0
1398PDTCRYPT_SECRET_KEY = 1
1399
e3abcdf0
PG
1400PDTCRYPT_DECRYPT = 1 << 0 # decrypt archive with password
1401PDTCRYPT_SPLIT = 1 << 1 # split archive into individual objects
da82bc58 1402PDTCRYPT_HASH = 1 << 2 # output scrypt hash for file and given password
e3abcdf0 1403
a808459e
PG
1404PDTCRYPT_SPLITNAME = "pdtcrypt-object-%d.bin"
1405PDTCRYPT_RESCUENAME = "pdtcrypt-rescue-object-%0.5d.bin"
e3abcdf0 1406
70ad9458 1407PDTCRYPT_VERBOSE = False
ee6aa239 1408PDTCRYPT_STRICTIVS = False
b07633d3 1409PDTCRYPT_OVERWRITE = False
15d3eefd 1410PDTCRYPT_BLOCKSIZE = 1 << 12
70ad9458
PG
1411PDTCRYPT_SINK = 0
1412PDTCRYPT_SOURCE = 1
1413SELF = None
1414
77058bab
PG
1415PDTCRYPT_DEFAULT_VER = 1
1416PDTCRYPT_DEFAULT_PVER = 1
1417
7b3940e5
PG
1418# scrypt hashing output control
1419PDTCRYPT_SCRYPT_INTRANATOR = 0
1420PDTCRYPT_SCRYPT_PARAMETERS = 1
4f6405d6 1421PDTCRYPT_SCRYPT_DEFAULT = PDTCRYPT_SCRYPT_INTRANATOR
7b3940e5
PG
1422
1423PDTCRYPT_SCRYPT_FORMAT = \
1424 { "i2n" : PDTCRYPT_SCRYPT_INTRANATOR
1425 , "params" : PDTCRYPT_SCRYPT_PARAMETERS }
1426
4c62ddc0 1427PDTCRYPT_TT_COLUMNS = 80 # assume standard terminal
15d3eefd
PG
1428
1429class PDTDecryptionError (Exception):
1430 """Decryption failed."""
1431
e3abcdf0
PG
1432class PDTSplitError (Exception):
1433 """Decryption failed."""
1434
15d3eefd
PG
1435
1436def noise (*a, **b):
591a722f 1437 print (file=sys.stderr, *a, **b)
15d3eefd
PG
1438
1439
89e1073c
PG
1440class PassthroughDecryptor (object):
1441
1442 curhdr = None # write current header on first data write
1443
1444 def __init__ (self):
1445 if PDTCRYPT_VERBOSE is True:
1446 noise ("PDT: no encryption; data passthrough")
1447
1448 def next (self, hdr):
1449 ok, curhdr = hdr_make (hdr)
1450 if ok is False:
1451 raise PDTDecryptionError ("bad header %r" % hdr)
1452 self.curhdr = curhdr
1453
1454 def done (self):
1455 if self.curhdr is not None:
1456 return self.curhdr
1457 return b""
1458
1459 def process (self, d):
1460 if self.curhdr is not None:
1461 d = self.curhdr + d
1462 self.curhdr = None
1463 return d
1464
1465
a83fa4ed 1466def depdtcrypt (mode, secret, ins, outs):
15d3eefd 1467 """
a83fa4ed
PG
1468 Remove PDTCRYPT layer from all objects encrypted with the secret. Used on a
1469 Deltatar backup this will yield a (possibly Gzip compressed) tarball.
15d3eefd
PG
1470 """
1471 ctleft = -1 # length of ciphertext to consume
1472 ctcurrent = 0 # total ciphertext of current object
15d3eefd
PG
1473 total_obj = 0 # total number of objects read
1474 total_pt = 0 # total plaintext bytes
1475 total_ct = 0 # total ciphertext bytes
1476 total_read = 0 # total bytes read
e3abcdf0
PG
1477 outfile = None # Python file object for output
1478
89e1073c 1479 if mode & PDTCRYPT_DECRYPT: # decryptor
a83fa4ed
PG
1480 ks = secret [0]
1481 if ks == PDTCRYPT_SECRET_PW:
1482 decr = Decrypt (password=secret [1], strict_ivs=PDTCRYPT_STRICTIVS)
1483 elif ks == PDTCRYPT_SECRET_KEY:
1484 key = binascii.unhexlify (secret [1])
1485 decr = Decrypt (key=key, strict_ivs=PDTCRYPT_STRICTIVS)
1486 else:
1487 raise InternalError ("‘%d’ does not specify a valid kind of secret"
1488 % ks)
89e1073c
PG
1489 else:
1490 decr = PassthroughDecryptor ()
1491
e3abcdf0
PG
1492 def nextout (_):
1493 """Dummy for non-split mode: output file does not vary."""
1494 return outs
1495
1496 if mode & PDTCRYPT_SPLIT:
1497 def nextout (outfile):
1498 """
1499 We were passed an fd as outs for accessing the destination
1500 directory where extracted archive components are supposed
1501 to end up in.
1502 """
1503
1504 if outfile is None:
1505 if PDTCRYPT_VERBOSE is True:
1506 noise ("PDT: no output file to close at this point")
77058bab
PG
1507 else:
1508 if PDTCRYPT_VERBOSE is True:
1509 noise ("PDT: release output file %r" % outfile)
e3abcdf0
PG
1510 # cleanup happens automatically by the GC; the next
1511 # line will error out on account of an invalid fd
1512 #outfile.close ()
1513
1514 assert total_obj > 0
1515 fname = PDTCRYPT_SPLITNAME % total_obj
1516 try:
a808459e
PG
1517 outfd = open2_dump_file (fname, outs, force=PDTCRYPT_OVERWRITE)
1518 except RuntimeError as exn:
1519 raise PDTSplitError (exn)
e3abcdf0
PG
1520 return os.fdopen (outfd, "wb", closefd=True)
1521
15d3eefd 1522
47d22679 1523 def tell (s):
b09a99eb 1524 """ESPIPE is normal on non-seekable stdio stream."""
47d22679
PG
1525 try:
1526 return s.tell ()
1527 except OSError as exn:
b09a99eb 1528 if exn.errno == os.errno.ESPIPE:
47d22679
PG
1529 return -1
1530
e3abcdf0 1531 def out (pt, outfile):
15d3eefd
PG
1532 npt = len (pt)
1533 nonlocal total_pt
1534 total_pt += npt
70ad9458 1535 if PDTCRYPT_VERBOSE is True:
15d3eefd
PG
1536 noise ("PDT:\t· decrypt plaintext %d B" % (npt))
1537 try:
e3abcdf0 1538 nn = outfile.write (pt)
15d3eefd
PG
1539 except OSError as exn: # probably ENOSPC
1540 raise DecryptionError ("error (%s)" % exn)
1541 if nn != npt:
1542 raise DecryptionError ("write aborted after %d of %d B" % (nn, npt))
1543
1544 while True:
1545 if ctleft <= 0:
1546 # current object completed; in a valid archive this marks either
1547 # the start of a new header or the end of the input
1548 if ctleft == 0: # current object requires finalization
70ad9458 1549 if PDTCRYPT_VERBOSE is True:
47d22679 1550 noise ("PDT: %d finalize" % tell (ins))
5d394c0d
PG
1551 try:
1552 pt = decr.done ()
1553 except InvalidGCMTag as exn:
f08c604b
PG
1554 raise DecryptionError ("error finalizing object %d (%d B): "
1555 "%r" % (total_obj, len (pt), exn)) \
1556 from exn
e3abcdf0 1557 out (pt, outfile)
70ad9458 1558 if PDTCRYPT_VERBOSE is True:
15d3eefd
PG
1559 noise ("PDT:\t· object validated")
1560
70ad9458 1561 if PDTCRYPT_VERBOSE is True:
47d22679 1562 noise ("PDT: %d hdr" % tell (ins))
15d3eefd
PG
1563 try:
1564 hdr = hdr_read_stream (ins)
dd47d6a2 1565 total_read += PDTCRYPT_HDR_SIZE
ae3d0f2a
PG
1566 except EndOfFile as exn:
1567 total_read += exn.remainder
dd47d6a2 1568 if total_ct + total_obj * PDTCRYPT_HDR_SIZE != total_read:
15d3eefd
PG
1569 raise PDTDecryptionError ("ciphertext processed (%d B) plus "
1570 "overhead (%d × %d B) does not match "
1571 "the number of bytes read (%d )"
dd47d6a2 1572 % (total_ct, total_obj, PDTCRYPT_HDR_SIZE,
15d3eefd
PG
1573 total_read))
1574 # the single good exit
1575 return total_read, total_obj, total_ct, total_pt
1576 except InvalidHeader as exn:
1577 raise PDTDecryptionError ("invalid header at position %d in %r "
ee6aa239 1578 "(%s)" % (tell (ins), exn, ins))
70ad9458 1579 if PDTCRYPT_VERBOSE is True:
15d3eefd
PG
1580 pretty = hdr_fmt_pretty (hdr)
1581 noise (reduce (lambda a, e: (a + "\n" if a else "") + "PDT:\t· " + e,
1582 pretty.splitlines (), ""))
1583 ctcurrent = ctleft = hdr ["ctsize"]
89e1073c 1584
15d3eefd 1585 decr.next (hdr)
e3abcdf0
PG
1586
1587 total_obj += 1 # used in file counter with split mode
1588
1589 # finalization complete or skipped in case of first object in
1590 # stream; create a new output file if necessary
1591 outfile = nextout (outfile)
15d3eefd 1592
70ad9458 1593 if PDTCRYPT_VERBOSE is True:
15d3eefd 1594 noise ("PDT: %d decrypt obj no. %d, %d B"
47d22679 1595 % (tell (ins), total_obj, ctleft))
15d3eefd
PG
1596
1597 # always allocate a new buffer since python-cryptography doesn’t allow
1598 # passing a bytearray :/
1599 nexpect = min (ctleft, PDTCRYPT_BLOCKSIZE)
70ad9458 1600 if PDTCRYPT_VERBOSE is True:
15d3eefd 1601 noise ("PDT:\t· [%d] %d%% done, read block (%d B of %d B remaining)"
47d22679 1602 % (tell (ins),
15d3eefd
PG
1603 100 - ctleft * 100 / (ctcurrent > 0 and ctcurrent or 1),
1604 nexpect, ctleft))
1605 ct = ins.read (nexpect)
1606 nct = len (ct)
1607 if nct < nexpect:
47d22679 1608 off = tell (ins)
ae3d0f2a
PG
1609 raise EndOfFile (nct,
1610 "hit EOF after %d of %d B in block [%d:%d); "
15d3eefd
PG
1611 "%d B ciphertext remaining for object no %d"
1612 % (nct, nexpect, off, off + nexpect, ctleft,
1613 total_obj))
1614 ctleft -= nct
1615 total_ct += nct
1616 total_read += nct
1617
70ad9458 1618 if PDTCRYPT_VERBOSE is True:
15d3eefd
PG
1619 noise ("PDT:\t· decrypt ciphertext %d B" % (nct))
1620 pt = decr.process (ct)
e3abcdf0 1621 out (pt, outfile)
15d3eefd 1622
d6c15a52 1623
70ad9458 1624def deptdcrypt_mk_stream (kind, path):
d6c15a52 1625 """Create stream from file or stdio descriptor."""
70ad9458 1626 if kind == PDTCRYPT_SINK:
d6c15a52 1627 if path == "-":
70ad9458 1628 if PDTCRYPT_VERBOSE is True: noise ("PDT: sink: stdout")
d6c15a52
PG
1629 return sys.stdout.buffer
1630 else:
70ad9458 1631 if PDTCRYPT_VERBOSE is True: noise ("PDT: sink: file %s" % path)
d6c15a52 1632 return io.FileIO (path, "w")
70ad9458 1633 if kind == PDTCRYPT_SOURCE:
d6c15a52 1634 if path == "-":
70ad9458 1635 if PDTCRYPT_VERBOSE is True: noise ("PDT: source: stdin")
d6c15a52
PG
1636 return sys.stdin.buffer
1637 else:
70ad9458 1638 if PDTCRYPT_VERBOSE is True: noise ("PDT: source: file %s" % path)
d6c15a52
PG
1639 return io.FileIO (path, "r")
1640
1641 raise ValueError ("bogus stream “%s” / %s" % (kind, path))
1642
15d3eefd 1643
a83fa4ed 1644def mode_depdtcrypt (mode, secret, ins, outs):
da82bc58
PG
1645 try:
1646 total_read, total_obj, total_ct, total_pt = \
a83fa4ed 1647 depdtcrypt (mode, secret, ins, outs)
da82bc58
PG
1648 except DecryptionError as exn:
1649 noise ("PDT: Decryption failed:")
1650 noise ("PDT:")
1651 noise ("PDT: “%s”" % exn)
1652 noise ("PDT:")
a83fa4ed 1653 noise ("PDT: Did you specify the correct key / password?")
da82bc58
PG
1654 noise ("")
1655 return 1
1656 except PDTSplitError as exn:
1657 noise ("PDT: Split operation failed:")
1658 noise ("PDT:")
1659 noise ("PDT: “%s”" % exn)
1660 noise ("PDT:")
a83fa4ed 1661 noise ("PDT: Hint: target directory should be empty.")
da82bc58
PG
1662 noise ("")
1663 return 1
1664
1665 if PDTCRYPT_VERBOSE is True:
1666 noise ("PDT: decryption successful" )
1667 noise ("PDT: %.10d bytes read" % total_read)
1668 noise ("PDT: %.10d objects decrypted" % total_obj )
1669 noise ("PDT: %.10d bytes ciphertext" % total_ct )
1670 noise ("PDT: %.10d bytes plaintext" % total_pt )
1671 noise ("" )
1672
1673 return 0
1674
1675
7b3940e5 1676def mode_scrypt (pw, ins=None, nacl=None, fmt=PDTCRYPT_SCRYPT_INTRANATOR):
77058bab 1677 hsh = None
7b3940e5 1678 paramversion = PDTCRYPT_DEFAULT_PVER
77058bab
PG
1679 if ins is not None:
1680 hsh, nacl, version, paramversion = scrypt_hashsource (pw, ins)
1681 defs = ENCRYPTION_PARAMETERS.get(paramversion, None)
1682 else:
1683 nacl = binascii.unhexlify (nacl)
7b3940e5 1684 defs = ENCRYPTION_PARAMETERS.get(paramversion, None)
77058bab
PG
1685 version = PDTCRYPT_DEFAULT_VER
1686
1687 kdfname, params = defs ["kdf"]
1688 if hsh is None:
1689 kdf = kdf_by_version (None, defs)
1690 hsh, _void = kdf (pw, nacl)
da82bc58
PG
1691
1692 import json
7b3940e5
PG
1693
1694 if fmt == PDTCRYPT_SCRYPT_INTRANATOR:
1695 out = json.dumps ({ "salt" : base64.b64encode (nacl).decode ()
1696 , "key" : base64.b64encode (hsh) .decode ()
1697 , "paramversion" : paramversion })
1698 elif fmt == PDTCRYPT_SCRYPT_PARAMETERS:
1699 out = json.dumps ({ "salt" : binascii.hexlify (nacl).decode ()
1700 , "key" : binascii.hexlify (hsh) .decode ()
1701 , "version" : version
1702 , "scrypt_params" : { "N" : params ["N"]
1703 , "r" : params ["r"]
1704 , "p" : params ["p"]
1705 , "dkLen" : params ["dkLen"] } })
1706 else:
1707 raise RuntimeError ("bad scrypt output scheme %r" % fmt)
1708
da82bc58
PG
1709 print (out)
1710
1711
4c62ddc0
PG
1712def noise_output_candidates (cands, indent=8, cols=PDTCRYPT_TT_COLUMNS):
1713 """
1714 Print a list of offsets without garbling the terminal too much.
1715
1716 The indent is counted from column zero; if it is wide enough, the “PDT: ”
1717 marker will be prepended, considered part of the indentation.
1718 """
1719 wd = cols - 1
1720 nc = len (cands)
1721 idt = " " * indent if indent < 5 else "PDT: " + " " * (indent - 5)
1722 line = idt
1723 lpos = indent
1724 sep = ","
1725 lsep = len (sep)
1726 init = True # prevent leading separator
1727
1728 if indent >= wd:
1729 raise ValueError ("the requested indentation exceeds the line "
1730 "width by %d" % (indent - wd))
1731
1732 for n in cands:
1733 ns = "%d" % n
1734 lns = len (ns)
1735 if init is False:
1736 line += sep
1737 lpos += lsep
1738
1739 lpos += lns
1740 if lpos > wd: # line break
1741 noise (line)
1742 line = idt
1743 lpos = indent + lns
1744 elif init is True:
1745 init = False
1746 else: # space
1747 line += ' '
1748 lpos += 1
1749
1750 line += ns
1751
1752 if lpos != indent:
1753 noise (line)
1754
1755
a808459e 1756def mode_scan (secret, fname, outs=None, nacl=None):
f41973a6
PG
1757 """
1758 Dissect a binary file, looking for PDTCRYPT headers and objects.
a808459e
PG
1759
1760 If *outs* is supplied, recoverable data will be dumped into the specified
1761 directory.
f41973a6
PG
1762 """
1763 try:
a808459e 1764 ifd = os.open (fname, os.O_RDONLY)
f41973a6
PG
1765 except FileNotFoundError:
1766 noise ("PDT: failed to open %s readonly" % fname)
1767 noise ("")
1768 usage (err=True)
1769
1770 try:
1771 if PDTCRYPT_VERBOSE is True:
1772 noise ("PDT: scan for potential sync points")
a808459e 1773 cands = locate_hdr_candidates (ifd)
f41973a6
PG
1774 if len (cands) == 0:
1775 noise ("PDT: scan complete: input does not contain potential PDT "
1776 "headers; giving up.")
1777 return -1
1778 if PDTCRYPT_VERBOSE is True:
4c62ddc0
PG
1779 noise ("PDT: scan complete: found %d candidates:" % len (cands))
1780 noise_output_candidates (cands)
6c8073ab 1781 except:
a808459e 1782 os.close (ifd)
6c8073ab 1783 raise
f41973a6 1784
6c8073ab
PG
1785 junk, todo = [], []
1786 try:
a808459e 1787 nobj = 0
6c8073ab 1788 for cand in cands:
a808459e
PG
1789 nobj += 1
1790 vdt, hdr = inspect_hdr (ifd, cand)
6c8073ab
PG
1791 if vdt == HDR_CAND_JUNK:
1792 junk.append (cand)
1793 else:
1794 off0 = cand + PDTCRYPT_HDR_SIZE
1795 if PDTCRYPT_VERBOSE is True:
a808459e 1796 noise ("PDT: obj %d: read payload @%d" % (nobj, off0))
70a33834
PG
1797 pretty = hdr_fmt_pretty (hdr)
1798 noise (reduce (lambda a, e: (a + "\n" if a else "") + "PDT:\t· " + e,
1799 pretty.splitlines (), ""))
6c8073ab 1800
a808459e
PG
1801 ofd = -1
1802 if outs is not None:
1803 ofname = PDTCRYPT_RESCUENAME % nobj
1804 ofd = open2_dump_file (ofname, outs, force=PDTCRYPT_OVERWRITE)
1805
1806 try:
1807 ok = try_decrypt (ifd, off0, hdr, secret, ofd=ofd) == hdr ["ctsize"]
1808 finally:
1809 if ofd != -1:
1810 os.close (ofd)
70a33834 1811 if vdt == HDR_CAND_GOOD and ok is True:
6c8073ab
PG
1812 noise ("PDT: %d → ✓ valid object %d–%d"
1813 % (cand, off0, off0 + hdr ["ctsize"]))
70a33834 1814 elif vdt == HDR_CAND_FISHY and ok is True:
6c8073ab
PG
1815 noise ("PDT: %d → × object %d–%d, corrupt header"
1816 % (cand, off0, off0 + hdr ["ctsize"]))
70a33834 1817 elif vdt == HDR_CAND_GOOD and ok is False:
6c8073ab
PG
1818 noise ("PDT: %d → × object %d–%d, problematic payload"
1819 % (cand, off0, off0 + hdr ["ctsize"]))
70a33834 1820 elif vdt == HDR_CAND_FISHY and ok is False:
6c8073ab
PG
1821 noise ("PDT: %d → × object %d–%d, corrupt header, problematic "
1822 "ciphertext" % (cand, off0, off0 + hdr ["ctsize"]))
1823 else:
1824 raise Unreachable
1825 finally:
a808459e 1826 os.close (ifd)
7b3940e5 1827
70a33834
PG
1828 if len (junk) == 0:
1829 noise ("PDT: all headers ok")
1830 else:
1831 noise ("PDT: %d candidates not parseable as headers:" % len (junk))
1832 noise_output_candidates (junk)
1833
70ad9458
PG
1834def usage (err=False):
1835 out = print
1836 if err is True:
1837 out = noise
5afcb45d 1838 indent = ' ' * len (SELF)
da82bc58 1839 out ("usage: %s SUBCOMMAND { --help" % SELF)
5afcb45d 1840 out (" %s | [ -v ] { -p PASSWORD | -k KEY }" % indent)
77058bab
PG
1841 out (" %s [ { -i | --in } { - | SOURCE } ]" % indent)
1842 out (" %s [ { -n | --nacl } { SALT } ]" % indent)
1843 out (" %s [ { -o | --out } { - | DESTINATION } ]" % indent)
1844 out (" %s [ -D | --no-decrypt ] [ -S | --split ]" % indent)
7b3940e5 1845 out (" %s [ -f | --format ]" % indent)
70ad9458
PG
1846 out ("")
1847 out ("\twhere")
da82bc58
PG
1848 out ("\t\tSUBCOMMAND main mode: { process | scrypt }")
1849 out ("\t\t where:")
1850 out ("\t\t process: extract objects from PDT archive")
1851 out ("\t\t scrypt: calculate hash from password and first object")
a83fa4ed
PG
1852 out ("\t\t-p PASSWORD password to derive the encryption key from")
1853 out ("\t\t-k KEY encryption key as 16 bytes in hexadecimal notation")
e3abcdf0 1854 out ("\t\t-s enforce strict handling of initialization vectors")
70ad9458
PG
1855 out ("\t\t-i SOURCE file name to read from")
1856 out ("\t\t-o DESTINATION file to write output to")
77058bab 1857 out ("\t\t-n SALT provide salt for scrypt mode in hex encoding")
70ad9458 1858 out ("\t\t-v print extra info")
e3abcdf0
PG
1859 out ("\t\t-S split into files at object boundaries; this")
1860 out ("\t\t requires DESTINATION to refer to directory")
1861 out ("\t\t-D PDT header and ciphertext passthrough")
7b3940e5 1862 out ("\t\t-f format of SCRYPT hash output (“default” or “parameters”)")
70ad9458
PG
1863 out ("")
1864 out ("\tinstead of filenames, “-” may used to specify stdin / stdout")
1865 out ("")
1866 sys.exit ((err is True) and 42 or 0)
1867
1868
a83fa4ed
PG
1869def bail (msg):
1870 noise (msg)
1871 noise ("")
1872 usage (err=True)
1873 raise Unreachable
1874
1875
70ad9458
PG
1876def parse_argv (argv):
1877 global SELF
7b3940e5
PG
1878 mode = PDTCRYPT_DECRYPT
1879 secret = None
1880 insspec = None
1881 outsspec = None
a808459e 1882 outs = None
7b3940e5 1883 nacl = None
4f6405d6 1884 scrypt_format = PDTCRYPT_SCRYPT_DEFAULT
70ad9458
PG
1885
1886 argvi = iter (argv)
1887 SELF = os.path.basename (next (argvi))
1888
da82bc58
PG
1889 try:
1890 rawsubcmd = next (argvi)
1891 subcommand = PDTCRYPT_SUB [rawsubcmd]
1892 except StopIteration:
a83fa4ed 1893 bail ("ERROR: subcommand required")
da82bc58 1894 except KeyError:
a83fa4ed 1895 bail ("ERROR: invalid subcommand “%s” specified" % rawsubcmd)
da82bc58 1896
59d74e2b
PG
1897 def checked_arg ():
1898 nonlocal argvi
1899 try:
1900 return next (argvi)
1901 except StopIteration:
1902 bail ("ERROR: argument list incomplete")
1903
a83fa4ed
PG
1904 def checked_secret (t, arg):
1905 nonlocal secret
1906 if secret is None:
1907 secret = (t, arg)
da82bc58 1908 else:
a83fa4ed 1909 bail ("ERROR: encountered “%s” but secret already given" % arg)
da82bc58 1910
70ad9458
PG
1911 for arg in argvi:
1912 if arg in [ "-h", "--help" ]:
1913 usage ()
1914 raise Unreachable
1915 elif arg in [ "-v", "--verbose", "--wtf" ]:
1916 global PDTCRYPT_VERBOSE
1917 PDTCRYPT_VERBOSE = True
1918 elif arg in [ "-i", "--in", "--source" ]:
59d74e2b 1919 insspec = checked_arg ()
70ad9458 1920 if PDTCRYPT_VERBOSE is True: noise ("PDT: decrypt from %s" % insspec)
a83fa4ed 1921 elif arg in [ "-p", "--password" ]:
59d74e2b 1922 arg = checked_arg ()
a83fa4ed
PG
1923 checked_secret (PDTCRYPT_SECRET_PW, arg)
1924 if PDTCRYPT_VERBOSE is True: noise ("PDT: decrypting with password")
70ad9458 1925 else:
da82bc58
PG
1926 if subcommand == PDTCRYPT_SUB_PROCESS:
1927 if arg in [ "-s", "--strict-ivs" ]:
1928 global PDTCRYPT_STRICTIVS
1929 PDTCRYPT_STRICTIVS = True
77058bab
PG
1930 elif arg in [ "-o", "--out", "--dest", "--sink" ]:
1931 outsspec = checked_arg ()
1932 if PDTCRYPT_VERBOSE is True: noise ("PDT: decrypt to %s" % outsspec)
da82bc58
PG
1933 elif arg in [ "-f", "--force" ]:
1934 global PDTCRYPT_OVERWRITE
1935 PDTCRYPT_OVERWRITE = True
1936 if PDTCRYPT_VERBOSE is True: noise ("PDT: overwrite existing files")
1937 elif arg in [ "-S", "--split" ]:
1938 mode |= PDTCRYPT_SPLIT
1939 if PDTCRYPT_VERBOSE is True: noise ("PDT: split files")
1940 elif arg in [ "-D", "--no-decrypt" ]:
1941 mode &= ~PDTCRYPT_DECRYPT
1942 if PDTCRYPT_VERBOSE is True: noise ("PDT: not decrypting")
a83fa4ed 1943 elif arg in [ "-k", "--key" ]:
59d74e2b 1944 arg = checked_arg ()
a83fa4ed
PG
1945 checked_secret (PDTCRYPT_SECRET_KEY, arg)
1946 if PDTCRYPT_VERBOSE is True: noise ("PDT: decrypting with key")
da82bc58 1947 else:
a83fa4ed 1948 bail ("ERROR: unexpected positional argument “%s”" % arg)
da82bc58 1949 elif subcommand == PDTCRYPT_SUB_SCRYPT:
77058bab
PG
1950 if arg in [ "-n", "--nacl", "--salt" ]:
1951 nacl = checked_arg ()
1952 if PDTCRYPT_VERBOSE is True: noise ("PDT: salt key with %s" % nacl)
7b3940e5
PG
1953 elif arg in [ "-f", "--format" ]:
1954 arg = checked_arg ()
1955 try:
1956 scrypt_format = PDTCRYPT_SCRYPT_FORMAT [arg]
1957 except KeyError:
1958 bail ("ERROR: invalid scrypt output format %s" % arg)
1959 if PDTCRYPT_VERBOSE is True:
1960 noise ("PDT: scrypt output format “%s”" % scrypt_format)
77058bab
PG
1961 else:
1962 bail ("ERROR: unexpected positional argument “%s”" % arg)
f41973a6 1963 elif subcommand == PDTCRYPT_SUB_SCAN:
a808459e
PG
1964 if arg in [ "-o", "--out", "--dest", "--sink" ]:
1965 outsspec = checked_arg ()
1966 if PDTCRYPT_VERBOSE is True: noise ("PDT: decrypt to %s" % outsspec)
1967 elif arg in [ "-f", "--force" ]:
1968 global PDTCRYPT_OVERWRITE
1969 PDTCRYPT_OVERWRITE = True
1970 if PDTCRYPT_VERBOSE is True: noise ("PDT: overwrite existing files")
1971 else:
1972 bail ("ERROR: unexpected positional argument “%s”" % arg)
70ad9458 1973
a83fa4ed 1974 if secret is None:
ecb9676d 1975 if PDTCRYPT_VERBOSE is True:
a83fa4ed 1976 noise ("ERROR: no password or key specified, trying $PDTCRYPT_PASSWORD")
ecb9676d
PG
1977 epw = os.getenv ("PDTCRYPT_PASSWORD")
1978 if epw is not None:
a83fa4ed
PG
1979 checked_secret (PDTCRYPT_SECRET_PW, epw.strip ())
1980
1981 if secret is None:
1982 if PDTCRYPT_VERBOSE is True:
1983 noise ("ERROR: no password or key specified, trying $PDTCRYPT_KEY")
1984 ek = os.getenv ("PDTCRYPT_KEY")
1985 if ek is not None:
1986 checked_secret (PDTCRYPT_SECRET_KEY, ek.strip ())
ecb9676d 1987
a83fa4ed 1988 if secret is None:
da82bc58 1989 if subcommand == PDTCRYPT_SUB_SCRYPT:
a83fa4ed 1990 bail ("ERROR: scrypt hash mode requested but no password given")
da82bc58 1991 elif mode & PDTCRYPT_DECRYPT:
a83fa4ed
PG
1992 bail ("ERROR: encryption requested but no password given")
1993
a808459e
PG
1994 if mode & PDTCRYPT_SPLIT and outsspec is None:
1995 bail ("ERROR: split mode is incompatible with stdout sink "
1996 "(the default)")
1997
1998 if subcommand == PDTCRYPT_SUB_SCAN and outsspec is None:
1999 pass # no output by default in scan mode
2000 elif mode & PDTCRYPT_SPLIT or subcommand == PDTCRYPT_SUB_SCAN:
2001 # destination must be directory
2002 if outsspec == "-":
2003 bail ("ERROR: mode is incompatible with stdout sink")
2004 try:
2005 try:
2006 os.makedirs (outsspec, 0o700)
2007 except FileExistsError:
2008 # if it’s a directory with appropriate perms, everything is
2009 # good; otherwise, below invocation of open(2) will fail
2010 pass
2011 outs = os.open (outsspec, os.O_DIRECTORY, 0o600)
2012 except FileNotFoundError as exn:
2013 bail ("ERROR: cannot create target directory “%s”" % outsspec)
2014 except NotADirectoryError as exn:
2015 bail ("ERROR: target path “%s” is not a directory" % outsspec)
2016 else:
2017 outs = deptdcrypt_mk_stream (PDTCRYPT_SINK, outsspec or "-")
2018
f41973a6
PG
2019 if subcommand == PDTCRYPT_SUB_SCAN:
2020 if insspec is None:
2021 bail ("ERROR: please supply an input file for scanning")
2022 if insspec == '-':
2023 bail ("ERROR: input must be seekable; please specify a file")
a808459e 2024 return True, partial (mode_scan, secret, insspec, outs, nacl=nacl)
f41973a6 2025
77058bab
PG
2026 if subcommand == PDTCRYPT_SUB_SCRYPT:
2027 if secret [0] == PDTCRYPT_SECRET_KEY:
2028 bail ("ERROR: scrypt mode requires a password")
2029 if insspec is not None and nacl is not None \
2030 or insspec is None and nacl is None :
2031 bail ("ERROR: please supply either an input file or "
2032 "the salt")
70ad9458
PG
2033
2034 # default to stdout
77058bab
PG
2035 ins = None
2036 if insspec is not None or subcommand != PDTCRYPT_SUB_SCRYPT:
2037 ins = deptdcrypt_mk_stream (PDTCRYPT_SOURCE, insspec or "-")
da82bc58
PG
2038
2039 if subcommand == PDTCRYPT_SUB_SCRYPT:
7b3940e5
PG
2040 return True, partial (mode_scrypt, secret [1].encode (), ins, nacl,
2041 fmt=scrypt_format)
da82bc58 2042
a83fa4ed 2043 return True, partial (mode_depdtcrypt, mode, secret, ins, outs)
15d3eefd
PG
2044
2045
00b3cd10 2046def main (argv):
da82bc58 2047 ok, runner = parse_argv (argv)
f08c604b 2048
da82bc58 2049 if ok is True: return runner ()
15d3eefd 2050
da82bc58 2051 return 1
f08c604b 2052
00b3cd10
PG
2053
2054if __name__ == "__main__":
2055 sys.exit (main (sys.argv))
2056