Planet Debian

Subscribe to Planet Debian feed
Planet Debian - http://planet.debian.org/
Updated: 2 hours 42 min ago

Sylvain Beucler: dot-zed archive file format

10 September, 2017 - 20:50

TL,DR: I reverse-engineered the .zed encrypted archive format.
Following a clean-room design, I'm providing a description that can be implemented by a third-party.
Interested?

(reference version at: https://www.beuc.net/zed/)

.zed archive file format Introduction

Archives with the .zed extension are conceptually similar to an encrypted .zip file.

In addition to a specific format, .zed files support multiple users: files are encrypted using the archive master key, which itself is encrypted for each user and/or authentication method (password, RSA key through certificate or PKCS#11 token). Metadata such as filenames is partially encrypted.

.zed archives are used as stand-alone or attached to e-mails with the help of a MS Outlook plugin. A variant, which is not covered here, can encrypt/decrypt MS Windows folders on the fly like ecryptfs.

In the spirit of academic and independent research this document provides a description of the file format and encryption algorithms for this encrypted file archive.

See the conventions section for conventions and acronyms used in this document.

Structure overview

The .zed file format is composed of several layers.

  • The main container is using the (MS-CFB), which is notably used by MS Office 97-2003 .doc files. It contains several streams:

    • Metadata stream: in OLE Property Set format (MS-OLEPS), contains 2 blobs in a specific Type-Length-Value (TLV) format:

      • _ctlfile: global archive properties and access list
        It is obfuscated by means of static-key AES encryption.
        The properties include archive initial filename and a global IV.
        A global encryption key is itself encrypted in each user entry.

      • _catalog: file list
        Contains each file metadata indexed with a 15-bytes identifier.
        Directories are supported.
        Full filename is encrypted using AES.
        File extension is (redundantly) stored in clear, and so are file metadata such as modification time.

    • Each file in the archive compressed with zlib and encrypted with the standard AES algorithm, in a separate stream.
      Several encryption schemes and key sizes are supported.
      The file stream is split in chunks of 512 bytes, individually encrypted.

    • Optional streams, contain additional metadata as well as pictures to display in the application background ("watermarks"). They are not discussed here.

Or as a diagram:

+----------------------------------------------------------------------------------------------------+
| .zed archive (MS-CBF)                                                                              |
|                                                                                                    |
|  stream #1                         stream #2                       stream #3...                    |
| +------------------------------+  +---------------------------+  +---------------------------+     |
| | metadata (MS-OLEPS)          |  | encryption (AES)          |  | encryption (AES)          |     |
| |                              |  | 512-bytes chunks          |  | 512-bytes chunks          |     |
| | +--------------------------+ |  |                           |  |                           |     |
| | | obfuscation (static key) | |  | +-----------------------+ |  | +-----------------------+ |     |
| | | +----------------------+ | |  |-| compression (zlib)    |-|  |-| compression (zlib)    |-|     |
| | | |_ctlfile (TLV)        | | |  | |                       | |  | |                       | | ... |
| | | +----------------------+ | |  | | +---------------+     | |  | | +---------------+     | |     | 
| | +--------------------------+ |  | | | file contents |     | |  | | | file contents |     | |     |
| |                              |  | | |               |     | |  | | |               |     | |     |
| | +--------------------------+ |  |-| +---------------+     |-|  |-| +---------------+     |-|     |
| | | _catalog (TLV)           | |  | |                       | |  | |                       | |     |
| | +--------------------------+ |  | +-----------------------+ |  | +-----------------------+ |     |
| +------------------------------+  +---------------------------+  +---------------------------+     |
+----------------------------------------------------------------------------------------------------+
Encryption schemes

Several AES key sizes are supported, such as 128 and 256 bits.

The Cipher Block Chaining (CBC) block cipher mode of operation is used to decrypt multiple AES 16-byte blocks, which means an initialisation vector (IV) is stored in clear along with the ciphertext.

All filenames and file contents are encrypted using the same encryption mode, key and IV (e.g. if you remove and re-add a file in the archive, the resulting stream will be identical).

No cleartext padding is used during encryption; instead, several end-of-stream handlers are available, so the ciphertext has exactly the size of the cleartext (e.g. the size of the compressed file).

The following variants were identified in the 'encryption_mode' field.

STREAM

This is the end-of-stream handler for:

  • obfuscated metadata encrypted with static AES key
  • filenames and files in archives with 'encryption_mode' set to "AES-CBC-STREAM"
  • any AES ciphertext of size < 16 bytes, regardless of encryption mode

This end-of-stream handler is apparently specific to the .zed format, and applied when the cleartext's does not end on a 16-byte boundary ; in this case special processing is performed on the last partial 16-byte block.

The encryption and decryption phases are identical: let's assume the last partial block of cleartext (for encryption) or ciphertext (for decryption) was appended after all the complete 16-byte blocks of ciphertext:

  • the second-to-last block of the ciphertext is encrypted in AES-ECB mode (i.e. block cipher encryption only, without XORing with the IV)

  • then XOR-ed with the last partial block (hence truncated to the length of the partial block)

In either case, if the full ciphertext is less then one AES block (< 16 bytes), then the IV is used instead of the second-to-last block.

CTS

CTS or CipherText Stealing is the end-of-stream handler for:

  • filenames and files in archives with 'encryption_mode' set to "AES-CBC-CTS".
    • exception: if the size of the ciphertext is < 16 bytes, then "STREAM" is used instead.

It matches the CBC-CS3 variant as described in Recommendation for Block Cipher Modes of Operation: Three Variants of Ciphertext Stealing for CBC Mode.

Empty cleartext

Since empty filenames or metadata are invalid, and since all files are compressed (resulting in a minimum 8-byte zlib cleartext), no empty cleartext was encrypted in the archive.

metadata stream

It is named 05356861616161716149656b7a6565636e576a33317a7868304e63 (hexadecimal), i.e. the character with code 5 followed by '5haaaaqaIekzeecnWj31zxh0Nc' (ASCII).

The format used is OLE Property Set (MS-OLEPS).

It introduces 2 property names "_ctlfile" (index 3) and "_catalog" (index 4), and 2 instances of said properties each containing an application-specific VT_BLOB (type 0x0041).

_ctlfile: obfuscated global properties and access list

This subpart is stored under index 3 ("_ctlfile") of the MS-OLEPS metadata.

It consists of:

  • static delimiter 0765921A2A0774534752073361719300 (hexadecimal) followed by 0100 (hexadecimal) (18 bytes total)
  • 16-byte IV
  • ciphertext
  • 1 uint32be representing the length of all the above
  • static delimiter 0765921A2A0774534752073361719300 (hexadecimal) followed by "ZoneCentral (R)" (ASCII) and a NUL byte (32 bytes total)

The ciphertext is encrypted with AES-CBC "STREAM" mode using 128-bit static key 37F13CF81C780AF26B6A52654F794AEF (hexadecimal) and the prepended IV so as to obfuscate the access list. The ciphertext is continuous and not split in chunks (unlike files), even when it is larger than 512 bytes.

The decrypted text contain properties in a TLV format as described in _ctlfile TLV:

  • global archive properties as a 'fileprops' structure,

  • extra archive properties as a 'archive_extraprops' structure

  • users access list as a series of 'passworduser' and 'rsauser entries.

Archives may include "mandatory" users that cannot be removed. They are typically used to add an enterprise wide recovery RSA key to all archives. Extreme care must be taken to protect these key, as it can decrypt all past archives generated from within that company.

_catalog: file list

This subpart is stored under index 4 ("_catalog") of the MS-OLEPS metadata.

It contains a series of 'fileprops' TLV structures, one for each file or directory.

The file hierarchy can be reconstructed by checking the 'parent_id' field of each file entry. If 'parent_id' is 0 then the file is located at the top-level of the hierarchy, otherwise it's located under the directory with the matching 'file_id'.

TLV format

This format is a series of fields :

  • 4 bytes for Type (specified as a 4-bytes hexadecimal below)
  • 4 bytes for value Length (uint32be)
  • Value

Value semantics depend on its Type. It may contain an uint32be integer, a UTF-16LE string, a character sequence, or an inner TLV structure.

Unless otherwise noted, TLV structures appear once.

Some fields are optional and may not be present at all (e.g. 'archive_createdwith').

Some fields are unique within a structure (e.g. 'files_iv'), other may be repeated within a structure to form a list (e.g. 'fileprops' and 'passworduser').

The following top-level types that have been identified, and detailed in the next sections:

  • 80110600: fileprops, used for the file list as well as for the global archive properties
  • 001b0600: archive_extraprops
  • 80140600: accesslist

Some additional unidentified types may be present.

_ctlfile TLV
  • 80110600: fileprops (TLV structure): global archive properties
    • 00230400: archive_pathname (UTF-16LE string): initial archive filename (past versions also leaked the full pathname of the initial archive)
    • 80270200: encryption_mode (utf32be): 103 for "AES-CBC-STREAM", 104 for "AES-CBC-CTS"
    • 80260200: encryption_strength (utf32be): AES key size, in bytes (e.g. 32 means AES with a 256-bit key)
    • 80280500: files_iv (sequence of bytes): global IV for all filenames and file contents
  • 001b0600: archive_extraprops (TLV structure): additionnal archive properties (optional)
    • 00c40500: archive_creationtime (FILETIME): date and time when archive was initially created (optional)
    • 00c00400: archive_createdwith (UTF-16LE string): uuid-like structure describing the application that initialized the archive (optional)
      {00000188-1000-3CA8-8868-36F59DEFD14D} is Zed! Free 1.0.188.
  • 80140600: accesslist (TLV structure): describe the users, their key encryption and their permissions
    • 80610600: passworduser (TLV structure): user identified by password (0 or more)
    • 80620600: rsauser (TLV structure): user identified by RSA key (via file or PKCS#11 token) (0 or more)
    • Fields common to passworduser and rsauser:
      • 80710400: login (UTF-16LE string): user name
      • 80720300: login_md5 (sequence of bytes): used by the application to search for a user name
      • 807e0100: priv1 (uchar): user privileges; present and set to 1 when user is admin (optional)
      • 00830200: priv2 (uint32be): user privileges; present and set to 2 when user is admin, present and set to 5 when user is a marked as mandatory, e.g. for recovery keys (optional)
      • 80740500: files_key_ciphertext (sequence of bytes): the archive encryption key, itself encrypted
      • 00840500: user_creationtime (FILETIME): date and time when the user was added to the archive
    • passworduser-specific fields:
      • 80760500: pbe_salt (sequence of bytes): salt for PBE
      • 80770200: pbe_iter (uint32be): number of iterations for PBE
      • 80780200: pkcs12_hashfunc (uint32be): hash function used for PBE and PBA key derivation
      • 80790500: pba_checksum (sequence of bytes): password derived with PBA to check for password validity
      • 807a0500: pba_salt (sequence of bytes): salt for PBA
      • 807b0200: pba_iter (uint32be): number of iterations for PBA
    • rsauser-specific fields:
      • 807d0500: certificate (sequence of bytes): user X509 certificate in DER format
_catalog TLV
  • 80110600: fileprops (TLV structure): describe the archive files (0 or more)
    • 80300500: file_id (sequence of bytes): a 16-byte unique identifier
    • 80310400: filename_halfanon (UTF-16LE string): half-anonymized filename, e.g. File1.txt (leaking filename extension)
    • 00380500: filename_ciphertext (sequence of bytes): encrypted filename; may have a trailing NUL byte once decrypted
    • 80330500: file_size (uint64le): decompressed file size in bytes
    • 80340500: file_creationtime (FILETIME): file creation date and time
    • 80350500: file_lastwritetime (FILETIME): file last modification date and time
    • 80360500: file_lastaccesstime (FILETIME): file last access date and time
    • 00370500: parent_directory_id (sequence of bytes): file_id of the parent directory, 0 is top-level
    • 80320100: is_dir (uint32be): 1 if entry is directory (optional)
Decrypting the archive AES key rsauser

The user accessing the archive will be authenticated by comparing his/her X509 certificate with the one stored in the 'certificate' field using DER format.

The 'files_key_ciphertext' field is then decrypted using the PKCS#1 v1.5 encryption mechanism, with the private key that matches the user certificate.

passworduser

An intermediary user key, a user IV and an integrity checksum will be derived from the user password, using the deprecated PKCS#12 method as described at rfc7292 appendix B.

Note: this is not PKCS#5 (nor PBKDF1/PBKDF2), this is an incompatible method from PKCS#12 that notably does not use HMAC.

The 'pkcs12_hashfunc' field defines the underlying hash function. The following values have been identified:

  • 21: SHA-1
  • 22: SHA-256
PBA - Password-based authentication

The user accessing the archive will be authenticated by deriving an 8-byte sequence from his/her password.

The parameters for the derivation function are:

  • ID: 3
  • 'pba_salt': the salt, typically an 8-byte random sequence
  • 'pba_iter': the iteration count, typically 200000

The derivation is checked against 'pba_checksum'.

PBE - Password-based encryption

Once the user is identified, 2 new values are derived from the password with different parameters to produce the IV and the key decryption key, with the same hash function:

  • 'pbe_salt': the salt, typically an 8-bytes random sequence
  • 'pbe_iter': the iteration count, typically 100000

The parameters specific to user key are:

  • ID: 1
  • size: 32

The user key needs to be truncated to a length of 'encryption_strength', as specified in bytes in the archive properties.

The parameters specific to user IV are:

  • ID: 2
  • size: 16

Once the key decryption key and the IV are derived, 'files_key_ciphertext' is decrypted using AES CBC, with PKCS#7 padding.

Identifying file streams

The name of the MS-CFB stream is derived by shuffling the bytes from the 'file_id' field and then encoding the result as hexadecimal.

The reordering is:

Initial  offset: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Shuffled offset: 3 2 1 0 5 4 7 6 8 9 10 11 12 13 14 15

The 16th byte is usually a NUL byte, hence the stream identifier is a 30-character-long string.

Decrypting files

The compressed stream is split in chunks of 512 bytes, each of them encrypted separately using AES CBS and the global archive encryption scheme. Decryption uses the global AES key (retrieved using the user credentials), and the global IV (retrieved from the deobfuscated archive metadata).

The IV for each chunk is computed by:

  • expressing the current chunk number as little endian on 16 bytes
  • XORing it with the global IV
  • encrypting with the global AES key in ECB mode (without IV).

Each chunk is an independent stream and the decryption process involves end-of-stream handling even if this is not the end of the actual file. This is particularly important for the CTS handler.

Note: this is not to be confused with CTR block cipher mode of operation with operates differently and requires a nonce.

Decompressing files

Compressed streams are zlib stream with default compression options and can be decompressed following the zlib format.

Test cases

Excluded for brevity, cf. https://www.beuc.net/zed/#test-cases.

Conventions and references Feedback

Feel free to send comments at beuc@beuc.net. If you have .zed files that you think are not covered by this document, please send them as well (replace sensitive files with other ones). The author's GPG key can be found at 8FF1CB6E8D89059F.

Copyright (C) 2017 Sylvain Beucler

Copying and distribution of this file, with or without modification, are permitted in any medium without royalty provided the copyright notice and this notice are preserved. This file is offered as-is, without any warranty.

Sylvain Beucler: dot-zed archive file format

10 September, 2017 - 20:50

TL,DR: I reverse-engineered the .zed encrypted archive format.
Following a clean-room design, I'm providing a description that can be implemented by a third-party.
Interested?

(reference version at: https://www.beuc.net/zed/)

.zed archive file format Introduction

Archives with the .zed extension are conceptually similar to an encrypted .zip file.

In addition to a specific format, .zed files support multiple users: files are encrypted using the archive master key, which itself is encrypted for each user and/or authentication method (password, RSA key through certificate or PKCS#11 token). Metadata such as filenames is partially encrypted.

.zed archives are used as stand-alone or attached to e-mails with the help of a MS Outlook plugin. A variant, which is not covered here, can encrypt/decrypt MS Windows folders on the fly like ecryptfs.

In the spirit of academic and independent research this document provides a description of the file format and encryption algorithms for this encrypted file archive.

See the conventions section for conventions and acronyms used in this document.

Structure overview

The .zed file format is composed of several layers.

  • The main container is using the (MS-CFB), which is notably used by MS Office 97-2003 .doc files. It contains several streams:

    • Metadata stream: in OLE Property Set format (MS-OLEPS), contains 2 blobs in a specific Type-Length-Value (TLV) format:

      • _ctlfile: global archive properties and access list
        It is obfuscated by means of static-key AES encryption.
        The properties include archive initial filename and a global IV.
        A global encryption key is itself encrypted in each user entry.

      • _catalog: file list
        Contains each file metadata indexed with a 15-bytes identifier.
        Directories are supported.
        Full filename is encrypted using AES.
        File extension is (redundantly) stored in clear, and so are file metadata such as modification time.

    • Each file in the archive compressed with zlib and encrypted with the standard AES algorithm, in a separate stream.
      Several encryption schemes and key sizes are supported.
      The file stream is split in chunks of 512 bytes, individually encrypted.

    • Optional streams, contain additional metadata as well as pictures to display in the application background ("watermarks"). They are not discussed here.

Or as a diagram:

+----------------------------------------------------------------------------------------------------+
| .zed archive (MS-CBF)                                                                              |
|                                                                                                    |
|  stream #1                         stream #2                       stream #3...                    |
| +------------------------------+  +---------------------------+  +---------------------------+     |
| | metadata (MS-OLEPS)          |  | encryption (AES)          |  | encryption (AES)          |     |
| |                              |  | 512-bytes chunks          |  | 512-bytes chunks          |     |
| | +--------------------------+ |  |                           |  |                           |     |
| | | obfuscation (static key) | |  | +-----------------------+ |  | +-----------------------+ |     |
| | | +----------------------+ | |  |-| compression (zlib)    |-|  |-| compression (zlib)    |-|     |
| | | |_ctlfile (TLV)        | | |  | |                       | |  | |                       | | ... |
| | | +----------------------+ | |  | | +---------------+     | |  | | +---------------+     | |     | 
| | +--------------------------+ |  | | | file contents |     | |  | | | file contents |     | |     |
| |                              |  | | |               |     | |  | | |               |     | |     |
| | +--------------------------+ |  |-| +---------------+     |-|  |-| +---------------+     |-|     |
| | | _catalog (TLV)           | |  | |                       | |  | |                       | |     |
| | +--------------------------+ |  | +-----------------------+ |  | +-----------------------+ |     |
| +------------------------------+  +---------------------------+  +---------------------------+     |
+----------------------------------------------------------------------------------------------------+
Encryption schemes

Several AES key sizes are supported, such as 128 and 256 bits.

The Cipher Block Chaining (CBC) block cipher mode of operation is used to decrypt multiple AES 16-byte blocks, which means an initialisation vector (IV) is stored in clear along with the ciphertext.

All filenames and file contents are encrypted using the same encryption mode, key and IV (e.g. if you remove and re-add a file in the archive, the resulting stream will be identical).

No cleartext padding is used during encryption; instead, several end-of-stream handlers are available, so the ciphertext has exactly the size of the cleartext (e.g. the size of the compressed file).

The following variants were identified in the 'encryption_mode' field.

STREAM

This is the end-of-stream handler for:

  • obfuscated metadata encrypted with static AES key
  • filenames and files in archives with 'encryption_mode' set to "AES-CBC-STREAM"
  • any AES ciphertext of size < 16 bytes, regardless of encryption mode

This end-of-stream handler is apparently specific to the .zed format, and applied when the cleartext's does not end on a 16-byte boundary ; in this case special processing is performed on the last partial 16-byte block.

The encryption and decryption phases are identical: let's assume the last partial block of cleartext (for encryption) or ciphertext (for decryption) was appended after all the complete 16-byte blocks of ciphertext:

  • the second-to-last block of the ciphertext is encrypted in AES-ECB mode (i.e. block cipher encryption only, without XORing with the IV)

  • then XOR-ed with the last partial block (hence truncated to the length of the partial block)

In either case, if the full ciphertext is less then one AES block (< 16 bytes), then the IV is used instead of the second-to-last block.

CTS

CTS or CipherText Stealing is the end-of-stream handler for:

  • filenames and files in archives with 'encryption_mode' set to "AES-CBC-CTS".
    • exception: if the size of the ciphertext is < 16 bytes, then "STREAM" is used instead.

It matches the CBC-CS3 variant as described in Recommendation for Block Cipher Modes of Operation: Three Variants of Ciphertext Stealing for CBC Mode.

Empty cleartext

Since empty filenames or metadata are invalid, and since all files are compressed (resulting in a minimum 8-byte zlib cleartext), no empty cleartext was encrypted in the archive.

metadata stream

It is named 05356861616161716149656b7a6565636e576a33317a7868304e63 (hexadecimal), i.e. the character with code 5 followed by '5haaaaqaIekzeecnWj31zxh0Nc' (ASCII).

The format used is OLE Property Set (MS-OLEPS).

It introduces 2 property names "_ctlfile" (index 3) and "_catalog" (index 4), and 2 instances of said properties each containing an application-specific VT_BLOB (type 0x0041).

_ctlfile: obfuscated global properties and access list

This subpart is stored under index 3 ("_ctlfile") of the MS-OLEPS metadata.

It consists of:

  • static delimiter 0765921A2A0774534752073361719300 (hexadecimal) followed by 0100 (hexadecimal) (18 bytes total)
  • 16-byte IV
  • ciphertext
  • 1 uint32be representing the length of all the above
  • static delimiter 0765921A2A0774534752073361719300 (hexadecimal) followed by "ZoneCentral (R)" (ASCII) and a NUL byte (32 bytes total)

The ciphertext is encrypted with AES-CBC "STREAM" mode using 128-bit static key 37F13CF81C780AF26B6A52654F794AEF (hexadecimal) and the prepended IV so as to obfuscate the access list. The ciphertext is continuous and not split in chunks (unlike files), even when it is larger than 512 bytes.

The decrypted text contain properties in a TLV format as described in _ctlfile TLV:

  • global archive properties as a 'fileprops' structure,

  • extra archive properties as a 'archive_extraprops' structure

  • users access list as a series of 'passworduser' and 'rsauser entries.

Archives may include "mandatory" users that cannot be removed. They are typically used to add an enterprise wide recovery RSA key to all archives. Extreme care must be taken to protect these key, as it can decrypt all past archives generated from within that company.

_catalog: file list

This subpart is stored under index 4 ("_catalog") of the MS-OLEPS metadata.

It contains a series of 'fileprops' TLV structures, one for each file or directory.

The file hierarchy can be reconstructed by checking the 'parent_id' field of each file entry. If 'parent_id' is 0 then the file is located at the top-level of the hierarchy, otherwise it's located under the directory with the matching 'file_id'.

TLV format

This format is a series of fields :

  • 4 bytes for Type (specified as a 4-bytes hexadecimal below)
  • 4 bytes for value Length (uint32be)
  • Value

Value semantics depend on its Type. It may contain an uint32be integer, a UTF-16LE string, a character sequence, or an inner TLV structure.

Unless otherwise noted, TLV structures appear once.

Some fields are optional and may not be present at all (e.g. 'archive_createdwith').

Some fields are unique within a structure (e.g. 'files_iv'), other may be repeated within a structure to form a list (e.g. 'fileprops' and 'passworduser').

The following top-level types that have been identified, and detailed in the next sections:

  • 80110600: fileprops, used for the file list as well as for the global archive properties
  • 001b0600: archive_extraprops
  • 80140600: accesslist

Some additional unidentified types may be present.

_ctlfile TLV
  • 80110600: fileprops (TLV structure): global archive properties
    • 00230400: archive_pathname (UTF-16LE string): initial archive filename (past versions also leaked the full pathname of the initial archive)
    • 80270200: encryption_mode (utf32be): 103 for "AES-CBC-STREAM", 104 for "AES-CBC-CTS"
    • 80260200: encryption_strength (utf32be): AES key size, in bytes (e.g. 32 means AES with a 256-bit key)
    • 80280500: files_iv (sequence of bytes): global IV for all filenames and file contents
  • 001b0600: archive_extraprops (TLV structure): additionnal archive properties (optional)
    • 00c40500: archive_creationtime (FILETIME): date and time when archive was initially created (optional)
    • 00c00400: archive_createdwith (UTF-16LE string): uuid-like structure describing the application that initialized the archive (optional)
      {00000188-1000-3CA8-8868-36F59DEFD14D} is Zed! Free 1.0.188.
  • 80140600: accesslist (TLV structure): describe the users, their key encryption and their permissions
    • 80610600: passworduser (TLV structure): user identified by password (0 or more)
    • 80620600: rsauser (TLV structure): user identified by RSA key (via file or PKCS#11 token) (0 or more)
    • Fields common to passworduser and rsauser:
      • 80710400: login (UTF-16LE string): user name
      • 80720300: login_md5 (sequence of bytes): used by the application to search for a user name
      • 807e0100: priv1 (uchar): user privileges; present and set to 1 when user is admin (optional)
      • 00830200: priv2 (uint32be): user privileges; present and set to 2 when user is admin, present and set to 5 when user is a marked as mandatory, e.g. for recovery keys (optional)
      • 80740500: files_key_ciphertext (sequence of bytes): the archive encryption key, itself encrypted
      • 00840500: user_creationtime (FILETIME): date and time when the user was added to the archive
    • passworduser-specific fields:
      • 80760500: pbe_salt (sequence of bytes): salt for PBE
      • 80770200: pbe_iter (uint32be): number of iterations for PBE
      • 80780200: pkcs12_hashfunc (uint32be): hash function used for PBE and PBA key derivation
      • 80790500: pba_checksum (sequence of bytes): password derived with PBA to check for password validity
      • 807a0500: pba_salt (sequence of bytes): salt for PBA
      • 807b0200: pba_iter (uint32be): number of iterations for PBA
    • rsauser-specific fields:
      • 807d0500: certificate (sequence of bytes): user X509 certificate in DER format
_catalog TLV
  • 80110600: fileprops (TLV structure): describe the archive files (0 or more)
    • 80300500: file_id (sequence of bytes): a 16-byte unique identifier
    • 80310400: filename_halfanon (UTF-16LE string): half-anonymized filename, e.g. File1.txt (leaking filename extension)
    • 00380500: filename_ciphertext (sequence of bytes): encrypted filename; may have a trailing NUL byte once decrypted
    • 80330500: file_size (uint64le): decompressed file size in bytes
    • 80340500: file_creationtime (FILETIME): file creation date and time
    • 80350500: file_lastwritetime (FILETIME): file last modification date and time
    • 80360500: file_lastaccesstime (FILETIME): file last access date and time
    • 00370500: parent_directory_id (sequence of bytes): file_id of the parent directory, 0 is top-level
    • 80320100: is_dir (uint32be): 1 if entry is directory (optional)
Decrypting the archive AES key rsauser

The user accessing the archive will be authenticated by comparing his/her X509 certificate with the one stored in the 'certificate' field using DER format.

The 'files_key_ciphertext' field is then decrypted using the PKCS#1 v1.5 encryption mechanism, with the private key that matches the user certificate.

passworduser

An intermediary user key, a user IV and an integrity checksum will be derived from the user password, using the deprecated PKCS#12 method as described at rfc7292 appendix B.

Note: this is not PKCS#5 (nor PBKDF1/PBKDF2), this is an incompatible method from PKCS#12 that notably does not use HMAC.

The 'pkcs12_hashfunc' field defines the underlying hash function. The following values have been identified:

  • 21: SHA-1
  • 22: SHA-256
PBA - Password-based authentication

The user accessing the archive will be authenticated by deriving an 8-byte sequence from his/her password.

The parameters for the derivation function are:

  • ID: 3
  • 'pba_salt': the salt, typically an 8-byte random sequence
  • 'pba_iter': the iteration count, typically 200000

The derivation is checked against 'pba_checksum'.

PBE - Password-based encryption

Once the user is identified, 2 new values are derived from the password with different parameters to produce the IV and the key decryption key, with the same hash function:

  • 'pbe_salt': the salt, typically an 8-bytes random sequence
  • 'pbe_iter': the iteration count, typically 100000

The parameters specific to user key are:

  • ID: 1
  • size: 32

The user key needs to be truncated to a length of 'encryption_strength', as specified in bytes in the archive properties.

The parameters specific to user IV are:

  • ID: 2
  • size: 16

Once the key decryption key and the IV are derived, 'files_key_ciphertext' is decrypted using AES CBC, with PKCS#7 padding.

Identifying file streams

The name of the MS-CFB stream is derived by shuffling the bytes from the 'file_id' field and then encoding the result as hexadecimal.

The reordering is:

Initial  offset: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Shuffled offset: 3 2 1 0 5 4 7 6 8 9 10 11 12 13 14 15

The 16th byte is usually a NUL byte, hence the stream identifier is a 30-character-long string.

Decrypting files

The compressed stream is split in chunks of 512 bytes, each of them encrypted separately using AES CBS and the global archive encryption scheme. Decryption uses the global AES key (retrieved using the user credentials), and the global IV (retrieved from the deobfuscated archive metadata).

The IV for each chunk is computed by:

  • expressing the current chunk number as little endian on 16 bytes
  • XORing it with the global IV
  • encrypting with the global AES key in ECB mode (without IV).

Each chunk is an independent stream and the decryption process involves end-of-stream handling even if this is not the end of the actual file. This is particularly important for the CTS handler.

Note: this is not to be confused with CTR block cipher mode of operation with operates differently and requires a nonce.

Decompressing files

Compressed streams are zlib stream with default compression options and can be decompressed following the zlib format.

Test cases

Excluded for brevity, cf. https://www.beuc.net/zed/#test-cases.

Conventions and references Feedback

Feel free to send comments at beuc@beuc.net. If you have .zed files that you think are not covered by this document, please send them as well (replace sensitive files with other ones). The author's GPG key can be found at 8FF1CB6E8D89059F.

Copyright (C) 2017 Sylvain Beucler

Copying and distribution of this file, with or without modification, are permitted in any medium without royalty provided the copyright notice and this notice are preserved. This file is offered as-is, without any warranty.

Charles Plessy: Summary of the discussion on off-line keys.

10 September, 2017 - 19:06

Last month, there has been an interesting discussion about off-line GnuPG keys and their storage systems on the debian-project@l.d.o mailing list. I tried to summarise it in the Debian wiki, in particular by creating two new pages.

Joachim Breitner: Less parentheses

10 September, 2017 - 17:10

Yesterday, at the Haskell Implementers Workshop 2017 in Oxford, I gave a lightning talk titled ”syntactic musings”, where I presented three possibly useful syntactic features that one might want to add to a language like Haskell.

The talked caused quite some heated discussions, and since the Internet likes heated discussion, I will happily share these ideas with you

Context aka. Sections

This is probably the most relevant of the three proposals. Consider a bunch of related functions, say analyseExpr and analyseAlt, like these:

analyseExpr :: Expr -> Expr
analyseExpr (Var v) = change v
analyseExpr (App e1 e2) =
  App (analyseExpr e1) (analyseExpr e2)
analyseExpr (Lam v e) = Lam v (analyseExpr flag e)
analyseExpr (Case scrut alts) =
  Case (analyseExpr scrut) (analyseAlt <$> alts)

analyseAlt :: Alt -> Alt
analyseAlt (dc, pats, e) = (dc, pats, analyseExpr e)

You have written them, but now you notice that you need to make them configurable, e.g. to do different things in the Var case. You thus add a parameter to all these functions, and hence an argument to every call:

type Flag = Bool

analyseExpr :: Flag -> Expr -> Expr
analyseExpr flag (Var v) = if flag then change1 v else change2 v
analyseExpr flag (App e1 e2) =
  App (analyseExpr flag e1) (analyseExpr flag e2)
analyseExpr flag (Lam v e) = Lam v (analyseExpr (not flag) e)
analyseExpr flag (Case scrut alts) =
  Case (analyseExpr flag scrut) (analyseAlt flag <$> alts)

analyseAlt :: Flag -> Alt -> Alt
analyseAlt flag (dc, pats, e) = (dc, pats, analyseExpr flag e)

I find this code problematic. The intention was: “flag is a parameter that an external caller can use to change the behaviour of this code, but when reading and reasoning about this code, flag should be considered constant.”

But this intention is neither easily visible nor enforced. And in fact, in the above code, flag does “change”, as analyseExpr passes something else in the Lam case. The idiom is indistinguishable from the environment idiom, where a locally changing environment (such as “variables in scope”) is passed around.

So we are facing exactly the same problem as when reasoning about a loop in an imperative program with mutable variables. And we (pure functional programmers) should know better: We cherish immutability! We want to bind our variables once and have them scope over everything we need to scope over!

The solution I’d like to see in Haskell is common in other languages (Gallina, Idris, Agda, Isar), and this is what it would look like here:

type Flag = Bool
section (flag :: Flag) where
  analyseExpr :: Expr -> Expr
  analyseExpr (Var v) = if flag then change1 v else change2
  analyseExpr (App e1 e2) =
    App (analyseExpr e1) (analyseExpr e2)
  analyseExpr (Lam v e) = Lam v (analyseExpr e)
  analyseExpr (Case scrut alts) =
    Case (analyseExpr scrut) (analyseAlt <$> alts)

  analyseAlt :: Alt -> Alt
  analyseAlt (dc, pats, e) = (dc, pats, analyseExpr e)

Now the intention is clear: Within a clearly marked block, flag is fixed and when reasoning about this code I do not have to worry that it might change. Either all variables will be passed to change1, or all to change2. An important distinction!

Therefore, inside the section, the type of analyseExpr does not mention Flag, whereas outside its type is Flag -> Expr -> Expr. This is a bit unusual, but not completely: You see precisely the same effect in a class declaration, where the type signature of the methods do not mention the class constraint, but outside the declaration they do.

Note that idioms like implicit parameters or the Reader monad do not give the guarantee that the parameter is (locally) constant.

More details can be found in the GHC proposal that I prepared, and I invite you to raise concern or voice support there.

Curiously, this problem must have bothered me for longer than I remember: I discovered that seven years ago, I wrote a Template Haskell based implementation of this idea in the seal-module package!

Less parentheses 1: Bulleted argument lists

The next two proposals are all about removing parentheses. I believe that Haskell’s tendency to express complex code with no or few parentheses is one of its big strengths, as it makes it easier to visualy parse programs. A common idiom is to use the $ operator to separate a function from a complex argument without parentheses, but it does not help when there are multiple complex arguments.

For that case I propose to steal an idea from the surprisingly successful markup language markdown, and use bulleted lists to indicate multiple arguments:

foo :: Baz
foo = bracket
        • some complicated code
          that is evaluated first
        • other complicated code for later
        • even more complicated code

I find this very easy to visually parse and navigate.

It is actually possible to do this now, if one defines (•) = id with infixl 0 •. A dedicated syntax extension (-XArgumentBullets) is preferable:

  • It only really adds readability if the bullets are nicely vertically aligned, which the compiler should enforce.
  • I would like to use $ inside these complex arguments, and multiple operators of precedence 0 do not mix. (infixl -1 • would help).
  • It should be possible to nest these, and distinguish different nesting levers based on their indentation.
Less parentheses 1: Whitespace precedence

The final proposal is the most daring. I am convinced that it improves readability and should be considered when creating a new language. As for Haskell, I am at the moment not proposing this as a language extension (but could be convinced to do so if there is enough positive feedback).

Consider this definition of append:

(++) :: [a] -> [a] -> [a]
[]     ++ ys = ys
(x:xs) ++ ys = x : (xs++ys)

Imagine you were explaining the last line to someone orally. How would you speak it? One common way to do so is to not read the parentheses out aloud, but rather speak parenthesised expression more quickly and add pauses otherwise.

We can do the same in syntax!

(++) :: [a] -> [a] -> [a]
[]   ++ ys = ys
x:xs ++ ys = x : xs++ys

The rule is simple: A sequence of tokens without any space is implicitly parenthesised.

The reaction I got in Oxford was horror and disgust. And that is understandable – we are very used to ignore spacing when parsing expressions (unless it is indentation, of course. Then we are no longer horrified, as our non-Haskell colleagues are when they see our code).

But I am convinced that once you let the rule sink in, you will have no problem parsing such code with ease, and soon even with greater ease than the parenthesised version. It is a very natural thing to look at the general structure, identify “compact chunks of characters”, mentally group them, and then go and separately parse the internals of the chunks and how the chunks relate to each other. More natural than first scanning everything for ( and ), matching them up, building a mental tree, and then digging deeper.

Incidentally, there was a non-programmer present during my presentation, and while she did not openly contradict the dismissive groan of the audience, I later learned that she found this variant quite obvious to understand and more easily to read than the parenthesised code.

Some FAQs about this:

  • What about an operator with space on one side but not on the other?
    I’d simply forbid that, and hence enforce readable code.
  • Do operator sections still require parenthesis?
    Yes, I’d say so.
  • Does this overrule operator precedence?
    Yes! a * b+c == a * (b+c).
  • What is a token?
    Good question, and I am not not decided. In particular: Is a parenthesised expression a single token? If so, then (Succ a)+b * c parses as ((Succ a)+b) * c, otherwise it should probably simply be illegal.
  • Can we extend this so that one space binds tighter than two spaces, and so on?
    Yes we can, but really, we should not.
  • This is incompatible with Agda’s syntax!
    Indeed it is, and I really like Agda’s mixfix syntax. Can’t have everything.
  • Has this been done before?
    I have not seen it in any language, but Lewis Wall has blogged this idea before.

Well, let me know what you think!

Lior Kaplan: PHP 7.2 is coming… mcrypt extension isn’t

10 September, 2017 - 15:56

Early September, it’s about 3 months before PHP 7.2 is expected to be release (schedule here). One of the changes is the removal of the mcrypt extension after it was deprecated in PHP 7.1. The main problem with mcrypt extension is that it is based on libmcrypt that was abandoned by it’s upstream since 2007. That’s 10 years of keeping a library alive, moving the burden to distribution’s security teams. But this isn’t new, Remi already wrote about this two years ago: “About libmcrypt and php-mcrypt“.

But with removal of the extension from the PHP code base (about F**King time), it would force the recommendation was done “nicely” till now. And forcing people means some noise, although an alternative is PHP’s owns openssl extension. But as many migrations that require code change – it’s going slow.

The goal of this post is to reach to the PHP eco system and map the components (mostly frameworks and applications) to still require/recommend mcyrpt and to pressure them to fix it before PHP 72 is released. I’ll appreciate the readers’ help with this mapping in the comments.

For example, Laravel‘s release notes for 5.1:

In previous versions of Laravel, encryption was handled by the mcrypt PHP extension. However, beginning in Laravel 5.1, encryption is handled by the openssl extension, which is more actively maintained.

Or, on the other hand Joomla 3 requirements still mentions mcrypt.

mcrypt safe:

mcrypt dependant:

For those who really need mcrypt, it is part of PECL, PHP’s extensions repository. You’re welcome to compile it on your own risk.


Filed under: Debian GNU/Linux, PHP

Russell Coker: Observing Reliability

9 September, 2017 - 17:18

Last year I wrote about how great my latest Thinkpad is [1] in response to a discussion about whether a Thinkpad is still the “Rolls Royce” of laptops.

It was a few months after writing that post that I realised that I omitted an important point. After I had that laptop for about a year the DVD drive broke and made annoying clicking sounds all the time in addition to not working. I removed the DVD drive and the result was that the laptop was lighter and used less power without missing any feature that I desired. As I had installed Debian on that laptop by copying the hard drive from my previous laptop I had never used the DVD drive for any purpose. After a while I got used to my laptop being like that and the gaping hole in the side of the laptop where the DVD drive used to be didn’t even register to me. I would prefer it if Lenovo sold Thinkpads in the T series without DVD drives, but it seems that only the laptops with tiny screens are designed to lack DVD drives.

For my use of laptops this doesn’t change the conclusion of my previous post. Now the T420 has been in service for almost 4 years which makes the cost of ownership about $75 per year. $1.50 per week as a tax deductible business expense is very cheap for such a nice laptop. About a year ago I installed a SSD in that laptop, it cost me about $250 from memory and made it significantly faster while also reducing heat problems. The depreciation on the SSD about doubles the cost of ownership of the laptop, but it’s still cheaper than a mobile phone and thus not in the category of things that are expected to last for a long time – while also giving longer service than phones usually do.

One thing that’s interesting to consider is the fact that I forgot about the broken DVD drive when writing about this. I guess every review has an unspoken caveat of “this works well for me but might suck badly for your use case”. But I wonder how many other things that are noteworthy I’m forgetting to put in reviews because they just don’t impact my use. I don’t think that I am unusual in this regard, so reading multiple reviews is the sensible thing to do.

Related posts:

  1. Is a Thinkpad Still Like a Rolls-Royce For a long time the Thinkpad has been widely regarded...
  2. PC prices drop again! A few weeks ago Dell advertised new laptops for $849AU,...
  3. Laptop Reliability Update: TumbleDry has a good analysis of the Square Trade...

François Marier: TLS Authentication on Freenode and OFTC

9 September, 2017 - 11:52

In order to easily authenticate with IRC networks such as OFTC and Freenode, it is possible to use client TLS certificates (also known as SSL certificates). In fact, it turns out that it's very easy to setup both on irssi and on znc.

Generate your TLS certificate

On a machine with good entropy, run the following command to create a keypair that will last for 10 years:

openssl req -nodes -newkey rsa:2048 -keyout user.pem -x509 -days 3650 -out user.pem -subj "/CN=<your nick>"

Then extract your key fingerprint using this command:

openssl x509 -sha1 -noout -fingerprint -in user.pem | sed -e 's/^.*=//;s/://g'
Share your fingerprints with NickServ

On each IRC network, do this:

/msg NickServ IDENTIFY Password1!
/msg NickServ CERT ADD <your fingerprint>

in order to add your fingerprint to the access control list.

Configure ZNC

To configure znc, start by putting the key in the right place:

cp user.pem ~/.znc/users/<your nick>/networks/oftc/moddata/cert/

and then enable the built-in cert plugin for each network in ~/.znc/configs/znc.conf:

<Network oftc>
    ...
            LoadModule = cert
    ...
</Network>
    <Network freenode>
    ...
            LoadModule = cert
    ...
</Network>
Configure irssi

For irssi, do the same thing but put the cert in ~/.irssi/user.pem and then change the OFTC entry in ~/.irssi/config to look like this:

{
  address = "irc.oftc.net";
  chatnet = "OFTC";
  port = "6697";
  use_tls = "yes";
  tls_cert = "~/.irssi/user.pem";
  tls_verify = "yes";
  autoconnect = "yes";
}

and the Freenode one to look like this:

{
  address = "chat.freenode.net";
  chatnet = "Freenode";
  port = "7000";
  use_tls = "yes";
  tls_cert = "~/.irssi/user.pem";
  tls_verify = "yes";
  autoconnect = "yes";
}

That's it. That's all you need to replace password authentication with a much stronger alternative.

Vincent Fourmond: Extract many attachement from many mails in one go using ripmime

9 September, 2017 - 04:42
I was recently looking for a way to extract many attachments from a series of emails. I first had a look at the AttachmentExtractor thunderbird plugin, but it seems very old and not maintained anymore. So I've come up with another very simple solution that also works with any other mail client.

Just copy all the mails you want to extract attachments from to a single (temporary) mail folder, find out which file holds the mail folder and use ripmime on that file (ripmime is packaged for Debian). For my case, it looked like:

~ ripmime -i .icedove/XXXXXXX.default/Mail/pop.xxxx/tmp -d target-directory

Simple solution, but it saved me quite some time. Hope it helps !

Sven Hoexter: munin with TLS

8 September, 2017 - 21:41

Primarily a note for my future self so I don't have to find out what I did in the past once more.

If you're running some smaller systems scattered around the internet, without connecting them with a VPN, you might want your munin master and nodes to communicate with TLS and validate certificates. If you remember what to do it's a rather simple and straight forward process. To manage the PKI I'll utilize the well known easyrsa script collection. For this special purpose CA I'll go with a flat layout. So it's one root certificate issuing all server and client certificates directly. Some very basic docs can be also found in the munin wiki.

master setup

For your '/etc/munin/munin.conf':

tls paranoid
tls_verify_certificate yes
tls_private_key /etc/munin/master.key
tls_certificate /etc/munin/master.crt
tls_ca_certificate /etc/munin/ca.crt
tls_verify_depth 1

A node entry with TLS will look like this:

[node1.stormbind.net]
    address [2001:db8::]
    use_node_name yes

Important points here:

  • "tls_certificate" is a Web Client Authentication certificate. The master connects to the nodes as a client.
  • "tls_ca_certificate" is the root CA certificate.
  • If you'd like to disable TLS connections, for example for localhost, set "tls disabled" in the node block.

For easy-rsa the following command invocations are relevant:

./easyrsa init-pki
./easyrsa build-ca
./easrsa gen-req master
./easyrsa sign-req client master
./easyrsa set-rsa-pass master nopass
node setup

For your '/etc/munin/munin-node.conf':

tls paranoid
tls_verify_certificate yes
tls_private_key /etc/munin/node1.key
tls_certificate /etc/munin/node1.crt
tls_ca_certificate /etc/munin/ca.crt
tls_verify_depth 1

For easy-rsa the following command invocations are relevant:

./easyrsa gen-req node1
./easyrsa sign-req server node1
./easyrsa set-rsa-pass node1 nopass

Important points here:

  • "tls_certificate" on the node must be a server certificate.
  • You've to provide the CA here as well so we can verify the client certificate provided by the munin master.

Steinar H. Gunderson: Licensing woes

8 September, 2017 - 05:45

On releasing modified versions of GPLv3 software in binary form only (quote anonymized):

And in my opinion it's perfectly ok to give out a binary release of a project, that is a work in progress, so that people can try it out and coment on it. It's easier for them to have it as binary and not need to compile it themselfs. If then after a (long) while the code is still only released in binary form, then it's ok to start a discussion. But only for a quick test, that is unneccessary. So people, calm down and enjoy life!

I wonder at what point we got here.

Gunnar Wolf: It was thirty years ago today... (and a bit more): My first ever public speech!

8 September, 2017 - 01:35

I came across a folder with the most unexpected treasure trove: The text for my first ever public speech! (and some related materials)
In 1985, being nine years old, I went to the IDESE school, to learn Logo. I found my diploma over ten years ago and blogged about it in this same space. Of course, I don't expect any of you to remember what I wrote twelve years ago about a (then) twenty years old piece of paper!

I add to this very old stuff about Gunnar the four pages describing my game, Evitamono ("Avoid the monkey", approximately). I still remember the game quite vividly, including traumatic issues which were quite common back then; I wrote that «the sprites were accidentally deleted twice and the game once». I remember several of my peers telling about such experiences. Well, that is good if you account for the second system syndrome!

I also found the amazing course material for how to program sound and graphics in the C64 BASIC. That was a course taken by ten year old kids. Kids that understood that you had to write [255,129,165,244,219,165,0,102] (see pages 3-5) into a memory location starting at 53248 to redefine a character so it looked as the graphic element you wanted. Of course, it was done with a set of POKEs, as everything in C64. Or that you could program sound by setting the seven SID registers for each of the three voices containing low frequency, high frequency, low pulse, high pulse, wave control, wave length, wave amplitude in memory locations 54272 through 54292... And so on and on and on...

And as a proof that I did take the course:

...I don't think I could make most of my current BSc students make sense out of what is in the manual. But, being a kid in the 1980s, that was the only way to get a computer to do what you wanted. Yay for primitivity! :-D

AttachmentSize Speech for "Evitamono"1.29 MB Coursee material for sound and graphics programming in C64 BASIC15.82 MB Proof that I was there!4.86 MB

Lior Kaplan: FOSScamp Syros 2017 – day 3

7 September, 2017 - 22:13

The 3rd day should have started with a Debian sprint and then a LibreOffice one, taking advantage I’m still attending, as that’s my last day. But plans don’t always work out and we started 2 hours later. When everybody arrive we got everyone together for a short daily meeting (scrum style). The people were divided to 3 teams for translating:  Debian Installer, LibreOffice and Gnome. For each team we did a short list of what left and with what to start. And in the end – how does what so there will be no toe stepping. I was really proud with this and felt it was time well spent.

The current translation percentage for Albanian in LibreOffice is 60%. So my recommendation to the team is translate master only and do not touch the help translation. My plans ahead would be to improve the translation as much as possible for LibreOffice 6.0 and near the branching point (Set to November 20th by the release schedule) decide if it’s doable for the 6.0 life time or to set the goal at 6.1. In the 2nd case, we might try to backport translation back to 6.0.

For the translation itself, I’ve mentioned to the team about KeyID language pack and referred them to the nightly builds. These tools should help with keeping the translation quality high.

For the Debian team, after deciding who works on what, I’ve asked Silva to do review for the others, as doing it myself started to take more and more of my time. It’s also good that the reviewer know the target language and not like me, can catch more the syntax only mistakes. Another point, as she’s available more easily to the team while I’m leaving soon, so I hope this role of reviewer will stay as part of the team.

With the time left I mostly worked on my own tasks, which were packaging the Albanian dictionary, resulting in https://packages.debian.org/sid/myspell-sq and making sure the dictionary is also part of LibreOffice resulting in https://gerrit.libreoffice.org/#/c/41906/ . When it is accepted, I want to upload it to the LibreOffice repository so all users can download and use the dictionary.

During the voyage home (ferry, bus, plain and train), I mailed Sergio Durigan Junior, my NM applicant, with a set of questions. My first action as an AM (:

Overall FOSScamp results for Albanian translation were very close to the goal I set (100%):

  • Albanian (sq) level1 – 99%
  • Albanian (sq) level2 – 25% (the rest is pending at #874497)
  • Albanian (sq) level3 – 100%

That’s the result of work by Silva Arapi, Eva Vranici, Redon Skikuli, Anisa Kuci and Nafie Shehu.


Filed under: Debian GNU/Linux, i18n & l10n, LibreOffice

Thomas Lange: My recent FAI activities

7 September, 2017 - 22:03

During DebConf 17 in Montréal I had a FAI demo session (video), where I showed how to create a customized installation CD and how to create a diskimage using the same configuration. This diskimage is ready for use with a VM software or can be booted inside a cloud environment.

During the last weeks I was working on FAI 5.4 which will be released in a few weeks. I you want to test it use

deb https://fai-project.org/download beta-testing koeln

in your sources.list file.

The most important new feature will be the cross architecture support. I managed to create an ARM64 diskimage on a x86 host and boot this inside Qemu. Currently I learn how to flash images onto my new Hikey960 board for booting my own Debian images on real hardware. The embedded world is still new for me and very different in respect to the boot process.

At DebConf, I also worked on debootstrap. I produced a set of patches which can speedup debootstrap by a factor of 2. See #871835 for details.

FAI debootstrap ARM

Reproducible builds folks: Reproducible Builds: Weekly report #123

7 September, 2017 - 16:54

Here's what happened in the Reproducible Builds effort between Sunday August 27 and Saturday September 2 2017:

Talks and presentations

Holger Levsen talked about our progress and our still-far goals at BornHack 2017 (Video).

Toolchain development and fixes

The Debian FTP archive will now reject changelogs where different entries have the same timestamps.

UDD now uses reproducible-tracker.json (~25MB) which ignores our tests for Debian unstable, instead of our full set of results in reproducible.json. Our tests for Debian unstable uses a stricter definition of "reproducible" than what was recently added to Debian policy, and these stricter tests are currently more unreliable.

Packages reviewed and fixed, and bugs filed

Patches sent upstream:

Debian bugs filed:

Debian packages NMU-uploaded:

Reviews of unreproducible packages

25 package reviews have been added, 50 have been updated and 86 have been removed in this week, adding to our knowledge about identified issues.

Weekly QA work

During our reproducibility testing, FTBFS bugs have been detected and reported by:

  • Adrian Bunk (46)
  • Martín Ferrari (1)
  • Steve Langasek (1)
diffoscope development

Version 86 was uploaded to unstable by Mattia Rizzolo. It included previous weeks' contributions from:

  • Mattia Rizzolo
    • tests/binary: skip a test if the 'distro' module is not available.
    • Some code quality and style improvements.
  • Guangyuan Yang
    • tests/iso9660: support both cdrtools' genisoimage's versions of isoinfo.
  • Chris Lamb
    • comparators/xml: Use name attribute over path to avoid leaking comparison full path in output.
    • Tidy diffoscope.progress a little.
  • Ximin Luo
    • Add a --tool-prefix-binutils CLI flag. Closes: #869868
    • On non-GNU systems, prefer some tools that start with "g". Closes: #871029
    • presenters/html: Don't traverse children whose parents were already limited. Closes: #871413
  • Santiago Torres-Arias
    • diffoscope.progress: Support the new fork of python-progressbar. Closes: #873157
reprotest development

Development continued in git with contributions from:

  • Ximin Luo:
    • Add -v/--verbose which is a bit more popular.
    • Make it possible to omit "auto" when building packages.
    • Refactor how the config file works, in preparation for new features.
    • chown -h for security.
Misc.

This week's edition was written by Ximin Luo, Chris Lamb, Bernhard M. Wiedemann and Holger Levsen & reviewed by a bunch of Reproducible Builds folks on IRC & the mailing lists.

John Goerzen: Switching to xmonad + Gnome – and ditching a Mac

7 September, 2017 - 09:43

I have been using XFCE with xmonad for years now. I’m not sure exactly how many, but at least 6 years, if not closer to 10. Today I threw in the towel and switched to Gnome.

More recently, at a new job, I was given a Macbook Pro. I wasn’t entirely sure what to think of this, but I thought I’d give it a try. I found MacOS to be extremely frustrating and confining. It had no real support for a tiling window manager, and although projects like amethyst tried to approximate what xmonad can do on Linux, they were just too limited by the platform and were clunky. Moreover, the entire UI was surprisingly sluggish; maybe that was an induced effect from animations, but I don’t think that explains it. A Debisn stretch install, even on inferior hardware, was snappy in a way that MacOS never was. So I have requested to swap for a laptop that will run Debian. The strange use of Command instead of Control for things, combined with the overall lack of configurability of keybindings, meant that I was going to always be fighting muscle memory moving from one platform to another. Not only that, but being back in the world of a Free Software OS means a lot.

Now then, back to xmonad and XFCE situation. XFCE once worked very well with xmonad. Over the years, this got more challenging. Around the jessie (XFCE 4.10) time, I had to be very careful about when I would let it save my session, because it would easily break. With stretch, I had to write custom scripts because the panel wouldn’t show up properly, and even some application icons would be invisible, if things were started in a certain order. This took much trial and error and was still cumbersome.

Gnome 3, with its tightly-coupled Gnome Shell, has never been compatible with other window managers — at least not directly. A person could have always used MATE with xmonad — but a lot of people that run XFCE tend to have some Gnome 3 apps (for instance, evince) anyhow. Cinnamon also wouldn’t work with xmonad, because it is simply another tightly-coupled shell instead of Gnome Shell. And then today I discovered gnome-flashback. gnome-flashback is a Gnome 3 environment that uses the traditional X approach with a separate window manager (metacity of yore by default). Sweet.

It turns out that Debian’s xmonad has built-in support for it. If you know the secret: apt-get install gnome-session-flashback (OK, it’s not so secret; it’s even in xmonad’s README.Debian these days) Install that, plus gnome and gdm3 and things are nice. Configure xmonad with GNOME support and poof – goodness right out of the box, selectable from the gdm sessions list.

I still have some gripes about Gnome’s configurability (or lack thereof). But I’ve got to say: This environment is the first one I’ve ever used that got external display switching very nearly right without any configuration, and I include MacOS in that. Plug in an external display, and poof – it’s configured and set up. You can hit a toggle key (Windows+P by default) to change the configurations, or use the Display section in gnome-control-center. Unplug it, and it instantly reconfigures itself to put everything back on the laptop screen. Yessss! I used to have scripts to do this in the wheezy/jessie days. XFCE in stretch had numerous annoying failures in this area which rendered the internal display completely dark until the next reboot – very frustrating. With Gnome, it just works. And, even if you have “suspend on lid closed” turned on, if the system is powered up and hooked up to an external display, it will keep running even if the lid is closed, figuring you must be using it on the external screen. Another thing the Mac wouldn’t do well.

All in all, some pretty good stuff here. I continue to be impressed by stretch. It is darn impressive to put this OS on generic hardware and have it outshine the closed-ecosystem Mac!

Mike Gabriel: MATE 1.18 landed in Debian testing

6 September, 2017 - 16:04

This is to announce that finally all MATE Desktop 1.18 components have landed in Debian testing (aka buster).

Credits

Again a big thanks to the packaging team (esp. Vangelis Mouhtsis and Martin Wimpress, but also to Jeremy Bicha for constant advice and Aron Xu for joining the Debian+Ubuntu MATE Packaging Team and merging all the Ubuntu zesty and artful branches back to master).

Fully Available on all Debian-supported Architectures

The very special thing about this MATE 1.18 series for Debian is that MATE is now available on all Debian hardware architectures. See "Buildd" column on our DDPO overview page [1]. Thanks to all people from the Debian porters realm for providing feedback to my porting questions.

References

Kees Cook: security things in Linux v4.13

6 September, 2017 - 06:01

Previously: v4.12.

Here’s a short summary of some of interesting security things in Sunday’s v4.13 release of the Linux kernel:

security documentation ReSTification
The kernel has been switching to formatting documentation with ReST, and I noticed that none of the Documentation/security/ tree had been converted yet. I took the opportunity to take a few passes at formatting the existing documentation and, at Jon Corbert’s recommendation, split it up between end-user documentation (which is mainly how to use LSMs) and developer documentation (which is mainly how to use various internal APIs). A bunch of these docs need some updating, so maybe with the improved visibility, they’ll get some extra attention.

CONFIG_REFCOUNT_FULL
Since Peter Zijlstra implemented the refcount_t API in v4.11, Elena Reshetova (with Hans Liljestrand and David Windsor) has been systematically replacing atomic_t reference counters with refcount_t. As of v4.13, there are now close to 125 conversions with many more to come. However, there were concerns over the performance characteristics of the refcount_t implementation from the maintainers of the net, mm, and block subsystems. In order to assuage these concerns and help the conversion progress continue, I added an “unchecked” refcount_t implementation (identical to the earlier atomic_t implementation) as the default, with the fully checked implementation now available under CONFIG_REFCOUNT_FULL. The plan is that for v4.14 and beyond, the kernel can grow per-architecture implementations of refcount_t that have performance characteristics on par with atomic_t (as done in grsecurity’s PAX_REFCOUNT).

CONFIG_FORTIFY_SOURCE
Daniel Micay created a version of glibc’s FORTIFY_SOURCE compile-time and run-time protection for finding overflows in the common string (e.g. strcpy, strcmp) and memory (e.g. memcpy, memcmp) functions. The idea is that since the compiler already knows the size of many of the buffer arguments used by these functions, it can already build in checks for buffer overflows. When all the sizes are known at compile time, this can actually allow the compiler to fail the build instead of continuing with a proven overflow. When only some of the sizes are known (e.g. destination size is known at compile-time, but source size is only known at run-time) run-time checks are added to catch any cases where an overflow might happen. Adding this found several places where minor leaks were happening, and Daniel and I chased down fixes for them.

One interesting note about this protection is that is only examines the size of the whole object for its size (via __builtin_object_size(..., 0)). If you have a string within a structure, CONFIG_FORTIFY_SOURCE as currently implemented will make sure only that you can’t copy beyond the structure (but therefore, you can still overflow the string within the structure). The next step in enhancing this protection is to switch from 0 (above) to 1, which will use the closest surrounding subobject (e.g. the string). However, there are a lot of cases where the kernel intentionally copies across multiple structure fields, which means more fixes before this higher level can be enabled.

NULL-prefixed stack canary
Rik van Riel and Daniel Micay changed how the stack canary is defined on 64-bit systems to always make sure that the leading byte is zero. This provides a deterministic defense against overflowing string functions (e.g. strcpy), since they will either stop an overflowing read at the NULL byte, or be unable to write a NULL byte, thereby always triggering the canary check. This does reduce the entropy from 64 bits to 56 bits for overflow cases where NULL bytes can be written (e.g. memcpy), but the trade-off is worth it. (Besdies, x86_64’s canary was 32-bits until recently.)

IPC refactoring
Partially in support of allowing IPC structure layouts to be randomized by the randstruct plugin, Manfred Spraul and I reorganized the internal layout of how IPC is tracked in the kernel. The resulting allocations are smaller and much easier to deal with, even if I initially missed a few needed container_of() uses.

randstruct gcc plugin
I ported grsecurity’s clever randstruct gcc plugin to upstream. This plugin allows structure layouts to be randomized on a per-build basis, providing a probabilistic defense against attacks that need to know the location of sensitive structure fields in kernel memory (which is most attacks). By moving things around in this fashion, attackers need to perform much more work to determine the resulting layout before they can mount a reliable attack.

Unfortunately, due to the timing of the development cycle, only the “manual” mode of randstruct landed in upstream (i.e. marking structures with __randomize_layout). v4.14 will also have the automatic mode enabled, which randomizes all structures that contain only function pointers.

A large number of fixes to support randstruct have been landing from v4.10 through v4.13, most of which were already identified and fixed by grsecurity, but many were novel, either in newly added drivers, as whitelisted cross-structure casts, refactorings (like IPC noted above), or in a corner case on ARM found during upstream testing.

lower ELF_ET_DYN_BASE
One of the issues identified from the Stack Clash set of vulnerabilities was that it was possible to collide stack memory with the highest portion of a PIE program’s text memory since the default ELF_ET_DYN_BASE (the lowest possible random position of a PIE executable in memory) was already so high in the memory layout (specifically, 2/3rds of the way through the address space). Fixing this required teaching the ELF loader how to load interpreters as shared objects in the mmap region instead of as a PIE executable (to avoid potentially colliding with the binary it was loading). As a result, the PIE default could be moved down to ET_EXEC (0x400000) on 32-bit, entirely avoiding the subset of Stack Clash attacks. 64-bit could be moved to just above the 32-bit address space (0x100000000), leaving the entire 32-bit region open for VMs to do 32-bit addressing, but late in the cycle it was discovered that Address Sanitizer couldn’t handle it moving. With most of the Stack Clash risk only applicable to 32-bit, fixing 64-bit has been deferred until there is a way to teach Address Sanitizer how to load itself as a shared object instead of as a PIE binary.

early device randomness
I noticed that early device randomness wasn’t actually getting added to the kernel entropy pools, so I fixed that to improve the effectiveness of the latent_entropy gcc plugin.

That’s it for now; please let me know if I missed anything. As a side note, I was rather alarmed to discover that due to all my trivial ReSTification formatting, and tiny FORTIFY_SOURCE and randstruct fixes, I made it into the most active 4.13 developers list (by patch count) at LWN with 76 patches: a whopping 0.6% of the cycle’s patches. ;)

Anyway, the v4.14 merge window is open!

© 2017, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License.

Gunnar Wolf: Made with Creative Commons: Over half translated, yay!

6 September, 2017 - 02:05

An image speaks for a thousand words...

And our translation project is worth several thousand words!
I am very happy and surprised to say we have surpassed the 50% mark of the Made with Creative Commons translation project. We have translated 666 out of 1210 strings (yay for 3v1l numbers)!
I have to really thank Weblate for hosting us and allowing for collaboration to happen there. And, of course, I have to thank the people that have jumped on board and helped the translation — We are over half way there! Lets keep pushing!



PS - If you want to join the project, just get in Weblate and start translating right away, either to Spanish or other languages! (Polish, Dutch and Norwegian Bokmål are on their way) If you translate into Spanish, *please* read and abide by the specific Spanish translation guidelines.

Chris Lamb: Ask the dumb questions

5 September, 2017 - 18:51

In the same way it vital to ask the "smart questions", it is equally important to ask the dumb ones.

Whilst your milieu might be—say—comparing and contrasting the finer details of commission structures between bond brokers, if you aren't quite sure of the topic learn to be bold and confident enough to boldly ask: I'm sorry, but what actually is a bond?

Don't consider this to be an all-or-nothing affair. After all, you might have at least some idea about what a bond is. Rather, adjust your tolerance to also ask for clarification when you are merely slightly unsure or merely slightly uncertain about a concept, term or reference.

So why do this? Most obviously, you are learning something and expanding your knowledge about the world, but a clarification can avoid problems later if you were mistaken in your assumptions.

Not only that, asking "can you explain that?" or admitting "I don't follow…" is not only being honest with yourself, the vulnerability you show when admitting one's ignorance opens yourself to others leading to closer friendships and working relationships.

We clearly have a tendency to want to come across as knowledgable or―perhaps more honestly―we don't want to appear dumb or uninformed as it will bruise our ego. But the precise opposite is true: nodding and muddling your way through conversations you only partly understand is unlikely to cultivate true feelings of self-respect and a healthy self-esteem.

Since adopting this approach I have found I've rarely derailed the conversation. In fact, speaking up not only encourages and flatters others that you care about their subject, it has invariably lead to related matters which are not only more inclusive but actually novel and interesting to all present.

So push through the voice in your head and be that elephant in the room. After all, you might not the only person thinking it. If it helps, try reframing it to yourself as helping others…

You'll be finding it effortness soon enough. Indeed, asking the dumb question is actually a positive feedback loop where each question you pose helps you make others in the future. Excellence is not an act, but a habit.

Junichi Uekawa: It's already September.

5 September, 2017 - 07:26
It's already September. I haven't written much code last month. I wrote a CSV parser and felt a little depressed after reading rfc4180. None of my CSV files were in CRLF.

Pages

Creative Commons License ลิขสิทธิ์ของบทความเป็นของเจ้าของบทความแต่ละชิ้น
ผลงานนี้ ใช้สัญญาอนุญาตของครีเอทีฟคอมมอนส์แบบ แสดงที่มา-อนุญาตแบบเดียวกัน 3.0 ที่ยังไม่ได้ปรับแก้