Planet Debian

Subscribe to Planet Debian feed
Planet Debian -
Updated: 35 min 30 sec ago

Chris Lamb: Free software activities in September 2016

1 October, 2016 - 04:44

Here is my monthly update covering what I have been doing in the free software world (previous month):

Reproducible builds

Whilst anyone can inspect the source code of free software for malicious flaws, most Linux distributions provide binary (or "compiled") packages to end users.

The motivation behind the Reproducible Builds effort is to allow verification that no flaws have been introduced — either maliciously and accidentally — during this compilation process by promising identical binary packages are always generated from a given source.

My work in the Reproducible Builds project was also covered in our weekly reports #71, #72, #71 & #74.

I made the following improvements to our tools:


diffoscope is our "diff on steroids" that will not only recursively unpack archives but will transform binary formats into human-readable forms in order to compare them.

  • Added a global Progress object to track the status of the comparison process allowing for graphical and machine-readable status indicators. I also blogged about this feature in more detail.
  • Moved the global Config object to a more Pythonic "singleton" pattern and ensured that constraints are checked on every change.


disorderfs is our FUSE filesystem that deliberately introduces nondeterminism into the results of system calls such as readdir(3).

  • Display the "disordered" behaviour we intend to show on startup. (#837689)
  • Support relative paths in command-line parameters (previously only absolute paths were permitted).


strip-nondeterminism is our tool to remove specific information from a completed build.

  • Fix an issue where temporary files were being left on the filesystem and add a test to avoid similar issues in future. (#836670)
  • Print an error if the file to normalise does not exist. (#800159)
  • Testsuite improvements:
    • Set the timezone in tests to avoid a FTBFS and add a File::StripNondeterminism::init method to the API to to set tzset everywhere. (#837382)
    • "Smoke test" the strip-nondeterminism(1) and dh_strip_nondeterminism(1) scripts to prevent syntax regressions.
    • Add a testcase for .jar file ordering and normalisation.
    • Check the stripping process before comparing file attributes to make it less confusing on failure.
    • Move to a lookup table for descriptions of stat(1) indices and use that for nicer failure messages.
    • Don't uselessly test whether the inode number has changed.
  • Run perlcritic across the codebase and adopt some of its prescriptions including explicitly using oct(..) for integers with leading zeroes, avoiding mixing high and low-precedence booleans, ensuring subroutines end with a return statement, etc.

I also submitted 4 patches to fix specific reproducibility issues in golang-google-grpc, nostalgy, python-xlib & torque.

Patches contributed Debian LTS

This month I have been paid to work 12.75 hours on Debian Long Term Support (LTS). In that time I did the following:

  • "Frontdesk" duties, triaging CVEs, etc.
  • Issued DLA 608-1 for mailman fixing a CSRF vulnerability.
  • Issued DLA 611-1 for jsch correcting a path traversal vulnerability.
  • Issued DLA 620-1 for libphp-adodb patching a SQL injection vulnerability.
  • Issued DLA 631-1 for unadf correcting a buffer underflow issue.
  • Issued DLA 634-1 for dropbear fixing a buffer overflow when parsing ASN.1 keys.
  • Issued DLA 635-1 for dwarfutils working around an out-of-bounds read issue.
  • Issued DLA 638-1 for the SELinux policycoreutils, patching a sandbox escape issue.
  • Enhanced Brian May's find-work --unassigned switch to take an optional "except this user" argument.
  • Marked matrixssl and inspircd as being unsupported in the current LTS version.
  • python-django 1:1.10.1-1 — New upstream release and ensure that django-admin startproject foo creates files with the correct shebang under Python 3.
  • gunicorn:
    • 19.6.0-5 — Don't call chown(2) if it would be a no-op to avoid failure under snap.
    • 19.6.0-6 — Remove now-obsolete conffiles and logrotate scripts; they should have been removed in 19.6.0-3.
  • redis:
    • 3.2.3-2 — Call ulimit -n 65536 by default from SysVinit scripts to normalise the behaviour with systemd. I also bumped the Debian package epoch as the "2:" prefix made it look like we are shipping version 2.x. I additionaly backported this upload to Debian Jessie.
    • 3.2.4-1 — New upstream release, add missing -ldl for dladdr(3) & add missing dependency on lsb-base.
  • python-redis (2.10.5-2) — Bump python-hiredis to Suggests to sync with Ubuntu and move to a machine-readable debian/copyright. I also backported this upload to Debian Jessie.
  • adminer (4.2.5-3) — Move mysql-server dependencies to default-mysql-server. I also backported this upload to Debian Jessie.
  • gpsmanshp (1.2.3-5) on behalf of the QA team:
    • Move to "minimal" debhelper style, making the build reproducible. (#777446 & #792991)
    • Reorder linker command options to build with --as-needed (#729726) and add hardening flags.
    • Move to machine-readable copyright file, add missing #DEBHELPER# tokens to postinst and prerm scripts, tidy descriptions & other debian/control fields and other smaller changes.

I sponsored the upload of 5 packages from other developers:

I also NMU'd:

RC bugs

I filed 37 FTBFS bugs against csoundqt, cups-filters, dymo-cups-drivers, easytag, erlang-p1-oauth2, erlang-p1-sqlite3, erlang-p1-xmlrpc, erlang-redis-client, fso-datad, gnome-python-desktop, gnote, gstreamermm-1.0, gtkglextmm, gupnp-dlna, haskell-hmatrix-gsl, jdeb, kryo-serializers, libcmrt, libfso-glib, libmonitoring-livestatus-perl, librasterlite2, network-manager, print-manager, psychtoolbox-3, python-3to2, python-tidylib, recutils, slang2, snd, sugar, tj3, transmission-remote-gtk, vino, webkit2pdf, xml-core, xml-core & xml-core.

I additionally filed 2 "important" bugs for packages that access the internet during build against gnupg2 & libgdata.

FTP Team

As a Debian FTP assistant I ACCEPTed 147 packages: alljoyn-services-1604, android-platform-external-doclava, android-platform-system-tools-aidl, aufs, bcolz, binwalk, bmusb, bruteforce-salted-openssl, cappuccino, captagent, chrome-gnome-shell, ciphersaber, cmark, colorfultabs, cppformat, dnsrecon, dogtag-pki, dxtool, e2guardian, flask-compress, fonts-mononoki, fwknop-gui, gajim-httpupload, glbinding, glewmx, gnome-2048, golang-github-googleapis-proto-client-go, google-android-installers, gsl, haskell-hmatrix-gsl, haskell-relational-query, haskell-relational-schemas, haskell-secret-sharing, hindsight, i8c, ip4r, java-string-similarity, khal, khronos-opencl-headers, liblivemedia, libshell-config-generate-perl, libshell-guess-perl, libstaroffice, libxml2, libzonemaster-perl, linux, linux-grsec-base, linux-signed, lua-sandbox, lua-torch-trepl, mbrola-br2, mbrola-br4, mbrola-de1, mbrola-de2, mbrola-de3, mbrola-ir1, mbrola-lt1, mbrola-lt2, mbrola-mx1, mimeo, mimerender, mongo-tools, mozilla-gnome-keyring, munin, node-grunt-cli, node-js-yaml, nova, open-build-service, openzwave, orafce, osmalchemy, pgespresso, pgextwlist, pgfincore, pgmemcache, pgpool2, pgsql-asn1oid, postbooks-schema, postgis, postgresql-debversion, postgresql-multicorn, postgresql-mysql-fdw, postgresql-unit, powerline-taskwarrior, prefix, pycares, pydl, pynliner, pytango, pytest-cookies, python-adal, python-applicationinsights, python-async-timeout, python-azure, python-azure-storage, python-blosc, python-can, python-canmatrix, python-chartkick, python-confluent-kafka, python-jellyfish, python-k8sclient, python-msrestazure, python-nss, python-pytest-benchmark, python-tenacity, python-tmdbsimple, python-typing, python-unidiff, python-xstatic-angular-schema-form, python-xstatic-tv4, quilt, r-bioc-phyloseq, r-cran-filehash, r-cran-png, r-cran-testit, r-cran-tikzdevice, rainbow-mode, repmgr, restart-emacs, restbed, ruby-azure-sdk, ruby-babel-source, ruby-babel-transpiler, ruby-diaspora-prosody-config, ruby-haikunator, ruby-license-finder, ruby-ms-rest, ruby-ms-rest-azure, ruby-rails-assets-autosize, ruby-rails-assets-blueimp-gallery, ruby-rails-assets-bootstrap, ruby-rails-assets-bootstrap-markdown, ruby-rails-assets-emojione, ruby-sprockets-es6, ruby-timeliness, rustc, skytools3, slony1-2, snmp-mibs-downloader, syslog-ng, test-kitchen, uctodata, usbguard, vagrant-azure, vagrant-mutate & vim.

Bdale Garbee: Second Retirement

30 September, 2016 - 11:23

At the end of August 2012, I announced my Early Retirement from HP. Two years later, my friend and former boss Martin Fink successfully recruited me to return to what later became Hewlett Packard Enterprise, as an HPE Fellow working on open source strategy in his Office of the CTO.

I'm proud of what I was was able to accomplish in the 25 months since then, but recent efforts to "simplify" HPE actually made things complicated for me. Between the announcement in late June that Martin intended to retire himself, and the two major spin-merger announcements involving Enterprise Services and Software... well...

The bottom line is that today, 30 September 2016, is my last day at HPE.

My plan is to "return to retirement" and work on some fun projects with my wife now that we are "empty nesters". I do intend to remain involved in the Free Software and open hardware worlds, but whether that might eventually involve further employment is something I'm going to try and avoid thinking about for a while...

There is a rocket launch scheduled nearby this weekend, after all!

Dirk Eddelbuettel: gcbd 0.2.6

29 September, 2016 - 10:39

A pure maintenance release 0.2.6 of the gcbd package is now on CRAN. The gcbd proposes a benchmarking framework for LAPACK and BLAS operations (as the library can exchanged in a plug-and-play sense on suitable OSs) and records result in local database. Recent / upcoming changes to DBMI and RSQLite let me to update the package; there are no actual functionality changes in this release.

CRANberries also provides a diffstat report for the latest release.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

Dirk Eddelbuettel: RcppCNPy 0.2.6

29 September, 2016 - 10:34

A new version of the RcppCNPy package arrived on CRAN a few days ago.

RcppCNPy provides R with read and write access to NumPy files thanks to the cnpy library by Carl Rogers.

This new release reflects all the suggestions and comments I received during the review process for the Journal of Open Source Software submission. I am happy to say that about twenty-nine days after I submitted, the paper was accepted and is now published.

Changes in version 0.2.6 (2016-09-25)
  • Expanded documentation in

  • Added examples to help page

  • Added CITATION file for JOSS paper

CRANberries also provides a diffstat report for the latest release. As always, feedback is welcome and the best place to start a discussion may be the GitHub issue tickets page.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

Kees Cook: security things in Linux v4.5

29 September, 2016 - 04:58

Some things I found interesting in the Linux kernel v4.5:

ptrace fsuid checking

Jann Horn fixed some corner-cases in how ptrace access checks were handled on special files in /proc. For example, prior to this fix, if a setuid process temporarily dropped privileges to perform actions as a regular user, the ptrace checks would not notice the reduced privilege, possibly allowing a regular user to trick a privileged process into disclosing things out of /proc (ASLR offsets, restricted directories, etc) that they normally would be restricted from seeing.

ASLR entropy sysctl

Daniel Cashman standardized the way architectures declare their maximum user-space ASLR entropy (CONFIG_ARCH_MMAP_RND_BITS_MAX) and then created a sysctl (/proc/sys/vm/mmap_rnd_bits) so that system owners could crank up entropy. For example, the default entropy on 32-bit ARM was 8 bits, but the maximum could be as much as 16. If your 64-bit kernel is built with CONFIG_COMPAT, there’s a compat version of the sysctl as well, for controlling the ASLR entropy of 32-bit processes: /proc/sys/vm/mmap_rnd_compat_bits.

Here’s how to crank your entropy to the max, without regard to what architecture you’re on:

for i in "" "compat_"; do f=/proc/sys/vm/mmap_rnd_${i}bits; n=$(cat $f); while echo $n > $f ; do n=$(( n + 1 )); done; done

strict sysctl writes

Two years ago I added a sysctl for treating sysctl writes more like regular files (i.e. what’s written first is what appears at the start), rather than like a ring-buffer (what’s written last is what appears first). At the time it wasn’t clear what might break if this was enabled, so a WARN was added to the kernel. Since only one such string showed up in searches over the last two years, the strict writing mode was made the default. The setting remains available as /proc/sys/kernel/sysctl_writes_strict.

seccomp NNP vs TSYNC fix

Jann Horn noticed and fixed a problem where if a seccomp filter was already in place on a process (after being installed by a privileged process like systemd, a container launcher, etc) then the setting of the “no new privs” flag could be bypassed when adding filters with the SECCOMP_FILTER_FLAG_TSYNC flag set. Bypassing NNP meant it might be possible to trick a buggy setuid program into doing things as root after a seccomp filter forced a privilege drop to fail (generally referred to as the “sendmail setuid flaw”). With NNP set, a setuid program can’t be run in the first place.

That’s it! Tomorrow I’ll cover v4.6…

© 2016, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License.

Alessio Treglia: Emptiness and Form

29 September, 2016 - 03:33


In the perennial search of the meaning of life and the fundamental laws that govern nature, man was always faced – for millennia – with the mysterious concept of emptiness. What is emptiness? Does it really exist in nature? Is emptiness the non-being, as theorized by Parmenides?

Until the early years of the last century, technology had not yet been able to equip scientists with the necessary tools to investigate the innermost structure of matter, so the concept of emptiness was always faced with insights and metaphors that led, over the centuries, to a broad philosophical debate.

For the ancient atomist Greek philosophers, the existence of emptiness was not only possible but had become a necessity, becoming the ontological principle for the existence of being: for them, actually, the emptiness that permeates the atoms is what allows movement.

<Read More…[by Fabio Marzocca]>

Sean Whitton: 'Do you really need to do that?'

28 September, 2016 - 22:25

A new postdoc student arrived at our department this semester, and after learning that he uses GNU/Linux for all his computing, I invited him along to TFUG. During some of our meetings people asked “how could I do X on my GNU/Linux desktop?” and, jokingly, the postdoc would respond “the answer to your question is ‘do you really need to do that?’” Sometimes the more experienced GNU/Linux users at the table would respond to questions by suggesting that the user should simply give up on doing X, and the postdoc would slap his thigh and laugh and say “see? I told you that’s the answer!”

The phenomenon here is that people who have at some point made a commitment to at least try to use GNU/Linux for all their computing quickly find that they have come to value using GNU/Linux more than they value engaging in certain activities that only work well/at all under a proprietary operating system. I think that this is because they get used to being treated with respect by their computer. And indeed, one of the reasons I’ve almost entirely given up on computer gaming is that computer games are non-free software. “Are you sure you need to do that?” starts sounding like a genuine question rather than simply a polite way of saying that what someone wants to do can’t be achieved.

I suggest that this is a blessing in disguise. The majority of the things that you can only do under a proprietary operating system are things that it would be good for you if you were to switch them out for other activities. I’m not suggesting that switching to a GNU/Linux is a good way to give up on the entertainment industry. It’s a good way of moderating your engagement with the entertainment industry. Rather than logging onto Netflix, you might instead pop in a DVD of a movie. You can still engage with contemporary popular culture, but the technical barriers give you an opportunity to moderate your consumption: once you’ve finished watching the movie, the software won’t try to get you to watch something else by making a calculation as to what you’re most likely to assent to watching next based on what you’ve watched before. For this behaviour of the Netflix software is just another example of non-free software working against its user’s interests: watching a movie is good for you, but binge-watching a TV series probably isn’t. In cases like this, living in the world of Free Software makes it easier to engage with media healthily.

Chris Lamb: Diffoscope progress bar

28 September, 2016 - 18:45

Diffoscope is a diff utility which recursively unpacks archives, ISOs, etc., transforming a wide variety of files into human-readable forms before comparison instead of simply showing the raw difference in hexadecimal.

I recently added a progress bar when diffoscope is run on a terminal:

Note that as diffoscope can at any pointencounter an archive or format that requires unpacking, the progress will always be approximate and may even appear to go "backwards".

The implementation, available in version 61, is simple (see #1, #2, #3 & #4) but takes into account of a number of subtleties by using context managers to correctly track the state throughout.

Kees Cook: security things in Linux v4.4

28 September, 2016 - 05:47

Continuing with interesting security things in the Linux kernel, here’s v4.4. As before, if you think there’s stuff I missed that should get some attention, please let me know.


The CONFIG_STRICT_DEVMEM setting that has existed for a long time already protects system RAM from being accessible through the /dev/mem device node to root in user-space. Dan Williams added CONFIG_IO_STRICT_DEVMEM to extend this so that if a kernel driver has reserved a device memory region for use, it will become unavailable to /dev/mem also. The reservation in the kernel was to keep other kernel things from using the memory, so this is just common sense to make sure user-space can’t stomp on it either. Everyone should have this enabled.

If you’re looking to create a very bright line between user-space having access to device memory, it’s worth noting that if a device driver is a module, a malicious root user can just unload the module (freeing the kernel memory reservation), fiddle with the device memory, and then reload the driver module. So either just leave out /dev/mem entirely (not currently possible with upstream), build a monolithic kernel (no modules), or otherwise block (un)loading of modules (/proc/sys/kernel/modules_disabled).

seccomp UM

Mickaël Salaün added seccomp support (and selftests) for user-mode Linux. Moar architectures!

seccomp Checkpoint/Restore-In-Userspace

Tycho Andersen added a way to extract and restore seccomp filters from running processes via PTRACE_SECCOMP_GET_FILTER under CONFIG_CHECKPOINT_RESTORE. This is a continuation of his work (that I failed to mention in my prior post) from v4.3, which introduced a way to suspend and resume seccomp filters. As I mentioned at the time (and for which he continues to quote me) “this feature gives me the creeps.” :)

x86 W^X corrections

Stephen Smalley noticed that there was still a range of kernel memory (just past the end of the kernel code itself) that was incorrectly marked writable and executable, defeating the point of CONFIG_DEBUG_RODATA which seeks to eliminate these kinds of memory ranges. He corrected this and added CONFIG_DEBUG_WX which performs a scan of memory at boot time and yells loudly if unexpected memory protection are found. To nobody’s delight, it was shortly discovered the UEFI leaves chunks of memory in this state too, which posed an ugly-to-solve problem (which Matt Fleming addressed in v4.6).

x86_64 vsyscall CONFIG

I introduced a way to control the mode of the x86_64 vsyscall with a build-time CONFIG selection, though the choice I really care about is CONFIG_LEGACY_VSYSCALL_NONE, to force the vsyscall memory region off by default. The vsyscall memory region was always mapped into process memory at a fixed location, and it originally posed a security risk as a ROP gadget execution target. The vsyscall emulation mode was added to mitigate the problem, but it still left fixed-position static memory content in all processes, which could still pose a security risk. The good news is that glibc since version 2.15 doesn’t need vsyscall at all, so it can just be removed entirely. Any kernel built this way that discovered they needed to support a pre-2.15 glibc could still re-enable it at the kernel command line with “vsyscall=emulate”.

That’s it for v4.4. Tune in tomorrow for v4.5!

© 2016, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License.

C.J. Adams-Collier: OpenDaylight Symposium 2016

28 September, 2016 - 00:17

I’ll write this later.  Keywords and notes inline











Open Source




OSI stack


Developers and users working together.

Fast Dev/Test cycles

Sessions start at 10:30

Message bus

Minimal install needs to be smaller

Most functionalities must be implemented in modules

Upgrade path complexity

Out of band control channel.
OPNFV customers don’t run stock releases, make customisations and plugins easier.

Message bus

Distribution vendors should not be expected to perform package maintenance.  Get the distribution ready for fast pathing in to backports, security, testing  and unstable.

Ben Hutchings: Debian LTS work, August 2016

27 September, 2016 - 17:41

I was assigned 14.75 hours of work by Freexian's Debian LTS initiative and carried over 0.7 from last month. I worked a total of 14 hours, carrying over 1.45 hours.

I finished preparing and finally uploaded an update for linux (3.2.81-2). This took longer than expected due to the difficulty of reproducing CVE-2016-5696 and verifying the backported fix. I also released an upstream stable update (3.2.82) which will go into the next update in wheezy LTS.

I discussed a few other security updates and issues on the debian-lts mailing list.

Kees Cook: security things in Linux v4.3

27 September, 2016 - 05:54

When I gave my State of the Kernel Self-Protection Project presentation at the 2016 Linux Security Summit, I included some slides covering some quick bullet points on things I found of interest in recent Linux kernel releases. Since there wasn’t a lot of time to talk about them all, I figured I’d make some short blog posts here about the stuff I was paying attention to, along with links to more information. This certainly isn’t everything security-related or generally of interest, but they’re the things I thought needed to be pointed out. If there’s something security-related you think I should cover from v4.3, please mention it in the comments. I’m sure I haven’t caught everything. :)

A note on timing and context: the momentum for starting the Kernel Self Protection Project got rolling well before it was officially announced on November 5th last year. To that end, I included stuff from v4.3 (which was developed in the months leading up to November) under the umbrella of the project, since the goals of KSPP aren’t unique to the project nor must the goals be met by people that are explicitly participating in it. Additionally, not everything I think worth mentioning here technically falls under the “kernel self-protection” ideal anyway — some things are just really interesting userspace-facing features.

So, to that end, here are things I found interesting in v4.3:


Russell King implemented this feature for ARM which provides emulated segregation of user-space memory when running in kernel mode, by using the ARM Domain access control feature. This is similar to a combination of Privileged eXecute Never (PXN, in later ARMv7 CPUs) and Privileged Access Never (PAN, coming in future ARMv8.1 CPUs): the kernel cannot execute user-space memory, and cannot read/write user-space memory unless it was explicitly prepared to do so. This stops a huge set of common kernel exploitation methods, where either a malicious executable payload has been built in user-space memory and the kernel was redirected to run it, or where malicious data structures have been built in user-space memory and the kernel was tricked into dereferencing the memory, ultimately leading to a redirection of execution flow.

This raises the bar for attackers since they can no longer trivially build code or structures in user-space where they control the memory layout, locations, etc. Instead, an attacker must find areas in kernel memory that are writable (and in the case of code, executable), where they can discover the location as well. For an attacker, there are vastly fewer places where this is possible in kernel memory as opposed to user-space memory. And as we continue to reduce the attack surface of the kernel, these opportunities will continue to shrink.

While hardware support for this kind of segregation exists in s390 (natively separate memory spaces), ARM (PXN and PAN as mentioned above), and very recent x86 (SMEP since Ivy-Bridge, SMAP since Skylake), ARM is the first upstream architecture to provide this emulation for existing hardware. Everyone running ARMv7 CPUs with this kernel feature enabled suddenly gains the protection. Similar emulation protections (PAX_MEMORY_UDEREF) have been available in PaX/Grsecurity for a while, and I’m delighted to see a form of this land in upstream finally.

To test this kernel protection, the ACCESS_USERSPACE and EXEC_USERSPACE triggers for lkdtm have existed since Linux v3.13, when they were introduced in anticipation of the x86 SMEP and SMAP features.

Ambient Capabilities

Andy Lutomirski (with Christoph Lameter and Serge Hallyn) implemented a way for processes to pass capabilities across exec() in a sensible manner. Until Ambient Capabilities, any capabilities available to a process would only be passed to a child process if the new executable was correctly marked with filesystem capability bits. This turns out to be a real headache for anyone trying to build an even marginally complex “least privilege” execution environment. The case that Chrome OS ran into was having a network service daemon responsible for calling out to helper tools that would perform various networking operations. Keeping the daemon not running as root and retaining the needed capabilities in children required conflicting or crazy filesystem capabilities organized across all the binaries in the expected tree of privileged processes. (For example you may need to set filesystem capabilities on bash!) By being able to explicitly pass capabilities at runtime (instead of based on filesystem markings), this becomes much easier.

For more details, the commit message is well-written, almost twice as long as than the code changes, and contains a test case. If that isn’t enough, there is a self-test available in tools/testing/selftests/capabilities/ too.

PowerPC and Tile support for seccomp filter

Michael Ellerman added support for seccomp to PowerPC, and Chris Metcalf added support to Tile. As the seccomp maintainer, I get excited when an architecture adds support, so here we are with two. Also included were updates to the seccomp self-tests (in tools/testing/selftests/seccomp), to help make sure everything continues working correctly.

That’s it for v4.3. If I missed stuff you found interesting, please let me know! I’m going to try to get more per-version posts out in time to catch up to v4.8, which appears to be tentatively scheduled for release this coming weekend.

© 2016, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License.

Reproducible builds folks: Reproducible Builds: week 74 in Stretch cycle

27 September, 2016 - 04:25

Here is what happened in the Reproducible Builds effort between Sunday September 18 and Saturday September 24 2016:


We intend to participate in Outreachy Round 13 and look forward for new enthusiastic applications to contribute to reproducible builds. We're offering four different areas to work on:

  • Improve test and debugging tools.
  • Improving reproducibility of Debian packages.
  • Improving Debian infrastructure.
  • Help collaboration across distributions.
Reproducible Builds World summit #2

We are planning e a similar event to our Athens 2015 summit and expect to reveal more information soon. If you haven't been contacted yet but would like to attend, please contact h01ger.

Toolchain development and fixes

Mattia uploaded dpkg/ to our experimental repository. and covered the details for the upload in a mailing list post.

The most important change is the incorporation of improvements made by Guillem Jover (dpkg maintainer) to the .buildinfo generator. This is also in the hope that it will speed up the merge in the upstream.

One of the other relevant changes from before is that .buildinfo files generated from binary-only builds will no longer include the hash of the .dsc file in Checksums-Sha256 as documented in the specification.

Even if it was considered important to include a checksum of the source package in .buildinfo, storing it that way breaks other assumptions (eg. that Checksums-Sha256 contains only files part of that are part of a single upload, wheras the .dsc might not be part of that upload), thus we look forward for another solution to store the source checksum in .buildinfo.

Bugs filed Reviews of unreproducible packages

250 package reviews have been added, 4 have been updated and 4 have been removed in this week, adding to our knowledge about identified issues.

4 issue types have been added:

3 issue types have been updated:

Weekly QA work

FTBFS bugs have been reported by:

  • Chris Lamb (11)
  • Santiago Vila (2)
Documentation updates

h01ger created a new Jenkins job so that every commit pushed to the master branch for the website will update

diffoscope development strip-nondeterminism development reprotest development
  • The full rebuild of all packages in unstable (for all tested archs) with the new build path variation has been completed. This has had the result that we are down to ~75% reproducible packages in unstable now. In comparison, for testing (where we don't vary the build path) we are still at ~90%. IRC notifications for unstable have been enabled again. (Holger)
  • Make the notes job robust about bad data (see #833695 and #833738). (Holger)
  • Setup profitbricks-build7 running stretch as testing reproducible builds of F-Droid need to use a newer version of vagrant in order to support running vagrant VMs with kvm on kvm. (Holger)
  • The misbehaving 'opi2a' armhf node has been replaced with a Jetson-TK1 board kindly donated by NVidia. This machine is using an NVIDIA tegra-k1 (cortex-a15) quad-core board. (vagrant and Holger)

This week's edition was written by Chris Lamb, Holger Levsen and Mattia Rizzolo and reviewed by a bunch of Reproducible Builds folks on IRC.

Rhonda D'Vine: LP

26 September, 2016 - 17:00

I guess you know by now that I simply love music. It is powerful, it can move you, change your mood in a lot of direction, make you wanna move your body to it, even unknowingly have this happen, and remind you of situations you want to keep in mind. The singer I present to you was introduce to me by a dear friend with the following words: So this hasn't happened to me in a looooong time: I hear a voice and can't stop crying. I can't decide which song I should send to you thus I send three of which the last one let me think of you.

And I have to agree, that voice is really great. Thanks a lot for sharing LP with me, dear! And given that I got sent three songs and I am not good at holding excitement back, I want to share it with you, so here are the songs:

  • Lost On You: Her voice is really great in this one.
  • Halo: Have to agree that this is really a great cover.
  • Someday: When I hear that song and think about that it reminds my friend of myself I'm close to tears, too ...

Like always, enjoy!

/music | permanent link | Comments: 0 | Flattr this

Clint Adams: Collect the towers

26 September, 2016 - 06:57

Why is openbmap's North American coverage so sad? Is there a reason that RadioBeacon doesn't also submit to OpenCellID? Is there a free software Android app that submits data to OpenCellID?

Vincent Sanders: I'll huff, and I'll puff, and I'll blow your house in

26 September, 2016 - 06:23
Sometimes it really helps to have a different view on a problem and after my recent writings on my Public Suffix List (PSL) library I was fortunate to receive a suggestion from my friend Enrico Zini.

I had asked for suggestions on reducing the size of the library further and Enrico simply suggested Huffman coding. This was a technique I had learned about long ago in connection with data compression and the intervening years had made all the details fuzzy which explains why it had not immediately sprung to mind.

Huffman coding named for David A. Huffman is an algorithm that enables a representation of data which is very efficient. In a normal array of characters every character takes the same eight bits to represent which is the best we can do when any of the 256 values possible is equally likely. If your data is not evenly distributed this is not the case for example if the data was english text then the value is fifteen times more likely to be that for e than k.

So if we have some data with a non uniform distribution of probabilities we need a way the data be encoded with fewer bits for the common values and more bits for the rarer values. To be efficient we would need some way of having variable length representations without storing the length separately. The term for this data representation is a prefix code and there are several ways to generate them.

Such is the influence of Huffman on the area of prefix codes they are often called Huffman codes even if they were not created using his algorithm. One can dream of becoming immortalised like this, to join the ranks of those whose names are given to units or whole ideas in a field must be immensely rewarding, however given Huffman invented his algorithm and proved it to be optimal to answer a question on a term paper in his early twenties I fear I may already be a bit too late.

The algorithm itself is relatively straightforward. First a frequency analysis is performed, a fancy way of saying count how many of each character is in the input data. Next a binary tree is created by using a priority queue initialised with the nodes sorted by frequency.

The two least frequent items count is summed together and a node placed in the tree with the two original entries as child nodes. This step is repeated until a single node exists with a count value equal to the length of the input.

To encode data once simply walks the tree outputting a 0 for a left node or 1 for right node until reaching the original value. This generates a mapping of values to bit output, the input is then simply converted value by value to the bit output. To decode the data the data is used bit by bit to walk the tree to arrive at values.

If we perform this algorithm on the example string table *!asiabvcomcoopitamazonawsarsaves-the-whalescomputebasilicata we can reduce the 488 bits (61 * 8 bit characters) to 282 bits or 40% reduction. Obviously in a real application the huffman tree would need to be stored which would probably exceed this saving but for larger data sets it is probable this technique would yield excellent results on this kind of data.

Once I proved this to myself I implemented the encoder within the existing conversion program. Although my perl encoder is not very efficient it can process the entire PSL string table (around six thousand labels using 40KB or so) in less than a second, so unless the table grows massively an inelegant approach will suffice.

The resulting bits were packed into 32bit values to improve decode performance (most systems prefer to deal with larger memory fetches less frequently) and resulted in 18KB of output or 47% of the original size. This is a great improvement in size and means the statically linked test program is now 59KB and is actually smaller than the gzipped source data.

$ ls -alh test_nspsl
-rwxr-xr-x 1 vince vince 59K Sep 25 23:58 test_nspsl
$ ls -al public_suffix_list.dat.gz
-rw-r--r-- 1 vince vince 62K Sep 1 08:52 public_suffix_list.dat.gz

To be clear the statically linked program can determine if a domain is in the PSL with no additional heap allocations and includes the entire PSL ordered tree, the domain label string table and the huffman decode table to read it.

An unexpected side effect is that because the decode loop is small it sits in the processor cache. This appears to cause the string comparison function huffcasecmp() (which is not locale dependant because we know the data will be limited ASCII) performance to be close to using strcasecmp() indeed on ARM32 systems there is a very modest improvement in performance.

I think this is as much work as I am willing to put into this library but I am pleased to have achieved a result which is on par with the best of breed (libpsl still has a data representation 20KB smaller than libnspsl but requires additional libraries for additional functionality) and I got to (re)learn an important algorithm too.

Julian Andres Klode: Introducing TrieHash, a order-preserving minimal perfect hash function generator for C(++)

26 September, 2016 - 01:44

I introduce TrieHash an algorithm for constructing perfect hash functions from tries. The generated hash functions are pure C code, minimal, order-preserving and outperform existing alternatives. Together with the generated header files,they can also be used as a generic string to enumeration mapper (enums are created by the tool).


APT (and dpkg) spend a lot of time in parsing various files, especially Packages files. APT currently uses a function called AlphaHash which hashes the last 8 bytes of a word in a case-insensitive manner to hash fields in those files (dpkg just compares strings in an array of structs).

There is one obvious drawback to using a normal hash function: When we want to access the data in the hash table, we have to hash the key again, causing us to hash every accessed key at least twice. It turned out that this affects something like 5 to 10% of the cache generation performance.

Enter perfect hash functions: A perfect hash function matches a set of words to constant values without collisions. You can thus just use the index to index into your hash table directly, and do not have to hash again (if you generate the function at compile time and store key constants) or handle collision resolution.

As #debian-apt people know, I happened to play a bit around with tries this week before guillem suggested perfect hashing. Let me tell you one thing: My trie implementation was very naive, that did not really improve things a lot…

Enter TrieHash

Now, how is this related to hashing? The answer is simple: I wrote a perfect hash function generator that is based on tries. You give it a list of words, it puts them in a trie, and generates C code out of it, using recursive switch statements (see code generation below). The function achieves competitive performance with other hash functions, it even usually outperforms them.

Given a dictionary, it generates an enumeration (a C enum or C++ enum class) of all words in the dictionary, with the values corresponding to the order in the dictionary (the order-preserving property), and a function mapping strings to members of that enumeration.

By default, the first word is considered to be 0 and each word increases a counter by one (that is, it generates a minimal hash function). You can tweak that however:

= 0
WordLabel ~ Word
OtherWord = 9

will return 0 for an unknown value, map “Word” to the enum member WordLabel and map OtherWord to 9. That is, the input list functions like the body of a C enumeration. If no label is specified for a word, it will be generated from the word. For more details see the documentation

C code generation
switch(string[0] | 32) {
case 't':
    switch(string[1] | 32) {
    case 'a':
        switch(string[2] | 32) {
        case 'g':
            return Tag;
return Unknown;

Yes, really recursive switches – they directly represent the trie. Now, we did not really do a straightforward translation, there are some optimisations to make the whole thing faster and easier to look at:

First of all, the 32 you see is used to make the check case insensitive in case all cases of the switch body are alphabetical characters. If there are non-alphabetical characters, it will generate two cases per character, one upper case and one lowercase (with one break in it). I did not know that lowercase and uppercase characters differed by only one bit before, thanks to the clang compiler for pointing that out in its generated assembler code!

Secondly, we only insert breaks only between cases. Initially, each case ended with a return Unknown, but guillem (the dpkg developer) suggested it might be faster to let them fallthrough where possible. Turns out it was not faster on a good compiler, but it’s still more readable anywhere.

Finally, we build one trie per word length, and switch by the word length first. Like the 32 trick, his gives a huge improvement in performance.

Digging into the assembler code

The whole code translates to roughly 4 instructions per byte:

  1. A memory load,
  2. an or with 32
  3. a comparison, and
  4. a conditional jump.

(On x86, the case sensitive version actually only has a cmp-with-memory and a conditional jump).

Due to this may be one instruction more: On some architectures an unneeded zero-extend-byte instruction is inserted – this causes a 20% performance loss.

Performance evaluation

I run the hash against all 82 words understood by APT in Packages and Sources files, 1,000,000 times for each word, and summed up the average run-time:

host arch Trie TrieCase GPerfCase GPerf DJB plummer ppc64el 540 601 1914 2000 1345 eller mipsel 4728 5255 12018 7837 4087 asachi arm64 1000 1603 4333 2401 1625 asachi armhf 1230 1350 5593 5002 1784 barriere amd64 689 950 3218 1982 1776 x230 amd64 465 504 1200 837 693

Suffice to say, GPerf does not really come close.

All hosts except the x230 are Debian porterboxes. The x230 is my laptop with a a Core i5-3320M, barriere has an Opteron 23xx. I included the DJB hash function for another reference.

Source code

The generator is written in Perl, licensed under the MIT license and available from – I initially prototyped it in Python, but guillem complained that this would add new build dependencies to dpkg, so I rewrote it in Perl.

Benchmark is available from


See the script for POD documentation.

Filed under: General

Steinar H. Gunderson: Nageru @ Fyrrom

25 September, 2016 - 20:15

When Samfundet wanted to make their own Boiler Room spinoff (called “Fyrrom”—more or less a direct translation), it was a great opportunity to try out the new multitrack code in Nageru. After all, what can go wrong with a pretty much untested and unfinished git branch, right?

So we cobbled together a bunch of random equipment from here and there:

Hooked it up to Nageru:

and together with some great work from the people actually pulling together the event, this was the result. Lots of fun.

And yes, some bugs were discovered—of course, field testing without followup patches is meaningless (that would either mean you're not actually taking your test experience into account, or that your testing gave no actionable feedback and thus was useless), so they will be fixed in due time for the 1.4.0 release.

Sven Hoexter: in causa wosign

25 September, 2016 - 17:00

Since I kind of recommended the free WoSign CA in the past, I would like to point out the issues that have piled up. Mozilla has a writeup due to a removal discussion for NSS: (Yes I'm late with this post, about a month or two by now ...)

Since WoSign, or the person behind it, silently also bought StartCom we've now with StartSSL and WoSign two of the three free CAs in one hand with a questionable track record. That leaves everyone looking for a low budget option with Let's Encrypt.

Russ Allbery: podlators 4.08

25 September, 2016 - 09:28

A new release of the distribution that provides Pod::Man and Pod::Text for Perl documentation formatting.

The impetus for this release is fixing a rendering bug in Pod::Man that spewed stray bits of half-escaped *roff into the man page for the text "TRUE (1)". This turned out to be due to two interlocking bugs in the dark magic regexes that try to fix up formatting to make man pages look a bit better: incorrect double-markup in both small caps and as a man page reference, and incorrect interpretation of the string "\s0(1)". Both are fixed in this release.

podlators 4.00 changed Pod::Man to make piping POD through pod2man on standard input without providing the --name option an error, since there was no good choice for the man page title. This turned out to be too disruptive: the old behavior of tolerating this had been around for too long, and I got several bug reports. Since I think backward compatibility is extremely important for these tools, I've backed down from this change, and now Pod::Man and pod2man just silently use the man page name "STDIN" (which still fixes the original problem of being reproducible).

It is, of course, still a good idea to provide the name option when dealing with standard input, since "STDIN" isn't a very good man page title.

This release also adds new --lquote and --rquote options to pod2man to set the quote marks independently, and removes a test that relied on a POD construct that is going to become an error in Pod::Simple.

You can get the latest release from the podlators distribution page.


Creative Commons License ลิขสิทธิ์ของบทความเป็นของเจ้าของบทความแต่ละชิ้น
ผลงานนี้ ใช้สัญญาอนุญาตของครีเอทีฟคอมมอนส์แบบ แสดงที่มา-อนุญาตแบบเดียวกัน 3.0 ที่ยังไม่ได้ปรับแก้