Planet Debian

Subscribe to Planet Debian feed
Planet Debian - http://planet.debian.org/
Updated: 27 min 1 sec ago

Matthew Garrett: ACPI, kernels and contracts with firmware

17 September, 2014 - 06:51
ACPI is a complicated specification - the latest version is 980 pages long. But that's because it's trying to define something complicated: an entire interface for abstracting away hardware details and making it easier for an unmodified OS to boot diverse platforms.

Inevitably, though, it can't define the full behaviour of an ACPI system. It doesn't explicitly state what should happen if you violate the spec, for instance. Obviously, in a just and fair world, no systems would violate the spec. But in the grim meathook future that we actually inhabit, systems do. We lack the technology to go back in time and retroactively prevent this, and so we're forced to deal with making these systems work.

This ends up being a pain in the neck in the x86 world, but it could be much worse. Way back in 2008 I wrote something about why the Linux kernel reports itself to firmware as "Windows" but refuses to identify itself as Linux. The short version is that "Linux" doesn't actually identify the behaviour of the kernel in a meaningful way. "Linux" doesn't tell you whether the kernel can deal with buffers being passed when the spec says it should be a package. "Linux" doesn't tell you whether the OS knows how to deal with an HPET. "Linux" doesn't tell you whether the OS can reinitialise graphics hardware.

Back then I was writing from the perspective of the firmware changing its behaviour in response to the OS, but it turns out that it's also relevant from the perspective of the OS changing its behaviour in response to the firmware. Windows 8 handles backlights differently to older versions. Firmware that's intended to support Windows 8 may expect this behaviour. If the OS tells the firmware that it's compatible with Windows 8, the OS has to behave compatibly with Windows 8.

In essence, if the firmware asks for Windows 8 support and the OS says yes, the OS is forming a contract with the firmware that it will behave in a specific way. If Windows 8 allows certain spec violations, the OS must permit those violations. If Windows 8 makes certain ACPI calls in a certain order, the OS must make those calls in the same order. Any firmware bug that is triggered by the OS not behaving identically to Windows 8 must be dealt with by modifying the OS to behave like Windows 8.

This sounds horrifying, but it's actually important. The existence of well-defined[1] OS behaviours means that the industry has something to target. Vendors test their hardware against Windows, and because Windows has consistent behaviour within a version[2] the vendors know that their machines won't suddenly stop working after an update. Linux benefits from this because we know that we can make hardware work as long as we're compatible with the Windows behaviour.

That's fine for x86. But remember when I said it could be worse? What if there were a platform that Microsoft weren't targeting? A platform where Linux was the dominant OS? A platform where vendors all test their hardware against Linux and expect it to have a consistent ACPI implementation?

Our even grimmer meathook future welcomes ARM to the ACPI world.

Software development is hard, and firmware development is software development with worse compilers. Firmware is inevitably going to rely on undefined behaviour. It's going to make assumptions about ordering. It's going to mishandle some cases. And it's the operating system's job to handle that. On x86 we know that systems are tested against Windows, and so we simply implement that behaviour. On ARM, we don't have that convenient reference. We are the reference. And that means that systems will end up accidentally depending on Linux-specific behaviour. Which means that if we ever change that behaviour, those systems will break.

So far we've resisted calls for Linux to provide a contract to the firmware in the way that Windows does, simply because there's been no need to - we can just implement the same contract as Windows. How are we going to manage this on ARM? The worst case scenario is that a system is tested against, say, Linux 3.19 and works fine. We make a change in 3.21 that breaks this system, but nobody notices at the time. Another system is tested against 3.21 and works fine. A few months later somebody finally notices that 3.21 broke their system and the change gets reverted, but oh no! Reverting it breaks the other system. What do we do now? The systems aren't telling us which behaviour they expect, so we're left with the prospect of adding machine-specific quirks. This isn't scalable.

Supporting ACPI on ARM means developing a sense of discipline around ACPI development that we simply haven't had so far. If we want to avoid breaking systems we have two options:

1) Commit to never modifying the ACPI behaviour of Linux.
2) Exposing an interface that indicates which well-defined ACPI behaviour a specific kernel implements, and bumping that whenever an incompatible change is made. Backward compatibility paths will be required if firmware only supports an older interface.

(1) is unlikely to be practical, but (2) isn't a great deal easier. Somebody is going to need to take responsibility for tracking ACPI behaviour and incrementing the exported interface whenever it changes, and we need to know who that's going to be before any of these systems start shipping. The alternative is a sea of ARM devices that only run specific kernel versions, which is exactly the scenario that ACPI was supposed to be fixing.

[1] Defined by implementation, not defined by specification
[2] Windows may change behaviour between versions, but always adds a new _OSI string when it does so. It can then modify its behaviour depending on whether the firmware knows about later versions of Windows.

comments

Steinar H. Gunderson: The virtues of std::unique_ptr

17 September, 2014 - 06:30

Among all the changes in C++11, there's one that I don't feel has received enough attention: std::unique_ptr (or just unique_ptr; I'll drop the std:: from here on). The motivation is simple; assume a function like this:

Foo *func() {
        Foo *foo = new Foo;
        if (something_complicated()) {
                // Oops, something wrong happened
                return NULL;
        }
        foo->baz();
        return foo;
}

The memory leak is obvious; if something_complicated() returns false, we presumably leak foo. The classical fix is:

Foo *func() {
        Foo *foo = new Foo;
        if (something_complicated()) {
                delete foo;
                return NULL;
        }
        foo->baz();
        return foo;
}

But this is cumbersome and easy to get wrong. Tools like valgrind have made this a lot easier to detect, but that's a poor substitute; what we want is a coding style where it's deliberately hard to make mistakes. Enter unique_ptr:

Foo *func() {
        unique_ptr<Foo> foo(new Foo);
        if (something_complicated()) {
                // unique_ptr<Foo> destructor deletes foo for us!
                return NULL;
        }
        foo->baz();
        return foo.release();
}

So we have introduced a notion of ownership; the function (or, more precisely, scope) now owns the Foo object. The only way we can leave the function and not have it destroyed is through an explicit call to release() (which returns the raw pointer and clears the unique_ptr). We have smart pointer semantics, so we can use -> just as if we had a regular pointer. In any case, the runtime overhead over a regular pointer is exactly zero.

Ownership does, of course, extend just fine to classes:

class Bar {
public:
        Foo() : foo(new Foo) {}

private:
        unique_ptr<Foo> foo;
};

In this case, the Bar object owns the Foo object, and will destroy it when it goes out of scope without having to do a manual delete in the destructor, operator= and so on; not to mention that it will make your object non-copy-constructible, so you won't get that wrong by mistake. (In this case, you could do the same just by “Foo foo;” instead of using unique_ptr, of course, modulo the copy constructor behavior and heap behavior.)

So far, we could do all of this in C++03. But C++11 includes a very helpful extra piece of the puzzle, namely move semantics. These allow us to transfer the ownership safely:

class Bar {
public:
        Bar(unique_ptr<Foo> arg_foo) : foo(foo) {}

private:
        unique_ptr<Foo> foo;
};

void func() {
        unique_ptr<Foo> foo(new Foo);
        // Do something with foo.
        Bar bar(move(foo));
        // ...
}

Below the Bar constructor line, foo is empty, and bar owns the Foo object! And at no point, the object was without an owner; if there's no more code in the function, bar will get immediately destroyed, and the Foo object with it (since it has ownership). It also deals just fine with exception safety.

If you program with unique_ptr, it is genuinely very hard to get memory leaks. And it's much better than Java-style garbage collection; you don't get the RAM overhead GC needs, your objects are destroyed at predictable times, and destructors are run, so you can get reliable behavior on things like file handles, sockets and the likes, without having to resort to manual cleanup in a finally block. (In a sense, it's like a refcount that can only ever be 0 or 1.)

It sounds so innocuous on paper, but all great ideas are simple. So, go forth and unique_ptr!

Steve Kemp: Applications updating & phoning home

17 September, 2014 - 03:42

Personally I believe that any application packaged for Debian should neither phone home, attempt to download plugins over HTTP at run-time, or update itself.

On that basis I've filed #761828.

As a project we have guidelines for what constitutes a "serious" bug, which generally boil down to a package containing a security issue, causing data-loss, or being unusuable.

I'd like to propose that these kind of tracking "things" are equally bad. If consensus could be reached that would be a good thing for the freedom of our users.

(Ooops I slipped into "us", "our user", I'm just an outsider looking in. Mostly.)

Petter Reinholdtsen: Speeding up the Debian installer using eatmydata and dpkg-divert

16 September, 2014 - 20:00

The Debian installer could be a lot quicker. When we install more than 2000 packages in Skolelinux / Debian Edu using tasksel in the installer, unpacking the binary packages take forever. A part of the slow I/O issue was discussed in bug #613428 about too much file system sync-ing done by dpkg, which is the package responsible for unpacking the binary packages. Other parts (like code executed by postinst scripts) might also sync to disk during installation. All this sync-ing to disk do not really make sense to me. If the machine crash half-way through, I start over, I do not try to salvage the half installed system. So the failure sync-ing is supposed to protect against, hardware or system crash, is not really relevant while the installer is running.

A few days ago, I thought of a way to get rid of all the file system sync()-ing in a fairly non-intrusive way, without the need to change the code in several packages. The idea is not new, but I have not heard anyone propose the approach using dpkg-divert before. It depend on the small and clever package eatmydata, which uses LD_PRELOAD to replace the system functions for syncing data to disk with functions doing nothing, thus allowing programs to live dangerous while speeding up disk I/O significantly. Instead of modifying the implementation of dpkg, apt and tasksel (which are the packages responsible for selecting, fetching and installing packages), it occurred to me that we could just divert the programs away, replace them with a simple shell wrapper calling "eatmydata $program $@", to get the same effect. Two days ago I decided to test the idea, and wrapped up a simple implementation for the Debian Edu udeb.

The effect was stunning. In my first test it reduced the running time of the pkgsel step (installing tasks) from 64 to less than 44 minutes (20 minutes shaved off the installation) on an old Dell Latitude D505 machine. I am not quite sure what the optimised time would have been, as I messed up the testing a bit, causing the debconf priority to get low enough for two questions to pop up during installation. As soon as I saw the questions I moved the installation along, but do not know how long the question were holding up the installation. I did some more measurements using Debian Edu Jessie, and got these results. The time measured is the time stamp in /var/log/syslog between the "pkgsel: starting tasksel" and the "pkgsel: finishing up" lines, if you want to do the same measurement yourself. In Debian Edu, the tasksel dialog do not show up, and the timing thus do not depend on how quickly the user handle the tasksel dialog.

Machine/setup Original tasksel Optimised tasksel Reduction Latitude D505 Main+LTSP LXDE 64 min (07:46-08:50) <44 min (11:27-12:11) >20 min 18% Latitude D505 Roaming LXDE 57 min (08:48-09:45) 34 min (07:43-08:17) 23 min 40% Latitude D505 Minimal 22 min (10:37-10:59) 11 min (11:16-11:27) 11 min 50% Thinkpad X200 Minimal 6 min (08:19-08:25) 4 min (08:04-08:08) 2 min 33% Thinkpad X200 Roaming KDE 19 min (09:21-09:40) 15 min (10:25-10:40) 4 min 21%

The test is done using a netinst ISO on a USB stick, so some of the time is spent downloading packages. The connection to the Internet was 100Mbit/s during testing, so downloading should not be a significant factor in the measurement. Download typically took a few seconds to a few minutes, depending on the amount of packages being installed.

The speedup is implemented by using two hooks in Debian Installer, the pre-pkgsel.d hook to set up the diverts, and the finish-install.d hook to remove the divert at the end of the installation. I picked the pre-pkgsel.d hook instead of the post-base-installer.d hook because I test using an ISO without the eatmydata package included, and the post-base-installer.d hook in Debian Edu can only operate on packages included in the ISO. The negative effect of this is that I am unable to activate this optimization for the kernel installation step in d-i. If the code is moved to the post-base-installer.d hook, the speedup would be larger for the entire installation.

I've implemented this in the debian-edu-install git repository, and plan to provide the optimization as part of the Debian Edu installation. If you want to test this yourself, you can create two files in the installer (or in an udeb). One shell script need do go into /usr/lib/pre-pkgsel.d/, with content like this:

#!/bin/sh
set -e
. /usr/share/debconf/confmodule
info() {
    logger -t my-pkgsel "info: $*"
}
error() {
    logger -t my-pkgsel "error: $*"
}
override_install() {
    apt-install eatmydata || true
    if [ -x /target/usr/bin/eatmydata ] ; then
        for bin in dpkg apt-get aptitude tasksel ; do
            file=/usr/bin/$bin
            # Test that the file exist and have not been diverted already.
            if [ -f /target$file ] ; then
                info "diverting $file using eatmydata"
                printf "#!/bin/sh\neatmydata $bin.distrib \"\$@\"\n" \
                    > /target$file.edu
                chmod 755 /target$file.edu
                in-target dpkg-divert --package debian-edu-config \
                    --rename --quiet --add $file
                ln -sf ./$bin.edu /target$file
            else
                error "unable to divert $file, as it is missing."
            fi
        done
    else
        error "unable to find /usr/bin/eatmydata after installing the eatmydata pacage"
    fi
}

override_install

To clean up, another shell script should go into /usr/lib/finish-install.d/ with code like this:

#! /bin/sh -e
. /usr/share/debconf/confmodule
error() {
    logger -t my-finish-install "error: $@"
}
remove_install_override() {
    for bin in dpkg apt-get aptitude tasksel ; do
        file=/usr/bin/$bin
        if [ -x /target$file.edu ] ; then
            rm /target$file
            in-target dpkg-divert --package debian-edu-config \
                --rename --quiet --remove $file
            rm /target$file.edu
        else
            error "Missing divert for $file."
        fi
    done
    sync # Flush file buffers before continuing
}

remove_install_override

In Debian Edu, I placed both code fragments in a separate script edu-eatmydata-install and call it from the pre-pkgsel.d and finish-install.d scripts.

By now you might ask if this change should get into the normal Debian installer too? I suspect it should, but am not sure the current debian-installer coordinators find it useful enough. It also depend on the side effects of the change. I'm not aware of any, but I guess we will see if the change is safe after some more testing. Perhaps there is some package in Debian depending on sync() and fsync() having effect? Perhaps it should go into its own udeb, to allow those of us wanting to enable it to do so without affecting everyone.

Hideki Yamane: Intel 910 SSD 400GB - $420

16 September, 2014 - 16:10

Intel SSD 910 (400GB, SSDPEDOX400G301) is cheaper than ever in Japan - only $420 (and its spec sheet says "Recommended Customer Price BULK: $1929.00", wow).

Ritesh Raj Sarraf: apt-offline 1.5

16 September, 2014 - 02:17

I am very pleased to announce the release of apt-offline, version 1.5.

In version 1.4, the offline bug report functionality had to be dropped. In version 1.5, it is back again. apt-offline now uses the new Debian native BTS library. Thanks to its developers, this library is much more slim and neat. The only catch is that it depends on the SOAPpy library which currently is not stock in Python. If you run apt-offline of Debian, you may not have to worry as I will add a Recommends on that package. For users using it on Microsoft Windows, please ensure that you have the SOAPpy library installed. It is available on pypi.

The old bundled magic library has been replaced with the version of python magic library that Debian ships. This library is derived from the file package and is portable on almost all Unixes. For Debian users, there will be a Recommends on it too.

There were also a bunch of old, outstanding, and annoying bugs that have been fixed in this release. For a full list of changes, please refer to the git logs.

With this release, apt-offline should be in good shape for the Jessie release.

apt-offline is available on Alioth @ https://alioth.debian.org/projects/apt-offline/

AddThis:  Categories: Keywords: 

Keith Packard: xserver-pending-fixes

16 September, 2014 - 01:14
A Forest of X Server Changes

We’ve got about another month left in the X server merge window for 1.17 and I’ve written a small set of fixes which haven’t been reviewed yet for merging. I thought I’d advertise them a bit and see if I couldn’t encourage a few of you to take a look and see if they’re useful, correct and complete.

All of these are in my personal X server repository:

git://people.freedesktop.org/~keithp/xserver.git
Cleaning up the X Registry

Branch: registry-fixes

I’ll bet most of you don’t even know about this code. It serves as a database mapping various X enumerations to strings to aid in diagnostics. For the security extensions, SECURITY and XSELinux, it holds names for all of the request, event and errors in the core protocol and all registered extensions. For X-Resource, it has the names of the registered resource types.

The X registry gets the request, event and error data from a file, “protocol.txt”, which is installed in /usr/lib/xorg/protocol.txt on my machine. It gets the resource names as a part of resource type allocation.

So, what’s wrong with this? Three basic things:

  1. A simple bug — protocol.txt is left open while the server runs. This consumes a file descriptor for no good reason.

  2. protocol.txt is read and parsed even if the security extensions aren’t available. This wastes time and memory.

  3. The resource names are kept even if X-Resource isn’t in use.

The fixes remove the configure options for including the registry code; these functions are only used by the above extensions, so we can tell whether to include the code based solely on whether the extensions are being built.

Getting rid of the TCP listener by default

Branch: listen-fixes

We’ve had the ‘-nolisten’ option for a while now to disable inbound TCP connections. It’s useful for security reasons, but we’ve never enabled this by default. This patch sequence provides configure options for each of the listen sockets (tcp, unix and local), leaves unix and local enabled by default and disables tcp by default.

A new option, ‘-listen’, is added which allows the user to override the -nolisten defaults in case they actually want to use TCP connections to X.

Glamor bug fixes

branch: glamor-fixes

This branch fixes two bugs:

  1. Scale a large pixmap down to a small pixmap. This happens when you display enormous images in a web page. Iceweasel sends the whole huge image to X and uses Render to scale it to the screen. If the image is larger than a single texture, the X server splits it up into tiles, but the code which tries to perform the merged scale is just broken. Five patches fix this.

  2. Shader-based trapezoids. This code uses area coverage to compute trapezoids. That violates the Render spec, which requires point sampling. Further, the performance of these trapezoids is lower than software (by a lot). This one patch removes the code.

Present bug fixes

branch: present-fixes

A selection of small bug fixes:

  1. Clear pending flips at CloseScreen. This removes a reference to any pending flip pixmap, allowing it to be freed. Otherwise, we’ll leak memory across server reset.

  2. Add support for PresentOptionCopy. This has been in the protocol spec for a while, and was completely trivial to implement. However, it never got done. One tiny little patch.

  3. Expose the Present API to drivers via sdksyms.sh. Until now, the present extension APIs have only been available inside the X server. This exposes them to drivers. This took a few cleanup patches first.

Use Present for Glamor XV

branch: glamor-present-xv

Painting XV to the screen should be done at vblank time to avoid tearing. Present offers vblank synchronized operations. Hooking those two together required a few new present APIs to expose the vblank functionality outside of the present code, then a bit of glamor code to hook up that new API to the XV bits.

Switching Glamor to a GL core profile context

branch: glamor-core-profile

This patch set is still in progress, but demonstrates how close we are. We’ll be requiring OpenGL 3.3 for this so that we get texture swizzling, which is required for our single channel objects.

The changes present on the branch are:

  1. Switch single channel surfaces from GLALPHA to GLRED.

  2. Use vertex array objects.

  3. Switch ephyr over to using a core 3.3 profile.

Still left to do is

  1. Switch Render code to VBOs

The core code uses VBOs everywhere, but the Render code doesn’t. This means that all Render drawing fails, which makes the resulting server not very useful.

My main objective for getting this done is to reduce memory usage by about 16MB, which is the space allocated for software rendering in Mesa in case someone does something which the hardware doesn’t handle, and that can only with some legacy OpenGL APIs.

Please help out!

All of these friendly little patches are looking for a bit of review so that they can get merged before the 1.17 window closes.

Thomas Goirand: Backporting libjs-angularjs and libjs-d3 to Wheezy

16 September, 2014 - 01:13

If you didn’t notice, Javascript isn’t as simple as it used to be… Want to backport the 2 simple javascript libs? No problem. You then “just” need to backport a bunch of other packages which are build-dependencies… (and file #761670, #761672, and #761674 on the way when rebuilding…). Here’s the short list:

gyp
node-abbrev
node-ansi
node-ansi-color-table
node-archy
node-async
node-block-stream
node-combined-stream
node-contextify
node-cookie-jar
node-cssom
node-delayed-stream
node-diff
node-eyes
node-forever-agent
node-form-data
node-fstream
node-fstream-ignore
node-github-url-from-git
node-glob
node-graceful-fs
node-gyp
node-htmlparser
node-inherits
node-ini
node-jake
node-jsdom
node-json-stringify-safe
node-lockfile
node-lru-cache
node-marked
node-mime
node-minimatch
node-mkdirp
node-mute-stream
node-node-uuid
node-nopt
node-normalize-package-data
node-npmlog
node-once
node-optimist
node-osenv
node-qs
node-queue-async
node-read
node-read-package-json
node-request
node-retry
node-rimraf
node-semver
node-sha
node-sigmund
node-slide
node-smash
node-tar
node-tunnel-agent
node-uglify
node-underscore
node-utilities
node-vows
node-whic
node-which
node-wordwrap
nodejs
npm
ruby-ronn

Yes, that’s 66 packages above… And of course, backporting some ruby stuff makes sense… :)

Vincent Sanders: NetSurf 3.2

15 September, 2014 - 23:08
We recently released a new version of NetSurf this was largely to address numerous small bugs but did also include the persistent caching implementation I have written about previously. A release used to require the release manager (usually me) to perform a lot of manual processes and while we had a checklist it was far too easy to miss things.

The Continuous Integration (CI) system combined with signed release tags in git has resulted in a greatly simplified process indeed it has become almost completely automated. The majority of the manual work is now confined to doing the tasks that require actual decision making and checking we are releasing what was intended.

By having the CI system build release binaries the project now has a much clearer and importantly traceable process, I can recommend such a system to any project that produces releases especially if they release binaries for any of their targets.

I have also managed to package and upload this version of NetSurf ready for the Debian Jessie release. I would like to thank Jonathan Wiltshire for his assistance in ensuring this was a good quality package.

The release incorporates the successfully merged work of Rupinder Singh who was our our GSoc 2014 student. Rupinder mainly made improvements to our core DOM implementation and was very responsive and enthusiastic throughout his time despite the mentor team sometimes not being available.

This work goes towards improving NetSurf in the future by ensuring the underlying features are present in our core libraries. The GSoc mentors and all project developers are all pleased with the results of this years GSoc participation and would like to thank everyone involved in making our participation possible.

Along with the good news comes a little bad:
PowerPC Mac OS X
Despite repeated calls for assistance with new hardware and Java builds none has been forthcoming meaning that from this release we ware no longer able to ship PowerPC builds for MAC OS X.

The main issue is the last version of MAC OS X that runs on PPC is Leopard and there is no viable Java 1.6 port necessary for our CI system to run. Additionally the fully loaded PPC Mac mini (kindly donated to us by Mythic Beasts) had become far too slow to keep up with our builds and was causing long delays.
Bugs
We have a lot of bugs, in fact just during this release cycle we have 30 more bugs reported than we closed.So while the new bug reporting system has been a success and our users are reporting issues when they find them the development team is not keeping up..

The failure to keep up stems from the underlying issue of lack of manpower. We have relatively few active developers which is especially problematic when there are many users for a platform, such as RISCOS, but the maintainer is unable to commit enough time to fixing issues.

If you would like to help making NetSurf a better browser we are always happy to work with new contributors.

Junichi Uekawa: ARM assembly.

15 September, 2014 - 19:51
ARM assembly. I was reading up some docs on Unified Assembly Language (UAL). and confusions. I don't seem to be able to find a comprehensive doc about what works and what doesn't. Heh.

Julien Danjou: Python bad practice, a concrete case

15 September, 2014 - 19:09

A lot of people read up on good Python practice, and there's plenty of information about that on the Internet. Many tips are included in the book I wrote this year, The Hacker's Guide to Python. Today I'd like to show a concrete case of code that I don't consider being the state of the art.

In my last article where I talked about my new project Gnocchi, I wrote about how I tested, hacked and then ditched whisper out. Here I'm going to explain part of my thought process and a few things that raised my eyebrows when hacking this code.

Before I start, please don't get the spirit of this article wrong. It's in no way a personal attack to the authors and contributors (who I don't know). Furthermore, whisper is a piece of code that is in production in thousands of installation, storing metrics for years. While I can argue that I consider the code not to be following best practice, it definitely works well enough and is worthy to a lot of people.

Tests

The first thing that I noticed when trying to hack on whisper, is the lack of test. There's only one file containing tests, named test_whisper.py, and the coverage it provides is pretty low. One can check that using the coverage tool.

$ coverage run test_whisper.py
...........
----------------------------------------------------------------------
Ran 11 tests in 0.014s
 
OK
$ coverage report
Name Stmts Miss Cover
----------------------------------
test_whisper 134 4 97%
whisper 584 227 61%
----------------------------------
TOTAL 718 231 67%


While one would think that 61% is "not so bad", taking a quick peak at the actual test code shows that the tests are incomplete. Why I mean by incomplete is that they for example use the library to store values into a database, but they never check if the results can be fetched and if the fetched results are accurate. Here's a good reason one should never blindly trust the test cover percentage as a quality metric.

When I tried to modify whisper, as the tests do not check the entire cycle of the values fed into the database, I ended up doing wrong changes but had the tests still pass.

No PEP 8, no Python 3

The code doesn't respect PEP 8 . A run of flake8 + hacking shows 732 errors… While it does not impact the code itself, it's more painful to hack on it than it is on most Python projects.

The hacking tool also shows that the code is not Python 3 ready as there is usage of Python 2 only syntax.

A good way to fix that would be to set up tox and adds a few targets for PEP 8 checks and Python 3 tests. Even if the test suite is not complete, starting by having flake8 run without errors and the few unit tests working with Python 3 should put the project in a better light.

Not using idiomatic Python

A lot of the code could be simplified by using idiomatic Python. Let's take a simple example:

def fetch(path,fromTime,untilTime=None,now=None):
fh = None
try:
fh = open(path,'rb')
return file_fetch(fh, fromTime, untilTime, now)
finally:
if fh:
fh.close()


That piece of code could be easily rewritten as:

def fetch(path,fromTime,untilTime=None,now=None):
with open(path, 'rb') as fh:
return file_fetch(fh, fromTime, untilTime, now)


This way, the function looks actually so simple that one can even wonder why it should exists – but why not.

Usage of loops could also be made more Pythonic:

for i,archive in enumerate(archiveList):
if i == len(archiveList) - 1:
break


could be actually:

for i, archive in enumerate(itertools.islice(archiveList, len(archiveList) - 1):


That reduce the code size and makes it easier to read through the code.

Wrong abstraction level

Also, one thing that I noticed in whisper, is that it abstracts its features at the wrong level.

Take the create() function, it's pretty obvious:

def create(path,archiveList,xFilesFactor=None,aggregationMethod=None,sparse=False,useFallocate=False):
# Set default params
if xFilesFactor is None:
xFilesFactor = 0.5
if aggregationMethod is None:
aggregationMethod = 'average'
 
#Validate archive configurations...
validateArchiveList(archiveList)
 
#Looks good, now we create the file and write the header
if os.path.exists(path):
raise InvalidConfiguration("File %s already exists!" % path)
fh = None
try:
fh = open(path,'wb')
if LOCK:
fcntl.flock( fh.fileno(), fcntl.LOCK_EX )
 
aggregationType = struct.pack( longFormat, aggregationMethodToType.get(aggregationMethod, 1) )
oldest = max([secondsPerPoint * points for secondsPerPoint,points in archiveList])
maxRetention = struct.pack( longFormat, oldest )
xFilesFactor = struct.pack( floatFormat, float(xFilesFactor) )
archiveCount = struct.pack(longFormat, len(archiveList))
packedMetadata = aggregationType + maxRetention + xFilesFactor + archiveCount
fh.write(packedMetadata)
headerSize = metadataSize + (archiveInfoSize * len(archiveList))
archiveOffsetPointer = headerSize
 
for secondsPerPoint,points in archiveList:
archiveInfo = struct.pack(archiveInfoFormat, archiveOffsetPointer, secondsPerPoint, points)
fh.write(archiveInfo)
archiveOffsetPointer += (points * pointSize)
 
#If configured to use fallocate and capable of fallocate use that, else
#attempt sparse if configure or zero pre-allocate if sparse isn't configured.
if CAN_FALLOCATE and useFallocate:
remaining = archiveOffsetPointer - headerSize
fallocate(fh, headerSize, remaining)
elif sparse:
fh.seek(archiveOffsetPointer - 1)
fh.write('\x00')
else:
remaining = archiveOffsetPointer - headerSize
chunksize = 16384
zeroes = '\x00' * chunksize
while remaining > chunksize:
fh.write(zeroes)
remaining -= chunksize
fh.write(zeroes[:remaining])
 
if AUTOFLUSH:
fh.flush()
os.fsync(fh.fileno())
finally:
if fh:
fh.close()


The function is doing everything: checking if the file doesn't exist already, opening it, building the structured data, writing this, building more structure, then writing that, etc.

That means that the caller has to give a file path, even if it just wants a whipser data structure to store itself elsewhere. StringIO() could be used to fake a file handler, but it will fail if the call to fcntl.flock() is not disabled – and it is inefficient anyway.

There's a lot of other functions in the code, such as for example setAggregationMethod(), that mixes the handling of the files – even doing things like os.fsync() – while manipulating structured data. This is definitely not a good design, especially for a library, as it turns out reusing the function in different context is near impossible.

Race conditions

There are race conditions, for example in create() (see added comment):

if os.path.exists(path):
raise InvalidConfiguration("File %s already exists!" % path)
fh = None
try:
# TOO LATE I ALREADY CREATED THE FILE IN ANOTHER PROCESS YOU ARE GOING TO
# FAIL WITHOUT GIVING ANY USEFUL INFORMATION TO THE CALLER :-(
fh = open(path,'wb')


That code should be:

try:
fh = os.fdopen(os.open(path, os.O_WRONLY | os.O_CREAT | os.O_EXCL), 'wb')
except OSError as e:
if e.errno = errno.EEXIST:
raise InvalidConfiguration("File %s already exists!" % path)


to avoid any race condition.

Unwanted optimization

We saw earlier the fetch() function that is barely useful, so let's take a look at the file_fetch() function that it's calling.

def file_fetch(fh, fromTime, untilTime, now = None):
header = __readHeader(fh)
[...]


The first thing the function does is to read the header from the file handler. Let's take a look at that function:

def __readHeader(fh):
info = __headerCache.get(fh.name)
if info:
return info
 
originalOffset = fh.tell()
fh.seek(0)
packedMetadata = fh.read(metadataSize)
 
try:
(aggregationType,maxRetention,xff,archiveCount) = struct.unpack(metadataFormat,packedMetadata)
except:
raise CorruptWhisperFile("Unable to read header", fh.name)
[...]


The first thing the function does is to look into a cache. Why is there a cache?

It actually caches the header based with an index based on the file path (fh.name). Except that if one for example decide not to use file and cheat using StringIO, then it does not have any name attribute. So this code path will raise an AttributeError.

One has to set a fake name manually on the StringIO instance, and it must be unique so nobody messes with the cache

import StringIO
 
packedMetadata = <some source>
fh = StringIO.StringIO(packedMetadata)
fh.name = "myfakename"
header = __readHeader(fh)


The cache may actually be useful when accessing files, but it's definitely useless when not using files. But it's not necessarily true that the complexity (even if small) that the cache adds is worth it. I doubt most of whisper based tools are long run processes, so the cache that is really used when accessing the files is the one handled by the operating system kernel, and this one is going to be much more efficient anyway, and shared between processed. There's also no expiry of that cache, which could end up of tons of memory used and wasted.

Docstrings

None of the docstrings are written in a a parsable syntax like Sphinx. This means you cannot generate any documentation in a nice format that a developer using the library could read easily.

The documentation is also not up to date:

def fetch(path,fromTime,untilTime=None,now=None):
"""fetch(path,fromTime,untilTime=None)
[...]
"""
 
def create(path,archiveList,xFilesFactor=None,aggregationMethod=None,sparse=False,useFallocate=False):
"""create(path,archiveList,xFilesFactor=0.5,aggregationMethod='average')
[...]
"""


This is something that could be avoided if a proper format was picked to write the docstring. A tool cool be used to be noticed when there's a diversion between the actual function signature and the documented one, like missing an argument.

Duplicated code

Last but not least, there's a lot of code that is duplicated around in the scripts provided by whisper in its bin directory. Theses scripts should be very lightweight and be using the console_scripts facility of setuptools, but they actually contains a lot of (untested) code. Furthermore, some of that code is partially duplicated from the whisper.py library which is against DRY.

Conclusion

There are a few more things that made me stop considering whisper, but these are part of the whisper features, not necessarily code quality. One can also point out that the code is very condensed and hard to read, and that's a more general problem about how it is organized and abstracted.

A lot of these defects are actually points that made me start writing The Hacker's Guide to Python a year ago. Running into this kind of code makes me think it was a really good idea to write a book on advice to write better Python code!

The Hacker's Guide to Python

A book I wrote talking about designing Python applications, state of the art, advice to apply when building your application, various Python tips, etc. Interested? Check it out.

Cyril Brulebois: Freelance Debian consultant

15 September, 2014 - 17:20

I’m not used to talking about my day job but here’s an exception.

Over the past few years I worked in two startups (3 years each). It was nice to spend time in different areas: one job was mostly about research and development in a Linux cluster environment; the other one was about maintaining a highly-customized, Linux-based operating system, managing a small support team, and performing technological surveillance in IT security.

In the meanwhile I’ve reached a milestone: 10 years with Debian. I had been wondering for a few months whether I could try my luck going freelance, becoming a Debian consultant. I finally decided to go ahead and started in August!

The idea is to lend a hand for various Debian-related things like systems administration, development/debugging, packaging/repository maintenance, or Debian Installer support, be it one-shot or on a regular basis. I didn’t think about trainings/workshops at first but sharing knowledge is something I’ve always liked, even if I didn’t become a teacher.

For those interested, details can be found on my website: https://mraw.org/.

Of course this doesn’t mean I’m going to put an end to my benevolent activities within Debian, especially as a Debian Installer release manager. Quite the contrary in fact! See the August and September debian-boot@ archives, which have been busy months. :)

Martin Pitt: autopkgtest 3.5: Reboot support, Perl/Ruby implicit tests

15 September, 2014 - 16:23

Last week’s autopkgtest 3.5 release (in Debian sid and Ubuntu Utopic) brings several new features which I’d like to announce.

Tests that reboot

For testing low-level packages like init or the kernel it is sometimes desirable to reboot the testbed in the middle of a test. For example, I added a new boot_and_services systemd autopkgtest which configures grub to boot with systemd as pid 1, reboots, and then checks that the most important services like lightdm, D-BUS, NetworkManager, and cron come up as expected. (This test will be expanded a lot in the future to cover other areas like the journal, logind, etc.)

In a testbed which supports rebooting (currently only QEMU) your test will now find an “autopkgtest-reboot” command which the test calls with an arbitrary “marker” string. autopkgtest will then reboot the testbed, save/restore any files it needs to (like the tests file tree or previously created artifacts), and then re-run the test with ADT_REBOOT_MARK=mymarker.

The new “Reboot during a test” section in README.package-tests explains this in detail with an example.

Implicit test metadata for similar packages

The Debian pkg-perl team recently discussed how to add package tests to the ~ 3.000 Perl packages. For most of these the test metadata looks pretty much the same, so they created a new pkg-perl-autopkgtest package which centralizes the logic. autopkgtest 3.5 now supports an implicit debian/tests/control control file to avoid having to modify several thousand packages with exactly the same file.

An initial run already looked quite promising, 65% of the packages pass their tests. There will be a few iterations to identify common failures and fix those in pkg-perl-autopkgtest and autopkgtestitself now.

There is still some discussion about how implicit test control files go together with the DEP-8 specification, as other runners like sadt do not support them yet. Most probably we’ll declare those packages XS-Testsuite: autopkgtest-pkg-perl instead of the usual autopkgtest.

In the same vein, Debian’s Ruby maintainer (Antonio Terceiro) added implicit test control support for Ruby packages. We haven’t done a mass test run with those yet, but their structure will probably look very similar.

Gergely Nagy: Looking ahead

15 September, 2014 - 14:34

A little more than a year ago, I started working as a syslog-ng OSE developer full-time. That was a tremendously important milestone in my career, as one of the goals I wanted to achieve in life - to work on free software for a living - became a reality. Rocket boots were fired up, and we accomplised quite a lot in the past year, and I'm very, very proud of the work we did - we, the whole community. I enjoyed every bit of it, but as it turns out, some of the other desires I wish to pursue, and new challenges I am looking for, will lead me in a different direction. At the end of August, I handed in my resignation, and the past friday was my last work day at BalaBit.

There are a few questions that will likely be asked, and I'll try my best to answer them beforehand. Questions such as: What will happen to syslog-ng?, Who will be the new maintainer?, How does this affect your Debian work?, and so on, and so forth.

syslog-ngWhat will happen to syslog-ng?

After careful consideration, with the syslog-ng team at BalaBit, we decided that they will take over OSE-related responsibilities as well. They will do releases, they will engage the people on GitHub and the mailing list, they will take care of the @sngOSE twitter account.

During the past few months, we've been working on pushing the team closer to the community: we moved issues to GitHub, the team submitted pull requests, and in general, we moved closer to each other in every possible way.

While they may not have the open source maintainer experience I had, they are all capable folk, and will quickly learn on the job. The team taking over also has the advantage of having more manpower, and better collaboration between the premium edition and OSE.

Who will be the syslog-ng maintainer?

There will be no single bottle neck. This has both good and bad implications, but I believe the good ones outweigh the disadvantages.

What about the roadmap? When will 3.6.1 happen?

The roadmap is already laid out for 3.6, but it is the team's judgement when it will be released. The latest 3.6.0beta2 was released together with the team, too. I expect there will be delays (I planned 3.6.1 to be released on September 27th), but not much; a few weeks, perhaps.

What will your involvement in syslog-ng development be?

With changing jobs, my involvement will drastically decrease. I will remain a syslog-ng user, and I will continue packaging it for Debian and Ubuntu, and will keep my unofficial repository running. I do not see myself contributing much, perhaps bug reports, opinions and an occasional idea.

Nevertheless, my expertise is still considerable, and I will help the team and the community in whatever way I can - but I will be severely time constrained.

Debian

While BalaBit were very free software friendly, and allowed me to work on Debian tasks from time to time, my other activities took up an increasingly big chunk of my paid time, and I had to cut back a little. At the new job, things will be a bit different: while I won't have as much time to do Debian work on the job, I will have a lot more free time, in which I hope to do more for Debian.

I will - as promised above - continue maintaining syslog-ng, and all other packages I currently maintain. Apart from those, I wish to make myself more useful within the Clojure and Hy teams.

MiscellaneousWhere to are you moving?

This is something I will answer in due time. For now, I will have two weeks off, which I wish to spend my way, picking up projects I neglected for far too long.

Gregor Herrmann: RC bugs 2014/34-37

15 September, 2014 - 05:13

the perl 5.20 transition is over, debconf14 is over, so I should have more time for RC bugs? yes & no: I fixed some, but only in "our" (as in: pkg-perl) packages:

  • #711418 – src:libanyevent-dbi-perl: "libanyevent-dbi-perl: FTBFS: Failed test 'Using an unknown function results in error'"
    add patch for newer SQLite (pkg-perl)
  • #756566 – libxml-dt-perl: "libxml-dt-perl: Insecure use of temporary files (CVE-2014-5260)"
    upload new upstream release (pkg-perl)
  • #759838 – src:padre: "padre: FTBFS: Failed test 'no warnings'"
    fix 2/3 of the test failures in git, last one fixed by dod (pkg-perl)
  • #759942 – src:cpanminus: "cpanminus: FTBFS: Can't write to cpanm home '/sbuild-nonexistent/.cpanm': You should fix it with chown/chmod first."
    set HOME once more in debian/rules (pkg-perl)
  • #759964 – src:libhtml-formhandler-model-dbic-perl: "libhtml-formhandler-model-dbic-perl: FTBFS: dh_auto_test: make -j1 test returned exit code 2"
    upload new upstream release (pkg-perl)
  • #761312 – libdbix-class-resultset-recursiveupdate-perl: "libdbix-class-resultset-recursiveupdate-perl: missing dependency on liblist-moreutils-perl"
    add missing dependency (pkg-perl)
  • #761313 – libgeo-google-mapobject-perl: "libgeo-google-mapobject-perl: missing dependency on libjson-perl"
    add missing dependency (pkg-perl)
  • #761315 – libfile-read-perl: "libfile-read-perl: missing dependency on libfile-slurp-perl"
    add missing dependency (pkg-perl)
  • #761319 – libpod-wordlist-hanekomu-perl: "libpod-wordlist-hanekomu-perl: missing dependency on libtest-spelling-perl"
    add missing dependency (pkg-perl)
  • #761526 – src:liblocale-maketext-gettext-perl: "liblocale-maketext-gettext-perl: FTBFS: Tests failures"
    skip test which needs internet and a writable $HOME (pkg-perl)
  • #761558 – src:libposix-strptime-perl: "libposix-strptime-perl: FTBFS: Tests failures"
    skip test which needs internet and a writable $HOME (pkg-perl)

Ben Armstrong: Bluff Wilderness Trail Hike, Summer 2014

14 September, 2014 - 06:30

Happy to be back from our yearly hike with my friend, Ryan Neily, on the Bluff Wilderness Trail. We’re proud of our achievement, hiking all four loops. Including the trip to and from the head of the trail, that was 30 km in all. Exhausting, but well worth it.

On the trip we bumped into one of the people from WRWEO who helps to maintain the trail, and stopped for a bit to talk to swap stories and tips about hiking the trail. Kudos to Nancy for helping keep this trail beautiful and accessible. We really appreciate the tireless work of this organization, and the thought they’ve put into it. It’s a treasure!

Keith Packard: Altos1.5

14 September, 2014 - 02:47
AltOS 1.5 — EasyMega support, features and bug fixes

Bdale and I are pleased to announce the release of AltOS version 1.5.

AltOS is the core of the software for all of the Altus Metrum products. It consists of firmware for our cc1111, STM32L151, LPC11U14 and ATtiny85 based electronics and Java-based ground station software.

This is a major release of AltOS, including support for our new EasyMega board and a host of new features and bug fixes

AltOS Firmware — EasyMega added, new features and fixes

Our new flight computer, EasyMega, is a TeleMega without any radios:

  • 9 DoF IMU (3 axis accelerometer, 3 axis gyroscope, 3 axis compass).

  • Orientation tracking using the gyroscopes (and quaternions, which are lots of fun!)

  • Four fully-programmable pyro channels, in addition to the usual apogee and main channels.

AltOS Changes

We’ve made a few improvements in the firmware:

  • The APRS secondary station identifier (SSID) is now configurable by the user. By default, it is set to the last digit of the serial number.

  • Continuity of the four programmable pyro channels on EasyMega and TeleMega is now indicated via the beeper. Four tones are sent out after the continuity indication for the apogee and main channels with high tones indicating continuity and low tones indicating an open circuit.

  • Configurable telemetry data rates. You can now select among 38400 (the previous value, and still the default), 9600 or 2400 bps. To take advantage of this, you’ll need to reflash your TeleDongle or TeleBT.

AltOS Bug Fixes

We also fixed a few bugs in the firmware:

  • TeleGPS had separate flight logs, one for each time the unit was turned on. Turning the unit on to test stuff and turning it back off would consume one of the flight log ‘slots’ on the board; once all of the slots were full, no further logging would take place. Now, TeleGPS appends new data to an existing single log.

  • Increase the maximum computed altitude from 32767m to 2147483647m. Back when TeleMetrum v1.0 was designed, we never dreamed we’d be flying to 100k’ or more. Now that’s surprisingly common, and so we’ve increased the size of the altitude data values to fit modern rocketry needs.

  • Continuously evaluate pyro firing condition during delay period. The previous firmware would evaluate the pyro firing requirements, and once met, would delay by the indicated amount and then fire the channel. If the conditions had changed state, the channel would still fire. Now, the conditions are continuously evaluated during the delay period and if they change state, the event is suppressed.

  • Allow negative values in the pyro configuration. Now you can select a negative speed to indicate a descent rate or a negative acceleration value to indicate acceleration towards the ground.

AltosUI and TeleGPS — EasyMega support, OS integration and more

The AltosUI and TeleGPS applications have a few changes for this release:

  • EasyMega support. That was a simple matter of adapting the existing TeleMega support.

  • Added icons for our file types, and hooked up the file manager so that AltosUI, TeleGPS and/or MicroPeak are used to view any of our data files.

  • Configuration support for APRS SSIDs, and telemetry data rates.

Laura Arjona: Disabling comments in the blog

13 September, 2014 - 22:23

I’m getting more spam than the amount that I can stand in this blog. Comments are moderated, so the public is not suffering that, only me. From time to time I go to my dashboard and clean the spam. I’m afraid that I delete some legit comment in these spam-cleaning-fevers, or, more probably, that a legit comment waits in the queue for several days (weeks?), just because I’m lazy to deal with spam and let days pass by (until the fever comes).

I think I’m going to follow the wisdom of Bradley M. Kuhn and link to a pump.io note for comments on my blog posts (disabling them here in WordPress.com). I usually post a notice when I write something in my blog, so the only task is to update the blog post with the pump.io URL of the thread for comments.

While WordPress.com allows to write comments quickly, without need of an account (you write just a name and an email, and the comment), in pump you need to have an account and sign in to comment. That looks as a bad thing, a barrier for people to participate. But of course, it stops spam :)

After thinking about it a bit, pump.io it’s a federated network, you can choose the pump server that they want, you can create a fake account, you don’t need to provide personal information… and it’s another way to promote one of the social networks where I live. Other systems link to facebook, twitter, or other places, and nobody complains! Even when those services don’t have any of the advantages of being in a federated free-software powered social network :)

If anybody don’t want to use pump.io but wants to comment, other ways to reach me or the related blog post are:

  • Comment in the GNUSocial fediverse: the post announcing the thread for each blog post will be propagated to my quitter.se account too.
  • While I’m still using Twitter, they can comment on the corresponding tweet, but beware that I’m seriously thinking about closing my account there, since I rarely use it and don’t like the platform.
  • Drop me an email, I can post the comment on behalf of that person (if you want your comment to be “anonymous”, please state it in the email).

So now it’s decided, and this is the first post of this new experiment. This text is posted in pump.io too, and you can comment there :)


Filed under: My experiences and opinion, Tools Tagged: Blog, English, federation, pump.io, social networks, Wordpress

Steve Kemp: Storing and distributing secrets.

13 September, 2014 - 04:10

I run a number of hosts, and they are controlled via a server automation tool I wrote called slaughter [Documentation].

The policies I use to control my hosts are public and I don't want to make them private because they server as good examples.

Because the roles are public I don't want to embed passwords in them, which means I need something to hold secrets securely. In my case secrets are things like plaintext-passwords. I want those secrets to be secure and unavailable from untrusted hosts.

The simplest solution I could think of was an IP-address based ACL and a simple webserver. A client requests something like:

  • http://secret.example.com/user-passwords

That returns a JSON object, if the requesting host is permitted to read the data. Otherwise it returns a HTTP 403 error.

The layout is very simple:

|-- secrets
|   |-- 206.190.139.148
|   |   `-- auth.json
|   |-- 127.0.0.1
|   |   `-- example.json
|   `-- 80.68.84.109
|       `-- chat.json

Each piece of data is beneath a directory/symlink which controls the read-only access. If the request comes in from the suitable IP it is granted, if not it is denied.

For example a failing case:

skx@desktop ~ $ curl  http://sss.steve.org.uk/chat
missing/permission denied

A working case :

root@chat ~ # curl  http://sss.steve.org.uk/chat
{ "steve": "haha", "bot": "notreally" }

(The JSON suffix is added automatically.)

It is hardly rocket-science, but I couldn't find anything else packaged neatly for this - only things like auth/secstore and factotum. So I'll share if it is useful.

Simple Secret Sharing, or Steve's secret storage.

Richard Hartmann: Release Critical Bug report for Week 37

13 September, 2014 - 04:08

Remember, remember; the fifth of November.

The UDD bugs interface currently knows about the following release critical bugs:

  • In Total: 1422
    • Affecting Jessie: 410 That's the number we need to get down to zero before the release. They can be split in two big categories:
      • Affecting Jessie and unstable: 355 Those need someone to find a fix, or to finish the work to upload a fix to unstable:
        • 52 bugs are tagged 'patch'. Please help by reviewing the patches, and (if you are a DD) by uploading them.
        • 26 bugs are marked as done, but still affect unstable. This can happen due to missing builds on some architectures, for example. Help investigate!
        • 277 bugs are neither tagged patch, nor marked done. Help make a first step towards resolution!
      • Affecting Jessie only: 55 Those are already fixed in unstable, but the fix still needs to migrate to Jessie. You can help by submitting unblock requests for fixed packages, by investigating why packages do not migrate, or by reviewing submitted unblock requests.
        • 0 bugs are in packages that are unblocked by the release team.
        • 55 bugs are in packages that are not unblocked.

Graphical overview of bug stats thanks to azhag:

Pages

Creative Commons License ลิขสิทธิ์ของบทความเป็นของเจ้าของบทความแต่ละชิ้น
ผลงานนี้ ใช้สัญญาอนุญาตของครีเอทีฟคอมมอนส์แบบ แสดงที่มา-อนุญาตแบบเดียวกัน 3.0 ที่ยังไม่ได้ปรับแก้