Planet Debian

Subscribe to Planet Debian feed
Planet Debian - http://planet.debian.org/
Updated: 23 min 15 sec ago

Charles Plessy: Browsing debian-private via SSH

26 November, 2014 - 17:35

I recently realised that one can browse the archives of debian-private via SSH. I find this a good compromise between subscription and ignorance. Here is for instance the command for November.

ssh -t master.debian.org mutt -f /home/debian/archive/debian-private/debian-private.201411

Francois Marier: Hiding network disconnections using an IRC bouncer

26 November, 2014 - 17:30

A bouncer can be a useful tool if you rely on IRC for team communication and instant messaging. The most common use of such a server is to be permanently connected to IRC and to buffer messages while your client is disconnected.

However, that's not what got me interested in this tool. I'm not looking for another place where messages accumulate and wait to be processed later. I'm much happier if people email me when I'm not around.

Instead, I wanted to do to irssi what mosh did to ssh clients: transparently handle and hide temporary disconnections. Here's how I set everything up.

Server setup

The first step is to install znc:

apt-get install znc

Make sure you get the 1.0 series (in jessie or trusty, not wheezy or precise) since it has much better multi-network support.

Then, as a non-root user, generate a self-signed TLS certificate for it:

openssl req -x509 -sha256 -newkey rsa:2048 -keyout znc.pem -nodes -out znc.crt -days 365

and make sure you use something like irc.example.com as the subject name, that is the URL you will be connecting to from your IRC client.

Then install the certificate in the right place:

mkdir ~/.znc
mv znc.pem ~/.znc/
cat znc.crt >> ~/.znc/znc.pem

Once that's done, you're ready to create a config file for znc using the znc --makeconf command, again as the same non-root user:

  • create separate znc users if you have separate nicks on different networks
  • use your nickserv password as the server password for each network
  • enable ssl
  • say no to the chansaver and nickserv plugins

Finally, open the IRC port (tcp port 6697 by default) in your firewall:

iptables -A INPUT -p tcp --dport 6697 -j ACCEPT
Client setup (irssi)

On the client side, the official documentation covers a number of IRC clients, but the irssi page was quite sparse.

Here's what I used for the two networks I connect to (irc.oftc.net and irc.mozilla.org):

servers = (
  {
    address = "irc.example.com";
    chatnet = "OFTC";
    password = "fmarier/oftc:Passw0rd1!";
    port = "6697";
    use_ssl = "yes";
    ssl_verify = "yes";
    ssl_cafile = "~/.irssi/certs/znc.crt";
  },
  {
    address = "irc.example.com";
    chatnet = "Mozilla";
    password = "francois/mozilla:Passw0rd1!";
    port = "6697";
    use_ssl = "yes";
    ssl_verify = "yes";
    ssl_cafile = "~/.irssi/certs/znc.crt";
  }
);

Of course, you'll need to copy your znc.crt file from the server into ~/.irssi/certs/znc.crt.

Make sure that you're no longer authenticating with the nickserv from within irssi. That's znc's job now.

Wrapper scripts

So far, this is a pretty standard znc+irssi setup. What makes it work with my workflow is the wrapper script I wrote to enable znc before starting irssi and then prompt to turn it off after exiting:

#!/bin/bash
ssh irc.example.com "pgrep znc || znc"
irssi
read -p "Terminate the bouncer? [y/N] " -n 1 -r
echo
if [[ $REPLY =~ ^[Yy]$ ]]
then
  ssh irc.example.com killall -sSIGINT znc
fi

Now, instead of typing irssi to start my IRC client, I use irc.

If I'm exiting irssi before commuting or because I need to reboot for a kernel update, I keep the bouncer running. At the end of the day, I say yes to killing the bouncer. That way, I don't have a backlog to go through when I wake up the next day.

Sune Vuorela: QImage and QPixmap in a Qt Quick item

26 November, 2014 - 04:50

For reasons I don’t know, apparantly a Qt Quick Item that can show a QImage or a QPixmap is kind of missing. The current Image QML item only works with data that can be represented by a URL.

So I wrote one that kind of works. Comments most welcome.

It is found on git.kde.org: http://quickgit.kde.org/?p=scratch/sune/imageitem.git

Oh, and the KDE End of Year fundraiser is still running. https://www.kde.org/fundraisers/yearend2014/. Go support it if you haven’t already.

Holger Levsen: 20141125-change

26 November, 2014 - 04:48
Change

Not many people adopt to fundamental changes easily, but at least people can change at all. I'm sure what looks funny now has also been a painful experience, but... - that's life. Sometimes it sucks. And suddenly...

Enrico Zini: mock-webserver

26 November, 2014 - 00:22
A mock webserver to use for unit testing HTTP clients

With python -m SimpleHTTPServer it's easy to bring up an HTTP server to use to test HTTP client code, however it only supports GET requests, and I needed to test an HTTP client that needs to perform a file upload.

It took way more than I originally expected to put this together, so here it is, hopefully saving other people (including future me) some time:

#!/usr/bin/python3

import http.server
import cgi
import socketserver
import hashlib
import json

PORT = 8081

class Handler(http.server.SimpleHTTPRequestHandler):
    def do_POST(self):
        info = {
            "method": "POST",
            "headers": { k: v for k, v in self.headers.items() },
        }

        form = cgi.FieldStorage(
            fp=self.rfile,
            headers=self.headers,
            environ={'REQUEST_METHOD':'POST',
                     'CONTENT_TYPE':self.headers['Content-Type'],
                     })

        postdata = {}
        for k in form.keys():
            if form[k].file:
                buf = form.getvalue(k)
                postdata[k] = {
                    "type": "file",
                    "name": form[k].filename,
                    "size": len(buf),
                    # json.dumps will not serialize a byte() object, so we
                    # return the shasum instead of the file body
                    "sha256": hashlib.sha256(buf).hexdigest(),
                }
            else:
                vals = form.getlist(k)
                if len(vals) == 1:
                    postdata[k] = {
                        "type": "field",
                        "val": vals[0],
                    }
                else:
                    postdata[k] = {
                        "type": "multifield",
                        "vals": vals,
                    }

        info["postdata"] = postdata

        resbody = json.dumps(info, indent=1)
        print(resbody)

        resbody = resbody.encode("utf-8")

        self.send_response(200)
        self.send_header("Content-type", "application/json")
        self.send_header("Content-Length", str(len(resbody)))
        self.end_headers()

        self.wfile.write(resbody)

class TCPServer(socketserver.TCPServer):
    # Allow to restart the mock server without needing to wait for the socket
    # to end TIME_WAIT: we only listen locally, and we may restart often in
    # some workflows
    allow_reuse_address = True

httpd = TCPServer(("", PORT), Handler)

print("serving at port", PORT)
httpd.serve_forever()

Thorsten Glaser: d-i preseeding is not the answer

26 November, 2014 - 00:00

This post details what the d-i team currently shows as the only way.

It has several shortcomings and one missing documentation part.

Shortcoming: --purge is missing from the apt-get invocation. This leaves packages in “rc” state (requiring a manual dpkg --purge to completely remove them later, as they are then invisible to apt).

Worse shortcoming: this still leaves all dependencies pulled in by systemd around on the system, because packages installed by debootstrap are not eligible for “apt-get --purge autoremove”. Additionally, it does not influence debootstrap’s (nōn-existent, see #557322, #668001, #768062) dependency resolver, leading to possibly pessimistic package selections.

Missing: you can just hit Alt-F2 and enter the command…

	in-target apt-get --purge -y install sysvinit-core
 

… there, no need to preseed. But this does not eliminate the aforementioned shortcomings, of course.

Scott Kitterman: On being excellent to each other

25 November, 2014 - 23:47

There has been a lot of discussion recently where there is strong disagreement, even about how to discuss the disagreement. Here’s a few thoughts on the matter.

The thing I personally find the most annoying is when someone thinks what someone else says is inappropriate and says so, it seems like the inevitable response is to scream censorship. When people do that, I’m pretty sure they don’t know what the word censorship actually means. Debian/Ubuntu/Insert Project Name Here resources are not public spaces and no government is telling people what they can and can’t say.

When you engage in speech and people respond to that speech, even if you don’t feel all warm and fuzzy after reading the response, it’s not censorship. It’s called discussion.

When someone calls out speech that they think is inappropriate, the proper response is not to blame a Code of Conduct or some other set of rules. Projects that have a code, also have a process for dealing with claims the code has been violated. Unless someone invokes that process (which almost never happens), the code is irrelevant. What’s relevant is that someone is having a problem with what or how you are saying something and are in some way hurt by it.

Let’s focus on that. The rules are irrelevant, what matters is working together in a collegial way. I really don’t think project members actively want other project members to feel bad/unsafe, but it’s hard to get outside ones own defensive reaction to being called out. So please pay less attention to how you’re feeling about things and try to see things from the other side. If we can all do a bit more of that, then things can be better for all of us.

Final note: If you’ve gotten this far and thought “Oh, that other person is doing this to me”, I have news for you – it’s not just them.

Chris Lamb: Validating Django model attribute assignment

25 November, 2014 - 21:54

Ever done the following?

>>> user = User.objects.get(pk=102)
>>> user.superuser = True
>>> user.save()

# Argh, why is this user now not a superuser...

Here's a dirty hack to validate these:

import sys

from django.db import models
from django.conf import settings

FIELDS = {}
EXCEPTIONS = {
    'auth.User': ('backend',),
}

def setattr_validate(self, name, value):
    super(models.Model, self).__setattr__(name, value)

    # Real field names cannot start with underscores
    if name.startswith('_'):
        return

    # Magic
    if name == 'pk':
        return

    k = '%s.%s' % (self._meta.app_label, self._meta.object_name)
    try:
        fields = FIELDS[k]
    except KeyError:
        fields = FIELDS[k] = set(
            getattr(x, y) for x in self._meta.fields
            for y in ('attname', 'name')
        )

    # Field is in allowed list
    if name in fields:
        return

    # Field is in known exceptions
    if  name in EXCEPTIONS.get(k, ()):
        return

    # Always allow Django internals to set values (eg. aggregates)
    if 'django/db/models' in sys._getframe().f_back.f_code.co_filename:
        return

    raise ValueError(
        "Refusing to set unknown attribute '%s' on %s instance. "
        "(Did you misspell %s?)" % (name, k, ', '.join(fields))
    )

# Let's assume we have good test coverage
if settings.DEBUG:
    models.Model.__setattr__ = setattr_validate

Now:

>>> user = User.objects.get(pk=102)
>>> user.superuser = True
...
ValueError: Refusing to set unknown attribute 'superuser' on auth.User instance. (Did you misspell 'username', 'first_name', 'last_name', 'is_active', 'email', 'is_superuser', 'is_staff', 'last_login', 'password', 'id', 'date_joined')

(Django can be a little schizophrenic on this — Model.save()'s update_fields keyword argument validates its fields, as does prefetch_related, but it's taking select_related a little while to land.)

Dirk Eddelbuettel: Rcpp now used by 300 CRAN packages

25 November, 2014 - 19:14

This morning, Rcpp reached another round milestone: 300 packages on CRAN now depend on it (as measured by Depends, Imports and LinkingTo declarations). The graph is on the left depicts the growth of Rcpp usage over time. There are 41 more on BioConductor (which is not included in the chart).

The first and less detailed part uses manually save entries, the second half of the data set was generated semi-automatically via a short script appending updates to a small file-based backend. A list of user package is kept on this page.

Also displayed in the graph is the relative proportion of CRAN packages using Rcpp. The four per-cent hurdle was cleared just before useR! 2014 where I showed a similar graph (as two distinct graphs) in my invited talk. We may well hit five per-cent before the end of the year.

300 is a pretty humbling and staggering number. Also interesting that we we cleared 200 only at the end of April, and 250 in early August.

So from everybody behind Rcpp, a heartfelt Thank You! to all the users and of course other contributors.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

DebConf team: DebConf14 final report (Posted by Uli Scholler, and the DebConf Team)

25 November, 2014 - 15:48

The Final Report for DebConf14 is complete and the DebConf team proudly presents it to the world.

DebConf14, which was held in Portland, Oregon, USA, in August 2014, was a big success. Our final report captures the essence of this year’s conference in pictures and words:

  • talks and how we selected them
  • face-to-face meetings and their effect on building trust
  • events such as the day trip or the infamous cheese & wine party
  • the university venue
  • a selection of attendee’s impressions

And of course there are numbers, budget, and statistics.

Read, enjoy, and share!

The DebConf team

Erich Schubert: Installing Debian with sysvinit

25 November, 2014 - 15:24
First let me note that I am using systemd, so these things here are untested by me. See e.g. Petter's and Simon's blog entries on the same overall topic. According to the Debian installer maintainers, the only accepted way to install Debian with sysvinit is to use preseeding. This can either be done at the installer boot prompt by manually typing the magic spell:
preseed/late_command="in-target apt-get install -y sysvinit-core"
or by using a preseeding file (which is a really nice feature I used for installing my Hadoop nodes) to do the same:
d-i preseed/late_command string in-target apt-get install -y sysvinit-core
If you are a sysadmin, using preseeding can save you a lot of typing. Put all your desired configuration into preseeding files, put them on a webserver (best with a short name resolvable by local DNS). Let's assume you have set up the DNS name d-i.example.com, and your DHCP is configured such that example.com is on the DNS search list. You can also add a vendor extension to DHCP to serve a full URL. Manually enabling preseeding then means adding
auto url=d-i
to the installer boot command line (d-i is the hostname I suggested to set up in your DNS before, and the full URL would then be http://d-i.example.com/d-i/jessie/./preseed.cfg. Preseeding is well documented in Appendix B of the installer manual, but nevertheless will require a number of iterations to get everything work as desired for a fully automatic install like I used for my Hadoop nodes. There might be an easier option.
I have filed a wishlist bug suggesting to use the tasksel mechanism to allow the user to choose sysvinit at installation time. However, it got turned down by the Debian installer maintainers quire rudely in a "No." - essentially this is a "shut the f... up and go away", which is in my opinion an inappropriate to discard a reasonable user wishlist request. Since I don't intend to use sysvinit anymore, I will not be pursuing this option further. It is, as far as I can tell, still untested. If it works, it might be the least-effort, least-invasive option to allow the installation of sysvinit Jessie (except for above command line magic). If you have interest in sysvinit, you (because I don't use sysvinit) should now test if this approach works.
  1. Get the patch proposed to add a task-sysvinit package.
  2. Build an installer CD with this tasksel (maybe this documentation is helpful for this step).
  3. Test whether the patch works. Report results to above bug report, so that others interested in sysvinit can find them easily.
  4. Find and fix bugs if it didn't work. Repeat.
  5. Publish the modified ("forked") installer, and get user feedback.
If you are then still up for a fight, you can try to convince the maintainers (or go the nasty way, and ask the CTTE for their opinion, to start another flamewar and make more maintainers give up) that this option should be added to the mainline installer. And hurry up, or you may at best get this into Jessie reloaded, 8.1. - chance are that the release manager will not accept such patches this late anymore. The sysvinit supporters should have investigated this option much, much earlier instead of losing time on the GR. Again, I won't be doing this job for you. I'm happy with systemd. But patches and proof-of-concept is what makes open source work, not GRs and MikeeUSA's crap videos spammed to the LKML... (And yes, I am quite annoyed by the way the Debian installer maintainers handled the bug report. This is not how open-source collaboration is supposed to work. I tried to file a proper wishlist bug reporting, suggesting a solution that I could not find discussed anywhere before and got back just this "No. Shut up." answer. I'm not sure if I will be reporting a bug in debian-installer ever again, if this is the way they handle bug reports ...) I do care about our users, though. If you look at popcon "vote" results, we have 4179 votes for sysvinit-core and 16918 votes for systemd-sysv (graph) indicating that of those already testing jessie and beyond - neglecting 65 upstart votes, and assuming that there is no bias to not-upgrade if you prefer sysvinit - about 20% appear to prefer sysvinit (in fact, they may even have manually switched back to sysvinit after being upgraded to systemd unintentionally?). These are users that we should listen to, and that we should consider adding an installer option for, too.

Gunnar Wolf: 10 PRINT CHR$(205.5+RND(1)); : GOTO 10 (also known as #10print )

25 November, 2014 - 12:41

The line of BASIC code that appears as the subject for this post is the title for a book I just finished reading — And enjoyed thoroughly. The book is available online for download under a CC-BY-NC-SA 3.0 License, so you can take a good look at it before (or instead of) buying it. Although it's among the books I will enjoy having on my shelf; the printing is of a very enjoyable good quality.

And what is this book about? Well, of course, it analizes that very simple line of code, as it ran on the Commodore 64 thirty years ago.

And the analysis is made from every possible angle: What do mazes mean in culture? What have they meant in cultures through history? What about regularity in art (mainly 20th century art)? How would this code look (or how it would be adapted) on contemporary non-C64 computers? And in other languages more popular today? What does randomness mean? And what does random() mean? What is BASIC, and how it came to the C64? What is the C64, and where did it come from? And several other beautiful chapters.

The book was collaboratively written by ten different authors, in a Wiki-like fashion. And... Well, what else is there to say? I enjoyed so much reading through long chapters of my childhood, of what attracted me to computers, of my cultural traits and values... I really hope that, in due time, I can be a part of such a beautiful project!

Kenshi Muto: Bug #668001

25 November, 2014 - 10:00

If the bug title of #668001 was not "debootstrap: cant install systemd instead of sysvinit", but was like "debootstrap ignores everything from the first pipe character to the end of Depends/Pre-Depends line.", it would be treated more carefully ;)

My patch posting #20 aims to fix it.

Well, I wish this bug will be solved on jessie+1 or backports.

Dirk Eddelbuettel: YATORP -- Yet Another Tutorial on R Packaging

25 November, 2014 - 09:42

What the world needs right now is yet another tutorial on R packages and their creation. Luckily, this last Friday and Saturday, I had the opportunity to present in a workshop organized by Frank DiTraglia at Penn's shiny new Warren Center, and held at Wharton.

Given the Warren Center's focus, the workshop centered around Big Data and Open Science with R. Yihui Xie and myself alternated on delivering four units on an Introduction to R, Writing R packages, Dynamic Documents with R, and HPC with Rcpp and RcppArmadillo.

So I had to come up with a plan for teaching creating R packages -- and decided to do it from the very bottom up, clearly introducing the underlying R CMD ... commands and only then switching to taking advantage of an environment such as the RStudio IDE.

The resulting slides are now available on my presentations page. The code examples are in a repo subdirectory on GitHub as well. While both were designed to support the parallel live instruction offered in the workshop, I would be interested in feedback (preferably via email) about how useful the slides are by themselves.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

Rogério Brito: Problems with Emacs 24.4

25 November, 2014 - 00:16

This is, essentially, a call for help, as I don't really know which program is at a fault here.

Given that Emacs's upstream converted their repository from bzr to git, all the commits in mirror repositories became "invalid" in relation to the official repository.

What does this mean in practical terms, in my case? Well, bear with me while I try to report my steps.

Noticing a regression and reporting a bug

There is a regression with Emacs 24.4 relative to 24.3, which I discovered after Emacs 24.4 became available in Debian's sid.

The regression in particular is that Emacs 24.4 doesn't seem to respect my Xresources, while 24.3 does (and this is 100% reproducible: I kept the binary packages of version 24.3 of emacs24 and I can install and reinstall things).

When I reported this to upstream, I received a reply that it worked fine with another person that was using XFCE with unstable.

Testing various Desktop environments

As I am using the MATE desktop environment, I proceeded to test this assertion by installing XFCE. Emacs 24.4 read my Xresources. I went ahead and installed LXDE. It worked again. I tried once more with GNOME 3, but "regular" GNOME 3 just crashed. I tried with GNOME 3 Classic and Emacs 24.4 just worked again.

Going deep into the rabbit's hole

Then, I got more curious and I tried to see why things worked the way that they did and given that there was a mirror of the Emacs repo on github, I cloned it and started to git bisect to find the first problematic commit (I have no idea if bzr even offers something like git's bisect and I wouldn't really know how to do it as quickly as I do with git).

To cut short a long story, after many recompiles, many wasted hours, a lot of wasted electrical energy, I found a bad commit and reported it.

I received no response after that.

The new repo enters in action

Of course, all my hard work bisecting things was completely invalidated after the transition to the new repository went live.

To make things relevant again, I used the awesome powers of git, restricting the changes of the newly cloned repository to the e-mail of the committer in question (Chong Yidong) and, from there, I proceeded to another painful process of git bisects.

And, sure enough, the first bad commit was the same one that I found with the previous tree.

Semi-blindly reverting this commit, and also semi-blindly resolving the conflicts make Emacs's from master work again on my system, but I highly suspect that (given the way that I did it), it would not really be appropriate for upstream.

But given also that I failed to receive feedback after my original report, I am not too confident that this bug can be solved soon (even if it doesn't qualify for being fixed in Debian 8).

After all this, I don't really know what else to do. I even filed a bug report (more like a request for help) to the Debian MATE maintainers.

As a side note, I would have filed a bug to upstream MATE, but it is not really clear what the proper procedure to report bugs to them is---they seem to use github's issues system, but given that they have separate repositories for each component of the project, and that I don't know precisely what repository to report to (or even if it applies to MATE after all), I am more or less paralyzed.

A side note

I must say that the conversion was well done by Eric Raymond, because the whole .git repository of the new repo is only about 200MB, with history going back to 1985, while the other repository had about 800GB.

John Goerzen: My boys love 1986 computing

24 November, 2014 - 10:27

Yesterday, Jacob (age 8) asked to help me put together a 30-year-old computer from parts in my basement. Meanwhile, Oliver (age 5) asked Laura to help him learn cursive. Somehow, this doesn’t seem odd for a Saturday at our place.

Let me tell you how this came about.

I’ve had a project going on for a while now to load data from old floppies. It’s been fun, and had a surprise twist the other day: my parents gave me an old TRS-80 Color Computer II (aka “CoCo 2″). It was, in fact, my first computer, one they got for me when I was in Kindergarten. It is nearly 30 years old.

I have been musing lately about the great disservice Apple did the world by making computers easy to learn — namely the fact that few people ever bother to learn about them. Who bothers to learn about them when, on the iPhone for instance, the case is sealed shut, the lifespan is 1 or 2 years for many purchasers, and the platform is closed in lots of ways?

I had forgotten how finicky computers used to be. But after some days struggling with IDE incompatibilities, booting issues, etc., when I actually managed to get data off a machine that had last booted in 1999, I had quite the sense of accomplishment, which I rarely have lately. I did something that was hard to do in a world where most of the interfaces don’t work with equipment that old (even if nominally they are supposed to.)

The CoCo is one of those computers normally used with a floppy drive or cassette recorder to store programs. You type DIR, and you feel the clack of the drive heads through the desk. You type CLOAD and you hear the relay click closed to turn on the tape motor. You wiggle cables around until they make contact just right. You power-cycle for the times when the reset button doesn’t quite do the job. The details of how it works aren’t abstracted away by innumerable layers of controllers, interfaces, operating system modules, etc. It’s all right there, literally vibrating your desk.

So I thought this could be a great opportunity for Jacob to learn a few more computing concepts, such as the difference between mass storage and RAM, plus a great way to encourage him to practice critical thinking. So we trekked down to the basement and came up with handfulls of parts. We brought up the computer, some joysticks, all sorts of tangled cables. We needed adapters, an old TV. Jacob helped me hook everything up, and then the moment of truth: success! A green BASIC screen!

I added more parts, but struck out when I tried to connect the floppy drive. The thing just wouldn’t start up right whenever the floppy controller cartridge was installed. I cleaned the cartridge. I took it apart, scrubbed the contacts, even did a re-seat of the chips. No dice.

So I fired up my CoCo emulator (xroar), and virtually “saved” some programs to cassette (a .wav file). I then burned those .wav files to an audio CD, brought up an old CD player from the basement, connected the “cassette in” plug to the CD player’s headphone jack, and presto — instant programs. (Well, almost. It takes a couple of minutes to load a program from audio codes.)

The picture above is Oliver cackling at one of the very simplest BASIC programs there is: “number find.” The computer picks a random number between 1 and 2000, and asks the user to guess it, giving a “too low” or “too high” clue with each incorrect guess. Oliver delighted in giving invalid input (way too high numbers, or things that weren’t numbers at all) and cackled at the sarcastic error messages built into the program. During Jacob’s turn, he got very serious about it, and is probably going to be learning about how to calculate halfway points before too long.

But imagine my pride when this morning, Jacob found the new CD I had made last night (correcting a couple recordings), found my one-line instruction on just part of how to load a program, and correctly figured out by himself all the steps to do in order (type CLOAD on the CoCo, advance the CD to the proper track, press play on the player, wait for it to load on the CoCo, then type RUN).

I ordered a replacement floppy controller off eBay tonight, and paid $5 for a coax adapter that should fix some video quality issues. I rescued some 5.25″ floppies from my trash can from another project, so they should have plenty of tools for exploration.

It is so much easier for them to learn how a disk drive works, and even what the heck a track is, when you can look at a floppy drive with the cover off and see the heads move. There are other things we can do with more modern equipment — Jacob has shown a lot of interest in Arduino projects — but I have so far drawn a blank on ways to really let kids discover how a modern PC (let alone a modern phone or tablet) works.

Dimitri John Ledkov: Analyzing public OpenPGP keys

24 November, 2014 - 04:15
OpenPGP Message Format (RFC 4880) well defines key structure and wire formats (openpgp packets). Thus when I looked for public key network (SKS) server setup, I quickly found pointers to dump files in said format for bootstrapping a key server.

I did not feel like experimenting with Python and instead opted for Go and found http://code.google.com/p/go.crypto/openpgp/packet library that has comprehensive support for parsing openpgp low level structures. I've downloaded the SKS dump, verified it's MD5SUM hashes (lolz), and went ahead to process them in Go.

With help from http://github.com/lib/pq and database/sql, I've written a small program to churn through all the dump files, filter for primary RSA keys (not subkeys) and inject them into a database table. The things that I have chosen to inject are fingerprint, N, E. N & E are the modulus of the RSA key pair and the public exponent. Together they form a public part of an RSA keypair. So far, nothing fancy.

Next I've run an SQL query to see how unique things are... and found 92 unique N & E pairs that have from two and up to fifteen duplicates. In total it is 231 unique fingerprints, which use key material with a known duplicate in the public key network. That didn't sound good. And also odd - given that over 940 000 other RSA keys managed to get unique enough entropy to pull out a unique key out of the keyspace haystack (which is humongously huge by the way).

Having the list of the keys, I've fetched them and they do not look like regular keys - their UIDs do not have names & emails, instead they look like something from the monkeysphere. The keys look like they are originally used for TLS and/or SSH authentication, but were converted into OpenPGP format and uploaded into the public key server. This reminded me of the Debian's SSL key generation vulnerability CVE-2008-0166. So these keys might have been generated with bad entropy due to affected tools by that CVE and later converted to OpenPGP.

Looking at the openssl-blacklist package, it should be relatively easy for me to generate all possible RSA key-pairs and I believe all other material that is hashed to generate the fingerprint are also available (RFC 4880#12.2). Thus it should be reasonably possible to generate matching private keys, generate revocation certificates and publish the revocation certificate with pointers to CVE-2008-0166. (Or email it to the people who have signed given monkeysphered keys). When I have a minute I will work on generating openpgp-blacklist type of scripts to address this.

If anyone is interested in the Go source code I've written to process openpgp packets, please drop me a line and I'll publish it on github or something.

Steinar H. Gunderson: Scaling analysis.sesse.net

24 November, 2014 - 03:06

As I previously mentioned, I've been running live chess analysis during the Carlsen–Anand World Chess Championship match. Now it's all over (congratulations to Magnus!), so I thought I should write a few words about scaling, as we ended up peaking at (I think) 1527 simultaneous users, much more than the system was originally designed for (2–3 or so :-) ).

Let me explain first the requirements. Generally the backend system outputs analysis as soon as it comes in from the two chess engines (although rate-limited so that it doesn't output more than once a second), and we want to push this out to the clients as soon as possible. The clients are regular web browsers (both on mobile and on desktop; I haven't checked the ratio) running a fair amount of JavaScript; they generally request an URL in a loop, and whenever something comes in, they display it on the chessboard, wait 100 ms (just as a safeguard) and then go fetch again.

Of course, I could have just had the clients ask every second, but it seems inelegant and a bit wasteful, especially for mobile. If the analysis has come far, or even has stopped entirely since e.g. the game is over, there's no need to go fetch the same data over and over again. Instead, what I want it a system where the, if the client already has the latest data, the HTTP request hangs until there's more data, and then gets it immediately. Together with this, there's also a special header that says how many people are connected (which is also shown to the viewers). If a client has been hanging/sleeping for more than 30 seconds, I just re-send the latest analysis; in this world of NATs, transparent proxies and other unpredictable network conditions, I don't want to have connections hanging for minutes with no idea of whether I can actually answer them when the time comes.

The client tells the server (in a CGI parameter, again for simplicity so that I don't have to deal with caching proxies etc. on the way) in the request the timestamp of the latest data it has. This leads to the following different scenarios:

  1. A client comes in and has no existing data. They should get the latest data, immediately.
  2. A client has old data, and re-asks. They should also get the latest data, immediately.
  3. A client already has the newest data, which causes it to hang, and no new data is ready within 30 seconds. They should get the latest data anew (or I could have returned some other HTTP code, but I decided not to get fancy).
  4. A client already has the newest data, but new data comes in underway. They should get the new data.

Unsurprisingly, this leads to a lot of clients being in the “hanging” state at the same time, and then when new analysis comes in, there's a thundering herd of clients that should have it at the same time (and then come back for more soon after).

Like I wrote earlier, Node.js was pretty much the ideal case model-wise for this; there's only one process to handle all of them, which means the extra memory overhead per hanging client is very low, and when there's new data, we can just send to all of them after each other. Furthermore, since there's only one process, it is also easy to find the viewer count: Simply count all the hanging clients, plus the ones that I think are simply processing the latest data and should come back with a new request soon (the limit was five seconds or so).

However, around 6–700 clients, I started getting issues with requests not coming through. It turns out that the single Node.js process just couldn't handle all that many clients and started hitting the roof CPU-wise. Everyone who's done a bit of performance work in nontrivial systems know that you can't really optimize anything without profiling it first, but unfortunately, Node.js was extremely limited here. There were some systems sending lots of data to externals, which I didn't really feel like. Then, there were some systems to try to interpret V8's debug output logs, and they simply gave out bogus answers.

In particular, they claimed 93% of my time was spent in glibc, but couldn't say where in glibc, and when I pointed perf at them, it was pretty clear that most of the time was actually spent in JavaScript and V8 support functions. I took a guess at my viewer count functions, optimized so I didn't have to count as often, and it helped—but I still wasn't really confident it would scale very far. Usually people start up more Node.js workers and then have some load balancer in front, but it would make the viewer counting much more complicated, and the CPU would need to be shared with the chess engine itself, so I wasn't happy with the “just give it more cores” approach either.

So I turned to everyone's favorite website scaling tool, Varnish. With lots of help from Lasse (Varnish Software) and Tollef (ex-Varnish Software, now Fastly), I got things working; it was a sort of bumpy road, though, especially as I hit two different crash bugs in Varnish 4.0.2 that only manifested themselves under actual production load. Here's what I ended up with (running on git master):

The first thing to realize is that we're not trying to keep backend traffic entirely minimal, just cut it significantly. For instance, if Varnish sees #1 (no existing data) and #2 (old existing data) as different and fire off two different backend requests for them, it doesn't really matter. However, we really want all the hanging clients to get the same backend connection; thankfully, Varnish gives us this entirely by itself with its backend coalescing; if it has a backend request going and another one comes in for the same URL, it simply puts the second one on the sleep list and gives both the same response when it comes back. Also, if a slow or new client doesn't manage to get onto the hanging request (ie., it comes in after the backend response came), it should simply be served entirely out of Varnish' cache.

A lot of this comes automatically, and some of it comes with some cooperation between the backend and the VCL. In particular, we can let the client tag the response with the timestamp of the data, and once something comes in, simply purge/ban every object with a different timestamp header, causing us to never give out stale data from the Varnish cache. Varnish bans can be a bit tricky since they're checked lazily, and if you're not entirely careful, you can end up with a very long ban list, but it seems to work well in practice.

However, the distinction between #3 and #4 gives us a problem. We now have a situation where people ask for an URL, it times out after 30 seconds and gives us a response... which we then should give out to everybody, but the next time they ask, we should hang again! This was incredibly tricky to get right; the combination of TTL, Expires headers, grace, and the problem of clock skew for long-running requests (what exactly is the Date timestamp supposed to mean; request received, first byte of backend response, or something else?) and so on was just too much for me. Eventually I got tired of reading the Varnish source code (which, frankly, I find quite opaque) and decided to just sidestep the problem. Now, instead of the 30-second timeout, the backend simply just touches the data file every 30 seconds if it hasn't been changed, so it gets a new timestamp every time. Problem solved.

Finally, there's the counting problem; the backend doesn't see all the requests anymore, so we need a different way of counting. I solve this by tailing the access logs (using varnishncsa) in a separate process, comparing to the updates of the analysis file, and then trying to figure out if they've fallen out or not. Then I simply inject the viewer count into the backend every second. Problem solved, again. (Well, at low traffic numbers, seemingly there's some sort of buffering somewhere, causing me to see the requests way too late, and this causes the count to oscillate down between 1 and the real number somehow. But I don't care too much right now.)

So, there you have it. Varnish' threaded architecture isn't super for this kind of thing; in a sense, much less than Node.js. However, even in this day and age, optimized C beats JavaScript any day of the week; seemingly by a factor five or so. In the end, the 1500 clients were handled with CPU usage of about 40% of one core. I don't really like the fact that it needs ~1500 worker threads for those 1500 clients (I had to increase it from the default of 1024 in order to keep the HTTP errors away), but I used taskset to restrict it to two physical cores in order not to disturb the chess worker threads too much (they are already rather sensitive to the kernel's scheduling decisions).

So, how far can it go? Well, those 1500 clients needed about 33 Mbit/sec, so we can go to ~45k based on bandwidth (the server is on gigabit). At that point, though, I sincerely doubt I can sustain both Varnish and the chess engine can keep going, so I'd either move it externally. So next up, maybe Fastly? Well, at least if they start supporting IPv6.

You can find all the code, including the Varnish snippets, in the git repository. Until next time—perhaps WCC 2016! Waiting for Carlsen–Caruana. :-)

Final bonus: Munin graphs. Everyone loves Munin graphs; it's the Comic Sans of system administration.

Iustin Pop: Debian, Debian…

23 November, 2014 - 16:23

Due to some technical issues, I've been without access to my lists subscription email for a bit more than a week. Once I regained access and proceeded to read the batch of emails, I was - once again - shocked. Shocked at the amount of emails spent on the systemd issue, shocked at the number of people resigning, shocked at the amount of mud thrown.

I just hope that the GR results finally will mean silence and getting over the last 3-6 months.

For the record:

  • I seconded the GR because I believed we were moving too fast (I wanted one full release as a transition period, even if that's a long time)
  • I am quite happy with the result of the GR!
  • I am not happy with the amount of people leaving (I hope they're just taking a break)
  • I am, as usual, behind on my Debian packaging ☹

However, some of the more recent emails on -private give me more hope, so I'm looking forward to the next 6 months. I wonder how this will all look in two years?

(Side-note: emacs-nox shows me the italic word above as italic in text mode: I never saw that before, and didn't know, that it's possible to have italic fonts in xterm! What is this trickery⁈ … it seems to be related to the font I use, fun!)

Iustin Pop: Debian, Debian…

23 November, 2014 - 16:23

Due to some technical issues, I've been without access to my lists subscription email for a bit more than a week. Once I regained access and proceeded to read the batch of emails, I was - once again - shocked. Shocked at the amount of emails spent on the systemd issue, shocked at the number of people resigning, shocked at the amount of mud thrown.

I just hope that the GR results finally will mean silence and getting over the last 3-6 months.

For the record:

  • I seconded the GR because I believed we were moving too fast (I wanted one full release as a transition period, even if that's a long time)
  • I am quite happy with the result of the GR!
  • I am not happy with the amount of people leaving (I hope they're just taking a break)
  • I am, as usual, behind on my Debian packaging ☹

However, some of the more recent emails on -private give me more hope, so I'm looking forward to the next 6 months. I wonder how this will all look in two years?

(Side-note: emacs-nox shows me the italic word above as italic in text mode: I never saw that before, and didn't know, that it's possible to have italic fonts in xterm! What is this trickery⁈ … it seems to be related to the font I use, fun!)

Pages

Creative Commons License ลิขสิทธิ์ของบทความเป็นของเจ้าของบทความแต่ละชิ้น
ผลงานนี้ ใช้สัญญาอนุญาตของครีเอทีฟคอมมอนส์แบบ แสดงที่มา-อนุญาตแบบเดียวกัน 3.0 ที่ยังไม่ได้ปรับแก้