Planet Debian

Subscribe to Planet Debian feed
Planet Debian -
Updated: 18 min 16 sec ago

Christoph Egger: Installing a python systemd service?

1 hour 53 min ago

As web search engines and IRC seems to be of no help, maybe someone here has a helpful idea. I have some service written in python that comes with a .service file for systemd. I now want to build&install a working service file from the software's I can override the build/build_py commands of setuptools, however that way I still lack knowledge wrt. the bindir/prefix where my service script will be installed. Ideas?

Steinar H. Gunderson: Why does software development take so long?

5 hours 39 min ago

Nageru 1.4.0 is out (and on its way through the Debian upload process right now), so now you can do live video mixing with multichannel audio to your heart's content. I've already blogged about most of the interesting new features, so instead, I'm trying to answer a question: What took so long?

To be clear, I'm not saying 1.4.0 took more time than I really anticipated (on the contrary, I pretty much understood the scope from the beginning, and there was a reason why I didn't go for building this stuff into 1.0.0); but if you just look at the changelog from the outside, it's not immediately obvious why “multichannel audio support” should take the better part of three months of develoment. What I'm going to say is of course going to be obvious to most software developers, but not everyone is one, and perhaps my experiences will be illuminating.

Let's first look at some obvious things that isn't the case: First of all, development is not primarily limited by typing speed. There are about 9,000 lines of new code in 1.4.0 (depending a bit on how you count), and if it was just about typing them in, I would be done in a day or two. On a good keyboard, I can type plain text at more than 800 characters per minute—but you hardly ever write code for even a single minute at that speed. Just as when writing a novel, most time is spent thinking, not typing.

I also didn't spend a lot of time backtracking; most code I wrote actually ended up in the finished product as opposed to being thrown away. (I'm not as lucky in all of my projects.) It's pretty common to do so if you're in an exploratory phase, but in this case, I had a pretty good idea of what I wanted to do right from the start, and that plan seemed to work. This wasn't a difficult project per se; it just needed to be done (which, in a sense, just increases the mystery).

However, even if this isn't at the forefront of science in any way (most code in the world is pretty pedestrian, after all), there's still a lot of decisions to make, on several levels of abstraction. And a lot of those decisions depend on information gathering beforehand. Let's take a look at an example from late in the development cycle, namely support for using MIDI controllers instead of the mouse to control the various widgets.

I've kept a pretty meticulous TODO list; it's just a text file on my laptop, but it serves the purpose of a ghetto bugtracker. For 1.4.0, it contains 83 work items (a single-digit number is not ticked off, mostly because I decided not to do those things), which corresponds roughly 1:2 to the number of commits. So let's have a look at what the ~20 MIDI controller items went into.

First of all, to allow MIDI controllers to influence the UI, we need a way of getting to it. Since Nageru is single-platform on Linux, ALSA is the obvious choice (if not, I'd probably have to look for a library to put in-between), but seemingly, ALSA has two interfaces (raw MIDI and sequencer). Which one do you want? It sounds like raw MIDI is what we want, but actually, it's the sequencer interface (it does more of the MIDI parsing for you, and generally is friendlier).

The first question is where to start picking events from. I went the simplest path and just said I wanted all events—anything else would necessitate a UI, a command-line flag, figuring out if we wanted to distinguish between different devices with the same name (and not all devices potentially even have names), and so on. But how do you enumerate devices? (Relatively simple, thankfully.) What do you do if the user inserts a new one while Nageru is running? (Turns out there's a special device you can subscribe to that will tell you about new devices.) What if you get an error on subscription? (Just print a warning and ignore it; it's legitimate not to have access to all devices on the system. By the way, for PCM devices, all of these answers are different.)

So now we have a sequencer device, how do we get events from it? Can we do it in the main loop? Turns out it probably doesn't integrate too well with Qt, but it's easy enough to put it in a thread. The class dealing with the MIDI handling now needs locking; what mutex granularity do we want? (Experience will tell you that you nearly always just want one mutex. Two mutexes give you all sorts of headaches with ordering them, and nearly never gives any gain.) ALSA expects us to poll() a given set of descriptors for data, but on shutdown, how do you break out of that poll to tell the thread to go away? (The simplest way on Linux is using an eventfd.)

There's a quirk where if you get two or more MIDI messages right after each other and only read one, poll() won't trigger to alert you there are more left. Did you know that? (I didn't. I also can't find it documented. Perhaps it's a bug?) It took me some looking into sample code to find it. Oh, and ALSA uses POSIX error codes to signal errors (like “nothing more is available”), but it doesn't use errno.

OK, so you have events (like “controller 3 was set to value 47”); what do you do about them? The meaning of the controller numbers is different from device to device, and there's no open format for describing them. So I had to make a format describing the mapping; I used protobuf (I have lots of experience with it) to make a simple text-based format, but it's obviously a nightmare to set up 50+ controllers by hand in a text file, so I had to make an UI for this. My initial thought was making a grid of spinners (similar to how the input mapping dialog already worked), but then I realized that there isn't an easy way to make headlines in Qt's grid. (You can substitute a label widget for a single cell, but not for an entire row. Who knew?) So after some searching, I found out that it would be better to have a tree view (Qt Creator does this), and then you can treat that more-or-less as a table for the rows that should be editable.

Of course, guessing controller numbers is impossible even in an editor, so I wanted it to respond to MIDI events. This means the editor needs to take over the role as MIDI receiver from the main UI. How you do that in a thread-safe way? (Reuse the existing mutex; you don't generally want to use atomics for complicated things.) Thinking about it, shouldn't the MIDI mapper just support multiple receivers at a time? (Doubtful; you don't want your random controller fiddling during setup to actually influence the audio on a running stream. And would you use the old or the new mapping?)

And do you really need to set up every single controller for each bus, given that the mapping is pretty much guaranteed to be similar for them? Making a “guess bus” button doesn't seem too difficult, where if you have one correctly set up controller on the bus, it can guess from a neighboring bus (assuming a static offset). But what if there's conflicting information? OK; then you should disable the button. So now the enable/disable status of that button depends on which cell in your grid has the focus; how do you get at those events? (Install an event filter, or subclass the spinner.) And so on, and so on, and so on.

You could argue that most of these questions go away with experience; if you're an expert in a given API, you can answer most of these questions in a minute or two even if you haven't heard the exact question before. But you can't expect even experienced developers to be an expert in all possible libraries; if you know everything there is to know about Qt, ALSA, x264, ffmpeg, OpenGL, VA-API, libusb, microhttpd and Lua (in addition to C++11, of course), I'm sure you'd be a great fit for Nageru, but I'd wager that pretty few developers fit that bill. I've written C++ for almost 20 years now (almost ten of them professionally), and that experience certainly helps boosting productivity, but I can't say I expect a 10x reduction in my own development time at any point.

You could also argue, of course, that spending so much time on the editor is wasted, since most users will only ever see it once. But here's the point; it's not actually a lot of time. The only reason why it seems like so much is that I bothered to write two paragraphs about it; it's not a particular pain point, it just adds to the total. Also, the first impression matters a lot—if the user can't get the editor to work, they also can't get the MIDI controller to work, and is likely to just go do something else.

A common misconception is that just switching languages or using libraries will help you a lot. (Witness the never-ending stream of software that advertises “written in Foo” or “uses Bar” as if it were a feature.) For the former, note that nothing I've said so far is specific to my choice of language (C++), and I've certainly avoided a bunch of battles by making that specific choice over, say, Python. For the latter, note that most of these problems are actually related to library use—libraries are great, and they solve a bunch of problems I'm really glad I didn't have to worry about (how should each button look?), but they still give their own interaction problems. And even when you're a master of your chosen programming environment, things still take time, because you have all those decisions to make on top of your libraries.

Of course, there are cases where libraries really solve your entire problem and your code gets reduced to 100 trivial lines, but that's really only when you're solving a problem that's been solved a million times before. Congrats on making that blog in Rails; I'm sure you're advancing the world. (To make things worse, usually this breaks down when you want to stray ever so slightly from what was intended by the library or framework author. What seems like a perfect match can suddenly become a development trap where you spend more of your time trying to become an expert in working around the given library than actually doing any development.)

The entire thing reminds me of the famous essay No Silver Bullet by Fred Brooks, but perhaps even more so, this quote from John Carmack's .plan has struck with me (incidentally about mobile game development in 2006, but the basic story still rings true):

To some degree this is already the case on high end BREW phones today. I have a pretty clear idea what a maxed out software renderer would look like for that class of phones, and it wouldn't be the PlayStation-esq 3D graphics that seems to be the standard direction. When I was doing the graphics engine upgrades for BREW, I started along those lines, but after putting in a couple days at it I realized that I just couldn't afford to spend the time to finish the work. "A clear vision" doesn't mean I can necessarily implement it in a very small integral number of days.

In a sense, programming is all about what your program should do in the first place. The “how” question is just the “what”, moved down the chain of abstractions until it ends up where a computer can understand it, and at that point, the three words “multichannel audio support” have become those 9,000 lines that describe in perfect detail what's going on.

Daniel Pocock: FOSDEM 2017 Real-Time Communications Call for Participation

6 hours 30 min ago

FOSDEM is one of the world's premier meetings of free software developers, with over five thousand people attending each year. FOSDEM 2017 takes place 4-5 February 2017 in Brussels, Belgium.

This email contains information about:

  • Real-Time communications dev-room and lounge,
  • speaking opportunities,
  • volunteering in the dev-room and lounge,
  • related events around FOSDEM, including the XMPP summit,
  • social events (the legendary FOSDEM Beer Night and Saturday night dinners provide endless networking opportunities),
  • the Planet aggregation sites for RTC blogs
Call for participation - Real Time Communications (RTC)

The Real-Time dev-room and Real-Time lounge is about all things involving real-time communication, including: XMPP, SIP, WebRTC, telephony, mobile VoIP, codecs, peer-to-peer, privacy and encryption. The dev-room is a successor to the previous XMPP and telephony dev-rooms. We are looking for speakers for the dev-room and volunteers and participants for the tables in the Real-Time lounge.

The dev-room is only on Saturday, 4 February 2017. The lounge will be present for both days.

To discuss the dev-room and lounge, please join the FSFE-sponsored Free RTC mailing list.

To be kept aware of major developments in Free RTC, without being on the discussion list, please join the Free-RTC Announce list.

Speaking opportunities

Note: if you used FOSDEM Pentabarf before, please use the same account/username

Real-Time Communications dev-room: deadline 23:59 UTC on 17 November. Please use the Pentabarf system to submit a talk proposal for the dev-room. On the "General" tab, please look for the "Track" option and choose "Real-Time devroom". Link to talk submission.

Other dev-rooms and lightning talks: some speakers may find their topic is in the scope of more than one dev-room. It is encouraged to apply to more than one dev-room and also consider proposing a lightning talk, but please be kind enough to tell us if you do this by filling out the notes in the form.

You can find the full list of dev-rooms on this page and apply for a lightning talk at

Main track: the deadline for main track presentations is 23:59 UTC 31 October. Leading developers in the Real-Time Communications field are encouraged to consider submitting a presentation to the main track.

First-time speaking?

FOSDEM dev-rooms are a welcoming environment for people who have never given a talk before. Please feel free to contact the dev-room administrators personally if you would like to ask any questions about it.

Submission guidelines

The Pentabarf system will ask for many of the essential details. Please remember to re-use your account from previous years if you have one.

In the "Submission notes", please tell us about:

  • the purpose of your talk
  • any other talk applications (dev-rooms, lightning talks, main track)
  • availability constraints and special needs

You can use HTML and links in your bio, abstract and description.

If you maintain a blog, please consider providing us with the URL of a feed with posts tagged for your RTC-related work.

We will be looking for relevance to the conference and dev-room themes, presentations aimed at developers of free and open source software about RTC-related topics.

Please feel free to suggest a duration between 20 minutes and 55 minutes but note that the final decision on talk durations will be made by the dev-room administrators. As the two previous dev-rooms have been combined into one, we may decide to give shorter slots than in previous years so that more speakers can participate.

Please note FOSDEM aims to record and live-stream all talks. The CC-BY license is used.

Volunteers needed

To make the dev-room and lounge run successfully, we are looking for volunteers:

  • FOSDEM provides video recording equipment and live streaming, volunteers are needed to assist in this
  • organizing one or more restaurant bookings (dependending upon number of participants) for the evening of Saturday, 4 February
  • participation in the Real-Time lounge
  • helping attract sponsorship funds for the dev-room to pay for the Saturday night dinner and any other expenses
  • circulating this Call for Participation (text version) to other mailing lists

See the mailing list discussion for more details about volunteering.

Related events - XMPP and RTC summits

The XMPP Standards Foundation (XSF) has traditionally held a summit in the days before FOSDEM. There is discussion about a similar summit taking place on 2 and 3 February 2017. XMPP Summit web site - please join the mailing list for details.

We are also considering a more general RTC or telephony summit, potentially in collaboration with the XMPP summit. Please join the Free-RTC mailing list and send an email if you would be interested in participating, sponsoring or hosting such an event.

Social events and dinners

The traditional FOSDEM beer night occurs on Friday, 3 February.

On Saturday night, there are usually dinners associated with each of the dev-rooms. Most restaurants in Brussels are not so large so these dinners have space constraints and reservations are essential. Please subscribe to the Free-RTC mailing list for further details about the Saturday night dinner options and how you can register for a seat.

Spread the word and discuss

If you know of any mailing lists where this CfP would be relevant, please forward this email (text version). If this dev-room excites you, please blog or microblog about it, especially if you are submitting a talk.

If you regularly blog about RTC topics, please send details about your blog to the planet site administrators:

Planet site Admin contact All projects Free-RTC Planet ( contact XMPP Planet Jabber ( contact SIP Planet SIP ( contact SIP (Español) Planet SIP-es ( contact

Please also link to the Planet sites from your own blog or web site as this helps everybody in the free real-time communications community.


For any private queries, contact us directly using the address and for any other queries please ask on the Free-RTC mailing list.

The dev-room administration team:

Joachim Breitner: Showcasing Applicative

9 hours 24 min ago

My plan for this week’s lecture of the CIS 194 Haskell course at the University of Pennsylvania is to dwell a bit on the concept of Functor, Applicative and Monad, and to highlight the value of the Applicative abstraction.

I quite like the example that I came up with, so I want to share it here. In the interest of long-term archival and stand-alone pesentation, I include all the material in this post.1


In case you want to follow along, start with these imports:

import Data.Char
import Data.Maybe
import Data.List

import System.Environment
import System.IO
import System.Exit
The parser

The starting point for this exercise is a fairly standard parser-combinator monad, which happens to be the result of the student’s homework from last week:

newtype Parser a = P (String -> Maybe (a, String))

runParser :: Parser t -> String -> Maybe (t, String)
runParser (P p) = p

parse :: Parser a -> String -> Maybe a
parse p input = case runParser p input of
    Just (result, "") -> Just result
    _ -> Nothing -- handles both no result and leftover input

noParserP :: Parser a
noParserP = P (\_ -> Nothing)

pureParserP :: a -> Parser a
pureParserP x = P (\input -> Just (x,input))

instance Functor Parser where
    fmap f p = P $ \input -> do
	(x, rest) <- runParser p input
	return (f x, rest)

instance Applicative Parser where
    pure = pureParserP
    p1 <*> p2 = P $ \input -> do
        (f, rest1) <- runParser p1 input
        (x, rest2) <- runParser p2 rest1
        return (f x, rest2)

instance Monad Parser where
    return = pure
    p1 >>= k = P $ \input -> do
        (x, rest1) <- runParser p1 input
        runParser (k x) rest1

anyCharP :: Parser Char
anyCharP = P $ \input -> case input of
    (c:rest) -> Just (c, rest)
    []       -> Nothing

charP :: Char -> Parser ()
charP c = do
    c' <- anyCharP
    if c == c' then return ()
               else noParserP

anyCharButP :: Char -> Parser Char
anyCharButP c = do
    c' <- anyCharP
    if c /= c' then return c'
               else noParserP

letterOrDigitP :: Parser Char
letterOrDigitP = do
    c <- anyCharP
    if isAlphaNum c then return c else noParserP

orElseP :: Parser a -> Parser a -> Parser a
orElseP p1 p2 = P $ \input -> case runParser p1 input of
    Just r -> Just r
    Nothing -> runParser p2 input

manyP :: Parser a -> Parser [a]
manyP p = (pure (:) <*> p <*> manyP p) `orElseP` pure []

many1P :: Parser a -> Parser [a]
many1P p = pure (:) <*> p <*> manyP p

sepByP :: Parser a -> Parser () -> Parser [a]
sepByP p1 p2 = (pure (:) <*> p1 <*> (manyP (p2 *> p1))) `orElseP` pure []

A parser using this library for, for example, CSV files could take this form:

parseCSVP :: Parser [[String]]
parseCSVP = manyP parseLine
    parseLine = parseCell `sepByP` charP ',' <* charP '\n'
    parseCell = do
        charP '"'
        content <- manyP (anyCharButP '"')
        charP '"'
        return content
We want EBNF

Often when we write a parser for a file format, we might also want to have a formal specification of the format. A common form for such a specification is EBNF. This might look as follows, for a CSV file:

cell = '"', {not-quote}, '"';
line = (cell, {',', cell} | ''), newline;
csv  = {line};

It is straight-forward to create a Haskell data type to represent an ENBF syntax description. Here is a simle EBNF library (data type and pretty-printer) for your convenience:

data RHS
  = Terminal String
  | NonTerminal String
  | Choice RHS RHS
  | Sequence RHS RHS
  | Optional RHS
  | Repetition RHS
  deriving (Show, Eq)

ppRHS :: RHS -> String
ppRHS = go 0
    go _ (Terminal s)     = surround "'" "'" $ concatMap quote s
    go _ (NonTerminal s)  = s
    go a (Choice x1 x2)   = p a 1 $ go 1 x1 ++ " | " ++ go 1 x2
    go a (Sequence x1 x2) = p a 2 $ go 2 x1 ++ ", "  ++ go 2 x2
    go _ (Optional x)     = surround "[" "]" $ go 0 x
    go _ (Repetition x)   = surround "{" "}" $ go 0 x

    surround c1 c2 x = c1 ++ x ++ c2

    p a n | a > n     = surround "(" ")"
          | otherwise = id

    quote '\'' = "\\'"
    quote '\\' = "\\\\"
    quote c    = [c]

type Production = (String, RHS)
type BNF = [Production]

ppBNF :: BNF -> String
ppBNF = unlines . map (\(i,rhs) -> i ++ " = " ++ ppRHS rhs ++ ";")
Code to produce EBNF

We had a good time writing combinators that create complex parsers from primitive pieces. Let us do the same for EBNF grammars. We could simply work on the RHS type directly, but we can do something more nifty: We create a data type that keeps track, via a phantom type parameter, of what Haskell type the given EBNF syntax is the specification:

newtype Grammar a = G RHS

ppGrammar :: Grammar a -> String
ppGrammar (G rhs) = ppRHS rhs

So a value of type Grammar t is a description of the textual representation of the Haskell type t.

Here is one simple example:

anyCharG :: Grammar Char
anyCharG = G (NonTerminal "char")

Here is another one. This one does not describe any interesting Haskell type, but is useful when spelling out the speical characters in the syntax described by the grammar:

charG :: Char -> Grammar ()
charG c = G (Terminal [c])

A combinator that creates new grammers from two existing grammers:

orElseG :: Grammar a -> Grammar a -> Grammar a
orElseG (G rhs1) (G rhs2) = G (Choice rhs1 rhs2)

We want the convenience of our well-known type classes in order to combine these values some more:

instance Functor Grammar where
    fmap _ (G rhs) = G rhs

instance Applicative Grammar where
    pure x = G (Terminal "")
    (G rhs1) <*> (G rhs2) = G (Sequence rhs1 rhs2)

Note how the Functor instance does not actually use the function. How should it? There are no values inside a Grammar!

We cannot define a Monad instance for Grammar: We would start with (G rhs1) >>= k = …, but there is simply no way of getting a value of type a that we can feed to k. So we will do without a Monad instance. This is interesting, and we will come back to that later.

Like with the parser, we can now begin to build on the primitive example to build more complicated combinators:

manyG :: Grammar a -> Grammar [a]
manyG p = (pure (:) <*> p <*> manyG p) `orElseG` pure []

many1G :: Grammar a -> Grammar [a]
many1G p = pure (:) <*> p <*> manyG p

sepByG :: Grammar a -> Grammar () -> Grammar [a]
sepByG p1 p2 = ((:) <$> p1 <*> (manyG (p2 *> p1))) `orElseG` pure []

Let us run a small example:

dottedWordsG :: Grammar [String]
dottedWordsG = many1G (manyG anyCharG <* charG '.')
*Main> putStrLn $ ppGrammar dottedWordsG
'', ('', char, ('', char, ('', char, ('', char, ('', char, ('', …

Oh my, that is not good. Looks like the recursion in manyG does not work well, so we need to avoid that. But anyways we want to be explicit in the EBNF grammers about where something can be repeated, so let us just make many a primitive:

manyG :: Grammar a -> Grammar [a]
manyG (G rhs) = G (Repetition rhs)

With this definition, we already get a simple grammer for dottedWordsG:

*Main> putStrLn $ ppGrammar dottedWordsG
'', {char}, '.', {{char}, '.'}

This already looks like a proper EBNF grammer. One thing that is not nice about it is that there is an empty string ('') in a sequence (…,…). We do not want that.

Why is it there in the first place? Because our Applicative instance is not lawful! Remember that pure id <*> g == g should hold. One way to achieve that is to improve the Applicative instance to optimize this case away:

instance Applicative Grammar where
    pure x = G (Terminal "")
    G (Terminal "") <*> G rhs2 = G rhs2
    G rhs1 <*> G (Terminal "") = G rhs1
    (G rhs1) <*> (G rhs2) = G (Sequence rhs1 rhs2)
Now we get what we want:
*Main> putStrLn $ ppGrammar dottedWordsG
{char}, '.', {{char}, '.'}

Remember our parser for CSV files above? Let me repeat it here, this time using only Applicative combinators, i.e. avoiding (>>=), (>>), return and do-notation:

parseCSVP :: Grammar [[String]]
parseCSVP = manyP parseLine
    parseLine = parseCell `sepByP` charG ',' <* charP '\n'
    parseCell = charP '"' *> manyP (anyCharButP '"') <* charP '"'

And now we try to rewrite the code to produce Grammar instead of Parser. This is straight forward with the exception of anyCharButP. The parser code for that in inherently monadic, and we just do not have a monad instance. So we work around the issue by making that a “primitive” grammer, i.e. introducing a non-terminal in the EBNF without a production rule – pretty much like we did for anyCharG:

primitiveG :: String -> Grammar a
primitiveG s = G (NonTerminal s)

parseCSVG :: Grammar [[String]]
parseCSVG = manyG parseLine
    parseLine = parseCell `sepByG` charG ',' <* charG '\n'
    parseCell = charG '"' *> manyG (primitiveG "not-quote") <* charG '"'

Of course the names parse… are not quite right any more, but let us just leave that for now.

Here is the result:

*Main> putStrLn $ ppGrammar parseCSVG
{('"', {not-quote}, '"', {',', '"', {not-quote}, '"'} | ''), '

The line break is weird. We do not really want newlines in the grammar. So let us make that primitive as well, and replace charG '\n' with newlineG:

newlineG :: Grammar ()
newlineG = primitiveG "newline"

Now we get

*Main> putStrLn $ ppGrammar parseCSVG
{('"', {not-quote}, '"', {',', '"', {not-quote}, '"'} | ''), newline}

which is nice and correct, but still not quite the easily readable EBNF that we saw further up.

Code to produce EBNF, with productions

We currently let our grammers produce only the right-hand side of one EBNF production, but really, we want to produce a RHS that may refer to other productions. So let us change the type accordingly:

newtype Grammar a = G (BNF, RHS)

runGrammer :: String -> Grammar a -> BNF
runGrammer main (G (prods, rhs)) = prods ++ [(main, rhs)]

ppGrammar :: String -> Grammar a -> String
ppGrammar main g = ppBNF $ runGrammer main g

Now we have to adjust all our primitive combinators (but not the derived ones!):

charG :: Char -> Grammar ()
charG c = G ([], Terminal [c])

anyCharG :: Grammar Char
anyCharG = G ([], NonTerminal "char")

manyG :: Grammar a -> Grammar [a]
manyG (G (prods, rhs)) = G (prods, Repetition rhs)

mergeProds :: [Production] -> [Production] -> [Production]
mergeProds prods1 prods2 = nub $ prods1 ++ prods2

orElseG :: Grammar a -> Grammar a -> Grammar a
orElseG (G (prods1, rhs1)) (G (prods2, rhs2))
    = G (mergeProds prods1 prods2, Choice rhs1 rhs2)

instance Functor Grammar where
    fmap _ (G bnf) = G bnf

instance Applicative Grammar where
    pure x = G ([], Terminal "")
    G (prods1, Terminal "") <*> G (prods2, rhs2)
        = G (mergeProds prods1 prods2, rhs2)
    G (prods1, rhs1) <*> G (prods2, Terminal "")
        = G (mergeProds prods1 prods2, rhs1)
    G (prods1, rhs1) <*> G (prods2, rhs2)
        = G (mergeProds prods1 prods2, Sequence rhs1 rhs2)

primitiveG :: String -> Grammar a
primitiveG s = G (NonTerminal s)

The use of nub when combining productions removes duplicates that might be used in different parts of the grammar. Not efficient, but good enough for now.

Did we gain anything? Not yet:

*Main> putStr $ ppGrammar "csv" (parseCSVG)
csv = {('"', {not-quote}, '"', {',', '"', {not-quote}, '"'} | ''), newline};

But we can now introduce a function hat lets us tell the system where to give names to a piece of grammer:

nonTerminal :: String -> Grammar a -> Grammar a
nonTerminal name (G (prods, rhs))
  = G (prods ++ [(name, rhs)], NonTerminal name)

Ample use of this in parseCSVG yields the desired result:

parseCSVG :: Grammar [[String]]
parseCSVG = manyG parseLine
    parseLine = nonTerminal "line" $
        parseCell `sepByG` charG ',' <* newline
    parseCell = nonTerminal "cell" $
        charG '"' *> manyG (primitiveG "not-quote") <* charG '"
*Main> putStr $ ppGrammar "csv" (parseCSVG)
cell = '"', {not-quote}, '"';
line = (cell, {',', cell} | ''), newline;
csv = {line};

This is great!

Unifying parsing and grammar-generating

Note how simliar parseCSVG and parseCSVP are! Would it not be great if we could implement that functionaliy only once, and get both a parser and a grammer description out of it? This way, the two would never be out of sync!

And surely this must be possible. The tool to reach for is of course to define a type class that abstracts over the parts where Parser and Grammer differ. So we have to identify all functions that are primitive in one of the two worlds, and turn them into type class methods. This includes char and orElse. It includes many, too: Although manyP is not primitive, manyG is. It also includes nonTerminal, which does not exist in the world of parsers (yet), but we need it for the grammers.

The primitiveG function is tricky. We use it in grammers when the code that we might use while parsing is not expressible as a grammar. So the solution is to let it take two arguments: A String, when used as a descriptive non-terminal in a grammar, and a Pareser a, used in the parsing code.

Finally, the type class that we execpt, Applicative (and thus Functor), are added as constraints on our type class:

class Applicative f => Descr f where
    char :: Char -> f ()
    many :: f a -> f [a]
    orElse :: f a -> f a -> f a
    primitive :: String -> Parser a -> f a
    nonTerminal :: String -> f a -> f a

The instances are easily written:

instance Descr Parser where
    char = charP
    many = manyP
    orElse = orElseP
    primitive _ p = p
    nonTerminal _ p = p

instance Descr Grammar where
    char = charG
    many = manyG
    orElse = orElseG
    primitive s _ = primitiveG s
    nonTerminal s g = nonTerminal s g

And we can now take the derived definitions, of which so far we had two copies, and define them once and for all:

many1 :: Descr f => f a -> f [a]
many1 p = pure (:) <*> p <*> many p

anyChar :: Descr f => f Char
anyChar = primitive "char" anyCharP

dottedWords :: Descr f => f [String]
dottedWords = many1 (many anyChar <* char '.')

sepBy :: Descr f => f a -> f () -> f [a]
sepBy p1 p2 = ((:) <$> p1 <*> (many (p2 *> p1))) `orElse` pure []

newline :: Descr f => f ()
newline = primitive "newline" (charP '\n')

And thus we now have our CSV parser/grammar generator:

parseCSV :: Descr f => f [[String]]
parseCSV = many parseLine
    parseLine = nonTerminal "line" $
        parseCell `sepBy` char ',' <* newline
    parseCell = nonTerminal "cell" $
        char '"' *> many (primitive "not-quote" (anyCharButP '"')) <* char '"'

We can now use this definition both to parse and to generate grammers:

*Main> putStr $ ppGrammar2 "csv" (parseCSV)
cell = '"', {not-quote}, '"';
line = (cell, {',', cell} | ''), newline;
csv = {line};
*Main> parse parseCSV "\"ab\",\"cd\"\n\"\",\"de\"\n\n"
Just [["ab","cd"],["","de"],[]]
The INI file parser and grammar

As a final exercise, let us transform the INI file parser into a combined thing. Here is the parser (another artifact of last week’s homework) again using applicative style2:

parseINIP :: Parser INIFile
parseINIP = many1P parseSection
    parseSection =
        (,) <$  charP '['
            <*> parseIdent
            <*  charP ']'
            <*  charP '\n'
            <*> (catMaybes <$> manyP parseLine)
    parseIdent = many1P letterOrDigitP
    parseLine = parseDecl `orElseP` parseComment `orElseP` parseEmpty

    parseDecl = Just <$> (
        (,) <*> parseIdent
            <*  manyP (charP ' ')
            <*  charP '='
            <*  manyP (charP ' ')
            <*> many1P (anyCharButP '\n')
            <*  charP '\n')

    parseComment =
        Nothing <$ charP '#'
                <* many1P (anyCharButP '\n')
                <* charP '\n'

    parseEmpty = Nothing <$ charP '\n'

Transforming that to a generic description is quite straight-forward. We use primitive again to wrap letterOrDigitP:

descrINI :: Descr f => f INIFile
descrINI = many1 parseSection
    parseSection =
        (,) <*  char '['
            <*> parseIdent
            <*  char ']'
            <*  newline
            <*> (catMaybes <$> many parseLine)
    parseIdent = many1 (primitive "alphanum" letterOrDigitP)
    parseLine = parseDecl `orElse` parseComment `orElse` parseEmpty

    parseDecl = Just <$> (
        (,) <*> parseIdent
            <*  many (char ' ')
            <*  char '='
            <*  many (char ' ')
            <*> many1 (primitive "non-newline" (anyCharButP '\n'))
	    <*  newline)

    parseComment =
        Nothing <$ char '#'
                <* many1 (primitive "non-newline" (anyCharButP '\n'))
		<* newline

    parseEmpty = Nothing <$ newline

This yiels this not very helpful grammar (abbreviated here):

*Main> putStr $ ppGrammar2 "ini" descrINI
ini = '[', alphanum, {alphanum}, ']', newline, {alphanum, {alphanum}, {' '}…

But with a few uses of nonTerminal, we get something really nice:

descrINI :: Descr f => f INIFile
descrINI = many1 parseSection
    parseSection = nonTerminal "section" $
        (,) <$  char '['
            <*> parseIdent
            <*  char ']'
            <*  newline
            <*> (catMaybes <$> many parseLine)
    parseIdent = nonTerminal "identifier" $
        many1 (primitive "alphanum" letterOrDigitP)
    parseLine = nonTerminal "line" $
        parseDecl `orElse` parseComment `orElse` parseEmpty

    parseDecl = nonTerminal "declaration" $ Just <$> (
        (,) <$> parseIdent
            <*  spaces
            <*  char '='
            <*  spaces
            <*> remainder)

    parseComment = nonTerminal "comment" $
        Nothing <$ char '#' <* remainder

    remainder = nonTerminal "line-remainder" $
        many1 (primitive "non-newline" (anyCharButP '\n')) <* newline

    parseEmpty = Nothing <$ newline

    spaces = nonTerminal "spaces" $ many (char ' ')
*Main> putStr $ ppGrammar "ini" descrINI
identifier = alphanum, {alphanum};
spaces = {' '};
line-remainder = non-newline, {non-newline}, newline;
declaration = identifier, spaces, '=', spaces, line-remainder;
comment = '#', line-remainder;
line = declaration | comment | newline;
section = '[', identifier, ']', newline, {line};
ini = section, {section};
Recursion (variant 1)

What if we want to write a parser/grammar-generator that is able to generate the following grammar, which describes terms that are additions and multiplications of natural numbers:

const = digit, {digit};
spaces = {' ' | newline};
atom = const | '(', spaces, expr, spaces, ')', spaces;
mult = atom, {spaces, '*', spaces, atom}, spaces;
plus = mult, {spaces, '+', spaces, mult}, spaces;
expr = plus;

The production of expr is recursive (via plus, mult, atom). We have seen above that simply defining a Grammar a recursively does not go well.

One solution is to add a new combinator for explicit recursion, which replaces nonTerminal in the method:

class Applicative f => Descr f where
    recNonTerminal :: String -> (f a -> f a) -> f a

instance Descr Parser where
    recNonTerminal _ p = let r = p r in r

instance Descr Grammar where
    recNonTerminal = recNonTerminalG

recNonTerminalG :: String -> (Grammar a -> Grammar a) -> Grammar a
recNonTerminalG name f =
    let G (prods, rhs) = f (G ([], NonTerminal name))
    in G (prods ++ [(name, rhs)], NonTerminal name)

nonTerminal :: Descr f => String -> f a -> f a
nonTerminal name p = recNonTerminal name (const p)

runGrammer :: String -> Grammar a -> BNF
runGrammer main (G (prods, NonTerminal nt)) | main == nt = prods
runGrammer main (G (prods, rhs)) = prods ++ [(main, rhs)]

The change in runGrammer avoids adding a pointless expr = expr production to the output.

This lets us define a parser/grammar-generator for the arithmetic expressions given above:

data Expr = Plus Expr Expr | Mult Expr Expr | Const Integer
    deriving Show

mkPlus :: Expr -> [Expr] -> Expr
mkPlus = foldl Plus

mkMult :: Expr -> [Expr] -> Expr
mkMult = foldl Mult

parseExpr :: Descr f => f Expr
parseExpr = recNonTerminal "expr" $ \ exp ->
    ePlus exp

ePlus :: Descr f => f Expr -> f Expr
ePlus exp = nonTerminal "plus" $
    mkPlus <$> eMult exp
           <*> many (spaces *> char '+' *> spaces *> eMult exp)
           <*  spaces

eMult :: Descr f => f Expr -> f Expr
eMult exp = nonTerminal "mult" $
    mkPlus <$> eAtom exp
           <*> many (spaces *> char '*' *> spaces *> eAtom exp)
           <*  spaces

eAtom :: Descr f => f Expr -> f Expr
eAtom exp = nonTerminal "atom" $
    aConst `orElse` eParens exp

aConst :: Descr f => f Expr
aConst = nonTerminal "const" $ Const . read <$> many1 digit

eParens :: Descr f => f a -> f a
eParens inner =
    id <$  char '('
       <*  spaces
       <*> inner
       <*  spaces
       <*  char ')'
       <*  spaces

And indeed, this works:

*Main> putStr $ ppGrammar "expr" parseExpr
const = digit, {digit};
spaces = {' ' | newline};
atom = const | '(', spaces, expr, spaces, ')', spaces;
mult = atom, {spaces, '*', spaces, atom}, spaces;
plus = mult, {spaces, '+', spaces, mult}, spaces;
expr = plus;
Recursion (variant 1)

Interestingly, there is another solution to this problem, which avoids introducing recNonTerminal and explicitly passing around the recursive call (i.e. the exp in the example). To implement that we have to adjust our Grammar type as follows:

newtype Grammar a = G ([String] -> (BNF, RHS))

The idea is that the list of strings is those non-terminals that we are currently defining. So in nonTerminal, we check if the non-terminal to be introduced is currently in the process of being defined, and then simply ignore the body. This way, the recursion is stopped automatically:

nonTerminalG :: String -> (Grammar a) -> Grammar a
nonTerminalG name (G g) = G $ \seen ->
    if name `elem` seen
    then ([], NonTerminal name)
    else let (prods, rhs) = g (name : seen)
         in (prods ++ [(name, rhs)], NonTerminal name)

After adjusting the other primitives of Grammar (including the Functor and Applicative instances, wich now again have nonTerminal) to type-check again, we observe that this parser/grammar generator for expressions, with genuine recursion, works now:

parseExp :: Descr f => f Expr
parseExp = nonTerminal "expr" $

ePlus :: Descr f => f Expr
ePlus = nonTerminal "plus" $
    mkPlus <$> eMult
           <*> many (spaces *> char '+' *> spaces *> eMult)
           <*  spaces

eMult :: Descr f => f Expr
eMult = nonTerminal "mult" $
    mkPlus <$> eAtom
           <*> many (spaces *> char '*' *> spaces *> eAtom)
           <*  spaces

eAtom :: Descr f => f Expr
eAtom = nonTerminal "atom" $
    aConst `orElse` eParens parseExp

Note that the recursion is only going to work if there is at least one call to nonTerminal somewhere around the recursive calls. We still cannot implement many as naively as above.


If you want to play more with this: The homework is to define a parser/grammar-generator for EBNF itself, as specified in this variant:

identifier = letter, {letter | digit | '-'};
spaces = {' ' | newline};
quoted-char = non-quote-or-backslash | '\\', '\\' | '\\', '\'';
terminal = '\'', {quoted-char}, '\'', spaces;
non-terminal = identifier, spaces;
option = '[', spaces, rhs, spaces, ']', spaces;
repetition = '{', spaces, rhs, spaces, '}', spaces;
group = '(', spaces, rhs, spaces, ')', spaces;
atom = terminal | non-terminal | option | repetition | group;
sequence = atom, {spaces, ',', spaces, atom}, spaces;
choice = sequence, {spaces, '|', spaces, sequence}, spaces;
rhs = choice;
production = identifier, spaces, '=', spaces, rhs, ';', spaces;
bnf = production, {production};

This grammer is set up so that the precedence of , and | is correctly implemented: a , b | c will parse as (a, b) | c.

In this syntax for BNF, terminal characters are quoted, i.e. inside '…', a ' is replaced by \' and a \ is replaced by \\ – this is done by the function quote in ppRHS.

If you do this, you should able to round-trip with the pretty-printer, i.e. parse back what it wrote:

*Main> let bnf1 = runGrammer "expr" parseExpr
*Main> let bnf2 = runGrammer "expr" parseBNF
*Main> let f = Data.Maybe.fromJust . parse parseBNF. ppBNF
*Main> f bnf1 == bnf1
*Main> f bnf2 == bnf2

The last line is quite meta: We are unsing parseBNF as a parser on the pretty-printed grammar produced from interpreting parseBNF as a grammar.


We have again seen an example of the excellent support for abstraction in Haskell: Being able to define so very different things such as a parser and a grammar description with the same code is great. Type classes helped us here.

Note that it was crucial that our combined parser/grammers are only able to use the methods of Applicative, and not Monad. Applicative is less powerful, so by giving less power to the user of our Descr interface, the other side, i.e. the implementation, can be more powerful.

The reason why Applicative is ok, but Monad is not, is that in Applicative, the results do not affect the shape of the computation, whereas in Monad, the whole point of the bind operator (>>=) is that the result of the computation is used to decide the next computation. And while this is perfectly fine for a parser, it just makes no sense for a grammar generator, where there simply are no values around!

We have also seen that a phantom type, namely the parameter of Grammar, can be useful, as it lets the type system make sure we do not write nonse. For example, the type of orElseG ensures that both grammars that are combined here indeed describe something of the same type.

  1. It seems to be the week of applicative-appraising blog posts: Brent has posted a nice piece about enumerations using Applicative yesterday.

  2. I like how in this alignment of <*> and <* the > point out where the arguments are that are being passed to the function on the left.

Dirk Eddelbuettel: Rblpapi 0.3.5

10 hours 55 min ago

A new release of Rblpapi is now on CRAN. Rblpapi provides a direct interface between R and the Bloomberg Terminal via the C++ API provided by Bloomberg Labs (but note that a valid Bloomberg license and installation is required).

This is the sixth release since the package first appeared on CRAN last year. This release brings new functionality via new (getPortfolio()) and extended functions (getTicks()) as well as several fixes:

Changes in Rblpapi version 0.3.5 (2016-10-25)
  • Add new function getPortfolio to retrieve portfolio data via bds (John in #176)

  • Extend getTicks() to (optionally) return non-numeric data as part of data.frame or data.table (Dirk in #200)

  • Similarly extend getMultipleTicks (Dirk in #202)

  • Correct statement on timestamp for getBars (Closes issue #192)

  • Minor edits to a few files in order to either please R(-devel) CMD check --as-cran, or update documentation

Courtesy of CRANberries, there is also a diffstat report for the this release. As always, more detailed information is on the Rblpapi page. Questions, comments etc should go to the issue tickets system at the GitHub repo.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

Laura Arjona Reina: Rankings, Condorcet and free software: Calculating the results for the Stretch Artwork Survey

16 hours 57 min ago
We had 12 candidates for the Debian Stretch Artwork and a survey was set up for allowing people to vote which one they prefer.

The survey was run in my LimeSurvey instance, LimeSurvey  its a nice free software with a lot of features. It provides a “Ranking” question type, and it was very easy for allowing people to “vote” in the Debian style (Debian uses the Condorcet method in its elections).

However, although LimeSurvey offers statistics and even graphics to show the results of many type of questions, its output for the Ranking type is not useful, so I had to export the data and use another tool to find the winner.

Export the data from LimeSurvey I’ve created a read-only user to visit the survey site. With this visitor you can explore the survey questionnaire, its results, and export the data. URL: Username: stretch Password: artwork First attempt, the quick and easy (and nonfree, I guess) There is an online tool to calculate the Condorcet winner,  The steps I followed to feed the tool with the data from LimeSurvey were these: 1.- Went to admin interface of lime survey, selected the stretch artwork survey, responses and statistics, export results to application 2.- Selected “Completed responses only”, “Question codes”, “Answer codes”, and exported to CSV. (results_stretch1.csv) 3.- Opened the CSV with LibreOffice Calc, and removed these columns: id    submitdate    lastpage    startlanguage 4.- Remove the first row containing the headers and saved the result (results_stretch2.csv) 5.- In commandline:
sort results_stretch2.csv | uniq -c > results_stretch3.csv
6.- Opened results_stretch3.csv with LibreOffice Calc and “merge delimitors” when importing. 7.- Removed the first column (blank) and added a column between the numbers and the first ranked option, and fulfilled that column with “:” value. Saved (results_stretch4.csv) 8.- Opened results_stretch4.csv with my preferred editor and search and replace “,:,” for “:” and after that, search and replace “,” for “>”. Save the result (results_stretch5.csv) 9.- Went to, selected Condorcet basic, “tell me some things”, and pasted the contents of results_stretch5.csv there. The results are in results_stretch1.html But where is the source code of this Condorcet tool? I couldn’t find the source code (nor license) of the solver by Eric Gorr. The tool is mentioned in where other tools are listed and when the tool is libre software, is noted so. But not in this case. There, I found another tool, VoteEngine, which is open source, so I tried with that. Second attempt: VoteEngine, a Free Open Source Software tool made with Python I used a modification of voteengine-0.99 (the original zip is available in and a diff with the changes I made (basically, Numeric -> numpy and Int -> int, inorder that works in Debian stable), here. Steps 1 to 4 are the same as in the first attempt. 5.- Sorted alphabetically the different 12 options to vote, and assigned a letter to each one (saved the assignments in a file called  stretch_key.txt). 6.- Opened results_stretch2.csv with my favorite editor, and search and replace the name of the different options, for their corresponding letter in stretch_key.txt file. Searched and replaced “,” for ” ” (space). Then, saved the results into the file results_stretch3_voteengine.txt 7.- Copied the input.txt file from voteengine-0.99 into stretch.txt and edited the options to our needs. Pasted the contents of results_stretch3_voteengine.cvs at the end of stretch.txt 8.-In the commandline
./ <stretch.txt  > winner.txt
(winner.txt contains the results for the Condorcet method). 9.- I edited again stretch.txt to change the method to shulze and calculated the results, and again with the smith method. The winner in the 3 methods is the same. I pasted the summary of these 3 methods (shulze and smith provide a ranked list) in stretch_results.txt If it can be done, it can be done with R… I found the algstat R package: which includes a “condorcet” function but I couldn’t make it work with the data. I’m not sure how the data needs to be shaped. I’m sure that this can be done in R and the problem is me, in this case. Comments are welcome, and I’ll try to ask to a friend whose R skills are better than mine!
And another SaaS I found and its source code. It would be interesting to deploy a local instance to drive future surveys, but for this time I didn’t want to fight with PHP in order to use only the “solver” part, nor install another SaaS in my home server just to find that I need some other dependency or whatever. I’ll keep an eye on this, though, because it looks like a modern and active project. Finally, devotee Well and which software Debian uses for its elections?  There is a git repository with devotee, you can clone it: I found that although the tool is quite modular, it’s written specifically for the Debian case (votes received by mail, GPG signed, there is a quorum, and other particularities) and I was not sure if I could use it with my data. It is written in Perl and then I understood it worse than the Python from VoteEngine. Maybe I’ll return to it, though, when I have more time, to try to put our data in the shape of a typicall tally.txt file and then see if the module solving the condorcet winner can work for me. That’s all, folks! (for now…) Comments You can coment on this blog post in this thread
Filed under: Tools Tagged: data mining, Debian, English, SaaS, statistics

Jose M. Calhariz: New packages for Amanda on the works

18 hours 28 min ago

Because of the upgrade of perl, amanda is currently broken on testing and unstable on Debian. The problem is known and I am working with my sponsor to create new packages to solve the problem. Please hang a little more.

Bits from Debian: "softWaves" will be the default theme for Debian 9

19 hours 19 min ago

The theme "softWaves" by Juliette Taka Belin has been selected as default theme for Debian 9 'stretch'.

After the Debian Desktop Team made the call for proposing themes, a total of twelve choices have been submitted, and any Debian contributor has received the opportunity to vote on them in a survey. We received 3,479 responses ranking the different choices, and softWaves has been the winner among them.

We'd like to thank all the designers that have participated providing nice wallpapers and artwork for Debian 9, and encourage everybody interested in this area of Debian, to join the Design Team. It is being considered to package all of them so they are easily available in Debian. If you want to help in this effort, or package any other artwork (for example, particularly designed to be accessibility-friendly), please contact the Debian Desktop Team, but hurry up, because the freeze for new packages in the next release of Debian starts on January 5th, 2016.

This is the second time that Debian ships a theme by Juliette Belin, who also created the theme "Lines" that enhances our actual stable release, Debian 8. Congratulations, Juliette, and thank you very much for your continued commitment to Debian!

Julian Andres Klode: Introducing DNS66, a host blocker for Android

25 October, 2016 - 23:20

I’m proud (yes, really) to announce DNS66, my host/ad blocker for Android 5.0 and newer. It’s been around since last Thursday on F-Droid, but it never really got a formal announcement.

DNS66 creates a local VPN service on your Android device, and diverts all DNS traffic to it, possibly adding new DNS servers you can configure in its UI. It can use hosts files for blocking whole sets of hosts or you can just give it a domain name to block (or multiple hosts files/hosts). You can also whitelist individual hosts or entire files by adding them to the end of the list. When a host name is looked up, the query goes to the VPN which looks at the packet and responds with NXDOMAIN (non-existing domain) for hosts that are blocked.

You can find DNS66 here:

F-Droid is the recommended source to install from. DNS66 is licensed under the GNU GPL 3, or (mostly) any later version.

Implementation Notes

DNS66’s core logic is based on another project,  dbrodie/AdBuster, which arguably has the cooler name. I translated that from Kotlin to Java, and cleaned up the implementation a bit:

All work is done in a single thread by using poll() to detect when to read/write stuff. Each DNS request is sent via a new UDP socket, and poll() polls over all UDP sockets, a Device Socket (for the VPN’s tun device) and a pipe (so we can interrupt the poll at any time by closing the pipe).

We literally redirect your DNS servers. Meaning if your DNS server is, all traffic to is routed to the VPN. The VPN only understands DNS traffic, though, so you might have trouble if your DNS server also happens to serve something else. I plan to change that at some point to emulate multiple DNS servers with fake IPs, but this was a first step to get it working with fallback: Android can now transparently fallback to other DNS servers without having to be aware that they are routed via the VPN.

We also need to deal with timing out queries that we received no answer for: DNS66 stores the query into a LinkedHashMap and overrides the removeEldestEntry() method to remove the eldest entry if it is older than 10 seconds or there are more than 1024 pending queries. This means that it only times out up to one request per new request, but it eventually cleans up fine.


Filed under: Android, Uncategorized

Michal &#268;iha&#345;: New features on Hosted Weblate

25 October, 2016 - 23:00

Today, new version has been deployed on Hosted Weblate. It brings many long requested features and enhancements.

Adding project to watched got way simpler, you can now do it on the project page using watch button:

Another feature which will be liked by project admins is that they can now change project metadata without contacting me. This works for both project and component level:

And adding some fancy things, there is new badge showing status of translations into all languages. This is how it looks for Weblate itself:

As you can see it can get pretty big for projects with many translations, but you get complete picture of the translation status in it.

You can find all these features in upcoming Weblate 2.9 which should be released next week. Complete list of changes in Weblate 2.9 is described in our documentation.

Filed under: Debian English phpMyAdmin SUSE Weblate | 0 comments

Jaldhar Vyas: Aaargh gcc 5.x You Suck

25 October, 2016 - 13:45

Aaargh gcc 5.x You Suck

I had to write a quick program today which is going to be run many thousands of times a day so it has to run fast. I decided to do it in c++ instead of the usual perl or javascript because it seemed appropriate and I've been playing around a lot with c++ lately trying to update my knowledge of its' modern features. So 200 LOC later I was almost done so I ran the program through valgrind a good habit I've been trying to instill. That's when I got a reminder of why I avoid c++.

==37698== HEAP SUMMARY:
==37698==     in use at exit: 72,704 bytes in 1 blocks
==37698==   total heap usage: 5 allocs, 4 frees, 84,655 bytes allocated
==37698== LEAK SUMMARY:
==37698==    definitely lost: 0 bytes in 0 blocks
==37698==    indirectly lost: 0 bytes in 0 blocks
==37698==      possibly lost: 0 bytes in 0 blocks
==37698==    still reachable: 72,704 bytes in 1 blocks
==37698==         suppressed: 0 bytes in 0 blocks

One of things I've learnt which I've been trying to apply more rigorously is to avoid manual memory management (news/deletes.) as much as possible in favor of modern c++ features such as std::unique_ptr etc. By my estimation there should only be three places in my code where memory is allocated and none of them should leak. Where do the others come from? And why is there a missing free (or delete.) Now the good news is that valgrind is saying that the memory is not technically leaking. It is still reachable at exit but that's ok because the OS will reclaim it. But this program will run a lot and I think it could still lead to problems over time such as memory fragmentation so I wanted to understand what was going on. Not to mention the bad aesthetics of it.

My first assumption (one which has served me well over the years) was to assume that I had screwed up somewhere. Or perhaps it could some behind the scenes compiler magic. It turned out to be the latter -- sort of as I found out only after two hours of jiggling code in different ways and googling for clues. That's when I found this Stack Overflow question which suggests that it is either a valgrind or compiler bug. The answer specifically mentions gcc 5.1. I was using Ubuntu LTS which has gcc 5.4 so I have just gone ahead and assumed all 5.x versions of gcc have this problem. Sure enough, compiling the same program on Debian stable which has gcc 4.9 gave this...

==6045== HEAP SUMMARY:
==6045==     in use at exit: 0 bytes in 0 blocks
==6045==   total heap usage: 3 allocs, 3 frees, 10,967 bytes allocated
==6045== All heap blocks were freed -- no leaks are possible

...Much better. The executable was substantially smaller too. The time was not a total loss however. I learned that valgrind is pronounced val-grinned (it's from Norse mythology.) not val-grind as I had thought. So I have that going for me which is nice.

Russ Allbery: Review: Lord of Emperors

25 October, 2016 - 11:04

Review: Lord of Emperors, by Guy Gavriel Kay

Series: Sarantine Mosaic #2 Publisher: Eos Copyright: 2000 Printing: February 2001 ISBN: 0-06-102002-8 Format: Mass market Pages: 560

Lord of Emperors is the second half of a work that began with Sailing to Sarantium and is best thought of as a single book split for publishing reasons. You want to read the two together and in order.

As is typical for this sort of two-part work, it's difficult to review the second half without spoilers. I'll be more vague about the plot and the characters than normal, and will mark one bit that's arguably a bit of a spoiler (although I don't think it would affect the enjoyment of the book).

At the end of Sailing to Sarantium, we left Crispin in the great city, oddly and surprisingly entangled with some frighteningly powerful people and some more mundane ones (insofar as anyone is mundane in a Guy Gavriel Kay novel, but more on that in a bit). The opening of Lord of Emperors takes a break from the city to introduce a new people, the Bassanids, and a new character, Rustem of Karakek. While Crispin is still the heart of this story, the thread that binds the entirety of the Sarantine Mosaic together, Rustem is the primary protagonist for much of this book. I had somehow forgotten him completely since my first read of this series many years ago. I have no idea how.

I mentioned in my review of the previous book that one of the joys of reading this series is competence porn: watching the work of someone who is extremely good at what they do, and experiencing vicariously some of the passion and satisfaction they have for their work. Kay's handling of Crispin's mosaics is still the highlight of the series for me, but Rustem's medical practice (and Strumosus, and the chariot races) comes close. Rustem is a brilliant doctor by the standards of the time, utterly frustrated with the incompetence of the Sarantine doctors, but also weaving his own culture's belief in omens and portents into his actions. He's more reserved, more laconic than Crispin, but is another character with focused expertise and a deep internal sense of honor, swept unexpectedly into broader affairs and attempting to navigate them by doing the right thing in each moment. Kay fills this book with people like that, and it's compelling reading.

Rustem's entrance into the city accidentally sets off a complex chain of events that draws together all of the major characters of Sailing to Sarantium and adds a few more. The stakes are no less than war and control of major empires, and here Kay departs firmly from recorded history into his own creation. I had mentioned in the previous review that Justinian and Theodora are the clear inspirations for this story; that remains true, and many other characters are easy to map, but don't expect history to go here the way that it did in our world. Kay's version diverges significantly, and dramatically.

But one of the things I love the most about this book is its focus on the individual acts of courage, empathy, and ethics of each of the characters, even when those acts do not change the course of empires. The palace intrigue happens, and is important, but the individual acts of Kay's large cast get just as much epic narrative attention even if they would never appear in a history book. The most globally significant moment of the book is not the most stirring; that happens slightly earlier, in a chariot race destined to be forgotten by history. And the most touching moment of the book is a moment of connection between two people who would never appear in history, over the life of a third, that matters so much to the reader only because of the careful attention to individual lives and personalities Kay has shown over the course of a hundreds of pages.

A minor spoiler follows in the next paragraph, although I don't think it affects the reading of the book.

One brilliant part of Kay's fiction is that he doesn't have many villains, and goes to some lengths to humanize the actions of nearly everyone in the book. But sometimes the author's deep dislike of one particular character shows through, and here it's Pertennius (the clear analogue of Procopius). In a way, one could say the entirety of the Sarantine Mosaic is a rebuttal of the Secret History. But I think Kay's contrast between Crispin's art (and Scortius's, and Strumosus's) and Pertennius's history has a deeper thematic goal. I came away from this book feeling like the Sarantine Mosaic as a whole stands in contrast to a traditional history, stands against a reduction of people to dates and wars and buildings and governments. Crispin's greatest work attempts to capture emotion, awe, and an inner life. The endlessly complex human relationships shown in this book running beneath the political events occasionally surface in dramatic upheavals, but in Kay's telling the ones that stay below the surface are just as important. And while much of the other art shown in this book differs from Crispin's in being inherently ephemeral, it shares that quality of being the art of life, of complexity, of people in dynamic, changing, situational understanding of the world, exercising competence in some area that may or may not be remembered.

Kay raises to the level of epic the bits of history that don't get recorded, and, in his grand and self-conscious fantasy epic style, encourages the reader to feel those just as deeply as the ones that will have later historical significance. The measure of people, their true inner selves, is often shown in moments that Pertennius would dismiss and consider unworthy of recording in his history.

End minor spoiler.

I think Lord of Emperors is the best part of the Sarantine Mosaic duology. It keeps the same deeply enjoyable view of people doing things they are extremely good at while correcting some of the structural issues in the previous book. Kay continues to use a large cast, and continues to cut between viewpoint characters to show each event from multiple angles, but he has a better grasp of timing and order here than in Sailing to Sarantium. I never got confused about the timeline, thanks in part to more frequent and more linear scene cuts. And Lord of Emperors passes, with flying colors, the hardest test of a novel with a huge number of viewpoint characters: when Kay cuts to a new viewpoint, my reaction is almost always "yes, I wanted to see what they were thinking!" and almost never "wait, no, go back!".

My other main complaint about Sailing to Sarantium was the treatment of women, specifically the irresistibility of female sexual allure. Kay thankfully tones that down a lot here. His treatment of women is still a bit odd — one notices that five women seem to all touch the lives of the same men, and little room is left for Platonic friendship between the genders — but they're somewhat less persistently sexualized. And the women get a great deal of agency in this book, and a great deal of narrative respect.

That said, Lord of Emperors is also emotionally brutal. It's beautifully done, and entirely appropriate to the story, and Kay does provide a denouement that takes away a bit of the sting. But it's still very hard to read in spots if you become as invested in the characters and in the world as I do. Kay is writing epic that borders on tragedy, and uses his full capabilities as a writer to make the reader feel it. I love it, but it's not a book that I want to read too often.

As with nearly all Kay, the Sarantine Mosaic as a whole is intentional, deliberate epic writing, wearing its technique on its sleeve and making no apologies. There is constant foreshadowing, constant attempts to draw larger conclusions or reveal great principles of human nature, and a very open, repeated stress on the greatness and importance of events while they're being described. This works for me, but it doesn't work for everyone. If it doesn't work for you, the Sarantine Mosaic is unlikely to change your mind. But if you're in the mood for that type of story, I think this is one of Kay's best, and Lord of Emperors is the best half of the book.

Rating: 10 out of 10

Gunnar Wolf: On the results of vote "gr_private2"

25 October, 2016 - 08:46

Given that I started the GR process, and that I called for discussion and votes, I feel somehow as my duty to also put a simple wrap-around to this process. Of course, I'll say many things already well-known to my fellow Debian people, but also non-debianers read this.

So, for further context, if you need to, please read my previous blog post, where I was about to send a call for votes. It summarizes the situation and proposals; you will find we had a nice set of messages in during September; I have to thank all the involved parties, much specially to Ian Jackson, who spent a lot of energy summing up the situation and clarifying the different bits to everyone involved.

So, we held the vote; you can be interested in looking at the detailed vote statistics for the 235 correctly received votes, and most importantly, the results:

First of all, I'll say I'm actually surprised at the results, as I expected Ian's proposal (acknowledge difficulty; I actually voted this proposal as my top option) to win and mine (repeal previous GR) to be last; turns out, the winner option was Iain's (remain private). But all in all, I am happy with the results: As I said during the discussion, I was much disappointed with the results to the previous GR on this topic — And, yes, it seems the breaking point was when many people thought the privacy status of posted messages was in jeopardy; we cannot really compare what I would have liked to have in said vote if we had followed the strategy of leaving the original resolution text instead of replacing it, but I believe it would have passed. In fact, one more surprise of this iteration was that I expected Further Discussion to be ranked higher, somewhere between the three explicit options. I am happy, of course, we got such an overwhelming clarity of what does the project as a whole prefer.

And what was gained or lost with this whole excercise? Well, if nothing else, we gain to stop lying. For over ten years, we have had an accepted resolution binding us to release the messages sent to debian-private given such-and-such-conditions... But never got around to implement it. We now know that debian-private will remain private... But we should keep reminding ourselves to use the list as little as possible.

For a project such as Debian, which is often seen as a beacon of doing the right thing no matter what, I feel being explicit about not lying to ourselves of great importance. Yes, we have the principle of not hiding our problems, but it has long been argued that the use of this list is not hiding or problems. Private communication can happen whenever you have humans involved, even if administratively we tried to avoid it.

Any of the three running options could have won, and I'd be happy. My #1 didn't win, but my #2 did. And, I am sure, it's for the best of the project as a whole.

Chris Lamb: Concorde

25 October, 2016 - 01:59

Today marks the 13th anniversary since the last passenger flight from New York arrived in the UK. Every seat was filled, a feat that had become increasingly rare for a plane that was a technological marvel but a commercial flop….

  • Only 20 aircraft were ever built despite 100 orders, most of them cancelled in the early 1970s.
  • Taxiing to the run way consumed 2 tons of fuel.
  • The white colour scheme was specified to reduce the outer temperature by about 10°C.
  • In a promotional deal with Pepsi, F-BTSD was temporarily painted blue. Due to the change of colour, Air France were advised to remain at Mach 2 for no more than 20 minutes at a time.
  • At supersonic speed the fuselage would heat up and expand by as much as 30cm. The most obvious manifestation of this was a gap that opened up on the flight deck between the flight engineer's console and the bulkhead. On some aircraft conducting a retiring supersonic flight, the flight engineers placed their caps in this expanded gap, permanently wedging the cap as it shrank again.
  • At Concorde's altitude a breach of cabin integrity would result in a loss of pressure so severe that passengers would quickly suffer from hypoxia despite application of emergency oxygen. Concorde was thus built with smaller windows to reduce the rate of loss in such a breach.
  • The high cruising altitude meant passengers received almost twice the amount of radiation as a conventional long-haul flight. To prevent excessive radiation exposure, the flight deck comprised of a radiometer; if the radiation level became too high, pilots would descend below 45,000 feet.
  • BA's Concorde service had a greater number of passengers who booked a flight and then failed to appear than any other aircraft in the fleet.
  • Market research later in Concorde's life revealed that many customers thought Concorde was more expensive than it actually was. Ticket prices were progressively raised to match these perceptions.
  • The fastest transatlantic airliner flight was from New York JFK to London Heathrow on 7 February 1996 by British Airways' G-BOAD in 2 hours, 52 minutes, 59 seconds from takeoff to touchdown. It was aided by a 175 mph tailwind.

See also: A Rocket to Nowhere.

Reproducible builds folks: Reproducible Builds: week 78 in Stretch cycle

24 October, 2016 - 23:10

What happened in the Reproducible Builds effort between Sunday October 16 and Saturday October 22 2016:

Media coverage Upcoming events

To enable everyone to rebuild everything reproducibly, we have the concept of .buildinfo files which generally describe the environment used for a particular build; the inputs and the outputs and, in the Debian case, are available per per package/architecture/version tuple. We anticipate the next dpkg upload to sid will create such .buildinfo files by default and while it's clear that we need to teach dak to deal with them (see #763822), it's not actually clear how to handle .buildinfo files after dak has processed them and how to make them available to the world.

To this end, Chris Lamb has started development on an highly proof-of-concept .buildinfo server licensed under the GNU AGPLv3. Source

Reproducible work in other projects
  • Ximin Luo submitted a patch to GCC as a prerequisite for future patches to make debugging symbols reproducible.
Packages reviewed and fixed, and bugs filed Reviews of unreproducible packages

99 package reviews have been added, 3 have been updated and 6 have been removed in this week, adding to our knowledge about identified issues.

6 issue types have been added:

Weekly QA work

During of reproducibility testing, some FTBFS bugs have been detected and reported by:

  • Chris Lamb (23)
  • Daniel Reichelt (2)
  • Lucas Nussbaum (1)
  • Santiago Vila (18)
diffoscope development
  • h01ger increased the diskspace for reproducible content on Jenkins. Thanks to ProfitBricks.
  • Valerie Young supplied a patch to make Python SQL interface more SQLite/PostgresSQL agnostic.
  • lynxis worked hard to make LEDE and OpenWrt builds happen on two hosts.

Our poll to find a good time for an IRC meeting is still running until Tuesday, October 25st; please reply as soon as possible.

We need a logo! Some ideas and requirements for a Reproducible Builds logo have been documented in the wiki. Contributions very welcome, even if simply by forwarding this information.

This week's edition was written by Chris Lamb & Holger Levsen and reviewed by a bunch of Reproducible Builds folks on IRC.

Russ Allbery: Review: The Design of Everyday Things

24 October, 2016 - 11:17

Review: The Design of Everyday Things, by Don Norman

Publisher: Basic Books Copyright: 2013 ISBN: 0-465-05065-4 Format: Trade paperback Pages: 298

There are several editions of this book (the first under a different title, The Psychology of Everyday Things). This review is for the Revised and Expanded Edition, first published in 2013 and quite significantly revised compared to the original. I probably read at least some of the original for a class in human-computer interaction around 1994, but that was long enough ago that I didn't remember any of the details.

I'm not sure how much impact this book has had outside of the computer field, but The Design of Everyday Things is a foundational text of HCI (human-computer interaction) despite the fact that many of its examples and much of its analysis is not specific to computers. Norman's goal is clearly to write a book that's fundamental to the entire field of design; not having studied the field, I don't know if he succeeded, but the impact on computing was certainly immense. This is the sort of book that everyone ends up hearing about, if not necessarily reading, in college. I was looking forward to filling a gap in my general knowledge.

Having now read it cover-to-cover, would I recommend others invest the time? Maybe. But probably not.

There are several things this book does well. One of the most significant is that it builds a lexicon and a set of general principles that provide a way of talking about design issues. Lexicons are not the most compelling reading material (see also Design Patterns), but having a common language is useful. I still remember affordances from college (probably from this book or something else based on it). Norman also adds, and defines, signifiers, constraints, mappings, and feedback, and talks about the human process of building a conceptual model of the objects with which one is interacting.

Even more useful, at least in my opinion, is the discussion of human task-oriented behavior. The seven stages of action is a great systematic way of analyzing how humans perform tasks, where those actions can fail, and how designers can help minimize failure. One thing I particularly like about Norman's presentation here is the emphasis on the feedback cycle after performing a task, or a step in a task. That feedback, and what makes good or poor feedback, is (I think) an underappreciated part of design and something that too often goes missing. I thought Norman was a bit too dismissive of simple beeps as feedback (he thinks they don't carry enough information; while that's not wrong, I think they're far superior to no feedback at all), but the emphasis on this point was much appreciated.

Beyond these dry but useful intellectual frameworks, though, Norman seems to have a larger purpose in The Design of Everyday Things: making a passionate argument for the importance of design and for not tolerating poor design. This is where I think his book goes a bit off the rails.

I can appreciate the boosterism of someone who feels an aspect of creating products is underappreciated and underfunded. But Norman hammers on the unacceptability of bad design to the point of tedium, and seems remarkably intolerant of, and unwilling to confront, the reasons why products may be released with poor designs for their eventual users. Norman clearly wishes that we would all boycott products with poor designs and prize usability above most (all?) other factors in our decisions. Equally clearly, this is not happening, and Norman knows it. He even describes some of the reasons why not, most notably (and most difficultly) the fact that the purchasers of many products are not the eventual users. Stoves are largely sold to builders, not kitchen cooks. Light switches are laid out for the convenience of the electrician; here too, the motive for the builder to spend additional money on better lighting controls is unclear. So much business software is purchased by people who will never use it directly, and may have little or no contact with the people who do. These layers of economic separation result in deep disconnects of incentive structure between product manufacturers and eventual consumers.

Norman acknowledges this, writes about it at some length, and then seems to ignore the point entirely, returning to ranting about the deficiencies of obviously poor design and encouraging people to care more about design. This seems weirdly superficial in this foundational of a book. I came away half-convinced that these disconnects of incentive (and some related problems, such as the unwillingness to invest in proper field research or the elaborate, expensive, and lengthy design process Norman lays out as ideal) are the primary obstacle in the way of better-designed consumer goods. If that's the case, then this is one of the largest, if not the largest, obstacle in the way of doing good design, and I would have expected this foundational of a book to tackle it head-on and provide some guidance for how to fight back against this problem. But Norman largely doesn't.

There is some mention of this in the introduction. Apparently much of the discussion of the practical constraints on product design in the business world was added in this revised edition, and perhaps what I'm seeing is the limitations of attempting to revise an existing text. But that also implies that the original took an even harder line against poor design. Throughout, Norman is remarkably high-handed in his dismissal of bad design, focusing more on condemnation than on an investigation of why bad design might happen and what we, as readers, can learn from that process to avoid repeating it. Norman does provide extensive analysis of the design process and the psychology of human interaction, but still left me with the impression that he believes most design failures stem from laziness and stupidity. The negativity and frustration got a bit tedious by the middle of the book.

There's quite a lot here that someone working in design, particularly interface design, should be at least somewhat familiar with: affordances, signifiers, the importance of feedback, the psychological model of tasks and actions, and the classification of errors, just to name a few. However, I'm not sure this book is the best medium for learning those things. I found it a bit tedious, a bit too arrogant, and weirdly unconcerned with feasible solutions to the challenge of mismatched incentives. I also didn't learn that much from it; while the concepts here are quite important, most of them I'd picked up by osmosis from working in the computing field for twenty years.

In that way, The Design of Everyday Things reminded me a great deal of the Gang of Four's Design Patterns, even though it's a more readable book and less of an exercise in academic classification. The concepts presented are useful and important, but I'm not sure I can recommend the book as a book. It may be better to pick up the same concepts as you go, with the help of Internet searches and shorter essays.

Rating: 6 out of 10

Dirk Eddelbuettel: Word Marathon Majors: Five Star Finisher!

24 October, 2016 - 09:41

A little over eight years ago, I wrote a short blog post which somewhat dryly noted that I had completed the five marathons constituting the World Marathon Majors. I had completed Boston, Chicago and New York during 2007, adding London and then Berlin (with a personal best) in 2008. The World Marathon Majors existed then, but I was not aware of a website. The organisation was aiming to raise the profile of the professional and very high-end aspect of the sport. But marathoning is funny as they let somewhat regular folks like you and me into the same race. And I always wondered if someone kept track of regular folks completing the suite...

I have been running a little less the last few years, though I did get around to complete the Illinois Marathon earlier this year (only tweeted about it and still have not added anything to the running section of my blog). But two weeks ago, I was once again handing out water cups at the Chicago Marathon, sending along two tweets when the elite wheelchair and elite male runners flew by. To the first, the World Marathon Majors account replied, which lead me to their website. Which in turn lead me to the Five Star Finisher page, and the newer / larger Six Star Finisher page now that Tokyo has been added.

And in short, one can now request one's record to be added (if they check out). So I did. And now I am on the Five Star Finisher page!

I don't think I'll ever surpass that as a runner. The table header and my row look like this:

If only my fifth / sixth grade physical education teacher could see that---he was one of those early running nuts from the 1970s and made us run towards / around this (by now enlarged) pond and boy did I hate that :) Guess it did have some long lasting effects. And I casually circled the lake a few years ago, starting much further away from my parents place. Once you are in the groove for distance...

But leaving that aside, running has been fun and I with some luck I may have another one or two marathons or Ragnar Relays left. The only really bad part about this is that I may have to get myself to Tokyo after all (for something that is not an ISM workshop) ...

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

Daniel Silverstone: Gitano - Approaching Release - Deprecated commands

24 October, 2016 - 09:24

As mentioned previously I am working toward getting Gitano into Stretch. Last time we spoke about lace, on which a colleague and friend of mine (Richard Maw) did a large pile of work. This time I'm going to discuss deprecation approaches and building more capability out of fewer features.

First, a little background -- Gitano is written in Lua which is a deliberately small language whose authors spend more time thinking about what they can remove from the language spec than they do what they could add in. I first came to Lua in the 3.2 days, a little before 4.0 came out. (The authors provide a lovely timeline in case you're interested.) With each of the releases of Lua which came after 3.2, I was struck with how the authors looked to take a number of features which the language had, and collapse them into more generic, more powerful, smaller, fewer features.

This approach to design stuck with me over the subsequent decade, and when I began Gitano I tried to have the smallest number of core features/behaviours, from which could grow the power and complexity I desired. Gitano is, at its core, a set of files in a single format (clod) stored in a consistent manner (Git) which mediate access to a resource (Git repositories). Some of those files result in emergent properties such as the concept of the 'owner' of a repository (though that can simply be considered the value of the project.owner property for the repository). Indeed the concept of the owner of a repository is a fiction generated by the ACL system with a very small amount of collusion from the core of Gitano. Yet until recently Gitano had a first class command set-owner which would alter that one configuration value.

[gitano]  set-description ---- Set the repo's short description (Takes a repo)
[gitano]         set-head ---- Set the repo's HEAD symbolic reference (Takes a repo)
[gitano]        set-owner ---- Sets the owner of a repository (Takes a repo)

Those of you with Gitano installations may see the above if you ask it for help. Yet you'll also likely see:

[gitano]           config ---- View and change configuration for a repository (Takes a repo)

The config command gives you access to the repository configuration file (which, yes, you could access over git instead, but the config command can be delegated in a more fine-grained fashion without having to write hooks). Given the config command has all the functionality of the three specific set-* commands shown above, it was time to remove the specific commands.


If you had automation which used the set-description, set-head, or set-owner commands then you will want to switch to the config command before you migrate your server to the current or any future version of Gitano.

In brief, where you had:

ssh git@gitserver set-FOO repo something

You now need:

ssh git@gitserver config repo set project.FOO something

It looks a little more wordy but it is consistent with the other features that are keyed from the project configuration, such as:

ssh git@gitserver config repo set cgitrc.section Fooble Section Name

And, of course, you can see what configuration is present with:

ssh git@gitserver config repo show

Or look at a specific value with:

ssh git@gitserver config repo show specific.key

As always, you can get more detailed (if somewhat cryptic) help with:

ssh git@gitserver help config

Next time I'll try and touch on the new PGP/GPG integration support.

Francois Marier: Tweaking Referrers For Privacy in Firefox

24 October, 2016 - 07:00

The Referer header has been a part of the web for a long time. Websites rely on it for a few different purposes (e.g. analytics, ads, CSRF protection) but it can be quite problematic from a privacy perspective.

Thankfully, there are now tools in Firefox to help users and developers mitigate some of these problems.


In a nutshell, the browser adds a Referer header to all outgoing HTTP requests, revealing to the server on the other end the URL of the page you were on when you placed the request. For example, it tells the server where you were when you followed a link to that site, or what page you were on when you requested an image or a script. There are, however, a few limitations to this simplified explanation.

First of all, by default, browsers won't send a referrer if you place a request from an HTTPS page to an HTTP page. This would reveal potentially confidential information (such as the URL path and query string which could contain session tokens or other secret identifiers) from a secure page over an insecure HTTP channel. Firefox will however include a Referer header in HTTPS to HTTPS transitions unless network.http.sendSecureXSiteReferrer (removed in Firefox 52) is set to false in about:config.

Secondly, using the new Referrer Policy specification web developers can override the default behaviour for their pages, including on a per-element basis. This can be used both to increase or reduce the amount of information present in the referrer.

Legitimate Uses

Because the Referer header has been around for so long, a number of techniques rely on it.

Armed with the Referer information, analytics tools can figure out:

  • where website traffic comes from, and
  • how users are navigating the site.

Another place where the Referer is useful is as a mitigation against cross-site request forgeries. In that case, a website receiving a form submission can reject that form submission if the request originated from a different website.

It's worth pointing out that this CSRF mitigation might be better implemented via a separate header that could be restricted to particularly dangerous requests (i.e. POST and DELETE requests) and only include the information required for that security check (i.e. the origin).

Problems with the Referrer

Unfortunately, this header also creates significant privacy and security concerns.

The most obvious one is that it leaks part of your browsing history to sites you visit as well as all of the resources they pull in (e.g. ads and third-party scripts). It can be quite complicated to fix these leaks in a cross-browser way.

These leaks can also lead to exposing private personally-identifiable information when they are part of the query string. One of the most high-profile example is the accidental leakage of user searches by

Solutions for Firefox Users

While web developers can use the new mechanisms exposed through the Referrer Policy, Firefox users can also take steps to limit the amount of information they send to websites, advertisers and trackers.

In addition to enabling Firefox's built-in tracking protection by setting privacy.trackingprotection.enabled to true in about:config, which will prevent all network connections to known trackers, users can control when the Referer header is sent by setting network.http.sendRefererHeader to:

  • 0 to never send the header
  • 1 to send the header only when clicking on links and similar elements
  • 2 (default) to send the header on all requests (e.g. images, links, etc.)

It's also possible to put a limit on the maximum amount of information that the header will contain by setting the network.http.referer.trimmingPolicy to:

  • 0 (default) to send the full URL
  • 1 to send the URL without its query string
  • 2 to only send the scheme, host and port

or using the network.http.referer.XOriginTrimmingPolicy option (added in Firefox 52) to only restrict the contents of referrers attached to cross-origin requests.

Site owners can opt to share less information with other sites, but they can't share any more than what the user trimming policies allow.

Another approach is to disable the Referer when doing cross-origin requests (from one site to another). The network.http.referer.XOriginPolicy preference can be set to:

  • 0 (default) to send the referrer in all cases
  • 1 to send a referrer only when the base domains are the same
  • 2 to send a referrer only when the full hostnames match

If you try to remove all referrers (i.e. network.http.sendRefererHeader = 0, you will most likely run into problems on a number of sites, for example:

The first two have been worked-around successfully by setting network.http.referer.spoofSource to true, an advanced setting which always sends the destination URL as the referrer, thereby not leaking anything about the original page.

Unfortunately, the last two are examples of the kind of breakage that can only be fixed through a whitelist (an approach supported by the smart referer add-on) or by temporarily using a different browser profile.

My Recommended Settings

As with my cookie recommendations, I recommend strengthening your referrer settings but not disabling (or spoofing) it entirely.

While spoofing does solve many the breakage problems mentioned above, it also effectively disables the anti-CSRF protections that some sites may rely on and that have tangible user benefits. A better approach is to limit the amount of information that leaks through cross-origin requests.

If you are willing to live with some amount of breakage, you can simply restrict referrers to the same site by setting:

network.http.referer.XOriginPolicy = 2

or to sites which belong to the same organization (i.e. same ETLD/public suffix) using:

network.http.referer.XOriginPolicy = 1

This prevent leaks to third-parties while giving websites all of the information that they can already see in their own server logs.

On the other hand, if you prefer a weaker but more compatible solution, you can trim cross-origin referrers down to just the scheme, hostname and port:

network.http.referer.XOriginTrimmingPolicy = 2

I have not yet found user-visible breakage using this last configuration. Let me know if you find any!

Carl Chenet: PyMoneroWallet: the Python library for the Monero wallet

24 October, 2016 - 05:00

Do you know the Monero crytocurrency? It’s a cryptocurrency, like Bitcoin, focused on the security, the privacy and the untracabily. That’s a great project launched in 2014, today called XMR on all cryptocurrency exchange platforms (like Kraken or Poloniex).

So what’s new? In order to work with a Monero wallet from some Python applications, I just wrote a Python library to use the Monero wallet: PyMoneroWallet

Using PyMoneroWallet is as easy as:

$ python3
>>> from monerowallet import MoneroWallet
>>> mw = MoneroWallet()
>>> mw.getbalance()
{'unlocked_balance': 2262265030000, 'balance': 2262265030000}

Lots of features are included, you should have a look at the documentation of the monerowallet module to know them all, but quickly here are some of them:

And so on. Have a look at the complete documentation for extensive available functions.

Feel free to contribute to this starting project to help spreading the Monero use by using the PyMoneroWallet project with your Python applications


Creative Commons License ลิขสิทธิ์ของบทความเป็นของเจ้าของบทความแต่ละชิ้น
ผลงานนี้ ใช้สัญญาอนุญาตของครีเอทีฟคอมมอนส์แบบ แสดงที่มา-อนุญาตแบบเดียวกัน 3.0 ที่ยังไม่ได้ปรับแก้