Planet Debian

Subscribe to Planet Debian feed
Planet Debian - https://planet.debian.org/
Updated: 57 min 7 sec ago

Antoine Beaupré: Tip of the day: batch PDF conversion with LibreOffice

8 October, 2019 - 23:28

Someone asked me today why they couldn't write on the DOCX document they received from a student using the pen in their Onyx Note Pro reader. The answer, of course, is that while the Onyx can read those files, it can't annotate them: that only works with PDFs.

Next question then, is of course: do I really need to open each file separately and save them as PDF? That's going to take forever, I have 30 students per class!

Fear not, shell scripting and headless mode flies in to the rescue!

As it turns out, one of the Libreoffice parameters allow you to run batch operations on files. By calling:

libreoffice --headless --convert-to pdf *.docx

LibreOffice will happily convert all the *.docx files in the current directory to PDF. But because navigating the commandline can be hard, I figured I could push this a tiny little bit further and wrote the following script:

#!/bin/sh

exec libreoffice --headless --convert-to pdf "$@"

Drop this in ~/.local/share/nautilus/scripts/libreoffice-pdf, mark it executable, and voilà! You can batch-convert basically any text file (or anything supported by LibreOffice, really) into PDF.

Now I wonder if this would be a useful addition to the Debian package, anyone?

Steve Kemp: A blog overhaul

8 October, 2019 - 22:00

When this post becomes public I'll have successfully redeployed my blog!

My blog originally started in 2005 as a Wordpress installation, at some point I used Mephisto, and then I wrote my own solution.

My project was pretty cool; I'd parse a directory of text-files, one file for each post, and insert them into an SQLite database. From there I'd initiate a series of plugins, each one to generate something specific:

  • One plugin would output an archive page.
  • Another would generate a tag cloud.
  • Yet another would generate the actual search-results for a particular month/year, or tag-name.

All in all the solution was flexible and it wasn't too slow because finding posts via the SQLite database was pretty good.

Anyway I've come to realize that freedom and architecture was overkill. I don't need to do fancy presentation, I don't need a loosely-coupled set of plugins.

So now I have a simpler solution which uses my existing template, uses my existing posts - with only a few cleanups - and generates the site from scratch, including all the comments, in less than 2 seconds.

After running make clean a complete rebuild via make upload (which deploys the generated site to the remote host via rsync) takes 6 seconds.

I've lost the ability to be flexible in some areas, but I've gained all the speed. The old project took somewhere between 20-60 seconds to build, depending on what had changed.

In terms of simplifying my life I've dropped the remote installation of a site-search which means I can now host this site on a static site with only a single handler to receive any post-comments. (I was 50/50 on keeping comments. I didn't want to lose those I'd already received, and I do often find valuable and interesting contributions from readers, but being 100% static had its appeal too. I guess they stay for the next few years!)

Jonas Meurer: debian lts report 2019.09

8 October, 2019 - 21:34
Debian LTS report for September 2019

This month I was allocated 10 hours and carried over 9.5 hours from August. Unfortunately, again I didn't find much time to work on LTS issues, partially because I was travelling. I spent 5 hours on the task listed below. That means that I carry over 14.5 hours to October.

Links

Jamie McClelland: Editing video without a GUI? Really?

8 October, 2019 - 20:19

It seems counter intuitive - if ever there was a program in need of a graphical user interface, it's a non-linear video editing program.

However, as part of the May First board elections, I discovered otherwise.

We asked each board candidate to submit a 1 - 2 minute video introduction about why they want to be on the board. My job was to connect them all into a single video.

I had an unrealistic thought that I could find some simple tool that could concatenate them all together (like mkvmerge) but I soon realized that this approach requires that everyone use the exact same format, codec, bit rate, sample rate and blah blah blah.

I soon realized that I needed to actually make a video, not compile one. I create videos so infrequently, that I often forget the name of the video editing software I used last time so it takes some searching. This time I found that I had openshot-qt installed but when I tried to run it, I got a back trace (which someone else has already reported).

I considered looking for another GUI editor, but I wasn't that interested in learning what might be a complicated user interface when what I need is so simple.

So I kept searching and found melt. Wow.

I ran:

melt originals/* -consumer avformat:all.webm acodec=libopus vcodec=libvpx

And a while later I had a video. Impressive. It handled people who submitted their videos in portrait mode on their cell phones in mp4 as well as web cam submissions using webm/vp9 on landscape mode.

Thank you melt developers!

Jonathan McDowell: Ada Lovelace Day: 5 Amazing Women in Tech

8 October, 2019 - 14:00

It’s Ada Lovelace day and I’ve been lax in previous years about celebrating some of the talented women in technology I know or follow on the interwebs. So, to make up for it, here are 5 amazing technologists.

Allison Randal

I was initially aware of Allison through her work on Perl, was vaguely aware of the fact she was working on Ubunutu, briefly overlapped with her at HPE (and thought it was impressive HP were hiring such high calibre of Free Software folk) when she was working on OpenStack, and have had the pleasure of meeting her in person due to the fact we both work on Debian. In the continuing theme of being able to do all things tech she’s currently studying a PhD at Cambridge (the real one), and has already written a fascinating paper about about the security misconceptions around virtual machines and containers. She’s also been doing things with home automation, properly, with local speech recognition rather than relying on any external assistant service (I will, eventually, find the time to follow her advice and try this out for myself).

Alyssa Rosenzweig

Graphics are one of the many things I just can’t do. I’m not artistic and I’m in awe of anyone who is capable of wrangling bits to make computers do graphical magic. People who can reverse engineer graphics hardware that would otherwise only be supported by icky binary blobs impress me even more. Alyssa is such a person, working on the Panfrost driver for ARM’s Mali Midgard + Bifrost GPUs. The lack of a Free driver stack for this hardware is a real problem for the ARM ecosystem and she has been tirelessly working to bring this to many ARM based platforms. I was delighted when I saw one of my favourite Free Software consultancies, Collabora, had given her an internship over the summer. (Selfishly I’m hoping it means the Gemini PDA will eventually be able to run an upstream kernel with accelerated graphics.)

Angie McKeown

The first time I saw Angie talk it was about the user experience of Virtual Reality, and how it had an entirely different set of guidelines to conventional user interfaces. In particular the premise of not trying to shock or surprise the user while they’re in what can be a very immersive environment. Obvious once someone explains it to you! Turns out she was also involved in the early days of custom PC builds and internet cafes in Northern Ireland, and has interesting stories to tell. These days she’s concentrating on cyber security - I’ve her to thank for convincing me to persevere with Ghidra - having looked at Bluetooth security as part of her Masters. She’s also deeply aware of the implications of the GDPR and has done some interesting work on thinking about how it affects the computer gaming industry - both from the perspective of the author, and the player.

Claire Wilgar

I’m not particularly fond of modern web design. That’s unfair of me, but web designers seem happy to load megabytes of Javascript from all over the internet just to display the most basic of holding pages. Indeed it seems that such things now require all the includes rather than being simply a matter of HTML, CSS and some graphics, all from the same server. Claire talked at Women Techmakers Belfast about moving away from all of this bloat and back to a minimalistic approach with improved performance, responsiveness and usability, without sacrificing functionality or presentation. She said all the things I want to say to web designers, but from a position of authority, being a front end developer as her day job. It’s great to see someone passionate about front-end development who wants to do things the right way, and talks about it in a way that even people without direct experience of the technologies involved (like me) can understand and appreciate.

Karen Sandler

There aren’t enough people out there who understand law and technology well. Karen is one of the few I’ve encountered who do, and not only that, but really, really gets Free software and the impact of the four freedoms on users in a way many pure technologists do not. She’s had a successful legal career that’s transitioned into being the general counsel for the Software Freedom Law Center, been the executive director of GNOME and is now the executive director of the Software Freedom Conservancy. As someone who likes to think he knows a little bit about law and technology I found Karen’s wealth of knowledge and eloquence slightly intimidating the first time I saw her speak (I think at some event in San Francisco), but I’ve subsequently (gratefully) discovered she has an incredible amount of patience (and ability) when trying to explain the nuances of free software legal issues.

Julien Danjou: Python and fast HTTP clients

7 October, 2019 - 16:30

Nowadays, it is more than likely that you will have to write an HTTP client for your application that will have to talk to another HTTP server. The ubiquity of REST API makes HTTP a first class citizen. That's why knowing optimization patterns are a prerequisite.

There are many HTTP clients in Python; the most widely used and easy to
work with is requests. It is the de-factor standard nowadays.

Persistent Connections

The first optimization to take into account is the use of a persistent connection to the Web server. Persistent connections are a standard since HTTP 1.1 though many applications do not leverage them. This lack of optimization is simple to explain if you know that when using requests in its simple mode (e.g. with the get function) the connection is closed on return. To avoid that, an application needs to use a Session object that allows reusing an already opened connection.

import requests

session = requests.Session()
session.get("http://example.com")
# Connection is re-used
session.get("http://example.com")
Using Session with requests

Each connection is stored in a pool of connections (10 by default), the size of
which is also configurable:

import requests


session = requests.Session()
adapter = requests.adapters.HTTPAdapter(
    pool_connections=100,
    pool_maxsize=100)
session.mount('http://', adapter)
response = session.get("http://example.org")
Changing pool size

Reusing the TCP connection to send out several HTTP requests offers a number of performance advantages:

  • Lower CPU and memory usage (fewer connections opened simultaneously).
  • Reduced latency in subsequent requests (no TCP handshaking).
  • Exceptions can be raised without the penalty of closing the TCP connection.

The HTTP protocol also provides pipelining, which allows sending several requests on the same connection without waiting for the replies to come (think batch). Unfortunately, this is not supported by the requests library. However, pipelining requests may not be as fast as sending them in parallel. Indeed, the HTTP 1.1 protocol forces the replies to be sent in the same order as the requests were sent – first-in first-out.

Parallelism

requests also has one major drawback: it is synchronous. Calling requests.get("http://example.org") blocks the program until the HTTP server replies completely. Having the application waiting and doing nothing can be a drawback here. It is possible that the program could do something else rather than sitting idle.

A smart application can mitigate this problem by using a pool of threads like the ones provided by concurrent.futures. It allows parallelizing the HTTP requests in a very rapid way.

from concurrent import futures

import requests


with futures.ThreadPoolExecutor(max_workers=4) as executor:
    futures = [
        executor.submit(
            lambda: requests.get("http://example.org"))
        for _ in range(8)
    ]

results = [
    f.result().status_code
    for f in futures
]

print("Results: %s" % results)
Using futures with requests

This pattern being quite useful, it has been packaged into a library named requests-futures. The usage of Session objects is made transparent to the developer:

from requests_futures import sessions


session = sessions.FuturesSession()

futures = [
    session.get("http://example.org")
    for _ in range(8)
]

results = [
    f.result().status_code
    for f in futures
]

print("Results: %s" % results)
Using futures with requests

By default a worker with two threads is created, but a program can easily customize this value by passing the max_workers argument or even its own executor to the FuturSession object – for example like this: FuturesSession(executor=ThreadPoolExecutor(max_workers=10)).

Asynchronicity

As explained earlier, requests is entirely synchronous. That blocks the application while waiting for the server to reply, slowing down the program. Making HTTP requests in threads is one solution, but threads do have their own overhead and this implies parallelism, which is not something everyone is always glad to see in a program.

Starting with version 3.5, Python offers asynchronicity as its core using asyncio. The aiohttp library provides an asynchronous HTTP client built on top of asyncio. This library allows sending requests in series but without waiting for the first reply to come back before sending the new one. In contrast to HTTP pipelining, aiohttp sends the requests over multiple connections in parallel, avoiding the ordering issue explained earlier.

import aiohttp
import asyncio


async def get(url):
    async with aiohttp.ClientSession() as session:
        async with session.get(url) as response:
            return response


loop = asyncio.get_event_loop()

coroutines = [get("http://example.com") for _ in range(8)]

results = loop.run_until_complete(asyncio.gather(*coroutines))

print("Results: %s" % results)
Using aiohttp

All those solutions (using Session, threads, futures or asyncio) offer different approaches to making HTTP clients faster.

Performances

The snippet below is an HTTP client sending requests to httpbin.org, an HTTP API that provides (among other things) an endpoint simulating a long request (a second here). This example implements all the techniques listed above and times them.

import contextlib
import time

import aiohttp
import asyncio
import requests
from requests_futures import sessions

URL = "http://httpbin.org/delay/1"
TRIES = 10


@contextlib.contextmanager
def report_time(test):
    t0 = time.time()
    yield
    print("Time needed for `%s' called: %.2fs"
          % (test, time.time() - t0))


with report_time("serialized"):
    for i in range(TRIES):
        requests.get(URL)


session = requests.Session()
with report_time("Session"):
    for i in range(TRIES):
        session.get(URL)


session = sessions.FuturesSession(max_workers=2)
with report_time("FuturesSession w/ 2 workers"):
    futures = [session.get(URL)
               for i in range(TRIES)]
    for f in futures:
        f.result()


session = sessions.FuturesSession(max_workers=TRIES)
with report_time("FuturesSession w/ max workers"):
    futures = [session.get(URL)
               for i in range(TRIES)]
    for f in futures:
        f.result()


async def get(url):
    async with aiohttp.ClientSession() as session:
        async with session.get(url) as response:
            await response.read()

loop = asyncio.get_event_loop()
with report_time("aiohttp"):
    loop.run_until_complete(
        asyncio.gather(*[get(URL)
                         for i in range(TRIES)]))
Program to compare the performances of different requests usage

Running this program gives the following output:

Time needed for `serialized' called: 12.12s
Time needed for `Session' called: 11.22s
Time needed for `FuturesSession w/ 2 workers' called: 5.65s
Time needed for `FuturesSession w/ max workers' called: 1.25s
Time needed for `aiohttp' called: 1.19s

Without any surprise, the slower result comes with the dumb serialized version, since all the requests are made one after another without reusing the connection — 12 seconds to make 10 requests.

Using a Session object and therefore reusing the connection means saving 8% in terms of time, which is already a big and easy win. Minimally, you should always use a session.

If your system and program allow the usage of threads, it is a good call to use them to parallelize the requests. However threads have some overhead, and they are not weight-less. They need to be created, started and then joined.

Unless you are still using old versions of Python, without a doubt using aiohttp should be the way to go nowadays if you want to write a fast and asynchronous HTTP client. It is the fastest and the most scalable solution as it can handle hundreds of parallel requests. The alternative, managing hundreds of threads in parallel is not a great option.

Streaming

Another speed optimization that can be efficient is streaming the requests. When making a request, by default the body of the response is downloaded immediately. The stream parameter provided by the requests library or the content attribute for aiohttp both provide a way to not load the full content in memory as soon as the request is executed.

import requests


# Use `with` to make sure the response stream is closed and the connection can
# be returned back to the pool.
with requests.get('http://example.org', stream=True) as r:
    print(list(r.iter_content()))
Streaming with requests
import aiohttp
import asyncio


async def get(url):
    async with aiohttp.ClientSession() as session:
        async with session.get(url) as response:
            return await response.content.read()

loop = asyncio.get_event_loop()
tasks = [asyncio.ensure_future(get("http://example.com"))]
loop.run_until_complete(asyncio.wait(tasks))
print("Results: %s" % [task.result() for task in tasks])
Streaming with aiohttp

Not loading the full content is extremely important in order to avoid allocating potentially hundred of megabytes of memory for nothing. If your program does not need to access the entire content as a whole but can work on chunks, it is probably better to just use those methods. For example, if you're going to save and write the content to a file, reading only a chunk and writing it at the same time is going to be much more memory efficient than reading the whole HTTP body, allocating a giant pile of memory, and then writing it to disk.

I hope that'll make it easier for you to write proper HTTP clients and requests. If you know any other useful technic or method, feel free to write it down in the comment section below!

Antoine Beaupré: This is why native apps matter

7 October, 2019 - 07:15

I was just looking a web stream on Youtube today and was wondering why my CPU was so busy. So I fired up top and saw my web browser (Firefox) took up around 70% of a CPU to play the stream.

I thought, "this must be some high resolution crazy stream! how modern! such wow!" Then I thought, wait, this is the web, there must be something insane going on.

So I did a little experiment: I started chromium --temp-profile on the stream, alongside vlc (which can also play Youtube streams!). Then I took a snapshot of the top(1) command after 5 minutes. Here are the results:

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
16332 anarcat   20   0 1805160 269684 102660 S  60,2   1,7   3:34.96 chromium
16288 anarcat   20   0  974872 119752  87532 S  33,2   0,7   1:47.51 chromium
16410 anarcat   20   0 2321152 176668  80808 S  22,0   1,1   1:15.83 vlc
 6641 anarcat   20   0   21,1g 520060 137580 S  13,8   3,2  55:36.70 x-www-browser
16292 anarcat   20   0  940340  83980  67080 S  13,2   0,5   0:41.28 chromium
 1656 anarcat   20   0 1970912  18736  14576 S  10,9   0,1   4:47.08 pulseaudio
 2256 anarcat   20   0  435696  93468  78120 S   7,6   0,6  16:03.57 Xorg
16262 anarcat   20   0 3240272 165664 127328 S   6,2   1,0   0:31.06 chromium
  920 message+  20   0   11052   5104   2948 S   1,3   0,0   2:43.37 dbus-daemon
17915 anarcat   20   0   16664   4164   3276 R   1,3   0,0   0:02.07 top

To deconstruct this, you can see my Firefox process (masquerading as x-www-browser) which has been started for a long time. It's taken 55 hours of CPU time, but let's ignore that for now as it's not in the benchmark. What I find fascinating is there are at least 4 chromium processes running here, and they collectively take up over 7 minutes of CPU time.

Compare this a little over one (1!!!11!!!) minute of CPU time for VLC, and you realize why people are so ranty about everything being packaged as web apps these days. It's basically using up an order of magnitudes more processing power (and therefore electric power and slave labor) to watch those silly movies in your web browsers than in a proper video player.

Keep that in mind next time you let Youtube go on a "autoplay Donald Drumpf" playlist...

Iustin Pop: IronBike Einsiedeln 2019

7 October, 2019 - 04:00

The previous race I did at the end of August was soo much fun—I forgot how much fun this is—that I did the unthinkable and registered to the IronBike as well, at a short (but not the shortest) distance.

Why unthinkable? Last time I was here, three years ago, I apparently told my wife I will never, ever go to this race again, since it’s not a fun one. Steep up, steep down, or flat, but no flowing sections.

Well, it seems I forgot I said that, I only remembered “it’s a hard race”, so I registered at the 53km distance, where there are no age categories. Just “Herren Fun” ☺

September

The month of September was a pretty good one, overall, and I managed to even lose 2kg (of fat, I’m 100% sure), and continued to increase my overall fitness - 23→34 CTL in Training Peaks.

I also did the Strava Escape challenge and the Sufferfest Yoga challenge, so core fitness improved as well. And I was more rested: Training Peaks form was -8, up from -18 for previous race.

Overall, I felt reasonably confident to complete (compete?) the shorter distance; official distance 53km, 1’400m altitude.

Race (day)

The race for this distance starts very late (around half past ten), so it was an easy morning, breakfast, drive to place, park, get dressed, etc.

The weather was actually better than I feared, it was plain shorts and t-shirt, no need for arm warmers, or anything like that. And it got sunnier as the race went on, which reminded me why on my race prep list it says “Sunscreen (always)”, and not just sunscreen.

And yes, this race as well, even at this “fun” distance, started very strong. The first 8.7km are flat (total altitude gain 28m, i.e. nothing), and I did them at an average speed of 35.5km/h! Again, this is on a full-suspension mountain bike, with thick tires. I actually drafted, a lot, and it was the first time I realised even with MTB, knowing how to ride in a group is useful.

And then, the first climb starts. About 8.5km, gaining 522m altitude (around a third only of the goal…), and which was tiring. On the few places where the climb flattened I saw again that one of the few ways in which I can gain on people is by timing my effort well enough such that when it flattens, I can start strongly and thus gain, sometimes significantly.

I don’t know if this is related to the fact that I’m on the heavier but also stronger side (this is comparing peak 5s power with other people on Sufferfest/etc.), or just I was fresh enough, but I was able to apply this repeatedly over the first two thirds of the race.

At this moment I was still well, and the only negative aspect—my lower back pain, which started very early—was on and off, not continuous, so all good.

First descent, 4.7km, average almost -12%, thus losing almost all the gain, ~450m. And this was an interesting descent: a lot of it was on a quite wide trail, but “paved” with square wood logs. And the distance between the wood logs, about 3-4cm, was not filled well enough with soil, so it was very jittery ride. I’ve seen a lot of people not able to ride there (to my surprise), and, an even bigger surprise, probably 10 or 15 people with flats. I guess they went low enough on pressure to get bites (not sure if correct term) and thus a flat. I had no issues, I was running a bit high pressure (lucky guess), which I took off later. But here it served me very well.

Another short uphill, 1.8km/230m altitude, but I gave up and pushed the bike. Not fancy, but when I got off the bike it was 16%!! And that’s not my favourite place in the world to be. Pushing the bike was not that bad; I was going 4.3km/h versus 5.8km/h beforehand, so I wasn’t losing much time. But yes, I was losing time.

Then another downhill, thankfully not so steep so I could enjoy going down for longer, and then a flat section. Overall almost 8km flat section, which I remembered quite well. Actually I was remembering quite a bit of things from the previous race, despite different route, but especially this flat section I did remember in two ways:

  • that I was already very, very tired back then
  • and by the fact that I don’t remembered at all what was following

So I start on this flat section, expecting good progress, only to find myself going slower and slower. I couldn’t understand why it was so tiring to go straight, until I started approaching the group ahead. I realised then it was a “false flat”; I knew the term before, but didn’t understand well what it means, but now I got it ☺ About 1km, at a mean grade of 4.4%, which is not high when you know it’s a climb, but it is not flat. Once I realised this, I felt better. The other ~6.5km I spent pursuing a large group ahead, which I managed to catch about half a kilometre before the end of the flat.

I enjoy this section, then the road turns and I saw what my brain didn’t want to remember: the last climb, 4km long, only ~180m altitude, but mean grade 10%, but I don’t have energy for all of it. Which is kind of hilarious, 180m altitude is only twice and a bit of my daily one-way commute, so peanuts, but I’m quite tired from both previous climbs and chasing that group, so after 11 minutes I get off the bike, and start pushing.

And the pushing is the same as before, actually even better. I was going 5.2km/h on average here, versus 6.1km/h before, so less than 1km/h difference. This walking speed is more than usual walking, as it was not very steep I was walking fast. Since I was able to walk fast and felt recovered, I get back on the bike before the climb finished, only to feel a solid cramp in the right leg as soon as I try to start pedalling. OK, let me pedal with the left, ouch cramp as well. So I’m going actually almost as slow as walking, just managing the cramps and keeping them below-critical, and after about 5 minutes they go away.

It was the first time I got such cramps, or any cramps, during a race. I’m not sure I understand why, I did drink energy drinks not just water (so I assume not acute lack of some electrolytes). Apparently reading on the internet it seems cramps are not a solved issue, and most likely just due to the hard efforts at high intensity. It was scary about ten seconds as I feared a full cramping of one of both legs (and being clipped in… not funny), but once I realised I can keep it under control, it was just interesting.

I remembered now the rest of the route (ahem, see below), so I knew I was looking at a long (7km), flat-ish section (100m gain), before the last, relatively steep descent. Steep as in -24.4% max gradient, with average -15%. This done, I was on relatively flat roads (paved, even), and my Garmin was showing 1’365m altitude gain since the start of the race. Between the official number of 1’400m and this was a difference of only 35m, which I was sure was due to simple errors; distance now was 50.5km, so I was looking at the rest of 2.5km being a fast run towards the end.

Then in a village we exit the main road and take a small road going up a bit, and then up a bit, and then—I was pretty annoyed here already—the road opens to a climb. A CLIMB!!!! I could see the top of the hill, people going up, but it was not a small climb! It was not 35m, not fewer than that, it was a short but real climb!

I was swearing with all my power in various languages. I completely and totally forgot this last climb, I did not plan for it, and I was pissed off ☺ It turned out to be 1.3km long, mean grade of 11.3% (!), with 123m of elevation gain. It was almost 100m more than I planned… I got off the bike again, especially as it was a bit tricky to pedal and the gradient was 20% (!!!), so I pushed, then I got back on, then I got back off, and then back on. At the end of the climb there was a photographer again, so I got on the bike and hoped I might get a smiling picture, which almost happened:

Pretending I’m enjoying this, and showing how much weight I should get rid of :/

And then from here on it was clean, easy downhill, at most -20% grade, back in Einsiedeln, and back to the start with a 10m climb, which I pushed hard, overtook a few people (lol), passed the finish, and promptly found first possible place to get off the bike and lie down. This was all behind me now:

Thanks VeloViewer!

And, my Garmin said 1’514m altitude, more than 100m more than the official number. Boo!

Aftermath

It was the first race I actually had to lie down afterwards. First race I got cramps as well ☺ Third 1h30m heart rate value ever, a lot of other year 2019 peaks too. And I was dead tired.

I learned it’s very important to know well the route altitude profile, in order to be able to time your efforts well. I got reminded I suck at climbing (so I need to train this over and over), but I learned that I can push better than other people at the end of a climb, so I need to train this strength as well.

I also learned that you can’t rely on the official race data, since it can be off by more than 100m. Or maybe I can’t rely on my Garmin(s), but with a barometric altimeter, I don’t think the problem was here.

I think I ate a bit too much during the race, which was not optimal, but I was very hungry already after the first climb…

But the biggest thing overall was that, despite no age group, I got a better placement than I thought. I finished in 3h:37m.14s, 1h:21m.56s behind the first place, but a good enough time that I was 325th place out of 490 finishers.

By my calculations, 490 finishers means the thirds are 1-163, 164-326, and 327-490. Which means, I was in the 2nd third, not in the last one!!! Hence the subtitle of this post, moving up, since I usually am in the bottom third, not the middle one. And there were also 11 DNF people, so I’m well in the middle third ☺

Joke aside, this made me very happy, as it’s the first time I feel like efforts (in a bad year like this, even) do amount to something. So good.

Speaking of DNF, I was shocked at the amount of people I saw either trying to fix their bike on the side, or walking their bike (to the next repair post), or just shouting angrily at a broken component. I counted probably towards 20 of such people, a lot after that first descent, but given only 11 DNF, I guess many people do carry spare tubes. I didn’t, hope is a strategy sometimes ☺

Now, with the season really closed, time to start thinking about next year. And how to lose those 10kg of fat I still need to lose…

Antoine Beaupré: Calibre replacement considerations

7 October, 2019 - 02:27
Summary

TL;DR: I'm considering replacing those various Calibre compnents with...

See below why and a deeper discussion on all the features.

Problems with Calibre

Calibre is an amazing software: it allows users to manage ebooks on your desktop and a multitude of ebook readers. It's used by Linux geeks as well as Windows power-users and vastly surpasses any native app shipped by ebook manufacturers. I know almost exactly zero people that have an ebook reader that do not use Calibre.

However, it has had many problems over the years:

The latest issue (lack of Python 3) is the last straw, for me. While Calibe is an awesome piece of software, I can't help but think it's doing too much, and the wrong way. It's one of those tools that looks amazing on the surface, but when you look underneath, it's a monster that is impossible to maintain, a liability that is just bound to cause more problems in the future.

What does Calibre do anyways

So let's say I wanted to get rid of Calibre, what would that mean exactly? What do I actually use Calibre for anyways?

Calibre is...

  • an ebook viewer: Calibre ships with the ebook-viewer command, which allows one to browse a vast variety of ebook formats. I rarely use this feature, since I read my ebooks on a e-reader, on purpose. There is, besides, a good variety of ebook-readers, on different platforms, that can replace Calibre here:

    • Atril, MATE's version of Evince, supports ePUBs (Evince doesn't)
    • MuPDF also reads ePUBs without problems and is really fast
    • fbreader also supports ePUBs, but is much slower than all those others
    • Emacs (of course) supports ebooks through nov.el
    • Okular apparently supports ePUBs, but I must be missing a library because it doesn't actually work here
    • coolreader is another alternative, not yet in Debian (#715470)
    • lucidor also looks interesting, but is not packaged in Debian either (although upstream provides a .deb)
    • koreader and plato are good alternatives for the Kobo reader (although koreader also now has builds for Debian)
  • an ebook editor: Calibre also ships with an ebook-edit command, which allows you to do all sorts of nasty things to your ebooks. I have rarely used this tool, having found it hard to use and not giving me the results I needed, in my use case (which was to reformat ePUBs before publication). For this purpose, Sigil is a much better option, now packaged in Debian. There are also various tools that render to ePUB: I often use the Sphinx documentation system for that purpose, and have been able to produce ePUBs from LaTeX for some projects.

  • a file converter: Calibre can convert between many ebook formats, to accomodate the various readers. In my experience, this doesn't work very well: the layout is often broken and I have found it's much better to find pristine copies of ePUB books than fight with the converter. There are, however, very few alternatives to this functionality, unfortunately.

  • a collection browser: this is the main functionality I would miss from Calibre. I am constantly adding books to my library, and Calibre does have this incredibly nice functionality of just hitting "add book" and Just Do The Right Thing™ after that. Specifically, what I like is that it:

    • sort, view, and search books in folders, per author, date, editor, etc
    • quick search is especially powerful
    • allows downloading and editing metadata (like covers) easily
    • track read/unread status (although that's a custom field I had to add)

    Calibre is, as far as I know, the only tool that goes so deep in solving that problem. The Liber web server, however, does provide similar search and metadata functionality. It also supports migrating from an existing Calibre database as it can read the Calibre metadata stores.

    This also connects with the more general "book inventory" problem I have which involves an inventory physical books and directory of online articles. See also firefox (Zotero section) and ?bookmarks for a longer discussion of that problem.

  • a device synchronization tool : I mostly use Calibre to synchronize books with an ebook-reader. It can also automatically update the database on the ebook with relevant metadata (e.g. collection or "shelves"), although I do not really use that feature. I do like to use Calibre to quickly search and prune books from by ebook reader, however. I might be able to use git-annex for this, however, given that I already use it to synchronize and backup my ebook collection in the first place...

  • an RSS reader: I used this for a while to read RSS feeds on my ebook-reader, but it was pretty clunky. Calibre would be continously generating new ebooks based on those feeds and I would never read them, because I would never find the time to transfer them to my ebook viewer in the first place. Instead, I use a regular RSS feed reader. I ended up writing my own, feed2exec) and when I find an article I like, I add it to Wallabag which gets sync'd to my reader using wallabako, another tool I wrote.

  • an ebook web server : Calibre can also act as a web server, presenting your entire ebook collection as a website. It also supports acting as an OPDS directory, which is kind of neat. There are, as far as I know, no alternative for such a system although there are servers to share and store ebooks, like Trantor or Liber.

Note that I might have forgotten functionality in Calibre in the above list: I'm only listing the things I have used or am using on a regular basis. For example, you can have a USB stick with Calibre on it to carry the actual software, along with the book library, around on different computers, but I never used that feature.

So there you go. It's a colossal task! And while it's great that Calibre does all those things, I can't help but think that it would be better if Calibre was split up in multiple components, each maintained separately. I would love to use only the document converter, for example. It's possible to do that on the commandline, but it still means I have the entire Calibre package installed.

Maybe a simple solution, from Debian's point of view, would be to split the package into multiple components, with the GUI and web servers packaged separately from the commandline converter. This way I would be able to install only the parts of Calibre I need and have limited exposure to other security issues. It would also make it easier to run Calibre headless, in a virtual machine or remote server for extra isoluation, for example.

Norbert Preining: RIP (for now) Calibre in Debian

6 October, 2019 - 09:14

The current purge of all Python2 related packages has a direct impact on Calibre. The latest version of Calibre requires Python modules that are not (anymore) available for Python 2, which means that Calibre >= 4.0 will for the foreseeable future not be available in Debian.

I just have uploaded a version of 3.48 which is the last version that can run on Debian. From now on until upstream Calibre switches to Python 3, this will be the last version of Calibre in Debian.

In case you need newer features (including the occasional security fixes), I recommend switching to the upstream installer which is rather clean (installing into /opt/calibre, creating some links to the startup programs, and installing completions for zsh and bash. It also prepares an uninstaller that reverts these changes.

Enjoy.

Reproducible Builds: Reproducible Builds in September 2019

6 October, 2019 - 03:43

Welcome to the September 2019 report from the Reproducible Builds project!

In these reports we outline the most important things that we have been up over the past month. As a quick refresher of what our project is about, whilst anyone can inspect the source code of free software for malicious changes, most software is distributed to end users or servers as precompiled binaries. The motivation behind the reproducible builds effort is to ensure zero changes have been introduced during these compilation processes. This is achieved by promising identical results are always generated from a given source, thus allowing multiple third-parties to come to a consensus on whether a build was compromised.

In September’s report, we cover:

  • Media coverage & eventsmore presentations, preventing Stuxnet, etc.
  • Upstream newskernel reproducibility, grafana, systemd, etc.
  • Distribution workreproducible images in Arch Linux, policy changes in Debian, etc.
  • Software developmentyet more work on diffoscope, upstream patches, etc.
  • Misc news & getting in touchfrom our mailing list how to contribute, etc

If you are interested in contributing to our project, please visit our Contribute page on our website.

Media coverage & events

This month Vagrant Cascadian attended the 2019 GNU Tools Cauldron in Montréal, Canada and gave a presentation entitled Reproducible Toolchains for the Win (video).

In addition, our project was highlighted as part of a presentation by Andrew Martin at the All Systems Go conference in Berlin titled Rootless, Reproducible & Hermetic: Secure Container Build Showdown, and Björn Michaelsen from the Document Foundation presented at the 2019 LibreOffice Conference in Almería in Spain on the status of reproducible builds in the LibreOffice office suite.

In academia, Anastasis Keliris and Michail Maniatakos from the New York University Tandon School of Engineering published a paper titled ICSREF: A Framework for Automated Reverse Engineering of Industrial Control Systems Binaries (PDF) that speaks to concerns regarding the security of Industrial Control Systems (ICS) such as those attacked via Stuxnet. The paper outlines their ICSREF tool for reverse-engineering binaries from such systems and furthermore demonstrates a scenario whereby a commercial smartphone equipped with ICSREF could be easily used to compromise such infrastructure.

Lastly, It was announced that Vagrant Cascadian will present a talk at SeaGL in Seattle, Washington during November titled There and Back Again, Reproducibly.

2019 Summit

Registration for our fifth annual Reproducible Builds summit that will take place between 1st → 8th December in Marrakesh, Morocco has opened and personal invitations have been sent out.

Similar to previous incarnations of the event, the heart of the workshop will be three days of moderated sessions with surrounding “hacking” days and will include a huge diversity of participants from Arch Linux, coreboot, Debian, F-Droid, GNU Guix, Google, Huawei, in-toto, MirageOS, NYU, openSUSE, OpenWrt, Tails, Tor Project and many more. If you would like to learn more about the event and how to register, please visit our our dedicated event page.

Upstream news

Ben Hutchings added documentation to the Linux kernel regarding how to make the build reproducible. As he mentioned in the commit message, the kernel is “actually” reproducible but the end-to-end process was not previously documented in one place and thus Ben describes the workflow and environment needed to ensure a reproducible build.

Daniel Edgecumbe submitted a pull request which was subsequently merged to the logging/journaling component of systemd in order that the output of e.g. journalctl --update-catalog does not differ between subsequent runs despite there being no changes in the input files.

Jelle van der Waa noticed that if the grafana monitoring tool was built within a source tree devoid of Git metadata then the current timestamp was used instead, leading to an unreproducible build. To avoid this, Jelle submitted a pull request in order that it use SOURCE_DATE_EPOCH if available.

Mes (a Scheme-based compiler for our “sister” bootstrappable builds effort) announced their 0.20 release.

Distribution work

Bernhard M. Wiedemann posted his monthly Reproducible Builds status update for the openSUSE distribution. Thunderbird and kernel-vanilla packages will be among the larger ones to become reproducible soon and there were additional Python patches to help reproducibility issues of modules written in this language that have C bindings.

OpenWrt is a Linux-based operating system targeting embedded devices such as wireless network routers. This month, Paul Spooren (aparcar) switched the toolchain the use the GCC version 8 by default in order to support the -ffile-prefix-map= which permits a varying build path without affecting the binary result of the build []. In addition, Paul updated the kernel-defaults package to ensure that the SOURCE_DATE_EPOCH environment variable is considered when creating the the /init directory.

Alexander “lynxis” Couzens began work on working on a set of build scripts for creating firmware and operating system artifacts in the coreboot distribution.

Lukas Pühringer prepared an upload which was sponsored by Holger Levsen of python-securesystemslib version 0.11.3-1 to Debian unstable. python-securesystemslib is a dependency of in-toto, a framework to protect the integrity of software supply chains.

Arch Linux

The mkinitcpio component of Arch Linux was updated by Daniel Edgecumbe in order that it generates reproducible initramfs images by default, meaning that two subsequent runs of mkinitcpio produces two files that are identical at the binary level. The commit message elaborates on its methodology:

Timestamps within the initramfs are set to the Unix epoch of 1970-01-01. Note that in order for the build to be fully reproducible, the compressor specified (e.g. gzip, xz) must also produce reproducible archives. At the time of writing, as an inexhaustive example, the lzop compressor is incapable of producing reproducible archives due to the insertion of a runtime timestamp.

In addition, a bug was created to track progress on making the Arch Linux ISO images reproducible.

Debian

In July, Holger Levsen filed a bug against the underlying tool that maintains the Debian archive (“dak”) after he noticed that .buildinfo metadata files were not being automatically propagated in the case that packages had to be manually approved in “NEW queue”. After it was pointed out that the files were being retained in a separate location, Benjamin Hof proposed a patch for the issue that was merged and deployed this month.

Aurélien Jarno filed a bug against the Debian Policy (#940234) to request a section be added regarding the reproducibility of source packages. Whilst there is already a section about reproducibility in the Policy, it only mentions binary packages. Aurélien suggest that it:

… might be a good idea to add a new requirement that repeatedly building the source package in the same environment produces identical .dsc files.

In addition, 51 reviews of Debian packages were added, 22 were updated and 47 were removed this month adding to our knowledge about identified issues. Many issue types were added by Chris Lamb including buildpath_in_code_generated_by_bison, buildpath_in_postgres_opcodes and ghc_captures_build_path_via_tempdir.

Software development Upstream patches

The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including:

Diffoscope

diffoscope is our in-depth and content-aware diff utility that can locate and diagnose reproducibility issues. It is run countless times a day on our testing infrastructure and is essential for identifying fixes and causes of non-deterministic behaviour.

This month, Chris Lamb uploaded versions 123, 124 and 125 and made the following changes:

  • New features:

    • Add /srv/diffoscope/bin to the Docker image path. (#70)
    • When skipping tests due to the lack of installed tool, print the package that might provide it. []
    • Update the “no progressbar” logging message to match the parallel missing tlsh module warnings. []
    • Update “requires foo” messages to clarify that they are referring to Python modules. []
  • Testsuite updates:

    • The test_libmix_differences ELF binary test requires the xxd tool. (#940645)
    • Build the OCaml test input files on-demand rather than shipping them with the package in order to prevent test failures with OCaml 4.08. (#67)
    • Also conditionally skip the identification and “no differences” tests as we require the Ocaml compiler to be present when building the test files themselves. (#940471)
    • Rebuild our test squashfs images to exclude the character device as they requires root or fakeroot to extract. (#65)
  • Many code cleanups, including dropping some unnecessary control flow [], dropping unnecessary pass statements [] and dropping explicitly inheriting from object class as it unnecessary in Python 3 [].

In addition, Marc Herbert completely overhauled the handling of ELF binaries particularly around many assumptions that were previously being made via file extensions, etc. [][][] and updated the testsuite to support a never version of the coreboot utilities. []. Mattia Rizzolo then ensured that diffoscope does not crash when the progress bar module is missing but the functionality was requested [] and made our version checking code more lenient []. Lastly, Vagrant Cascadian not only updated diffoscope to versions 123 and 125, he enabled a more complete test suite in the GNU Guix distribution. [][][][][][]

Project website

There was yet more effort put into our our website this month, including:

In addition, Cindy Kim added in-toto to our “Who is Involved?” page, James Fenn updated our homepage to fix a number of spelling and grammar issues [] and Peter Conrad added BitShares to our list of projects interested in Reproducible Builds [].

strip-nondeterminism

strip-nondeterminism is our tool to remove specific non-deterministic results from successful builds. This month, Marc Herbert made a huge number of changes including:

  • GNU ar handler:
    • Don’t corrupt the pseudo file mode of the symbols table.
    • Add test files for “symtab” (/) and long names (//).
    • Don’t corrupt the SystemV/GNU table of long filenames.
  • Add a new $File::StripNondeterminism::verbose global and, if enabled, tell the user that ar(1) could not set the symbol table’s mtime.

In addition, Chris Lamb performed some issue investigation with the Debian Perl Team regarding issues in the Archive::Zip module including a problem with corruption of members that use bzip compression as well as a regression whereby various metadata fields were not being updated that was reported in/around Debian bug #940973.

Test framework

We operate a comprehensive Jenkins-based testing framework that powers tests.reproducible-builds.org.

  • Alexander “lynxis” Couzens:
    • Fix missing .xcompile in the build system. []
    • Install the GNAT Ada compiler on all builders. []
    • Don’t install the iasl ACPI power management compiler/decompiler. []
  • Holger Levsen:
    • Correctly handle the $DEBUG variable in OpenWrt builds. []
    • Fefactor and notify the #archlinux-reproducible IRC channel for problems in this distribution. []
    • Ensure that only one mail is sent when rebooting nodes. []
    • Unclutter the output of a Debian maintenance job. []
    • Drop a “todo” entry as we vary on a merged /usr for some time now. []

In addition, Paul Spooren added an OpenWrt snapshot build script which downloads .buildinfo and related checksums from the relevant download server and attempts to rebuild and then validate them for reproducibility. []

The usual node maintenance was performed by Holger Levsen [][][], Mattia Rizzolo [] and Vagrant Cascadian [][].

reprotest

reprotest is our end-user tool to build same source code twice in different environments and then check the binaries produced by each build for differences. This month, a change by Dmitry Shachnev was merged to not use the faketime wrapper at all when asked to not vary time [] and Holger Levsen subsequently released this as version 0.7.9 as dramatically overhauling the packaging [][].

Misc news & getting in touch

On our mailing list Rebecca N. Palmer started a thread titled Addresses in IPython output which points out and attempts to find a solution to a problem with Python packages, whereby objects that don’t have an explicit string representation have a default one that includes their memory address. This causes problems with reproducible builds if/when such output appears in generated documentation.

If you are interested in contributing the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:


This month’s report was written by Bernhard M. Wiedemann, Chris Lamb, Holger Levsen, Jelle van der Waa, Mattia Rizzolo and Vagrant Cascadian. It was subsequently reviewed by a bunch of Reproducible Builds folks on IRC and the mailing list.

Ritesh Raj Sarraf: Setting a Lotus Pot

5 October, 2019 - 23:26
Experiences setting up a Lotus Pond Pot A novice’s first time experience setting up a Lotus pond and germinating the Lotus seeds to a full plant The trigger

Our neighbors have a very nice Lotus setup in the front of their garden, with flowers blooming in it. It is really a pleasing experience to see it. With lifestyles limiting to specifics, I’m glad to be around with like minded people. So we decided to set up a Lotus Pond Pot in our garden too.

Hunting the pot

About 2/3rd of our garden has been laid out with Mexican Grass, with the exception of some odd spots where there are 2 large concrete tanks in the ground and other small tanks. The large one is fairly big, with a dimension around 3.5 x 3.5 ft. We wanted to make use of that spot. I checked out some of the available pots and found a good looking circular pot carved in stone, of around 2 ft, but it was quite expensive and not fitting my budget.

With the stone pot out of my budget, the other options left out were

  • One molded in cement
  • Granite

We looked at another pot maker who made pots by molds in cement. They had some small size pot samples of around 1x1 feet but the finished product samples didn’t look very good. From the available samples, we weren’t sure how well would the pot what we wanted, would look like. Also, they weren’t having a proportionate price difference. The vendor would take a months time to make one. With no ready made sample of that size in place, this was something we weren’t very enthused to explore.

We instead chose to explore the possibility of building one with granite slabs. First reason being, granite vendors were more easily available. But second and most important, given the spot I had chosen, I felt a square granite based setup will be an equally good fit. Also, the granite based pot was coming in budget friendly. Finally, we settled with a 3 x 3 ft granite pot.

And this is what our initial Lotus Pot looked like.

Note: The granite pot was quite heavy, close to around 200 Kilograms. It took us 3 people to place it at the designated spot

Actual Lotus pot

As you can see from the picture above, there’s another pot in the larger granite pot itself. Lotus’ grow in water and sludge. So we needed to provide it with the same setup. We used one of the Biryani Pots to prepare for the sludge. This is where the Lotus’ roots, tuber, will live and grow, inside the water and sludge.

With an open pot, when you have water stored, you have other challenges to take care of.

  • Aeration of the water
  • Mosquitoes

We bought a solar water fountain to take care of the water. It is a nice device that works very well under sunlight. Remember, for your Lotus, you need a well sun-lit area. So, the solar water fountain was a perfect fit for this scenario.

But, with water, breed mosquitoes. As most would do, we chose to put in some fish into the pond to take care of that problem. We put in some Mollies, which we hope would produce more. Also the waste they produce, should work as a supplement fertilizer for the Lotus plant.

Aeration is very important for the fish and so far our Solar Water Fountain seems to be doing a good job there, keeping the fishes happy and healthy.

So in around a week, we had the Lotus Pot in place along with a water fountain and some fish.

The Lotus Plant itself

With the setup in place, it was now time to head to the nearby nursery and get the Lotus plant. It was an utter surprise to us to not find the Lotus plant in any of the nurseries. After quite a lot of searching around, we came across one nursery that had some lotus plants. They didn’t look in good health but that was the only option we had. But the bigger disappointment was the price, which was insanely high for a plant. We returned back home without the Lotus, a little disheartened.

Thank you Internet

The internet has connected the world so well. I looked up the internet and was delighted to see so many people sharing experiences in the form of articles and YouTube videos about Lotus. You can find all the necessary information about. With it, we were all charged up to venture into the next step, to grow the lotus from scratch seeds instead.

The Lotus seeds

First, we looked up the internet and placed an order for 15 lotus seeds. Soon, they were delivered. And this is what Lotus seeds look like.

Lotus Seeds I had, never in my life before, seen lotus seeds. The shell is very very hard. An impressive remnant of the Lotus plant.

Germinating the seed

Germinating the seed is an experience of its own, given how hard the lotus’ shell is. There are very good articles and videos on the internet explaining on the steps on how to germinate a seed. In a gist, you need to scratch the pointy end of the seed’s shell enough to see the inner membrane. And then you need to submerge it in water for around 7-10 days. Every day, you’ll be able to witness the germination process.

Lotus seed with the pointy end Make sure to scratch the pointy end well enough, while also ensuring to not damage the inner membrane of the seed. You also need to change the water in the container on a daily basis and keep it in a decently lit area.

Here’s an interesting bit, specific to my own experience. Turned out that the seeds we purchased online wasn’t of good quality. Except for one, none of the other seeds germinated. And the one that did, did not sprout out proper. It popped a single shoot but the shoot did not grow much. In all, it didn’t work.

But while we were waiting for the seeds to be delivered, my wife looked at the pictures of the seeds that I was studying about online, realized that we already had the lotus seeds at home. Turns out, these seeds are used in our Hindu Puja Rituals. The seeds are called कऎल ŕ¤—ŕ¤ŸŕĽŕ¤Ÿŕ¤ž in Hindi. We had some of the seeds from the puja remaining. So we had in parallel, used those seeds for germination and they sprouted out very well.

Unfortunately, I had not taken any pictures of them during the germination phase

The sprouting phase should be done in a tall glass. This will allow for the shoots to grow long, as eventually, it needs to be set into the actual pot. During the germination phase, the shoot will grow daily around an inch or so, ultimately targeting to reach the surface of the water. Once it reaches the surface, eventually, at the base, it’ll start developing the roots.

Now is time to start the preparation to sow the seed into the sub-pot which has all the sludge

Sowing your seeds Initial Lotus growing in the sludge

This picture is from a couple of days after I sowed the seeds. When you transfer the shoots into the sludge, use your finger to submerge into the sludge making room for the seed. Gently place the seed into the sludge. You can also cover the sub-pot with some gravel just to keep the sludge intact and clean.

Once your shoots get accustomed to the new environment, they start to grow again. That is what you see in the picture above, where the shoot reaches the surface of the water and then starts developing into the beautiful lotus leaf

Lotus leaf

It is a pleasure to see the Lotus leaves floating on the water. The flowering is going to take well over a year, from what I have read on the internet. But the Lotus Leaves, its veins and the water droplets on it, for now itself, are very soothing to watch

Lotus leaf veins

Lotus leaf veins and water droplets Final Result

As of today, this is what the final setup looks like. Hopefully, in a year’s time, there’ll be flowers

Lotus pot setup now

John Goerzen: Resurrecting Ancient Operating Systems on Debian, Raspberry Pi, and Docker

5 October, 2019 - 05:09

I wrote recently about my son playing Zork on a serial terminal hooked up to a PDP-11, and how I eventually bought a vt420 (ok, some vt420s and vt510s, I couldn’t stop at one) and hooked it up to a Raspberry Pi.

This led me down another path: there is a whole set of hardware and software that I’ve never used. For some, it fell out of favor before I could read (and for others, before I was even born).

The thing is – so many of these old systems have a legacy that we live in today. So much so, in fact, that we are now seeing articles about how modern CPUs are fast PDP-11 emulators in a sense. The PDP-11, and its close association with early Unix, lives on in the sense that its design influenced microprocessors and operating systems to this day. The DEC vt100 terminal is, nowadays, known far better as that thing that is emulated, but it was, in fact, a physical thing. Some goes back into even mistier times; Emacs, for instance, grew out of the MIT ITS project but was later ported to TOPS-20 before being associated with Unix. vi grew up in 2BSD, and according to wikipedia, was so large it could barely fit in the memory of a PDP-11/70. Also in 2BSD, a buggy version of Zork appeared — so buggy, in fact, that the save game option was broken. All of this happened in the late 70s.

When we think about the major developments in computing, we often hear of companies like IBM, Microsoft, and Apple. Of course their contributions are undeniable, and emulators for old versions of DOS are easily available for every major operating system, plus many phones and tablets. But as the world is moving heavily towards Unix-based systems, the Unix heritage is far more difficult to access.

My plan with purchasing and setting up an old vt420 wasn’t just to do that and then leave. It was to make it useful for modern software, and also to run some of these old systems under emulation.

To that end, I have released my vintage computing collection – both a script for setting up on a system, and a docker image. You can run Emacs and TECO on TOPS-20, zork and vi on 2BSD, even Unix versions 5, 6, and 7 on a PDP-11. And for something particularly rare, RDOS on a Data General Nova. I threw in some old software compiled for modern systems: Zork, Colossal Cave, and Gopher among them. The bsdgames collection and some others are included as well.

I hope you enjoy playing with the emulated big-iron systems of the 70s and 80s. And in a dramatic turnabout of scale and cost, those machines which used to cost hundreds of thousands of dollars can now be run far faster under emulation on a $35 Raspberry Pi.

Ben Hutchings: Kernel Recipes 2019, part 2

4 October, 2019 - 20:30

This conference only has a single track, so I attended almost all the talks. This time I didn't take notes but I've summarised all the talks I attended. This is the second and last part of that; see part 1 if you missed it.

XDP closer integration with network stack

Speaker: Jesper Dangaard Brouer

Details and slides: https://kernel-recipes.org/en/2019/xdp-closer-integration-with-network-stack/

Video: Youtube

The speaker introduced XDP and how it can improve network performance.

The Linux network stack is extremely flexible and configurable, but this comes at some performance cost. The kernel has to generate a lot of metadata about every packet and check many different control hooks while handling it.

The eXpress Data Path (XDP) was introduced a few years ago to provide a standard API for doing some receive packet handling earlier, in a driver or in hardware (where possible). XDP rules can drop unwanted packets, forward them, pass them directly to user-space, or allow them to continue through the network stack as normal.

He went on to talk about how recent and proposed future extensions to XDP allow re-using parts of the standard network stack selectively.

This talk was supposed to be meant for kernel developers in general, but I don't think it would be understandable without some prior knowledge of the Linux network stack.

Faster IO through io_uring

Speaker: Jens Axboe

Details and slides: https://kernel-recipes.org/en/2019/talks/faster-io-through-io_uring/

Video: Youtube. (This is part way through the talk, but the earlier part is missing audio.)

The normal APIs for file I/O, such as read() and write(), are blocking, i.e. they make the calling thread sleep until I/O is complete. There is a separate kernel API and library for asynchronous I/O (AIO), but it is very restricted; in particular it only supports direct (uncached) I/O. It also requires two system calls per operation, whereas blocking I/O only requires one.

Recently the io_uring API was introduced as an entirely new API for asynchronous I/O. It uses ring buffers, similar to hardware DMA rings, to communicate operations and completion status between user-space and the kernel, which is far more efficient. It also removes most of the restrictions of the current AIO API.

The speaker went into the details of this API and showed performance comparisons.

The Next Steps toward Software Freedom for Linux

Speaker: Bradley Kuhn

Details: https://kernel-recipes.org/en/2019/talks/the-next-steps-toward-software-freedom-for-linux/

Slides: http://ebb.org/bkuhn/talks/Kernel-Recipes-2019/kernel-recipes.html

Video: Youtube

The speaker talked about the importance of the GNU GPL to the development of Linux, in particular the ability of individual developers to get complete source code and to modify it to their local needs.

He described how, for a large proportion of devices running Linux, the complete source for the kernel is not made available, even though this is required by the GPL. So there is a need for GPL enforcement—demanding full sources from distributors of Linux and other works covered by GPL, and if necessary suing to obtain them. This is one of the activities of his employer, Software Freedom Conservancy, and has been carried out by others, particularly Harald Welte.

In one notable case, the Linksys WRT54G, the release of source after a lawsuit led to the creation of the OpenWRT project. This is still going many years later and supports a wide range of networking devices. He proposed that the Conservancy's enforcement activity should, in the short term, concentrate on a particular class of device where there would likely be interest in creating a similar project.

Suricata and XDP

Speaker: Eric Leblond

Details and slides: https://kernel-recipes.org/en/2019/talks/suricata-and-xdp/

Video: Youtube

The speaker described briefly how an Intrusion Detection System (IDS) interfaces to a network, and why it's important to be able to receive and inspect all relevant packets.

He then described how the Suricata IDS uses eXpress Data Path (XDP, explained in an earlier talk) to filter and direct packets, improving its ability to handle very high packet rates.

CVEs are dead, long live the CVE!

Speaker: Greg Kroah-Hartman

Details and slides: https://kernel-recipes.org/en/2019/talks/cves-are-dead-long-live-the-cve/

Video: Youtube

Common Vulnerabilities and Exposures Identifiers (CVE IDs) are a standard, compact way to refer to specific software and hardware security flaws.

The speaker explained problems with the way CVE IDs are currently assigned and described, including assignments for bugs that don't impact security, lack of assignment for many bugs that do, incorrect severity scores, and missing information about the changes required to fix the issue. (My work on CIP's kernel CVE tracker addresses some of these problems.)

The average time between assignment of a CVE ID and a fix being published is apparently negative for the kernel, because most such IDs are being assigned retrospectively.

He proposed to replace CVE IDs with "change IDs" (i.e. abbreviated git commit hashes) identifying bug fixes.

Driving the industry toward upstream first

Speaker: Enric Balletbo i Serra

Details snd slides: https://kernel-recipes.org/en/2019/talks/driving-the-industry-toward-upstream-first/

Video: Youtube

The speaker talked about how the Chrome OS developers have tried to reduce the difference between the kernels running on Chromebooks, and the upstream kernel versions they are based on. This has succeeded to the point that it is possible to run a current mainline kernel on at least some Chromebooks (which he demonstrated).

Formal modeling made easy

Speaker: Daniel Bristot de Oliveira

Details and slides: https://kernel-recipes.org/en/2019/talks/formal-modeling-made-easy/

Video: Youtube

The speaker explained how formal modelling of (parts of) the kernel could be valuable. A formal model will describe how some part of the kernel works, in a way that can be analysed and proven to have certain properties. It is also necessary to verify that the model actually matches the kernel's implementation.

He explained the methodology he used for modelling the real-time scheduler provided by the PREEMPT_RT patch set. The model used a number of finite state machines (automata), with conditions on state transitions that could refer to other state machines. He added (I think) tracepoints for all state transitions in the actual code and a kernel module that verified that at each such transition the model's conditions were met.

In the process of this he found a number of bugs in the scheduler.

Kernel documentation: past, present, and future

Speaker: Jonathan Corbet

Details and slides: https://kernel-recipes.org/en/2019/kernel-documentation-past-present-and-future/

Video: Youtube

The speaker is the maintainer of the Linux kernel's in-tree documentation. He spoke about how the documentation has been reorganised and reformatted in the past few years, and what work is still to be done.

GNU poke, an extensible editor for structured binary data

Speaker: Jose E Marchesi

Details and slides: https://kernel-recipes.org/en/2019/talks/gnu-poke-an-extensible-editor-for-structured-binary-data/

Video: Youtube

The speaker introduced and demonstrated his project, the "poke" binary editor, which he thinks is approaching a first release. It has a fairly powerful and expressive language which is used for both interactive commands and scripts. Type definitions are somewhat C-like, but poke adds constraints, offset/size types with units, and types of arbitrary bit width.

The expected usage seems to be that you write a script ("pickle") that defines the structure of a binary file format, use poke interactively or through another script to map the structures onto a specific file, and then read or edit specific fields in the file.

Norbert Preining: TensorFlow 2.0 with GPU on Debian/sid

4 October, 2019 - 15:37

Some time ago I have been written about how to get Tensorflow (1.x) running on current Debian/sid back then. It turned out that this isn’t correct anymore and needs an update, so here it is, getting the most uptodate TensorFlow 2.0 running with nVidia support running on Debian/sid.

Step 1: Install CUDA 10.0

Follow more or less the instructions here and do

wget -O- https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/7fa2af80.pub | sudo tee /etc/apt/trusted.gpg.d/nvidia-cuda.asc
echo "deb http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/ /" | sudo tee /etc/apt/sources.list.d/nvidia-cuda.list
sudo apt-get update
sudo apt-get install cuda-libraries-10-0

Warning! Don’t install the 10-1 version since the TensorFlow binaries need 10.0.

This will install lots of libs into /usr/local/cuda-10.0 and add the respective directory to the ld.so path by creating a file /etc/ld.so.conf.d/cuda-10-0.conf.

Step 2: Install CUDA 10.0 CuDNN

One difficult to satisfy dependency are the CuDNN libraries. In our case we need the version 7 library for CUDA 10.0. To download these files one needs to have a NVIDIA developer account, which is quick and painless. After that go to the CuDNN page where one needs to select Download cuDNN v7.N.N (xxxx NN, YYYY), for CUDA 10.0 and then cuDNN Runtime Library for Ubuntu18.04 (Deb).

At the moment (as of today) this will download a file libcudnn7_7.6.4.38-1+cuda10.0_amd64.deb which needs to be installed with dpkg -i libcudnn7_7.6.4.38-1+cuda10.0_amd64.deb.

Step 3: Install Tensorflow for GPU

This is the easiest one and can be done as explained on the TensorFlow installation page using

pip3 install --upgrade tensorflow-gpu

This will install several other dependencies, too.

Step 4: Check that everything works

Last but not least, make sure that TensorFlow can be loaded and find your GPU. This can be done with the following one-liner, and in my case gives the following output:

$ python3 -c "import tensorflow as tf;print(tf.reduce_sum(tf.random.normal([1000, 1000])))"
....(lots of output)
2019-10-04 17:29:26.020013: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 3390 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1050 Ti, pci bus id: 0000:01:00.0, compute capability: 6.1)
tf.Tensor(444.98087, shape=(), dtype=float32)
$

I haven’t tried to get R working with the newest TensorFlow/Keras combination, though. Hope the above helps.

Matthew Garrett: Investigating the security of Lime scooters

4 October, 2019 - 13:04
(Note: to be clear, this vulnerability does not exist in the current version of the software on these scooters. Also, this is not the topic of my Kawaiicon talk.)

I've been looking at the security of the Lime escooters. These caught my attention because:
(1) There's a whole bunch of them outside my building, and
(2) I can see them via Bluetooth from my sofa
which, given that I'm extremely lazy, made them more attractive targets than something that would actually require me to leave my home. I did some digging. Limes run Linux and have a single running app that's responsible for scooter management. They have an internal debug port that exposes USB and which, until this happened, ran adb (as root!) over this USB. As a result, there's a fair amount of information available in various places, which made it easier to start figuring out how they work.

The obvious attack surface is Bluetooth (Limes have wifi, but only appear to use it to upload lists of nearby wifi networks, presumably for geolocation if they can't get a GPS fix). Each Lime broadcasts its name as Lime-12345678 where 12345678 is 8 digits of hex. They implement Bluetooth Low Energy and expose a custom service with various attributes. One of these attributes (0x35 on at least some of them) sends Bluetooth traffic to the application processor, which then parses it. This is where things get a little more interesting. The app has a core event loop that can take commands from multiple sources and then makes a decision about which component to dispatch them to. Each command is of the following form:

AT+type,password,time,sequence,data$

where type is one of either ATH, QRY, CMD or DBG. The password is a TOTP derived from the IMEI of the scooter, the time is simply the current date and time of day, the sequence is a monotonically increasing counter and the data is a blob of JSON. The command is terminated with a $ sign. The code is fairly agnostic about where the command came from, which means that you can send the same commands over Bluetooth as you can over the cellular network that the Limes are connected to. Since locking and unlocking is triggered by one of these commands being sent over the network, it ought to be possible to do the same by pushing a command over Bluetooth.

Unfortunately for nefarious individuals, all commands sent over Bluetooth are ignored until an authentication step is performed. The code I looked at had two ways of performing authentication - you could send an authentication token that was derived from the scooter's IMEI and the current time and some other stuff, or you could send a token that was just an HMAC of the IMEI and a static secret. Doing the latter was more appealing, both because it's simpler and because doing so flipped the scooter into manufacturing mode at which point all other command validation was also disabled (bye bye having to generate a TOTP). But how do we get the IMEI? There's actually two approaches:

1) Read it off the sticker that's on the side of the scooter (obvious, uninteresting)
2) Take advantage of how the scooter's Bluetooth name is generated

Remember the 8 digits of hex I mentioned earlier? They're generated by taking the IMEI, encrypting it using DES and a static key (0x11, 0x22, 0x33, 0x44, 0x55, 0x66, 0x77, 0x88), discarding the first 4 bytes of the output and turning the last 4 bytes into 8 digits of hex. Since we're discarding information, there's no way to immediately reverse the process - but IMEIs for a given manufacturer are all allocated from the same range, so we can just take the entire possible IMEI space for the modem chipset Lime use, encrypt all of them and end up with a mapping of name to IMEI (it turns out this doesn't guarantee that the mapping is unique - for around 0.01%, the same name maps to two different IMEIs). So we now have enough information to generate an authentication token that we can send over Bluetooth, which disables all further authentication and enables us to send further commands to disconnect the scooter from the network (so we can't be tracked) and then unlock and enable the scooter.

(Note: these are actual crimes)

This all seemed very exciting, but then a shock twist occurred - earlier this year, Lime updated their authentication method and now there's actual asymmetric cryptography involved and you'd need to engage in rather more actual crimes to obtain the key material necessary to authenticate over Bluetooth, and all of this research becomes much less interesting other than as an example of how other companies probably shouldn't do it.

In any case, congratulations to Lime on actually implementing security!

comments

Joey Hess: Project 62 Valencia Floor Lamp review

4 October, 2019 - 03:30

From Target, this brass finish floor lamp evokes 60's modernism, updated for the mid-Anthropocene with a touch plate switch.

The integrated microcontroller consumes a mere 2.2 watts while the lamp is turned off, in order to allow you to turn the lamp on with a stylish flick. With a 5 watt LED bulb (sold separately), the lamp will total a mere 7.2 watts while on, making it extremely energy efficient. While off, the lamp consumes a mere 19 kilowatt-hours per year.

Though the lamp shade at first appears perhaps flimsy, while you are drilling a hole in it to add a physical switch, you will discover metal, though not brass all the way through. Indeed, this lamp should last for generations, should the planet continue to support human life for that long.

As an additional bonus, the small plastic project box that comes free in this lamp will delight any electrical enthusiast. As will the approximately 1 hour conversion process to delete the touch switch phantom load. The 2 cubic foot of syrofoam packaging is less delightful.

Two allen screws attach the pole to the base; one was missing in my lamp. Also, while the base is very heavily weighted, the lamp still rocks a bit when using the aftermarket switch. So I am forced to give it a mere 4 out of 5 stars.

Molly de Blanc: Free software activities (September 2019)

3 October, 2019 - 23:49

September marked the end of summer and the end of my summer travel.  Paid and non-paid activities focused on catching up with things I fell behind on while traveling. Towards the middle of September, the world of FOSS blew up, and then blew up again, and then blew up again.

Free software activities: Personal
  • I caught up on some Debian Community Team emails I’ve been behind on. The CT is in search of new team members. If you think you might be interested in joining, please contact us.
  • After much deliberation, the OSI decided to appoint two directors to the board. We will decide who they will be in October, and are welcoming nominations.
  • On that note, the OSI had a board meeting.
  • Wrote a blog post on rights and freedoms to create a shared vocabulary for future writing concerning user rights. I also wrote a bit about leadership in free software.
  • I gave out a few pep talks. If you need a pep talk, hmu.
Free software activities: Professional
  • Wrote and published the September Friends of GNOME Update.
  • Interviewed Sammy Fung for the GNOME Engagement Blog.
  • Did a lot of behind the scenes work for GNOME, that you will hopefully see more of soon!
  • I spent a lot of time fighting with CiviCRM.
  • I attended GitLab Commit on behalf of GNOME, to discuss how we implement GitLab.

 

Thorsten Alteholz: My Debian Activities in September 2019

3 October, 2019 - 22:08

FTP master

This month I accepted 246 packages and rejected 28. The overall number of packages that got accepted was 303.

Debian LTS

This was my sixty third month that I did some work for the Debian LTS initiative, started by Raphael Hertzog at Freexian.

This month my all in all workload has been 23.75h. During that time I did LTS uploads of:

    [DLA 1911-1] exim4 security update for one CVE
    [DLA 1936-1] cups security update for one CVE
    [DLA 1935-1] e2fsprogs security update for one CVE
    [DLA 1934-1] cimg security update for 8 CVEs
    [DLA 1939-1] poppler security update for 3 CVEs

I also started to work on opendmarc and spip but did not finish testing yet.
Last but not least I did some days of frontdesk duties.

Debian ELTS

This month was the sixteenth ELTS month.

During my allocated time I uploaded:

  • ELA-160-1 of exim4
  • ELA-166-1 of libpng
  • ELA-167-1 of cups
  • ELA-169-1 of openldap
  • ELA-170-1 of e2fsprogs

I also did some days of frontdesk duties.

Other stuff

This month I uploaded new packages of …

I also uploaded new upstream versions of …

I improved packaging of …

On my Go challenge I uploaded golang-github-rivo-uniseg, golang-github-bruth-assert, golang-github-xlab-handysort, golang-github-paypal-gatt.

I also sponsored the following packages: golang-gopkg-libgit2-git2go.v28.

Birger Schacht: Installing and running Signal on Tails

3 October, 2019 - 15:52

Because the topic comes up every now and then, I thought I’d write down how to install and run Signal on Tails. These instructions are based on the 2nd Beta of Tails 4.0 - the 4.0 release is scheduled for October 22nd. I’m not sure if these steps also work on Tails 3.x, I seem to remember having some problems with installing flatpaks on Debian Stretch.

The first thing to do is to enable the Additional Software feature of Tails persistence (the Personal Data feature is also required, but that one is enabled by default when configuring persistence). Don’t forget to reboot afterwards. When logging in after the reboot, please set an Administration Password.

The approach I use to run Signal on Tails is using flatpak, so install flatpak either via Synaptic or via commandline:

sudo apt install flatpak

Tails then asks if you want to add flatpak to your additional software and I recommend doing so. The list of additional software can be checked via Applications → System Tools → Additional Software. The next thing you need to do is set up the directories- flatpak installs the software packages either system-wide in $prefix/var/lib/flatpak/[1] or per user in $HOME/.local/share/flatpak/ (the latter lets you manage your flatpaks without having to use elevated permissions). User specific data of the apps goes into $HOME/.var/app. This means we have to create directories on our Peristent folder for those two locations and then link them to their targets in /home/amnesia.

I recommend putting these commands into a script (i.e. /home/amnesia/Persistent/flatpak-setup.sh) and making it executable (chmod +x /home/amnesia/Persistent/flatpak-setup.sh):

#!/bin/sh

mkdir -p /home/amnesia/Persistent/flatpak
mkdir -p /home/amnesia/.local/share
ln -s /home/amnesia/Persistent/flatpak /home/amnesia/.local/share/flatpak
mkdir -p /home/amnesia/Persistent/app
mkdir -p /home/amnesia/.var
ln -s /home/amnesia/Persistent/app /home/amnesia/.var/app

Now you need to add a flatpak remote and install signal:

amnesia@amnesia:~$ torify flatpak remote-add --user --if-not-exists flathub https://flathub.org/repo/flathub.flatpakrepo
amnesia@amnesia:~$ torify flatpak install flathub org.signal.Signal

This will take a couple of minutes.

To show Signal the way to the next whiskey bar through Tor the HTTP_PROXY and HTTPS_PROXY environment variables have to be set. I recommend again to put this into a script (i.e. /home/amnesia/Persistent/signal.sh)

#!/bin/sh

export HTTP_PROXY=socks://127.0.0.1:9050
export HTTPS_PROXY=socks://127.0.0.1:9050
flatpak run org.signal.Signal

Yay it works!

To update signal you have to run

amnesia@amnesia:~$ torify flatpak update

To make the whole thing a bit more comfortably, the folder softlinks can be automatically created on login using a Gnome autostart script. For that to work you have to have the Dotfiles feature of Tails enabled. Then you can create a /live/persistence/TailsData_unlocked/dotfiles/.config/autostart/FlatpakSetup.desktop file:

[Desktop Entry]
Name=FlatpakSetup
GenericName=Setup Flatpak on Tails
Comment=This script runs the flatpak-setup.sh script on start of the user session
Exec=/live/persistence/TailsData_unlocked/Persistent/flatpak-setup.sh
Terminal=false
Type=Application

By adding /live/persistence/TailsData_unlocked/dotfiles/.local/share/applications/Signal.desktop file to the dotfiles folder, Signal also shows as part of the Gnome applications with a nice Signal icon:

[Desktop Entry]
Name=Signal
GenericName=Signal Desktop Messenger
Exec=/home/amnesia/Persistent/signal.sh
Terminal=false
Type=Application
Icon=/home/amnesia/.local/share/flatpak/app/org.signal.Signal/current/active/files/share/icons/hicolor/128x128/apps/org.signal.Signal.png

  1. It is also possible to configure additional system wide installation locations, details are documented in flatpak-installation(5) [return]

Pages

Creative Commons License ลิขสิทธิ์ของบทความเป็นของเจ้าของบทความแต่ละชิ้น
ผลงานนี้ ใช้สัญญาอนุญาตของครีเอทีฟคอมมอนส์แบบ แสดงที่มา-อนุญาตแบบเดียวกัน 3.0 ที่ยังไม่ได้ปรับแก้