Planet Debian

Subscribe to Planet Debian feed
Planet Debian - http://planet.debian.org/
Updated: 1 hour 22 min ago

Dirk Eddelbuettel: Rcpp now used by 10 percent of CRAN packages

23 August, 2017 - 18:18

Over the last few days, Rcpp passed another noteworthy hurdle. It is now used by over 10 percent of packages on CRAN (as measured by Depends, Imports and LinkingTo, but excluding Suggests). As of this morning 1130 packages use Rcpp out of a total of 11275 packages. The graph on the left shows the growth of both outright usage numbers (in darker blue, left axis) and relative usage (in lighter blue, right axis).

Older posts on this blog took note when Rcpp passed round hundreds of packages, most recently in April for 1000 packages. The growth rates for both Rcpp, and of course CRAN, are still staggering. A big thank you to everybody who makes this happen, from R Core and CRAN to all package developers, contributors, and of course all users driving this. We have built ourselves a rather impressive ecosystem.

So with that a heartfelt Thank You! to all users and contributors of R, CRAN, and of course Rcpp, for help, suggestions, bug reports, documentation, encouragement, and, of course, code.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

Jonathan McDowell: Notes on upgrading from Jessie to Stretch

23 August, 2017 - 03:29

I upgraded my last major machine from Jessie to Stretch last week. That machine was the one running the most services, but I’d made notes while updating various others to ensure it went smoothly. Below are the things I noted along the way, both for my own reference and in case they are of use to anyone else.

  • Roundcube with the sqlite3 backend stopped working after the upgrade; fix was to edit /etc/roundcube/debian-db-roundcube.php and change sqlite3:// to sqlite:// in the $config['db_dsnw'] line.
  • Dovecot no longer supports SSLv2 so had to remove !SSLv2 from the ssl_protocols list in /etc/dovecot/conf.d/10-ssl.conf
  • Duplicity now tries to do a mkdir so I had to change from the scp:// backend to the sftp:// backend in my backup scripts.
  • Needed to add needs_root_rights=yes to /etc/X11/Xwrapper.config so Kodi systemd unit could still start it on a new VT. Need to figure out how to get this working without the need for root.
  • Upgrading fail2ban would have been easier if I’d dropped my additions in /etc/fail2ban/jail.d/ rather than the master config. Fixed for next time.
  • ejabberd continues to be a pain; I do wonder if it’s worth running an XMPP server these days. I certainly don’t end up using it to talk to people myself.
  • Upgrading 1200+ packages takes a long time, even when the majority of them don’t have any questions to ask during the process.
  • PostgreSQL upgrades have got so much easier. pg_upgradecluster 9.4 main chugged away but did exactly what I needed.

Other than those points things were pretty smooth. Nice work by all those involved!

John Goerzen: The Eclipse

22 August, 2017 - 22:17

Highway US-81 in northern Kansas and southern Nebraska is normally a pleasant, sleepy sort of drive. It was upgraded to a 4-lane road not too long ago, but as far as 4-lane roads go, its traffic is typically light. For drives from Kansas to South Dakota, it makes a pleasant route.

Yesterday was eclipse day. I strongly suspect that highway 81 had more traffic that day than it ever has before, or ever will again. For nearly the entire 3-hour drive to Geneva, NE, it was packed — though mostly still moving at a good speed. And for our entire drive back, highway 81 and every other southbound road we used was so full it felt like rush hour in Dallas. (Well, not quite. Traffic was still moving.) I believe scenes like this were played out across the continent.

I’ve been taking a lot of photos, and writing about our new baby Martha lately. Now it’s time to write a bit about some more adventures with Jacob and Oliver – they’re now in third and fifth grades in school.

We had been planning to fly, and airports I called were either full, or were planning to park planes in the grass, or even shut down some runways to use for parking. The airport in the little town of Beatrice, NE (which I had visited twice before) was even going to have a temporary FAA control tower. At the last minute, due to some storm activity near home at departure time, we unloaded the plane and drove instead.

The atmosphere at the fairgrounds in Geneva was festive. One family had brought bubbles for their kids — and extras to share.

I had bought the boys a book about the eclipse, which they were reading before and during the event. They were both great, safe users of their eclipse glasses.

Jacob caught a toad, and played with it for awhile. He wanted to bring it home with us, but I convinced him to let me take a picture of him with his toad friend instead.

While we were waiting for totality, a number of buses from the local school district arrived. So by the time the big moment arrived, we could hear the distant roar of delight and applause from the school children gathered at the far end of the field, plus all the excitement nearby. Both boys were absolutely ecstatic to be witnessing it (and so was I!) “Wow!” “Awesome!” And simple cackles of delight were heard. On the drive home, they both kept talking about how amazing it was, and it was “once in a lifetime.”

We enjoyed our “eclipse neighbors” – the woman from San Antonio next to us, the surprise discovery of another family from just a few miles from us parked two cars down, even running into relatives at a restaurant on the way home. The applause from all around when it started – and when it ended. And the feeling, which is hard to describe, of awe and amazement at the wonders of our world and our universe.

There are many problems with the world right now, but somehow there’s something right about people coming together from all over to enjoy it.

Sven Hoexter: whois databases and registration spam

22 August, 2017 - 21:59

Lately I experienced some new kind of spam, at least new to myself. Seems that spammer abuse registration input fields that do not implement strong enough validation, and echo back several values from the registration process in some kind of welcome mail. Basically filling the spam message in name and surname fields.

So far I found a bunch of those originating from the following AS: AS49453, AS50896, AS200557 and AS61440. The first three belong to something identifying itself as "QUALITYNETWORK". The last one, AS61440, seems to be involved only partially with some networks being delegated to "Atomohost".

To block them it's helpful to query the public radb service whois.radb.net for all networks belonging to the specific AS like this:

whois -h whois.radb.net -- '-i origin AS50896'

Another bunch of batch capable whois services are provided by Team Cymru. They've some examples at the end of https://www.team-cymru.org/IP-ASN-mapping.html.

In this specific case the spam was for "www.robotas.ru" which is currently terminated at CloudFlare and redirects via JS document.location to "http://link31.net/b494d/ooo/", which in turn redirects via JS window.location to "http://revizor-online.ga/" which is again hosted at CloudFlare. The page at the end plays some strange youtube video - currently at around 1900 plays, so not that widely spread. In the end an interesting indicator about spam campaign success.

Norbert Preining: Debian packages as Graph Database

22 August, 2017 - 20:08

I have been playing around with Neo4j as graph database, and searching for a big dataset I decided to look at Debian packages (source and binary) from stable, testing, sid, and experimental, and represent all of that in a big graph database.

While this is far from ready, the following entities and relations a represented:

  • source packages, unversioned and versioned
  • binary packages, unversioned and versioned
  • maintainers
  • all dependencies, including alternatives and versioned dependencies
  • relations like maintains, builds, etc
  • suites (stable, testing, sid, experimental)

The graph currently has 220618 nodes and 782323 edges, and my first trial to import this into the database was by generating a long cypher statement, and then throwing that at cypher-shell. Well, that was not the best idea. After 24h I stopped the process and rewrote the generation script to generate csv files. Using neo4j-import the same amount of data was imported in 5secs (!!!).

What I would like to get in the future is the whole package history as well, and maybe also include all the bugs into the database … if I only would have easily accessible and parseable information about these items (Debian Q&A maybe?). If you have any suggestions, please let me know.

More to come, stay tuned.

Daniel Silverstone: Building a USB descriptor table set

22 August, 2017 - 17:58

In order to proceed further on our USB/STM32 oddessy, we need to start to build a USB descriptor set for our first prototype piece of code. For this piece, we're going to put together a USB device which is vendor-specific class-wise and has a single configuration with a interface with a single endpoint which we're not going to actually implement anything of. What we're after is just to get the information presented to the computer so that lsusb can see it.

To get these built, let's refer to information we discovered and recorded in a previous post about how descriptors go together.

Device descriptor

Remembering that values which are > 1 byte in length are always stored little-endian, we can construct our device descriptor as:

Our device descriptor Field Name Value Bytes bLength 18 0x12 bDescriptorType DEVICE 0x01 bcdUSB USB 2.0 0x00 0x02 bDeviceClass 0 0x00 bDeviceSubClass 0 0x00 bDeviceProtocol 0 0x00 bMaxPacketSize 64 0x40 idVendor TEST 0xff 0xff idProduct TEST 0xff 0xff bcdDevice 0.0.1 0x01 0x00 iManufacturer 1 0x01 iProduct 2 0x02 iSerialNumber 3 0x03 bNumConfigurations 1 0x01

We're using the vendor ID and product id 0xffff because at this point we don't have any useful values for this (it costs $5,000 to register a vendor ID).

This gives us a final byte array of:

0x12 0x01 0x00 0x02 0x00 0x00 0x00 0x40 (Early descriptor)

0xff 0xff 0xff 0xff 0x01 0x00 0x01 0x02 0x03 0x01 (and the rest)

We're reserving string ids 1, 2, and 3, for the manufacturer string, product name string, and serial number string respectively. I'm deliberately including them all so that we can see it all come out later in lsusb.

If you feed the above hex sequence into a USB descriptor decoder then you can check my working.

Endpoint Descriptor

We want a single configuration, which covers our one interface, with one endpoint in it. Let's start with the endpoint...

Our bulk IN endpoint Field Name Value Bytes bLength 7 0x07 bDescriptorType ENDPOINT 0x05 bEndpointAddress EP2IN 0x82 bmAttributes BULK 0x02 wMaxPacketSize 64 0x40 0x00 bInterval IGNORED 0x00

We're giving a single bulk IN endpoint, since that's the simplest thing to describe at this time. This endpoint will never be ready and so nothing will ever be read into the host.

All that gives us:

0x07 0x05 0x82 0x02 0x40 0x00 0x00

Interface Descriptor

The interface descriptor prefaces the endpoint set, and thanks to our simple single endpoint, and no plans for alternate interfaces, we can construct the interface simply as:

Our single simple interface Field Name Value Bytes bLength 9 0x09 bDescriptorType INTERFACE 0x04 bInterfaceNumber 1 0x01 bAlternateSetting 1 0x01 bNumEndpoints 1 0x01 bInterfaceClass 0 0x00 bInterfaceSubClass 0 0x00 bInterfaceProtocol 0 0x00 iInterface 5 0x05

All that gives us:

0x09 0x04 0x01 0x01 0x01 0x00 0x00 0x00 0x05

Configuration descriptor

Finally we can put it all together and get the configuration descriptor...

Our sole configuration, encapsulating the interface and endpoint above Field Name Value Bytes bLength 9 0x09 bDescriptorType CONFIG 0x02 wTotalLength 9+9+7 0x19 0x00 bNumInterfaces 1 0x01 bConfigurationValue 1 0x01 iConfiguration 4 0x04 bmAttributes Bus powered, no wake 0x80 bMaxPower 500mA 0xfa

The wTotalLength field is interesting. It contains the configuration length, the interface length, and the endpoint length, hence 9 plus 9 plus 7 is 25.

This gives:

0x09 0x02 0x19 0x00 0x01 0x01 0x04 0x80 0xfa

String descriptors

We allowed ourselves a total of five strings, they were iManufacturer, iProduct, iSerial (from the device descriptor), iConfiguration (from the configuration descriptor), and iInterface (from the interface descriptor) respectively.

Our string descriptors will therefore be:

String descriptor zero, en_GB only Field Name Value Bytes bLength 4 0x04 bDescriptorType STRING 0x03 wLangID[0] en_GB 0x09 0x08

0x04 0x03 0x09 0x08

...and...

String descriptor one, iManufacturer Field Name Value Bytes bLength 38 0x26 bDescriptorType STRING 0x03 bString "Rusty Manufacturer" ...

0x26 0x03 0x52 0x00 0x75 0x00 0x73 0x00

0x74 0x00 0x79 0x00 0x20 0x00 0x4d 0x00

0x61 0x00 0x6e 0x00 0x75 0x00 0x66 0x00

0x61 0x00 0x63 0x00 0x74 0x00 0x75 0x00

0x72 0x00 0x65 0x00 0x72 0x00

(You get the idea, there's no point me breaking down the rest of the string descriptors here, suffice it to say that the other strings are appropriate for the values they represent - namely product, serial, configuration, and interface.)

Putting it all together

Given all the above, we have a device descriptor which is standalone, then a configuration descriptor which encompasses the interface and endpoint descriptors too. Finally we have a string descriptor table with six entries, the first is the language sets available, and the rest are our strings. In total we have:

    // Device descriptor
    const DEV_DESC: [u8; 18] = {
        0x12, 0x01, 0x00, 0x02, 0x00, 0x00, 0x00, 0x40,
        0xff, 0xff, 0xff, 0xff, 0x01, 0x00, 0x01, 0x02,
        0x03, 0x01
    };

    // Configuration descriptor
    const CONF_DESC: [u8; 25] = {
        0x09, 0x02, 0x19, 0x00, 0x01, 0x01, 0x04, 0x80, 0xfa,
        0x09, 0x04, 0x01, 0x01, 0x01, 0x00, 0x00, 0x00, 0x05,
        0x07, 0x05, 0x82, 0x02, 0x40, 0x00, 0x00
    };

    // String Descriptor zero
    const STR_DESC_0: [u8; 4] = {0x04, 0x03, 0x09, 0x08};

    // String Descriptor 1, "Rusty Manufacturer"
    const STR_DESC_1: [u8; 38] = {
        0x26, 0x03, 0x52, 0x00, 0x75, 0x00, 0x73, 0x00,
        0x74, 0x00, 0x79, 0x00, 0x20, 0x00, 0x4d, 0x00,
        0x61, 0x00, 0x6e, 0x00, 0x75, 0x00, 0x66, 0x00,
        0x61, 0x00, 0x63, 0x00, 0x74, 0x00, 0x75, 0x00,
        0x72, 0x00, 0x65, 0x00, 0x72, 0x00
    };

    // String Descriptor 2, "Rusty Product"
    const STR_DESC_2: [u8; 28] = {
        0x1c, 0x03, 0x52, 0x00, 0x75, 0x00, 0x73, 0x00,
        0x74, 0x00, 0x79, 0x00, 0x20, 0x00, 0x50, 0x00,
        0x72, 0x00, 0x6f, 0x00, 0x64, 0x00, 0x75, 0x00,
        0x63, 0x00, 0x74, 0x00
    };

    // String Descriptor 3, "123ABC"
    const STR_DESC_3: [u8; 14] = {
        0x0e, 0x03, 0x31, 0x00, 0x32, 0x00, 0x33, 0x00,
        0x41, 0x00, 0x42, 0x00, 0x43, 0x00
    };

    // String Descriptor 4, "Rusty Configuration"
    const STR_DESC_4: [u8; 40] = {
        0x28, 0x03, 0x52, 0x00, 0x75, 0x00, 0x73, 0x00,
        0x74, 0x00, 0x79, 0x00, 0x20, 0x00, 0x43, 0x00,
        0x6f, 0x00, 0x6e, 0x00, 0x66, 0x00, 0x69, 0x00,
        0x67, 0x00, 0x75, 0x00, 0x72, 0x00, 0x61, 0x00,
        0x74, 0x00, 0x69, 0x00, 0x6f, 0x00, 0x6e, 0x00
    };

    // String Descriptor 5, "Rusty Interface"
    const STR_DESC_5: [u8; 32] = {
        0x20, 0x03, 0x52, 0x00, 0x75, 0x00, 0x73, 0x00,
        0x74, 0x00, 0x79, 0x00, 0x20, 0x00, 0x49, 0x00,
        0x6e, 0x00, 0x74, 0x00, 0x65, 0x00, 0x72, 0x00,
        0x66, 0x00, 0x61, 0x00, 0x63, 0x00, 0x65, 0x00
    };

With the above, we're a step closer to our first prototype which will hopefully be enumerable. Next time we'll look at beginning our prototype low level USB device stack mock-up.

Aigars Mahinovs: Debconf 17 photo retrospective

22 August, 2017 - 01:49

Debconf17 has come and gone by too fast, so we all could use a moment looing back at all the fun and serious happenings of the main event in the Debian social calendar. You can find my full photo gallery on Google, Flickr and Debconf Share.

Make sure to check out the Debconf17 group photo and as an extra special treat for you - enjoy the "living" Debconf17 group photo!

 

Dirk Eddelbuettel: RcppArmadillo 0.7.960.1.1

21 August, 2017 - 02:20

On the heels of the very recent bi-monthly RcppArmadillo release comes a quick bug-fix release 0.7.960.1.1 which just got onto CRAN (and I will ship a build to Debian in a moment).

There were three distinct issues I addressed in three quick pull requests:

  • The excellent Google Summer of Code work by Binxiang Ni had only encountered direct use of sparse matrices as produced by the Matrix. However, while we waited for 0.7.960.1.0 to make it onto CRAN, the quanteda package switched to derived classes---which we now account for via the is() method of our S4 class. Thanks to Kevin Ushey for reminding me we had is().
  • We somehow missed to account for the R 3.4.* and Rcpp 0.12.{11,12} changes for package registration (with .registration=TRUE), so ensured we only have one fastLm symbol.
  • The build did not take not too well to systems without OpenMP, so we now explicitly unset supported via an Armadillo configuration variable. In general, client packages probably want to enable C++11 support when using OpenMP (explicitly) but we prefer to not upset too many (old) users. However, our configure check now also wants g++ 4.7.2 or later just like Armadillo.

Armadillo is a powerful and expressive C++ template library for linear algebra aiming towards a good balance between speed and ease of use with a syntax deliberately close to a Matlab. RcppArmadillo integrates this library with the R environment and language--and is widely used by (currently) 382 other packages on CRAN---an increase of 52 since the CRAN release in June!

Changes in this release relative to the previous CRAN release are as follows:

Changes in RcppArmadillo version 0.7.960.1.1 (2017-08-20)
  • Added improved check for inherited S4 matrix classes (#162 fixing #161)

  • Changed fastLm C++ function to fastLm_impl to not clash with R method (#164 fixing #163)

  • Added OpenMP check for configure (#166 fixing #165)

Courtesy of CRANberries, there is a diffstat report. More detailed information is on the RcppArmadillo page. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

Steinar H. Gunderson: Random codec notes

21 August, 2017 - 02:15

Post-Solskogen, there hasn't been all that many commits in the main Nageru repository, but that doesn't mean the project is standing still. In particular, I've been working with NVIDIA to shake out a crash bug in their drivers (which in itself uncovered some interesting debugging techniques, although in the end, the bug turned out just to be uncovered by the boring standard technique of analyzing crash dumps and writing a minimal program to reproduce). But I've also been looking at intraframe codecs; my sort-of plan was to go to VideoLAN Dev Days to present my findings, but unfortunately, there seems to be a schedule conflict, so instead, you can have some scattered random notes:

  • JPEG is really impressive technology. It's made in 1992, and it's still frighteningly close to state of the art! Beating it without getting a lot slower isn't trivial at all; witness that even with WebP etc., we still don't have a JPEG-killer because it's just so incredibly good. (Granted, in 1992, decompressing a JPEG would easily take half a minute or more.)
  • rANS is pretty neat; it gives you almost all the advantages of arithmetic coding, but does without divisions in the decoder (and if you have static probabilities and fast high multiplies, also in the encoder). I still don't feel like I understand it fully, though. And it has the odd property of having to encode and decode in different directions, which may or may not be a problem to you. But ryg's interleaving tricks to get SIMD are pretty nice; I won't claim to fully understand those either. Maybe I will need to eventually.
  • GPU performance models continue to bend my head. GPUs may have managed to make a parallel programming model that people actually manage to use, but getting really high performance out of compute shaders is still hard to get use to.
  • It really annoys me that Quick Sync doesn't support luma-only encoding (ie. 4:0:0); the best you can do is seemingly 4:2:0 and then have empty chroma planes, which wastes resources. It would have been a nice way to hack around the limitation that you can't have 4:2:2 or alpha.
  • Time-frequency switching is the solution to a problem that sounds trivial if you're a DSP beginner and nearly impossible if you've actually done some DSP. I'm not intuitively convinced it's worth it if you have fast float muladds, but it's pretty neat. Allowing multiple block sizes to an intraframe codec has a cost in that you'll need a size decision heuristic, though, which pretty much instantly complicates the encoder.

Work in progress :-) Maybe something more coherent will come out eventually.

Edit: Forgot about TF switching!

Joachim Breitner: Titel

21 August, 2017 - 01:50

Joachim Breitner: Compose Conference talk video online

21 August, 2017 - 01:50

Three months ago, I gave a talk at the Compose::Conference in New York about how Chris Smith and I added the ability to create networked multi-user programs to the educational Haskell programming environment CodeWorld, and finally the recording of the talk is available on YouTube (and is being discussed on reddit):

It was the talk where I got the most positive feedback afterwards, and I think this is partly due to how I created the presentation: Instead of showing static slides, I programmed the complete visual display from scratch as an “interaction” within the CodeWorld environment, including all transitions, an working embedded game of Pong and a simulated multi-player environments with adjustable message delays. I have put the code for the presentation online.

Chris and I have written about this for ICFP'17, and thanks to open access I can actually share the paper freely with you and under a CC license. If you come to Oxford you can see me perform a shorter version of this talk again.

Vincent Bernat: IPv6 route lookup on Linux

20 August, 2017 - 23:00

TL;DR: With its implementation of IPv6 routing tables using radix trees, Linux offers subpar performance (450 ns for a full view — 40,000 routes) compared to IPv4 (50 ns for a full view — 500,000 routes) but fair memory usage (20 MiB for a full view).

In a previous article, we had a look at IPv4 route lookup on Linux. Let’s see how different IPv6 is.

Lookup trie implementation

Looking up a prefix in a routing table comes down to find the most specific entry matching the requested destination. A common structure for this task is the trie, a tree structure where each node has its parent as prefix.

With IPv4, Linux uses a level-compressed trie (or LPC-trie), providing good performances with low memory usage. For IPv6, Linux uses a more classic radix tree (or Patricia trie). There are three reasons for not sharing:

  • The IPv6 implementation (introduced in Linux 2.1.8, 1996) predates the IPv4 implementation based on LPC-tries (in Linux 2.6.13, commit 19baf839ff4a).
  • The feature set is different. Notably, IPv6 supports source-specific routing1 (since Linux 2.1.120, 1998).
  • The IPv4 address space is denser than the IPv6 address space. Level-compression is therefore quite efficient with IPv4. This may not be the case with IPv6.

The trie in the below illustration encodes 6 prefixes:

For more in-depth explanation on the different ways to encode a routing table into a trie and a better understanding of radix trees, see the explanations for IPv4.

The following figure shows the in-memory representation of the previous radix tree. Each node corresponds to a struct fib6_node. When a node has the RTN_RTINFO flag set, it embeds a pointer to a struct rt6_info containing information about the next-hop.

The fib6_lookup_1() function walks the radix tree in two steps:

  1. walking down the tree to locate the potential candidate, and
  2. checking the candidate and, if needed, backtracking until a match.

Here is a slightly simplified version without source-specific routing:

static struct fib6_node *fib6_lookup_1(struct fib6_node *root,
                                       struct in6_addr  *addr)
{
    struct fib6_node *fn;
    __be32 dir;

    /* Step 1: locate potential candidate */
    fn = root;
    for (;;) {
        struct fib6_node *next;
        dir = addr_bit_set(addr, fn->fn_bit);
        next = dir ? fn->right : fn->left;
        if (next) {
            fn = next;
            continue;
        }
        break;
    }

    /* Step 2: check prefix and backtrack if needed */
    while (fn) {
        if (fn->fn_flags & RTN_RTINFO) {
            struct rt6key *key;
            key = fn->leaf->rt6i_dst;
            if (ipv6_prefix_equal(&key->addr, addr, key->plen)) {
                if (fn->fn_flags & RTN_RTINFO)
                    return fn;
            }
        }

        if (fn->fn_flags & RTN_ROOT)
            break;
        fn = fn->parent;
    }

    return NULL;
}
Caching

While IPv4 lost its route cache in Linux 3.6 (commit 5e9965c15ba8), IPv6 still has a caching mechanism. However cache entries are directly put in the radix tree instead of a distinct structure.

Since Linux 2.1.30 (1997) and until Linux 4.2 (commit 45e4fd26683c), almost any successful route lookup inserts a cache entry in the radix tree. For example, a router forwarding a ping between 2001:db8:1::1 and 2001:db8:3::1 would get those two cache entries:

$ ip -6 route show cache
2001:db8:1::1 dev r2-r1  metric 0
    cache
2001:db8:3::1 via 2001:db8:2::2 dev r2-r3  metric 0
    cache

These entries are cleaned up by the ip6_dst_gc() function controlled by the following parameters:

$ sysctl -a | grep -F net.ipv6.route
net.ipv6.route.gc_elasticity = 9
net.ipv6.route.gc_interval = 30
net.ipv6.route.gc_min_interval = 0
net.ipv6.route.gc_min_interval_ms = 500
net.ipv6.route.gc_thresh = 1024
net.ipv6.route.gc_timeout = 60
net.ipv6.route.max_size = 4096
net.ipv6.route.mtu_expires = 600

The garbage collector is triggered at most every 500 ms when there are more than 1024 entries or at least every 30 seconds. The garbage collection won’t run for more than 60 seconds, except if there are more than 4096 routes. When running, it will first delete entries older than 30 seconds. If the number of cache entries is still greater than 4096, it will continue to delete more recent entries (but no more recent than 512 jiffies, which is the value of gc_elasticity) after a 500 ms pause.

Starting from Linux 4.2 (commit 45e4fd26683c), only a PMTU exception would create a cache entry. A router doesn’t have to handle those exceptions, so only hosts would get cache entries. And they should be pretty rare. Martin KaFai Lau explains:

Out of all IPv6 RTF_CACHE routes that are created, the percentage that has a different MTU is very small. In one of our end-user facing proxy server, only 1k out of 80k RTF_CACHE routes have a smaller MTU. For our DC traffic, there is no MTU exception.

Here is how a cache entry with a PMTU exception looks like:

$ ip -6 route show cache
2001:db8:1::50 via 2001:db8:1::13 dev out6  metric 0
    cache  expires 573sec mtu 1400 pref medium
Performance

We consider three distinct scenarios:

Excerpt of an Internet full view
In this scenario, Linux acts as an edge router attached to the default-free zone. Currently, the size of such a routing table is a little bit above 40,000 routes.
/48 prefixes spread linearly with different densities
Linux acts as a core router inside a datacenter. Each customer or rack gets one or several /48 networks, which need to be routed around. With a density of 1, /48 subnets are contiguous.
/128 prefixes spread randomly in a fixed /108 subnet
Linux acts as a leaf router for a /64 subnet with hosts getting their IP using autoconfiguration. It is assumed all hosts share the same OUI and therefore, the first 40 bits are fixed. In this scenario, neighbor reachability information for the /64 subnet are converted into routes by some external process and redistributed among other routers sharing the same subnet2.
Route lookup performance

With the help of a small kernel module, we can accurately benchmark3 the ip6_route_output_flags() function and correlate the results with the radix tree size:

Getting meaningful results is challenging due to the size of the address space. None of the scenarios have a fallback route and we only measure time for successful hits4. For the full view scenario, only the range from 2400::/16 to 2a06::/16 is scanned (it contains more than half of the routes). For the /128 scenario, the whole /108 subnet is scanned. For the /48 scenario, the range from the first /48 to the last one is scanned. For each range, 5000 addresses are picked semi-randomly. This operation is repeated until we get 5000 hits or until 1 million tests have been executed.

The relation between the maximum depth and the lookup time is incomplete and I can’t explain the difference of performance between the different densities of the /48 scenario.

We can extract two important performance points:

  • With a full view, the lookup time is 450 ns. This is almost ten times the budget for forwarding at 10 Gbps — which is about 50 ns.
  • With an almost empty routing table, the lookup time is 150 ns. This is still over the time budget for forwarding at 10 Gbps.

With IPv4, the lookup time for an almost empty table was 20 ns while the lookup time for a full view (500,000 routes) was a bit above 50 ns. How to explain such a difference? First, the maximum depth of the IPv4 LPC-trie with 500,000 routes was 6, while the maximum depth of the IPv6 radix tree for 40,000 routes is 40.

Second, while both IPv4’s fib_lookup() and IPv6’s ip6_route_output_flags() functions have a fixed cost implied by the evaluation of routing rules, IPv4 has several optimizations when the rules are left unmodified5. Those optimizations are removed on the first modification. If we cancel those optimizations, the lookup time for IPv4 is impacted by about 30 ns. This still leaves a 100 ns difference with IPv6 to be explained.

Let’s compare how time is spent in each lookup function. Here is a CPU flamegraph for IPv4’s fib_lookup():

Only 50% of the time is spent in the actual route lookup. The remaining time is spent evaluating the routing rules (about 30 ns). This ratio is dependent on the number of routes we inserted (only 1000 in this example). It should be noted the fib_table_lookup() function is executed twice: once with the local routing table and once with the main routing table.

The equivalent flamegraph for IPv6’s ip6_route_output_flags() is depicted below:

Here is an approximate breakdown on the time spent:

  • 50% is spent in the route lookup in the main table,
  • 15% is spent in handling locking (IPv4 is using the more efficient RCU mechanism),
  • 5% is spent in the route lookup of the local table,
  • most of the remaining is spent in routing rule evaluation (about 100 ns)6.

Why does the evaluation of routing rules is less efficient with IPv6? Again, I don’t have a definitive answer.

History

The following graph shows the performance progression of route lookups through Linux history:

All kernels are compiled with GCC 4.9 (from Debian Jessie). This version is able to compile older kernels as well as current ones. The kernel configuration is the default one with CONFIG_SMP, CONFIG_IPV6, CONFIG_IPV6_MULTIPLE_TABLES and CONFIG_IPV6_SUBTREES options enabled. Some other unrelated options are enabled to be able to boot them in a virtual machine and run the benchmark.

There are three notable performance changes:

  • In Linux 3.1, Eric Dumazet delays a bit the copy of route metrics to fix the undesirable sharing of route-specific metrics by all cache entries (commit 21efcfa0ff27). Each cache entry now gets its own metrics, which explains the performance hit for the non-/128 scenarios.
  • In Linux 3.9, Yoshifuji Hideaki removes the reference to the neighbor entry in struct rt6_info (commit 887c95cc1da5). This should have lead to a performance increase. The small regression may be due to cache-related issues.
  • In Linux 4.2, Martin KaFai Lau prevents the creation of cache entries for most route lookups. The most sensible performance improvement comes with commit 4b32b5ad31a6. The second one is from commit 45e4fd26683c, which effectively removes creation of cache entries, except for PMTU exceptions.
Insertion performance

Another interesting performance-related metric is the insertion time. Linux is able to insert a full view in less than two seconds. For some reason, the insertion time is not linear above 50,000 routes and climbs very fast to 60 seconds for 500,000 routes.

Despite its more complex insertion logic, the IPv4 subsystem is able to insert 2 million routes in less than 10 seconds.

Memory usage

Radix tree nodes (struct fib6_node) and routing information (struct rt6_info) are allocated with the slab allocator7. It is therefore possible to extract the information from /proc/slabinfo when the kernel is booted with the slab_nomerge flag:

# sed -ne 2p -e '/^ip6_dst/p' -e '/^fib6_nodes/p' /proc/slabinfo | cut -f1 -d:
♯  name            <active_objs> <num_objs> <objsize> <objperslab> <pagesperslab>
fib6_nodes         76101  76104     64   63    1
ip6_dst_cache      40090  40090    384   10    1

In the above example, the used memory is 76104×64+40090×384 bytes (about 20 MiB). The number of struct rt6_info matches the number of routes while the number of nodes is roughly twice the number of routes:

The memory usage is therefore quite predictable and reasonable, as even a small single-board computer can support several full views (20 MiB for each):

The LPC-trie used for IPv4 is more efficient: when 512 MiB of memory is needed for IPv6 to store 1 million routes, only 128 MiB are needed for IPv4. The difference is mainly due to the size of struct rt6_info (336 bytes) compared to the size of IPv4’s struct fib_alias (48 bytes): IPv4 puts most information about next-hops in struct fib_info structures that are shared with many entries.

Conclusion

The takeaways from this article are:

  • upgrade to Linux 4.2 or more recent to avoid excessive caching,
  • route lookups are noticeably slower compared to IPv4 (by an order of magnitude),
  • CONFIG_IPV6_MULTIPLE_TABLES option incurs a fixed penalty of 100 ns by lookup,
  • memory usage is fair (20 MiB for 40,000 routes).

Compared to IPv4, IPv6 in Linux doesn’t foster the same interest, notably in term of optimizations. Hopefully, things are changing as its adoption and use “at scale” are increasing.

  1. For a given destination prefix, it’s possible to attach source-specific prefixes:

    ip -6 route add 2001:db8:1::/64 \
      from 2001:db8:3::/64 \
      via fe80::1 \
      dev eth0
    

    Lookup is first done on the destination address, then on the source address. 

  2. This is quite different of the classic scenario where Linux acts as a gateway for a /64 subnet. In this case, the neighbor subsystem stores the reachability information for each host and the routing table only contains a single /64 prefix. 

  3. The measurements are done in a virtual machine with one vCPU and no neighbors. The host is an Intel Core i5-4670K running at 3.7 GHz during the experiment (CPU governor set to performance). The benchmark is single-threaded. Many lookups are performed and the result reported is the median value. Timings of individual runs are computed from the TSC. 

  4. Most of the packets in the network are expected to be routed to a destination. However, this also means the backtracking code path is not used in the /128 and /48 scenarios. Having a fallback route gives far different results and make it difficult to ensure we explore the address space correctly. 

  5. The exact same optimizations could be applied for IPv6. Nobody did it yet

  6. Compiling out table support effectively removes those last 100 ns. 

  7. There is also per-CPU pointers allocated directly (4 bytes per entry per CPU on a 64-bit architecture). We ignore this detail. 

Dirk Eddelbuettel: #10: Compacting your Shared Libraries, After The Build

20 August, 2017 - 20:40

Welcome to the tenth post in the rarely ranting R recommendations series, or R4 for short. A few days ago we showed how to tell the linker to strip shared libraries. As discussed in the post, there are two options. One can either set up ~/.R/Makevars by passing the strip-debug option to the linker. Alternatively, one can adjust src/Makevars in the package itself with a bit a Makefile magic.

Of course, there is a third way: just run strip --strip-debug over all the shared libraries after the build. As the path is standardized, and the shell does proper globbing, we can just do

$ strip --strip-debug /usr/local/lib/R/site-library/*/libs/*.so

using a double-wildcard to get all packages (in that R package directory) and all their shared libraries. Users on macOS probably want .dylib on the end, users on Windows want another computer as usual (just kidding: use .dll). Either may have to adjust the path which is left as an exercise to the reader.

The impact can be Yuge as illustrated in the following dotplot:

This illustration is in response to a mailing list post. Last week, someone claimed on r-help that tidyverse would not install on Ubuntu 17.04. And this is of course patently false as many of us build and test on Ubuntu and related Linux systems, Travis runs on it, CRAN tests them etc pp. That poor user had somehow messed up their default gcc version. Anyway: I fired up a Docker container, installed r-base-core plus three required -dev packages (for xml2, openssl, and curl) and ran a single install.packages("tidyverse"). In a nutshell, following the launch of Docker for an Ubuntu 17.04 container, it was just

$ apt-get update
$ apt-get install r-base libcurl4-openssl-dev libssl-dev libxml2-dev
$ apt-get install mg          # a tiny editor
$ mg /etc/R/Rprofile.site     # to add a default CRAN repo
$ R -e 'install.packages("tidyverse")'

which not only worked (as expected) but also installed a whopping fifty-one packages (!!) of which twenty-six contain a shared library. A useful little trick is to run du with proper options to total, summarize, and use human units which reveals that these libraries occupy seventy-eight megabytes:

root@de443801b3fc:/# du -csh /usr/local/lib/R/site-library/*/libs/*so
4.3M    /usr/local/lib/R/site-library/Rcpp/libs/Rcpp.so
2.3M    /usr/local/lib/R/site-library/bindrcpp/libs/bindrcpp.so
144K    /usr/local/lib/R/site-library/colorspace/libs/colorspace.so
204K    /usr/local/lib/R/site-library/curl/libs/curl.so
328K    /usr/local/lib/R/site-library/digest/libs/digest.so
33M     /usr/local/lib/R/site-library/dplyr/libs/dplyr.so
36K     /usr/local/lib/R/site-library/glue/libs/glue.so
3.2M    /usr/local/lib/R/site-library/haven/libs/haven.so
272K    /usr/local/lib/R/site-library/jsonlite/libs/jsonlite.so
52K     /usr/local/lib/R/site-library/lazyeval/libs/lazyeval.so
64K     /usr/local/lib/R/site-library/lubridate/libs/lubridate.so
16K     /usr/local/lib/R/site-library/mime/libs/mime.so
124K    /usr/local/lib/R/site-library/mnormt/libs/mnormt.so
372K    /usr/local/lib/R/site-library/openssl/libs/openssl.so
772K    /usr/local/lib/R/site-library/plyr/libs/plyr.so
92K     /usr/local/lib/R/site-library/purrr/libs/purrr.so
13M     /usr/local/lib/R/site-library/readr/libs/readr.so
4.7M    /usr/local/lib/R/site-library/readxl/libs/readxl.so
1.2M    /usr/local/lib/R/site-library/reshape2/libs/reshape2.so
160K    /usr/local/lib/R/site-library/rlang/libs/rlang.so
928K    /usr/local/lib/R/site-library/scales/libs/scales.so
4.9M    /usr/local/lib/R/site-library/stringi/libs/stringi.so
1.3M    /usr/local/lib/R/site-library/tibble/libs/tibble.so
2.0M    /usr/local/lib/R/site-library/tidyr/libs/tidyr.so
1.2M    /usr/local/lib/R/site-library/tidyselect/libs/tidyselect.so
4.7M    /usr/local/lib/R/site-library/xml2/libs/xml2.so
78M     total
root@de443801b3fc:/# 

Looks like dplyr wins this one at thirty-three megabytes just for its shared library.

But with a single stroke of strip we can reduce all this down a lot:

root@de443801b3fc:/# strip --strip-debug /usr/local/lib/R/site-library/*/libs/*so
root@de443801b3fc:/# du -csh /usr/local/lib/R/site-library/*/libs/*so
440K    /usr/local/lib/R/site-library/Rcpp/libs/Rcpp.so
220K    /usr/local/lib/R/site-library/bindrcpp/libs/bindrcpp.so
52K     /usr/local/lib/R/site-library/colorspace/libs/colorspace.so
56K     /usr/local/lib/R/site-library/curl/libs/curl.so
120K    /usr/local/lib/R/site-library/digest/libs/digest.so
2.5M    /usr/local/lib/R/site-library/dplyr/libs/dplyr.so
16K     /usr/local/lib/R/site-library/glue/libs/glue.so
404K    /usr/local/lib/R/site-library/haven/libs/haven.so
76K     /usr/local/lib/R/site-library/jsonlite/libs/jsonlite.so
20K     /usr/local/lib/R/site-library/lazyeval/libs/lazyeval.so
24K     /usr/local/lib/R/site-library/lubridate/libs/lubridate.so
8.0K    /usr/local/lib/R/site-library/mime/libs/mime.so
52K     /usr/local/lib/R/site-library/mnormt/libs/mnormt.so
84K     /usr/local/lib/R/site-library/openssl/libs/openssl.so
76K     /usr/local/lib/R/site-library/plyr/libs/plyr.so
32K     /usr/local/lib/R/site-library/purrr/libs/purrr.so
648K    /usr/local/lib/R/site-library/readr/libs/readr.so
400K    /usr/local/lib/R/site-library/readxl/libs/readxl.so
128K    /usr/local/lib/R/site-library/reshape2/libs/reshape2.so
56K     /usr/local/lib/R/site-library/rlang/libs/rlang.so
100K    /usr/local/lib/R/site-library/scales/libs/scales.so
496K    /usr/local/lib/R/site-library/stringi/libs/stringi.so
124K    /usr/local/lib/R/site-library/tibble/libs/tibble.so
164K    /usr/local/lib/R/site-library/tidyr/libs/tidyr.so
104K    /usr/local/lib/R/site-library/tidyselect/libs/tidyselect.so
344K    /usr/local/lib/R/site-library/xml2/libs/xml2.so
6.6M    total
root@de443801b3fc:/#

Down to six point six megabytes. Not bad for one command. The chart visualizes the respective reductions. Clearly, C++ packages (and their template use) lead to more debugging symbols than plain old C code. But once stripped, the size differences are not that large.

And just to be plain, what we showed previously in post #9 does the same, only already at installation stage. The effects are not cumulative.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

Sean Whitton: Debian Policy call for participation -- August 2017

20 August, 2017 - 04:44

At the Debian Policy BoF at DebConf17, Solveig suggested that we could post summaries of recent activity in policy bugs to Planet Debian, as a kind of call for participation. Russ Allbery had written a script to generate such a summary some time ago, but it couldn’t handle the usertags the policy team uses to progress bugs through the policy changes process. Today I enhanced the script to handle usertags and I’m pleased to be able to post a summary of our bugs.

Consensus has been reached and help is needed to write a patch

#172436 BROWSER and sensible-browser standardization

#273093 document interactions of multiple clashing package diversions

#299007 Transitioning perms of /usr/local

#314808 Web applications should use /usr/share/package, not /usr/share/doc/…

#425523 Describe error unwind when unpacking a package fails

#452393 Clarify difference between required and important priorities

#476810 Please clarify 12.5, “Copyright information”

#484673 file permissions for files potentially including credential informa…

#491318 init scripts “should” support start/stop/restart/force-reload - why…

#556015 Clarify requirements for linked doc directories

#568313 Suggestion: forbid the use of dpkg-statoverride when uid and gid ar…

#578597 Recommend usage of dpkg-buildflags to initialize CFLAGS and al.

#582109 document triggers where appropriate

#587279 Clarify restrictions on main to non-free dependencies

#587991 perl-policy: /etc/perl missing from Module Path

#592610 Clarify when Conflicts + Replaces et al are appropriate

#613046 please update example in 4.9.1 (debian/rules and DEB_BUILD_OPTIONS)

#614807 Please document autobuilder-imposed build-dependency alternative re…

#616462 clarify wording of parenthetical in section 2.2.1

#628515 recommending verbose build logs

#661928 recipe for determining shlib package name

#664257 document Architecture name definitions

#682347 mark ‘editor’ virtual package name as obsolete

#683222 say explicitly that debian/changelog is required in source packages

#685506 copyright-format: new Files-Excluded field

#685746 debian-policy Consider clarifying the use of recommends

#688251 Built-Using description too aggressive

#757760 please document build profiles

#759316 Document the use of /etc/default for cron jobs

#761219 document versioned Provides

#767839 Linking documentation of arch:any package to arch:all

#770440 policy should mention systemd timers

#773557 Avoid unsafe RPATH/RUNPATH

#780725 PATH used for building is not specified

#793499 The Installed-Size algorithm is out-of-date

#810381 Update wording of 5.6.26 VCS-* fields to recommend encryption

#823256 Update maintscript arguments with dpkg >= 1.18.5

#833401 virtual packages: dbus-session-bus, dbus-default-session-bus

#835451 Building as root should be discouraged

#838777 Policy 11.8.4 for x-window-manager needs update for freedesktop menus

#845715 Please document that packages are not allowed to write outside thei…

#853779 Clarify requirements about update-rc.d and invoke-rc.d usage in mai…

Wording proposed, awaiting review from anyone and/or seconds by DDs

#542288 Versions for native packages, NMU’s, and binary only uploads

#582109 document triggers where appropriate

#630174 forbid installation into /lib64

#645696 [copyright-format] clearer definitions and more consistent License:…

#648271 11.8.3 “Packages providing a terminal emulator” says xterm passes -…

#649530 [copyright-format] clearer definitions and more consistent License:…

#662998 stripping static libraries

#732445 debian-policy should encourage verification of upstream cryptograph…

#737796 copyright-format: support Files: paragraph with both abbreviated na…

#756835 Extension of the syntax of the Packages-List field.

#786470 [copyright-format] Add an optional “License-Grant” field

#835451 Building as root should be discouraged

#844431 Packages should be reproducible

#845255 Include best practices for packaging database applications

#850729 Documenting special version number suffixes

Merged for the next release

#587279 Clarify restrictions on main to non-free dependencies

#616462 clarify wording of parenthetical in section 2.2.1

#732445 debian-policy should encourage verification of upstream cryptograph…

#844431 Packages should be reproducible

Holger Levsen: 20170819-lasercutter-sprint

19 August, 2017 - 23:14
laser-cutter sprint

So I'm overcoming my jetlag after DebConf17 by helping to make the Alioth sprint happen, and while it's good to witness work on the upcoming git.debian.org replacement, I'm rather minding my own business instead of getting involved…

And so I got interested in this laser cutter, which since two months has been set up in the CCCHH hackerspace and which is nicely documentend (and set up), so I managed to learn how to do my first baby steps with the laser cutter in one evening:

Basically there is a hosted web application named 'LaserWeb4' for which a pre-configuration exists, so that one only needs to load an image, scale and position it and tune the laser settings a bit. The laser itself is inside a cage, which has a physical safety switch which will turn off the laser if the cage is opened. Obviously the setup is a lot more complex and there are many parameters to tune, and I basically just learned one thing, which is "printing images on wood", but "printing images on a laptop cover" should be pretty similar and something to learn in the future

And now I'm even teaching weasel how to use this thing (and he already made interesting new mistakes) and it looks like Ganneff & formorer are next. Fun fun fun!

Oh, and the Alioth sprint also seems to be quite productive, but I'll leave reporting about this to others.

Vasudev Kamath: Writing a UDP Broadcast Receiver as Python Iterator

19 August, 2017 - 20:28

I had to write a small Python application to listen for some broadcast message and process the message. This broadcast messages are actually sort of discovery messages to find some peers in a network. Writing a simple UDP Server to listen on a particular port was easy; but while designing an application I was wondering how can I plugin this server into my main code. There are 2 possibility

  1. Use threading module of python to send the server code in back ground and give it a callback to communicate the data to main thread.
  2. Periodically read some messages from server code and then dispose of server.

I didn't like first approach because I need to pass a callback function and I some how will end up complicating code. Second approach sounded sane but I did want to make server more like iterator. I searched around to see if some one has attempted to write something similar, but did not find anything useful (may be my Googling skills aren't good enough). Anyway so I thought what is wrong in trying?. If it works then I'll be happy that I did something different :-).

The first thing for making iterator in Python is having function __iter__ and __next__ defined in your class. For Python 2 iterator protocol wanted next to be defined instead of __next__. So for portable code you can define a next function which in return calls __next__.

So here is my first shot at writing BroadCastReceiver class.

from socket import socket, AF_INET, SOCK_DGRAM


class BroadCastReceiver:

    def __init__(self, port, msg_len=8192):
        self.sock = socket(AF_INET, SOCK_DGRAM)
        self.sock.setsockopt(SOL_SOCKET, SO_REUSEADDR, 1)
        self.sock.bind(('', port))
        self.msg_len = msg_len

    def __iter__(self):
             return self

    def __next__(self):
             try:
                 addr, data = self.sock.recvfrom(self.msg_len)
                 return addr, data
             except Exception as e:
                 print("Got exception trying to recv %s" % e)
                 raise StopIteration

This version of code can be used in a for loop to read from socket UDP broadcasts. One problem will be that if no packet is received which might be due to disconnected network the loop will just block forever. So I had to modify the code slightly to add timeout parameter. So changed portion of code is below.

...
    def __init__(self, port, msg_len=8192, timeout=15):
        self.sock = socket(AF_INET, SOCK_DGRAM)
        self.sock.setsockopt(SOL_SOCKET, SO_REUSEADDR, 1)
        self.sock.settimeout(timeout)
        self.sock.msg_len = msg_len
        self.sock.bind(('', port))

 ...

So now if network is disconnected or no packet was received for timeout period we get a socket.timeout exception due to which StopIteration will be raised causing the for loop using server as iterator to exit. This avoids us just blocking our periodic code run forever when network is disconnected or no messages are received for long time. (may be due to connected to wrong network).

Now every thing looks fine but only part is if we create the server object each time our periodic code is called we will have binding issue as we did not properly close the socket once iterator has stopped. So I added socket closing code in __del__ function for the class. __del__ will be called when garbage collector try to recollect object when it goes out of scope.

...
    def __del__(self):
        self.sock.close()

So the server can be used in for loop or by passing the object of server to next built-in function. Here are 2 examples.

r = BroadCastReceiver(5000, timeout=10)
count = 0
for (address, data) in r:
    print('Got packet from %s: %s' % address, data)
    count += 1
    # do whatever you want with data
    if count > 10:
        break

Here we use an counter variable to track iteration and after some iteration we exit for loop. Another way is use for loop with range of iteration like below.

r = BroadCastReceiver(5000,  timeout=10)
for i in range(20):
    try:
        address, data = next(r)
        # do whatever you want with data
    except:
        break

Here an additional try block was needed inside the for loop to card call to next, this is to handle the timeout or other exception and exit the loop. In first case this is not needed as StopIteration is understood by for.

Both use cases I described above are mostly useful when it is not critical to handle each and every packet (mostly peer discovery) and packets will always be sent. So if we miss some peers in one iteration we will still catch them in next iteration. We just need to make sure we provide big enough counter to catch most peers in each iteration.

If its critical to receive each packet we can safely send this iterating logic to a separate thread which keeps receiving packets and process data as needed.

For now I tried this pattern mostly with UDP protocol but I'm sure with some modification this can be used with TCP as well. I'll be happy to get feed back from Pythonistas out there on what you think of this approach. :-)

Olivier Grégoire: Conclusion Google Summer of Code 2017 with GNU

19 August, 2017 - 19:57


GNU Ring project with GNU organisation 1. Me

Before getting into the thick of my project, let me present myself:
I am Olivier Grégoire (Gasuleg), and I study IT engineering at École de Technologie Supérieure in Montreal.
I am a technician in electronics, and I study now computing sciences.
I applied to GSoC because I love the concept of the project that I worked on and I really wanted to be part of it.

2. My Project

During this GSoC, I worked on the GNU Ring project.
What is ring?

  • A telephone: a simple tool to connect, communicate and share.
  • A teleconferencing tool: easily join calls to create conferences with multiple participants.
  • A media sharing tool: Ring supports a variety of video input options, including mutliple cameras and image and video files, and the selection of audio inputs and outputs; all this is supported by multiple high quality audio and video codecs.
  • A messenger: send text messeges during calls or out of calls (as long as your peer is connected).
  • A building block for your IoT project: re-use the universal communications technology of Ring with its portable library on your system of choice.
    – ring.cx

What I need to do?

This project is, at the moment, unstable due to a lack of automated tests. Only a part of the code is tested. I Need to improve this.
To do that, I need to:

  • Reimplement some unit tests to check the components of the SIP pur account.
  • Research and test automation strategies that integrate the compilation system and Jenkins verification.
  • Write more unit tests for the critical functions in order to increase the code coverage.

My proposal

3. What my code can do

The Code

Here are the links to the code I was working on all throughout the Google Summer of Code:

Patch Status black box testing sip Merged refactoring + video_input unit test On Review fix sip error On Review new unit test: smartools On Review new unit test:account_factory On Review new unit test: util classes On Review new unit test: archiver, conference, preferences On Review new unit test: dring, threadloop WIP 4. What’s next?
  • Add gcov to know the code coverage
  • Increase the code coverage
5. Thanks

I would like to thank the following:

  • The Google Summer of Code organisation, for this wonderful experience.
  • GNU, for accepting my project proposal and letting me embark on this fantastic adventure.
  • My mentor, Mr Guillaume Roguez, and all the ring team, for being there to help me.

Arturo Borrero González: Running Suricata 4.0 with Debian Stretch

19 August, 2017 - 17:56

Do you know what’s happening in the wires of your network? There is a major FLOSS player in the field of real time intrusion detection (IDS), inline intrusion prevention (IPS) and network security monitoring (NSM). I’m talking about Suricata, a mature, fast and robust network threat detection engine. Suricata is a community driven project, supported by the Open InfoSec Foundation (OISF).

For those who doesn’t know how Suricata works, it usually runs by loading a set of pre-defined rules for matching different network protocols and flow behaviours. In this regards, Suricata has been always ruleset-compatible with the other famous IDS: snort.

The last major release of Suricata is 4.0.0, and I’m uploading the package for Debian stretch-backports as I write this line. This means the updated package should be available for general usage after the usual buildds processing ends inside the Debian archive.

You might be wondering, How to start using Suricata 4.0 with Debian Stretch? First, I would recommend reading the docs. Please checkout:

My recommendation is to run Suricata from stretch-backports or from testing, and just installing the package should be enough to get the environment up and running:

% sudo aptitude install suricata

You can check that the installation was good:

% sudo systemctl status suricata
● suricata.service - Suricata IDS/IDP daemon
   Loaded: loaded (/lib/systemd/system/suricata.service; enabled; vendor preset: enabled)
   Active: active (running) since Sat 2017-08-19 12:50:49 CEST; 44min ago
     Docs: man:suricata(8)
           man:suricatasc(8)
           https://redmine.openinfosecfoundation.org/projects/suricata/wiki
 Main PID: 1101 (Suricata-Main)
    Tasks: 8 (limit: 4915)
   CGroup: /system.slice/suricata.service
           └─1101 /usr/bin/suricata -D --af-packet -c /etc/suricata/suricata.yaml --pidfile /var/run/suricata.pid

ago 19 12:50:44 nostromo systemd[1]: Starting Suricata IDS/IDP daemon...
ago 19 12:50:47 nostromo suricata[1032]: 19/8/2017 -- 12:50:47 - <Notice> - This is Suricata version 4.0.0 RELEASE
ago 19 12:50:49 nostromo systemd[1]: Started Suricata IDS/IDP daemon.

You can interact with Suricata using the suricatasc tool:

% sudo suricatasc -c uptime
{"message": 3892, "return": "OK"}

And start inspecting the generated logs at /var/log/suricata/

The default configuration, in file /etc/suricata/suricata.yaml, comes with some preconfigured values. For a proper integration into your enviroment, you should tune the configuration file, define your networks, network interfaces, running modes, and so on (refer to the upstream documentation for this).

In my case, I tested suricata by inspecting the traffic of my laptop. After installation, I only had to switch the network interface:

[...]
# Linux high speed capture support
af-packet:
  - interface: wlan0
[...]

After a restart, I started seeing some alerts:

% sudo systemctl restart suricata
% sudo tail -f /var/log/suricata/fast.log
08/19/2017-14:03:04.025898  [**] [1:2012648:3] ET POLICY Dropbox Client Broadcasting [**] \
	[Classification: Potential Corporate Privacy Violation] [Priority: 1] {UDP} 192.168.1.36:17500 -> 255.255.255.255:17500

One of the main things when running Suricata is to keep your ruleset up-to-dated. In Debian, we have the suricata-oinkmaster package which comes with some handy options to automate your ruleset updates using the Oinkmaster software. Please note that this is a Debian-specific glue to integrate and automate Suricata with Oinkmaster.

To get this funcionality, simply install the package:

% sudo aptitude install suricata-oinkmaster

A daily cron-job will be enabled. Check suricata-oinkmaster-updater(8) for more info.

By the way, Did you know that Suricata can easily handle big loads of traffic? (i.e, 10Gbps). And I heard some scaling works are in mind to reach 100Gpbs.

I have been in charge of the Suricata package in Debian for a while, several years already, with the help of some other DD hackers: Pierre Chifflier (pollux) and Sascha Steinbiss (satta), among others. Due to this work, I believe the package is really well integrated into Debian, ready to use and with some powerful features. And, of course, we are open to suggestions and bug reports.

So, this is it, another great stuff you can do with Debian :-)

Sean Whitton: The knowledge that one has an unread message is equivalent to a 10 point drop in one's IQ

19 August, 2017 - 00:37

According to Daniel Pocock’s talk at DebConf17’s Open Day, hearing a ping from your messaging or e-mail app or seeing a visual notification of a new unread message has an equivalent effect on your ability to concentrate as

  • a 10 point drop in your IQ; or

  • drinking a glass of wine.

This effect is probably at least somewhat mitigated by reading the message, but that is a context switch, and we all know what those do to your concentration. So if you want to get anything done, be sure to turn off notifications.

Rapha&#235;l Hertzog: Freexian’s report about Debian Long Term Support, July 2017

18 August, 2017 - 21:16

Like each month, here comes a report about the work of paid contributors to Debian LTS.

Individual reports

In July, about 181 work hours have been dispatched among 11 paid contributors. Their reports are available:

  • Antoine Beaupré did 20h (out of 16h allocated + 4 extra hours).
  • Ben Hutchings did 14 hours (out of 15h allocated, thus keeping 1 extra hour for August).
  • Chris Lamb did 18 hours.
  • Emilio Pozuelo Monfort did 18.5 hours (out of 23.5 hours allocated + 8 hours remaining, thus keeping 13 hours for August).
  • Guido Günther did 10 hours.
  • Hugo Lefeuvre did nothing due to personal problems (out of 2h allocated + 10 extra hours, thus keeping 12 extra hours for August).
  • Markus Koschany did 23.5 hours.
  • Ola Lundqvist did not publish his report yet (out of 14h allocated + 2 extra hours).
  • Raphaël Hertzog did 7 hours (out of 12 hours allocated but he gave back his remaining hours).
  • Roberto C. Sanchez did 19.5 hours (out of 23.5 hours allocated + 12 hours remaining, thus keeping 16 extra hours for August).
  • Thorsten Alteholz did 23.5 hours.
Evolution of the situation

The number of sponsored hours increased slightly with two new sponsors: Leibniz Rechenzentrum (silver sponsor) and Catalyst IT Ltd (bronze sponsor).

The security tracker currently lists 74 packages with a known CVE and the dla-needed.txt file 64. The number of packages with open issues increased of almost 50% compared to last month. Hopefully this backlog will get cleared up when the unused hours will actually be done. In any case, this evolution is worth watching.

Thanks to our sponsors

New sponsors are in bold.

No comment | Liked this article? Click here. | My blog is Flattr-enabled.

Pages

Creative Commons License ลิขสิทธิ์ของบทความเป็นของเจ้าของบทความแต่ละชิ้น
ผลงานนี้ ใช้สัญญาอนุญาตของครีเอทีฟคอมมอนส์แบบ แสดงที่มา-อนุญาตแบบเดียวกัน 3.0 ที่ยังไม่ได้ปรับแก้