Planet Debian

Subscribe to Planet Debian feed
Planet Debian - http://planet.debian.org/
Updated: 2 hours 41 min ago

Evgeni Golov: starting the correct Chromium profile when opening links from IRC

27 August, 2017 - 23:18

I am using Chromium/Chrome as my main browser and I also use its profile/people feature to separate my work profile (bookmarks, cookies, etc) from my private one.

However, Chromium always opens links in the last window (and by that profile) that was in foreground last. And that is pretty much not what I want. Especially if I open a link from IRC and it might lead to some shady rick-roll page.

Thankfully, getting the list of available Chromium profiles is pretty easy and so is displaying a few buttons using Python.

To do so I wrote cadmium, which scans the available Chromium profiles and allows to start either of them, or Chromium's Incognito Mode. On machines with SELinux it can even launch Chromium in the SELinux sandbox.

No more links opened in the wrong profile. Yay!

Andrew Cater: BBQ Cambridge 2017 - post 5 - and a bit of a retrospective

27 August, 2017 - 22:01
Thanks to all the sponsors of this BBQ who have made this so awesome.

This is also post 100 in this blog - looking  back, 90 or so of the 100 have been from Cambridge which just goes to show how much of the world revolves around a radius of about five miles from here

Likewise, there are folk in the room whom I've known for 20 years even if I'm dreadful with remembering  stuff. There's also scope for remembering absent friends who have got us this far and are no longer with us, for whatever reason.

I've just handed over some CDs and DVDs which, if readable, have a collective memory back to Debian 0.93 in about 1994 - even if not readable, they're a document of how far we've come from boot floppies to VMs, Bu-Ray size images and architectures undreamt of all those years ago.

Andrew Cater: Helping out around the edges ...

27 August, 2017 - 21:49
for two point releases of Debian CDs.

Lots of testing, lots of folk chatting on IRC on #debian-cd - it's a good process.

Very impressed by processes behind the scenes to obtain necessary computer accounts, access to machines and various other things that are absolutely necessary and invisible from the outside. Hofstadter's law applies of course - it always takes longer than you think, even when you take into account Hofstadter's law.

Also many thanks for the patience and tolerance of people I've known for many years but who I get to see all too seldom. It's a nice group to be with, as ever.

Andrew Cater: BBQ Cambridge 2017 - post 4

27 August, 2017 - 21:41
Room full of people with laptops and an amount of chatting going on. Annoyingly, I can't get the thing I want to work but there's a whole load of other folk deep into dealing with all sorts.

The garden is also full but I'm guessing everyone is under the gazebos - it's now hot and sunny, unusual for a British holiday weekend.

Holger Levsen: config-2017-07-30

27 August, 2017 - 20:49

Holger Levsen: 20170827-coreboot-build-environment

27 August, 2017 - 19:55
setting up a coreboot build environment, including an Ada compiler

So without much explaination, this is how lynxis told me how to setup a coreboot build environment, which contains an Ada compiler which is needed to build the free graphics initialisation for Intel cards (=so no binary VGA bios blob is needed).

The Ada compiler is build automatically by default if it's build depends are installed:

sudo apt install build-essential bison flex zlib1g-dev ncurses-dev gnat
git clone --recursive https://review.coreboot.org/p/coreboot.git
cd coreboot/
git submodule update --init --checkout 3rdparty/blobs   # for the x230 this only contains microcode updates
bash util/crossgcc/buildgcc -j 4 -P IASL

coreboot is then build as usual:

make menuconfig
cd util/crossgcc
make -j 4 SKIP_GDB=1 build-i386 build-x64 build_make
cd ../..
make

That's it.

(I've just left out the steps to choose the coreboot revision and validating it, as well as choosing a configurationwith make menuconfig as this is better documented elsewhere.)

Andrew Cater: BBQ Cambridge 2017 - post 3

27 August, 2017 - 19:24
One set of gazebos put up: kilos of mushrooms eaten, bacon, mushrooms and all the trimmings barbequed and consumed by the hordes. Now laptops are sprouting in the garden under the gazebos as the temperature is soaring,

Some folk are quiet in the house under fans typing and cooling off.

Masses of washing up is being done - as ever, it's how many people you can fit into a kitchen.

Now it will all go quiet for a bit as everyone lets the breakfast go down :)

Superb hospitality - we're _SO_ lucky to have Steve and Jo do this so readily.

Hideki Yamane: Let's send patches to debian-policy (rst file is your friend :-)

27 August, 2017 - 18:26
As I posted before, now debian-policy package uses Sphinx. It means, you can edit and send patches for Debian Policy easier than ever. Get source (install devscripts package and exec 'debcheck debian-policy')  and dig into policy directory. There are several rst files for each chapter.



Open it with your favorite editor and edit (Perhaps most of editors support reStructuredText, and if not, check its extension).


rst file is more friendly than old policy.xml file :-)

Then, commit and create patches with 'git format-patch'. Not much complicated, right?

Andrew Cater: BBQ Cambridge 2017 - post 2

27 August, 2017 - 17:50
We were all up until about 0100 :) House full of folk talking about all sorts, a game of Mao. Garden full of people clustered round the barbeque or sitting chatting - I had a long chat about Debian, what it means and how it's often an easier world to deal with and move in than the world of work, office politics or whatever - being here is being at home.

Arguments in the kitchen over how far coffee "just happens" with the magic bean to cup machine, some folk are in the garden preparing for breakfast at noon.

I missed the significance of this week's  date - the 26th anniversary of Linus' original announcement of Linux in 1991 fell on Friday. Probably the first user of Linux who installed it from scratch was Lars Wirzenius - who was here yesterday.

Debian's 24th birthday  was just about ten days ago on 16th August, making it the second oldest distribution and I reckon I've been using it for twenty one of those years - I wouldn't change it for the world.

Andrew Cater: OMGWTFBBQ Cambridge 2017

26 August, 2017 - 22:16
Funny this - I only blog when I'm in Cambridge :) I'm sure there's a blog back in the day from a BBQ a good few years ago. This is almost deja vu - a room full of Debian types - the crazy family - Thinkpads on a lot of laps and lots of chat around the room.

Colin showing round an amateur radio project in a tobacco tin, coding happening at the back of the room. Pepper the dog sitting on my foot - everything pretty well normal.

Nathan Handler: freenode #live

26 August, 2017 - 07:00

For those of you who might not be aware, freenode is an IRC network that caters to free and open source projects. We have had a goal for a number of years to hold freenode #live, an in-person conference where our staff, projects, FOSS supporters, and IRC enthusiasts could all get together in one place. Thanks to some very generous sponsorship, primarily by Private Internet Access, this conference is finally happening! The first ever freenode #live conference will take place October 28 - 29, 2017 in Bristol, United Kingdom.

This conference will only be a success with support from the community. Luckily, there are many ways to help out.

  1. Register to attend. We are working on putting together a great lineup of speakers that you will not want to miss. Registering allows us to finalize many logistical details surrounding the event to make sure everything runs smoothly.
  2. Do you represent a FOSS group? We would love to have your group present at the event. There will be an exhibitor hall where you can advertise your organization and attract new users and contributors.
  3. Submit a talk. Our call for proposals is still open. New and experienced speakers are welcome. Feel free to contact 2017-team@freenode.live and we would be happy to work with you to come up with a talk idea or provide feedback.
  4. If you or your company are interested in helping to sponsor this event, please reach out to exhibit@freenode.live. We have opportunities for budgets of all sizes.

I look forward to hopefully seeing you at freenode #live!

Steve Kemp: Interesting times debugging puppet

26 August, 2017 - 04:00

I recently upgraded a bunch of systems from Jessie to Stretch, and as a result of that one of my hosts has started showing me a lot of noise in an hourly cron-email:

Command line is not complete. Try option "help"

I've been ignoring these emails for the past while, but today I sat down to track down the source. It was obviously coming from facter, the system that puppet uses to gather information about hosts.

Running facter -debug made that apparent:

 root@smaug ~ # facter --debug
 Found no suitable resolves of 1 for ec2_metadata
 value for ec2_metadata is still nil
 value for netmask_git is still nil
 value for ipaddress6_lo is still nil
 value for macaddress_lo is still nil
 value for ipaddress_master is still nil
 value for ipaddress6_master is still nil
 Command line is not complete. Try option "help"
 value for netmask_master is still nil
 value for ipaddress_skx_mail is still nil
 ..

There we see the issue, and it is obviously relating to our master interface.

To cut a long-story short /usr/lib/ruby/vendor_ruby/facter/util/ip.rb contains some code which eventually runs this:

 ip link show $interface

That works on all other interfaces I have:

  $ ip link show git
  6: git: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN mode DEFAULT group default qlen 1000

But not on master:

  $ ip link show master
  Command line is not complete. Try option "help"

I ninja-edited the code from this:

  ethbond = regex.match(%x{/sbin/ip link show '#{interface}'})

to:

  ethbond = regex.match(%x{/sbin/ip link show dev '#{interface}'})

And suddenly puppet-runs without any errors. I'm not 100% sure if this is a bug bug, but it is something of a surprise anyway.

This host runs KVM guests, one of the guests is a puppet-master, with a local name master. Hence the name of the interface. Similarly the interface git is associated with the KVM guest behind git.steve.org.uk.

Reproducible builds folks: Reproducible Builds: Weekly report #121

26 August, 2017 - 02:50

Here's what happened in the Reproducible Builds effort between Sunday August 13 and Saturday August 19 2017:

Reproducible Builds finally mandated by Debian Policy

"Packages should build reproducibly" was merged into Debian policy! The added text is as follows and has been included into debian-policy 4.1.0.0:

Reproducibility
---------------

Packages should build reproducibly, which for the purposes of this
document [#]_ means that given

- a version of a source package unpacked at a given path;
- a set of versions of installed build dependencies;
- a set of environment variable values;
- a build architecture; and
- a host architecture,

repeatedly building the source package for the build architecture on
any machine of the host architecture with those versions of the build
dependencies installed and exactly those environment variable values
set will produce bit-for-bit identical binary packages.

It is recommended that packages produce bit-for-bit identical binaries
even if most environment variables and build paths are varied.  It is
intended for this stricter standard to replace the above when it is
easier for packages to meet it.

.. [#]
   This is Debian's precisification of the `reproducible-builds.org
   definition `_.

  • Holger Levsen wrote a blog post briefly describing the background and implications of this. To quote him: "we are not 94% done yet, rather more like half done or so. We still need tools and processes to enable anyone to indepently verify that a given binary comes from the sources it is said to be coming, this will involve distributing .buildinfo files and providing user interfaces in APT and elsewhere and probably also systematic rebuilds by us and other parties. And 6% or 7% of the archive is still a lot of packages, eg. in Buster we currently still have 273 unreproducible key packages and for a large part we don't have patches yet so there is still a lot of work ahead."
  • There were discussion threads on Hacker News and Reddit.
  • Our long-term goal is that Policy mandates that packages "must" be reproducible, but for that we need to show further progress and also reach a consensus on .buildinfo files and much more.
Reproducible work in other projects

Bernhard M. Wiedemann's reproducibleopensuse scripts now work on Debian buster on the openSUSE Build Service with the latest versions of osc and obs-build.

Toolchain development and fixes

#872514 was opened on devscripts by Chris Lamb to add a reproducible-check program to report on the reproducibility status of installed packages.

Packages reviewed and fixed, and bugs filed

Upstream reports:

  • Bernhard M. Wiedemann:

Debian reports:

Debian non-maintainer uploads:

Reviews of unreproducible packages

47 package reviews have been added, 58 have been updated and 39 have been removed in this week, adding to our knowledge about identified issues.

4 issue types have been updated:

Weekly QA work

During our reproducibility testing, FTBFS bugs have been detected and reported by:

  • Adrian Bunk (59)
  • Bastien Roucariès (1)
  • James Clarke (1)
  • Jeremy Bicha (1)
diffoscope development

Development continued in git, including the following contributions:

  • Ximin Luo:
    • presenters: html: Don't traverse children whose parents were already limited (Closes: #871413)
    • On a non-GNU system, prefer tools that start with "g" for certain whitelisted commands. (Closes: #871029)
    • Add a --tool-prefix-binutils CLI flag. (Closes: #869868)
  • Chris Lamb:
    • Temporarily revert "Bump Standards-Version to 4.0.1" to avoid spurious CI test failures.
    • comparators.xml: Use name attribute over path to avoid leaking comparison full path in output.
    • Code style fixes.
disorderfs development

Development continued in git, including the following contributions:

  • Chris Lamb:
    • Add simple autopkgtest.
reprotest development

Development continued in git, including the following contributions:

  • Ximin Luo:
    • Choose an existent HOME for the "control" build. (Closes: #860428)
    • Update debian/changelog with Santiago's changes.
  • Santiago Torres:
    • Abstract parts of autopkgtest to support running on non-Debian systems.
    • Add a --host-distro flag to support that too.
tests.reproducible-builds.org

Mattia fixed the script which creates the HTML representation of our database scheme to not append .html twice to the filename.

Misc.

This week's edition was written by Ximin Luo, Chris Lamb and Holger Levsen & reviewed by a bunch of Reproducible Builds folks on IRC & the mailing lists.

Michal &#268;iha&#345;: New projects on Hosted Weblate

25 August, 2017 - 23:00

Hosted Weblate provides also free hosting for free software projects. The hosting requests queue has grown too long, so it's time to process it and include new project.

This time, the newly hosted projects include:

If you want to support this effort, please donate to Weblate, especially recurring donations are welcome to make this service alive. You can do them on Liberapay or Bountysource.

Filed under: Debian English SUSE Weblate

Steve McIntyre: Let's BBQ again, like we did last summer!

25 August, 2017 - 09:00

It's that time again! Another year, another OMGWTFBBQ! We're expecting 50 or so Debian folks at our place in Cambridge this weekend, ready to natter, geek, socialise and generally have a good time. Let's hope the weather stays nice, but if not we have gazebo technology... :-)

Many thanks to a number of awesome companies and people near and far who are sponsoring the important refreshments for the weekend:

I've even been working on the garden this week to improve it ready for the event. If you'd like to come and haven't already told us, please add yourself to the wiki page!

Dirk Eddelbuettel: BH 1.65.0-1

25 August, 2017 - 08:13

The BH package on CRAN was updated today to version 1.65.0. BH provides a sizeable portion of the Boost C++ libraries as a set of template headers for use by R, possibly with Rcpp as well as other packages.

This release upgrades the version of Boost to the rather new upstream version Boost 1.65.0 released earlier this week, and adds two new libraries: align and sort.

I had started the upgrade process a few days ago under release 1.64.0. Rigorous checking of reverse dependencies showed that mvnfast needed a small change (which was trivial: just seeding the RNG prior to running tests), which Matteo did in no time with a fresh CRAN upload. rstan is needing a bit more work but should be ready real soon now and we are awaiting a new version. And once I switched to the just release Boost 1.65.0 it became apparent that Cyclops no longer needs its embedded copy of Boost iterator---and Marc already made that change with yet another fresh CRAN upload. It is a true pleasure to work in such a responsive and collaborative community.

Changes in version 1.65.0-1 (2017-08-24)
  • Upgraded to Boost 1.64 and then 1.65 installed directly from upstream source with several minor tweaks (as before)

  • Fourth tweak corrects a misplaced curly brace (see the Boost ublas GitHub repo and its issue #40)

  • Added Boost align (as requested in #32)

  • Added Boost sort (as requested in #35)

  • Added Boost multiprecision by fixing a script typo (as requested in #42)

  • Updated Travis CI support via newer run.sh

Via CRANberries, there is a diffstat report relative to the previous release.

Comments and suggestions are welcome via the mailing list or the issue tracker at the GitHub repo.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

Michal &#268;iha&#345;: Taking over siphashc for Python

24 August, 2017 - 23:00

Since some time we're using siphash algorithm to speed up looking up strings in Weblate. Even though it is used by Python internally, it's not exposed in the standard library so several third party modules appeared in the PyPI. Out of all these siphashc or rather it's Python 3 fork siphashc3 seemed to perform best, so I've started to use that.

However it turned out that none of them is in active maintenance anymore. The original version lacks Python 3 support, while the siphashc3 uses odd versioning which causes problems to some pip versions.

After trying to get fix into siphashc3 without much of success, I've spoken to original author of siphashc and he has agreed to hand over maintainership to me. So it's new home is at https://github.com/WeblateOrg/siphashc and new release is already available on PyPI.

Note: Originally we were using MD5 in Weblate, but siphash has shown to be faster and fits into 64-bits, what makes it easier to store and index in SQL databases as LONGINT.

Filed under: Debian English SUSE Weblate

Urvika Gola: Much awaited.. DebConf’17 in Montreal.

24 August, 2017 - 11:15

On 5th August I got a chance to attend, speak and experience DebConf 2017 at Montreal, Canada. The conference was ‘stretch’ed from 6 August to 12 August .

Seasons of Debian – Summer of Code & Winter of Outreachy

Pretty late for me to document my DebConf fun-learning-experiences, thanks to my delaying tactics.. I need to overcome.
But better late than ever, I had amazing time at DebConf. I got to meet and learn from my Outreachy Mentor, Daniel Pocock!

Picture of Daniel and me captured by : Dorina Mosku

One thing about DebConf I loved was the amount of Diversity in Debian family!

As a beginner, I got to get a big picture of what all projects are there. Daniel helped me a lot in getting started with packaging in Debian. I really appreciate the time he took out to guide me @DebConf and Pranav, remotely.

One specific line I liked about Daniel’s talk on Open Day, 5th August  – “Free Communications with Free Software and Debian” while talking about free RTC (Real Time Communication) is that,

..Instead of communication controlling the user, the user can control the communcation..

I talked about free RTC, my Project Lumicall and about my journey being an Outreachy Intern with Debian. I also covered my co-speaker’s project work on Lumicall being a GSoC 2016 student.

Picture captured by – Aigars Mahinovs 

 

Managing Debian’s RTC services – Daniel Pocock

Meeting the Outreachy family feels amaazzing! Karen Sandler, executive director of the Software Freedom Conservancy gave a talk on the Significance and Impact of Outreachy and Debian’s support for the programme.

with Karen Sandler and Outreachy alumini

DebConf 2017 has been a wonderful conference with the community being very welcoming and helpful


Sylvestre Ledru: Rebuild of Debian using Clang 3.9, 4.0 and 5.0

24 August, 2017 - 05:09

tldr: The percentage of failure is decreasing, Clang support is improving but there is a long way to go.

The goal of this initiative is to rebuild Debian using Clang as a compiler instead of gcc. I have been doing this analysis for the last 6 years.

Recently, we rebuilt the archive of the Debian archive with Clang 3.9.1 (July 6th), 4.0.1 (July 6th) and 5.0 rc2 (August 20th).

For various reasons, we didn't perform a rebuild since June 2016 with version 3.8. Therefor, we took the opportunity to do three over the last month.

Now, the 3.9 & 4.0 results are impacted by a build failure when building all haskell packages (the -no-pie option in Clang doesn't exist - I introduced it in clang 5.0). Fixing this issue with 5.0 removed more than 860 failures.

Also, for the same versions, a Qt compiler detection is considering that Clang is not a C++11 compiler because clang++, by default, defines __cplusplus as 199711L (-std=c++11 has to be added to define a correct __cplusplus). See https://bugreports.qt.io/browse/QTBUG-62535 for more information. Some discussions happened on the upstream mailing list about changing the default C++ dialect.
For example, with 4.0, this is causing 132 errors. With 5.0, probably thanks to a new Qt version, roughly the same number of packages are failing but because gcc just triggers a warning with the "nodiscard" attribute being incorrectly used when clang triggers an error.

In parallel, ignoring the haskell build failures, the numbers sightly increased since last year even if the overall percentage decreased (new packages being uploaded in the archive).

VersionBuild failuresIgnoring haskell pkgs 3.81367 / 5.6% 3.92274 / 8.1%1618 / 5.8% 4.02311 / 8.3%1655 / 5.9% 5.01445 / 5.1%

In parallel, new warnings and errors showed up in Clang.
This is causing a new set of build failures (especially with the usage of -Werror).

As few examples:
* Starting with 4.0, clang triggers an error ordered comparison between pointer and zero ('char *' and 'int').
* Similarly, with this version, -Wmain introduces a new warning which will trigger a warning when a bool literal is returned from main.
* clang also introduced a new warning called -Waddress-of-packed-member causing 5 new errors.
* With the same version, clang can trigger a new error when auto is used in function return type.

Now, as a conclusion, having Debian being built with clang by default is still a long shot.
First, when Clang became usable for a general audience, gcc was lagging in term of warning and error detections. Now, gcc is in a much better position than it was, decreasing the interest to have clang replacing gcc. In parallel, most of the efforts in term of warnings
and mistake detections are currently done under the clang tidy umbrella, making them less intrusive as part of this initiative (but harder to use and to deploy).
As an example, the gcc warning -Wmisleading-indentation has been implemented under a clang-tidy checker.
Second, the very permissive license of clang has been a key factor for some operating systems to switch like the PS4, Mac OS X or FreeBSD. With Debian, the community is generally happy with the GPL.
Third, the performances are similar enough that it is not worth the work, except for some projects with very special needs.

Last, despite that it is much easier to contribute to llvm/clang than gcc (not copyright assignment or actual review system for example), this isn't a big differentiator for most of the projects.

Of course, I will continue to run and analysis these rebuilds as this is a great source of information for clang upstream developers to improve the compatibility with gcc and understand some impacts. However, until there is a big game changer, I will stop pursuing the goal of having Debian switching to clang instead of gcc. I will stop effort on the debile project (which was aiming to rebuild in the background packages).

Original post blogged on b2evolution.

Antoine Beaupré: The supposed decline of copyleft

24 August, 2017 - 00:00

At DebConf17, John Sullivan, the executive director of the FSF, gave a talk on the supposed decline of the use of copyleft licenses use free-software projects. In his presentation, Sullivan questioned the notion that permissive licenses, like the BSD or MIT licenses, are gaining ground at the expense of the traditionally dominant copyleft licenses from the FSF. While there does seem to be a rise in the use of permissive licenses, in general, there are several possible explanations for the phenomenon.

When the rumor mill starts

Sullivan gave a recent example of the claim of the decline of copyleft in an article on Opensource.com by Jono Bacon from February 2017 that showed a histogram of license usage between 2010 and 2017 (seen below).

From that, Bacon elaborates possible reasons for the apparent decline of the GPL. The graphic used in the article was actually generated by Stephen O'Grady in a January article, The State Of Open Source Licensing, which said:

In Black Duck's sample, the most popular variant of the GPL – version 2 – is less than half as popular as it was (46% to 19%). Over the same span, the permissive MIT has gone from 8% share to 29%, while its permissive cousin the Apache License 2.0 jumped from 5% to 15%.

Sullivan, however, argued that the methodology used to create both articles was problematic. Neither contains original research: the graphs actually come from the Black Duck Software "KnowledgeBase" data, which was partly created from the old Ohloh web site now known as Open Hub.

To show one problem with the data, Sullivan mentioned two free-software projects, GNU Bash and GNU Emacs, that had been showcased on the front page of Ohloh.net in 2012. On the site, Bash was (and still is) listed as GPLv2+, whereas it changed to GPLv3 in 2011. He also claimed that "Emacs was listed as licensed under GPLv3-only, which is a license Emacs has never had in its history", although I wasn't able to verify that information from the Internet archive. Basically, according to Sullivan, "the two projects featured on the front page of a site that was using [the Black Duck] data set were wrong". This, in turn, seriously brings into question the quality of the data:

I reported this problem and we'll continue to do that but when someone is not sharing the data set that they're using for other people to evaluate it and we see glimpses of it which are incorrect, that should give us a lot of hesitation about accepting any conclusion that comes out of it.

Reproducible observations are necessary to the establishment of solid theories in science. Sullivan didn't try to contact Black Duck to get access to the database, because he assumed (rightly, as it turned out) that he would need to "pay for the data under terms that forbid you to share that information with anybody else". So I wrote Black Duck myself to confirm this information. In an email interview, Patrick Carey from Black Duck confirmed its data set is proprietary. He believes, however, that through a "combination of human and automated techniques", Black Duck is "highly confident at the accuracy and completeness of the data in the KnowledgeBase". He did point out, however, that "the way we track the data may not necessarily be optimal for answering the question on license use trend" as "that would entail examination of new open source projects coming into existence each year and the licenses used by them".

In other words, even according to Black Duck, its database may not be useful to establish the conclusions drawn by those articles. Carey did agree with those conclusions intuitively, however, saying that "there seems to be a shift toward Apache and MIT licenses in new projects, though I don't have data to back that up". He suggested that "an effective way to answer the trend question would be to analyze the new projects on GitHub over the last 5-10 years." Carey also suggested that "GitHub has become so dominant over the recent years that just looking at projects on GitHub would give you a reasonable sampling from which to draw conclusions".

Indeed, GitHub published a report in 2015 that also seems to confirm MIT's popularity (45%), surpassing copyleft licenses (24%). The data is, however, not without its own limitations. For example, in the above graph going back to the inception of GitHub in 2008, we see a rather abnormal spike in 2013, which seems to correlate with the launch of the choosealicense.com site, described by GitHub as "our first pass at making open source licensing on GitHub easier".

In his talk, Sullivan was critical of the initial version of the site which he described as biased toward permissive licenses. Because the GitHub project creation page links to the site, Sullivan explained that the site's bias could have actually influenced GitHub users' license choices. Following a talk from Sullivan at FOSDEM 2016, GitHub addressed the problem later that year by rewording parts of the front page to be more accurate, but that any change in license choice obviously doesn't show in the report produced in 2015 and won't affect choices users have already made. Therefore, there can be reasonable doubts that GitHub's subset of software projects may not actually be that representative of the larger free-software community.

In search of solid evidence

So it seems we are missing good, reproducible results to confirm or dispel these claims. Sullivan explained that it is a difficult problem, if only in the way you select which projects to analyze: the impact of a MIT-licensed personal wiki will obviously be vastly different from, say, a GPL-licensed C compiler or kernel. We may want to distinguish between active and inactive projects. Then there is the problem of code duplication, both across publication platforms (a project may be published on GitHub and SourceForge for example) but also across projects (code may be copy-pasted between projects). We should think about how to evaluate the license of a given project: different files in the same code base regularly have different licenses—often none at all. This is why having a clear, documented and publicly available data set and methodology is critical. Without this, the assumptions made are not clear and it is unreasonable to draw certain conclusions from the results.

It turns out that some researchers did that kind of open research in 2016 in a paper called "The Debsources Dataset: Two Decades of Free and Open Source Software" [PDF] by Matthieu Caneill, Daniel M. Germán, and Stefano Zacchiroli. The Debsources data set is the complete Debian source code that covers a large history of the Debian project and therefore includes thousands of free-software projects of different origins. According to the paper:

The long history of Debian creates a perfect subject to evaluate how FOSS licenses use has evolved over time, and the popularity of licenses currently in use.

Sullivan argued that the Debsources data set is interesting because of its quality: every package in Debian has been reviewed by multiple humans, including the original packager, but also by the FTP masters to ensure that the distribution can legally redistribute the software. The existence of a package in Debian provides a minimal "proof of use": unmaintained packages get removed from Debian on a regular basis and the mere fact that a piece of software gets packaged in Debian means at least some users found it important enough to work on packaging it. Debian packagers make specific efforts to avoid code duplication between packages in order to ease security maintenance. The data set covers a period longer than Black Duck's or GitHub's, as it goes all the way back to the Hamm 2.0 release in 1998. The data and how to reproduce it are freely available under a CC BY-SA 4.0 license.

Sullivan presented the above graph from the research paper that showed the evolution of software license use in the Debian archive. Whereas previous graphs showed statistics in percentages, this one showed actual absolute numbers, where we can't actually distinguish a decline in copyleft licenses. To quote the paper again:

The top license is, once again, GPL-2.0+, followed by: Artistic-1.0/GPL dual-licensing (the licensing choice of Perl and most Perl libraries), GPL-3.0+, and Apache-2.0.

Indeed, looking at the graph, at most do we see a rise of the Apache and MIT licenses and no decline of the GPL per se, although its adoption does seem to slow down in recent years. We should also mention the possibility that Debian's data set has the opposite bias: toward GPL software. The Debian project is culturally quite different from the GitHub community and even the larger free-software ecosystem, naturally, which could explain the disparity in the results. We can only hope a similar analysis can be performed on the much larger Software Heritage data set eventually, which may give more representative results. The paper acknowledges this problem:

Debian is likely representative of enterprise use of FOSS as a base operating system, where stable, long-term and seldomly updated software products are desirable. Conversely Debian is unlikely representative of more dynamic FOSS environments (e.g., modern Web-development with micro libraries) where users, who are usually developers themselves, expect to receive library updates on a daily basis.

The Debsources research also shares methodology limitations with Black Duck: while Debian packages are reviewed before uploading and we can rely on the copyright information provided by Debian maintainers, the research also relies on automated tools (specifically FOSSology) to retrieve license information.

Sullivan also warned against "ascribing reason to numbers": people may have different reasons for choosing a particular license. Developers may choose the MIT license because it has fewer words, for compatibility reasons, or simply because "their lawyers told them to". It may not imply an actual deliberate philosophical or ideological choice.

Finally, he brought up the theory that the rise of non-copyleft licenses isn't necessarily at the detriment of the GPL. He explained that, even if there is an actual decline, it may not be much of a problem if there is an overall growth of free software to the detriment of proprietary software. He reminded the audience that non-copyleft licenses are still free software, according to the FSF and the Debian Free Software Guidelines, so their rise is still a positive outcome. Even if the GPL is a better tool to accomplish the goal of a free-software world, we can all acknowledge that the conversion of proprietary software to more permissive—and certainly simpler—licenses is definitely heading in the right direction.

[I would like to thank the DebConf organizers for providing meals for me during the conference.]

Note: this article first appeared in the Linux Weekly News.

Pages

Creative Commons License ลิขสิทธิ์ของบทความเป็นของเจ้าของบทความแต่ละชิ้น
ผลงานนี้ ใช้สัญญาอนุญาตของครีเอทีฟคอมมอนส์แบบ แสดงที่มา-อนุญาตแบบเดียวกัน 3.0 ที่ยังไม่ได้ปรับแก้