Planet Debian

Subscribe to Planet Debian feed
Planet Debian - http://planet.debian.org/
Updated: 3 hours 13 min ago

Guido Günther: Debian Fun in May 2016

11 June, 2016 - 00:38
Debian LTS

May marked the thirteenth month I contributed to Debian LTS under the Freexian umbrella. I spent the 17.25 hours working on these LTS things:

  • Fixed CVE-2014-7210 in pdns resulting in DLA-492-1
  • Fixed the build failure of Icedove on armhf resulting in DLA 472-2
  • Forward ported our nss, nspr enhancements to to the current versions in testing to continue the discussion on the same nss and nspr versions in all suites including some ABI compliance research (thanks abi-compliance-tester!), resulting in 824872.
  • Backported Icedve 45 and Enigmail to wheezy to check if we can continue to support it - we can with a minor tweaks. Upload will happen in June.
  • While at that added some autpkgtests for Icedove 45 resulting in 809723 (already applied).
  • Released DLA-498-1 for ruby-active-model-3.2 to address CVE-2016-0753.
  • Reviewed the Updates of ruby-active-record-3.2 for CVE-2015-7577 and eglibc.
Other Debian stuff
  • Uploaded libvirt 1.3.4 to sid, 1.3.5~rc1 to experimental
  • Uploaded libosinfo 0.3.0 to sid
  • Uploaded git-buildpackage 0.7.4 to sid including experimental multiple tarball support for gbp buildpackage

Jaminy Prabaharan: Weekly Report for GSoC16-week 1 and week2

10 June, 2016 - 20:18

After introducing ourselves to the community, we start contributing to the open and free source software. Since this is the first week,  I have went through theories which would help me in coding. Before coding it is always safer to refer to theories so that we don’t need to spare time in debugging.

 

The following were my week 1-4 plans:

  • Getting familiar with Python and coding for connecting to an email account (using IMAP) and examines every message in every folder.
  • Writing a basic Python script to look at the “To”, “From” and “CC” headers of every email message in the folder. Identifying all the names and email addresses and writing them in a CSV file.
  • Scanning the body of each message looking for phone numbers.(for messages in plain text format)
  • Cleaning up the phone numbers and write them in international format and putting those in the CSV file too.(Telify recognizes phone numbers on web pages or in email messages and converts them into clickable links. This works with many CTI applications, SIP clients, Skype, Netmeeting, snom phones, the AGFEO TK-Suite client, SerVonic IXI-PCS and others. In general, all telephony software, devices or interfaces which can be controlled by using a URL, will most likely enable you to call a phone number directly from your browser or email client.

These two weeks I have completed my first two tasks in the list.I have committed and pushed my works in GitHub.https://github.com/Jaminy/GSoC

Successfully logged into the email account using IMAP and examined each folders.Examined inbox to extract  “To”, “From” and “CC” of the last message received.Trying to extract all message details to put in CSV file.


Ben Hutchings: Debian LTS work, May 2016

10 June, 2016 - 18:02

I was assigned another 15 hours of work by Freexian's Debian LTS initiative, but only worked a total of 10 hours. I intend to make up for this in June.

I began preparing the next stable update for Linux 3.2 on kernel.org, but haven't yet sent it out for review. I rebased the wheezy-security branch onto Linux 3.2.80, and added fixes for one more security issue and one data corruption issue affecting aufs.

I started a week in the front desk, triaging new issues for wheezy.

Martin Pitt: autopkgtest 4.0: Simplified CLI, deprecating “adt”

10 June, 2016 - 03:24

Historically, the “adt-run” command line has allowed multiple tests; as a consequence, arguments like --binary or --override-control were position dependent, which confused users a lot (#795274, #785068, #795274, LP #1453509). On the other hand I don’t know anyone or any CI system which actually makes use of the “multiple tests on a single command line” feature.

The command line also was a bit confusing in other ways, like the explicit --built-tree vs. --unbuilt-tree and the magic / vs. // suffixes, or option vs. positional arguments to specify tests.

The other long-standing confusion is the pervasive “adt” acronym, which is still from the very early times when “autopkgtest” was called “autodebtest” (this was changed one month after autodebtest’s inception, in 2006!).

Thus in some recent night/weekend hack sessions I’ve worked on a new command line interface and consistent naming. This is now available in autopkgtest 4.0 in Debian unstable and Ubuntu Yakkety. You can download and use the deb package on Debian jessie and Ubuntu ≥ 14.04 LTS as well. (I will provide official backports after the first bug fix release after this got some field testing.)

New “autopkgtest” command

The adt-run program is now superseded by autopkgtest:

  • It accepts only exactly one tested source package, and gives a proper error if none or more than one (often unintend) is given. Binaries to be tested, --override-control, etc. can now be specified in any order, making the arguments position independent. So you now can do things like:
    autopkgtest *.dsc *.deb [...]

    Before, *.deb only applied to the following test.

  • The explicit --source, --click-source etc. options are gone, the type of tested source/binary packages, including built vs. unbuilt tree, is detected automatically. Tests are now only specified with positional arguments, without the need (or possibility) to explicitly specify their type. The one exception is --installed-click com.example.myapp as possible names are the same as for apt source package names.
    # Old:
    adt-run --unbuilt-tree pkgs/foo-2 [...]
    # or equivalently:
    adt-run pkgs/foo-2// [...]
    
    # New:
    autopkgtest pkgs/foo-2
    # Old:
    adt-run --git-source http://example.com/foo.git [...]
    # New:
    autopkgtest http://example.com/foo.git [...]
    
  • The virtualization server is now separated with a double instead of a tripe dash, as the former is standard Unix syntax.
  • It defaults to the current directory if that is a Debian source package. This makes the command line particularly simple for the common case of wanting to run tests in the package you are just changing:
    autopkgtest -- schroot sid

    Assuming the current directory is an unbuilt Debian package, this will build the package, and run the tests in ./debian/tests against the built binaries.

  • The virtualization server must be specified with its “short” name only, e. g. “ssh” instead of “adt-virt-ssh”. They also don’t get installed into $PATH any more, as it’s hardly useful to call them directly.

README.running-tests got updated to the new CLI, as usual you can also read the HTML online.

The old adt-run CLI is still available with unchanged behaviour, so it is safe to upgrade existing CI systems to that version.

Image build tools

All adt-build* tools got renamed to autopkgtest-build*, and got changed to build images prefixed with “autopkgtest” instead of “adt”. For example, adt-build-lxc ubuntu xenial now produces an autopkgtest-xenial container instead of adt-xenial.

In order to not break existing CI systems, the new autopkgtest package contains symlinks to the old adt-build* commands, and when being called through them, also produce images with the old “adt-” prefix.

Environment variables in tests

Finally there is a set of environment variables that are exported by autopkgtest for using in tests and image customization tools, which now got renamed from ADT_* to AUTOPKGTEST_*:

  • AUTOPKGTEST_APT_PROXY
  • AUTOPKGTEST_ARTIFACTS
  • AUTOPKGTEST_AUTOPILOT_MODULE
  • AUTOPKGTEST_NORMAL_USER
  • AUTOPKGTEST_REBOOT_MARK
  • AUTOPKGTEST_TMP

As these are being used in existing tests and tools, autopkgtest also exports/checks those under their old ADT_* name. So tests can be converted gradually over time (this might take several years).

Feedback

As usual, if you find a bug or have a suggestion how to improve the CLI, please file a bug in Debian or in Launchpad. The new CLI is recent enough that we still have some liberty to change it.

Happy testing!

Patrick Schoenfeld: Ansible: Indenting in Templates

9 June, 2016 - 16:09

When using ansible to configure systems and services, templates can reach a significant complexity.  Proper indenting can help to improve the readability of the templates, which is very important for further maintenance.

Unfortunately the default settings for the jinja2 template engine in ansible do enable trim_blocks only, while a combination with lstrip_blocks would be better. But here comes the good news:

It’s possible to enable that setting on a per-template base. The secret is to add a special comment to the very first line of a template:

#jinja2: lstrip_blocks: True

This setting does the following: If enabled, leading spaces and tabs „are stripped from the start of a line to a block“.

So a resulting template could look like this:

global
{% for setting in global_settings %}
    {% if setting ... %}
    option {{ setting }}
    {% endif %}
{% endfor %}

Unfortunately (or fortunately, if you want to see it this way this does not strip leading spaces and tabs where the indentation is followed by pure text, e.g. the whitespaces in line 4 are preserved. So as a matter of fact, if you care for the indentation in the resulting target file, you need to indent those lines  according to the indentation wanted in the target file instead, like it is done in the example.

In less simple cases, with more deep nesting, this may seem odd, but hey: it’s the best compromise between a good, readable template and a consistently indented output file.

NOKUBI Takatsugu: Recurrent Convolutional Neural Networks for Text Classification

9 June, 2016 - 14:38

I made a simple implementation of text classification with Recurrent CNN.

https://github.com/knok/rcnn-text-classification

It uses chainer, a Deep Learning framework.

Recurrent convolutional neural networks for text classification
Siwei Lai, Liheng Xu, Kang Liu, Jun Zhao, Chinese Academy of Sciences, China
AAAI. 2015.

Daniel Pocock: Working to pass GSoC

9 June, 2016 - 00:11

GSoC students have officially been coding since 23 May (about 2.5 weeks) and are almost half-way to the mid-summer evaluation (20 - 27 June). Students who haven't completed some meaningful work before that deadline don't receive payment and in such a large program, there is no possibility to give students extensions or let them try and catch up later.

Every project and every student are different, some are still getting to know their environment while others have already done enough to pass the mid-summer evaluation.

I'd like to share a few tips to help students ensure they don't inadvertently fail the mid-summer evaluation

Kill electronic distractions

As a developer of real-time communications projects, many people will find it ironic or hypocritical that this is at the top of my list.

Switch off the mobile phone or put it in silent mode so it doesn't even vibrate. Research has suggested that physically turning it off and putting it out of sight has significant benefits. Disabling the voicemail service can be an effective way of making sure no time is lost listening to a bunch of messages later. Some people may grumble at first but if they respect you, they'll get into the habit of emailing you and waiting for you to respond when you are not working.

Get out a piece of paper and make a list of all the desktop notifications on your computer, whether they are from incoming emails, social media, automatic updates, security alerts or whatever else. Then figure out how to disable them all one-by-one.

Use email to schedule fixed times for meetings with mentors. Some teams/projects also have fixed daily or weekly times for IRC chat. For a development project like GSoC, it is not necessary or productive to be constantly on call for 3 straight months.

Commit every day

Habits are a powerful thing. Successful students have a habit of making at least one commit every day. The "C" in GSoC is for Code and commits are a good way to prove that coding is taking place.

GSoC is not a job, it is like a freelance project. There is no safety-net for students who get sick or have an accident and mentors are not bosses, each student is expected to be their own boss. Although Google has started recommending students work full time, 40 hours per week, it is unlikely any mentors have any way to validate these hours. Mentors can look for a commit log, however, and simply won't be able to pass a student if there isn't code.

There may be one day per week where a student writes a blog or investigates a particularly difficult bug and puts a detailed report in the bug tracker but by the time we reach the second or third week of GSoC, most students are making at least one commit in 3 days out of every 5.

Consider working away from home/family/friends

Can you work without anybody interrupting you for at least five or six hours every day?

Do you feel pressure to help with housework, cooking, siblings or other relatives? Even if there is no pressure to do these things, do you find yourself wandering away from the computer to deal with them anyway?

Do family, friends or housemates engage in social activities, games or other things in close proximity to where you work?

All these things can make a difference between passing and failing.

Maybe these things were tolerable during high school or university. GSoC, however, is a stepping stone into professional life and that means making a conscious decision to shut those things out and focus. Some students have the ability to manage these distractions well, but it is not for everybody. Think about how leading sports stars or musicians find a time and space to be "in the zone" when training or rehearsing, this is where great developers need to be too.

Some students find the right space in a public library or campus computer lab. Some students have been working in hacker spaces or at empty desks in local IT companies. These environments can also provide great networking opportunities.

Managing another summer job concurrently with GSoC

It is no secret that some GSoC students have another job as well. Sometimes the mentor is aware of it, sometimes it has not been disclosed.

The fact is, some students have passed GSoC while doing a summer job or internship concurrently but some have also failed badly in both GSoC and their summer job. Choosing one or the other is the best way to succeed, get the best results and maximize the quality of learning and community interaction. For students in this situation, now it is not too late to make the decision to withdraw from GSoC or the other job.

If doing a summer job concurrently with GSoC is unavoidable, the chance of success can be greatly increased by doing the GSoC work in the mornings, before starting the other job. Some students have found that they actually finish more quickly and produce better work when GSoC is constrained to a period of 4 or 5 hours each morning and their other job is only in the afternoon. On the other hand, if a student doesn't have the motivation or energy to get up and work on GSoC before the other job then this is a strong sign that it is better to withdraw from GSoC now.

Gunnar Wolf: University degrees and sysadmin skills

9 June, 2016 - 00:03

I'll tune in to the post-based conversation being held on Planet Debian: Russell Coker wonders about what's needed to get university graduates with enough skills for a sysadmin job, to which Lucas Nussbaum responds with his viewpoints. They present a very contrasting view of what's needed for students — And for a good reason, I'd say: Lucas is an academician; I don't know for sure about Russell, but he seems to be a down-to-the-earth, dirty-handed, proficient sysadmin working on the field. They both contact newcomers to their fields, and will notice different shortcomings.

I tend to side with Lucas' view. That does not come as a surprise, as I've been working for over 15 years in an university, and in the last few years I started walking from a mostly-operative sysadmin in an academic setting towards becoming an academician that spends most of his time sysadmining. Subtle but important distinction.

I teach at the BSc level at UNAM, and am a Masters student at IPN (respectively, Mexico's largest and second-largest universities). And yes, the lack of sysadmin abilities in both is surprising. But so is a good understanding of programming. And I'm sure that, were I to dig into several different fields, I'd feel the same: Student formation is very basic at each of those fields.

But I see that as natural. Of course, if I were to judge people as geneticists as they graduate from Biology, or were I to judge them as topologists as they graduate from Mathematics, or any other discipline in which I'm not an expert, I'd surely not know where to start — Given I have about 20 years of professional life on my shoulders, I'm quite skewed as to what is basic for a computing professional. And of course, there are severe holes in my formation, in areas I never used. I know next to nothing of electronics, my mathematical basis is quite flaky, and I'm a poor excuse when talking about artificial intelligence.

Where am I going with this? An university degree (BSc in English, would amount to "licenciatura" in Spanish) is not for specialization. It is to have a sufficiently broad panorama of the field, and all of the needed tools to start digging deeper and specializing — either by yourself, working on a given field and learning its details as you go, or going through a postgraduate program (Specialization, Masters, Doctorate).

Even most of my colleagues at the Masters in Engineering in Security and Information Technology lack of a good formation in fields I consider essential. However, what does information security mean? Many among them are working on legal implications of several laws that touch our field. Many other are working on authenticity issues in images, audios and other such media. Many other are trying to come up with mathematical ways to cheapen the enormous burden of crypto operations (say, "shaving" CPU cycles off a very large exponentiation). Others are designing autonomous learning mechanisms to characterize malware. Were I as a computing professional to start talking about their research, I'd surely reveal I know nothing about it and get laughed at. That's because I haven't specialized in those fields.

University education should give a broad universal basis to enter a professional field. It should not focus on teaching tools or specific procedures (although some should surely be presented as examples or case studies). Although I'd surely be happy if my university's graduates were to know everything about administering a Debian system, that would be wrong for a university to aim at; I'd criticize it the same way I currently criticize programs that mix together university formation and industry certification as if they were related.

Reproducible builds folks: Reproducible builds: week 58 in Stretch cycle

8 June, 2016 - 21:08

What happened in the Reproducible Builds effort between May 29th and June 4th 2016:

Media coverage

Ed Maste will present Reproducible Builds in FreeBSD at BDSCan 2016 in Ottawa, Canada on June 11th.

GSoC and Outreachy updates Toolchain fixes
  • Paul Gevers uploaded fpc/3.0.0+dfsg-5 with a new helper script fp-fix-timestamps, which helps with reproducibility issues of PPU files in freepascal packages.
  • Sascha Steinbiss uploaded a patched version of epydoc to our experimental repository to test a patch for the use_epydoc issue.
Other upstream fixes Packages fixed

The following 53 packages have become reproducible due to changes in their build-dependencies: angband blktrace code-saturne coinor-symphony device-tree-compiler mpich rtslib ruby-bcrypt ruby-bson-ext ruby-byebug ruby-cairo ruby-charlock-holmes ruby-curb ruby-dataobjects-sqlite3 ruby-escape-utils ruby-ferret ruby-ffi ruby-fusefs ruby-github-markdown ruby-god ruby-gsl ruby-hdfeos5 ruby-hiredis ruby-hitimes ruby-hpricot ruby-kgio ruby-lapack ruby-ldap ruby-libvirt ruby-libxml ruby-msgpack ruby-ncurses ruby-nfc ruby-nio4r ruby-nokogiri ruby-odbc ruby-oj ruby-ox ruby-raindrops ruby-rdiscount ruby-redcarpet ruby-redcloth ruby-rinku ruby-rjb ruby-rmagick ruby-rugged ruby-sdl ruby-serialport ruby-sqlite3 ruby-unicode ruby-yajl ruby-zoom thin

The following packages have become reproducible after being fixed:

Some uploads have addressed some reproducibility issues, but not all of them:

Uploads with an unknown result because they fail to build:

  • h2database/1.4.192-1 by Emmanuel Bourg, which forces a specific locale to generate documentation.

Patches submitted that have not made their way to the archive yet:

  • #825764 against docbook-ebnf by Chris Lamb: sort list of globbed files.
  • #825857 against python-setuptools by Anton Gladky: sort list of files in native_libs.txt.
  • #825968 against epydoc by Sascha Steinbiss: traverse lists in sorted order.
  • #826051 against dh-lua by Reiner Herrmann: sort list of Lua versions embedded into control file.
  • #826093 against osc by Alexis Bienvenüe: use SOURCE_DATE_EPOCH for manpage date.
  • #826158 against texinfo by Alexis Bienvenü: use SOURCE_DATE_EPOCH for dates in makeinfo output.
  • #826162 against slime by Alexis Bienvenüe: sort list of contributors locale-independently.
  • #826209 against fastqtl by Chris Lamb: normalize permissions and order in tarball.
  • #826309 against gnupg2 by intrigeri: don't embed hostname and timestamp into gpgv.exe.
Package reviews

45 reviews have been added, 25 have been updated and 25 have been removed in this week.

12 FTBFS bugs have been reported by Chris Lamb and Niko Tyni.

diffoscope development
  • diffoscope 53 was been released by Mattia Rizzolo, with:
    • various improvements on temporary file handling;
    • fix a crash when comparing directories with broken symlinks (#818856);
    • great improvement on the deb(5) support (#818414), by Reiner Herrmann;
    • add FreeBSD packages in --list-tools, by Ed Maste.
  • diffoscope 54 (released shortly after) to address a regression involving --list-tools, where a syntax error prevented proper listing of all tools.
strip-nondeterminism development

Mattia uploaded strip-nondeterminism 0.018-1 which improved support for *.epub files.

tests.reproducible-builds.org Misc.

Last week we also learned about progress of reproducible builds in FreeBSD. Ed Maste announced a change to record the build timestamp during ports building, which is required for later reproduction.

This week's edition was written by Reiner Herrman, Holger Levsen and Chris Lamb and reviewed by a bunch of Reproducible builds folks on IRC.

Jonathan Dowland: Some tools for working with Docker images

8 June, 2016 - 19:45

For developing complex, real-world Docker images, there are a number of tools that can make life easier.

The first thing to realise is that the Dockerfile format is severely limited. At work, we have eventually outgrown it and it has been replaced with a structured YAML document that is processed into a Dockerfile by a tool called dogen. There are several advantages to this, but I'll point out two: firstly, having data about the image available in a structured format makes automatically deriving technical documentation very easy. Secondly, some of the quirks of Dockerfiles, such as the ADD command respecting the environment's umask, are worked around in the dogen tool.

We have a large suite of integration tests that we run against images to make sure that we haven't introduced regressions during their development. The core of this is the Container Testing Framework, which makes use of the Behave system.

Each command that is run in a Dockerfile generates a new docker image layer. In practice, this can mean a real-world image has a great number of layers underneath it. Docker-dot-com have resisted introducing layer squashing into their tools, but with both hard limits for layers in some of the storage backends, and performance issues for most of the rest, this is a very real issue. Marek Goldmann wrote a squashing tool that we use to control the number of intermediate layers that are introduced by our images.

Finally, even with tools like dogen and ctf, we would like to be able to have more sophisticated tools than shell scripts for configuring images, both at image build time and container run time. We want to do this without introducing extra dependencies inside the images which will not otherwise be used for their operation.

Ansible could be a solution for this, but there are practical issues with relying on it for runtime configuration in our situation. For that reason David Becvarik is designing and implementing Container Configuration Tool, or cct, a framework for performing configuration of containers written in Python.

Tanguy Ortolo: Process command line arguments in shell

8 June, 2016 - 18:29

When writing a wrapper script, one often has to process the command line arguments to transform them according to his needs, to change some arguments, to remove or insert some, or perhaps to reorder them.

Naive approach

The naive approach to do that is¹:

# Process arguments, building a new argument list
new_args=""
for arg in "$@"
do
    case "$arg"
    in
        --foobar)
            # Convert --foobar to the new syntax --foo=bar
            new_args="$args --foo=bar"
        ;;
        *)
            # Take other options as they are
            new_args="$args $arg"
        ;;
    esac
done

# Call the actual program
exec program $new_args

This naive approach is simple, but fragile, as it will break on arguments that contain a space. For instance, calling wrapper --foobar "some file" (where some file is a single argument) will result in the call program --foo=bar some file (where some and file are two distinct arguments).

Correct approach

To handle spaces in arguments, we need either:

  • to quote them in the new argument list, but that requires escaping possible quotes they contain, which would be error-prone, and implies using external programs such as sed;
  • to use an actual list or array, which is a feature of advanced shells such as Bash or Zsh, not standard shell…

… except standard shell does support arrays, or rather, it does support one specific array: the positional parameter list "$@"². This leads to one solution to process arguments in a reliable way, which consists in rebuilding the positional parameter list with the built-in command set --:

# Process arguments, building a new argument list in "$@"
# "$@" will need to be cleared, not right now but on first iteration only
first_iter=1
for arg in "$@"
do
    if [ "$first_iter" -eq 1 ]
    then
        # Clear the argument list
        set --
        first_iter=0
    fi
    case "$arg"
    in
        --foobar) set -- "$@" --foo=bar ;;
        *) set -- "$@" "$arg" ;;
    esac
done

# Call the actual program
exec program "$@"
Notes
  1. I you prefer, for arg in "$@" can be simplified to just for arg.
  2. As a reminder, and contrary to what it looks like, quoted "$@" does not expand to a single field, but to one field per positional parameter.

Lucas Nussbaum: Re: Sysadmin Skills and University Degrees

8 June, 2016 - 15:04

Russell Coker wrote about Sysadmin Skills and University Degrees. I couldn’t agree more that a major deficiency in Computer Science degrees is the lack of sysadmin training. It seems like most sysadmins learned most of what they know from experience. It’s very hard to recruit young engineers (freshly out of university) for sysadmin jobs, and the job interviews are often a bit depressing. Sysadmins jobs are also not very popular with this public, probably because university curriculums fail to emphasize what’s exciting about those jobs.

However, I think I disagree rather deeply with Russell’s detailed analysis.

First, Version Control. Well, I think that it’s pretty well covered in university curriculums nowadays. From my point of view, teaching CS in Université de Lorraine (France), mostly in Licence Professionnelle Administration de Systèmes, Réseaux et Applications à base de Logiciels Libres (warning: french), a BSc degree focusing on Linux systems administration, it’s not usual to see student projects with a mandatory use of Git. And it doesn’t seem to be a major problem for students (which always surprises me). However, I wouldn’t rate Version Control as the most important thing that is required for a sysadmin. Similarly Dependencies and Backups are things that should be covered, but probably not as first class citizens.

I think that there are several pillars in the typical sysadmin knowledge.

First and foremost, sysadmins need a good understanding of the inner workings of an operating system. I sometimes feel that many Operating Systems Design courses are a bit too much focused on the “Design” side of things. Yes, it’s useful to understand the low-level mechanisms, and be able to (mentally) recreate an OS from scratch. But it’s also interesting to know how real systems are actually built, and what are the trade-off involved. I very much enjoyed reading Branden Gregg’s Systems Performance: Enterprise and the Cloud because each chapter starts with a great overview of how things are in the real world, with a very good level of detail. Also, addressing OS design from the point of view of performance could be a way to turn those courses into something more attractive for students: many people like to measure, benchmark, optimize things, and it’s quite easy to demonstrate how different designs, or different configurations, make a big difference in terms of performance in the context of OS design. It’s possible to be a sysadmin and ignore, say, the existence of the VFS, but there’s a large class of problems that you will never be able to solve. It can be a good trade-off for a curriculum (e.g. at the BSc level) to decide to ignore most of the low-level stuff, but it’s important to be aware of it.

Students also need to learn how to design a proper infrastructure (that meets requirements in terms of scalability, availability, security, and maybe elasticity). Yes, backups are important. But monitoring is, too. As well as high availability. In order to scale, it’s important to be able to automatize stuff. Russell writes that Sysadmins need some programming skills, but that’s mostly scripting and basic debugging. Well, when you design an infrastructure, or when you use configuration management tools such as Puppet, in some sense, you are programming, and in terms of needs to abstract things, it’s actually similar to doing object-oriented programming, with similar choices (should I use that off-the-shelf puppet module, or re-develop my own? How should everything fit together?). Also, when debugging, it’s often useful to be able to dig into code, understand what the developer was trying to do, and if the expected behavior actually matches what you are seeing. It often results in spending a lot of time to create a one-line fix, and it requires very advanced programming skills. Again, it’s possible to be a sysadmin with only limited software development knowledge, but there’s a large class of things that you are unlikely to address properly.

I think that what makes sysadmins jobs both very interesting and very challenging is that they require a very wide range of knowledge. There’s often the ability to learn about new stuff (much more than in software development jobs). Of course, the difficult question is where to draw the line. What is the sysadmin knowledge that every CS graduate should have, even in curriculums not targeting sysadmin jobs? What is the sysadmin knowledge for a sysadmin BSc degree? for a sysadmin MSc degree?

Russell Coker: Sysadmin Skills and University Degrees

8 June, 2016 - 13:10

I think that a major deficiency in Computer Science degrees is the lack of sysadmin training.

Version Control

The first thing that needs to be added is the basics of version control. CVS (which is now regarded as obsolete) was initially released when I was in the first year of university. But SCCS and RCS had been in use for some time. I think that the people who designed my course were remiss in not adding any mention of version control (not even strategies for saving old versions of your work), one could say that they taught us about version control by letting us accidentally delete our assignments. :-#

If a course is aimed at just teaching programmers (as most CS degrees are) then version control for group assignments should be a standard part of the course. Having some marks allocated for the quality of comments in the commit log would also be good.

A modern CS degree should cover distributed version control, that means covering Git as it’s the most popular distributed version control system nowadays.

For people who want to work as sysadmins (as opposed to developers who run their own PCs) a course should have an optional subject for version control of an entire system. That includes tools like etckeeper for version control of system configuration and tools like Puppet for automated configuration and system maintenance.

Dependencies

It’s quite reasonable for a CS degree to provide simplified problems for the students to solve so they can concentrate on one task. But in the real world the problems are more complex. One of the more difficult parts of managing real systems is dependencies. You have issues of header files etc at compile time and library versions at deployment. Often you need a program to run on systems with different versions of the OS which means making it compile for both and deal with differences in behaviour.

There are lots of hacky things that people do to deal with dependencies in systems. People link compiled programs statically, install custom versions of interpreters in user home directories or /usr/local for daemons, and do many other things. These things can have bad consequences including data loss, system downtime, and security problems. It’s not always wrong to do such things, but it’s something that should only be done with knowledge of the potential consequences and a plan for mitigating them. A CS degree should teach the potential advantages and disadvantages of these options to allow graduates to make informed decisions.

Backups

I’ve met many people who call themselves computer professionals and think that backups aren’t needed. I’ve seen production systems that were designed in a way that backups were impossible. The lack of backups is a serious problem for the entire industry.

Some lectures about backups could be part of a version control subject in a general CS degree. For a degree that majors in Sysadmin at least one subject about backups is appropriate.

For any backup (even backing up your home PC) you should have offsite backups to deal with fire damage, multiple backups of different ages (especially important now that encryption malware is a serious threat), and a plan for how fast you can restore things.

The most common use of backups is to deal with the case of deleting the wrong file. Unfortunately this case seems to be the most rarely mentioned.

Another common situation that should be covered is a configuration error that results in a system that won’t boot correctly. It’s a very common problem and one that can be solved quickly if you are prepared but which can take a long time if you aren’t.

For a Sysadmin course it is important to cover backups of systems in remote datacenters.

Hardware

A good CS degree should cover the process of selecting suitable hardware. Programmers often get to advise on the hardware used to run their code, especially at smaller companies. Reliability features such as RAID, ECC RAM, and clustering should be covered.

Planning for upgrades is a very important part of this which is usually not taught. Not only do you need to plan for an upgrade without much downtime or cost but you also need to plan for what upgrades are possible. Next year will your system require hardware that is more powerful than you can buy next year? If so you need to plan for a cluster now.

For a Sysadmin course some training about selecting cloud providers and remote datacenter hosting should be provided. There are many complex issues that determine whether it’s most appropriate to use a cloud service, hosted virtual machines, hosted physical servers managed by the ISP, hosted physical servers purchased by the client, or on-site servers. Often a large system will involve 2 or more of those options, even some small companies use 3 or more of those options to try and provide the performance and reliability they need at a price they can afford.

We Need Sysadmin Degrees

Covering the basic coding skills takes a lot of time. I don’t think we can reasonably expect a CS degree to cover all that and also give good coverage to sysadmin work. While some basic sysadmin skills are needed by every programmer I think we need to have separate majors for people who want a career in system administration.

Sysadmins need some programming skills, but that’s mostly scripting and basic debugging. Someone who’s main job is as a sysadmin can probably expect to never make any significant change to a program that’s more than 10,000 lines long. A large amount of the programming in a CS degree can be replaced by “file a bug report” for a sysadmin degree.

This doesn’t mean that sysadmins shouldn’t be doing software development or that they aren’t good at it. One noteworthy fact is that it appears that the most common job among developers of the Debian distribution of Linux is System Administration. Developing an OS involves some of the most intensive and demanding programming. But I think that more than a few people who do such work would have skipped a couple of programming subjects in favour of sysadmin subjects if they were given a choice.

Suggestions

Did I miss anything? What other sysadmin skills should be taught in a CS degree?

Do any universities teach these things now? If so please name them in the comments, it is good to help people find universities that teach them what they want to learn and help them in their career.

Related posts:

  1. university degrees Recently someone asked me for advice on what they can...
  2. A Better University I previously wrote about the financial value of a university...
  3. The Financial Value of a University Degree I’ve read quite a few articles about the value of...

Francois Marier: Simple remote mail queue monitoring

8 June, 2016 - 12:30

In order to monitor some of the machines I maintain, I rely on a simple email setup using logcheck. Unfortunately that system completely breaks down if mail delivery stops.

This is the simple setup I've come up with to ensure that mail doesn't pile up on the remote machine.

Server setup

The first thing I did on the server-side is to follow Sean Whitton's advice and configure postfix so that it keeps undelivered emails for 10 days (instead of 5 days, the default):

postconf -e maximal_queue_lifetime=10d

Then I created a new user:

adduser mailq-check

with a password straight out of pwgen -s 32.

I gave ssh permission to that user:

adduser mailq-check sshuser

and then authorized my new ssh key (see next section):

sudo -u mailq-check -i
mkdir ~/.ssh/
cat - > ~/.ssh/authorized_keys
Laptop setup

On my laptop, the machine from where I monitor the server's mail queue, I first created a new password-less ssh key:

ssh-keygen -t ed25519 -f .ssh/egilsstadir-mailq-check
cat ~/.ssh/egilsstadir-mailq-check.pub

which I then installed on the server.

Then I added this cronjob in /etc/cron.d/egilsstadir-mailq-check:

0 2 * * * francois /usr/bin/ssh -i /home/francois/.ssh/egilsstadir-mailq-check mailq-check@egilsstadir mailq | grep -v "Mail queue is empty"

and that's it. I get a (locally delivered) email whenever the mail queue on the server is non-empty.

There is a race condition built into this setup since it's possible that the server will want to send an email at 2am. However, all that does is send a spurious warning email in that case and so it's a pretty small price to pay for a dirt simple setup that's unlikely to break.

Enrico Zini: You'll thank me later

7 June, 2016 - 17:43

I agree with this post by Matthew Garrett.

I am quite convinced that most of the communities that I have known are vulnerable to people who are good manipulators of people.

Also, in my experience, manipulation by negating, pushing, or reframing the boundaries of people tends not to be recognised as manipulation, let alone abusive behaviour.

It's not about physically forcing people to do things that they don't want to do. It's about pushing people, again and again, wearing them out, making them feel like, despite their actual needs and wants, saying "yes" to you is the only viable way out.

It can happen for sex, and it can happen for getting a patch merged. It can happen out of habit. It can happen for pretty much anything.

Consent culture was not part of my education, and it was something I've had to discover for myself. I assume that to be a common experience, and that pushing against boundaries does happen, even without malicious intentions, on a regular basis.

However, it is not ok.

Take insisting. It is not the same as persisting. Persisting is what I do when I advocate for change. Persisting is what I do when the first version of my code segfaults. Insisting is what I do when a person says "no" to me and I don't want to accept it.

Is it ok to insist that a friend, whom you think is sick, goes and gets help?

Is it ok to insist that a friend, whom you think is sexually repressed, pushes through their boundaries to explore their sexuality with you?

In both cases, one may say, or think, trust me, you'll thank me afterwards. In both cases, what if afterwards I have nothing to thank you for?

I see a common pattern in you'll thank me afterwards situations. It can be in good faith, it can be creepy, it can be abusive, and most of the time, what it is, is dangerously unclear to most of the people involved.

I think that in a community like Debian, at the level of personal interaction, Insisting is not ok.

I think that in a community like Debian, at the level of personal interaction, "You'll thank me afterwards" is not ok.

When I say it's not ok I mean that it should not happen. If it happens, people must be free to say "stop". If it doesn't stop, people must expect to be able to easily find support, understanding, and help to make it stop.

Just like when people upload untested packages.

Pushing against personal boundaries of people is not ok, and pushing against personal boundaries does happen. When you get involved in a new community, such as Debian, find out early where, if that happens, you can find support, understanding, and help to make it stop.

If you cannot find any, or if the only thing you can find is people who say "it never happens here", consider whether you really want to be in that community.

Matthew Garrett: Be wary of heroes

7 June, 2016 - 10:33
Inspiring change is difficult. Fighting the status quo typically means being able to communicate so effectively that powerful opponents can't win merely by outspending you. People need to read your work or hear you speak and leave with enough conviction that they in turn can convince others. You need charisma. You need to be smart. And you need to be able to tailor your message depending on the audience, even down to telling an individual exactly what they need to hear to take your side. Not many people have all these qualities, but those who do are powerful and you want them on your side.

But the skills that allow you to convince people that they shouldn't listen to a politician's arguments are the same skills that allow you to convince people that they shouldn't listen to someone you abused. The ability that allows you to argue that someone should change their mind about whether a given behaviour is of social benefit is the same ability that allows you to argue that someone should change their mind about whether they should sleep with you. The visibility that gives you the power to force people to take you seriously is the same visibility that makes people afraid to publicly criticise you.

We need these people, but we also need to be aware that their talents can be used to hurt as well as to help. We need to hold them to higher standards of scrutiny. We need to listen to stories about their behaviour, even if we don't want to believe them. And when there are reasons to believe those stories, we need to act on them. That means people need to feel safe in coming forward with their experiences, which means that nobody should have the power to damage them in reprisal. If you're not careful, allowing charismatic individuals to become the public face of your organisation gives them that power.

There's no reason to believe that someone is bad merely because they're charismatic, but this kind of role allows a charismatic abuser both a great deal of cover and a great deal of opportunity. Sometimes people are just too good to be true. Pretending otherwise doesn't benefit anybody but the abusers.

comments

Pages

Creative Commons License ลิขสิทธิ์ของบทความเป็นของเจ้าของบทความแต่ละชิ้น
ผลงานนี้ ใช้สัญญาอนุญาตของครีเอทีฟคอมมอนส์แบบ แสดงที่มา-อนุญาตแบบเดียวกัน 3.0 ที่ยังไม่ได้ปรับแก้