Planet Debian

Subscribe to Planet Debian feed
Planet Debian - http://planet.debian.org/
Updated: 31 min 46 sec ago

Dirk Eddelbuettel: Introducing RcppParallel: Getting R and C++ to work (some more) in parallel

16 July, 2014 - 22:56
A common theme over the last few decades was that we could afford to simply sit back and let computer (hardware) engineers take care of increases in computing speed thanks to Moore's law. That same line of thought now frequently points out that we are getting closer and closer to the physical limits of what Moore's law can do for us.

So the new best hope is (and has been) parallel processing. Even our smartphones have multiple cores, and most if not all retail PCs now possess two, four or more cores. Real computers, aka somewhat decent servers, can be had with 24, 32 or more cores as well, and all that is before we even consider GPU coprocessors or other upcoming changes.

And sometimes our tasks are embarassingly simple as is the case with many data-parallel jobs: we can use higher-level operations such as those offered by the base R package parallel to spawn multiple processing tasks and gather the results. I covered all this in some detail in previous talks on High Performance Computing with R (and you can also consult the Task View on High Performance Computing with R which I edit).

But sometimes we can't use data-parallel approaches. Hence we have to redo our algorithms. Which is really hard. R itself has been relying on the (fairly mature) OpenMP standard for some of its operations. Luke Tierney's (awesome) keynote in May at our (sixth) R/Finance conference mentioned some of the issues related to OpenMP. One which matters is that OpenMP works really well on Linux, and either not so well (Windows) or not at all (OS X, due the usual issue with the gcc/clang switch enforced by Applem but the good news is that the OpenMP toolchain is expected to make it to OS X is some more performant form "soon"). R is still expected to make wider use of OpenMP in future versions.

Another tool which has been around for a few years, and which can be considered to be equally mature is the Intel Threaded Building Blocks library, or TBB. JJ recently started to wrap this up for use by R. The first approach resulted in a (now superseded, see below) package TBB. But hardware and OS issues bite once again, as the Intel TBB is not really building that well for the Windows toolchain used by R (and based on MinGW).

(And yes, there are two more options. But Boost Threads requires linking which precludes easy use as e.g. via our BH package. And C++11 with its threads library (based on Boost Threads) is not yet as widely available as R and Rcpp which means that it is not a real deployment option yet.)

Now, JJ, being as awesome as he is, went back to the drawing board and integrated a second threading toolkit: TinyThread++, a small header-only library without further dependencies. Not as feature-rich as Intel Threaded Building Blocks, but at least available everywhere. So a new package RcppParallel, so far only on GitHub, wraps around both TinyThread++ and Intel Threaded Building Blocks and offers a consistent interface available on all platforms used by R.

Better still, JJ also authored several pieces demonstrating this new package for the Rcpp Gallery:

All four are interesting and demonstrate different aspects of parallel computing via RcppParallel. But the last article is key. Based on a question by Jim Bullard, and then written with Jim, it shows how a particular matrix distance metric (which is missing from R) can be implemented in a serial manner in both R, and also via Rcpp. The key implementation, however, uses both Rcpp and RcppParallel and thereby achieves a truly impressive speed gain as the gains from using compiled code (via Rcpp) and from using a parallel algorithm (via RcppParallel) are multiplicative! Between JJ's and my four-core machines the gain was between 200 and 300 fold---which is rather considerable. For kicks, I also used a much bigger machine at work which came in at an even larger speed gain (but gains become clearly sublinear as the number of cores increases; there are however some tuning parameters).

So these are exciting times. I am sure there will be lots more to come. For now, head over to the RcppParallel package and start playing. Further contributions to the Rcpp Gallery are not only welcome but strongly encouraged.

Vincent Sanders

16 July, 2014 - 19:04
It is no great secret that my colleagues at Collabora have been doing work with the Raspberry Pi Foundation.

My desk is very near Marco and I often see him working with the various Pi boards. Recently he obtained one of the new B+ units for testing and I thought it looked a little sad sat naked on his desk.

To remedy this bare board problem I designed and built a laser cut a case for him and now the B+ has been publicly released I can make the design freely available.

The design is completely original though is inspired by several other plastic "clip" type designs I have seen. Originally I created and debugged the case design for my parallella though tweaking it for the Pi was pretty easy.

The design is under a CC attribution licence and I ought to say that my employer is in no way responsible for this, its all my own fault.

Wouter Verhelst: Reprepro for RPM

16 July, 2014 - 18:43

Dear lazyweb,

reprepro is a great tool. I hand it some configuration and a bunch of packages, and it creates the necessary directory structure, moves the packages to the right location, and generates a (signed) Debian package repository. Obviously it would be possible to all that reprepro does by hand—by calling things like cp and dpkg-scanpackages and gpg and other things by hand—but it's easy to forget a step when doing so, and having a tool that just does things for me is wonderful. The fact that it does so only on request (i.e., when I know something has changed, rather than "once every so often") is also quite useful.

At work, I currently need to maintain a bunch of package repositories. The Debian package archives there are maintained with reprepro, but I currently maintain the RPM archives pretty much by hand: create the correct directories, copy the right files to the right places, run createrepo over the correct directories (and in the case of the OpenSUSE repository, also run gpg), and a bunch of other things specific to our local installation. As if to prove my above point, apparently I forgot to do a few things there, meaning, some of the RPM repositories didn't actually work correctly, and my testing didn't catch on.

Which makes me wonder how RPM package repositories are usually maintained. When one needs to maintain just a bunch of packages for a number of servers, well, running createrepo manually isn't too much of a problem. When it gets beyond own systems, however, and when you need to support multiple builds for multiple versions of multiple distributions, having to maintain all those repositories by hand is probably not the best idea.

So, dear lazyweb: how do large RPM repositories maintain state of the packages, the distributions they belong to, and similar things?

Please don't say "custom scripts"

Juliana Louback: Become an Open-source Contributor Video Conference

16 July, 2014 - 18:06

A couple weeks ago I was contacted by Yehuda Korotkin through one of the Debian mailing lists. Yehuda is a tech professor at one of Israel’s leading colleges for women. He proposed a video conference to present the open source community to his class and explain how they can contribute to open source projects. I volunteered to participate and last Monday (July 14th 2014) we had our virtual meeting, which I hope was the first of many.

Shauna G. also participated from Boston, making it a Israel - USA - Brazil meetup. Pretty impressive, eh? The conference was hosted on Google Hangouts, the video is available for viewing here. Run time is around 2.5 hours so to save you time I’ve kindly posted a summary of the conference! To be frank, I look ghastly half the time so I’d much prefer that you read this post.

First there was a Q&A period which was more of a discussion panel, followed by a hands-on session where the girls made their first contribution to an open source project. Here’s the gist of it:

Q&A

  • What is open source [software]? Software that allows free access to its source code (ergo ‘open’), permitting analysis and any modifications to the original code. All these permissions are contained in a license included in the software. Note that the term ‘open source’ is now applied to a lot of things other than software, and I assume there’s a different definition for those cases.

  • Who is the community behind open source software? Shauna pointed out that there are many kinds of projects and many kinds of communities backing them. As an example of a for-profit model, Shauna cited RedHat that offers Linux (an open source OS) for free and profits from support services. Some projects are supported by NGO’s (example: Center for Open Science) with a specific objective, a series of other projects are volunteer-based or are the result of a hobby project. The contributors vary accordingly, they are not only programmers but also people with a non-technical background. In sum, there’s a lot of variety in the open source community as a whole, details of sturcture vary case to case.

  • Why is it important that women get involved in technology? Men and women are different because of a series of reasons, among them our biological composition, influences of society and life experiences. Technology is for everyone, male or female and we need to be able to design products that will suit everyone well. If the development team is comprised of only one sex or another, it’s likely that factors will be overlooked or ignored and as a result not a democratic experience. Shauna gave some examples of this, such as a voice-powered location search software that could easily identify stereotypically ‘male’ keywords but not many of interest to females.

    In addition to this, it’s known that there is a severe workforce deficiency in the tech industry. One proposed solution is to encourage more women to follow a career in tech. The argument is based on the assumption that any boy who is remotely interested in tech has a high likelihood of studying tech, but not every girl. Often girls are not encouraged to stuy STEM-related topics and in many places tech is considered inappropriate for women. I’ve read that tech companies’ engineering team is 12% female and 88% male. That number is consistent with the stats of female college grads for STEM majors. I think boys who love tech will keep getting into tech, but there are a lot of girls who could be great engineers if they only gave it a shot. Women should be encouraged to pursue a tech career not only so they’ll get to work in such an incredible field but also so they can help assuage the gap in supply and demand present in the tech workforce today.

  • Why is open source important? We don’t always get everything we want/need from off-the-shelf software. The great thing about open source is that you are free to tweak the at will and add whatever features you’d like. Sometimes there are even more important necessities, such as security requirements. With open source you don’t need to just take the manufacturers’ word that the code is bullet proof; open it and check it out yourself. Another factor that increases the value of open source is its stability. As Linus Torvalds said, “given enough eyeballs, all bugs are shallow”. With so many developers opening, studying and testing the code, bugs are quickly identified and removed. The fact that open source software is ‘free’ doesn’t reflect negatively on the quality; to the contrary, plenty of open source software is the best that’s out there.

  • Why its important contribute to open source? To start, it’s not written anywhere that you have to contribute. That said, it sure is nice if you do. I think there are two main reasons why people contribute. One is they want to ‘give back’ to the open source community. The other is they want to add some missing detail, feature or entire product. Of course, usually if you need something, someone else does as well, then by supplying your own need you end up helping many people out on the way. But one thing that I think applies especially to students is it’s a great way to gain experience. You have to fit in some practical learning along with all that theory and one can only be so enthusiastic about the CS projects done in school that are usually discarded at the end of the semester. By contributing to an open source project, you are doing something that’s real and that people will actually use. You also have access to an incredible network of mentors who are real pros and more than willing to help you out. Maybe I’ve just been lucky, but they are really nice people who won’t snicker if you ask a stupid question. Well, maybe they do. But I haven’t seen it happen and I’m the queen of stupid questions. Many times this is their pet project, so they’re thrilled to have your contribution. Another great thing is that you can choose what you want to work with. It’s easy to find a tech internship out there, what’s not easy is finding an interesting internship. You might work with something you dislike or (gasp) end up just bringing people coffee. But we do that stuff cause we need the work experience. With open source you have a huge array of projects you can work on, one of them is certain to suit your interests.

  • How to become a contributor? When using an open source software, you might notice things you’d like to change, or actual bugs. Let the development team know about it and be sure to also let them know if you think you can figure out the fix. And if that hasn’t happened yet, try exploring Github’s open source repositories and check out the ‘Issues’ option (upper right corner). There’s usually a list of bugs or wishlist features that you just might be the man/woman for.

Hands on

For the practical part of our conference, Yehuda’s students contributed to Jscommunicator’s Issue #5 which requests i18n support for major languages. JSCommunicator is a SIP communication tool developed in HTML and JavaScript, currently supporting audio/video calls and SIP SIMPLE instant messaging.

First we set up Github in their local machines and forked the Jscommunicator repo. We learned the basic git commands needed to push changes to their remote repositories and to create a pull requests. This is explained in further detail here.

At the end of the conference with nearly an hour overtime due to my tendency to be overly verbose, Yehuda’s class made their first contribution with a Hebrew translation for Jscommunicator!

All in all, it was great to virtually meet Yehuda and his epic students who all seemed very clever engineers. I’m looking forward to our next meetup once I gain some time-management skills.

Raphaël Hertzog: Spotify migrate 5000 servers from Debian to Ubuntu

16 July, 2014 - 16:07

Or yet another reason why it’s really important that we succeed with Debian LTS. Last year we heard of Dreamhost switching to Ubuntu because they can maintain a stable Ubuntu release for longer than a Debian stable release (and this despite the fact that Ubuntu only supports software in its main section, which misses a lot of popular software).

A few days ago, we just learned that Spotify took a similar decision:

A while back we decided to move onto Ubuntu for our backend server deployment. The main reasons for this was a predictable release cycle and long term support by upstream (this decision was made before the announcement that the Debian project commits to long term support as well.) With the release of the Ubuntu 14.04 LTS we are now in the process of migrating our ~5000 servers to that distribution.

This is just a supplementary proof that we have to provide long term support for Debian releases if we want to stay relevant in big deployments.

But the task is daunting and it’s difficult to find volunteers to do the job. That’s why I believe that our best answer is to get companies to contribute financially to Debian LTS.

We managed to convince a handful of companies already and July is the first month where paid contributors have joined the effort for a modest participation of 21 work hours (watch out for Thorsten Alteholz and Holger Levsen on debian-lts and debian-lts-announce). But we need to multiply this figure by 5 or 6 at least to make a correct work of maintaining Debian 6.

So grab the subscription form and have a chat with your management. It’s time to convince your company to join the initiative. Don’t hesitate to get in touch if you have questions or if you prefer that I contact a representative of your company. Thank you!

3 comments | Liked this article? Click here. | My blog is Flattr-enabled.

Jon Dowland: Mac

16 July, 2014 - 15:18

My job exposes me to a large variety of computing systems and I regularly use Mac, Windows and Linux desktops. My main desktop environment at home and work has been Debian GNU/Linux for over 10 years. However every now and then I take a little "holiday" and use something else for a few weeks. Often I'm spurred on by some niggle or other on the GNOME desktop, or burn-out with whatever the current contentious issue of the moment is in Debian. Usually I switched to Windows and I used it as an excuse to play some computer games.

Last November I had just such an excuse to take a holiday but this time I opted to go for Mac. I had a back-log of Mac issues to investigate at work anyway.

I haven't looked back.

It appears I have switched for good. I've been meaning to write about this for some time, but I couldn't quite get the words right. I doubted I could express my frustrations in a constructive, helpful way, even if I think that my experiences are useful and my discoveries valuable, perhaps I would put them across in a way that seemed inciteful rather than insightful. I wasn't sure anyone cared. Certainly the GNOME community doesn't seem interested in feedback.

I turns out that one person that doesn't care is me: I didn't realise just how broken the F/OSS desktop is. The straw that broke the camel's back was the file manager replacing type-ahead find with a search but (to seemlessly switch metaphor) it turns out I'd been cut a thousand times already. I'm not just on the other side of the fence, I'm several fields away.

Sometimes community people write about their concerns with whether they're going in the right direction, or how to tell the difference between legitimate complaints, trolls and whiners. When I look at conferences now, the sea of Thinkpads was replaced with a sea of Apple Macs a long time ago now, and the Thinkpads haven't come back. I'd suggest: don't worry about the whiners. Worry about the leavers.

What does this mean for my Debian involvement? Well, you can't help but have noticed that I've done very little this year. I've written nearly exclusively about music so far. the good news is: I still regularly use Debian, and I still intend to stay involved, just not on the desktop. I'm essentially only maintaining two packages now, lhasa and squishyball. I might pick up a few more (possibly archivemail if the situation doesn't improve) but I'm happy with a low package load; I'd like to make sure the ones I do maintain are maintained well. The sum of all my Debian efforts this year have been to get these two (or three) ship-shape. I have a bunch of other things I'd like to achieve in Debian which are not packages, and a larger package load would just distract from them. (We really are too package-oriented in Debian).

Michal Čihař: New UI for Weblate

16 July, 2014 - 00:00

For quite some time, I'm working on new UI for Weblate. As the time is always limited, the progress is not that fast as I would like to see, but I think it's time to show the current status to wider audience.

Almost all pages have been rewritten, the major missing parts are zen mode and source strings review. So it's time to play with it on our demo server. The UI is responsive, so it works more or less on different screen sizes, though I really don't expect people to translate on mobile phone, so not much tweaking was done for small resolutions.

Anyway I'd like to hear as much feedback as possible :-).

Filed under: English phpMyAdmin SUSE Weblate | 2 comments | Flattr this!

Mario Lang

15 July, 2014 - 16:15
Mixing vinyl again

The turntables have me back, after quite some long-term mixing break.

I used to do straight 4-to-the-floor, mostly acid or hardtek. You can find an old mix of mine on SoundCloud. This one is actually back from 2006.

But currently I am more into drum and bass. It is an interesting mixing experience, since considerably harder. Here is a small but very recent minimix. Experts in the genre might notice that I am mostly spinning stuff from BlackOutMusicNL, admittedly my favourite label right now.

Laura Arjona: New GPG Key!

14 July, 2014 - 03:47

Achievement unlocked: I have a new GPG key:

0xF22674467E4AF4A3

pub   4096R/7E4AF4A3 2014-07-13 [caduca: 2016-07-12]
Fingerprint = 445E 3AD0 3690 3F47 E19B  37B2 F226 7446 7E4A F4A3
uid                  Laura Arjona Reina <laura.arjona@upm.es>
uid                  Laura Arjona Reina <larjona@fsfe.org>
uid                  Laura Arjona Reina <larjona99@gmail.com>
sub   3072R/CC706B74 2014-07-13 [expires: 2016-07-12]
sub   3072R/7E51465F 2014-07-13 [expires: 2016-07-12]
sub   4096R/74C23D6E 2014-07-13 [expires: 2016-07-12]

The master key is 4096 bit, stored in a safe place, and 2 subkeys 3072 bit, stored in an FSFE Smartcard (I cannot store 4096 keys there).

I have carefully used the FSFE SmartCard Howto and “Creating the perfect GPG keypair” by Alex Cabal for strenghtening hash preferences and creating revocation certificate.

It seems everything works as intended. Passphrase is strong and this time I will not forget it.

As first celebration, 1/2 lt icecream is waiting for me after dinner :)

People knowing me and around Madrid, please send me an encrypted mail as test or normal communication, and ping me to meet and sign keys :)

One more step towards involvement in Debian and free software, controlling my digital life and communications, and becoming familiar with these technologies so I can teach them to my son as ‘the natural way’.

Yay!


Filed under: My experiences and opinion, Tools Tagged: Contributing to libre software, Debian, encryption, English, Free Software, gpg, libre software, Moving into free software, mswl-cases

Steve Kemp: A brief twitter experiment

14 July, 2014 - 03:08

So I've recently posted a few links on Twitter, and I see followers clicking them. But also I see random hits.

Tonight I posted a link to http://transient.email/, a domain I use for "anonymous" emailing, specifically to see which bots hit the URL.

Within two minutes I had 15 visitors the first few of which were:

IP User-Agent Request 199.16.156.124Twitterbot/1.0;GET /robots.txt 199.16.156.126Twitterbot/1.0;GET /robots.txt 54.246.137.243python-requests/1.2.3 CPython/2.7.2+ Linux/3.0.0-16-virtualHEAD / 74.112.131.243Mozilla/5.0 ();GET / 50.18.102.132Google-HTTP-Java-Client/1.17.0-rc (gzip)HEAD / 50.18.102.132Google-HTTP-Java-Client/1.17.0-rc (gzip)HEAD / 199.16.156.125Twitterbot/1.0;GET /robots.txt 185.20.4.143Mozilla/5.0 (compatible; TweetmemeBot/3.0; +http://tweetmeme.com/)GET / 23.227.176.34MetaURI API/2.0 +metauri.comGET / 74.6.254.127Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp);GET /robots.txt

So what jumps out? The twitterbot makes several requests for /robots.txt, but never actually fetches the page itself which is interesting because there is indeed a prohibition in the supplied /robots.txt file.

A surprise was that both Google and Yahoo seem to follow Twitter links in almost real-time. Though the Yahoo site parsed and honoured /robots.txt the Google spider seemed to only make HEAD requests - and never actually look for the content or the robots file.

In addition to this a bunch of hosts from the Amazon EC2 space made requests, which was perhaps not a surprise. Some automated processing, and classification, no doubt.

Anyway beer. It's been a rough weekend.

Dimitri John Ledkov: Hacking on launchpadlib

13 July, 2014 - 20:32
So here is a quick sample of my progress playing around with launchpadlib using lp-shell from lptools:
In [1]: lp
Out[1]: <launchpadlib.launchpad.Launchpad at 0x7f49ecc649b0>

In [2]: lp.distributions
Out[2]: <launchpadlib.launchpad.DistributionSet at 0x7f49ddf0e630>

In [3]: lp.distributions['ubuntu']
Out[3]: <distribution at https://api.launchpad.net/1.0/ubuntu>

In [4]: lp.distributions['ubuntu'].display_name
Out[4]: 'Ubuntu'

In [5]: lp.distributions['ubuntu'].summary
Out[5]: 'Ubuntu is a complete Linux-based operating system, freely available with both community and professional support.'

In [7]: import sys; print(sys.version)
3.4.1 (default, Jun 9 2014, 17:34:49)
[GCC 4.8.3]

There is not much yet, but it's a start. python3 port of launchpadlib is coming soon. It has been attempted a few times before and I am leveraging that work. Porting this stack has proven to be the most difficult python3 port I have ever done. But there is always python-libvirt that still needs porting ;-)

Some of above is just merge proposals against launchpadlib & lazr.restfulclient, and requires not yet packaged modules in the archive. When trying it out, I'm still getting a lot of run-time asserts and things that haven't been picked up by e.g. pyflakes3 and has not been unit-tested yet.

Russ Allbery: Review: Neptune's Brood

13 July, 2014 - 12:54

Review: Neptune's Brood, by Charles Stross

Series: Freyaverse #2 Publisher: Ace Copyright: July 2013 ISBN: 1-101-62453-1 Format: Kindle Pages: 325

Neptune's Brood is set in the same universe as Saturn's Children, but I wouldn't call it a sequel. It takes place considerably later, after substantial expansion of the robot civilization to the stars, and features entirely different characters (or, if there was overlap, I didn't notice). It also represents a significant shift in tone: while Saturn's Children is clearly a Heinlein pastiche and parody, Neptune's Brood takes its space opera more seriously. There is some situational humor — assault auditors, for example — but this book is played mostly straight, and I detected little or no Heinlein. This is Stross fleshing out his own space opera concept.

This being Stross, that concept is not exactly conventional. This is a space opera about economics. Specifically, it's a space opera about interstellar economics, a debt pyramid, and a very interesting remapping of the continual growth requirements of capitalism to the outward expansion of colonization. The first-person protagonist comes from a "family" (as in Saturn's Children, the concept exists but involves rather more aggressive control of the instantiated "children") of bankers, but she is a forensic accountant and historian who specializes in analysis of financial scams. As you might expect, this is a significant clue about the plot.

Neptune's Brood opens with Krina in search of her sister. She is supposed to be an itinerant scholar, moving throughout colonized space to spend some time with various scattered sisters, spreading knowledge and expanding her own. But it's clear from the start of the book that something else is going on, even before an assassin with Krina's face appears on her trail. Unfortunately, it takes roughly a third of the book to learn just what is happening beneath the surface, and most of that time is spent in a pointless interlude on a flying cathedral run by religious fanatics.

The religion is a callback to Saturn's Children: robots who are trying to spread original humanity (the Fragiles) to the stars. This mostly doesn't go well, and is going particularly poorly for the ship that Krina works for passage on. But this is a brief gag that I thought went on much too long. The plot happens to Krina for this first section of the book rather than the other way around, little of lasting significance other than some character introductions occurs, and the Church itself, while playing a minor role in the later plot, is not worth the amount of attention that it gets. The best parts of the early book are the interludes in which Krina explains major world concepts to the reader. These are absolutely blatant infodumping, and I'm not sure how Stross gets away with them, but somehow he does, at least for me. They remind me of some of his blog posts, except tighter and fit into an interesting larger structure.

Thankfully, once Krina finally arrives on Shin-Tethys, the plot improves considerably. There was a specific moment for me when the book became interesting: when Krina finds her sib's quarters in Shin-Tethys and analyzes what she finds there. It's the first significant thing in the book that she does rather than have done to her or thrust upon her, and she's a much better character when she's making decisions. This is also about the point where Stross starts fully explaining slow money, which is key to both the economics and the plot, and the plot starts to unwind its various mysteries and identify the motives of the players.

Even then, Krina suffers from a lack of agency. Only at rare intervals does she get a chance to affect the story. Most of what she did of relevance to this book she did in the past, and while those descriptions of the backstory are interesting, they don't entirely make up for a passive protagonist. Thankfully, the other characters are varied and interesting enough, and the political machinations and cascading revelations captivating enough, that the last part of the book was very satisfying even with Krina on for the ride.

This is a Stross novel, so it's full of two-dollar technical words mixed with technobabble. However, it shares with Saturn's Children the recasting of robots as the norm and fleshy humans as the exception, which means much of the technobabble is a straight substitution for our normal babble about meaty bodies and often works as an alienation technique. That makes it a bit more tolerable for me, although I still wished Stross would turn down the manic vocabulary in places. This bothers some people more than others; if you had no trouble with Accelerando, Neptune's Brood will pose no problems.

I don't think the first section of this book was successful, but I liked the rest well enough to recommend it. If you like your space opera with a heavy dose of economics, a realistic attitude towards deep space exploration without faster-than-light technology, and a realistic perspective on the hostility of alien planets to Earth life, Neptune's Brood is a good choice. And any book that quotes David Graeber's Debt, and whose author has clearly paid attention to its contents, wins bonus points from me.

Rating: 8 out of 10

Dirk Eddelbuettel: RcppArmadillo 0.4.320.0

13 July, 2014 - 07:45
While I was out at the (immensely impressive and equally enjoyable) useR! 2014 conference at UCLA, Conrad provided a bug-fix release 4.320 of Armadillo, the nifty templated C++ library for linear algebra. I quickly rolled that into RcppArmadillo release 0.4.320.0 which has been on CRAN and in Debian for a good week now.

This release fixes some minor things with sparse and dense Eigen solvers (as well as one RNG issue probably of lesser interest to R users deploying the RNGs from R) as shown in the NEWS entry below.

Changes in RcppArmadillo version 0.4.320.0 (2014-07-03)
  • Upgraded to Armadillo release Version 4.320 (Daintree Tea Raider)

    • expanded eigs_sym() and eigs_gen() to use an optional tolerance parameter

    • expanded eig_sym() to automatically fall back to standard decomposition method if divide-and-conquer fails

    • automatic installer enables use of C++11 random number generator when using gcc 4.8.3+ in C++11 mode

Courtesy of CRANberries, there is also a diffstat report for the most recent release. As always, more detailed information is on the RcppArmadillo page. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

Paul Tagliamonte: Satuday's the new Sunday

12 July, 2014 - 08:41

Hello, World!

For those of you who enforce my Sundays on me (keep doing that, thank you!), I’ll be changing my Saturdays with my Sundays.

That’s right! In this new brave world, I’ll be taking Saturdays off, not Sundays. Feel free to pester me all day on Sunday, now!

This means, as a logical result, I will not be around tomorrow, Saturday.

Much love.

Matt Brown: GPG Key Management Rant

12 July, 2014 - 08:17

2014 and it’s still annoyingly hard to find a reasonable GPG key management system for personal use… All I want is to keep the key material isolated from any Internet connected host, without requiring me to jump through major inconvenience every time I want to use the key.

An HSM/Smartcard of some sort is an obvious choice, but they all suck in their own ways:
* FSFE smartcard – it’s a smartcard, requires a reader, which are generally not particular portable compared to a USB stick.
* Yubikey Neo – restricted to 2048 bits, doesn’t allow imports of primary keys (only subkeys), so you either generate on device and have no backup, or maintain some off-device primary key with only subkeys on the Neo, negating the main benefits of it in the first place.
* Smartcard HSM – similar problems to the Neo, plus not really supported by GPG well (needs 2.0 with specific supporting module version requirements).
* Cryptostick – made by some Germans, sounds potentially great, but perpetually out of stock.

Which leaves basically only the “roll your own” dm-crypt+LUKS usb stick approach. It obviously works well, and is what I currently use, but it’s a bunch of effort to maintain, particularly if you decide, as I have, that the master key material can never touch a machine with a network connection. The implication is that you now need to keep an airgapped machine around, and maintain a set of subkeys that are OK for use on network connected machines to avoid going mad playing sneakernet for every package upload.

The ideal device would be a USB form factor, supporting import of 4096 bit keys, across all GPG capabilities, but with all crypto ops happening on-device, so the key material never leaves the stick once imported. Ideally also cheap enough (e.g. ~100ish currency units) that I can acquire two for redundancy.

As far as I can tell, such a device does not exist on this planet. It’s almost enough to make a man give up on Debian and go live a life of peace and solitude with the remaining 99.9% of the world who don’t know or care about this overly complicated mess of encryption we’ve wrought for ourselves.

end rant.

Steve Kemp: A partial perl-implementation of Redis

12 July, 2014 - 04:36

So recently I got into trouble running Redis on a host, because the data no-longer fits into RAM.

As an interim measure I fixed this by bumping the RAM allocated to the guest, but a real solution was needed. I figure there are three real alternatives:

  • Migrate to Postgres, MySQL, or similar.
  • Use an alternative Redis implementation.
  • Do something creative.

Looking around I found a couple of Redis-alternatives, but I was curious to see how hard it would be to hack something useful myself, as a creative solution.

This evening I spotted Protocol::Redis, which is a perl module for decoding/encoding data to/from a Redis server.

Thinking "Ahah" I wired this module up to AnyEvent::Socket. The end result was predis - A perl-implementation of Redis.

It's a limited implementation which stores data in an SQLite database, and currently has support for:

  • get/set
  • incr/decr
  • del/ping/info

It isn't hugely fast, but it is fast enough, and it should be possible to use alternative backends in the future.

I suspect I'll not add sets/hashes, but it could be done if somebody was keen.

Russ Allbery: Net::Duo 1.00

12 July, 2014 - 04:22

This is the first release of Net::Duo, which provides an object-oriented Perl interface for the Duo Security REST APIs. It attempts to abstract some of the API details and provide an object-oriented view of the returned objects in order to make use of the API in Perl code more natural than dealing with JSON data structures directly. Currently, some parts of the Auth and Admin APIs are implemented alongside with generic methods to call any of the JSON-based APIs.

The approach I took with this module was a bit of a science experiment, and I'm still not entirely sure what I think about the results. Duo Security offers sample Perl code that provides the equivalent of the call and call_json Net::Duo methods but stops there. One sends in data structures and gets back data structures from JSON and manipulates everything in that format.

I prefer a more object-oriented style, and want the module to do a bit more of the work for me, so this implementation wraps some of the APIs in objects with method calls. For updates, there are setters for the object itself and then a commit method to push the changes to Duo. This requires more implementation effort, and each API that should get richer treatment has to be modelled, but the resulting code looks like more natural object-oriented code.

I wasn't completely sure going in if the effort to reward tradeoff made sense, and having finished the module sufficiently for Stanford's immediate needs, I'm still not sure. It was certainly more effort to write the base module this way, but on the other hand it also meant that I could map Perl notions of true and false to Duo's and provide much simpler methods for common operations. I still think this will make the code more maintainable in the long run, but I think it's within the margin of difference of opinion.

Regardless, you can get the latest version from the Net::Duo distribution page and shortly from CPAN as well.

Joseph Bisch: GSoC Update

12 July, 2014 - 00:30

It has been a couple of months since my last GSoC update and a lot has happened since then.

You can view the web interface at metrics.debian.net. There is a dynamic interface viewable with JavaScript enabled, and a static interface viewable with JS disabled. I am currently working on the dynamic interface.

In the weeks since my last blog post I have accomplished a lot. There is support for pull and push metrics. Pull metrics run on the debmetrics server and pull data from some source. Push metrics run on a remote server and push data to the debmetrics server via HTTP POST. There is flask code and the static interface portion uses a general route, so that the flask code does not need to be modified every time a metric is added. There is a makefile that runs manifest2index.py and manifest2orm.py to generate the index page of the web interface and the SQLAlchemy model code respectively. I have a config file that allows the user to set the location of the manifest, pull_scripts, and graph_scripts directories. I have a debmetrics.wsgi file for easy deployment of debmetrics.

There is a single layout.html that is the base layout for all the other pages.

Nosetests is used for tests. Sphinx and readthedocs are used for the documentation. I wrote a minified_grabber script that downloads all the JS and css files that are not included with debmetrics itself. That helps make deployment easy.

If you want to take a look at the code you can view the repository.

I still have more work to do before the end of GSoC, mostly with the dynamic web interface and packaging of debmetrics.

To keep up to date on my progress, you can either read this blog, or you can read the soc-corrdination mailing list.

Russell Coker: Improving Computer Reliability

11 July, 2014 - 12:51

In a comment on my post about Taxing Inferior Products [1] Ben pointed out that most crashes are due to software bugs. Both Ben and I work on the Debian project and have had significant experience of software causing system crashes for Debian users.

But I still think that the widespread adoption of ECC RAM is a good first step towards improving the reliability of the computing infrastructure.

Currently when software developers receive bug reports they always wonder whether the bug was caused by defective hardware. So when bugs can’t be reproduced (or can’t be reproduced in a way that matches the bug report) they often get put in a list of random crash reports and no further attention is paid to them.

When a system has ECC RAM and a filesystem that uses checksums for all data and metadata we can have greater confidence that random bugs aren’t due to hardware problems. For example if a user reports a file corruption bug they can’t repeat that occurred when using the Ext3 filesystem on a typical desktop PC I’ll wonder about the reliability of storage and RAM in their system. If however the same bug report came from someone who had ECC RAM and used the ZFS filesystem then I would be more likely to consider it a software bug.

The current situation is that every part of a typical PC is unreliable. When a bug can be attributed to one of several pieces of hardware, the OS kernel and even malware (in the case of MS Windows) it’s hard to know where to start in tracking down a bug. Most users have given up and accepted that crashing periodically is just what computers do. Even experienced Linux users sometimes give up on trying to track down bugs properly because it’s sometimes very difficult to file a good bug report. For the typical computer user (who doesn’t have the power that a skilled Linux user has) it’s much worse, filing a bug report seems about as useful as praying.

One of the features of ECC RAM is that the motherboard can inform the user (either at boot time, after a NMI reboot, or through system diagnostics) of the problem so it can be fixed. A feature of filesystems such as ZFS and BTRFS is that they can inform the user of drive corruption problems, sometimes before any data is lost.

My recommendation of BTRFS in regard to system integrity does have a significant caveat, currently the system reliability decrease due to crashes outweighs the reliability increase due to checksums at this time. This isn’t all bad because at least when BTRFS crashes you know what the problem is, and BTRFS is rapidly improving in this regard. When I discuss BTRFS in posts like this one I’m considering the theoretical issues related to the design not the practical issues of software bugs. That said I’ve twice had a BTRFS filesystem seriously corrupted by a faulty DIMM on a system without ECC RAM.

Related posts:

  1. Reliability of RAID ZDNet has an insightful article by Robin Harris predicting the...
  2. Improving Blog Latency to Benefit Readers I just read an interesting post about latency and how...
  3. Banking with an Infected Computer Bruce Schneier summarised a series of articles about banking security...

Steve Kemp: Blogspam moved, redis alternatives being examined

10 July, 2014 - 16:45

As my previous post suggested I'd been running a service for a few years, using Redis as a key-value store.

Redis is lovely. If your dataset will fit in RAM. Otherwise it dies hard.

Inspired by Memcached, which is a simple key=value store, redis allows for more operations: using sets, using hashes, etc, etc.

As it transpires I mostly set keys to values, so it crossed my mind last night an alternative to rewriting the service to use a non-RAM-constrained service might be to juggle redis out and replace it with something else.

If it were possible to have a redis-compatible API which secretly stored the data in leveldb, sqlite, or even Berkley DB, then that would solve my problem of RAM-constraints, and also be useful.

Looking around there are a few projects in this area nds fork of redis, ssdb, etc.

I was hoping to find a Perl Redis::Server module, but sadly nothing exists. I should look at the various node.js stub-servers which exist as they might be easy to hack too.

Anyway the short version is that this might be a way forward, the real solution might be to use sqlite or postgres, but that would take a few days work. For the moment the service has been moved to a donated guest and has 2Gb of RAM instead of the paltry 512Mb it was running on previously.

Happily the server is installed/maintained by my slaughter tool so reinstalling took about ten minutes - the only hard part was migrating the Redis-contents, and that's trivial thanks to the integrated "slave of" support. (I should write that up regardless though.)

Pages

Creative Commons License ลิขสิทธิ์ของบทความเป็นของเจ้าของบทความแต่ละชิ้น
ผลงานนี้ ใช้สัญญาอนุญาตของครีเอทีฟคอมมอนส์แบบ แสดงที่มา-อนุญาตแบบเดียวกัน 3.0 ที่ยังไม่ได้ปรับแก้