There was a regression in Linux 3.2.65, which unfortunately was included in this weekend's Debian stable point release (7.8) as I didn't point out the bug reports to the stable release team. At least some systems are now failing to resume after suspending to RAM; instead they reboot.
I have tracked down the change that caused this, and it should be fixed as part of a security update soon. The change is in code specific to 64-bit x86 (i.e. the Debian amd64 architecture). If you need suspend/resume to work, you might wish to avoid upgrading the linux-image-3.2.0-4-amd64 package until that future update.
A maintenance release of RcppClassic, now at version 0.9.6, went out to CRAN today. This package provides a maintained version of the otherwise deprecated first Rcpp API; no new projects should use it.
No changes were in user-facing code. The Makevars file was change to accomodate a request by the CRAN Maintainer to keep it free of GNU Make extensions. At the same time, we overhauled the look and feel of the (very short) vignette. Build instructions were updated both in the vignette and in the included example package. Other accumulated changes since the last release were updates to the DESCRIPTION and NAMESPACE file as well two namespace-related R code updates.
Dear X2Go Community, dear friends,
as many of you may know, I have been contributing a considerable amount
of time to upstream-maintaining X2Go over the past 4 years. I provided
new X2Go components (Python X2Go, PyHoca X2Go Client, a publicly
available X2Go Session Broker, X2Go MATE Bindings, etc.) and focused on
making X2Go a wide-spread community project. For the last 2-3 years I
have been in the role of the X2Go project coordinator and various other
With the beginning of 2015, I will pass on several of those roles to
other people in the project, see the below list for already assigned and
- project/community coordinator (continued by Stefan Baur)
- development coordination (continued by Heinz-Markus Graesing,
very probably introducing some sort of agile development)
- release management (n.n.)
- i18n team leader (n.n.)
- package maintenance (continued by Oleksandr Shneyder)
- Git administrator (continued by Mihai Moldovan)
- bug tracker administrator (continued by Michael DePaulo)
The reasons for tremendously reducing my workload on X2Go are these:
- more time for development, less involvement in organizational tasks
- more time for paid/contracted work (also in the X2Go context)
- spend some of my time on doing Remote Desktop Computing research
- be more available to Debian and Ubuntu as a package maintainer
- be more available to my family
In several internal exchanges we (Heinz, Stefan, Mihai, Mike#2,
Thursday was my first day back with HP. I've joined Steve Geary's group to work on Linux support for “the machine”
I had a great time at Intel and wish my old team all the best.
First sequel to my Link Pack “series” (I’ll remove the quotes when it’s LP#05): Link Pack #01.
This time I’m going for fewer articles, to try to keep things less overwhelming. There’s no special theme, and I’m actually leaving out some nice things I read recently. On the plus side, that means I have good material for a Link Pack #03.
Also, I’m gonna stick with Link Pack as a name, because it’s good enough :-).
A Teenager’s View on Social Media: Written by an actual teen
A well thought and realistic take on how social media is being used nowadays by teenagers. I have seen the patterns the author describes, and actually follow many of them. Does that mean I’m still a teenager?
It’s interesting that the messaging and group-messaging part of the article is very US centric, or at least very US centric from my point of view. WhatsApp is the default messenger application south of the states, and fills the role of “somewhere you can chat with people without having to give them your full personal information”, that is, a place where you can chat with someone without running out of SMS and without adding them on Facebook (which would open them to stalk your whole profile and other friends). Some carriers in South América offer unlimited plans for specific applications like WhatsApp.
What Would Jesus Buy? (2007) — Full movie
“Reverend” Billy Talen from the Church of Stop Shopping Gospel is trying to prevent the Shopocalypse from happening. It’s an entertaining story of a group of funny guys and girls trying to share a message with comedy (that means A+ on my list). Simple and independent, a nice film.
13 Nutrition Lies That Made The World Sick And Fat
A pet peeve of mine. Nutrition is not really that complicated, but unfortunately there are a lot of myths that make people take really bad decisions. If you only read one thing in 2014 2015, read this.
Bottom Line: The low-fat, high-carb diet recommended by the mainstream nutrition organizations is a miserable failure and has been repeatedly proven to be ineffective.
Bottom Line: Low-carb diets are the easiest, healthiest and most effective way to lose weight and reverse metabolic disease. It is pretty much a scientific fact at this point.
GM’s hit and run: How a lawyer, mechanic, and engineer blew open the worst auto scandal in history
Cars are so complex nowadays, and dependent on electronics, that I’m honestly afraid of them. I have made software for many years and I know how hard, impossible, it is to get things “perfect”. I can’t imagine how hard it is for something so critical as brakes, steering wheels, etc. Even cameras can’t get focus right some times, and it’s been many many years.
Countless articles have been written about General Motors and its massive recalls earlier this year. What hasn’t been fully told is how GM might have gotten away with multiple counts of consumercide were it not for the efforts of three men: a Georgia lawyer, a Mississippi mechanic, and a Florida engineer.
Brooke Melton needn’t have died that night. She was killed by a corporation’s callous disregard for the safety of its customers, made worse by a regulatory agency reluctant to regulate.
The Long Game: Part 1 and The Long Game: Part 2
Two very short (less than 5 minutes) video essays about how notable people in the story of creativity are always celebrated without mentioning the boring years when they were nothing but losers. It’s a fun little video, worth a watch for the idea and the interesting editing. It feels like someone really wanted to create these.
The UDD bugs interface currently knows about the following release critical bugs:
- In Total:
177 bugs affecting
- Affecting Jessie:
157 (key packages:
92) That's the number we need to get down to zero
before the release. They can be split in two big categories:
- Affecting Jessie and unstable:
124 (key packages:
74) Those need someone to find a fix, or to finish the
work to upload a fix to unstable:
- 19 bugs are tagged 'patch'. (key packages: 12) Please help by reviewing the patches, and (if you are a DD) by uploading them.
- 4 bugs are marked as done, but still affect unstable. (key packages: 0) This can happen due to missing builds on some architectures, for example. Help investigate!
- 101 bugs are neither tagged patch, nor marked done. (key packages: 62) Help make a first step towards resolution!
- Affecting Jessie only: 33 (key packages: 18) Those are already fixed in unstable, but the fix still needs to migrate to Jessie. You can help by submitting unblock requests for fixed packages, by investigating why packages do not migrate, or by reviewing submitted unblock requests.
- Affecting Jessie and unstable: 124 (key packages: 74) Those need someone to find a fix, or to finish the work to upload a fix to unstable:
- Affecting Jessie: 157 (key packages: 92) That's the number we need to get down to zero before the release. They can be split in two big categories:
How do we compare to the Squeeze and Wheezy release cycles?Week Squeeze Wheezy Jessie 43 284 (213+71) 468 (332+136) 319 (240+79) 44 261 (201+60) 408 (265+143) 274 (224+50) 45 261 (205+56) 425 (291+134) 295 (229+66) 46 271 (200+71) 401 (258+143) 427 (313+114) 47 283 (209+74) 366 (221+145) 342 (260+82) 48 256 (177+79) 378 (230+148) 274 (189+85) 49 256 (180+76) 360 (216+155) 226 (147+79) 50 204 (148+56) 339 (195+144) ??? 51 178 (124+54) 323 (190+133) 189 (134+55) 52 115 (78+37) 289 (190+99) 147 (112+35) 1 93 (60+33) 287 (171+116) 140 (104+36) 2 82 (46+36) 271 (162+109) 157 (124+33) 3 25 (15+10) 249 (165+84) 4 14 (8+6) 244 (176+68) 5 2 (0+2) 224 (132+92) 6 release! 212 (129+83) 7 release+1 194 (128+66) 8 release+2 206 (144+62) 9 release+3 174 (105+69) 10 release+4 120 (72+48) 11 release+5 115 (74+41) 12 release+6 93 (47+46) 13 release+7 50 (24+26) 14 release+8 51 (32+19) 15 release+9 39 (32+7) 16 release+10 20 (12+8) 17 release+11 24 (19+5) 18 release+12 2 (2+0)
This is long overdue, so here goes:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1,SHA512 I'm transitioning my GPG key from an old 1024D key to a new 4096R key. The old key will continue to be valid for some time, but I prefer all new correspondance to be encrypted to the new key, and will be making all signatures going forward with the new key. This transition document is signed with both keys to validate the transition. If you have signed my old key, I would appreciate signatures on my new key as well, provided that your signing policy permits that without re-authenticating me. Old key: pub 1024D/0x5DD5685778D621B4 2000-03-07 Key fingerprint = 0F3C 34D1 E4A3 8FC6 435C 01BA 5DD5 6857 78D6 21B4 New key: pub 4096R/0x1D661A372FED8F94 2013-12-30 Key fingerprint = 9A17 578F 8646 055C E19D E309 1D66 1A37 2FED 8F94 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iEYEARECAAYFAlSwEaIACgkQXdVoV3jWIbQW5QCgoFHVU/D4fKSbvmGv3nNy3MAW S2UAn075ztmxQ8Y9/22crbUug1sEjfh5iQIcBAEBCgAGBQJUsBGiAAoJEB1mGjcv 7Y+U9PgP/29jPvrNcdWsLI8YK9U6+JzS+TMXNyfp6CQXc8O/+zJwqvvxNpqY3rLM 5otRLIEJ2EVdiF8sCWTDGusS9NkMePzumR0AFAR0iltIkekO5O0HbHhK0sXJQv0s EipDpFRO9k4/CBpJEy6Pkkxwd3ndtmwrL1/oKeVmM4E62PJd9ofMpQb/gMUsrA8u F8xoOXY8Os82Rrd759PypSxNecjd6SYaVJTHgFbZ0QIMJkdKaufifzARdw+v5jwg 8Q11BhpYxvUSugZgiciKA6RjRK5bfRnT8VQPFd0zneilsIW13zz/jub9df/vtM5L vY/6jHvXczYXSG8EGpHJQCD3KtQJPWZ0Nz9rAm4emEPmR2qav6KGARatYAm0RBqZ Y81YUEuiWzGli6DH1m9SQe8bqM/J94vQAAX9VqUn2gz0Z0Ey25kVQE7NOGsbbGVS vD/E74FSk1At9/RGpstrfEjsDKPRman2xk/oZe+08sRB22CJl40N4tZV9AkCJNom HHGZKp+VEKaCEiLUIRjKTHt2HTThg39zmxl+OnoTSFYvloxrDJyi9SxZgCAmBhbD 7kLkaSDmdUj6CmoilGU+gd2zmQl2D+RHinYZBxOUf1vi1MDLWNcLIMgrz4mRXgzE YKkG0newf9UbyJw42sXe2ukNQBIqBcL/DmAhG7V+r0RD7MQnMEYy =09bN -----END PGP SIGNATURE-----
The new key is available from keyservers, e.g. pgp.mit.edu or others.
In other news: Yes, I've not been blogging much recently, will try to do updates more often. In the mean time, you can also refer to my Twitter account for random stuff or the new sigrok Twitter account for sigrok-related posts.
I recently had to upload a large number (~1 million) of files to Amazon S3.
My first attempts revolved around s3cmd (and subsequently s4cmd) but both projects seem to based around analysing all the files first, rather than blindly uploading them. This not only requires a large amount of memory, non-trivial experimentation, fiddling and patching is also needed to avoid unnecessary stat(2) calls. I even tried a simple find | xargs -P 5 s3cmd put [..] but I just didn't trust the error handling correctly.
I finally alighted on s3-parallel-put, which worked out well. Here's a brief rundown on how to use it:
- First, change to your source directory. This is to ensure that the filenames created in your S3 bucket are not prefixed with the directory structure of your local filesystem — whilst s3-parallel-put has a --prefix option, it is ignored if you pass a fully-qualified source, ie. one starting with a /.
- Run with --dry-run --limit=1 and check that the resulting filenames will be correct after all:
$ export AWS_ACCESS_KEY_ID=FIXME $ export AWS_SECRET_ACCESS_KEY=FIXME $ /path/to/bin/s3-parallel-put \ --bucket=my-bucket \ --host=s3.amazonaws.com \ --put=stupid \ --insecure \ --dry-run --limit=1 \ .
[..] INFO:s3-parallel-put[putter-21714]:./yadt/profile.Profile/image/circle/807.jpeg -> yadt/profile.Profile/image/circle/807.jpeg [..]
- Remove --dry-run --limit=1, and let it roll.
I have just backported the php-redis (php5-redis) 2.2.5-1~bpo70+1 package for Debian Wheezy. Thanks to the ftp-masters for their quick ACCEPT :)
Now you can install and use the redis PHP extension from the offical repositories, see:
The main change is a switch to the curl() function from the eponymous package by Jeroen Ooms. This was caused by random.org now using https instead of http, annd the fact that te url() function from R does not cope well with the redirect. Besides this (enforced) change, everything else remains the same.
Normally, this doesn’t work as one might naively expect:
program > firstfile > secondfile
The second redirection will override the first one. You’d have to use an external tool to make this work, maybe something like:
program | tee firstfile secondfile
But with zsh, this type of thing actually works. It will duplicate the output and write it to multiple files.
This feature also works with a combination of redirections and pipes. For example
ls > foo | grep bar
will write the complete directory listing into file foo and print out files matching bar to the terminal.
That’s great, but this feature pops up in unexpected places.
I have a shell function that checks whether a given command produces any output on stderr:
! myprog "$arg" 2>&1 >/dev/null | grep .
The effect of this is:
- If no stderr is produced, the exit code is 0.
- If stderr is produced, the exit code is 1 and the stderr is shown.
(Note the ordering of 2>&1 >/dev/null to redirect stderr to stdout and silence the original stdout, as opposed to the more common incantation of >/dev/null 2>&1, which silences both stderr and stdout.)
The reason for this is that myprog has a bug that causes it to print errors but not produce a proper exit status in some cases.
Now how will my little shell function snippet behave under zsh? Well, it’s quite confusing at first, but the following happens. If there is stderr output, then only stderr is printed. If there is no stderr output, then stdout is passed through instead. But that’s not what I wanted.
This can be reproduced simply:
ls --bogus 2>&1 >/dev/null | grep .
prints an error message, as expected, but
ls 2>&1 >/dev/null | grep .
prints a directory listing. That’s because zsh redirects stdout to both /dev/null and the pipe, which makes the redirection to /dev/null pointless.
Note that in bash, the second command prints nothing.
This behavior can be changed by turning off the MULTIOS option (see zshmisc man page), and my first instinct was to do that, but options are not lexically scoped (I think), so this would break again if the option was somehow changed somewhere else. Also, I think I kind of like that option for interactive use.
My workaround is to use a subshell:
! ( myprog "$arg" 2>&1 >/dev/null ) | grep .
The long-term fix will probably be to write an external shell script in bash or plain POSIX shell.
I got a belated Christmas present today. Thanks Jo + Simon!
About a month ago, I blogged about extremon. As a reminder, ExtreMon is a monitoring tool that allows you to view things as they are happening, rather than with the ~5 minute delay that munin gives you, and also avoiding the quad-state limitation of Nagios' "good", "bad", "ugly", and "unknown" states. No, they're not really called that. Yes, I know you knew that.
Anyway. In my blog post, I explained how you can set up ExtreMon, and I also set up a fairly limited demo version on my own server. But I have since realized that while it is functional, it doesn't actually show why ExtreMon is so great. In an effort to remedy that, I present you an example of what ExtreMon can do.
Let's start with a screenshot of the ExtreMon console at the customer for which I spent time trying to figure out how to get it up and running:
Click for full sized version. You'll note that even in that full-sized version, many things are unreadable. This is because the ExtreMon console allows one to move around (right mouse button drag for zoom; left mouse button drag for moving around; control+RMB for rotate; center mouse button to reset to default); so what matters is that everything fits on the screen, not whether it is all readable (if you need to read, you zoom).
The image shows 18 rectangles. Each rectangle represents a single machine in this particular customer's HPC cluster. The top three rectangles are the cluster's file servers; the rest are its high performance nodes.
You'll note that the left fileserver has 8 processor cores (top row), 8 network cards (bottom row, left part), and it also shows information on its memory usage (bottom row, small rectangle in the middle) as well as its NFS client and server procedure calls (bottom row, slightly larger rectangles to the right). This file server is the one on which I installed ZFS a while back; hence the large amount of disks visible in the middle row. The leftmost disk is the root filesystem (which is an ext4 off a hardware RAID1); the two rightmost "disks" are the PCIe-attached SSDs which are used for the ZFS L2ARC and write log. The other disks in this file server nicely show how ZFS does write load balancing over all its disks.
The second file server has a hardware RAID1 on which it stores all its data; as such, there is only one disk graph there. It is also somewhat more limited in network, as it has only two NICs. It does, however, also have 8 cores.
The last file server has no more than four processor cores; in addition, it also does not have a hardware RAID controller, so it must use software RAID over its four hard disks. This server is used for archival purposes, mostly, since it is insufficient for most anything else.
As said, the other nodes are the "compute nodes", where the hard work is done. Most of these compute nodes have 16 cores each; two have 12 instead. When this particular screenshot was taken, four of the nodes (the ones showing red in their processor graphs) were hard at work; the others seem to have been mostly idling. In addition to the familiar memory, NFS (client only), network, and processor graphs, these nodes also show a "swap space" graph (just below the memory one), which seems fine for most nodes, except for the bottom left one (which shows a few bars that are coloured yellow rather than green).
The green/yellow/red stuff is supposed to represent the "ok", "warning", "bad" states that would be familiar from Nagios. In this particular case, however, where "processor is busy all the time" is actually a wanted state, a low amount of idleness on the part of the processor isn't actually a problem, on the contrary. I did consider, therefore, to modify the ExtreMon configuration so that the processor graphs would not show red when the system was under high load; however, I found that differences in colour like this actually makes it more easy to see, at a glance, which machines are busy -- and that's one of the main reasons why we wanted to set this up.
If you look carefully, you can find a particular processor core in the graph which shows 100% usage for "idle", "system", and "softirq", at the same time. Obviously that can't be the case, so there's a bug somewhere. Frank seems to believe it is a bug in CollectD; I haven't verified that. At any rate, though, this isn't usually a problem, due to the high update frequency of ExtreMon.
The amount of data that's flowing through ExtreMon is amazing:
- 22 values for NFS (times two for the file servers) per server: 22x2x3+22x15
- 4 values for memory: 4x18
- 3 values for swap: 3x15
- 8 values per CPU core: 8x8x2+8x4+8x12x2+8x16x13
- 2 values per disk: 2x25+2+2x4
- 2 values per NIC: 2x8x12+2x2x2+2x4x4
Which renders a grand total of 2887 data points that are shown in this particular screenshot; and then I'm not even counting all the intermediate values, some of which also pass through ExtreMon. Nor am I counting the extra bits which have since been added (this screenshot is a few days old, now, and I'm still finetuning things). Yet even so, ExtreMon manages to update those values once every few seconds, in the worst case. As a result, the display isn't static for a moment, constantly moving and updating data so that what you see is never out of date for more than a second or two.
via Le Monde, 16 Sep 2014:
Le Monde: Que devrait être une politique de gauche? Une régulation du capitalisme ou une politique de rupture radicale avec ce système économique?
B.M.: […] Nous allons vers une économie de partage, de la gratuité, du logiciel libre en effet. La figure centrale de demain sera le chercheur qui, lorsqu'il donne quelque chose à la communauté, ne le perd pas. Le chercheur répond aux besoins fondamentaux de l'homme: la création, la curiosité, le changement, le progrès. Il est obligé de coopérer. La coopération canalise la violence, que le libéralisme espérait canaliser par le doux commerce! L'au-delà du capitalisme sera une économie solidaire et fraternelle. Aujourd'hui, la question incontournable porte sur la nature du travail.[…]— Bernard Maris
23 Sep 1946 - 7 Jan 2015
LCA 2015 is next week so it seems like a good time to offer some suggestions for other delegates based on observations of past LCAs. There’s nothing LCA specific about the advice, but everything is based on events that happened at past LCAs.Don’t Oppose a Lecture
Question time at the end of a lecture isn’t the time to demonstrate that you oppose everything about the lecture. Discussion time between talks at a mini-conf isn’t a time to demonstrate that you oppose the entire mini-conf. If you think a lecture or mini-conf is entirely wrong then you shouldn’t attend.
The conference organisers decide which lectures and mini-confs are worthy of inclusion and the large number of people who attend the conference are signalling their support for the judgement of the conference organisers. The people who attend the lectures and mini-confs in question want to learn about the topics in question and people who object should be silent. If someone gives a lecture about technology which appears to have a flaw then it might be OK to ask one single question about how that issue is resolved, apart from that the lecture hall is for the lecturer to describe their vision.
The worst example of this was between talks at the Haecksen mini-conf last year when an elderly man tried at great length to convince me that everything about feminism is wrong. I’m not sure to what degree the Haecksen mini-conf is supposed to be a feminist event, but I think it’s quite obviously connected to feminism – which is of course was why he wanted to pull that stunt. After he discovered that I was not going to be convinced and that I wasn’t at all interested in the discussion he went to the front of the room to make a sexist joke and left.Consider Your Share of Conference Resources
I’ve previously written about the length of conference questions . Question time after a lecture is a resource that is shared among all delegates. Consider whether you are asking more questions than the other delegates and whether the questions are adding benefit to other people. If not then send email to the speaker or talk to them after their lecture.
Note that good questions can add significant value to the experience of most delegates. For example when a lecturer appears to be having difficulty in describing their ideas to the audience then good questions can make a real difference, but it takes significant skill to ask such questions.Dorm Walls Are Thin
LCA is one of many conferences that is typically held at a university with dorm rooms offered for delegates. Dorm rooms tend to have thinner walls than hotel rooms so it’s good to avoid needless noise at night. If one of your devices is going to make sounds at night please check the volume settings before you start it. At one LCA I was startled at about 2AM but the sound of a very loud porn video from a nearby dorm room, the volume was reduced within a few seconds, but it’s difficult to get to sleep quickly after that sort of surprise.
If you set an alarm then try to avoid waking other people. If you set an early alarm and then just get up then other people will get back to sleep, but pressing “snooze” repeatedly for several hours (as has been done in the past) is anti-social. Generally I think that an alarm should be at a low volume unless it is set for less than an hour before the first lecture – in which case waking people in other dorm rooms might be doing them a favor.Phones in Lectures
Do I need to write about this? Apparently I do because people keep doing it!
Phones can be easily turned to vibrate mode, most people who I’ve observed taking calls in LCA lectures have managed this but it’s worth noting for those who don’t.
There are very few good reasons for actually taking a call when in a lecture. If the hospital calls to tell you that they have found a matching organ donor then it’s a good reason to take the call, but I can’t think of any other good example.
Many LCA delegates do system administration work and get calls at all times of the day and night when servers have problems. But that isn’t an excuse for having a conversation in the middle of the lecture hall while the lecture is in progress (as has been done). If you press the green button on a phone you can then walk out of the lecture hall before talking, it’s expected that mobile phone calls sometimes have signal problems at the start of the call so no-one is going to be particularly surprised if it takes 10 seconds before you say hello.
As an aside, I think that the requirement for not disturbing other people depends on the number of people who are there to be disturbed. In tutorials there are fewer people and the requirements for avoiding phone calls are less strict. In BoFs the requirements are less strict again. But the above is based on behaviour I’ve witnessed in mini-confs and main lectures.Smoking
It is the responsibility of people who consume substances to ensure that their actions don’t affect others. For smokers that means smoking far enough away from lecture halls that it’s possible for other delegates to attend the lecture without breathing in smoke. Don’t smoke in the lecture halls or near the doorways.
Also using an e-cigarette is still smoking, don’t do it in a lecture hall.Photography
Unwanted photography can be harassment. I don’t think there’s a need to ask for permission to photograp people who harass others or break the law. But photographing people who break the social agreement as to what should be done in a lecture probably isn’t. At a previous LCA a man wanted to ask so many questions at a keynote lecture that he had a page of written notes (seriously), that was obviously outside the expected range of behaviour – but probably didn’t justify the many people who photographed him.A Final Note
I don’t think that LCA is in any way different from other conferences in this regard. Also I don’t think that there’s much that conference organisers can or should do about such things.
As I have just moved to a new home, I had to declare my new address to all my providers, including banks and administrations which require a proof of address, which can be a phone, DSL or electricity bill.
Well, this is just stupid, as, by definition, one will only have a bill after at least a month. Until then, that means the bank will keep a false address, and that the mail they send may not be delivered to the customer.
Now, bankers and employees of similar administrations, if you could use some common sense, I have some information for you: when someone moves to a new home, unless he is hosted by someone else, he is either renter or owner. Well, you should now that a renter has one contract that proves it, which is called a lease. And an owner has one paper that proves it, which is called a title, or, before it has been issued by administration, a certificate of sale. Now if you do not accept that as a proof of address, you just suck.
Besides, such a zeal to check one's address is just pointless, as it is just to get a proof of address without waiting for a phone, DSL or electricity bill (or to prove a false address, actually…) by just faking one. And as a reminder, at least in France, forgery is punishable by law but defined as an alteration of truth which can cause a prejudice, which means modifying a previous electricity bill to prove your actual address is not considered as a forgery (but using the same mean to prove a false address is, of course!).
All I want for 2015 is a Free/Open Source Software social network which is:
- easy to register on (no reCaptcha disability-discriminator or similar, a simple openID, activation emails that actually arrive);
- has an email help address or online support or phone number or something other than the website which can be used if the registration system causes a problem;
- can email when things happen that I might be interested in;
- can email me summaries of what’s happened last week/month in case they don’t know what they’re interested in;
- doesn’t email me too much (but this is rare);
- interacts well with other websites (allows long-term members to post links, sends trackbacks or pingbacks to let the remote site know we’re talking about them, makes it easy for us to dent/tweet/link to the forum nicely, and so on);
- isn’t full of spam (has limits on link-posting, moderators are contactable/accountable and so on, and the software gives them decent anti-spam tools);
- lets me back up my data;
- is friendly and welcoming and trolls are kept in check.
Is this too much to ask for? Does it exist already?
Every few days, someone publishes a new guide, tutorial, library or framework about web scraping, the practice of extracting information from websites where an API is either not provided or is otherwise incomplete.
However, I find these resources fundamentally deceptive — the arduous parts of "real world" scraping simply aren't in the parsing and extraction of data from the target page, the typical focus of these articles.
The difficulties are invariably in "post-processing"; working around incomplete data on the page, handling errors gracefully and retrying in some (but not all) situations, keeping on top of layout/URL/data changes to the target site, not hitting your target site too often, logging into the target site if necessary and rotating credentials and IP addresses, respecting robots.txt, target site being utterly braindead, keeping users meaningfully informed of scraping progress if they are waiting of it, target site adding and removing data resulting in a null-leaning database schema, sane parallelisation in the presence of prioritisation of important requests, difficulties in monitoring a scraping system due to its implicitly non-deterministic nature, and general problems associated with long-running background processes in web stacks.
In other words, extracting the right text on the page is the easiest and trivial part by far, with little practical difference between an admittedly cute jQuery-esque parsing library or even just using a blunt regular expression.
It would be quixotic to simply retort that sites should provide "proper" APIs but I would love to see more attempts at solutions that go beyond the superficial.
This release mostly solidifies and fixes things. Support for saving integer objects, which was expanded in release 0.2.3, was not entirely correct. Operations on big-endian systems were not up to snuff either.
Wush Wu helped in getting this right with very diligent testing and patching particularly on big-endian hardware. We also got a pull request from Romain to reflect better const correctness at the Rcpp side of things. Last but not least we obliged by the CRAN Maintainers to not assume one could call gzip from system() call because, well, you guessed it.Changes in version 0.2.4 (2015-01-05)
Support for saving integer objects was not correct and has been fixed.
Support for loading and saving on 'big endian' systems was incomplete, has been greatly expanded and corrected, thanks in large part to very diligent testing as well as patching by Wush Wu.
The implementation now uses const iterators, thanks to a pull request by Romain Francois.
The vignette no longer assumes that one can call gzip via system as the world's leading consumer OS may disagree.
CRANberries also provides a diffstat report for the latest release. As always, feedback is welcome and the rcpp-devel mailing list off the R-Forge page for Rcpp is may be the best place to start a discussion. GitHub issue tickets are also welcome.
I promised to write about this a long time, ooops... :-)Another ARM port in Debian - yay!
arm64 is officially a release architecture for Jessie, aka Debian version 8. That's taken a lot of manual porting and development effort over the last couple of years, and it's also taken a lot of CPU time - there are ~21,000 source packages in Debian Jessie! As is often the case for a brand new architecture like arm64 (or AArch64, to use ARM's own terminology), hardware can be really difficult to get hold of. In time this will cease to be an issue as hardware becomes more commoditised, but in Debian we really struggled to get hold of equipment for a very long time during the early part of the port.First bring-up in Debian Ports
To start with, we could use ARM's own AArch64 software models to build the first few packages. This worked, but only very slowly. Then Chen Baozi and the folks running the Tianhe-2 supercomputer project in Guangzhou, China contacted us to offer access to some arm64 hardware, and this is what Wookey used for bootstrapping the new port in the unofficial Debian Ports archive. This has now become the normal way for new architectures to get into Debian. We got most of the archive built in debian-ports this way, and we could then use those results to seed the initial core set of packages in the main Debian archive.Second bring-up - moving into the main Debian archive
By the time that first Debian bring-up was done, ARM was starting to produce its own "Juno" development boards, and with the help of my boss^4 James McNiven we managed to acquire a couple of those machines for use as official Debian build machines. The existing machines in China were faster, but for various reasons quite difficult to maintain as official Debian machines. So I set up the Junos as buildds just before going to DebConf in August 2014. They ran very well, and (for dev boards!) were very fast and stable. They built a large chunk of the Debian archive, but as the release freeze for Jessie grew close we weren't quite there. There was a small but persistent backlog of un-built packages that were causing us issues, plus the Juno machines are/were not quite suitable as porter boxes for Debian developers all over the world to use for debugging their packages on the new architecture.More horsepower - Linaro machines
This is where Linaro came to our aid. Linaro's goal is to help improve Free and Open Source Software on ARM, and one of the more recent projects in Linaro is a cluster of servers that are made available for software developers to use to get early access to ARMv8 (arm64) hardware. It's a great way for people who are interested in this new architecture to try things out, port their software or indeed just help with the general porting effort.
As Debian is seen as such an important part of the FLOSS ecosystem, we managed to negotiate dedicated access to three of the machines in that cluster for Debian's use and we set those up in October, shortly before the freeze for Jessie. Andy Doan spent a lot of his time getting these machines going for us, and then I set up two of them as build machines and one as the porter box we were still needing.
With these extra machines available, we quickly caught up with the ever-busy "Needs-Build" queue and we've got sufficient build power now to keep things going for the Jessie release. We were officially added to the list of release architectures at the Cambridge mini-Debconf in November, and all is looking good now!And in the future?
I've organised the loan of another arm64 machine from AMD for Debian to use for further porting and/or building. We're also expecting that more and more machines will be coming out soon as vendors move on from prototyping to producing real customer equipment. Once that's happened, more kit will be available and everybody will be able to have arm64-powered computers in the server room, on their desk and even inside their laptop! Mine will be running Debian Jessie... :-)Thanks!
There's been a lot of people involved in the Debian arm64 bootstrapping at various stages, so many that I couldn't possibly credit them all! I'll highlight some, though. :-)
First of all, Wookey's life has revolved around this port for the last few years, tirelessly porting, fixing and hacking out package builds to get us going. We've had loads of help from other teams in Debian, particularly the massive patience of the DSA folks with getting early machines up and running and the prodding of the ftpmaster, buildd and release teams when we've been grinding our way through ever more package builds and dependency loops. We've also had really good support from toolchain folks in Debian and ARM, fixing bugs as we've found them by stressing new code and new machines. We've had a number of other people helping by filing bugs and posting patches to help us get things built and working. And (last but not least!) thanks to all the folks who've helped us beg and borrow the hardware to make the Debian arm64 port a reality.
Rumours of even more ARM ports coming soon are entirely scurrilous... *grin*