Archive for the ‘Hacks’ Category

my UAV blog

Friday, April 15th, 2011

For those following my blog, I thought I should mention that I have a separate blog for the stuff I’m doing as part of the CanberraUAV project, which is building an entry for next years outback challenge.

The posting I made today may interest other Linux hackers as well. Up to now I’ve been using an old OLPC laptop as my ground control station for my UAV, because it is sunlight readable. I’ve now replaced that with a Thinkpad T41 modified to be sunlight readable by Nic Schraudolph from the MakeHackVoid group in Canberra.

de-beeping a LG microwave

Saturday, October 30th, 2010

We recently bought a new microwave, a LG MS3446VRL, chosen solely on the basis of it fitting our largest dish.

When we got it home we found one very annoying feature – it beeps loudly and often. Not just when it has finished cooking, but it keeps beeping every minute after that until you go and press a button or open it up to shut the damn thing up. It is amazing how annoying this can be.

After a few weeks of putting up with this I finally de-beeped it. After removing the cover we spotted something that looked a bit like a beeper on the circuit board, with a part number of TDP-2020P. A quick google search confirmed its beeping purpose. Then it was just a matter of applying brute force to rip the damn thing off the circuit board in order to get a beepless microwave. Ahh, the sound of silence!

In case anyone else has a beeping LG microwave that they want to silence, here is the part that needs to be removed:

debugging startup problems on Ubuntu

Thursday, July 29th, 2010

I recently upgraded my home server from Ubuntu Karmic to Lucid. It did not go well.

The actual apt-get dist-upgrade went fine, with only minor problems which were easy enough to fix. The problem came when I rebooted. The boot started fine, but then got
stuck at the purple boot page, which showed “Ubuntu 10.4″ and 5 dots which cycled
between white and red. It never got past that point.

The usual thing to do at this point is to reboot in single user mode and start debugging startup scripts. Unfortunately I found that single user mode with Ubuntu Lucid was not useful as it doesn’t start a shell until after a huge pile of other things are started. In my case a ‘single’ boot got stuck at the same point. Getting rid of the quiet and splash options, and adding nomodeset also didn’t help.

I found that if I booted an older kernel (2.6.31-19) then the system came up OK. That pointed to a likely driver issue. I could have just settled for that older kernel, but part of the reason for going to Lucid was to get a newer ALSA with better support for HDMI audio, so I didn’t really want to stick to an older kernel. I also wanted to know why the problem was happening.

I was also able to get a shell using the latest kernel by using the init=/bin/bash trick, but that doesn’t help to actually debug the problem. To debug startup problems you need to be able to watch the startup process in action, to see what is waiting. This is much harder these days with the new upstart init system now used in Ubuntu, as startup is much more parallel than it used to be. Adding some echo lines to init scripts used to be a useful technique, but it is much harder to get anything sensible out of that when using upstart.

To try to debug the problem I initially had a look for any startup debugging options. I found some promising options in /etc/default/rcS, and tried setting VERBOSE=yes and SULOGIN=yes. I found that the VERBOSE=yes option was somewhat useful, as it gave me some information on what jobs were started/waiting, but it didn’t really allow me to pin down the problem. The parallelism in upstart again made interpreting the output hard. When it says that a job is waiting it doesn’t say what it is waiting on, so you have no idea what the underlying problem really is.

Despite the promising name, and the nice description in the rcS(5) manpage, the SULOGIN=yes option didn’t seem to do anything at all. A grep for SULOGIN in the startup scripts didn’t find any hits, so I suspect it isn’t actually implemented.

As usual, the real key to solving the problem was a hack. I added the following to /etc/default/rcS:

(
/bin/sleep 10
/sbin/ifconfig eth0 192.168.2.10 up
/usr/sbin/sshd
) > /dev/null 2>&1 &

The idea behind this hack was to allow me to login with ssh from my laptop during the startup process and watch what was going on. This worked really well and meant that I was finally able to debug the startup process with the most recent Lucid kernel.

I rebooted again, logged into the system with ssh from my laptop, and started poking around with ps and initctl to see what was going on. I had assumed that “initctl list” would give me the information I needed. It does show what jobs are waiting, but as with the VERBOSE=yes messages it doesn’t tell you what it is waiting on.

Poking around some more I saw 3 things that were suspicious:

1) cryptdisks-enable was shown as “waiting”. I don’t have any encrypted disks on this system, so why should it be waiting?

2) dmesg showed a segfault in plymouth, which is the process that asks for user input during startup (it also does splash screens). This could be linked to why cryptdisks was waiting, as its possible that cryptdisks wanted a passphrase (for what disk though? I don’t have any encrypted disks)

3) dmesg also showed a lot of warnings from the dvb-usb-cxusb driver

As I was running low on time I decided to try the triple whammy of removing the cryptsetup package, removing the dvb-usb and dvb-usb-cxusb drivers (by moving them out of /lib/modules and running depmod) and removing the plymouth-theme-ubuntu-text package to try to simplify plymouth. This did the trick and my system now boots fine.

I still have the puzzle as to what is really causing the problem (and thus which of the changes matter), but I can leave that for another day. I thought it would be worthwhile sharing the ssh debug hack in case other people are also trying to debug upstart startup problems.

recovering kaddressbook entries

Friday, June 25th, 2010

I’ve been using kaddressbook for a long time, long after I switched from KDE to Gnome. It has worked well until now.

After returning from vacation I decided to update my laptop from Ubuntu Karmic to Lucid, and then found that kaddressbook wouldn’t start. Digging into it I found a maze of errors related to mysql and innodb. I hadn’t even realized that kaddressbook was using mysql for my addressbook till now. It’s a bit of an overkill for my little home addressbook :-)

After spending some time tracing mysql and trying to work out why I was getting “InnoDB: No valid checkpoint found”, I realised there was an easier way. Inside that mysql database was just a set of binary VCARD records. So the quick fix was:

mlgrep.py BEGIN:VCARD END:VCARD ~/.local/share/akonadi/db_data/ibdata1

putting the result in a file. I then imported the file into the evolution contacts app and all was well. Of course, this assumes records aren’t split within the db, but it seems to work well enough for this quick fix.

See http://samba.org/ftp/unpacked/junkcode/mlgrep.py for mlgrep.

Cheers, Tridge

chocolate truffles!

Sunday, January 10th, 2010

After having some fun building a coffee roaster with automatic power control and a python UI, my wife asked if I could help her with chocolate tempering for making chocolate truffles.

The result is a new use for pyRoast, and some great “techno truffles” !

See http://coffeesnobs.com.au/YaBB.pl?num=1263085051 for all the details and a short movie.

Cheers, Tridge

Fun with gdb

Thursday, April 23rd, 2009

Ben Elliston just gave a great lunchtime talk at OzLabs on gdb. Ben has actually read the gdb manual, and in doing so has discovered a lot of neat tricks!

One that he missed in his talk is some cute stuff you can do with the p command (the short form for ‘print’). For example, say you have some process that is sending its output to /dev/null, but you want to see that output. That can happen with long running daemons that were started without debugging enabled for example. What you need to do is tell that program to change file descriptor 1 to point to a file, instead of /dev/null.

Let’s look at an example:


 lsof -n |grep null|grep 1u
 gconfd-2  17243     tridge    1u      CHR                1,3              5961 /dev/null

We see from lsof that gconfd-2 has pid of 17243 and is redirecting stdout (fd 1) to /dev/null. Now let’s attach with gdb, close fd 1, and re-open it on a file in /tmp


 tridge@blu:~$ gdb --pid 17243
 GNU gdb 6.8-debian
 Copyright (C) 2008 Free Software Foundation, Inc.
 License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
 This is free software: you are free to change and redistribute it.
 There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
 and "show warranty" for details.
 This GDB was configured as "x86_64-linux-gnu".
 Attaching to process 17243

ok, we’re attached, now the dirty work:


 (gdb) p close(1)
 $1 = 0
 (gdb) p fopen("/tmp/gconfd.log", "w")
 $2 = 27041792
 (gdb) p fileno($2)
 $6 = 1
 (gdb) quit

Now if we check with lsof, we see the fd is now open on a log file


 gconfd-2  17243     tridge    1u      REG                8,1        0 12277473 /tmp/gconfd.log

You may ask why I used fopen() and not open(). First off, “w” is easier than remembering the right flag bits to open. Secondly, open may in fact be open64 or some other symbol, whereas I’ve found that fopen() always seems to work.

Notice also that I checked that fopen() gave us the right file descriptor back by calling fileno() on the output of fopen(). It will always give you the lowest available fd, which is usually right. If it isn’t the right one then use p with dup2() to move it to the right one.

Another use

This type of hack can also be used to solve the problem of unmounting a filesystem that has a process running with a file open on the filesystem read-write. The kernel doesn’t allow that, and the usual approach is to kill off all those processes. But what if you don’t want one of those processes to die? You can use the above trick to move the fd to point at a different filesystem. By using lsof to find all the ones you need to move, you can eventually mount the filesystem read-only, or even unmount it. This allows you to swap filesystems with running processes using them. You can minimise the effect on the processes by using gdb -x to script it (so it isn’t paused waiting for nervous fingers to type the right commands).