Ben Elliston just gave a great lunchtime talk at OzLabs on gdb. Ben has actually read the gdb manual, and in doing so has discovered a lot of neat tricks!
One that he missed in his talk is some cute stuff you can do with the p command (the short form for ‘print’). For example, say you have some process that is sending its output to /dev/null, but you want to see that output. That can happen with long running daemons that were started without debugging enabled for example. What you need to do is tell that program to change file descriptor 1 to point to a file, instead of /dev/null.
Let’s look at an example:
lsof -n |grep null|grep 1u
gconfd-2 17243 tridge 1u CHR 1,3 5961 /dev/null
We see from lsof that gconfd-2 has pid of 17243 and is redirecting stdout (fd 1) to /dev/null. Now let’s attach with gdb, close fd 1, and re-open it on a file in /tmp
tridge@blu:~$ gdb --pid 17243
GNU gdb 6.8-debian
Copyright (C) 2008 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Attaching to process 17243
ok, we’re attached, now the dirty work:
(gdb) p close(1)
$1 = 0
(gdb) p fopen("/tmp/gconfd.log", "w")
$2 = 27041792
(gdb) p fileno($2)
$6 = 1
(gdb) quit
Now if we check with lsof, we see the fd is now open on a log file
gconfd-2 17243 tridge 1u REG 8,1 0 12277473 /tmp/gconfd.log
You may ask why I used fopen() and not open(). First off, “w” is easier than remembering the right flag bits to open. Secondly, open may in fact be open64 or some other symbol, whereas I’ve found that fopen() always seems to work.
Notice also that I checked that fopen() gave us the right file descriptor back by calling fileno() on the output of fopen(). It will always give you the lowest available fd, which is usually right. If it isn’t the right one then use p with dup2() to move it to the right one.
Another use
This type of hack can also be used to solve the problem of unmounting a filesystem that has a process running with a file open on the filesystem read-write. The kernel doesn’t allow that, and the usual approach is to kill off all those processes. But what if you don’t want one of those processes to die? You can use the above trick to move the fd to point at a different filesystem. By using lsof to find all the ones you need to move, you can eventually mount the filesystem read-only, or even unmount it. This allows you to swap filesystems with running processes using them. You can minimise the effect on the processes by using gdb -x to script it (so it isn’t paused waiting for nervous fingers to type the right commands).