my UAV blog

April 15th, 2011

For those following my blog, I thought I should mention that I have a separate blog for the stuff I’m doing as part of the CanberraUAV project, which is building an entry for next years outback challenge.

The posting I made today may interest other Linux hackers as well. Up to now I’ve been using an old OLPC laptop as my ground control station for my UAV, because it is sunlight readable. I’ve now replaced that with a Thinkpad T41 modified to be sunlight readable by Nic Schraudolph from the MakeHackVoid group in Canberra.

Linux powered coffee roasting

February 5th, 2011

At the end of my LCA talk this year I forgot to show the final slide, which had the links to the code for the USB preload and pyRoast.

So if you want to try adapting the USB preload code for your own project, the slides of my talk may be useful

worst abuse of preload I’ve ever seen

February 4th, 2011

We recently had a curious bug report from a Samba user. The bug report showed a strange hang in our provision script. Andrew Bartlett worked with the user to get a gdb backtrace, which showed that an internal heimdal library was calling out to a net_read() function in /opt/lib/libmediaclient.so. That seemed strange, as net_read() is an internal heimdal function (it is part of the roken library), so why would it be calling a “media” library?

The answer turned out to be quite bizarre! The authors of the Sundtek driver install their driver using this script. That script is easily the worst abuse of loader preload mechanisms that I have ever seen. I’m a big fan of LD_PRELOAD as a driver development aid, and a debug tool, but this “driver” goes much further.

ld.so.preload is evil

The script installs the libmediaclient.so library in /etc/ld.so.preload. It implements a “driver” for the Sundtek USB device by intercepting 48 functions (including open()/close()/read()/poll()/mmap() etc) in _all_ installed programs. For those of you who don’t know, the /etc/ld.so.preload mechanism is one of those oh so tempting things that no developer should ever use. The way it works is that any library listed in that file is “preloaded” into all binaries on the system, and overrides any of the libraries functions in all those binaries.

In this case the author of this driver decided that he would avoid writing a real driver by instead intercepting library calls from all programs on the system and faking the return values. In this case the author didn’t just intercept standard interfaces, but also intercepted a bunch of non-standard functions. The interception of net_read() is what broke Heimdal, which in turn broke Samba.

Interesting conversation

I thought it would be useful to email the author of the driver to ask them to stop using ld.so.preload. I was pleasantly surprised when a few minutes later Markus Rechberger popped up on #samba-technical to discuss the problem. I wish all driver authors were this responsive.

Unfortunately Markus had turned up to tell me that there was no other way than using ld.so.preload. He insisted that problems were rare (how would he know? it isn’t him that gets the bug reports!) and that it would be far too difficult for his users to have to run a script that uses LD_PRELOAD instead. He thinks CUSE is too unstable and its far too difficult to write a driver for lots of different kernel versions.

The conversation was quite surreal. I asked towards the end if he would mind if I posted a log, but he asked me not to. That is a pity, as I think it would make a great case study for what has gone wrong with some Linux device driver development.

Flavours of Kangaroo Valley

January 29th, 2011

I was in big trouble with Susan with LCA being on the week of Australia day. Australia day is Susans birthday, and also our wedding anniversary! To make up for it we’ve gone to Kangaroo Valley for a few days to enjoy some walks and a bit of cooking – one of our favourite pastimes.

Italian Desserts

Today we had a wonderful afternoon cooking (and then eating!) Italian desserts. We were at the Flavours of the Valley cooking school in Kangaroo valley. The description of the “Afternoon Delights” class on the website is quite understated, so we were expecting to perhaps make a coffee and an Italian cake, but instead what we got was a wonderful 2 hour tour of Italian dessert making. We made 4 Italian sweets in 2 hours, which was very intense but quite fantastic.

Latuigi

We started off by making some sweet pasta dough, which after it had rested for a while we later made into Latuigi (Crostoli), a type of sweet fried pasta with icing sugar. We were surprised at how easy it was to roll out and cut the pasta into the intricate twirls that make up the Latuigi.

Espresso fudge cakes and lemon curd cakes

Next Susan and I each started on a cake. I made the espresso fudge cakes with chocolate ganache. I rarely make desserts – Susan normally does those, so making fudge cakes was quite a change for me. Despite my inexperience they turned out very well, especially when eaten straight from the oven!

Susan loves anything with lemon in it, so she chose the lemon curd cakes with meringue topping. She snuck some extra lemon into the recipe, as she always thinks one can never have too much lemon in a dessert. The lemon curd was fantastic, and the meringue was just nicely toasted in the oven to finish them off.

Chocolate pasta

After that came the chocolate pasta! We made chocolate ravioli with raspberry filling, covered with white chocolate sauce and all set on a raspberry mouse base, and with some fresh raspberries on top. It was an absolutely  fantastic dessert, and surprisingly simple to make.

Highly recommended

If you want to learn a bit of Italian cooking in a bush setting then Susan and I can definately recommend Flavours of the Valley. The hosts (Toni and Robert) are very friendly and really passionate about Italian cooking. It is a great bush setting, with wonderful scenery.

We’re going back tomorrow for the main course – two types of gnocchi, polenta and creamy sauces. Yum!

Great conference!

January 29th, 2011

LCA2011 was a true triumph under extraordinary circumstances. The organisers pulled off a miracle to get the conference moved to a new venue at such short notice, and still left us with a fantastic LCA.

The most memorable moment for me was in Jon Oxers talk – the “come hither” gesture with the kinect to bring the AR.Drone towards him was truly inspired. Wonderful stuff!

Congratulations to the organisers, and also to all the speakers, helpers and everyone else who make the conference a success.

See you all in Ballarat next year!

Samba integration with bind9

December 14th, 2010

I’ve spent the last couple of weeks immersed in the bind9 source code. The bind DNS server is one of the oldest applications still in active use in the free software community, and it plays a pivotal role in the worlds internet infrastructure. Working in the darker corners of this venerable codebase has been a very interesting experience!

Samba4 and DNS

Samba4 needs a DNS server, and not just any DNS server. It needs a DNS server that has good support for kerberos signed TKEY/GSSAPI updates, and it needs a DNS server that can integrate with the DRS replication that Active Directory uses for DNS updates between domain controllers. This presents a bit of a challenge.

Up to now we’ve been recommending bind9, but it has been a frustrating experience. Problems with dynamic DNS updates have been the biggest source of bug reports that we’ve had from users, mostly because getting the bind9 configuration right with all the necessary kerberos settings is quite tricky. It was bad enough that after one particularly frustrating session trying to debug DNS updates at the AD plugfest, Metze and Kai stayed up late and started work on a new DNS server builtin to Samba4. It looked like we were on our way to having yet another protocol implemented in Samba.

fixing the problems

I wasn’t quite ready to give up on bind however. A few months ago I had applied for access to the bind9 CVS tree (there is currently no anonymous access, something which I’m hoping may get fixed sometime soon), and started working towards making bind9 easier to configure for Samba4 users. The bind developers had accepted some patches previously which fixed a few problems with TKEY/GSSAPI updates, so I was hopeful I’d be able to get a new round of patches in.

This time I was lucky enough to come across Michael Graff, a long time bind developer. Michael was kind enough to teach me a bit about the process of getting patches into bind, and he has responded very quickly to patches I have submitted. We spent an hour on the phone comparing notes on Samba and bind9 development practices, which really helped me understand what is needed.

Encouraged by this, I thought I’d try something a bit more ambitious than just making Samba4 and bind9 easier to configure.

DLZ drivers

A problem in Samba4 we’ve been putting off for quite a while is how we reconcile the bind9 DNS database with the DRS updates that AD uses for replication. With a Samba4 AD DC configured to use bind9, we will have changes coming into the DNS database via 4 different sources:

  1. kerberos signed dynamic DNS updates over the DNS protocol
  2. RODC DNS updates over the netlogon MSRPC pipe
  3. DRS replication updates from other domain controllers
  4. direct modifications via LDAP (DNS in AD is stored in LDAP)

The first type of update can be handled by bind. We’re currently handling the 2nd type of update by constructing a nsupdate script and asking bind to do the modification, using TKEY/GSSAPI. We’re not coping at all with the 3rd and 4th type of update. This was one of the other motivations for Kai and Metze starting on our own DNS server.

To solve this, we really need the DNS database to be held in ldb. Not only that, we need the DNS server to be able to directly update that database, with transaction support.

The closest thing to this in the current bind9 code is something called a ‘DLZ driver‘. A DLZ driver is an abstraction inside bind9 that allows for new database drivers. The DLZ patches to bind were started in 2002 by Rob Butler, and have since been integrated as a standard part of bind9.

So, we just need a DLZ driver for ldb in Samba4? Not quite. DLZ has two major limitations that prevent us from solving the problems of Samba4 integration.

The first limitation is that DLZ drivers are all integrated directly into the bind9 source tree. This means to modify a DLZ driver you need to rebuild bind. Samba4 is undergoing rapid development and it is unlikely to be practical to ask users to wait for the next bind9 release before they can get a bugfix to the Samba DLZ driver. We also want Samba4 to be easy to install and use, which means asking users to patch bind9 themselves is not really going to work, especially once we try for wider deployment.

The solution is to write a DLZ driver which is a shim around other external DLZ drivers. To add this, I wrote a new DLZ driver “dlz dlopen” which uses dlopen() to attach to an external shared library implementing the SDLZ API. One of the key features of this driver is that it doesn’t require that the external drivers have access to the bind9 source code to build. This is achieved by passing named function pointers using a varargs interface, so the external driver doesn’t need direct access to any bind9 libraries. Using a simple string based API (which closely follows the existing SDLZ API in DLZ – thanks Rob!) means that access to the bind9 headers is also not required.

Then it was just a matter of adding a DLZ driver to the Samba4 tree and we could start serving queries from ldb directly through bind9.

dynamic updates and DLZ

This still left the 2nd problem. The DLZ driver API doesn’t support dynamic updates. Existing users of DLZ do updates directly to their database, not via DNS updates, whereas we need to support both direct database updates and DNS dynamic updates.

Adding update support to DLZ required a much more complex set of patches. This is where I really started to dive into the bind9 data structures, and it was fascinating. The bind9 code has some very nice features. The internal APIs are very well documented, and the coding style leans towards heavy use of asserts and very careful pool based memory management. It is really quite impressive, as long as you are willing to delve into some pretty esoteric data structures. I still don’t understand all of the data structures, but it seems that I can ignore some of them for what we need.

I finally got working DLZ updates this evening, and put together a sample driver and a set of patches. I’m hoping this will be useful for non-Samba users as well, and hopefully some of the other current DLZ drivers will be updated to take advantage of the new DLZ features.

Now to see if I can convince the bind9 developers to add this to the next release!

Thanks partner! A year of pair programming

December 2nd, 2010

This week marks the first anniversary of starting to do daily pair programming with another Samba developer, Andrew Bartlett. It has been an absolutely fantastic experience, and I thought this would be a good time to write up what we’ve been doing.

For many years I’ve been doing ad-hoc pair programming with various people. I have used a variety of techniques, from combining IRC with a shared GNU screen session, VNC sessions, NX sessions and lots of other combinations. What is different about the last year is that it wasn’t just a bit here and a bit there – we started working together on a daily basis.

We started off using long phone calls combined with a VNC session to share one of our screens. I originally used x11vnc to share my desktop, but we’ve more recently moved to using x0vncserver, part of the TigerVNC package. We’ve also moved from SIP phone calls to using a mumble server, which has allowed us to open up our coding sessions to other Samba developers, which has been very helpful.

daily coding sessions

The pattern we’ve got into now is that we both login to mumble around breakfast time. Like Andrew I tend to eat breakfast in front of my computer, while having a quick look at sites like lwn and slashdot. Andrew usually joins mumble around the same time, and we chat about what we are going to work on that day. After we’ve both finished our cereal we dive into coding.

The most common pattern is that Andrew connects to my desktop over VNC, which I have shared over a VPN. I have a script which runs this command in a terminal:

x0vncserver QueryConnect=1 QueryConnectTimeout=30 AlwaysShared=1 PollingCycle=300 ZlibLevel=9 SecurityTypes=None \
     AcceptKeyEvents=off AcceptPointerEvents=off PasswordFile=$HOME/private/VNC/team.pass

that shares my current desktop read-only, and prompts me when someone connects. It also sets up a slow polling cycle, which means the VNC session doesn’t chew too much bandwidth, which means our mumble VOIP session doesn’t degrade. It allows for multiple people to connect, which is really useful when Zahari or Kamen join in from Bulgaria, or Jelmer joins from the Netherlands.

audio setup

Andrew uses a fancy wireless headset which he can switch between being a pulseaudio device for mumble, and being a phone handset for when people ring him. The headset means he can walk around his house while we’re coding, which really suits him, as he often needs to help Kirsty out by hanging nappies on the line or doing a bit of cooking, and we can keep talking about the latest kerberos issue while he’s doing that.

I use a cheap desktop microphone (a Logitech one) along with desktop speakers. That means I don’t need to wear a headset all day, which is much more comfortable. If you adjust the mumble audio settings carefully you can avoid echo and avoid having the noise of typing come through, while still having really clear audio with a desktop microphone. I put the microphone on a rubber mouse mat to isolate it a bit from the desk.

writing code

With the above setup I just use my usual coding environment, which is emacs plus a bunch of GNU screen sessions in a gnome-terminal. Andrew watches me code and test, while we discuss the approach to each problem as it comes up, and he suggests different approaches. When we are working on a piece of code where Andrew is the expert (eg. all the auth code, kerberos etc), we often switch around so I’m connecting to his VNC server instead and I comment on his code while he is writing it. I use a command like this to connect:

 xtightvncviewer -passwd $HOME/private/VNC/abartlet.pass -viewonly -quality 3 -compresslevel 9 server

That gives me a pretty good view, while minimising bandwidth usage. When we need to share files, we either rsync it somewhere, or we push it to a personal branch on a git repository.

the results

The results over the last year have been really amazing. Between the two of us Andrew and I have pushed over 2500 patches to the Samba master repository over a year of pair programming, which is more than twice what we managed  in the previous year. I find it really interesting that despite only one of us typing at a time, we get much more done with pair programming than when we work separately. The results are even more notable when you take into account that in the last year Andrew has been rebuilding his house and looking after a new baby!

I think the reason it works so well is that it tends to minimise procrastination. When I code alone and I’m stuck on a bit of code, I often find myself drifting off to read slashdot or muck about with some new application that I’ve found. That happens a lot less when someone else is watching over your shoulder on VNC. We discuss how we’re going to solve the problem and then we solve it, without the hours of procrastination in between.

The code also ends up as much higher quality, with a lot less bugs. Andrew is great at spotting subtle issues in my code while I’m typing it in, which saves a lot of debugging time.

It’s also interesting that we’ve found that pair programming works a lot better when we aren’t in the same room. We both live in Canberra, which means we could drive over and code right next to each other, but we find we just don’t get nearly as much done that way.

some highlights from the year

Pair programming is also just a lot more fun than programming alone. After having been a lone coder for 20 years, moving to pair programming was a revelation. By being on mumble all day, I get a lot more social interaction with other developers. I think it’s fair to say that most of my social interaction is now over the mumble VOIP link.

One really fun time was during the SNIA CIFS conference in Santa Clara in September and the AD plugfest the following week in Redmond. Andrew couldn’t travel as Celeste had just been born, so I took my desktop microphone with me and used it to allow Andrew to attend the plugfests remotely. That worked extremely well! We setup some speakers attached to my laptop, and again shared my screen with VNC. Andrew was able to participate fully in the plugfests, even if it meant he didn’t get any of the free food. His knowledge of authentication protocols was essential to many of the problems we faced during those two weeks.

There have also been lots of fun moments when Celeste has decided to chime into the conversations as only a 2 month old baby can, with Andrew often holding her on his lap while we are coding. At my end of the link I have a similar effect when my little dog Nessie (a king charles cavalier) comes into my study to ask for a biscuit by whining pathetically. She nearly always gets the biscuit.

Another amusing incident was when I got a letter from my SIP provider saying threatening to cut me off as I was costing them too much with these 8 hour untimed phone calls every day. That was before we discovered mumble, so I just switched to a different SIP provider.

The switch to mumble, which made it easy for us to have several people connected at once, was also a big advance. It allowed Andrew and I to help a number of new Samba developers to find their feet with some very tricky code by working with them directly than we could do over IRC. We’ve spent many hours working with Nadya, Kamen, Anatoliy, Zahari, Kai and many others on some very tricky code, and it has helped them to become better Samba developers. It works well despite them being spread out all over the world.

give it a go!

If you haven’t tried pair programming then you really should give it a try. Find someone else working on the same project as yourself and see if your coding styles are compatible enough for shared screen coding sessions to work. I’m sure it won’t work for everyone, but when it does work its a fantastic way of making yourself both more productive while having a lot of fun.

Thanks partner!

Finally I’d like to say a huge thank you to Andrew for being such a great coding partner for the last year. Your patience when I’m coding something badly has been fantastic, and you’re support when a test just refuses to pass has been brilliant. Thank you!

samba-tool gets closer to python only

November 28th, 2010

The ‘samba-tool’ command (which replaces the old ‘net’ command in Samba4) has slowly been being migrated to python. It started off as a pure C program, but with the adoption of python as the primary scripting language for Samba4, we’ve been moving it one subcommand at a time to python.

Currently it is a C program that calls out to python for any subcommands not implemented in the C part of the code. As of last week we had just 5 subcommands left in C, with all the rest implemented using the very nice netcmd python framework that Jelmer did. That works, but it does mean the command line parsing is a bit of a mess, as command line options are first handled by popt in C, then handled again by python code. That leads to some odd behavior.

Yesterday I decided to tackle a couple more of them, in the hope of finally getting rid of the C wrapper in the near future. I started with the ‘samba-tool drs’ commands, which Kamen had done in C earlier in the year. The drs subcommands allow admins to control and query DRS replication, and are a core piece of the command set for any Samba4 sysadmin. I was pleased to find I could re-do all of the drs subcommands in python using about 1/4 of the code, while gaining some better printing of options and flags.

GPO subcommand

I’ve now started on the samba-tool gpo subcommands, which are for administering Group Policy Objects. That mostly involves some simple LDAP calls, which python is really good at (via the samdb interface), but it will also need some file operations, which will finally give me the excuse to create python interfaces for CIFS file operations. Meanwhile, Andrew Bartlett is working on some token/access_check calls that I will need to test whether a user has access to a GPO object.

This effort is also a good chance for me to learn a bit more about administering GPOs. One of the challenges I have with Samba development is that I don’t actually have much experience as a Windows sysadmin, so I’ve rarely had to deal with the finer details of administering collections of GPOs. Rewriting our GPO admin tool in python should cure me of that deficiency.

After I’ve finished the conversion of ‘samba-tool’ GPO I think we’ll be ready to ditch the C wrapper for samba-tool. I think Jelmer should have the honor of doing that final git rm, as it was his efforts that started us down this track of converting the tool to python. It’s been a very worthwhile effort, but it has taken quite a long time!

w2k3 TSIG/GSS DNS  updates

Hongwei (a very helpful Microsoft engineer) and I are still trying to find the cause of the failure of Windows2003 server to register some DNS records with bind9 using TSIG/GSSAPI. I’ve sent Hongwei some new TTT (‘Time Travel Trace’) logs of lsass.exe and svchost.exe in Windows2003 going through its initial registration on reboot. That combined with a parallel network capture should tell us what is going on.The really strange part is that the registration of the _udp, _tcp and _msdcs names works fine, but the registration of the A record for the machine doesn’t ever attempt to negotiate a TKEY needed for a TSIG/GSS DNS update. It is probably related to the split between the dhcp (svchost.exe) process on windows2003 doing the A records, and lsass.exe doing the other DNS names.

I wish we had something like TTT for Linux, it really is an amazing tool for debugging complex issues. When Hongwei gets a trace from me, he can load it into a debugger and navigate backwards and forwards in time in the trace, seeing exactly what is happening in a source level debugger. I’ve seen some pretty impressive demos of reversible debugging using valgrind on Linux, but nothing that yet matches what Hongwei can do with TTT.

replacing halogen downlights with LEDs

November 23rd, 2010

Our house, like many others, is plagued by horrible 12V halogen downlights. We have about 30 of them scattered through the house. I’ve been experimenting with various ways of replacing them for a while now. Apart from the amount of power they draw, they are also a fire hazard. I’ve noticed far too many of the bulbs are close to pieces of wood, and I hate to think that would happen if a rat nested on a little used bulb that was then switched on.

First attempt – GU10 compact fluorescents

My first attempt at fixing this was a couple of years ago, when I replaced 3 of the 12V transformers with 240V GU10 fittings and put in some compact fluoro bulbs. I was never very happy with them, they are slow to come on, and don’t give as much light as I’d hoped. I’ve left those in, but decided on a different approach for the rest.

LED MR16 bulbs

More recently I’ve been trying various LED MR16 12V downlights bulbs, which can go into the existing fittings with the existing transformers. I’ve tried 3 different LED bulbs so far:

Subjectively, the best_led bulbs are the brightest, and they are the first ones we’ve bought that pass the WAF for my wife. The dealextreme bulbs are very cheap, but are also quite dull. They are fine for the less important parts of the house, but no good for the kitchen and marginal for the lounge room. The ledcentral ones are better, but not as good as the best_led ones.

The other lesson is that LED bulb marketing is crazy. They all claim to be equivalent to a 50W halogen, but none of them are. The wattage and lumen claims are also very suspect.

Measuring the power usage

I used a Arlec power meter and a multimeter to measure the actual power for these 3 bulbs and a “50W” halogen. The results are shown below (all the numbers are quite approximate, as the arlec meter only shows 2 digits for power).

Bulb AC power (W) AC current (A) power factor DC current (A) DC power (W)
“50W” halogen 58 0.24 96 3.65 44
“3x3W” bulb 12 0.12 40 0.34 4
“6W” bulb 13 0.12 43 0.37 4.5
“360 lumen” bulb 10 0.12 33 0.26 3

All of the LED bulbs consume much less power than their rating. With LED bulbs, we have the strange marketing situation that a ‘green’ device is being advertised as consuming much more power than it actually does. If that “3x3W” bulb actually used 9W, then it perhaps it would actually be a good 50W halogen replacement, but as it only actually uses 4W it doesn’t have much of a chance. LEDs are not that efficient.

I recently saw some 10W GU10 bulbs from ledcentral at Andrew Bartletts house, and they are really bright! They are the first LEDs I’ve seen which really could replace 50W halogens reasonably. Unfortunately they are currently $59 each, so I’ll be waiting for the price to drop before buying any of those.

I will also need to look at replacing my old transformers at some stage. With the LED bulbs I’m now using most of my power just heating the transformer. Some specialist LED transformers are probably worthwhile.

In the meantime, the best_led bulbs seem to be the best choice so far. We’re pretty happy with 6 of them in the kitchen.

Testing against Windows with wintest

November 20th, 2010

The project for this week grew out of a need to check that the Samba4 HOWTO actually works. I’ve previously run through the steps of the HOWTO manually on a regular basis (well, as regularly as my schedule allows!), but as we get more features it was taking longer and longer to do manually. I also only had time to test a subset of the widely used versions of Windows out there. So this week I decided to solve the problem by writing a script to test the HOWTO automatically.

wintest – testing Samba against Windows

We’ve always tested Samba against Windows of course, and Samba also has a long tradition of highly automated testing, but we haven’t really cracked the problem of automatically testing against Windows before. It isn’t for lack of trying – Sam Lidicott put a lot of effort into this a couple of years back, and Jim McDonogue has also tried, along with efforts by myself and others. The problem has been making it reliable enough and broad enough to be really useful.

So up to now the Samba testsuite has mostly concentrated on testing Samba using its own test suites against itself. We also validate those test suites against Windows, but that isn’t quite the same thing as directly testing Samba against Windows.

This weeks attempt resulted in a new test framework that I’ve called wintest. The framework is based on using the excellent pexpect system, along with a very old and crufty bit of technology – Windows telnet.

The test scripts read a config file which defines what Windows virtual machines you have, and how to access them. For example it provides the snapshot name, the commands to run your VM system (I use VirtualBox myself), the usernames and passwords etc etc. The scripts then run through a battery of tests, joining Samba as a DC and RODC to Windows DCs, joining Windows boxes to Samba domains, checking replication sattus, checking dynamic DNS etc etc.

Using a few wrappers around pexpect made the test system very expressive. For example, this snippet:

    child = t.open_telnet("${WIN_HOSTNAME}", "${DOMAIN}\\administrator", "${PASSWORD1}", set_time=True)
    child.sendline("net use t: \\\\${HOSTNAME}.${LCREALM}\\test")
    child.expect("The command completed successfully")

nicely checks that we can mount a share, while this one:

    t.retry_cmd("bin/smbclient -L ${WIN_HOSTNAME} -Utest2%${PASSWORD2} -k no",
                ['LOGON_FAILURE'])

waits for a user deletion to propogate to a Windows server via DRS replication.

Lots of bugs found!

As is always the case with new test suites, I found heaps of bugs in Samba4. A particularly interesting one was a case of replication starvation where a single dead DC could cause us to stop replicating from all DCs, as notify events starved the running of the replication queue.

w2k3 dynamic DNS problems

The scripts also found one bug that I haven’t been able to fix yet. For each of the Windows virtual machines, the test checks that TSIG/GSS dynamic DNS updates work. I found that it works fine with WinXP, Windows7, Windows2008 and Windows2008R2, but fails for some DNS names for Windows2003. It is quite bizarre, and a couple of days of staring at traces hasn’t yet found the answer.

What happens is that Windows does its usual trick of first trying an unsigned DNS update, and expects to get a failure. That usually then triggers Windows to try a TSIG/GSS signed update. That happens fine for all the _msdcs names, but not for the the A record of the Windows machine itself. For some strange reason when it gets the failure on the unsigned update it just gives up. I’ve asked Hongwei if there is any way we can get a TTT (Time Travel Trace) of this to see why Windows isn’t trying a TSIG update.