Testing against Windows with wintest

The project for this week grew out of a need to check that the Samba4 HOWTO actually works. I’ve previously run through the steps of the HOWTO manually on a regular basis (well, as regularly as my schedule allows!), but as we get more features it was taking longer and longer to do manually. I also only had time to test a subset of the widely used versions of Windows out there. So this week I decided to solve the problem by writing a script to test the HOWTO automatically.

wintest – testing Samba against Windows

We’ve always tested Samba against Windows of course, and Samba also has a long tradition of highly automated testing, but we haven’t really cracked the problem of automatically testing against Windows before. It isn’t for lack of trying – Sam Lidicott put a lot of effort into this a couple of years back, and Jim McDonogue has also tried, along with efforts by myself and others. The problem has been making it reliable enough and broad enough to be really useful.

So up to now the Samba testsuite has mostly concentrated on testing Samba using its own test suites against itself. We also validate those test suites against Windows, but that isn’t quite the same thing as directly testing Samba against Windows.

This weeks attempt resulted in a new test framework that I’ve called wintest. The framework is based on using the excellent pexpect system, along with a very old and crufty bit of technology – Windows telnet.

The test scripts read a config file which defines what Windows virtual machines you have, and how to access them. For example it provides the snapshot name, the commands to run your VM system (I use VirtualBox myself), the usernames and passwords etc etc. The scripts then run through a battery of tests, joining Samba as a DC and RODC to Windows DCs, joining Windows boxes to Samba domains, checking replication sattus, checking dynamic DNS etc etc.

Using a few wrappers around pexpect made the test system very expressive. For example, this snippet:

    child = t.open_telnet("${WIN_HOSTNAME}", "${DOMAIN}\\administrator", "${PASSWORD1}", set_time=True)
    child.sendline("net use t: \\\\${HOSTNAME}.${LCREALM}\\test")
    child.expect("The command completed successfully")

nicely checks that we can mount a share, while this one:

    t.retry_cmd("bin/smbclient -L ${WIN_HOSTNAME} -Utest2%${PASSWORD2} -k no",
                ['LOGON_FAILURE'])

waits for a user deletion to propogate to a Windows server via DRS replication.

Lots of bugs found!

As is always the case with new test suites, I found heaps of bugs in Samba4. A particularly interesting one was a case of replication starvation where a single dead DC could cause us to stop replicating from all DCs, as notify events starved the running of the replication queue.

w2k3 dynamic DNS problems

The scripts also found one bug that I haven’t been able to fix yet. For each of the Windows virtual machines, the test checks that TSIG/GSS dynamic DNS updates work. I found that it works fine with WinXP, Windows7, Windows2008 and Windows2008R2, but fails for some DNS names for Windows2003. It is quite bizarre, and a couple of days of staring at traces hasn’t yet found the answer.

What happens is that Windows does its usual trick of first trying an unsigned DNS update, and expects to get a failure. That usually then triggers Windows to try a TSIG/GSS signed update. That happens fine for all the _msdcs names, but not for the the A record of the Windows machine itself. For some strange reason when it gets the failure on the unsigned update it just gives up. I’ve asked Hongwei if there is any way we can get a TTT (Time Travel Trace) of this to see why Windows isn’t trying a TSIG update.

Leave a Reply