Ubuntu diskless fat client

Note: This site will be moved to http://linuxpc.info/.

This is work in progress. Last update: 12-11-2009 (updates for Karmic).

If you have any questions or suggestions, feel free to enter them in my guestbook.

This document assumes you have basic knowledge about the TCP/IP setup in your environment. The goal of this project is to create a server, which diskless clients can boot from. Unlike the LTSP project, all applications started from the diskless client will run on the diskless client, not on the server. In this setup the diskless client is not a lightweight machine serving as X-terminal, aka thin client. The diskless client will be a regular machine, except maybe without a disk. Adding a new diskless client should be a matter of running 1 command. If you would like to custom install each individual client instead you should take a look at this and this. If you like the idea I described, but you are not in a hurry, you can wait for the LTSPFatClients project to finish. Though currently it is just a spec, the idea is similar, and I'm sure that if that project is finished it will require less work than the procedure I'm going to describe right now.

Ok, I see most of our audience has left now... but you're still here! Still with us? Ok, let's proceed :-)

Overview

In this doc our domain is mydomain.com. The default gateway also forwards DNS and its IP address is, let's say 192.168.1.1, the server is 192.168.1.2 and the client is 192.168.1.3. The network is 192.168.1.0, the netmask is 255.255.255.0 and the broadcast address 192.168.1.255. These things need to be adjusted to your environment. On the other hand if you see 127.0.0.1 (eg. localhost) leave it as-is.

Let's see what should happen when a diskless client starts up:

Install the software

Type:
aptitude install dhcp3-server tftpd-hpa nfs-kernel-server syslinux

Configure DHCP

In the early days of PXE, you would need a PXE server as well, but nowadays this is usually no longer required. If you need it (rare), see /usr/share/doc/syslinux/pxelinux.doc.

Make sure any existing DHCP servers (routers!) do not respond when your diskless client asks for an IP address.

Create a new /etc/dhcp3/dhcpd.conf, like this one:

ddns-update-style none;
subnet 192.168.1.0 netmask 255.255.255.0 {}

# Standard configuration directives...

option subnet-mask 255.255.255.0;
option broadcast-address 192.168.1.255;
option routers 192.168.1.1;
option domain-name "mydomain.com";
option domain-name-servers 192.168.1.1;

filename "/pxelinux.0";
next-server 192.168.1.2;
In addition, one of our readers suggests to use the following option if your NFS server uses large packets:
option interface-mtu 9000;

The domain and IP addresses used in this chapter need to be adjusted to your environment. Later on, an entry for each diskless client will be added to this file. Such an entry will look like this:

host my1stclient.mydomain.com {hardware ethernet 00:e0:18:37:44:ea;
   fixed-address 192.168.1.3;}

Configure pxelinux

The bootloader pxelinux (part of syslinux) is able to download a text file to display and then ask the user what to do. Depending on the user's choice, it either downloads the kernel using TFTP and starts it, or it can pass control back to the BIOS, so it can boot from local media, like a hard drive. A more reliable method of booting from local media, is by using the chainloader chain.c32. Documentation of pxelinux can be found in /usr/share/doc/syslinux/pxelinux.doc. We do not use pxegrub because it needs a hardware dependent driver for each type of network card, unlike pxelinux which uses UNDI, which is hardware independent. Basically this means that all PXE network cards (including future ones) will be supported automatically. Using menu.c32 you can create cursor controlled menu's, which I will show later on.

First we have to copy the kernel, the bootloader, its menu program and the chainloader to the TFTP server's root directory. Do this:

mkdir -p /tftpboot/boot/pxelinux.cfg
cd /tftpboot/boot
cp -i /boot/vmlinuz-`uname -r` ./kernel
cp -i /usr/lib/syslinux/pxelinux.0 .
cp -i /usr/lib/syslinux/menu.c32 .
cp -i /usr/lib/syslinux/chain.c32 .
cd pxelinux.cfg
Now create a file called default, which will provide the contents of the menu. Here is an example which I use at our department, providing the user a choice between 4 servers, and a local hard drive with another operating system.
DEFAULT menu.c32
PROMPT 0
MENU TITLE Network boot menu
TIMEOUT 30

LABEL 1
   MENU LABEL Linux server #^1 (Slackware)
   MENU DEFAULT
   KERNEL 192.168.1.9::/kernel
   APPEND ro ip=bootp root=/dev/nfs nfsroot=192.168.1.9:/tftpboot/root/%s

LABEL 2
   MENU LABEL Linux server #^2 (Slackware)
   KERNEL 192.168.1.10::/X86PC/pxelinux/vmlinuz-2.6.9
   APPEND ro ip=bootp root=/dev/nfs nfsroot=192.168.1.10:/tftpboot/root/%s

LABEL 3
   MENU LABEL Linux server #^3 (Gentoo)
   KERNEL 192.168.1.11::/kernel
   APPEND ro ip=dhcp root=/dev/nfs nfsroot=192.168.1.11:/tftpboot/root/%s

LABEL 4
   MENU LABEL Linux server #^4 (Ubuntu)
   KERNEL kernel
   APPEND ro ip=dhcp initrd=boot.img nfsroot=192.168.1.2:/tftpboot/root/%s

LABEL windblows
   MENU LABEL Microsoft ^Windblows
   KERNEL chain.c32
   APPEND hd0 0
The character following the carets (^) will be shown in bold and it can be used as a key shortcut. In this example the user can type 1, 2, 3, 4 or W to jump to the corresponding entry, or use the cursor keys.

There is one problem with Ubuntu, though. In Ubuntu the nfsroot option is not handled by the kernel (that's why the root=/dev/nfs option can be left out for the Ubuntu server). The nfsroot option is handled by the initramfs ramdisk, and in a different way. The IP address is mandatory, and the system cannot replace the %s by the clients IP address, so you would have to hardcode it. That is why you will see later on that the script for creating the client's root filesystem (addfat) also creates a modified copy of the menu with the %s already replaced by the IP address of the client. This copy is named after the MAC address. This way the bootloader attempts to use this copy first, before falling back to the original default menu. This workaround works.

Here is a screenshot of a diskless client showing the menu. (soon to come...).

Configure TFTPD-HPA

Edit /etc/default/tftpd-hpa. It should look like this:
RUN_DAEMON="yes"
OPTIONS="-l -s /tftpboot/boot"
Run /etc/init.d/tftpd-hpa start

Configure initramfs

Ubuntu uses a ramdisk with the kernel modules on it (among other things). A big advantage using this method is that you don't need to recompile the kernel each time the hardware changes. The drivers of the network cards do not have to be part of the kernel image: they can still be modules.

To get this to work in our diskless setup we need to change

BOOT=local
to
BOOT=nfs
in /etc/initramfs-tools/initramfs.conf. Next, type:
mkinitramfs -o /tftpboot/boot/boot.img
This creates a new ramdisk image, suitable for netbooting, in the TFTP root directory so the kernel can find it.

Change

BOOT=nfs
back to
BOOT=local
in /etc/initramfs-tools/initramfs.conf.

If booting fails later on, try to add the relevant network module to /etc/initramfs-tools/modules. Example file:

sky2

Create the NFS root

Usually a shared root filesystem is created which is shared among all clients. Not in this setup. Each client will have its own root filesystem. Ok, this will require more disk space than a shared NFS root, and it will need to be kept in sync with the server's root, BUT it also eliminates conflicts, race conditions and other weird problems you will get with a shared NFS root.

Because a new NFS root has to be setup for each client, I made a script for this, currently named addfat. This works perfectly for me, but please review the script and modify it before you actually use it. It contains a couple of potentially dangerous rm commands that could screw up your system. Anyway it basically should do the following things:

Configure portmap

If necessary comment out this line in /etc/default/portmap:
OPTIONS="-i 127.0.0.1"
(place a # at the beginning of this line.)

Configure NFS

Add this to /etc/exports:
/opt                    192.168.1.0/24(ro,async,no_root_squash,no_subtree_check)
/usr                    192.168.1.0/24(ro,async,no_root_squash,no_subtree_check)
/home                   192.168.1.0/24(rw,sync,no_root_squash,no_subtree_check)
/tftpboot/root          192.168.1.0/24(rw,async,no_root_squash,no_subtree_check)
Run /etc/init.d/nfs-kernel-server restart.

Important note

Recent kernels (2.6.17 and newer) used in Edgy and later have an NFS bug, which may affect the stability of the diskless clients. If applications start to freeze or crash, or if you get write or permission denied error messages when there is in fact no error, make sure you have enabled the option no_subtree_check in /etc/exports like above to workaround the problem. Then restart the NFS server.

Automatic configuration of Xorg

I'm working on that. The problem is /etc/X11/xorg.conf has a couple of "hardware dependent" lines. We need to automatically edit or regenerate this file during boot.

Make sure NFS file locks are working

If NFS file locks are not working, then KDE may hang during startup, and Openoffice.org will probably abort with an error message. Though the needed daemons (lockd and rpc.statd) should be started automatically, there may be a problem concerning hostname resolutions. In short, your hostname should be a FQDN, eg. include a domain, like myhost.mydomain.com, not just myhost. So make sure /etc/hostname contains an FQDN. Also in /etc/hosts the first name on each line following the IP address should be a FQDN. A short alias may be added after the FQDN. Here is an example entry of /etc/hosts:
192.168.1.3    my1stclient.mydomain.com my1stclient

Configure Cups

In /etc/cups/cupsd.conf, change the following line from:
Listen localhost:631
to:
Listen *:631

change the following line from:

Browsing off
to:
Browsing on

and add the following line somewhere at the top:

ServerName 192.168.1.2

Add the user cupsys to the group shadow, using the following command:

adduser cupsys shadow

Restart cupsd using the following commands:

/etc/init.d/cups stop
/etc/init.d/cups start

Add the printers using the webinterface at http://localhost:631/. Print a test page. In case of problems, check for conflicting CUPS servers advertising the same printers. These printers show up with /dev/null URI's which are usually wrong. Also check for the correct ServerName, otherwise IPP-processes will retry resolving and connecting forever. You can check for such processes using ps waxu.

Configure OpenLDAP and PHPLDAPadmin (updated)

Because each client has its own root, it also has its own /etc/passwd, so if a user changes his/her password on a client, then this password is only known on this particular client; the other clients still use the old password. To overcome this we need a centralized password file on the server which can be accessed by each client. We use LDAP for this. In the past I used NIS, but LDAP looks more promising. The LDIF files LDAP uses can easily be generated/edited using scripts. LDAP integrates nicely in PAM. There is also a user friendly web interface called phpldapadmin.

Setting up LDAP is probably the hardest part of this document. You may choose to skip this chapter, leaving each diskless client with its own /etc/passwd. And then of course there are other alternatives besides LDAP and NIS, like Kerberos. The choice is entirely yours!

Let's install the software:

aptitude install slapd ldap-utils libpam-ldap libpam-smbpass libnss-ldap \
   migrationtools phpldapadmin
In case you're wondering what all of this does, simply said:

When installing libnss-ldap the system will ask for the IP of the LDAP server, which in our case is our server's IP address. It will also ask for the distinguished name of the search base. The what? :-) Let's say it's a bit like the search domain in DNS, which in our case would be mydomain.com, in LDAP written as dc=mydomain,dc=com. You will have to replace mydomain and com to suit your needs, but you need to keep the rest: it's the LDAP way of specifying things. Note that this has no relationship with DNS, so you don't need to keep the LDAP dc's in sync with FQDN's. You can just invent an other name for your LDAP setup.

Answer No when asked if you would like to create a local root database, and No to the option "Database requires logging in".

The configuration file for slapd is /etc/ldap/slapd.conf. Edit it (only relevant lines shown):

# Use md5 to hash the passwords
password-hash {md5}

TLSCertificateFile /etc/ssl/certs/myip.pem
TLSCertificateKeyFile /etc/ssl/private/myip.key
TLSCACertificateFile /etc/ssl/certs/myip.pem

suffix          "dc=mydomain,dc=com"

rootdn          "cn=Manager,dc=mydomain,dc=com"
rootpw          {MD5}ahdhgirheiufhiusfhWtjw==

access to attrs=userPassword
        by dn="uid=root,ou=People,dc=mydomain,dc=com" write
        by anonymous auth
        by self write
        by * none

access to *
        by dn="uid=root,ou=People,dc=mydomain,dc=com" write
        by * read
The MD5 hash above needs to be replaced by the hash returned by the following command:
slappasswd -h {md5}
When asked for a password, you need to type the new password for the LDAP Manager, which is like the "root" user in Unix. Just invent a password and remember it.

The configuration file for the LDAP clients in ldap-utils is /etc/ldap/ldap.conf. Edit it. It should look like this:

BASE            dc=mydomain,dc=com
URI             ldaps://192.168.1.2:636/
TLS_REQCERT     allow

Please note that in Ubuntu 8.04 a hostname is required; an IP address is not good enough for secure connections.

Create a self-signed certificate and key pair. Type:

cd /etc/ssl
openssl req -new -x509 -nodes \
   -out certs/myip.pem -keyout private/myip.key -days 999999
When asked for CommonName type the IP of your LDAP-server, in our case 192.168.1.2.

Edit /etc/default/slapd.

SLAPD_SERVICES="ldaps:/// ldap://127.0.0.1/"
Type:
/etc/init.d/slapd restart
Test the LDAP server:
ldapsearch -D "cn=Manager,dc=mydomain,dc=com" -W -x
ldapsearch -H 'ldaps://192.168.1.2/' -D "cn=Manager,dc=mydomain,dc=com" -W -x
ldapsearch -H 'ldap://127.0.0.1/' -D "cn=Manager,dc=mydomain,dc=com" -W -x
ldapsearch -W -x
You need to type the LDAP manager password for the first 3 commands. At the last command just type enter: here we test the anonymous login. In Ubuntu 8.04 you will need the hostname in the second command instead of the IP address. If you see something like:
result: 32 No such object
that's ok.

Go to /usr/share/migrationtools and edit migrate_common.ph. In Ubuntu 8.04 this file has been moved to /usr/share/perl5.

# $DEFAULT_MAIL_DOMAIN = "padl.com";
$DEFAULT_BASE = "dc=mydomain,dc=com";
# $DEFAULT_MAIL_HOST = "mail.padl.com";
$EXTENDED_SCHEMA = 1;
Type:
./migrate_base.pl > /tmp/base.ldif
./migrate_passwd.pl /etc/passwd /tmp/passwd.ldif
./migrate_group.pl /etc/group /tmp/group.ldif
ldapadd -D "cn=Manager,dc=mydomain,dc=com" -W -x -f /tmp/base.ldif
ldapadd -D "cn=Manager,dc=mydomain,dc=com" -W -x -f /tmp/passwd.ldif
ldapadd -D "cn=Manager,dc=mydomain,dc=com" -W -x -f /tmp/group.ldif
rm /tmp/base.ldif /tmp/passwd.ldif /tmp/group.ldif
Note that ldapadd has an option -c to continue after errors (instead of aborting).

In the next section, 2 files must be edited: /etc/libnss-ldap.conf and /etc/pam_ldap.conf. However in Ubuntu 8.04 these 2 files have been merged into /etc/ldap.conf.

Edit /etc/libnss-ldap.conf:

These changes are needed for security by using ldaps:// (with encryption) instead of ldap:// (without encryption).

Do the same thing with /etc/pam_ldap.conf:

Edit /etc/nsswitch.conf, changing these 3 lines:

passwd:         files ldap
group:          files ldap
shadow:         files ldap
Check if LDAP works with NSS:
getent passwd | grep 0:0
You should see 2 identical lines, one coming from /etc and one coming from LDAP.

Setting up PAM to work with LDAP (and Samba)

We would like to put the user account information in the LDAP. Samba (used if you would like to share for example the homedirectories with Windows clients) uses its own password database, so we would like to sync that with LDAP automatically. If you don't want this leave out the lines containing pam_smbpass.so.

Edit /etc/pam.d/common-auth:

auth     sufficient   pam_unix.so nullok_secure
auth     requisite    pam_ldap.so use_first_pass
auth     optional     pam_smbpass.so migrate
Edit /etc/pam.d/common-account:
account   sufficient   pam_unix.so
account   required     pam_ldap.so
Edit /etc/pam.d/common-password:
password   sufficient   pam_unix.so nullok obscure min=4 max=8 md5
password   requisite    pam_ldap.so
password   optional     pam_smbpass.so nullok try_first_pass use_authtok
Edit /etc/pam.d/common-session:
session   required   pam_env.so readenv=1
session   required   pam_unix.so
session   optional   pam_ldap.so
session   required   pam_mkhomedir.so skel=/etc/skel/ umask=077
session   optional   pam_foreground.so
The first line has nothing to do with LDAP. It's there to read /etc/environment even when logging in using KDM. The home directory will be created automatically if it does not exist.

In order to turn off the caching of hosts, edit this line in /etc/nscd.conf:

enable-cache            hosts           no
Edit the following lines in /usr/share/phpldapadmin/config/config.php:
$config->custom->session['blowfish'] = 'some random string';
$ldapservers->SetValue($i,'server','name','NUGE LDAP Server');
$ldapservers->SetValue($i,'server','host','ldap://127.0.0.1/');
$ldapservers->SetValue($i,'server','port','389');
$ldapservers->SetValue($i,'server','base',array('dc=mydomain,dc=com'));
$ldapservers->SetValue($i,'server','auth_type','session');
$ldapservers->SetValue($i,'login','dn','');
$ldapservers->SetValue($i,'login','pass','');
$ldapservers->SetValue($i,'server','tls',false);
$ldapservers->SetValue($i,'server','low_bandwidth',false);
$ldapservers->SetValue($i,'appearance','password_hash','md5');
$ldapservers->SetValue($i,'login','attr','dn');
Replace some random string to something really random :-) Keep the IP 127.0.0.1 as-is (localhost).

Assuming you already have Apache, PHP and a webbrowser installed, startup a webbrowser and go to https://localhost/phpldapadmin/. Login as cn=Manager,dc=mydomain,dc=com. You can also login anonymously, but then you cannot see the password hashes and you cannot change anything.

This is a perfect moment to cleanup the LDAP. Remove everything, except Group and People.

Check if LDAP works with PAM. When you change a password, the corresponding password hash in the LDAP should also change, and you should be able to use this new password anywhere.

Adding a diskless fat client

is just a matter of running addfat!
# addfat.sh
This program adds a workstation to the system
Enter hostname (without domain): linuxpc3
Enter IP-address (example 123.45.67.89): 192.168.1.3
Enter MAC-address (example 11:22:33:aa:bb:cc): 00:e0:18:35:53:ea
Copying files... tar: var/run/xdmctl/dmctl-\:0/socket: socket ignored
tar: var/run/xdmctl/dmctl/socket: socket ignored
tar: var/run/mysqld/mysqld.sock: socket ignored
tar: var/run/dbus/system_bus_socket: socket ignored
tar: var/run/cups/cups.sock: socket ignored
tar: var/run/acpid.socket: socket ignored
tar: var/run/nscd/socket: socket ignored
tar: dev/log: socket ignored
Done
Configuring system...
Generating public/private dsa key pair.
Your identification has been saved in etc/ssh/ssh_host_dsa_key.
Your public key has been saved in etc/ssh/ssh_host_dsa_key.pub.
The key fingerprint is:
fb:db:2b:41:61:74:75:14:23:00:e0:02:30:05:0f:57 root@linuxpc3.mydomain.com
Generating public/private rsa key pair.
Your identification has been saved in etc/ssh/ssh_host_rsa_key.
Your public key has been saved in etc/ssh/ssh_host_rsa_key.pub.
The key fingerprint is:
2f:aa:5f:53:56:a6:bd:84:7c:e7:6b:a7:6a:0d:34:44 root@linuxpc3.mydomain.com
Done
Restarting dhcpd:
Internet Systems Consortium DHCP Server V3.0.3
Copyright 2004-2005 Internet Systems Consortium.
All rights reserved.
For info, please visit http://www.isc.org/sw/dhcp/
 * Stopping DHCP server                                            [ ok ]
Internet Systems Consortium DHCP Server V3.0.3
Copyright 2004-2005 Internet Systems Consortium.
All rights reserved.
For info, please visit http://www.isc.org/sw/dhcp/
 * Starting DHCP server:                                           [ ok ]
Done
You can safely ignore the messages about the ignored sockets.

Possible problems

ProblemSolution
client keeps booting from other media Go to the BIOS setup. Check the boot device priority. If using onboard LAN, check if it is enabled and check if the LAN BOOT ROM is enabled. If not, enable it, save, reboot, reenter BIOS setup, and check boot device priority. Some of the NIC's have an own setup, for example 3COM uses ctrl-alt-b to go to the NIC setup. Check if the boot method is PXE.
client keeps waiting for DHCP responsecheck using ps if the DHCP-server is running. Check if it is the only one running (disable any other DHCP-servers on the network responding to the client). Start dhcpd by hand in debug mode, using the -d option to see what is happening.
client keeps waiting for TFTP responsecheck using ps if the TFTP-server is running. Check the next-server option in /etc/dhcp3/dhcpd.conf. This should point to the TFTP-server.
client cannot find boot fileCheck /etc/default/tftpd-hpa if the root path after the -s option exists. Check if the boot file exists and is readable by other users.
kernel cannot mount root filsystemTry to add the relevant network module to /etc/mkinitramfs/modules (Dapper) or /etc/initramfs-tools/modules (Edgy). See section about initramfs. Check if nfsd is running. Check if /etc/exports is setup correctly. Check the menu files under /tftpboot/boot/pxelinux.cfg if the nfsroot option is specified correctly.
client keeps waiting for /usrCheck /etc/exports to see if /usr is exported correctly. Check if /tftpboot/root/*/etc/rcS.d/S44mountnfs.sh exists.
KDE hangs. Openoffice.org exits with an errorProbably an NFS file locking issue. The NFS file locking is partly done in the kernel, and partly in user-space (rpc.statd). Both parts have different ways to resolve hostnames as the kernel cannot use the standard C library for this. If the nodename the kernel uses differs from what is in /etc/hosts, for example the nodename includes a domain, and the hostname in /etc/hosts does not, then things will break. Check if /etc/hostname is an FQDN (eg. includes a domain). Check if all hostnames specified directly after the IP address are FQDN's.
applications start to freeze or crash, write or permission denied error messages appear when there is in fact no error. NFS bug found in kernels used in Edgy (2.6.17). Add the no_subtree_check option in /etc/exports.