How to Lose your Job with SSH, part 1

A less sensational title for this post would have been “SSH Remote Forwarding,” but that’s not nearly as fun.

I used to be responsible for one of the few entry points into a global network. The company had actual manufacturing secrets — their products included various machines of war. We had internal firewalls to protect sites from each other, even when the site didn’t have Internet access. All Internet connections had to go through proxies. We did not allow external DNS to reach the desktop. If you typed ping google.com on your desktop, you’d get a “host unknown” error. The company had invested in VPN technology that blocked all but approved clients, and only permitted clients on the approved list if they were running the approved anti-virus scanners and other security software.

I was frequently asked to open direct Internal access for random applications. Most of these requests were rejected unless the user could explain what they wanted and why it was business-critical. Some of the people who asked were technically literate, and became indignant when I rejected their request for outbound SSH to external servers or equipment. After all, SSH is “Secure Shell.” It says secure RIGHT IN THE NAME. They just want to check their personal email. How could I possibly reject this eminently sensible request?

I’ve since had left this job, but I’ve had the same discussion more than once afterwards.

Most SSH users have no idea of SSH’s flexibility. Arbitrary SSH connections are a nightmare for maintaining any sort of secure information perimeter. Remote port forwarding is one reason why.

When most people mention SSH port forwarding, they’re thinking of local port forwarding. You forward a TCP port on your client to a TCP port on the SSH server. This lets you, say, tunnel SMTP inside SSH, so your client can relay mail through your server without any complicated Sendmail rules.

Remote forwarding lets you do the reverse: forward a TCP port on the SSH server to a TCP port on the SSH client.

Suppose I permit a user to SSH to an external server. The desktop is behind a NAT. There are no port mappings from the NAT to the desktop; it has the same connectivity you might give a secretary, except for the outbound SSH access to a single host. That user sets up remote port forwarding from a TCP port attached to the server’s public IP to the client’s SSH daemon. I’m using an OpenBSD desktop, but you can get SSH clients for most other operating systems, including Windows. I’m using OpenSSH, of course, but most clients (including PuTTY) does remote port forwarding.

To do remote port forwarding in OpenSSH, run:

$ ssh -R remoteIP:remoteport:localIP:localport hostname

If you don’t specify an IP address to attach to the SSH server, the server attaches to 127.0.0.1. (You can also skip the first colon in this case.)

My desktop runs sshd. I want to attach port 2222 on the SSH server pride.blackhelicopters.org to port 22 on my SSH client using remote port forwarding.

client$ ssh -R 2222:localhost:22 pride

I leave this connection up and go home. Perhaps I run top in the command window, to prevent the SSH session from timing out at the corporate firewall. Now at home, I log into the SSH server and run:

pride$ ssh -p 2222 localhost

My SSH request to the local machine will get tunneled inside the existing SSH connection out of the network. I will get a logon prompt for my client inside the secure network. When I can access one supposedly secure machine, I can start ripping data out of the local file servers and maybe even access other internal sites. All of the trouble the company expended to prevent unauthorized access is now moot.

If you attach the remote port forwarding to the server’s public IP, then anyone on the Internet can attack your desktop’s SSH daemon. People are attacking SSH servers, even on odd ports.

Of course, remote port forwarding in this environment would violate company policy. If caught, you’d lose your job. But to catch this abuse, the network administrator would need to realize that large data transfers were taking place in off hours over these limited-use channels. You could, say, use flow analysis to write automated reports that notice and alarm when large amounts of traffic pass over these “rarely-used” channels. You’d have to be a real bastard to think of that. But the reports are easy to write, and the look on the abuser’s face when you confront them with graphs and numbers is priceless.

The point is, the next time your employer’s network administrator rejects your sensible request for SSH access to your home server, don’t be too hard on the poor slob.

Want more SSH fun? Check out the book recommended by the OpenSSH project, SSH Mastery.

IP Tables and VoIP

Here’s an iptables ruleset for a VoIP server with a Web interface. The goals are to allow management hosts to communicate with them freely, allow VoIP and HTTP(S) from the public, and drop everything else. It’s designed to be used as /etc/iptables.rules, and loaded with

# iptables-restore < /etc/iptables.rules

In Linux, you’re supposed to adjust the firewall at the command line. This implies an ability to retain the firewall ruleset in your head, as well as an ability to type correctly. Neither of these is true for me. My /etc/iptables.rules


*filter
#management
-A INPUT -s 192.168.0.0/16 -i eth0 -j ACCEPT
-A INPUT -s 10.0.0.0/8 -i eth0 -j ACCEPT

#Web interface
-A INPUT -p tcp -m tcp --dport 443 -j ACCEPT
-A INPUT -p tcp -m tcp --dport 80 -j ACCEPT

#VoIP
-A INPUT -p udp -m udp --dport 5080 -j ACCEPT
-A INPUT -p udp -m udp --dport 5061 -j ACCEPT
-A INPUT -p udp -m udp --dport 5060 -j ACCEPT
-A INPUT -p udp -m udp --dport 10000:20000 -j ACCEPT
-A INPUT -p udp -m udp --dport 1025:65534 -j ACCEPT

#keep state
-A INPUT -p tcp -m state --state ESTABLISHED -j ACCEPT
-A INPUT -p udp -m state --state ESTABLISHED -j ACCEPT
-A INPUT -i eth0 -j DROP

#allow outbound
-A OUTPUT -p tcp -m state --state NEW,ESTABLISHED -j ACCEPT
-A OUTPUT -p udp -m state --state NEW,ESTABLISHED -j ACCEPT
COMMIT

The section labeled “management” is where the rules allowing access from my management network goes. Management hosts may connect to this server on any port desired. Add additional lines for additional subnets.

The Web interface rules permit inbound HTTP(S) connections, and the VoIP section supports phone calls.

After working with iptables for a while, I feel perfectly qualified to say: I vastly prefer PF. Or even ipfilter. But now that I have the ruleset worked out, I can easily replicate it across all my VoIP servers.

NFSv4 and UIDs on OpenSolaris and Ubuntu

NFS clients and servers negotiate to use the highest NFS version they both support. NFSv4 usually performs much better than NFSv3, but requires a little more setup. Here I get NFSv4 working between an OpenSolaris file server and a diskless Ubuntu client. In theory, a plain mount(8) gives us a NFSv4 mount.

# mount server:/data1/opennebula/on22 /mnt/
#

Use nfsstat -m to see what kind of mount they negotiated

# nfsstat -m
...
/mnt from server:/data1/opennebula/on22
Flags: rw,relatime,vers=4,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=192.0.2.2,minorversion=0,addr=192.0.2.1

We have NFSv4, huzzah! Go look at the files.

# ls -lai /mnt/
total 12K
5 drwxr-xr-x 8 4294967294 4294967294 8 2011-04-19 11:50 .
3 drwxr-xr-x 21 root root 26 2011-03-17 10:22 ..
6 drwxr-xr-x 2 4294967294 4294967294 24 2011-04-19 11:50 bin
29 drwxr-xr-x 16 4294967294 4294967294 21 2011-04-19 11:50 etc
74 drwxr-xr-x 2 4294967294 4294967294 2 2011-04-19 11:50 include
75 drwxr-xr-x 7 4294967294 4294967294 7 2011-04-19 11:50 lib
296 drwxr-xr-x 5 4294967294 4294967294 5 2011-04-19 11:50 share
332 drwxr-xr-x 5 4294967294 4294967294 10 2011-04-19 11:50 var

A UID of 4294967294? That’s awesome. Wrong, but awesome. 4294967294 is -1 on a 64-bit system. Many modern Linuxish systems assign nobody and nogroup (the standard unprivileged NFS accounts) a UID and GID of -1. While my files are owned by uid 1003 on the server, and the client’s mount point is owned by uid 1003, NFSv4 defaults to mapping all UIDs to nobody. Use rpc.idmapd to map UIDs between systems. Go to /etc/default/nfs-common and enable idmapd.

NEED_IDMAPD=yes

Lower case seems to be required: I originally set this to YES and the process didn’t start.

Reboot the client, and the files are now owned by nobody. Well, at least that’s a legitimate system user, one originally created for NFS. The files are owned by UID 1003 on the server, however.

Here’s where NFSv4 gets interesting. In NFSv3 and earlier, file ownership over NFS is controlled by UID. Systems administrators worked hard to keep UIDs synchronized across their systems so that NFS permissions would be consistent. You can remap UIDs over NFS, of course, but maintaining those maps is vastly annoying.

NFSv4 maps file permissions by UIDs, but uses usernames for ACLs and ownership. Both must be correct, or common operations won’t work as expected. I have an OpenSolaris NFS server that contains lots of files for lots of diskless systems with lots of different usernames. Some of those usernames do not exist on the fileserver. While I keep user accounts in LDAP, I (mostly) don’t bother with system or program accounts. To share files over NFSv4, though, the accounts must exist on both client and server.

NFSv4 uses helper programs to map usernames and UIDs: nfsmapid on OpenSolaris, rpc.idmapd on Ubuntu, and nfsuserd on FreeBSD. (Please insert a screaming rant here: these are all basically the same program. Why, why, why change the name? We don’t give ping(8) different names even though it has completely different under-the-hood implementations on each program, do we? Sheesh.)

NFSv4 maps usernames within a domain, generally (but not necessarily) the machine’s domain name. If the NFSv4 client and server domain names doesn’t match, all the usernames will show up as “nobody.” OpenSolaris’ nfsmapid pulls the domain name from the machine’s domain name. I had to set the domain name on Ubuntu 10.10 in /etc/idmapd.conf.

NFSv4 now works in my environment.

Note that NFSv4 also has a variety of other changes. All exports are part of a single unified namespace. OpenSolaris handles that for you. If you use a different NFSv4 server, you might need to manage that namespace yourself. But that’ll be a topic for another post, when I get my FreeBSD/ZFS/iSCSI/NFSv4 server working.

WordPress LDAP auth on Ubuntu

I support too many servers and applications to manage separate user databases for each. LDAP is a must. If an application can’t hook up to LDAP, I don’t want it. WordPress can be configured to use LDAP, and has several different LDAP plugins. I’ve had mixed results with PHP LDAP plugins. I usually find that having the application trust Apache’s authentication, and attaching Apache to LDAP, gives better results in my environment.

Note that my WordPress installations usually have only one or two registered users. They are administrators. Most people cannot register. If you want to hook hundreds of LDAP users into WordPress, and manage them completely through LDAP, you’ll need to find an LDAP-specific plugin that meets your needs. In this environment, where I’m just looking for administrator password synchronization, it’s good enough.

This particular Web server runs Ubuntu 10.04 with Apache and WordPress 3.1. To enable LDAP auth in Apache, run:

# a2enmod authnz_ldap
# /etc/init.d/apache2 restart

On the WordPress side, install for the HTTP Authentication plugin. This tells WordPress to trust the Web server’s authentication.

WordPress won’t read a list of usernames from basic auth. You’ll need to create your users. (Again, this is for a couple of admin accounts, not for massive user databases.)

WordPress protects its administrative directory, /wp-admin/, automatically redirecting requests to the page wp-login.php. For this plugin to work, we must require LDAP auth to the one file wp-login.php. Here’s the Apache configuration for the WordPress directory.


Options Indexes FollowSymLinks MultiViews
AllowOverride None
Order allow,deny
allow from all

AuthType Basic
AuthName "Web Admins Only"
AuthBasicProvider ldap
AuthLDAPURL "ldap://ldapserver1.domain.com/dc=domain,dc=com" STARTTLS
AuthLDAPGroupAttribute memberUid
AuthLDAPGroupAttributeIsDN off
require ldap-group cn=wordpressadmins,ou=groups,dc=domain,dc=com

Note that my LDAP servers do not require a LDAP login to validate a user. If yours do, you’ll need to add the username and password to this configuration.

Restart Apache, open a new browser, go to the site, and hit the Login button. You should get an Apache login window. Enter your username and password, and you’ll reach the WordPress control panel.

You’re now handing your LDAP username and password to WordPress. You do have WordPress available over SSL, don’t you? Configure Apache so that http://wordpress.domain.com is also available as https://wordpress.domain.com, and add the following near the top of wp-config.php.

//we like SSL
define('FORCE_SSL_LOGIN', true);
define('FORCE_SSL_ADMIN', true);

WordPress will now pass user credentials and cookies over SSL.

Wherein I learn about initrd

Post summary: will someone PLEASE port a recent KVM to any BSD? There’s beer in it for you.

I’ve been attempting to upgrade my diskless virtualization cluster to Ubuntu 10.10. Diskless boot worked fine in the ESXi test area, but real hardware would not boot. This same hardware booted fine with Ubuntu 10.04 and 9.whatever. When I looked at the console, I saw:

ipconfig: no devices to configure
ipconfig: no devices to configure
ipconfig: no devices to configure
ipconfig: no devices to configure
/init: .: line 3: can't open '/tmp/net-*.conf'
[ 2.300079] Kernel panic - not syncing: Attempted to kill init!
[ 2.306052] Pid: 1, comm: init Not tainted 2.6.35-27-server #48-Ubuntu
[ 2.312653] Call Trace:
[ 2.315161] [] panic+0x90/0x113
[ 2.320025] [] forget_original_parent+0x33d/0x350
[ 2.326433] [] ? put_files_struct+0xc4/0xf0
[ 2.332339] [] exit_notify+0x1b/0x190
[ 2.337699] [] do_exit+0x1d5/0x400
[ 2.342817] [] ? do_page_fault+0x159/0x350
[ 2.348609] [] do_group_exit+0x55/0xd0
[ 2.354076] [] sys_exit_group+0x17/0x20
[ 2.359617] [] system_call_fastpath+0x16/0x1b

The useful messages are obviously further up, but the scrollback buffer is fubar. (Apparently when an Ubuntu box dies, it dies really really hard.) A serial console let me scroll back through the boot messages.

...
[ 2.004954] Uniform CD-ROM driver Revision: 3.20
[ 2.009944] sr 0:0:1:0: Attached scsi generic sg0 type 5
[ 2.015651] Freeing unused kernel memory: 836k freed
[ 2.021283] Write protecting the kernel read-only data: 10240k
[ 2.027551] Freeing unused kernel memory: 320k freed
[ 2.033118] Freeing unused kernel memory: 1620k freed
Loading, please wait...
[ 2.063067] udev[81]: starting version 163
Begin: Loading essential drivers ... done.
Begin: Running /scripts/init-premount ... done.
Begin: Mounting root file system ... Begin: Running /scripts/nfs-top ... done.
FATAL: Could not load /lib/modules/2.6.35-27-server/modules.dep: No such file or directory
FATAL: Could not load /lib/modules/2.6.35-27-server/modules.dep: No such file or directory
ipconfig: no devices to configure
ipconfig: no devices to configure
ipconfig: no devices to configure

The machine cannot find its modules directory? Odd. A packet sniffer found that the diskless client didn’t send an NFS request. It was just giving up after running initrd. I carefully reviewed the serial console output and compared it to the test Ubuntu systems, and found that the initial ramdisk wasn’t attaching a device driver to the Ethernet interface.

Initrd is an “initial ramdisk.” It loads a kernel and device drivers, for the purpose of finding the root file system and loading the real kernel and actual device drivers. If you install a machine in one environment, the initial ramdisk includes only the device drivers for that environment.

Checking /etc/udev/rules.d/70-persistent-net.rules of the older system revealed:

# PCI device 0x14e4:0x1659 (tg3)
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="00:17:31:d8:42:52", ATTR{dev_id}=="0x0", ATTR{type}=="1", KERNEL=="eth*", NAME="eth1"

The Ethernet cards on my physical servers use Linux’s tg3 driver.

To add this driver to initrd, I went to the Ubuntu 10.10 server on my ESXi test box and created the file /usr/share/initramfs-tools/modules.d/tg3. That file contained a single word, tg3. I then created the new initrd with:

# update-initramfs -u -k all
# mkinitramfs -o /home/mwlucas/initrd.img-2.6.35-27-server-pxe

Copy that image to my TFTP server, reboot the hardware, and everything boots.

pxelinux.cfg/* versus RCS

I’m a fan of version control in systems administration. If you don’t have a central VCS for your server configuration files, you can always use RCS. I habitually add #$Id$ at the top of configuration files, so I can easily see who touched this file last and when.

On an unrelated note, I’m upgrading my virtualization cluster to Ubuntu 10.10. The worker nodes run diskless. Each diskless node reads a configuration file over TFTP. Mine looked like the following:

#$Id$

LABEL linux
KERNEL vmlinuz-2.6.35-27-server
APPEND root=/dev/nfs initrd=initrd.img-2.6.35-27-server-pxe nfsroot=192.0.2.2:/data1/imagine,noacl ip=dhcp rw
TIMEOUT 0

This has worked fine for a year or so now, with me changing the kernel and initrd versions as I upgraded. With the Ubuntu 10.10 update, however, some pieces of hardware wouldn’t reboot. Most booted fine, but a few didn’t come back up again.

This is notably annoying because the hardware is in a remote datacenter. Driving out to view the console messages burns an hour and, more annoyingly, requires that I stir my lazy carcass out of my house. I have a serial console on one of the machines, but not on the affected one. Fortunately, I do have remote power, and I can make changes on the diskless filesystem.

Packet sniffing revealed that the machine successfully made a TFTP request, then just… stopped. This exact same configuration and filesystem worked on other machines, however. Except that the affected machines all had #$Id$ on the first line of their pxelinux.cfg file, and machines that booted successfully didn’t.

That shouldn’t matter. Really, it shouldn’t. pxelinux.cfg files accept comments. But I removed the tag, making the first line the LABEL statement, and power cycled the machine. And it came up perfectly.

Apparently this particular rev of Linux PXE is incompatible with version control ID tags. Oh joy, oh rapture!