I just dumped Solaris off my underutilized (because of Solaris..) Sunfire V120 and did a fresh install of Debian Etch for the Sparc64 architecture. I burned the sparc iso image and borrowed an IDE SuperSlim CDROM from another server for the install, but the V120 would not recognize the device, no matter what I tried, which seems to be a common issue with some Sun gear.. Sweet! Off to netboot-land 🙂
On a spare laptop (running Debian, of course):
aptitude install rarpd bootp tftpd
Set up rarpd with the MAC and IP addresses in /etc/ethers to hand the V120 its IP when it ARP broadcasts (grab the MAC for the Sun box from the console startup messages):
00:03:BA:16:85:6B 192.168.1.30
Set up bootp to tell the V120 where to go for the install image. The location of the tftp directory, the name of the image, the client IP and netmask, the boot server IP, and the client hardware (MAC) address go in /etc/bootptab:
client:\
hd=/srv/tftp:\
bf=boot.img:\
ip=192.168.1.30:\
sm=255.255.255.0:\
sa=192.168.1.26:\
ha=0003BA16856B:
Download the sparc64 boot image to the tftpd directory:
# cd /srv/tftp/
# wget http://http.us.debian.org/debian/dists/etch/main/installer-sparc/current/images/sparc64/netboot/2.6/boot.img
When the V120 netboots, it ignores the file, boot.img, we told it to look for and asks tftpd for a file based on the IP address that it currently has (see Preparing Files for TFTP Net Booting for more info), so I just cheated, and tailed syslog to see what was being requested:
# tail -f /var/log/syslog
Now, netboot the V120 from the Openboot “ok” prompt on the console:
ok boot net
Going back to the syslog tail, I found:
# tail -f /var/log/syslog
Dec 19 20:41:31 apollo rarpd[4751]: RARP request from 00:03:ba:16:85:6b on eth0
Dec 19 20:41:31 apollo rarpd[4751]: link lo
Dec 19 20:41:31 apollo rarpd[4751]: addr 127.0.0.1/8 on lo
Dec 19 20:41:31 apollo rarpd[4751]: link eth0
Dec 19 20:41:31 apollo rarpd[4751]: addr 192.168.1.26/24 on eth0
Dec 19 20:41:31 apollo rarpd[4751]: RARP response to 00:03:ba:16:85:6b 192.168.1.30 on eth0
Dec 19 20:41:31 apollo in.tftpd[10704]: connect from 192.168.1.30 (192.168.1.30)
Dec 19 20:41:31 apollo tftpd[10705]: tftpd: trying to get file: C0A8011E
Dec 19 20:41:31 apollo tftpd[10705]: tftpd: serving file from /srv/tftp
Dec 19 20:41:35 apollo in.tftpd[10708]: connect from 192.168.1.30 (192.168.1.30)
Dec 19 20:41:35 apollo tftpd[10709]: tftpd: trying to get file: C0A8011E
...
Nice! A quick symlink is all we need, while the V120 is repetitively requesting the file C0A8011E:
# ln -s boot.img C0A8011E
And then I see on the V120 console that it is happily downloading its boot image file. (happy debian sparc dance ensues!)
From there on out, the text debian-installer should be quite familiar.
Gotcha’s:
I ran into an issue when running the disk partitioner – regardless of even creating an empty partition table, the partitioner failed with an error message along the lines of (from memory), “you may have too many primary partitions”, and refused to create any partitions. Somewhere in my digging around, I came across a web page or mailing list post on the fact that if there was a previous Solaris install, debian-installer may fail, and that the cure was to wipe the master boot record from the disk. Backing out to the main menu dialog, there is a “start a shell” option – pick that! In the shell, just dd over the 512byte MBR of the drive, exit, and go back to the partitioning option, and all should be well.
# dd if=/dev/zero of=/dev/sda bs=1 count=512
Another little consideration was that the Linux kernel (not just debian) iterates the ethernet devices differently than labeled on the back of the server – eth1=Net0 and eth0=Net1. Since I would prefer the physical labels to be “correct”, in the sense that I may ask some other person in a data center to plug the primary interface of the server into a switch, then I commented my /etc/network/interfaces file appropriately:
# This file describes the network interfaces available on your system
# and how to activate them. For more information, see interfaces(5).
# The loopback network interface
auto lo
iface lo inet loopback
## eth1 really is the primary interface!
## eth1 is the port labeled "Net0" on the SunFire V120
# The primary network interface
auto eth1
iface eth1 inet static
address 192.168.1.30
netmask 255.255.255.0
gateway 192.168.1.1
## eth0 really is the secondary interface!
## eth0 is the port labeled "Net1" on the SunFire V120
# The secondary network interface
auto eth0
iface eth0 inet static
address 192.168.10.30
netmask 255.255.255.0
After the install was complete and Debian is booted, I found an annoying repetition of console messages about the disconnected eth0 (Net1) device switching between 10baseT and 100baseT while trying to find connectivity:
Debian GNU/Linux 4.0 sol ttyS0
sol login: eth0: switching to forced 10bt
eth0: switching to forced 100bt
eth0: switching to forced 10bt
eth0: switching to forced 100bt
eth0: switching to forced 10bt
...
Many thanks to a 2003 post by Marta Pla i Castells to the debian-powerpc mailing list, I found my console-saving grace in the form of an update to /etc/default/klogd to set a higher priority of kernel messages sent to the console, which restricted the annoying and unimportant (to me) eth0 message from being sent to the serial terminal. The default logging level to the console is 7 – dropping the console log level to 6, meaning anything with a priority of “notice” (5) or higher gets logged on the console, keeps these messages from appearing – so, by elimination, it appears that this particular message is of “info” (6) priority.
#KLOGD="-x"
KLOGD="-x -c 6"
My fix to suppress the same messages from /var/log/syslog was to change the kernel priority in /etc/syslog.conf allowing them to still be logged go to /var/log/messages:
#*.*;auth,authpriv.none -/var/log/syslog
*.notice;auth,authpriv.none -/var/log/syslog
Happy Sparc Hacking!