Graying Matter: 2007

Wednesday, August 15, 2007

RHN Satellite Server Tasks

In playing with the Satellite server in preparation of my "formal" training with Red Hat, I've come across several nuggets worth mentioning.

First, the nightly tasks of sync'ing and saving.

For sync'ing the server, I have a cronjob that runs each night. This is right from the Satellite server's manual.

# Randomly sync the server's channels
0 23 * * * perl -le 'sleep rand 9000' && satellite-sync --email >/dev/null 2>/dev/null

Backing up the server is a little more involved. I wrote a short script called rhndb-backup.sh that gets executed through cron. Again, the manual from Red Hat is really handy here. They key to this script is the use of "db-control" that comes with the oracle install from Red Hat.

#!/bin/bash
# dump a copy of the Satellite DB into /opt/dj-dbbackup
# v1.0
# JBC 20070226
# v1.1
# Fixed db-control typo
# Added echos and date stamps
# JBC 20070227
/bin/echo "Starting DB backup process"
/bin/date /sbin/service rhn-satellite stop
/bin/echo "Executing db-control backup script"
su --shell=/bin/bash - oracle -c "/usr/bin/db-control backup /opt/dj-dbbackup"
/sbin/service rhn-satellite start
su --shell=/bin/bash - oracle -c "/usr/bin/db-control report"
/bin/echo "Ending DB backup process"
/bin/date

Here's the cron entry. I keep a commented out entry that would email me the output in case I need to debug an ongoing issue. The important output comes from the report command sent to db-control. It's key in telling you if the DB's are filling up.

# Do a dump of the oracle DB of the Satellite server so that
# legato captures it for backing it up.
#0 6 * * * /usr/local/sbin/rhndb-backup.sh | /bin/mail -s "Nightly Satellite DB dump" jason.consorti@dowjones.com
0 6 * * * /usr/local/sbin/rhndb-backup.sh >/dev/null 2>/dev/null

A daily Legato backup takes care of dumping the DB dump to tape.

When I do find the tables in the Oracle Database filling up, following the manual's instructions, I extend the space.

[root@sbkrhelsatp01 ~]# su --shell=/bin/bash - oracle -c "/usr/bin/db-control report"
Tablespace Size Used Avail Use%
DATA_TBS 5.3G 5G 372.8M 93%
SYSTEM 250M 115.8M 134.1M 46%
TOOLS 128M 3M 124.9M 2%
UNDO_TBS 1000M 512.1M 487.8M 51%
USERS 128M 64K 127.9M 0%
[root@sbkrhelsatp01 ~]# su --shell=/bin/bash - oracle -c "/usr/bin/db-control extend DATA_TBS"
Extending DATA_TBS... done.
[root@sbkrhelsatp01 ~]# su --shell=/bin/bash - oracle -c "/usr/bin/db-control report"
Tablespace Size Used Avail Use%
DATA_TBS 5.8G 5G 872.7M 85%
SYSTEM 250M 115.8M 134.1M 46%
TOOLS 128M 3M 124.9M 2%
UNDO_TBS 1000M 512.1M 487.8M 51%
USERS 128M 64K 127.9M 0%
[root@sbkrhelsatp01 ~]#

I look forward to learning more at the class!

Tuesday, August 14, 2007

RU Embarassed?

In the news this evening is a, frankly, shocking development in what should be yesterday's news. I'm talking about Kia Vaughn's lawsuit against Don Imus.

As you may have guessed, or heard, Kia is (or perhaps, was) a player on Rutgers' Women's Basketball team. She's suing Don Imus for her "damaged reputation."

As an alum (RC `94) and current Masters candidate, I am truly embarrassed. I thought sports activities were supposed to develop character, character of the kind that allows you to rise above adversity.

Succumbing to the litigious culture that pervades our society reveals a weak personality that is only interested in cashing in on someone else's fortune. Absent any further information that is available to the public at this time, I believe Kia has allowed herself to exhibit this kind of weakness.

In this case, it is the puzzlingly successful Don Imus's fortune that is the target of this seemingly greedy impulse. I am not a fan of Don Imus. I recall fondly his ribbing of Edison High School's marching band, but that's as far as I can say I was ever entertained by him.

His offhanded remark about the Rutgers team was just another feeble attempt by the unfunny Imus to keep his dated material "hip." Offensive, maybe, but not something that anyone would say was a true, believable characterization of the Rutgers team. No reasonable person listening to the show would honestly believe his characterization.

Before making a kerfuffle over such as non-event, no reasonable person would expect such a comment to impact anyone on the team. However, now that the women of the Rutgers team decided to grab the spotlight and make it a big issue, I can easily make the case that someone would be LESS likely to regard the players in esteem as they have shown how thin-skinned and weak they are in handling a trivial event. How would I, as an employer, weigh their candidacy for a position knowing their inability to deal with insults in a dignified manner? I'd be afraid they would not be able to handle themselves in a high stress situation.

My point is that anyone can easily make a case that the players did more damage to their own reputation than any harm offered by Don Imus' off-the-cuff parody.

What could our sports programs be teaching our students that would lure them into the trap of what seems like easy money rather than looking to turn the event into something could pay lifelong dividends by having shown grace under pressure to any would-be employer, partner or investor?

Tuesday, July 10, 2007

Stratus FT 4300 and RHEL, Part 3

Now down to some sysadmin hacking.

The object here is to create a custom software channel on our Red Hat Satellite server for Stratus' kernel rpm's for the ft 4300 and to make this as automatic as possible.
To see what I was getting into, I pulled down the rpm's from their repository at "http://pman3.com/ftLinux/4.0/redhat/." Basically they were just several debug kernels from Red Hat and yum. Nothing custom, nothing fancy, nothing numerous. I suppose Stratus wants the debug kernels available to help in troubleshooting issues.

A side note here: I have to laud Stratus' support. It is top notch, at least for VOS, their proprietary operating system.

Even though these rpm's are not critical to the operation of the ft4300 running RHEL, I decided to move ahead with the software channel.

First up, I wrote a bash script to pull down the rpms and push them into a software channel.

#!/bin/bash
#
# 5 Jul 2007 Jason B Consorti
# Rev 1.0

# Wget Options Explanation
# --no-verbose
# Quiet, but not TOO QUIET
# -e
# Use the following as if they were in a .wgetrc file.
# Used here to explicitly set the proxy server
# --no-clobber
# Don't bother downloading something we already have.
# --recursive
# Act like a crawler and grab everything
# -l 1
# Stop wget from moving about the website
# --no-directories
# Don't bother making a directory hierarchy; just put
# all grabbed files into one spot.
# --directory-prefix
# Put all of the retrieved files into /var/stratus
# --accept \*rpm
# We're only interested in pulling down rpm files.

/usr/bin/wget --no-verbose -e 'http_proxy = http://our.corporate.proxy.server' --no-clobber --recursive -l 1 --no-directories --directory-prefix=/var/stratus --accept \*rpm http://pman3.com/ftLinux/4.0/redhat/

# This find command will look in the /var/stratus directory for any recently
# downloaded rpms and will then push them into the Satellite server's
# software channel for our stratus boxes.
# A special user account set up for this purpose is used.

/usr/bin/find /var/stratus -mtime -1 -name \*rpm -exec /usr/bin/rhnpush --channel=stratus-yum-x86_64 --username=stratus --password=PASSWORD {} \;

I then created a user called "stratus" and a custom software channel called "stratus-yum-x86_64" and made it a child of RHEL AS 4 for x86_64 servers. I also gave the user "stratus" privileges to push to that channel.

I created the /var/stratus subdirectory and added this script to root's crontab.

# Daily pull of Stratus' YUM channel for RHEL 4
0 7 * * * /usr/local/bin/stratus_daily.sh >/dev/null 2>/dev/null

The script is a little kludgey: there are no locks and the find script is a cheap way out. I'll have to work on that later.

Saturday, July 7, 2007

British style CCTV systems in the US

The recent attacks in Britain and the manner in which suspects were quickly rounded up, brought the attention of some news outlets to the UK's wide use of CCTV. I've seen some media exposure given to the question of not only whether the US should adopt a similar system but if such a measure would even be constitutional.

Let's set aside all arguments first as to whether the CCTV system was useful to the speedy apprehension of the suspected bombers. I can't foresee any reasonable argument to the contrary. Due to SCORES of years dealing with terrorism associated with the effort to unite Ireland, Britain seems to have very valuable experience in using law enforcement as a tool against terrorism. The CCTV system is but one consequence of this experience.

Concerns for a British style CCTV system on this side of the pond revolve around whether it would be a violation of American civil liberties. Indeed, this is a very valid concern.

Though I may not agree that a CCTV system that watches all public areas is a terrible intrusion into our right to privacy (a right interpreted by the Supreme Court as enshrined in the First Amendment, and therefore framed and even trumped by it), I in principle would not want a government bureaucrat overseeing such as system, especially with my tax dollars.

Could there be a way to have an effective CCTV system to aid law enforcement that would satisfy civil libertarians and fiscal conservatives? Possibly.

Today, there are cameras all over this nation; they are in ATM's, overseeing car lots, monitoring office buildings, in supermarkets, in malls and even private web cams. Often, law enforcement will ask for access to recorded tapes or even subpoena to have such access when investigating crimes.

Though these cameras are unconnected and, compared to the UK system, sparse, they are very useful to law enforcement. Imagine their usefulness with a system in place to expedite access to their content.

In this system, private owners of CCTV could work to link their content. Sophisticated software already exists that can cull license plate (sorry, in Jersey they're plates, not tags) numbers automatically. Is it hard to imagine software in the near future that can be given fuzzier directives like "find a white male, late 30's, in a red shirt?" In the search of suspects over a wide area, either by request of the police or maybe even the victim themselves, this could be a powerful tool.

What would be the incentive to be a part of such a network? Perhaps there would be an advertising advantage. Wouldn't a car dealer want to proudly proclaim their social conscience in being a part of a system that helped recover X number of abducted children per year? As with many things, the more popular this system would become, the more powerful it would be.

This system would have to be funded by its participants, so this "advertisement campaign" isn't free. The power of this "endorsement" would have to outweigh its cost to the participants. This is a deep flaw in my concept.

The advantage of this system goes beyond just the fiscal. It would also tussle less feathers by demonstrating that the government is not in control of this system and therefore it is not the government that would "violate" people's rights to privacy. Less people would object and more people would support such a system.

But as with all "majority rule" decisions, that doesn't make it automatically morale or right. The argument would still remain that peoples' rights to privacy were being violated.

This is where I remind myself that the right to privacy is not enumerated, but framed by the First Amendment. I agree that the reasonable expectation of privacy has its limitations when you step out your door and walk down the street.

There should be no difference between a human being staring at me from across the street and a web cam perched on a roof down the block tracking me. I'm not uncomfortable with either. If that creepy guy stood in my bushes to look in my window, or that web cam was behind a hole in a ceiling tile in a bathroom, then there would be hell to pay.

Tying together ATM and convenience store cameras is not such a strong intrusion into our lives such that I would object to such a scheme.

Stratus FT 4300 and RHEL, Part 2

Well, it turns out that the only packages needed by the ft 4300 are just the kernel. This means that Stratus' customizations to the inittab and sysinit scripts are done manually. That could be done better.

I am a fervent believer in using rpm's for just about everything. We use them to create sysadmin accounts. We use them to delete old sysadmin accounts. We use them to distribute ssh keys. We "strongly" encourage the application developers to deliver all packages as rpm's and have been relatively successful getting them to follow through on that.

Typically the application developers will get third party software as tarballs or even (shudder) zip files. Turning these packages into rpm's is pretty trivial but it took a while to get the application developers into the habit of doing so. Having all of these packages as rpm's makes deployment and support possible across the hundreds of servers we support. Key to this success is using software channels on our Red Hat Satellite server.

But I digress.

Basically, I made backups of the two files in question and forced the up2date.

# cd /etc/
# cp inittab inittab.`date +%s`
# cd /etc/rc.d/
# cp rc.sysinit rc.sysinit.`date +%s`
# up2date --nox -f initscripts

It turned out that inittab didn't get touched after the up2date so no action was needed there. the rc.sysinit was a different story so I copied back the backup I made (even though up2date makes a backup itself, I am extra-cautious).

After that drama was over, I run the full "up2date -u" and watched it upgrade 551 other packages. 551. Why on earth did Stratus feel the need to do a FULL INSTALL? I don't need squirrelmail or thunderbird on a fault tolerant server!!!

Next up will be creating a custom software channel on the Satellite server for the ft 4300's kernel rpm's.

Thursday, July 5, 2007

GRUB, RHEL and a corrupted MBR

GRUB

That's all the screen said.

Just "GRUB." It's almost like seeing "PC LOAD LETTER." My first reaction is "what the #$%! does that mean?" to coin the famous "Office Space" phrase.

Well, I know what GRUB is and what it does. What strikes me as why when it gets screwed up early in the bootstrap process, that's all it says on the screen. Couldn't it say "loading GRUB", or something more meaningful for me to know what stage in the bootstrapping it coughed blood?

But alas, no.

GRUB

Well, I've seen this several times before, and today saw it again: when booting, and after the BIOS work is done, the system just shows "GRUB" and hangs. Nothing else happens. Pretty much, this means that there is a problem with, well, GRUB.

Unfortunately, I've never taken a serious mental note on how the problem was fixed (I handed it off to other SA's to fix in the past) so I spent several hours today doing the same thing over and over again expecting different results.

IDIOT

In the end, the fix was as follows.

First, boot from the RHEL boot CD. When prompted, go into rescue mode.

boot: linux rescue

Follow through on the instructions to choose your language and keyboard. Don't worry about getting network up and running, so opt not to use it. Choose "CONTINUE" when prompted whether to have it find an installed copy of RHEL or not. You want it.

Next, when presented with a shell, chroot that RHEL image.

# chroot /mnt/sysimage

This will come back with another shell. Run a "df" to confirm it made the root change so that it appears you booted from the internal disks.

Next, invoke GRUB with the right options.

# grub --batch --device-map=/boot/grub/device.map --config-file=/boot/grub/grub.conf -no-floppy

From the GRUB shell, re-install the MBR.

grub> root (hd0,0)
grub> setup (hd0)
grub> quit

It is important to "quit" out of GRUB so that anything cached gets dumped.

My mistakes involved mainly re-installing GRUB without specifying the options, installing GRUB not on the master boot record but on the boot block of the first partition, and not using the rescue CD to execute the grub shell.

You must pay attention to your devices. For me, and the typical RHEL install, "hd0" is the root disk, and "hd0,0" is the "/boot" partition. If you have "/boot" installed on, say, your 2nd partition, you'd use "hd0,1" as your "root". You might not even have "hd0" mapped out. Review your "/boot/grub/device.map" file. For this particular RHEL install on an IBM 366 with a hardware mirrored internal drive, our file maps "sda" to "hd0". Our Dell boxes map the same way, too.

Friday, June 29, 2007

Stratus FT 4300 and RHEL

We've started to evaluate Stratus' FT4300 using Red Hat Enterprise Linux.

What we've learned so far is impressive: the server operates as robustly as it would with VOS. After installation, we've unplugged modules while running a video from the RHEL server. It never missed a beat.

It should be noted, however, that I/O redundancy is handled by RHEL as the hardware presents all the I/O channels, and not some kind of metadevice that represents a redundant pair of paths.

We did entitle the RHEL server to our Satellite server and noticed one problem right off the bat: the RHEL installed by Stratus required connectivity to their yum server for updates. We commented this line out from /etc/sysconfig/rhn/sources:

yum Stratus_Technologies_ft_Linux_4.0 http://pman3.com/ftLinux/4.0/

And ran up2date. It complains about initscripts, right away:

Testing package set / solving RPM inter-dependencies...
There was a package dependency problem. The message was:

To solve all dependencies for the RPMs you have selected, The following
packages you have marked to exclude would have to be added to the set:

Package Name Reason For Skipping
======================================================================
initscripts-7.93.29.EL-1 Config modified

Now, one would be tempted to run "up2date -f" and force the issue, but I knew right away that Stratus had to have its hooks in somewhere. You see, even though the I/O redundancy is handled by RHEL, the hardware remains aware of when there is lack of redundancy. For instance, after replaceing the power on one of the modules after testing its resiliencey, RHEL's MD had to remirror the root drive. During this time, the hardware flashed its led's in that characteristic way that VOS systems due to signal that systems are not currently in a redundant and fail-safe mode. Once mdadm showed the mirroring to be complete, the lights stopped flashing.

A quick run of rpm's verify showed exactly what configuration files were altered:

[root@sbkrhelstratp01 rhn]# rpm -Vc initscripts
S.5....T. c /etc/inittab
S.5....T. c /etc/rc.d/rc.sysinit
[root@sbkrhelstratp01 rhn]#

A careful scan of the rc.sysinit file drew my attention to several operations that were commented on by Stratus. These seemed to involve RAID. I fully expected to find something in inittab requiring respawns, and sure enough, there they were:

osm:2345:respawn:/opt/ft/sbin/osm
ftmo:12345:respawn:/opt/ft/sbin/miceope

My options at this point include saving copies of these and force the up2date, or perhaps reach out to the yum server at Stratus and see if it keeps an updated rpm of initscripts.

Ideally, I'd like to create a custom software channel on my satellite server based upon the yum server at Stratus.

Right now, I'm stalled.

Monday, June 25, 2007

Found this photo on Timothy Allen's photojournal. It is the mere definition of irony.

Friday, June 22, 2007

SSH Key Agent and Screen

I love screen. I use it whenever I can. I even experimented a bit with ratpoison, that's how much I love screen. One thing that drove me mad, though, was that SSH's key agent (ssh-agent) and screen are not good buddies. The problem is that old window sessions point to old SSH sockets to the agent. If I detach my screen session, log out, log back in later, and reattach to that session, SSH points to old sockets. What's the point of screen if I can't logout and login keeping a persistent state of things? With SSH being core to everything I do, I can't go without it. At work, key agents are especially important with our smartcards.

So, I made a hack to allow me to forward my key info through my screen sessions. This hack is, well, a hack, but it works for me.

First things first, edit your .screenrc file to contain a line like this:

setenv SSH_AUTH_SOCK $HOME/tmp/socket

This makes every window from your .screen point to a custom socket rather than the system set socket to your key agent.

Next, make a script that does something like this:

#!/bin/sh
/usr/bin/rm /export/home/username/tmp/socket
/usr/bin/ln -s $SSH_AUTH_SOCK /export/home/username/tmp/socket

This script creates a softlink from our own socket to the real key agent socket as presented by SSH_AUTH_SOCK. I called this script "screen-ssh-agent" and stuck it in my personal bin directory. Now, for your login, you need something like this to execute:

~/bin/screen-ssh-agent

Old-timey SA's like myself use tcsh, so I just added this to my ".login".

Now, after I login to this box and kick off screen, running ssh from any window inside will refer to the staticly named file "tmp/socket" that links to the real socket that is uniquely created and named by sshd everytime I login.

One key to rule them all!

Veritas Volume Replicator and Red Hat

Here's some advise: if you plan on using Veritas Volume Replicato r with Red Hat Enterprise Linux, AVOID USING EXT3FS!

I've seen the combo just hang on sending updates to the remote secondary. It would just sit there when trying to drain the Storage Replicator Log (SRL); vxrlink status would just sit there and show it not being drained .

To remedy, we had to force the SRL to overflow into DCM logging by creating a big enough bogus file (mkfile or dd if=/dev/zero) to use vradmin to resync. Forcing it to clear the DCM would be the only way to make the problem go away and resume replication!

This problem would appear almost randomly and we eliminated the size of the volumes as a factor. One volume was in the terabytes while another was several gigabytes. Both exhibited this problem.

Both ext2fs and vxfs worked fine. To preserve file system journaling, we went with vxfs. We had no reason not to, we just went with ext3fs because there were performance questions in a past project on Solaris regarding vxfs. Being this new project was on RHEL, we found no reason to stop us from converting to vxfs.

So, stick to vxfs with vvr. YMMV, but it worked well for us.

Newly Gray

This is the first entry for my blog, Graying Matter. Posts will include matters relating to politics and technology.

Graying Matter

Blog Archive