Diff’ing files over the network

This is a godsend. Wish I had thought about doing this before.


$ diff source/worksforme.php <(ssh -n me@liveserver cat /home/me/source/worksforme.php)

You can also compare files on two remote hosts.


$ diff <(ssh -n me@testserver cat /home/me/source/worksforme.php) <(ssh -n me@clientserver cat /home/me/source/worksforme.php)

Finding files by date with the Linux find command

Here’s something I’ve wanted to know how to do since forever.

Use this trick to find files that have been modified since some arbitrary date:

$ touch -d "13 may 2001 17:54:19" date_marker
$ find . -newer date_marker

To find files created before that date, use the cnewer and negation conditions:

$ find . ! -cnewer date_marker

And to delete them, use the built-in “delete” action, eg:

$ find . ! -cnewer date_marker -delete

Discovered in the Irish Linux Users Group‘s exceptional online tutorial.

Kernel conflicts in really old versions of Fedora

If you find yourself upgrading a wicked old Fedora distro, you may run across an error like this:

Error: Package initscripts needs kernel < 2.6.12, this is not available.

Try upgrading your kernel. To upgrade your kernel, make sure that only the latest version is hanging around. Because of a bug in legacy versions of yum, any older versions of the kernel must be removed.

To check your installed kernels, run:

$ rpm -qa | grep kernel

Note that if you simply run “rpm -q kernel”, you may miss smp kernels or other. The above will give you a complete picture. Remove all except the very latest version, then run yum update.

rsync and bzip2-compressed data

As it only transfers deltas between source and destination files, rsync is a great backup tool when working with uncompressed data. The structure of compressed data, however, can change drastically between backups, defeating the benefits of rsync. I’d read somewhere recently, however, that bzip2’s “blocking” design might make it a viable compression to use with rsync. Ran an ad hoc experiment this morning to check this out.

Uncompressed Data

Here are the results from rsync’ing an uncompressed MySQL database with a few minor record changes.

total: matches=1634 hash_hits=2136 false_alarms=0 data=21227
sent 6.73K bytes received 9.92K bytes 6.66K bytes/sec
total size is 2.69M speedup is 161.44

Nice. Only about 7K transferred. Roughly the size of the change.

bzip2 Compressed Data

And here are the results from rsync’ing the same MySQL database, compressed in advance with bzip2.

total: matches=596 hash_hits=17533 false_alarms=1 data=876602
sent 876.99K bytes received 7.73K bytes 353.89K bytes/sec
total size is 1.64M speedup is 1.85

Woah! What amounts to about a 7K change is resulting 10x the data transfer.

Which makes sense, as — digging into the details of bzip2 — I see that the bzip2 algorithm chunks data in 100K – 900k blocks.  So I suppose that using bzip2 might make sense if you have an incrementally growing data store that adds about 100K of data between backups; and where the older data rarely, if ever, changes.  Barring that, to achieve the benefits of rsync, uncompressed data is probably the way to go.

That said, there seems to be a version of gzip with an --rsyncable switch for Debian.  The BeezNest has a great article on this here.

More tail Tails

Thanks to the little Cygwin setup/upgrade app, I periodically come across useful new utilities.  For example, even though I’ve only been playing with these two for a couple of days, I’m already wondering how I ever did without them.

multitail

multitail renders multiple tail logs in an ncurses-formatted window. Perfect for monitoring access and error logs at the same time.  Much better than interleaving multiple logs with tail.  Now I don’t have to squint through output in an attempt to figure out which data is from which log.

since

since is tail with a memory. Whereas tail displays the last 10 lines (or whatever value you define with -n/--lines) of a file, since starts from where you left off the last time you ran it. This is an excellent tool for when you you’re toggling between watching a log, tweaking configuration parameters, and then returning to log monitoring.

Going to have to upgrade Cygwin more often.

Sherlocking Linux Distros

You login to a mysterious new box. There is no login message. You poke around and before long you start to wonder “So what the heck distro is this anyway?”

$ uname -a
just tells you all about the kernel. Hmmm.  A mystery.

To pull up details on the distribution, take a peak in /etc/issue. This text file is often what is presented to users after they login, and typically contains distribution specific details. Likewise, look for /etc/*release or /etc/*version, which various distributions use to tag the release version.

Elementary!

Why I love Gentoo

After doing an emerge world on an older box I get:

* Messages for package sys-libs/com_err-1.40.4:

* PLEASE PLEASE take note of this
* Please make *sure* to run revdep-rebuild now
* Certain things on your system may have linked against a
* different version of com_err -- those things need to be
* recompiled.  Sorry for the inconvenience

In other words, “This update is going to break your system. We kind of screwed up. Something changed. Here’s how to fix it.”

So now I don’t have to google around frantically trying to figure out why randomserviced is suddenly failing. And the Gentoo guys even apologize.

Honesty, transparency, and humility in software. Go figure.