Politics and Technology.

Wednesday, April 15, 2009

Stripping Control-M

One annoying little consequence of being a Unix geek in a Windows world is the common problem of having to strip "^M" (or "control-m" or carriage returns) before the line breaks of text files. These files were typically created on a Windows box, where lines are separated with carriage return line feeds, and transferred to a Unix box, where only a line feed is used.

As the adage goes, there's more than one way to skin a cat (it just depends on how you want the pelt to look), there are many little tricks and tools to remove the "^M" from text files. Unfortunately, I go through this exercise so infrequently that every time I encounter the problem, I forget some of the solutions and am forced to spend 10 or so minutes looking it up.

So, at least for my sake, I'm going to record some ways here. Where "^" is indicated, hold the "Ctrl" key and press the next character.


Perl Judo:


# cat windows.txt | perl -e 'while (<>) {s/\x0D//g; print;}' > unix.txt


Transcode Karate:


# cat windows.txt | tr -d "^v^m" > unix.txt


GNU Kung Fu (Solaris variety)


# dos2unix windows.txt unix.txt


GNU Kung Fu (Linux variety)


# dos2unix -n windows.txt unix.txt


Editor Jujitsu (Vi flavored)


:%s/^v^m//g


Editor Jujitsu (Emacs flavored)


esc-x comint-strip-ctrl-m

No comments: