Info

I often use the following arguments to perl:

-e Makes the line of code be executed instead of a script
-n Forces your line to be called in a loop. Allows you to take lines from the diamond operator (or stdin)
-p Forces your line to be called in a loop. Prints $_ at the end

Perl One-Liners

One-Liner: Count the number of times a specific character appears in each line

This counts the number of quotation marks in each line and prints it

perl -ne '$cnt = tr/"//;print "$cnt\n"' inputFileName.txt

One-Liner: Add string to beginning of each line

Adds string to each line, followed by tabperl -pe 's/(.*)/string\t$1/' inFile > outFile

One-Liner: Add newline to end of each line

Append a new line to each lineperl -pe 's//\n/' all.sent.classOnly > all.sent.classOnly.sep

One-Liner: Print only some columns of a file

Columns separated by a spacecut fileWithLotsOfColumns.txt -d" " -f 1,2,3,4 > fileWithOnlyFirst4Cols.txt

One-Liner: Print all columns except the first

cut -d" " -f 1 --complement filename > filename.

One-Liner: Replace a pattern with another one inside the file with backup

Replace all occurrences of pattern1 (e.g. [0-9]) with pattern2perl -p -i.bak -w -e 's/pattern1/pattern2/g' inputFile

One-Liner: Print only non-uppercase letters

Go through file and only print words that do not have any uppercase letters.perl -ne 'print unless m/[A-Z]/' allWords.txt > allWordsOnlyLowercase.txt

One-Liner: Print one word per line

Go through file, split line at each space and print words one per line.perl -ne 'print join("\n", split(/ /,$_));print("\n")' someText.txt > wordsPerLine.txt

One-Liner: Kill all screen sessions (no remorse)

Since there's no screen command that would kill all screen sessions regardless of what they're doing, here's a perl one-liner that really kills ALL screen sessions without remorse.screen -ls | perl -ne '/(\d+)\./;print $1' | xargs -l kill -9
The killall command may also do the job...

One-Liner: Return all unique words in a text document (divided by spaces), sorted by their counts (how often they appear)

assuming no punctuation marks:perl -ne 'print join("\n", split(/\s+/,$_));print("\n")' documents.txt > wordsOnePerLine.txt
cat wordsOnePerLine.txt | sort | uniq -c | sort -n > wordCountsSorted.txt

One-Liner: Delete all special characters

or in other words, delete every character that is not a letter, white space or line end (replace with nothing)perl -pne 's/[^a-zA-Z\s]*//g' text_withSpecial.txt > text_lettersOnly.txt

One-Liner: Lower case everything

perl -pne 'tr/[A-Z]/[a-z]/' textWithUpperCase.txt > textwithoutuppercase.txt;

One-Liner Combination: Combine lower-casing with word counting and sorting

perl -pne 'tr/[A-Z]/[a-z]/' sentences.txt | perl -ne 'print join("\n", split(/ /,$_));print("\n")' | sort | uniq -c | sort -n

One-Liner: Print only one column

Print only the second column of the data when using tabular as a separator
perl -ne '@F = split("\t", $_); print "$F[1]";' columnFileWithTabs.txt > justSecondColumn.txt

One-Liner: Print only text between tags

perl -ne 'if (m/\<a\>(.*?)\<\/a\>/g){print "$1\n"}' textFile
The same as a script:
Extracting multiple multiline patterns between a start and an end tag
- Here, we want to extract everything between <parse> and </parse>.
- #!/usr/bin/perl -w
  local $/;
  
  open(DAT, "yourFile.xml") || die("Could not open file!");
  my $content = <DAT>;
  
  while ($content =~ m/<parse>(.*?)<\/parse>/sg){
  print "$1\n"
  };

One-Liner: Sort lines by their length

perl -e 'print sort {length $a <=> length $b} <>' textFile

One-Liner: Print second column, unless it contains a number

perl -lane 'print $F[1] unless $F[1] =~ m/[0-9]/' wordCounts.txt

One-Liner: Trim/ Collapse white spaces and replace new lines by something else

echo "The cat sat on the mat
asd sad das " | perl -ne 's/\n/ /; print $_; print(";")' | perl -ne 's/\s+/ /g; print $_'

One-Liner: Get the average of one column from certain lines

grep "another criterion" thisDataFile.txt | perl -ne '@F = split(",", $_); print "$F[29]\n";' | awk '{sum+=$1} END { print "Average = ",sum/NR}'

One-Liner: How to sort a file by a column

Columns are separated by a space, we sort numerically (-n) and we sort by the 10'th column (-k10)
bash does the job here, no perl needed ;)
sort -t' ' -n -k10 eSet1_both.txt

One-Liner: Replace specific space but also copy a group of matches

matches a group of numbers in the beginning of a line
perl -p -i.bak -w -e 's/^([0-9]+) "/$1\t"/g' someFile.txt

More info

Sending multiple commands to screen session on different machines

#!/usr/bin/perl -w

# This script creates screen sessions, ssh's to machines and executes code on these machines.
# parameters: -s (start) -r (run) -q (quit)
# HowTo:
# 1) change the executed code, property folder and prefix to your values
# 2) select your machines
# 3) on the machine where you want your screen sessions run to start your sessions: ./clusterSubmitJobs.pl -s
# 4) once you're done and want to quit all your sessions: ./clusterSubmitJobs.pl -q
# author: richard socher.org

use strict;
use Getopt::Std;
use List::Util qw[min max];
my %options=();
getopts("srq",\%options);

#------------------
# files to be considered
my $folder = '/folderWithInputFiles';
my $prefix = 'tests_';
my $ext = '.config';
# code to run with files
my $code = './runMyScript.sh -configFile ';

# deprecated by mstat
my @freemachines = ('machine1.yourPlace.edu', 'machine2.yourPlace.edu');
#-------------------

my $full = $folder . $prefix . '*' . $ext;
print "Using files: $full \n";

my @files = <$full*>;
my $numMachines = @freemachines;
my $numFiles = @files;
my $minNum = min($numMachines,$numFiles);

for (my $i = 0; $i < $minNum; $i++) {
if ($options{s}){
print "Creating screen session: freemachines[$i] for \t $files[$i] \n";
system("screen -d -m -S $freemachines[$i]");
system("screen -S $freemachines[$i] -p 0 -X stuff \"ssh $freemachines[$i]\015\"");
}

if ($options{r}){
print "run: screen -S $freemachines[$i] -p 0 -X stuff \"$code $files[$i]\"\n";
system("screen -S $freemachines[$i] -p 0 -X stuff \"$code $files[$i]\015\"");
}

if ($options{q}){
print "screen -S $freemachines[$i] -p 0 -X stuff \"exit\n";
system("screen -S $freemachines[$i] -p 0 -X stuff \"exit\015\"");
system("screen -S $freemachines[$i] -p 0 -X stuff \"exit\015\"");
}
}

How to install new CPAN modules?

perl -MCPAN -e shell # go to CPAN install mode
install Bundle::CPAN # update CPAN
reload cpan
install Set::Scalar