Info


I often use the following arguments to perl:

  • -e Makes the line of code be executed instead of a script

  • -n Forces your line to be called in a loop. Allows you to take lines from the diamond operator (or stdin)

  • -p Forces your line to be called in a loop. Prints $_ at the end


Perl One-Liners




One-Liner: Count the number of times a specific character appears in each line


  • This counts the number of quotation marks in each line and prints it

perl -ne '$cnt = tr/"//;print "$cnt\n"' inputFileName.txt


One-Liner: Add string to beginning of each line


  • Adds string to each line, followed by tabperl -pe 's/(.*)/string\t$1/' inFile > outFile


One-Liner: Add newline to end of each line


  • Append a new line to each lineperl -pe 's//\n/' all.sent.classOnly > all.sent.classOnly.sep


One-Liner: Print only some columns of a file


  • Columns separated by a spacecut fileWithLotsOfColumns.txt -d" " -f 1,2,3,4 > fileWithOnlyFirst4Cols.txt


One-Liner: Print all columns except the first


  • cut -d" " -f 1 --complement filename > filename.


One-Liner: Replace a pattern with another one inside the file with backup


  • Replace all occurrences of pattern1 (e.g. [0-9]) with pattern2perl -p -i.bak -w -e 's/pattern1/pattern2/g' inputFile


One-Liner: Print only non-uppercase letters


  • Go through file and only print words that do not have any uppercase letters.perl -ne 'print unless m/[A-Z]/' allWords.txt > allWordsOnlyLowercase.txt


One-Liner: Print one word per line


  • Go through file, split line at each space and print words one per line.perl -ne 'print join("\n", split(/ /,$_));print("\n")' someText.txt > wordsPerLine.txt


One-Liner: Kill all screen sessions (no remorse)


  • Since there's no screen command that would kill all screen sessions regardless of what they're doing, here's a perl one-liner that really kills ALL screen sessions without remorse.screen -ls | perl -ne '/(\d+)\./;print $1' | xargs -l kill -9

  • The killall command may also do the job...


One-Liner: Return all unique words in a text document (divided by spaces), sorted by their counts (how often they appear)


  • assuming no punctuation marks:perl -ne 'print join("\n", split(/\s+/,$_));print("\n")' documents.txt > wordsOnePerLine.txt
    cat wordsOnePerLine.txt | sort | uniq -c | sort -n > wordCountsSorted.txt


One-Liner: Delete all special characters


  • or in other words, delete every character that is not a letter, white space or line end (replace with nothing)perl -pne 's/[^a-zA-Z\s]*//g' text_withSpecial.txt > text_lettersOnly.txt


One-Liner: Lower case everything


  • perl -pne 'tr/[A-Z]/[a-z]/' textWithUpperCase.txt > textwithoutuppercase.txt;


One-Liner Combination: Combine lower-casing with word counting and sorting


  • perl -pne 'tr/[A-Z]/[a-z]/' sentences.txt | perl -ne 'print join("\n", split(/ /,$_));print("\n")' | sort | uniq -c | sort -n


One-Liner: Print only one column


  • Print only the second column of the data when using tabular as a separator

  • perl -ne '@F = split("\t", $_); print "$F[1]";' columnFileWithTabs.txt > justSecondColumn.txt


One-Liner: Print only text between tags


  • perl -ne 'if (m/\<a\>(.*?)\<\/a\>/g){print "$1\n"}' textFile

  • The same as a script:

  • Extracting multiple multiline patterns between a start and an end tag

    • Here, we want to extract everything between <parse> and </parse>.

    • #!/usr/bin/perl -w
      local $/;

      open(DAT, "yourFile.xml") || die("Could not open file!");
      my $content = <DAT>;

      while ($content =~ m/<parse>(.*?)<\/parse>/sg){
      print "$1\n"
      };


One-Liner: Sort lines by their length


  • perl -e 'print sort {length $a <=> length $b} <>' textFile


One-Liner: Print second column, unless it contains a number


  • perl -lane 'print $F[1] unless $F[1] =~ m/[0-9]/' wordCounts.txt


One-Liner: Trim/ Collapse white spaces and replace new lines by something else


  • echo "The cat sat on the mat
    asd sad das " | perl -ne 's/\n/ /; print $_; print(";")' | perl -ne 's/\s+/ /g; print $_'


One-Liner: Get the average of one column from certain lines


  • grep "another criterion" thisDataFile.txt | perl -ne '@F = split(",", $_); print "$F[29]\n";' | awk '{sum+=$1} END { print "Average = ",sum/NR}'


One-Liner: How to sort a file by a column


  • Columns are separated by a space, we sort numerically (-n) and we sort by the 10'th column (-k10)

  • bash does the job here, no perl needed ;)

  • sort -t' ' -n -k10 eSet1_both.txt


One-Liner: Replace specific space but also copy a group of matches


  • matches a group of numbers in the beginning of a line

  • perl -p -i.bak -w -e 's/^([0-9]+) "/$1\t"/g' someFile.txt


More info



Sending multiple commands to screen session on different machines


  • #!/usr/bin/perl -w

    # This script creates screen sessions, ssh's to machines and executes code on these machines.
    # parameters: -s (start) -r (run) -q (quit)
    # HowTo:
    # 1) change the executed code, property folder and prefix to your values
    # 2) select your machines
    # 3) on the machine where you want your screen sessions run to start your sessions: ./clusterSubmitJobs.pl -s
    # 4) once you're done and want to quit all your sessions: ./clusterSubmitJobs.pl -q
    # author: richard socher.org

    use strict;
    use Getopt::Std;
    use List::Util qw[min max];
    my %options=();
    getopts("srq",\%options);

    #------------------
    # files to be considered
    my $folder = '/folderWithInputFiles';
    my $prefix = 'tests_';
    my $ext = '.config';
    # code to run with files
    my $code = './runMyScript.sh -configFile ';

    # deprecated by mstat
    my @freemachines = ('machine1.yourPlace.edu', 'machine2.yourPlace.edu');
    #-------------------


    my $full = $folder . $prefix . '*' . $ext;
    print "Using files: $full \n";

    my @files = <$full*>;
    my $numMachines = @freemachines;
    my $numFiles = @files;
    my $minNum = min($numMachines,$numFiles);

    for (my $i = 0; $i < $minNum; $i++) {
    if ($options{s}){
    print "Creating screen session: freemachines[$i] for \t $files[$i] \n";
    system("screen -d -m -S $freemachines[$i]");
    system("screen -S $freemachines[$i] -p 0 -X stuff \"ssh $freemachines[$i]\015\"");
    }

    if ($options{r}){
    print "run: screen -S $freemachines[$i] -p 0 -X stuff \"$code $files[$i]\"\n";
    system("screen -S $freemachines[$i] -p 0 -X stuff \"$code $files[$i]\015\"");
    }

    if ($options{q}){
    print "screen -S $freemachines[$i] -p 0 -X stuff \"exit\n";
    system("screen -S $freemachines[$i] -p 0 -X stuff \"exit\015\"");
    system("screen -S $freemachines[$i] -p 0 -X stuff \"exit\015\"");
    }
    }


How to install new CPAN modules?


  • perl -MCPAN -e shell # go to CPAN install mode
    install Bundle::CPAN # update CPAN
    reload cpan
    install Set::Scalar