Guidelines for Writing Perl Code 1) Start off with a disclaimer notice. A sample disclaimer notice, which is usually a part of all the scripts deployed at SLAC is: #--------------------------------------------------------------*/ # DISCLAIMER NOTICE */ # */ # This document and/or portions of the material and data */ # furnished herewith, was developed under sponsorship of the */ # U.S. Government. Neither the U.S. nor the U.S.D.O.E., nor */ # the Leland Stanford Junior University, nor their employees, */ # nor their respective contractors, subcontractors, or their */ # employees, makes any warranty, express or implied, or */ # assumes any liability or responsibility for accuracy, */ # completeness or usefulness of any information, apparatus, */ # product or process disclosed, or represents that its use */ # will not infringe privately-owned rights. Mention of any */ # product, its manufacturer, or suppliers shall not, nor is it */ # intended to, imply approval, disapproval, or fitness for any */ # particular use. The U.S. and the University at all times */ # retain the right to use and disseminate same for any purpose */ # whatsoever. */ #--------------------------------------------------------------*/ # Copyright (c) 2006 # The Board of Trustees of # the Leland Stanford Junior University. All Rights Reserved. 2) Provide a description of the script's or a subroutine's function, what input it uses and what output it produces. It is often helpful to include examples of input and output. Provide the author and creation date. Provide information on how to use the script (some examples are usually useful), including any options and their default values. 3) All perl modules should be checked for possible syntax errors using the perl -c option. 4) It is a good practice that all the variables should be declared, using $my or $our or $local. Make use of the strict feature to adhere to these issues. 5) Use the taint and warning features. Go over the attached annex A, to read and understand the benefits of using strict, taint and warnings. The first two lines of your code should look like: a. #!/usr/bin/perl -w b. Or if this is a CGI script then use the taint (-T) option also: #!/usr/bin/perl -wT 6) Check return codes of any functions called (e.g. opening or closing a file) and provide a diagnostic message for anomalous results together with the error code(s). 7) For warnings, diagnostics, and information about how the script is working etc. use STDERR. For the output that is the purpose of the script use STDOUT so the output can be piped to another process. 8) If the script produces a report, then include in the report output, the name of the script that created it (you can usually use $0), the time it was created (use scalar(logtime())), whom to contact for further information and/or who ran the script (scalar(getpwuid($<))). It is also good to include the name of the host that ran the script. One way to do this is: use Sys::Hostname; #use Socket; my $ipaddr=gethostbyname(hostname()); my ($a, $b, $c, $d)=unpack('C4',$ipaddr); my ($hostname,$aliases, $addrtype, $length, @addrs)=gethostbyaddr($ipaddr,2); 9) Where possible re-use code rather than replicating it, this should result in easier maintenance. Subroutines can help in this. 10) Code Commenting: Kindly ensure that the code is properly commented. There should be comments, if not with each statement, then at least with each block of code or loop. For instance a file or a major block of code should begin with purpose and description of the code: ########################################## #Get TULIP responsive host distances between Guthrie and Tulip # The Input gives the IP addresses and distances of responsive TULIP hosts # Example: # 30.87.34.2,8551.236157 # 140.181.96.29,18488.23791 # 203.181.248.234,5148.805353 # 203.181.248.241,5148.805353 # 134.160.224.254,16684.70444 # 136.142.111.46,1685.3686 # Returns: # A hash associated with the IP addresses with the value distance, Each subroutine should also have a brief description like: ########################################### #Executes synack to a target, and parses the results to provide # 0 if successful or -1 otherwise. If unsuccessful then prints #an error message to STDOUT. # The target is provided in the argument, i.e. $_[0] #Example of Unix synack output that is parsed by this sub: #SYN-ACK to dns1.ethz.ch (129.132.98.12), 4 Packets # # connection for seq no: 0 timed out within 1.000 Secs # connection for seq no: 1 timed out within 1.000 Secs # connection for seq no: 2 timed out within 1.000 Secs # connection for seq no: 3 timed out within 1.000 Secs # # Waiting for outstanding packets (if any).......... # # # ***** Round Trip Statistics of SYN-ACK to dns1.ethz.ch (Port = 22) ****** # 4 packets transmitted, 0 packets received, 100.00 percent packet loss # round-trip (ms) min/avg/max = 0.000/0.000/0.000 (std = 0.000) # (median = 0.000) (interquartile range = 0.000) # (25 percentile = 0.000) (75 percentile = 0.000) sub synack { my $target=$_[0]; if($^O=~"linux") { my $success=0; ... It is always good to maintain a change history. It is of great help when applications are in production environments and someone needs to make a slight modification. So include a version number and date and the major changes made for this version, comment older version numbers and their descriptions, for example: #my $version="3.73, 3/2/06, Les Cottrell"; # Replaced use of uname with $^O to improve tainting # Increased the debug output to show the pwd #my $version="3.74, 5/8/06, Les Cottrell"; #Enabled pings from onsite to onsite my $version="4.0, 5/13/06, Les Cottrell"; #Added synack for SLAC The rule of the thumb is to keep this thing in mind while coding. "How would it feel if someone provided me a similar code and I had to make one small change and it took me a whole day just to point out that one line!! So if I don’t want to mess up someone’s day, I better put useful comments." 11) Layout style (see http://www-iepm.slac.stanford.edu/admin/perlstyles.html) Also, go through the following link for better understanding. http://perl.apache.org/docs/2.0/devel/core/coding_style.html 12) Variable and File Names Variable and files names often convey more than comments. Try to use good variable names. Long names are not always good, but sometimes it is important for the message to be conveyed. http://www.unix.org.ua/orelly/linux/cgi/ch16_02.htm Avoid using the same names repeatedly, for instance if downsites is a script, it should not be the name of the database. Just like you don’t want to name your kids Waqar1, Waqar2 and Waqar3, you should not name the files downsites, downsites10 and downsites2. Golden Principles: To get a brief overview of the golden principles, please see: http://www.bbc.co.uk/guidelines/newmedia/technical/perl.shtml http://www.wpi.edu/Pubs/Policies/Web/CGI/guidelines.html 13) Options Use Getopt (see Perl manual) for options. If you need options then you must also add a USAGE section to your code. An example of a USAGE is given below: (my $progname = $0) =~ s'^.*/''; # strip path components, if any my $allurl = 'http://www-iepm.slac.stanford.edu/cgi-wrap/pingtable.pl?format=tsv' . '&file=minimum_rtt&by=by-node&size=100&tick=allmonthly&from=WORLD' . '&to=WORLD&ex=none&dataset=hep&percentage=any'; my $dir = "/afs/slac.stanford.edu/package/netmon/pinger/pingerDB/code/pingermanagement"; my $file = "$dir/data/minrtt"; my $allfile="$dir/minrtt2"; my $version="0.6, 5/15/06, Waqar and Les"; my $url = 'http://www-iepm.slac.stanford.edu/cgi-wrap/pingtable.pl?format=tsv' ### by default , get las . '&file=minimum_rtt&by=by-node&size=100&tick=last60days&from=WORLD' . '&to=WORLD&ex=none&dataset=hep&percentage=any'; my $USAGE = "Function: This script downloads the PingER data and saves it Usage:\t $progname [opts] Opts: [i -allmonthly] [-v] Where: allmonthly selects the URL for the input and output, the defaults are: input URL: $url output file: $file if allmonthly is specified then the input URL is $allurl and the output file is $allfile -v provides this output. Version: $version "; ... require "getopts.pl"; our ($opt_i, $opt_v); &Getopts('i:v'); if(defined($opt_v)) {print $USAGE; exit 1;} Annex A ------- Perl: Strict, Warnings, and Taint By Thomas Gutschmidt Taken from: http://www.developer.com/lang/perl/article.php/1478301 There are three common tools to help Perl programmers write clean and maintainable code: the strict pragma, the warnings pragma, and taint checking. Strict and Warning are probably the two most commonly used Perl pragmas, and are frequently used to catch "unsafe code." When Perl is set up to use these pragmas, the Perl compiler will check for, issue warnings against, and disallow certain programming constructs and techniques. In Perl (5.6.0 or later), pragmas are set up with the use command: use strict; use warnings; The strict pragma checks for unsafe programming constructs. Strict forces a programmer to declare all variables as package or lexically scoped variables. Strict also forces specific syntax with sub, forcing the programmer to call each subroutine explicitly. The programmer also needs to use quotes around all strings, and to call each subroutine explicitly, which forces a distrust of bare words. The warnings pragma sends warnings when the Perl compiler detects a possible typographical error and looks for potential problems. There are a number of possible warnings (check the man pages or Activestate's perldiag document page), but warnings mainly look for the most common syntax mistakes and common scripting bugs. Perl's ability to allow unchecked data has lead to a number of problems with Web servers and cgi utilizing Perl. To assist programmers in avoiding suspect data, Perl has taint checking built in. Taint prevents code by automatically tagging any variable assigned from outside of the program as "tainted" and therefore unsafe. Taint specifically marks user input, file input, and environment variables. With taint enabled, a program cannot use tainted data to affect anything outside of the actual script. Data that has been marked as tainted spreads the mark to any other data it comes in contact with, so if you use a tainted variable to change a second variable within the script, the second variable also becomes tainted. Taint basically halts any data being sent through eval, system, exec, or open calls. Taint also stops you from calling any external program without first setting a PATH environment variable. To turn on taint, you need to use a -T switch: Perl program.pl -T On a Web server, taint should be added to the path the Web server uses to search for the Perl distribution. If Perl is located within the usr/local.bin folder, each Perl script on that Web server should begin with the line: #!usr/local/bin/perl -T It is normally recommended that all Perl script used online or on a Web server should have taint enabled, and that programmers always use the strict and warnings pragmas. None of these are capable of writing secure code for you, however. Warnings will alert you to common bugs, strict will alert you to common syntax errors, and taint forces you as a programmer to think about what you are doing with outside data. One of the RFCs (request for change) in Perl 6 is more of a request for stagnancy. RFC number 16 is to "Keep default Perl free of constraints such as warnings and strict." One of Perl's strengths is its ability to be a quick and dirty solution. Quick hacks, one liners, and fast fixes are the breath of life of Perl, but they also often break common syntactical rules that taint, warnings, and strict would detect. This RFC will help ensure that Perl continues to be the language of choice when it comes to the often-necessary quick fix. But, by definition, quick fixes need to be quickly written; making these useful tools the default behavior would require that one code in lines to turn them off. This would be a major drawback for short, simple scripts, while larger projects would hardly suffer by being forced to include them if the programmer needs to use them.