File Uploads Via PERL And CGI

Disclaimer

These instructions/steps worked for me running CentOS. It may very well work for you on Red Hat-like or other distributions. Please note that if you decide to use these instructions on your machine, you are doing so entirely at your very own discretion and that neither this site, sgowtham.com, nor its author is responsible for any/all damage – intellectual or otherwise.


A while ago, I wrote about uploading files to a web server via PHP and this post follows along similar lines – only accomplishing the same task using PERL. There are numerous articles on the web and in the books, but what follows here is what worked for me – it is expected to serve as a Note2Self but if you find it useful, please feel free to do so.

A Form To Select The File

Let us suppose that the HTML file that displays this form is called NewFile.html and the PERL file that does the actual uploading work (and some more) is called Uploader.cgi.

1
2
3
 
 
<!-- The data encoding type, enctype, MUST be specified as below -->
Select A File
1
 

Uploader.cgi

Suppose that the end-user selected a file to upload and hit the Upload This File button. Now, the file, Uploader.cgi takes over and does a few things. The source code, adopted from a Site Point article, follows:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
#! /usr/bin/perl -wT
 
# -w switch to make Perl warn us of any potential dangers in our code.
# -T switch turns on taint checking. This ensures that any untrusted 
# input to the script, such as the uploaded file's filename, is marked as tainted.
 
use strict;
use CGI;
use CGI::Carp qw ( fatalsToBrowser );
use File::Basename;
 
# Tto prevent the server being overloaded by huge file uploads, 
# the allowable size of an uploaded file can be limited to 2MB.
#
# 1024 bytes in 1 kB; 1024 kB in 1 MB (1048576 bytes in 1 MB)
$CGI::POST_MAX = 1024 * 1024 * 2;
 
# Some characters, such as slashes (/), are dangerous in filenames, 
# as they might allow file uploads to any directory. Alphabets,
# numerics, underscores, hyphens and periods may be considered safe.
my $safe_characters = "a-zA-Z0-9_.-";
 
# In Red Hat (like) distribtutions, the DocumentRoot variable in Apache
# is set by default to '/var/www/html'. Suppose that all uploaded files
# will be put into a folder called 'uploads' under DocumentRoot.
# 'uploads' folder must have 777 permission.
my $upload_location = "/var/www/html/uploads";
 
# The next step is to create a CGI object (assigned to $query below).
# This allows one to access methods in the CGI.pm library. One can then
# read in the filename of uploaded file.
my $query = new CGI;
 
# If there was a problem uploading the file -- for example, the 
# file was bigger than the $CGI::POST_MAX setting -- $filename 
# will be empty. The following bit tests for such problems.
#
# Value within param() is the same as the reference used in the form
my $filename = $query-&gt;param("uploaded_file");
 
if ( !$filename ) {
 print $query-&gt;header ( );
 print "There was a problem uploading the file. May be it is too big?.";
 exit;
}
 
# The first thing to do with filename is use the 'fileparse' routine in
# File::Basename module to split the filename into its leading path 
# (if any), the filename itself, and the file extension. One can then 
# safely ignore the leading path. Not only does this help prevent 
# attempts to save the file anywhere on the web server, but 
# some browsers send the whole path to the file on the user's hard drive.
my ( $name, $path, $extension ) = fileparse ( $filename, '\..*' );
$filename = $name . $extension;
 
# Convert any spaces in the filename to underscores
$filename =~ tr/ /_/;
 
# Remove any characters that are not in safe character list 
$filename =~ s/[^$safe_characters]//g;
 
# Untaint the $filename variable. 
# This variable is tainted because it contains potentially unsafe 
# data passed by the browser. The only way to untaint a tainted 
# variable is to use regular expression matching to extract the 
# safe characters:
if ( $filename =~ /^([$safe_characters]+)$/ ) {
 $filename = $1;
} else {
 die "Filename contains invalid characters";
}
 
# The upload method can be used to grab the file handle of the 
# uploaded file (which  points to a temporary file created by CGI.pm).
my $upload_filehandle = $query-&gt;upload("uploaded_file");
 
# One can now read the contents of the handle, its contents can be read
# and saved to a new file under 'uploads' folder. One can retain the
# uploaded file's name as the name of the new file.
open ( UPLOADFILE, "&gt;$upload_location/$filename" ) or die "$!";
binmode UPLOADFILE;
 
while ( &lt;$upload_filehandle&gt; ) {
 print UPLOADFILE;
}
 
close UPLOADFILE;

Specific MIME Type

The purpose of this code is to be facilitate uploading journal publications to our research group website. Since most journals do provide PDF version of published material and PDF is as platform-independent a document format as one can get, it wouldn’t make much sense to upload anything but PDF files.

One Reply to “File Uploads Via PERL And CGI”

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.