CGI and mod_perl

CGI and mod_perl

CGI

CGI or Common Gateway Interface is a standard protocol that defines how web servers and external programs can communicate. It allows web servers to execute programs written in various languages, such as Perl, Python, and PHP, in response to web requests. CGI scripts are typically used to generate dynamic web pages, process form data, and access databases.

Working

Here's how CGI works:

  • A web browser sends a request to a web server for a specific URL.

  • The web server checks if the URL maps to a static file (e.g., HTML, CSS, image).

  • If it's not a static file, the web server searches for a CGI script that matches the URL.

  • If a CGI script is found, the web server launches the script and passes it any parameters that were included in the request URL.

  • The CGI script executes and generates output (e.g., HTML code).

  • The web server sends the output back to the web browser.

Despite its advantages, CGI has some limitations. One of the main issues is performance. Every time a CGI script is executed, a new process is created on the server. This can consume significant resources, especially if there are many simultaneous requests. This led to the development of alternative solutions, one of which is mod_perl.

Mod_perl

mod_perl is an Apache module (service programs that can be dynamically linked and loaded to extend the nature of the HTTP Server) that embeds the Perl interpreter directly into the web server. This allows Perl scripts to be executed much faster than CGI scripts, as they don't need to be launched as separate processes.

Working

With mod_perl, the Perl interpreter is started only once when the server starts. Perl scripts are loaded into memory, and subsequent requests are handled by the persistent interpreter. This eliminates the overhead of starting a new Perl process for each request, resulting in faster execution and reduced resource consumption.

mod_perl also extends the Apache API, allowing developers to write Apache modules entirely in Perl. This gives developers access to all stages of the request processing cycle and allows them to manipulate Apache's internal tables and state mechanisms. This level of control and integration is not possible with traditional CGI.

To ease the transition from CGI to mod_perl, it includes features to run existing CGI scripts under mod_perl with little or no modification. For example, Apache::Registry and Apache::PerlRun are two mod_perl modules that can execute CGI scripts much faster than traditional CGI, because they take advantage of the persistent Perl interpreter embedded in the server.

When Apache receives a request, it processes it in 12 phases. The advantage of breaking up the request process into phases is that Apache gives a programmer the opportunity to hook into the process at any of those phases. For every phase a standard default handler is supplied by Apache.

Modules take control of request processing at each of the phases through a set of well-defined hooks provided by Apache. The subroutine or function in charge of a particular request phase is called a handler. Apache also provides modules with a comprehensive set of functions they can call to achieve common tasks including file I/O, sending HTTP headers or parsing URIs. These functions are collectively knows as Apache API.

Like other Apache modules, mod_perl is written in C, registers handlers for request phases and uses the Apache API. However, mod_perl doesn't directly process requests. Rather, it allows you to write handlers in Perl. When the Apache core yields control to mod_perl through one of its registered handlers, mod_perl dispatches processing to one of the registered Perl handlers.

The <Location> section in the Apache configuration (httpd.conf) assigns a number of rules that the server follows when the request's URI matches the location.

<Location /foo> SetHandler modperl PerlResponseHandler FooServer </Location>

This configuration causes all requests for URIs starting with /foo to be handled by the mod_perl Apache modules with the handler from the FooServer perl module.

Directives :-

SetHandler

SetHandler set to perl-script or modperl tells Apache that mod_perl is going to handle the response generation.

PerlResponseHandler

This tells mod_perl to use the FooServer perl module to handle the response generation.

By default, the mod_perl API expects a subroutine named handler() to handle the request in the registered Perl*Handler module. Thus, if your module implements this subroutine, you can register the handler with mod_perl by just specifying the module name.