Skip to main content

ROBOTS

The robots.txt is a Document that tells Search Engines which pages they are and aren't allowed to show on their Search Engine results or ban specific Search Engines from crawling the website altogether.

User-agent: *
Allow: /
Disallow: /staff-portal

Favicon

The Favicon is a small Icon displayed in the Browser's Address Bar or Tab used for branding a Website. Sometimes when Frameworks are used to build a Website, a Favicon that is part of the Installation gets leftover, and if the Website Developer doesn't replace this with a custom one, this can give us a clue on what Framework is in use.


<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>Welcome to my webpage!</title>
<link rel="shortcut icon" type="image/jpg" href="images/favicon.ico"/>
</head>
  • We can get the HASH Value of the Favicon and compare it on the OWASP Database for a Match.
┌──────┐    ┌────────────────────────────────────────┐    ┌────────┐
| curl | -| https://website.com/images/favicon.ico | -| MD5SUM |
└──────┘ └────────────────────────────────────────┘ └────────┘

SITEMAP

This file gives a list of every file the Website Owner wishes to be listed on a Search Engine.

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<url>
<loc>https://website.com/administrators</loc>
<changefreq>weekly</changefreq>
<priority>0.5</priority>
</url>
<url>
<loc>https://website.com/hidden_area</loc>
<changefreq>weekly</changefreq>
<priority>0.5</priority>
</url>
</urlset>

HTTP Headers

Headers can sometimes contain useful Information such as the WebServer Software and possibly the Programming - Scripting Language in use.

HTTP/1.1 200 OK
Server: nginx/1.18.0 (Ubuntu)
X-Powered-By: PHP/7.4.3
Date: Mon, 19 Jul 2021 14:39:09 GMT
Content-Type: text/html; charset=UTF-8
Transfer-Encoding: chunked
Connection: keep-alive

Framework Stack

Once we established the Framework of a Website, either from the Favicon or by looking for clues in the page source such as Comments, Copyright Notices or Credits, we can then locate the Framework's Website.

<!--
Page Generated in 0.04109 Seconds using the Custom Framework v1.2 ( https://website.com/sites/Custom-Framework )
-->

URL

The URL may Hint at the Technology used by the Website. [In this Case Apache Struts]

┌─────────────────────────────────┐      ┌─────────────────┐
| www.website.com/showcase.action | -| showcase.action |
└─────────────────────────────────┘ └─────────────────┘