ROBOTS
The robots.txt
is a Document that tells Search Engines which pages they are and aren't allowed to show on their Search Engine results or ban specific Search Engines from crawling the website altogether.
User-agent: *
Allow: /
Disallow: /staff-portal
Favicon
The Favicon
is a small Icon displayed in the Browser's Address Bar or Tab used for branding a Website. Sometimes when Frameworks are used to build a Website, a Favicon that is part of the Installation gets leftover, and if the Website Developer doesn't replace this with a custom one, this can give us a clue on what Framework is in use.
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>Welcome to my webpage!</title>
<link rel="shortcut icon" type="image/jpg" href="images/favicon.ico"/>
</head>
- We can get the
HASH
Value of the Favicon and compare it on theOWASP
Database for a Match.
┌──────┐ ┌────────────────────────────────────────┐ ┌────────┐
| curl | -► | https://website.com/images/favicon.ico | -► | MD5SUM |
└──────┘ └────────────────────────────────────────┘ └────────┘
SITEMAP
This file gives a list of every file the Website Owner wishes to be listed on a Search Engine.
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<url>
<loc>https://website.com/administrators</loc>
<changefreq>weekly</changefreq>
<priority>0.5</priority>
</url>
<url>
<loc>https://website.com/hidden_area</loc>
<changefreq>weekly</changefreq>
<priority>0.5</priority>
</url>
</urlset>
HTTP Headers
Headers can sometimes contain useful Information such as the WebServer Software and possibly the Programming - Scripting Language in use.
HTTP/1.1 200 OK
Server: nginx/1.18.0 (Ubuntu)
X-Powered-By: PHP/7.4.3
Date: Mon, 19 Jul 2021 14:39:09 GMT
Content-Type: text/html; charset=UTF-8
Transfer-Encoding: chunked
Connection: keep-alive
Framework Stack
Once we established the Framework of a Website, either from the Favicon
or by looking for clues in the page source such as Comments, Copyright Notices or Credits, we can then locate the Framework's Website.
<!--
Page Generated in 0.04109 Seconds using the Custom Framework v1.2 ( https://website.com/sites/Custom-Framework )
-->
URL
The URL
may Hint at the Technology used by the Website. [In this Case Apache Struts
]
┌──────────────── ─────────────────┐ ┌─────────────────┐
| www.website.com/showcase.action | -► | showcase.action |
└─────────────────────────────────┘ └─────────────────┘