Domain & Infrastructure OSINT for Pentest Recon

A practical guide to domain and infrastructure OSINT pentest recon: CT logs, passive DNS, Shodan, Censys, urlscan, certificate pivoting, and scope boundaries.

Domain & Infrastructure OSINT for Pentest Recon

Reconnaissance is the phase where a penetration test is won or lost before a single exploit fires. The richer your pre-engagement picture of a target’s domain footprint and internet-facing infrastructure, the more precisely you can direct active testing — and the less time you waste hammering assets that belong to a third party. This guide walks through the core data sources and chaining techniques used for domain and infrastructure OSINT pentest recon, and draws a clear line between passive observation and active interaction.

Scope reminder: Every technique in this article is passive OSINT unless explicitly marked otherwise. Using these methods against infrastructure you are not authorized to test may violate the Computer Fraud and Abuse Act, the UK Computer Misuse Act, or equivalent legislation in your jurisdiction. Always operate within a signed Rules of Engagement document.

Why Passive Recon Matters Before You Touch Anything

Active scanning — sending packets, probing ports, fuzzing endpoints — is detectable, leaves log entries, and is legally meaningful once it crosses a scope boundary. Passive OSINT, by contrast, queries data that third parties have already collected and made available. It lets you:

Map the full attack surface before agreeing to final scope, so you can have an informed conversation with the client.
Discover shadow IT, forgotten subdomains, and infrastructure that the client’s own team may not know exists.
Build pivot chains (IP → certificate → domain → ASN → more IPs) that would take hours of active scanning in seconds.
Avoid testing assets outside scope, which protects both you and your client.

Certificate Transparency Logs: Your First Pivot Point

Every publicly trusted TLS certificate is logged to Certificate Transparency (CT) log servers as a condition of browser trust. This means that every subdomain a certificate has ever been issued for is permanently and publicly recorded — regardless of whether the subdomain is still live.

Key sources:

crt.sh — Free, browser-accessible, SQL-queryable. Search %.example.com to pull all certificates containing that domain as a Subject Alternative Name (SAN). The wildcard % acts as a SQL LIKE operator.
Google’s Transparency Report — Overlaps significantly with crt.sh but occasionally indexes certificates earlier.
Cert Spotter (SSLMate) — Offers a monitored feed useful for defenders, but the search interface is also usable for recon.

Certificate-based pivoting is where CT logs become truly powerful. A single certificate may cover ten SANs across multiple apex domains — revealing business units, acquisitions, or internal naming conventions that aren’t in public DNS. Pull the issuing organization field: some internal CA certificates leak into CT logs if a company inadvertently uses a publicly trusted CA for internal infrastructure.

Practical workflow:

Query %.example.com on crt.sh.
Export all unique SANs and apex domains.
Cross-reference apex domains against the client’s known legal entities — any unfamiliar domains warrant scope clarification before active testing.

Passive DNS: Historical Infrastructure Mapping

Passive DNS (pDNS) databases record DNS query/response pairs observed by participating resolvers over time. Unlike a live DNS lookup, pDNS shows you what an IP resolved to six months ago — an invaluable signal for finding infrastructure that has been decommissioned, moved, or rebranded but may still be accessible.

Key sources:

SecurityTrails — One of the most comprehensive pDNS archives. Shows A, AAAA, MX, NS, TXT, and CNAME history for any domain. The free tier is limited but sufficient for initial recon.
Farsight DNSDB — The industry benchmark for passive DNS data volume and historical depth. Requires a subscription; academic and security researcher access programs exist.
RiskIQ PassiveTotal (now Microsoft Defender Threat Intelligence) — Combines pDNS with WHOIS history and certificate data in a single pivot interface.
VirusTotal (covered in more detail below) — Includes pDNS records under the Relations tab for domains and IPs.

DNS history pivot workflow:

Look up the target apex domain. Note all historical A records.
For each historical IP, reverse-query pDNS: what other domains have ever resolved to this IP? This reveals shared hosting relationships, sibling services, and legacy infrastructure.
Check NS record history: a change in nameservers can indicate a domain transfer, a cloud migration, or — most usefully — the original hosting provider before a CDN was placed in front.
MX record history often reveals email security vendors (Proofpoint, Mimecast, Google Workspace) and — occasionally — direct mail server IPs that bypass the CDN.

Infrastructure Search Engines: Shodan, Censys, and Netlas

These platforms continuously scan the public internet and index banners, certificates, and service metadata. Querying them is passive — you are reading their collected data, not sending packets to the target.

Shodan

Shodan indexes port banners, HTTP headers, TLS certificates, and service fingerprints. Useful queries for pentest recon:

ssl.cert.subject.cn:"example.com" — Finds hosts presenting a certificate for the target domain, even on non-standard ports.
org:"Example Corp" — Scopes by ASN organization name. Combine with port: filters to narrow.
http.title:"Example" — Matches page titles. Useful when you suspect a staging environment with a recognizable title but unknown subdomain.
hostname:.example.com — Reverse DNS matches.

Shodan’s Shodan Monitor product is aimed at defenders, but its underlying data is the same dataset you are querying.

Censys

Censys offers a more structured, SQL-like query language and indexes IPv4, IPv6, and certificate data separately. Its certificate search is often more current than Shodan’s because it actively re-crawls CT log entries.

parsed.names: example.com (in the Certificates index) — Returns all certificates with example.com in any name field.
services.tls.certificates.leaf_data.subject.common_name: "example.com" (in the Hosts index) — Finds live hosts presenting that certificate.

Censys is particularly strong for IPv6 enumeration and for finding hosts where the certificate is the only reliable identifier.

Netlas

Netlas is a newer entrant worth adding to your workflow. It indexes HTTP/HTTPS response bodies, headers, and certificates and allows full-text search against response content — a capability the others offer only partially.

Search for internal product names, copyright strings, or custom HTTP headers specific to your target’s tech stack.
The domain: filter and certificate-based searches broadly mirror Shodan and Censys functionality, but the response body indexing can surface admin panels and API endpoints that don’t appear in certificate or banner data.

urlscan.io: Passive Browser-Based Recon

urlscan.io is a sandboxed URL scanner that records screenshots, DOM content, outbound requests, cookies, and redirects for submitted URLs. Critically for recon, it maintains a public archive of scans submitted by other users — meaning someone may have already scanned your target’s login portal, staging environment, or internal-facing application that leaked onto the internet.

Recon uses:

Search page.domain:example.com to find all public scans of pages on the target domain.
Review screenshots for application names, version strings, or internal branding.
The Requests tab of any scan lists every resource the page loaded — CDN URLs, API endpoints, analytics platforms, and sometimes internal IP addresses leaked via HTTP headers or JavaScript source maps.
Filter by date to find scans of pages that no longer exist in live DNS — evidence of decommissioned applications.

urlscan results are particularly valuable for identifying third-party SaaS products and integrations that expand the logical attack surface even when they are outside the direct IP scope.

VirusTotal: Infrastructure Relationships at Scale

VirusTotal’s graph and relationship features are underused by many pentesters. Beyond malware scanning, VT aggregates:

pDNS records for domains and IPs (sourced from multiple providers).
Subdomains observed in passive traffic.
Communicating files — malware samples that have beaconed to an IP, which can reveal whether an IP has a history as a C2 — useful context when scoping cloud infrastructure.
Resolutions — the full bidirectional IP-to-domain resolution history.
Referrer URLs — other URLs that linked to the target, sometimes exposing unlisted admin portals.

The VirusTotal Graph tool lets you build a visual pivot map: start from a domain, expand to IPs, then expand those IPs to sibling domains, certificates, and files — all in a single session without touching the target network.

Attack Surface Enumeration: Chaining the Pivots

The real skill in infrastructure OSINT is chaining data sources so that each finding seeds the next query. A representative pivot chain looks like this:

crt.sh (subdomains) 
  → pDNS (historical IPs per subdomain) 
    → Shodan/Censys (services on those IPs) 
      → Certificate SANs on discovered services 
        → crt.sh (new subdomains from those certificates) 
          → urlscan (application fingerprinting) 
            → VirusTotal (sibling domains, malware history)

At each step, you should be tracking:

IP ranges and ASNs — Build a CIDR map. Are any ranges registered directly to the client versus hosted on AWS/GCP/Azure? Cloud IPs require explicit authorization before active testing.
Technology stack signals — Server headers, TLS cipher suites, Jarm fingerprints (available in Shodan), and HTTP response bodies all contribute to a technology profile.
Organizational metadata — WHOIS registrant data (where not privacy-shielded), abuse contacts, and registration dates. Domains registered recently may indicate a new product or a phishing campaign.
Certificate anomalies — Self-signed certificates, certificates with internal hostnames in SANs, or certificates issued by unexpected CAs may indicate test environments or misconfigured infrastructure.

Automation and Tooling

Several open-source tools automate parts of this chain:

Amass — Actively maintained, integrates CT logs, pDNS (via API keys), and brute-force subdomain enumeration. Note: the brute-force module is active and should be disabled for passive-only phases.
Subfinder — Passive subdomain enumeration from multiple APIs; clean output suitable for piping.
theHarvester — Aggregates domain, email, and IP data from search engines and data sources.
DNSx — Fast DNS resolution and record querying; useful for validating which discovered subdomains are currently live.

All of these tools support passive-only modes. Read the documentation and disable any module that sends DNS queries directly to the target’s authoritative nameservers, as that constitutes active reconnaissance.

Where OSINT Ends and Active Testing Begins

This distinction is legally and professionally significant. As a rule of thumb:

Action	Classification
Querying crt.sh, Shodan, Censys, pDNS databases	Passive OSINT
Resolving a subdomain via a public recursive resolver	Gray zone — generates a query visible to the target’s authoritative NS
Sending a DNS query directly to the target’s authoritative nameserver	Active
Fetching a URL from the target’s web server	Active
Port scanning any IP	Active
Submitting a URL to urlscan.io for a new scan	Active (urlscan will fetch the URL)
Browsing the public urlscan.io archive	Passive OSINT

Authoritative DNS queries are the most commonly misunderstood boundary. When you run dig example.com @ns1.example.com, your query hits the target’s nameserver and may be logged. Most mature security teams are not monitoring this, but some are. If your Rules of Engagement begin passive, restrict DNS queries to public recursive resolvers (8.8.8.8, 1.1.1.1) and pDNS archives until active testing is authorized.

CDN and WAF bypass: A common goal of infrastructure OSINT is finding the origin IP behind a CDN. Discovering this IP through CT logs and pDNS is passive. Sending requests directly to that IP is active and requires explicit authorization, even if the IP is publicly routable.

Cloud asset enumeration: Permutation-based S3 bucket or Azure blob enumeration (e.g., example-backup.s3.amazonaws.com) involves making HTTP requests to AWS infrastructure. This is active testing, not OSINT, and must be in scope.

Practical Scope Management During Recon

A well-structured recon phase produces a deliverable before active testing begins: a proposed scope expansion document. This lists:

IP ranges and ASNs discovered through OSINT that were not in the original scope.
Subdomains and apex domains linked to the client through certificate and DNS pivots.
Third-party SaaS platforms identified through urlscan and VirusTotal that fall outside direct testing scope but represent logical attack vectors.
Cloud regions and provider accounts associated with the client.

Presenting this to the client before active testing serves two purposes: it demonstrates thoroughness, and it gives the client the opportunity to explicitly include or exclude newly discovered assets. This protects you legally and tends to produce a more comprehensive final report.

Building a Repeatable Workflow

For consistent results across engagements, codify your recon chain:

Seed domains — Collect all apex domains from the Statement of Work and supplement with CT log searches for the organization’s registered legal name.
Subdomain enumeration — Run Subfinder with all API keys configured (Shodan, Censys, SecurityTrails, VirusTotal, urlscan). Passive mode only at this stage.
IP resolution — Resolve all live subdomains via public resolvers. Deduplicate and map to ASNs via whois or BGP tools like bgp.he.net.
Service fingerprinting (passive) — Query Shodan and Censys for each discovered IP range. Export port, banner, and certificate data.
Certificate pivoting — Extract all SANs from discovered certificates. Feed new apex domains back to step 1.
Historical context — Query pDNS and VirusTotal for each discovered domain and IP. Note historical infrastructure and technology shifts.
Application fingerprinting — Search urlscan.io archive for discovered domains. Review screenshots and outbound request graphs.
Scope review — Compile all newly discovered assets and seek written authorization before proceeding to active testing.

Conclusion

Domain and infrastructure OSINT pentest recon is not a preliminary checkbox — it is a discipline that directly determines the quality of every subsequent phase. CT logs surface subdomains that active brute-forcing misses. Passive DNS reveals infrastructure that has been deliberately obscured. Shodan, Censys, and Netlas fingerprint services that haven’t been announced in any bug bounty program. urlscan and VirusTotal provide application and relationship context that pure network scanning cannot. And chaining these sources together — methodically, with clear notes on what was observed versus inferred — produces an attack surface map that is both comprehensive and defensible.

Keep the passive/active boundary explicit in your notes and your reporting. The data sources covered here are powerful precisely because they operate at arm’s length from the target. The moment you send a packet, you have crossed into a different legal and professional territory — one that requires written authorization and a signed scope document before you proceed.