Python client for the HashScanner API — query 1.5 billion+ NIST NSRL known file hashes (MD5 / SHA-1 / SHA-256) by API, single or in bulk.
HashScanner puts the NIST National Software Reference Library online so you can filter the known out of your data and focus on the unknown — without downloading and maintaining the ~700 GB RDS yourself.
🔑 You need a free API key. Create an account at https://www.hashscanner.com/register — every plan (including the free tier) includes API access. Your key is in your dashboard.
pip install hashscannerfrom hashscanner import Client
hs = Client("hs_xxxx_sk_xxxx") # or set HASHSCANNER_API_KEY
result = hs.lookup("d41d8cd98f00b204e9800998ecf8427e")
if result.found:
print(result.type, result.file_name, result.product, "via", result.source)
else:
print("not in NSRL — worth a closer look")A match means the file is known (cataloged in NSRL) — not that it is safe, clean, or malicious. Use it to set aside files you already recognise.
For large sets — up to 100,000 hashes per job — submit a bulk job. The client handles the submit → poll → download flow for you:
hashes = ["d41d8cd9...", "da39a3ee...", ...]
# JSON: returns a list of result dicts
for record in hs.bulk(hashes):
print(record["hash"], record["found"])
# CSV: returns the raw CSV text
csv_text = hs.bulk(hashes, format="csv")Prefer to drive the steps yourself?
job = hs.submit_bulk(hashes, format="json") # -> BulkJob (queued)
job = hs.wait(job, poll_interval=3) # poll until completed/failed
for record in hs.iter_results(job): # stream NDJSON results
...results = hs.lookup_many(["<hash1>", "<hash2>", "<hash3>"])The package installs a hashscanner command:
export HASHSCANNER_API_KEY="hs_xxxx_sk_xxxx"
# single lookup
hashscanner lookup d41d8cd98f00b204e9800998ecf8427e
hashscanner lookup d41d8cd9... --json
# bulk: one hash per line (use '-' for stdin), JSON (NDJSON) or CSV
hashscanner bulk hashes.txt
hashscanner bulk hashes.txt --format csv -o results.csv
cat hashes.txt | hashscanner bulk -All errors derive from hashscanner.HashScannerError:
| Exception | When |
|---|---|
AuthenticationError |
401 — key missing/invalid |
SubscriptionInactiveError |
403 — renew/upgrade |
RateLimitError |
429 — per-minute rate limit or monthly quota (.retry_after, .reset) |
BadRequestError |
400 — bad hash / job too large |
NotFoundError |
404 — unknown/expired bulk job |
JobFailedError |
bulk job finished failed |
APIError |
other non-2xx |
A single-lookup miss is not an error — it returns LookupResult(found=False).
- API documentation: https://www.hashscanner.com/api
- Pricing & limits: https://www.hashscanner.com/pricing
- Sign up (free): https://www.hashscanner.com/register
MIT