Skip to content

hashscanner/hashscanner-python

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

HashScanner Python Client

PyPI Python versions License: MIT

Python client for the HashScanner API — query 1.5 billion+ NIST NSRL known file hashes (MD5 / SHA-1 / SHA-256) by API, single or in bulk.

HashScanner puts the NIST National Software Reference Library online so you can filter the known out of your data and focus on the unknown — without downloading and maintaining the ~700 GB RDS yourself.

🔑 You need a free API key. Create an account at https://www.hashscanner.com/register — every plan (including the free tier) includes API access. Your key is in your dashboard.

Install

pip install hashscanner

Quick start

from hashscanner import Client

hs = Client("hs_xxxx_sk_xxxx")          # or set HASHSCANNER_API_KEY

result = hs.lookup("d41d8cd98f00b204e9800998ecf8427e")
if result.found:
    print(result.type, result.file_name, result.product, "via", result.source)
else:
    print("not in NSRL — worth a closer look")

A match means the file is known (cataloged in NSRL) — not that it is safe, clean, or malicious. Use it to set aside files you already recognise.

Bulk lookups (async)

For large sets — up to 100,000 hashes per job — submit a bulk job. The client handles the submit → poll → download flow for you:

hashes = ["d41d8cd9...", "da39a3ee...", ...]

# JSON: returns a list of result dicts
for record in hs.bulk(hashes):
    print(record["hash"], record["found"])

# CSV: returns the raw CSV text
csv_text = hs.bulk(hashes, format="csv")

Prefer to drive the steps yourself?

job = hs.submit_bulk(hashes, format="json")   # -> BulkJob (queued)
job = hs.wait(job, poll_interval=3)           # poll until completed/failed
for record in hs.iter_results(job):           # stream NDJSON results
    ...

A few small lookups concurrently

results = hs.lookup_many(["<hash1>", "<hash2>", "<hash3>"])

Command line

The package installs a hashscanner command:

export HASHSCANNER_API_KEY="hs_xxxx_sk_xxxx"

# single lookup
hashscanner lookup d41d8cd98f00b204e9800998ecf8427e
hashscanner lookup d41d8cd9... --json

# bulk: one hash per line (use '-' for stdin), JSON (NDJSON) or CSV
hashscanner bulk hashes.txt
hashscanner bulk hashes.txt --format csv -o results.csv
cat hashes.txt | hashscanner bulk -

Errors

All errors derive from hashscanner.HashScannerError:

Exception When
AuthenticationError 401 — key missing/invalid
SubscriptionInactiveError 403 — renew/upgrade
RateLimitError 429 — per-minute rate limit or monthly quota (.retry_after, .reset)
BadRequestError 400 — bad hash / job too large
NotFoundError 404 — unknown/expired bulk job
JobFailedError bulk job finished failed
APIError other non-2xx

A single-lookup miss is not an error — it returns LookupResult(found=False).

Links

License

MIT

About

Official Python client and CLI for the HashScanner API — query 1.5B+ NIST NSRL known file hashes (MD5/SHA-1/SHA-256), single or in bulk.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages