[uss_qualifier/reports] Add test artifacts obfuscator#1477
[uss_qualifier/reports] Add test artifacts obfuscator#1477BenjaminPelletier wants to merge 4 commits into
Conversation
| Detailed information on command-line options and toggles can be retrieved using the `--help` flag: | ||
|
|
||
| ```bash | ||
| PYTHONPATH=. uv run --index https://pypi.org/simple python monitoring/uss_qualifier/reports/obfuscate.py --help |
There was a problem hiding this comment.
nit: This could be simplified to PYTHONPATH=. uv run monitoring/uss_qualifier/reports/obfuscate.py --help
(python is not needed, --index is environment-specific)
| -v "/path/to/local/input_artifacts:/input" \ | ||
| -v "/path/to/local/output_dir:/output" \ | ||
| interuss/monitoring \ | ||
| python monitoring/uss_qualifier/reports/obfuscate.py /input /output/obfuscated_artifacts.zip |
There was a problem hiding this comment.
The path is wrong no, monitoring/ is extra? Also it should be run via uv no?
docker run --rm \
-v "/path/to/local/input_artifacts:/input" \
-v "/path/to/local/output_dir:/output" \
interuss/monitoring \
python monitoring/uss_qualifier/reports/obfuscate.py
python: can't open file '/app/monitoring/monitoring/uss_qualifier/reports/obfuscate.py': [Errno 2]
docker run --rm \
-v "/path/to/local/input_artifacts:/input" \
-v "/path/to/local/output_dir:/output" \
interuss/monitoring \
uv run uss_qualifier/reports/obfuscate.py
usage: obfuscate.py [-h] [--no-participants] [--no-hostnames] [--no-tokens]
input output
obfuscate.py: error: the following arguments are required: input, output
| python monitoring/uss_qualifier/reports/obfuscate.py /input /output/obfuscated_artifacts.zip | |
| uv run uss_qualifier/reports/obfuscate.py /input /output/obfuscated_artifacts.zip |
| elif k == "manager" and isinstance(v, str): | ||
| participant_ids.add(v) | ||
| elif isinstance(v, str): | ||
| for url in find_urls(v): |
There was a problem hiding this comment.
reuse scan_text bellow? it do the same loop
|
|
||
|
|
||
| def test_find_urls(): | ||
| text = "Check http://dss1.uss1.localutm/dss/v1, or go to https://github.com/interuss. Also see (http://localhost:8082/status)." |
There was a problem hiding this comment.
Should we test thoses extra cases?
http://user@localhost
http://user:password@localhost
http://localhost?q=a2
| # 2. Obfuscate hostnames | ||
| if config.obfuscate_hostnames: | ||
| for h, mapped_h in hostname_map.items(): | ||
| s = re.sub(rf"\b{re.escape(h)}\b", mapped_h, s) |
There was a problem hiding this comment.
That an edge case, but if we have url in uppercase for some reason, urlparse normalize it to lowercase and it won't be replaced there.
urlparse("http://TEST").hostname
'test'
| s = re.sub(rf"\b{re.escape(h)}\b", mapped_h, s) | |
| s = re.sub(rf"\b{re.escape(h)}\b", mapped_h, s, flags=re.IGNORECASE) |
|
|
||
| # 1. Obfuscate tokens | ||
| if config.obfuscate_tokens: | ||
| s = re.sub(r"(?i)\bBearer\s+\S+", "Bearer REDACTED", s) |
There was a problem hiding this comment.
| s = re.sub(r"(?i)\bBearer\s+\S+", "Bearer REDACTED", s) | |
| s = re.sub(r"(?i)\bBearer\s+\S+", "Bearer REDACTED", s, flags=re.IGNORECASE) |
To handle cases like bEarER that may be valid
| # Generate maps | ||
| participant_map = {} | ||
| for idx, pid in enumerate( | ||
| sorted(sorted(participant_ids), key=len, reverse=True), start=1 |
| scan_json(v, participant_ids, hostnames) | ||
| elif isinstance(obj, list): | ||
| for item in obj: | ||
| scan_json(item, participant_ids, hostnames) |
There was a problem hiding this comment.
should scan_json do something on strings ?
There is an isinstance(v, str) in dict branch, but not in list, meaning str in dict and list are not handled the same way
Some users may be hesitant to include full test artifacts when reporting issues or requesting help because those artifacts may contain semi-sensitive information. This PR adds a tool that attempts to ease this problem somewhat by obfuscating sensitive information while still retaining nearly-full diagnostic value for the artifacts.
Written mostly with Gemini.