MAINT: Refactor printers into lightweight printer module#1732
MAINT: Refactor printers into lightweight printer module#1732rlundeen2 wants to merge 14 commits into
Conversation
Create new pyrit/printer/ module with abstract base classes that contain all formatting logic. Data-fetching operations (CentralMemory calls) are abstract methods implemented by framework subclasses. This enables thin clients to reuse all pretty-printing by subclassing the base printers and implementing data-fetching via REST endpoints. The thin client only needs pyrit.models + pyrit.identifiers + colorama. Changes: - New pyrit/printer/ module with attack_result, scenario_result, scorer subpackages - ConsoleAttackPrinterBase: all attack console formatting, abstract get_conversation/get_scores - ConsoleScenarioPrinterBase: all scenario console formatting - ConsoleScorerPrinterBase: all scorer formatting, abstract get_objective/harm_metrics - Existing framework printers refactored to thin subclasses (backward compatible) - Added to_dict()/from_dict() to AttackResult, ScenarioResult, ScenarioIdentifier, ConversationReference, Score, MessagePiece, Message for serialization round-tripping - Message.to_full_dict() added for rich serialization (to_dict() unchanged for compat) All 675 existing tests pass with no modifications. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Move framework CentralMemory implementations into pyrit/printer/ alongside their base classes. CentralMemory is imported lazily inside constructors, so thin clients importing the module never pay the SQLAlchemy cost. - ConsoleAttackResultPrinter now lives in pyrit.printer.attack_result.console - ConsoleScenarioResultPrinter now lives in pyrit.printer.scenario_result.console - ConsoleScorerPrinter now lives in pyrit.printer.scorer.console - Old locations (executor/attack/printer/, scenario/printer/, score/printer/) become pure re-exports for backward compatibility - Updated test patch paths to match new module locations Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…6.0) Old locations now use PEP 562 __getattr__ lazy re-exports with DeprecationWarning. Only concrete classes are re-exported (not bases). - pyrit.executor.attack.printer → pyrit.printer.attack_result - pyrit.scenario.printer → pyrit.printer.scenario_result - pyrit.score.printer → pyrit.printer.scorer - Updated all internal callers to new canonical paths - Old ABC files (attack_result_printer.py, scenario_result_printer.py, scorer_printer.py) kept for now but deprecated via __init__.py Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…of bases Concrete classes that use CentralMemory/scorer registry are now named to clearly indicate their data source: - ConsoleAttackResultPrinter → ConsoleAttackMemoryPrinter - ConsoleScenarioResultPrinter → ConsoleScenarioMemoryPrinter - ConsoleScorerPrinter → ConsoleScorerMemoryPrinter Moved ScorerEvaluationIdentifier (pyrit internal) from base class into the concrete ConsoleScorerMemoryPrinter. Base classes now contain only formatting logic with no pyrit-internal imports beyond models/identifiers. Deprecated re-exports at old paths still work (mapping old names to new), scheduled for removal in 0.16.0. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Created MarkdownAttackPrinterBase + MarkdownAttackMemoryPrinter in pyrit/printer/attack_result/markdown.py (same pattern as console) - Deleted dead old ABC files: - pyrit/executor/attack/printer/attack_result_printer.py - pyrit/scenario/printer/scenario_result_printer.py - pyrit/score/printer/scorer_printer.py - Old markdown_printer.py now a deprecation re-export shim - Updated all internal imports and test patches Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Changed dict[str, object] to dict[str, Any] in MessagePiece.from_dict() and Message.from_dict() to satisfy pyright (dict.get returns object otherwise) - Added Any import to message_piece.py and message.py - Wrapped get_prompt_scores return in list() for Sequence -> list coercion - Added isinstance check in display_image_async for type safety Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Added return type annotation (-> type) to all __getattr__ deprecation shims - Added noqa: B027 to display_image_async intentional no-op default - Added Returns/Raises sections to short docstrings (DOC201, DOC501) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
objective_target_identifier and objective_scorer_identifier may be None when deserializing from dicts. The printer bases already handle None. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
| ) | ||
|
|
||
|
|
||
| class ConsoleAttackMemoryPrinter(ConsoleAttackPrinterBase): |
There was a problem hiding this comment.
nit: why not name this ConsoleAttackResultPrinter for continuity?
There was a problem hiding this comment.
I want to distinguish between the printer that uses memory and the future printer that uses the rest api. So there will also be a ConsoleAttackRestPrinter
Similar with other classes
| from pyrit.models import AttackOutcome, Message, Score | ||
|
|
||
|
|
||
| class AttackResultPrinterBase(ABC): |
There was a problem hiding this comment.
should we have an empty implementation of this defined as AttackResultPrinter so we can avoid importing ABC as that? Or do we intend for the console & markdown printers to be the only concrete implementations?
| from pyrit.models.conversation_reference import ConversationReference | ||
|
|
||
| return { | ||
| "conversation_id": self.conversation_id, |
There was a problem hiding this comment.
First reaction to all the to_dict and from_dict implementations is "wow that's a lot of string literals" 😆
I asked copilot if there was an easier way to reduce these to avoid duplication and also make it more maintainable and here's what it came up with - lmk what you think!
Suggestion: Reduce string-literal duplication in to_dict / from_dict
Across AttackResult, ScenarioResult, Score, MessagePiece, RetryEvent, ConversationReference, etc., every field name appears as a string literal 2–3 times (once in to_dict, once in
from_dict, and once as the attribute). This is error-prone and verbose.Since the dict keys almost always match the attribute names exactly, we could add a small shared helper like:
def _serialize_value(value):
if isinstance(value, datetime): return value.isoformat()
if isinstance(value, Enum): return value.value
if isinstance(value, uuid.UUID): return str(value)
if hasattr(value, 'to_dict'): return value.to_dict()
if isinstance(value, (list, set)): return [_serialize_value(v) for v in value]
if isinstance(value, dict): return {k: _serialize_value(v) for k, v in value.items()}
return value
Then most to_dict() implementations collapse to:
def to_dict(self):
return {k: _serialize_value(v) for k, v in vars(self).items() if not k.startswith('_')}
Simple classes like RetryEvent and ConversationReference would need zero overrides. More complex ones could call the base and patch a few keys. This could also be a Serializable mixin
so to_dict() is inherited.
There was a problem hiding this comment.
I like how you're thinking about this Nina!
But I think in this case is the easier; from_dict is tricky. And I think if the serialization is split, it makes it tough. And things like ComponentIdentifier is fairly incompatible since it flattens various things. So I think the mixin would need a lot of special cases
As is, it's verbose, but I think it's easier to debug and maintain.
BUT I did add some tests based on this comment so we can verify
Class.from_dict(obj.to_dict()) == obj
The printer classes for attacks, scenarios, and scorers were tightly coupled to
CentralMemoryand the module they were in; they fetched conversations and scores at print time. This made them unusable from thin clients that don't have the full PyRIT framework but need the same pretty-printing with data fetched via REST APIs.This refactor moves abstract base classes into a new
pyrit/printer/module. Data-fetching operations are abstract, so in the framework we can grab them from memory. Or a REST API caller can grab them from the REST endpoints.All functionality is the same; but this allows us to use the same printing once we replace our cli frontend to use the REST API