Skip to content

Feat/sum mult profiles#369

Open
graceg571 wants to merge 6 commits into
mggg:3.4.1from
graceg571:feat/sum-mult-profiles
Open

Feat/sum mult profiles#369
graceg571 wants to merge 6 commits into
mggg:3.4.1from
graceg571:feat/sum-mult-profiles

Conversation

@graceg571

Copy link
Copy Markdown

Closes #331

Adds a utility method 'sum_profiles' for summing a list of profiles (RankProfile or ScoreProfile).

Implementation

Iterates through the list and accumulates with +, delegating the actual combination to the existing __add__ methods on RankProfile and ScoreProfile.

Raises ValueError on empty list because the correct empty profile type can't be inferred. Raises TypeError if a mix of RankProfiles and ScoreProfiles are given as input. Checks before iterating through list to avoid performing unnecessary adds.

Tests

Added test cases for:

  • sum_profiles([...]) with an empty list
  • sum_profiles([...]) with a list of 1 profile
  • sum_profiles([...]) with a list of mixed ScoreProfiles and RankProfiles
  • sum_profiles([...]) for list of ScoreProfiles and list of RankProfiles

@graceg571 graceg571 requested a review from peterrrock2 June 1, 2026 13:55
@graceg571 graceg571 self-assigned this Jun 1, 2026
@graceg571 graceg571 changed the base branch from main to 3.4.1 June 1, 2026 20:33
@EdouardHeitzmann

Copy link
Copy Markdown
Collaborator

Just a thought, but using a python loop to repeatedly call RankProfile.__add__(...) can be computationally expensive below:

for profile in profiles[1:]:
        combined_profiles = combined_profiles + profile

It could speed things up to copy all the underlying profile.dfs into a list, then call pd.concat once on the whole list once -- it's a quadratic speedup in the number of profiles copied (I think!), since for a list of n profiles the first once gets copied n-1 times, the next one n-2, etc under the current setup.

@EdouardHeitzmann

Copy link
Copy Markdown
Collaborator

Upon further thought, mimicking the RankProfile.__add__ logic for the many-profiles version of addition as I suggest above would introduce some maintainability concerns (since the same logic would then exist in two separate places). Maybe a clean fix would be to move out the logic currently in __add__ into the sum_profiles helper you're currently working on, make it compatible with arbitrary-length lists of profiles, and then call sum_profiles inside of __add__:

def __add__(self, other) -> RankProfile:
        """
        Add two PreferenceProfiles by combining their ballot lists.
        """
        if not isinstance(other, RankProfile):
            raise TypeError("Unsupported operand type. Must be an instance of RankProfile.")

        # insert any other validation here

        return sum_profiles([self, other])

Sorry to be nosing in to this PR, I think the work you're doing is awesome!

@peterrrock2 peterrrock2 left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Notes:

  • Move to internal methods with guaranteed return types (e.g. _sum_rank_profiles)

Comment thread src/votekit/pref_profile/utils.py Outdated
Comment on lines +502 to +523
def sum_profiles(profiles: Sequence[RankProfile | ScoreProfile]) -> RankProfile | ScoreProfile:
"""
Combines multiple PreferenceProfiles by combining their ball lists.

Args:
profiles (Sequence[PreferenceProfile]): The profiles to sum.

Returns:
PreferenceProfile: The combined preference profile.

Raises:
ValueError: Cannot sum an empty list of profiles.
TypeError: All profiles must be of the same type.
"""

from votekit.pref_profile.pref_profile import RankProfile, ScoreProfile

if len(profiles) == 0:
raise ValueError("Cannot sum an empty list of profiles.")

if len(profiles) == 1:
return profiles[0]

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def sum_profiles(profiles: Sequence[RankProfile | ScoreProfile]) -> RankProfile | ScoreProfile:
"""
Combines multiple PreferenceProfiles by combining their ball lists.
Args:
profiles (Sequence[PreferenceProfile]): The profiles to sum.
Returns:
PreferenceProfile: The combined preference profile.
Raises:
ValueError: Cannot sum an empty list of profiles.
TypeError: All profiles must be of the same type.
"""
from votekit.pref_profile.pref_profile import RankProfile, ScoreProfile
if len(profiles) == 0:
raise ValueError("Cannot sum an empty list of profiles.")
if len(profiles) == 1:
return profiles[0]
def sum_profiles(profiles: Sequence[RankProfile | ScoreProfile]) -> RankProfile | ScoreProfile:
"""
Combines multiple PreferenceProfiles by combining their ball lists.
Args:
profiles (Sequence[PreferenceProfile]): The profiles to sum.
Returns:
PreferenceProfile: A new PreferenceProfile object containing the combined preference profile.
Raises:
ValueError: Cannot sum an empty list of profiles.
TypeError: All profiles must be of the same type.
"""
from votekit.pref_profile.pref_profile import RankProfile, ScoreProfile
if len(profiles) == 0:
raise ValueError("Cannot sum an empty list of profiles.")
if len(profiles) == 1:
return profiles[0] # Change to copy. Might be worth adding a __copy__ method

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a copy method to RankProfile and ScoreProfile. Updated the return comment

@peterrrock2

Copy link
Copy Markdown
Collaborator

Upon further thought, mimicking the RankProfile.__add__ logic for the many-profiles version of addition as I suggest above would introduce some maintainability concerns (since the same logic would then exist in two separate places). Maybe a clean fix would be to move out the logic currently in __add__ into the sum_profiles helper you're currently working on, make it compatible with arbitrary-length lists of profiles, and then call sum_profiles inside of __add__:

def __add__(self, other) -> RankProfile:
        """
        Add two PreferenceProfiles by combining their ballot lists.
        """
        if not isinstance(other, RankProfile):
            raise TypeError("Unsupported operand type. Must be an instance of RankProfile.")

        # insert any other validation here

        return sum_profiles([self, other])

Sorry to be nosing in to this PR, I think the work you're doing is awesome!

No worries @EdouardHeitzmann. Thank you for including your input!

@peterrrock2 peterrrock2 left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a couple of quick things on this!

Comment on lines +519 to +526
total_cand = set().union(*[set(profile.candidates) for profile in score_profiles])
total_dfs = []
for p in score_profiles:
curr_df = p.df.copy()
curr_cand = set(p.candidates)
for cand in total_cand - curr_cand:
curr_df[cand] = [np.nan] * len(curr_df)
total_dfs.append(curr_df)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could speed this up a little bit by not copying the dataframes that already contain a score for every candidate in total_cand. Avoids a secondary df copy.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did not copy the data frame if they already contain a score for each candidate or the same ranking length for summing ScoreProfiles and RankProfiles respectively.

max_ranking_length=3,
)
summed_profile = sum_profiles([profile])
assert summed_profile == profile

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
assert summed_profile == profile
assert summed_profile == profile
assert id(summed_profile) != id(profile) # make sure that we return a copy

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added

candidates=["A", "B", "C", "D"],
)
summed_profile = sum_profiles([profile])
assert summed_profile == profile

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
assert summed_profile == profile
assert summed_profile == profile
assert id(summed_profile) != id(profile)

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Utility method for summing profiles

3 participants