Skip to content

fix expression vocab memory management and incorrect reallocation of aggregates storage#3175

Closed
abalabin-bamfunds wants to merge 1 commit into
perspective-dev:masterfrom
abalabin-bamfunds:fix-expression-vocab-mem-2
Closed

fix expression vocab memory management and incorrect reallocation of aggregates storage#3175
abalabin-bamfunds wants to merge 1 commit into
perspective-dev:masterfrom
abalabin-bamfunds:fix-expression-vocab-mem-2

Conversation

@abalabin-bamfunds
Copy link
Copy Markdown

also fix and add ability to create tables with index to the sys_info example

This PR fixes a problem when having a string-valued expression column on views of tables that frequently update made perspective grow unlimited amount of memory and slow down dramatically with time. While I could not get a test together that would show this clearly, the results of manual testing are below.

Using the updated sys_info example and the JSON pasted below for the "Generated Data" view the observations before the fix looked like this:
image
after the fix:
image

JSON:

{
  "version": "4.4.1",
  "columns_config": {},
  "plugin": "Datagrid",
  "plugin_config": {
    "columns": {},
    "scroll_lock": false,
    "edit_mode": "READ_ONLY"
  },
  "settings": true,
  "table": "superstore_4",
  "theme": "Pro Light",
  "title": "Generated Data",
  "group_by": [
    "String 1"
  ],
  "split_by": [],
  "sort": [],
  "filter": [],
  "group_rollup_mode": "rollup",
  "expressions": {
    "S": "concat(\"String 1\", \"String 2\")"
  },
  "columns": [
    "S",
    "Float 2",
    "Float 3",
    "Float 4",
    "Integer 0",
    "Integer 1",
    "Datetime 4",
    "Boolean 0"
  ],
  "aggregates": {
    "Datetime 4": "dominant",
    "S": "any",
    "Boolean 0": "dominant"
  }
}

…aggregates storage

fix and improve the sys_info example

    Signed-off-by: Alexander Balabin <abalabin@bamfunds.com>
@texodus
Copy link
Copy Markdown
Member

texodus commented May 21, 2026

Thanks for the PR and research, @abalabin-bamfunds !

It's a real issue and the fix and prose are super helpful.

However this fix has a memory safety issue. t_expression_vocab is a wrapper around t_vocab exclusively used to prevent ExprTK from interning new strings while it holds references to old ones, hence the mechanism in this file which calls allocate_new_vocab() instead of t_vocab::extend, the latter of which invalidates existing string references.

Small side issue - the DCO Bot failed your PR for lacking a sig. I can see in the comments that you signed it, but the DCO Bot is not recognizing your sig. Perhaps because it is indented? We really appreciate you making sure to cover this step up front (it means you took the time to read the CONTRIBUTING.md! Thank you!) but the bot can be finicky sometimes, if you did not use -s when committing it should have formatted this correctly. If you did this already, perhaps the DCO Bot needs a PR!

Back to the issue at hand. I've gone ahead and re-implemented a fix and added a test for this issue in #3177 that is memory stable, so closing this PR.

@texodus texodus closed this May 21, 2026
@abalabin-bamfunds
Copy link
Copy Markdown
Author

Thanks @texodus

Agree re memory safety - I did not spot that letting vocab growing over pre-allocated size invalidates pointers and weirdly (and thankfully!) did not get any crashes while running with this fix for a few months. It is cool to have a proper fix.

Re DCO - I think this was a squash commit from pre-signed commits done with -s so I would guess indentation comes from squashing commit messages. Will keep an eye on it, or maybe the DCO bot could allow some whitespace in front on sign-offs?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

2 participants