Merge pull request #6389 from specify/issue-6121#8149
Conversation
Automatic DB user creation for new Specify 7 instances
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Plus Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
📝 WalkthroughWalkthroughThis PR implements a three-tier database user architecture (master, migrator, app) with environment-driven credential defaults, a startup orchestration script that creates/validates users and privileges, expanded Django multi-alias DATABASES with a DB_ALIAS selector, and migration tooling/entrypoint changes to run targeted migrations. ChangesDatabase User Setup and Runtime Integration
🚥 Pre-merge checks | ❌ 2❌ Failed checks (2 warnings)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@CodeRabbit full review |
✅ Actions performedFull review triggered. |
There was a problem hiding this comment.
Actionable comments posted: 11
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
specifyweb/settings/__init__.py (1)
90-100:⚠️ Potential issue | 🟠 Major | ⚡ Quick winStop building the SQLAlchemy URLs with master credentials.
SA_DATABASE_URLandSA_TEST_DB_URLstill always useMASTER_NAME/MASTER_PASSWORD, so any SQLAlchemy code path bypasses the newapp/migrationssplit and continues to depend on elevated credentials. That undercuts the least-privilege model even whenDATABASES["default"]is routed correctly.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@specifyweb/settings/__init__.py` around lines 90 - 100, get_sa_db_url currently hard-codes MASTER_NAME/MASTER_PASSWORD so SA_DATABASE_URL and SA_TEST_DB_URL keep using elevated creds; change the construction so SA_DATABASE_URL and SA_TEST_DB_URL use the credentials from the Django DATABASES["default"] entry instead of MASTER_NAME/MASTER_PASSWORD. Update get_sa_db_url to accept explicit user/password (or build the URL inline) and use DATABASES["default"]["USER"], DATABASES["default"]["PASSWORD"], DATABASES["default"]["HOST"], DATABASES["default"]["PORT"] when setting SA_DATABASE_URL and SA_TEST_DB_URL (keeping the same db name and charset and preserving the f'test_{DATABASE_NAME}' behavior).
🧹 Nitpick comments (2)
Dockerfile (1)
18-18: 💤 Low valueRedundant
mariadb-clientinstallation.
mariadb-clientis installed at line 18 in thecommonstage, butrun-common(which inheritsFROM common) reinstalls it at line 141. The second installation is unnecessary and adds build time.♻️ Suggested fix at line 141
- apt-get -y install --no-install-recommends rsync mariadb-client && \ + apt-get -y install --no-install-recommends rsync && \Also applies to: 141-141
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@Dockerfile` at line 18, The Dockerfile installs mariadb-client twice: once in the common stage and again in the run-common stage; remove the redundant installation from the run-common stage so mariadb-client is only installed in the common stage. Locate the two stages named "common" and "run-common" in the Dockerfile, and delete the mariadb-client package entry from the apt-get install list in the run-common stage (leaving the installation in the common stage intact); verify no other dependent lines in run-common assume a separate install step..github/workflows/test.yml (1)
67-72: ⚡ Quick winUse distinct CI users so this workflow actually exercises the new permission split.
Right now
master,migrator, andappall resolve to the same credentials, so the test job won't catch permission-boundary regressions in the new setup script or alias wiring.Also applies to: 125-130
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.github/workflows/test.yml around lines 67 - 72, The workflow currently uses identical credentials for MASTER, MIGRATOR and APP users so tests won't validate permission boundaries; update the environment variables (MASTER_NAME / MASTER_PASSWORD, MIGRATOR_NAME / MIGRATOR_PASSWORD, APP_USER_NAME / APP_USER_PASSWORD) to use distinct usernames and passwords (and mirror the same distinct values in the second block referenced around lines 125-130) so CI exercises the different permission sets and reveals regressions in the setup/alias wiring.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@docker-entrypoint.sh`:
- Line 13: Fix the typo in the inline comment that reads "Setup db users and run
mirgations" by updating it to "Setup db users and run migrations" (locate the
comment adjacent to the ./sp7_db_setup_check.sh invocation in
docker-entrypoint.sh and replace "mirgations" with "migrations").
In `@sp7_db_setup_check.sh`:
- Around line 40-43: The required-vars check omits the master DB credentials
used later; update the validation in sp7_db_setup_check.sh to include
MASTER_USER_NAME and MASTER_USER_PASSWORD so the script fails early with a clear
error if those environment variables are missing; specifically modify the
conditional that checks DB_HOST, DB_PORT, MIGRATOR_PASSWORD, DB_NAME,
APP_USER_NAME, and APP_USER_PASSWORD to also require MASTER_USER_NAME and
MASTER_USER_PASSWORD and ensure the echo error message mentions the missing
master credentials.
- Around line 76-95: The shell inserts unescaped env vars (DB_NAME,
MIGRATOR_NAME, MIGRATOR_PASSWORD, MASTER_USER_PASSWORD, etc.) directly into
mysql SQL strings (see DB_EXISTS and USER_EXISTS checks and the CREATE DATABASE
/ CREATE USER mysql -e invocations), which allows quotes/backticks in secrets to
break or alter SQL; fix by adding small helper functions to safely escape
identifiers and string literals before interpolation (e.g., escape SQL
identifiers by doubling/backslash-escaping backticks and wrap in backticks, and
escape SQL string literals by replacing single-quote with '\'' and wrapping in
single quotes), then use those helpers when building the queries for DB_NAME,
MIGRATOR_NAME, MIGRATOR_USER_HOST, MIGRATOR_PASSWORD, and any other env var used
in mysql -e calls so all CREATE DATABASE/CREATE USER and SELECT queries use
escaped values.
- Around line 1-5: Enable fail-fast behavior so the script aborts if the initial
migration fails: add strict shell options (e.g., set -euo pipefail) near the
top, and/or explicitly check the exit status of the base_specify_migration
invocation and exit with non‑zero if it fails before running the next migration
step (the invocation of manage.py migrate or any "second migration" commands);
ensure any conditional gating uses the exit code of the base_specify_migration
command (or the variable/command name base_specify_migration) to prevent
continuing on partial initialization.
- Around line 129-152: The current loop (reading GRANTS_PARSED and setting
migrator_has_access) treats any grant other than USAGE as sufficient; change the
privilege check so migrator_has_access is only true when the parsed privs string
contains migration-capable privileges (e.g. ALL PRIVILEGES or at least one of
CREATE, ALTER, DROP, INDEX, CREATE VIEW, CREATE ROUTINE or other DDL privileges
your app requires) rather than any non-USAGE token. Update the condition that
sets migrator_has_access (the block that computes privs via sed and currently
uses grep -Eiq '(^|[, ])USAGE([, ]|$)') to instead match ALL or a regex like
CREATE|ALTER|DROP|INDEX|CREATE[[:space:]]VIEW|CREATE[[:space:]]ROUTINE
(case-insensitive) and only set migrator_has_access=true when that match
succeeds; keep the existing fallback branch that reassigns
MIGRATOR_NAME/MIGRATOR_PASSWORD/MIGRATION_DB_ALIAS unchanged.
- Around line 195-239: The verifier currently expects a single GRANT line to
include all REQUIRED_PRIVS; instead, build a union of privileges from all
relevant grant lines before deciding. Modify the loop over APP_GRANTS_LINES to,
for each line that matches a relevant scope (global ON *.* TO or DB-scoped ON
`?${DB_NAME}`?.* TO), extract the privilege list portion (text between "GRANT"
and "ON"), normalize commas/spaces, and add each privilege into a cumulative set
(associative array or deduped list); after processing all lines, check that
every REQUIRED_PRIVS member is present in that union and set
app_has_required_permissions accordingly (rather than calling
has_all_privs_in_line per-line). Keep APP_GRANTS_RAW, REQUIRED_PRIVS,
APP_GRANTS_LINES, DB_NAME, app_has_required_permissions and the final fallback
logic intact.
- Around line 91-95: The condition guarding the CREATE USER branch is checking
APP_USER_NAME but the branch creates MIGRATOR_NAME; change the guard to check
MIGRATOR_NAME instead (i.e., use MIGRATOR_NAME != "root") so we verify the
migrator account before attempting CREATE USER; keep the existing USER_EXISTS
check and CREATE USER call (MIGRATOR_NAME, MIGRATOR_USER_HOST,
MIGRATOR_PASSWORD) unchanged so the root fallback still works.
In `@specifyweb/settings/__init__.py`:
- Around line 85-88: Validate the DJANGO_DB_ALIAS before using it to index
DATABASES: check that DB_ALIAS (from os.getenv("DJANGO_DB_ALIAS", "default")) is
a key in the DATABASES dict and if not either set DB_ALIAS = "default" or raise
a clear ImportError/RuntimeError with a descriptive message; update the block
around DB_ALIAS and DATABASES so the code only does
deepcopy(DATABASES[DB_ALIAS]) when DB_ALIAS is present, otherwise fallback to
default or raise an explicit error indicating the invalid alias and listing
valid keys.
In `@specifyweb/settings/specify_settings.py`:
- Around line 62-72: Replace the hardcoded placeholders for MIGRATOR_NAME,
MIGRATOR_PASSWORD, APP_USER_NAME, and APP_USER_PASSWORD with environment lookups
that fall back to the intended defaults (use os.environ.get), e.g. derive
MASTER_NAME and MASTER_PASSWORD from env first, then set MIGRATOR_* =
os.environ.get('MIGRATOR_NAME', MASTER_NAME) and MIGRATOR_PASSWORD =
os.environ.get('MIGRATOR_PASSWORD', MASTER_PASSWORD), and similarly
APP_USER_NAME = os.environ.get('APP_USER_NAME', MIGRATOR_NAME) and
APP_USER_PASSWORD = os.environ.get('APP_USER_PASSWORD', MIGRATOR_PASSWORD); also
fix the existing fallback bug by using the correct env key ('MASTER_PASSWORD')
when computing MASTER_PASSWORD instead of the duplicated wrong key.
In `@specifyweb/specify/management/commands/base_specify_migration.py`:
- Around line 33-34: The bare except that sets alias = 'master' should be
replaced with a narrow exception handler: catch the specific exceptions that can
occur when computing alias (e.g., IndexError, KeyError, ValueError) and
optionally log the exception before falling back to 'master' (e.g., except
(IndexError, KeyError, ValueError) as e: logger.warning("... %s", e); alias =
'master'). Do the same replacement for the second occurrence at lines 46-47 so
both bare except blocks are removed in favor of specific exception handling and
logging.
- Around line 31-35: The fallback to 'master' is a no-op because
transaction.atomic(using=alias) doesn't raise until entered; change the check to
actively open the connection and propagate the proper alias: attempt to access
connections[alias] and call conn.ensure_connection() (or conn.cursor()) inside
the try/except; on exception set alias = 'master' and reassign conn =
connections[alias] before using transaction.atomic(using=alias); update usage of
transaction.atomic, conn, connections and alias so they always refer to the same
resolved alias.
---
Outside diff comments:
In `@specifyweb/settings/__init__.py`:
- Around line 90-100: get_sa_db_url currently hard-codes
MASTER_NAME/MASTER_PASSWORD so SA_DATABASE_URL and SA_TEST_DB_URL keep using
elevated creds; change the construction so SA_DATABASE_URL and SA_TEST_DB_URL
use the credentials from the Django DATABASES["default"] entry instead of
MASTER_NAME/MASTER_PASSWORD. Update get_sa_db_url to accept explicit
user/password (or build the URL inline) and use DATABASES["default"]["USER"],
DATABASES["default"]["PASSWORD"], DATABASES["default"]["HOST"],
DATABASES["default"]["PORT"] when setting SA_DATABASE_URL and SA_TEST_DB_URL
(keeping the same db name and charset and preserving the f'test_{DATABASE_NAME}'
behavior).
---
Nitpick comments:
In @.github/workflows/test.yml:
- Around line 67-72: The workflow currently uses identical credentials for
MASTER, MIGRATOR and APP users so tests won't validate permission boundaries;
update the environment variables (MASTER_NAME / MASTER_PASSWORD, MIGRATOR_NAME /
MIGRATOR_PASSWORD, APP_USER_NAME / APP_USER_PASSWORD) to use distinct usernames
and passwords (and mirror the same distinct values in the second block
referenced around lines 125-130) so CI exercises the different permission sets
and reveals regressions in the setup/alias wiring.
In `@Dockerfile`:
- Line 18: The Dockerfile installs mariadb-client twice: once in the common
stage and again in the run-common stage; remove the redundant installation from
the run-common stage so mariadb-client is only installed in the common stage.
Locate the two stages named "common" and "run-common" in the Dockerfile, and
delete the mariadb-client package entry from the apt-get install list in the
run-common stage (leaving the installation in the common stage intact); verify
no other dependent lines in run-common assume a separate install step.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 592e1ec7-0909-4d16-bfcb-2545929d08dc
📒 Files selected for processing (10)
.env.github/workflows/test.ymlDockerfileMakefiledocker-entrypoint.shsp7_db_setup_check.shspecifyweb/backend/permissions/initialize.pyspecifyweb/settings/__init__.pyspecifyweb/settings/specify_settings.pyspecifyweb/specify/management/commands/base_specify_migration.py
…L values before interpolation
|
@CodeRabbit full review |
✅ Actions performedFull review triggered. |
There was a problem hiding this comment.
Actionable comments posted: 7
♻️ Duplicate comments (2)
sp7_db_setup_check.sh (2)
266-271:⚠️ Potential issue | 🟠 Major | ⚡ Quick winUnion app privileges across all
SHOW GRANTSlines.MariaDB can split a valid permission set across multiple
GRANTstatements. The current per-line check can reject a correct app account on existing installs and spuriously enter the fallback path.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@sp7_db_setup_check.sh` around lines 266 - 271, The current loop checks each grant line independently using grant_line_has_required_privs and may fail when privileges are split across multiple SHOW GRANTS lines; instead accumulate/union privileges from all APP_GRANTS_LINES and then test the combined set against APP_REQUIRED_PRIVS. Update the logic that iterates APP_GRANTS_LINES to extract privileges from each line (reusing or adding a helper used by grant_line_has_required_privs), merge them into a single collection, and set app_has_required_permissions=true only if the union contains all APP_REQUIRED_PRIVS; keep references to APP_GRANTS_LINES, grant_line_has_required_privs (or the new extractor), APP_REQUIRED_PRIVS, and app_has_required_permissions so reviewers can locate the change.
167-181:⚠️ Potential issue | 🟠 Major | ⚡ Quick winCheck
MIGRATOR_NAMEin this guard.This branch creates
${MIGRATOR_NAME}, but the root special-case is checking${APP_USER_NAME}. If the migrator falls back toroot, the script still tries to createroot@'%'.🔧 One-line fix
-if [[ "$USER_EXISTS" -eq 0 && "$APP_USER_NAME" != "root" ]]; then +if [[ "$USER_EXISTS" -eq 0 && "$MIGRATOR_NAME" != "root" ]]; then🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@sp7_db_setup_check.sh` around lines 167 - 181, The guard is checking "$APP_USER_NAME" but the block creates the migrator user (${MIGRATOR_NAME}/${SQL_MIGRATOR_NAME}), so if the migrator resolves to root the script will still attempt to create root; change the conditional to check the migrator variable instead (e.g. replace "$APP_USER_NAME" != "root" with "$MIGRATOR_NAME" != "root" or "$SQL_MIGRATOR_NAME" != "root" to match the variables used in USER_EXISTS and CREATE USER), ensuring the check uses the same migrator identifier as in USER_EXISTS/CREATE USER.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.github/workflows/test.yml:
- Around line 67-72: The CI is provisioning only the default MYSQL_* user, so
fix the env var values so the "master" and migrator aliases map to the actual
privileged account: set MASTER_NAME (and MIGRATOR_NAME if you want migrator
privileged) to "root" and set MASTER_PASSWORD (and MIGRATOR_PASSWORD) to the
same value as MYSQL_ROOT_PASSWORD, while keeping APP_USER_NAME/APP_USER_PASSWORD
pointing to the unprivileged MYSQL_USER/MYSQL_PASSWORD; update the workflow env
block that defines
MASTER_NAME/MASTER_PASSWORD/MIGRATOR_NAME/MIGRATOR_PASSWORD/APP_USER_NAME/APP_USER_PASSWORD
and regenerate the specifyweb/settings/local_specify_settings.py so the
master/migrator aliases refer to the real root credentials instead of the
non‑privileged MYSQL_USER.
In `@docker-entrypoint.sh`:
- Around line 11-17: The entrypoint currently wraps ./sp7_db_setup_check.sh with
set +e but never checks its exit status, allowing the container to continue on
failures; update docker-entrypoint.sh to fail fast by either removing set +e
around ./sp7_db_setup_check.sh or, if you need to preserve it, capture the
script's exit code (e.g., rc=$?) immediately after running
./sp7_db_setup_check.sh and if rc != 0 call echo with an error and exit $rc so
the container stops when sp7_db_setup_check.sh fails; reference the invocation
of ./sp7_db_setup_check.sh in docker-entrypoint.sh and ensure set -e is active
after this check.
In `@Makefile`:
- Around line 17-19: The django_migrations Makefile target diverges from startup
flow because it never uses the --use-override flag; update the target
(django_migrations) so both manage.py calls include a conditional override flag
(e.g., add a MAKE variable like USE_OVERRIDE that defaults empty and append
$(USE_OVERRIDE) to the base_specify_migration and migrate invocations), so
callers can run make django_migrations USE_OVERRIDE=--use-override to match
sp7_db_setup_check.sh behavior.
In `@sp7_db_setup_check.sh`:
- Around line 117-118: The app runtime privilege list (APP_REQUIRED_PRIVS) still
contains schema-changing rights; remove "ALTER" and "INDEX" (and any other
schema mutation privileges) from APP_REQUIRED_PRIVS and ensure they remain in
MIGRATION_REQUIRED_PRIVS so only the migrator/privileged path has schema-change
permissions; update both occurrences (the current declaration and the duplicate
around lines 243-246) to reflect this split and keep MIGRATION_REQUIRED_PRIVS as
the authoritative list for ALTER/INDEX.
- Around line 13-19: The assignments to MIGRATOR_NAME, MIGRATOR_PASSWORD,
APP_USER_NAME and APP_USER_PASSWORD fail under set -u when the env vars are
absent; change those lines to use safe parameter expansion (e.g.
${MIGRATOR_NAME:-} etc.) so they evaluate to an empty string instead of causing
an “unbound variable” error and allow the later derivation logic to run; update
the occurrences in sp7_db_setup_check.sh for MIGRATOR_NAME, MIGRATOR_PASSWORD,
APP_USER_NAME and APP_USER_PASSWORD accordingly.
In `@specifyweb/backend/permissions/initialize.py`:
- Around line 87-96: The early-return table-existence check that protects
assign_users_to_roles() should be extracted into a shared helper (e.g., add a
function like sp6_tables_exist() or ensure_sp6_tables_present()) and then called
from both assign_users_to_roles() and assign_users_to_roles_during_testing();
move the current cursor-based SELECT COUNT(*) check (the information_schema
query that looks for 'specifyuser_spprincipal' and 'spuserrole') into that
helper so users_collections_for_sp6(...) is never invoked when those tables are
absent, and update initialize() to call the new helper in the test-only
assignment path as well.
In `@specifyweb/specify/management/commands/base_specify_migration.py`:
- Around line 38-53: The current probe around SELECT 1 FROM django_migrations
inside transaction.atomic(using=alias) swallows all exceptions via a bare except
and can hide non-“table missing” errors; change the logic in the block that uses
conn.cursor(), cursor.execute(...) and the exists flag to either use Django's DB
introspection (connection.introspection.table_names() or
connection.introspection.table_exists) to check for the django_migrations table,
or catch only the specific DB-API error that indicates “table does not exist”
(e.g. the MySQL error/exception type your backend raises) and re-raise any other
exceptions (or call transaction.set_rollback(True) before continuing) so you
don't mask permission/connection errors and so atomic(using=alias) behaves
correctly.
---
Duplicate comments:
In `@sp7_db_setup_check.sh`:
- Around line 266-271: The current loop checks each grant line independently
using grant_line_has_required_privs and may fail when privileges are split
across multiple SHOW GRANTS lines; instead accumulate/union privileges from all
APP_GRANTS_LINES and then test the combined set against APP_REQUIRED_PRIVS.
Update the logic that iterates APP_GRANTS_LINES to extract privileges from each
line (reusing or adding a helper used by grant_line_has_required_privs), merge
them into a single collection, and set app_has_required_permissions=true only if
the union contains all APP_REQUIRED_PRIVS; keep references to APP_GRANTS_LINES,
grant_line_has_required_privs (or the new extractor), APP_REQUIRED_PRIVS, and
app_has_required_permissions so reviewers can locate the change.
- Around line 167-181: The guard is checking "$APP_USER_NAME" but the block
creates the migrator user (${MIGRATOR_NAME}/${SQL_MIGRATOR_NAME}), so if the
migrator resolves to root the script will still attempt to create root; change
the conditional to check the migrator variable instead (e.g. replace
"$APP_USER_NAME" != "root" with "$MIGRATOR_NAME" != "root" or
"$SQL_MIGRATOR_NAME" != "root" to match the variables used in USER_EXISTS and
CREATE USER), ensuring the check uses the same migrator identifier as in
USER_EXISTS/CREATE USER.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: a0548793-ef4f-4d7a-ac7c-e105ea21b274
📒 Files selected for processing (10)
.env.github/workflows/test.ymlDockerfileMakefiledocker-entrypoint.shsp7_db_setup_check.shspecifyweb/backend/permissions/initialize.pyspecifyweb/settings/__init__.pyspecifyweb/settings/specify_settings.pyspecifyweb/specify/management/commands/base_specify_migration.py
|
NOTES:
|
Automatic DB user creation for new Specify 7 instances
Fixes #6121
Checklist
self-explanatory (or properly documented)
specify7/specifyweb/specify/management/commands/run_key_migration_functions.py
Line 50 in ea04665
Testing instructions
Summary by CodeRabbit
New Features
Chores