Px/

Security Best Practices

Production hardening guidance for secrets management, JWT configuration, policy governance, and network security.

Edit this page on GitHub

Secrets and credentials#

Admin API key

  • Set ADMIN_API_KEY to a cryptographically random value (32+ bytes). Never reuse it across environments.
  • Rotate it by updating the env var and restarting the service. All existing admin sessions are immediately invalidated.

Service API keys

  • Create one key per service — never share keys between unrelated workloads.
  • Revoke keys immediately when a service is decommissioned: DELETE /api/v1/service-api-keys/{id}.
  • Store keys in a secrets manager (Vault, AWS Secrets Manager, GCP Secret Manager). Never commit them to source control or bake them into container images.

Java SDK credentials

  • Pass ADMIN_CLIENT_ID and ADMIN_CLIENT_SECRET as environment variables, not in application.properties.
  • Prefer service API key auth over OAuth2 client credentials for resource registration — it removes the dependency on a live OIDC token endpoint at startup.

JWT configuration#

  • Always set OIDC_ENABLED=true in production. Disabling JWT validation removes all authentication.
  • Use RS256 or ES256 — symmetric algorithms are not supported and should not be used with this service.
  • Keep JWT lifetimes short (5–15 minutes for user tokens). Longer-lived tokens increase the window for token theft.
  • Include only the claims the service needs. Avoid embedding sensitive PII in JWT payloads that are forwarded to Permix.
  • Configure EXCLUDED_ROLES to strip internal service-account roles (e.g. offline_access, uma_authorization) that should not influence policy decisions.

JWKS endpoint security#

  • Ensure your IdP's JWKS endpoint is served over HTTPS with a valid certificate.
  • The ValidatorCache uses the last known-good JWKS as a fallback on fetch failure. Monitor JWKS fetch errors in your IdP logs — extended JWKS unavailability can mask key rotation issues.

Policy governance#

  • Apply least-privilege defaultRoles when registering resources. Start with the minimum roles required and expand as needed.
  • Use ABAC deny policies at high priority to enforce hard constraints that must not be overridden by RBAC rules.
  • Soft-deletes on ABAC policies (deleted_at timestamp) preserve the full history. Query deleted policies for incident analysis and compliance audits.
  • Review GET /api/v1/resources/{id}/policies periodically to identify stale or overly broad RBAC rules.

Network security#

  • Do not expose Permix directly to the internet. Place it behind an internal load balancer or service mesh.
  • Restrict access to POST /api/v1/check to trusted internal services only.
  • Protect /admin/* routes with network-level controls in addition to the X-Admin-Api-Key header — admin routes should never be reachable from outside your private network.

Kubernetes deployment#

  • Mount ADMIN_API_KEY, POSTGRES_PASSWORD, and service API keys from Kubernetes Secrets, not ConfigMaps.
  • Configure liveness and readiness probes using /healthz/live and /healthz/ready. The readiness probe checks the database and Casbin engine — do not bypass it.
  • Run the service with a non-root user in the container. The binary has no filesystem writes after startup.

Database security#

  • Create a dedicated PostgreSQL user with minimal privileges for the service database — CONNECT, SELECT, INSERT, UPDATE, DELETE on the authz database only. No superuser.
  • Enable SSL on the PostgreSQL connection (sslmode=require or verify-full).
  • All tenant data rows carry tenant_id. The service enforces this at the query layer, but database-level row-level security can be added as a defence-in-depth measure.

Monitoring and alerting#

  • Alert on a sustained increase in 403 responses from POST /api/v1/check — this may indicate a misconfigured policy or a probing attack.
  • Alert on 503 from GET /healthz/ready — this means the database or Casbin engine is unavailable and all authorizations are failing.
  • Log matched_rule_id from check responses to correlate access decisions with specific policy rules during incident analysis.