Advanced PostgreSQL
Row-Level Security (RLS) for multi-tenancy, range partitioning for massive tables, and pg_cron for scheduled tasks inside the database.
Scaling and Security at the Data Layer
As your application grows, doing everything in application code becomes risky and slow. If you build a multi-tenant SaaS, filtering by tenant_id in every single PHP query is a bug waiting to happen. If your logs table reaches 100 million rows, DELETE FROM logs WHERE created_at < NOW() - INTERVAL '30 days' will lock the table and bring down production.
Postgres has advanced features specifically designed to handle these scale and security issues at the database level.
Learning Objectives
- Use Row-Level Security (RLS) to guarantee tenant isolation
- Use Range Partitioning to split massive tables without changing application code
- Understand
pg_cronfor scheduling tasks directly inside Postgres
Row-Level Security (RLS)
In a multi-tenant application, users from "Company A" must never see data belonging to "Company B".
The standard (flawed) approach is to add WHERE tenant_id = :id to every query in PHP. If a developer forgets that WHERE clause on one API endpoint, data leaks.
Row-Level Security (RLS) enforces this at the Postgres level. Even if the PHP developer runs SELECT * FROM posts, Postgres will intercept it and only return the rows the current tenant is allowed to see.
Step 1: Enable RLS
First, enable RLS on the table:
ALTER TABLE posts ENABLE ROW LEVEL SECURITY;(Note: If you enable RLS but create no policies, the default policy is DENY ALL. No rows will be visible.)
Step 2: Create a Policy
Create a policy that restricts access based on a session variable:
CREATE POLICY tenant_isolation ON posts
USING (tenant_id = current_setting('app.tenant_id', true)::INT);The USING clause defines what rows can be read and updated.
current_setting('app.tenant_id', true)reads a custom configuration variable. Thetruemeans "don't throw an error if the setting doesn't exist (return null)".::INTcasts it to an integer to match thetenant_idcolumn type.
Step 3: Set the context in PHP
Before running queries for a specific tenant, tell Postgres who the current tenant is. Do this once per HTTP request (e.g., in your framework's middleware or base controller).
$tenantId = 5; // e.g., determined from the logged-in user or the subdomain
$pdo = $db->getConnection();
$pdo->exec("SET app.tenant_id = {$tenantId}");Now, when you run this query:
$posts = $db->query('SELECT * FROM posts')->get();Postgres automatically rewrites it to effectively be SELECT * FROM posts WHERE tenant_id = 5. The isolation is guaranteed by the database engine.
Partitioning
When a table gets too large (e.g., tens of gigabytes), standard indexes become too large to fit in memory, and deleting old data causes massive I/O spikes.
Table Partitioning splits one large logical table into multiple smaller physical tables. Your application still queries the main table as if nothing changed.
Range Partitioning (e.g., by month)
This is ideal for time-series data like logs, events, or posts.
Create the parent table and declare the partition key:
CREATE TABLE event_logs (
id BIGSERIAL,
event_type TEXT NOT NULL,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
) PARTITION BY RANGE (created_at);(Note: The partition key created_at must be part of the primary key constraint if you have one, which complicates things slightly. Often, partitioned event tables don't use primary keys at all.)
Create the child tables (partitions) for specific date ranges:
CREATE TABLE event_logs_2024_01 PARTITION OF event_logs
FOR VALUES FROM ('2024-01-01') TO ('2024-02-01');
CREATE TABLE event_logs_2024_02 PARTITION OF event_logs
FOR VALUES FROM ('2024-02-01') TO ('2024-03-01');When you INSERT INTO event_logs, Postgres automatically routes the row to the correct child table.
Why Partitioning Matters
- Partition Pruning: If you query
WHERE created_at BETWEEN '2024-01-15' AND '2024-01-20', Postgres entirely ignoresevent_logs_2024_02and scans only the January table. RunEXPLAIN ANALYZEto see this in action. - Instant Deletion: To delete data older than January, you don't run
DELETE. You just drop the partition:DROP TABLE event_logs_2024_01. It happens instantly and reclaims disk space immediately with zero locking.
pg_cron
If you need to run a task every hour (e.g., deleting expired sessions, refreshing materialized views), you usually set up a Linux cron job that calls a PHP script.
If the task is purely data manipulation, you can use the pg_cron extension to run the job entirely inside Postgres.
Enabling pg_cron
In your docker-compose.yml, you must preload the library:
db:
image: postgres:16-alpine
command: postgres -c shared_preload_libraries=pg_cronThen in Postgres:
CREATE EXTENSION IF NOT EXISTS pg_cron;Scheduling Jobs
Delete expired sessions every day at 3 AM:
SELECT cron.schedule('0 3 * * *', $$DELETE FROM sessions WHERE expires_at < NOW()$$);Check job status:
SELECT * FROM cron.job_run_details ORDER BY start_time DESC LIMIT 5;End-of-Phase Project: Multi-Tenant Blog Platform
You have all the building blocks. Your final project is to build a multi-tenant blog platform (like Medium or Substack) using these advanced features.
Schema Requirements
- Tenants:
id,name,domain - Users:
id,tenant_id,email,password_hash - Posts:
id,tenant_id,user_id,title,body,search_vector,created_at
Database Architecture Rules
- Must use RLS: All queries to
usersandpostsmust be protected by Row-Level Security. Application code should never containWHERE tenant_id = ?. - Must use Full-Text Search: The
poststable must use atsvectorgenerated column, indexed with GIN, for searching articles. - Migrations: All schema changes must be written as numbered migration files in
database/migrations/. - Connection Pooling: Your
docker-compose.ymlmust includepgbouncer. PHP must connect to PgBouncer, not directly to Postgres.
API Requirements
- Middleware that extracts the
tenant_idfrom the request (e.g., from anX-Tenant-Domainheader) and executesSET app.tenant_id = ?. POST /posts— Create a post.GET /posts— List posts (paginated). Postgres RLS will automatically ensure they only see their tenant's posts.GET /search?q=docker— Full text search across the tenant's posts.GET /export— (Bonus) Stream the tenant's posts out as CSV.
Your Task
Load the broken challenge:
php artisan challenge:start db-missing-rlsA controller Http/controllers/tenant/posts.php lists posts for a tenant. It tries to isolate data by fetching the tenant ID and doing WHERE tenant_id = :id. But if another developer modifies this query later and forgets the WHERE clause, data will leak.
You must implement Row-Level Security to protect the data at the DB level.
- Fix the Migration: In
database/migrations/003_enable_rls.sql, write the SQL to enable RLS on thepoststable and create a policy that checkstenant_id = current_setting('app.tenant_id')::INT. - Fix the Controller: In
Http/controllers/tenant/posts.php, executeSET app.tenant_id = :idbefore theSELECTquery runs. - Remove the WHERE clause: Remove
WHERE tenant_id = :idfrom the controller'sSELECTquery to prove that RLS is doing the filtering.
Verify:
php artisan challenge:verify