← Back to Blog Engineering

How to Build a Static Analysis Engine That Catches Cloud Code Bugs Before They Ship

Zovia Studio | | 12 min read
cloud-code static-analysis quality parse
Cloud code infrastructure powering static analysis

At a Glance

  • NoSQL backends like Parse don’t enforce field names — a typo like user.get('emial') silently returns undefined at runtime, and you won’t know until a user reports missing data
  • You can build a pre-deploy static analysis gate that validates every class name, field reference, and pointer chain against your database schema
  • A variable tracking engine with three maps (query, object, array) can follow data through assignments, loops, and pointer chains to resolve types
  • Two-pass analysis solves the hardest problem: validating field references inside helper functions whose parameter types are only known at their call sites
  • We built this for our Parse codebase (797 cloud functions, 111 classes) — this post is the full blueprint so you can build one too

Every field reference, every class name, every pointer chain — validated against your database schema before a single line reaches production.


At Zovia, we ship cloud functions that power calendar intelligence, smart grocery lists, inventory management, and family check-ins. Our Parse backend has 111 classes, 797 cloud functions, and 253 JavaScript files across the ecosystem. A single typo in a field name — user.get('emial') instead of user.get('email') — would silently return undefined at runtime, causing subtle data bugs that are painful to trace.

So we built something that makes those bugs impossible: a schema assertion validator that statically analyzes every cloud code file against our live database schema. If it fails, the deploy is blocked. No exceptions.

This post breaks down exactly how it works — the architecture, the tracking engine, the gaps we closed, and the design decisions we made — so you can build the same thing for your own codebase.


The Problem: Silent Runtime Failures

Parse Server (and most NoSQL backends) doesn’t enforce field names at the application layer. You can write user.get('literally_anything') and it compiles, deploys, and runs — returning undefined with no error. Same for class names: new Parse.Query('Uesr') creates a query against a nonexistent class and silently returns nothing.

This isn’t unique to Parse. MongoDB, Firestore, DynamoDB — any schemaless database has this problem. The flexibility that makes NoSQL easy to start with is the same flexibility that lets typos slip through to production.

In any codebase with hundreds of cloud functions, these kinds of bugs are inevitable. You need a safety net that catches them at build time, not at 2 AM when a user reports missing data.

The Solution: A Pre-Deploy Static Analysis Gate

The idea is simple: export your database schema as JSON, then write a script that scans every cloud code file and checks every class name and field reference against that schema. Wire it into your deploy pipeline so nothing ships without passing.

Here’s what ours looks like:

npm run validate:schema  →  PASS (0 errors)  →  deploy allowed
npm run validate:schema  →  FAIL (3 errors)  →  deploy blocked

It’s wired into the deployment pipeline at multiple levels:

"predeploy": "npm run validate:pre-deploy",
"preflight:quick": "npm run validate:schema && npm run test:contracts",
"preflight:full": "npm run validate:schema && npm run test:contracts && npm run test:flows"

Every path to production runs through the validator first. If you want to adopt this pattern, the key principle is: make it mandatory, not optional. A validation step that developers can skip will be skipped.


What to Validate

Here are the five categories of checks that give the highest return on investment.

1. Class Names

Every reference to a database class should be checked against your schema:

new Parse.Query('OrderReceipt')     // Is 'OrderReceipt' a real class?
new Parse.Object('TeamMember')     // Is 'TeamMember' a real class?
Parse.Object.extend('GameSession')   // Is 'GameSession' a real class?

You’ll need to handle built-in classes (_User, _Role, _Session, _Installation) and maintain a curated list of known exceptions — legacy classes you reference intentionally but haven’t migrated yet.

If a class doesn’t exist in the schema and isn’t in either allowlist: deploy blocked.

2. Query Field Names

Every field passed to a query method should be validated against the target class:

const q = new Parse.Query('OrderReceipt');
q.equalTo('storeId', 'abc');      // Does 'storeId' exist on OrderReceipt?
q.include('space');                // Does 'space' exist on OrderReceipt?
q.descending('purchaseDate');      // Does 'purchaseDate' exist on OrderReceipt?

We validate against 29 query methods: equalTo, notEqualTo, lessThan, greaterThan, containedIn, exists, doesNotExist, matchesQuery, include, select, ascending, descending, and more. For your own validator, start with the methods your codebase actually uses and expand from there.

3. Object Field Methods

When a variable’s class is known, every .get(), .set(), .unset(), .increment(), .add(), .addUnique(), and .remove() call should be checked:

const receipt = await q.first();       // receipt is a OrderReceipt
receipt.get('storeId');                 // Valid field?
receipt.set('totalAmount', 42);        // Valid field?
receipt.increment('itemCount');         // Valid field?

4. Pointer Safety

This catches a common Parse SDK pitfall where pointer fields are accessed assuming they’re always fetched objects:

// Dangerous: .id doesn't exist on raw pointers
const spaceId = receipt.get('space').id;

// Safe: handles both fetched objects and raw pointers
const ptr = receipt.get('space');
const spaceId = ptr?.id || ptr?.objectId;

If your ORM or SDK has similar patterns where references can be either populated objects or raw IDs, add checks for those too.

5. Duplicate Function Definitions

In a codebase split across many files, it’s easy to accidentally define the same cloud function twice. The last definition silently wins, breaking the first caller. Your validator should detect this:

DUPLICATE_FUNCTION: Cloud function "processReceipt" is defined 2 times.
  Also at: cloud/zupply/legacy.js:45

The Hard Part: Knowing What Type a Variable Is

Validating new Parse.Query('OrderReceipt').equalTo('storeId', x) is straightforward — the class name is right there on the same line. But most real code looks like this:

const query = new Parse.Query('OrderReceipt');
query.equalTo('status', 'active');

const receipts = await query.find();
for (const receipt of receipts) {
  const store = receipt.get('store');    // What class is 'receipt'?
  const name = store.get('name');        // What class is 'store'?
}

To validate receipt.get('store'), you need to know that receipt is a OrderReceipt. To validate store.get('name'), you need to know that store is whatever class OrderReceipt.store points to.

This is where the variable tracking engine comes in — and this is the part worth building carefully, because it’s what separates a toy validator from a real one.

Variable Tracking: Following Data Through Code

The validator maintains three tracking maps as it processes each file line by line:

MapTracksExample
queryMapQuery variables and their target classq -> OrderReceipt
objectMapObject variables and their resolved classreceipt -> OrderReceipt
arrayMapArray variables and their element classreceipts -> OrderReceipt

These maps should be reset at function boundaries (every Parse.Cloud.define, Parse.Cloud.job, trigger, or standalone function declaration) to prevent cross-function pollution.

A fourth map — extendMap — persists across the entire file to handle Parse.Object.extend patterns.

Tracking Through Query Results

const q = new Parse.Query('OrderReceipt');   // queryMap['q'] = 'OrderReceipt'
const receipt = await q.first();               // objectMap['receipt'] = 'OrderReceipt'
const receipts = await q.find();               // arrayMap['receipts'] = 'OrderReceipt'

Tracking Through Loops

for (const receipt of receipts) { ... }          // objectMap['receipt'] from arrayMap['receipts']
receipts.forEach(r => { ... })                   // objectMap['r'] from arrayMap['receipts']
receipts.forEach((r) => { ... })                 // same
receipts.forEach(async (r) => { ... })           // same
receipts.forEach(function(r) { ... })            // same
receipts.map(r => { ... })                       // objectMap['r'] from arrayMap['receipts']
const filtered = receipts.filter(r => r.get('active'));  // arrayMap['filtered'] from arrayMap['receipts']

Handle every iteration variant your codebase uses. This is tedious but important — each missed pattern is a hole in your coverage.

Tracking Through Pointer Chains

This is where it gets interesting. When your schema says a field is a Pointer type with a targetClass, the tracker can follow the chain:

const receipt = await q.first();
// objectMap['receipt'] = 'OrderReceipt'

const store = receipt.get('store');
// Schema says OrderReceipt.store is Pointer<RetailStore>
// objectMap['store'] = 'RetailStore'

store.get('name');
// Validates 'name' exists on RetailStore

This works through request.user as well:

const profile = request.user.get('profile');
// Schema says _User.profile is Pointer<UserProfile>
// objectMap['profile'] = 'UserProfile'

Tracking Through Triggers

For beforeSave, afterSave, beforeDelete, and afterDelete triggers, track the trigger class to resolve request.object:

Parse.Cloud.beforeSave('OrderReceipt', async (request) => {
  request.object.get('storeId');     // Validates against OrderReceipt
  request.object.set('verified', true); // Validates against OrderReceipt
});

Closing the Tracking Gaps

When we first built the variable tracking engine, it could resolve 54% of all .get() calls — 2,069 out of 3,822. The remaining 46% were invisible to the validator. We analyzed the gaps and found four specific patterns the engine couldn’t follow. If you’re building your own validator, plan for these from the start.

Gap 1: Variable Re-assignment (23% of untracked)

The engine only tracked declarations with const/let/var. Re-assignments were invisible:

let receipt;
receipt = await query.first();       // Was NOT tracked (no const/let/var)
receipt.get('storeId');              // Unknown class, skipped

Fix: Add regex patterns for bare assignments (^\s+(\w+)\s*=\s*await\s+...), anchored to leading whitespace to avoid matching object property assignments like obj.prop = .... This covers .first(), .get(), .find(), and pointer traversal re-assignments.

Gap 2: .map() Callbacks (14% of untracked)

We tracked .forEach() in every variant but had missed .map() entirely:

const names = receipts.map(r => r.get('storeName'));
// 'r' was NOT tracked, 'storeName' was NOT validated

Fix: Add .map() tracking that mirrors the existing .forEach() patterns — arrow functions with and without parentheses, and async variants.

Gap 3: Chained .get().get() (20% of untracked)

Inline pointer traversal without an intermediate variable was invisible:

receipt.get('store')?.get('name');   // 'name' was NOT validated
request.user.get('space')?.get('title'); // 'title' was NOT validated

Fix: A dedicated regex matches obj.get('field1')?.get('field2') chains. It resolves field1 as a pointer on the source class, finds the target class, then validates field2 against that target class.

Gap 4: Function Parameters (37% of untracked — the hardest)

This is the biggest gap and the most interesting to solve. Helper functions receive Parse objects as parameters, but the engine has no way to know their types:

async function buildIndex(receipt) {
  receipt.get('storeId');    // What class is 'receipt'? Unknown.
  receipt.get('items');      // Can't validate.
}

// Later in the file:
const r = await query.first();   // r is OrderReceipt
await buildIndex(r);              // Passes OrderReceipt as 'receipt'

The function definition appears before the call site that would tell you the parameter type. A single-pass analysis can’t solve this.

Fix: Two-pass analysis.

Pass 1 processes the entire file normally, but additionally records:

  • Every function definition and its parameter names
  • Every call site where a tracked variable is passed as an argument

After pass 1, you know: “function buildIndex has parameter receipt, and at line 50, argument 0 is a OrderReceipt.”

Pass 2 re-processes the file with this knowledge. When it encounters function buildIndex(receipt), it pre-populates the tracking map: objectMap['receipt'] = 'OrderReceipt'. Now every .get() call inside that function body is validated.

The second pass only runs when pass 1 actually discovers resolvable function parameters. For files with no helper functions, it’s a no-op. This keeps the validator fast for the common case.


Error Severity: Blocking vs. Warning

This is a design decision that will make or break adoption of your validator. Not all findings are equal. We recommend a deliberate two-tier system:

FindingSeverityBlocks Deploy?
Invalid class nameErrorYes
Invalid field on same-line queryErrorYes
Duplicate cloud functionErrorYes
Invalid field via variable trackingWarningNo
Unsafe pointer access patternWarningNo

Why should variable-tracked fields be warnings instead of errors?

Because the variable tracking is regex-based, not AST-based. It’s a best-effort heuristic. If the tracking misidentifies a variable’s class (say, due to complex control flow the regex can’t follow), it produces a false positive. False positives that block deploys erode trust in the tool, and teams start adding // schema-ignore everywhere. By making these warnings, you get visibility without the risk of blocking valid code.

Same-line query fields can be errors because they’re deterministic — the class name is literally on the same line as the field reference. No heuristic involved.

The principle: block on certainty, warn on inference.


The Escape Hatch

Every good validator needs an escape hatch. Sometimes you need to reference something the validator can’t understand — a dynamically constructed class name, a test fixture, a migration script. A single inline comment should suppress validation for that line:

const q = new Parse.Query('DynamicClassName'); // schema-ignore

Use this sparingly and deliberately. If your team starts adding // schema-ignore everywhere, the validator is probably too aggressive — dial back the error/warning boundary rather than suppressing findings wholesale.


Results

Our validator runs on every deploy across the entire cloud codebase:

Files scanned: 253
Cloud functions: 797
Schema classes: 111
Duration: ~5 seconds

It catches real bugs. Misspelled field names, references to renamed classes, queries against fields that were removed in a schema migration — all caught before they reach production.

The two-tier error/warning system means we block on high-confidence findings and surface lower-confidence ones for review. The known exceptions list lets us acknowledge intentional legacy references without cluttering the output.

And the two-pass function parameter tracking means even helper functions deep in the call chain get their field references validated.


Design Decisions Worth Stealing

These are the choices we’d make again if we built this from scratch. Consider them for your own implementation.

Regex over AST. We chose regex-based analysis over a full JavaScript AST parser. An AST would be more precise but also heavier, slower, and harder to maintain. The regex approach handles the patterns your codebase actually uses, runs in seconds, and the code is readable enough that anyone on the team can add a new pattern. Start with regex. Move to AST only if you hit a wall.

Warnings, not just errors. A validator that only blocks or only warns is less useful than one that does both. The two-tier approach lets you expand tracking aggressively (every new pattern starts as a warning) without risking false-positive deploy blocks.

Schema as source of truth. The validator reads schema.json directly — the same file that defines the database. There’s no separate “validator config” to keep in sync. If the schema changes, the validator sees it immediately. Whatever your schema source is (exported JSON, GraphQL SDL, Prisma schema), read it directly.

Known exceptions over suppression. Instead of // schema-ignore everywhere, maintain a curated list of classes you know don’t exist in the schema. Each one has a comment explaining why. This makes the exceptions visible and reviewable, not scattered across hundreds of files.

Scope reset at function boundaries. Variable tracking resets at every cloud function definition, job, trigger, and standalone function. This prevents a variable typed in one function from leaking into another. The tradeoff — losing tracking across function boundaries — is exactly what the two-pass analysis recovers.


What’s Next

A validator like this is a living tool. Every time you discover a new pattern that produces an untracked reference, add a regex for it. The architecture — line-by-line processing with pluggable tracking patterns — makes this straightforward.

Some areas we’re still working on:

  • Cross-file function tracking. Currently, the two-pass analysis only works within a single file. Functions imported via require() aren’t tracked.
  • Destructuring patterns. const { storeId, items } = receipt.toJSON() bypasses .get() entirely and isn’t tracked.
  • Dynamic field names. obj.get(fieldVar) where the field name is in a variable can’t be statically validated.

But these are diminishing returns. The engine catches the vast majority of field and class reference errors, and the warning system surfaces the rest for human review.

If you’re running a Parse backend (or any NoSQL backend with cloud functions), you can build this. The schema is already there. The patterns are in this post. The hardest part is the variable tracking engine, and now you have the blueprint for that too.

Production quality from the start. That’s the standard.


Built at Zovia Studio. We ship apps that families use every day, and we take the “every day” part seriously.

← Back to Blog