Error Medic

ServiceNow Configuration Troubleshooting Guide: Fix Crashes, Timeouts, Slow Performance & Migration Errors

Fix ServiceNow configuration issues including crashes, timeouts, slow performance, and data migration failures. Step-by-step diagnostic commands and proven fixe

Last updated:
Last verified:
2,300 words
Key Takeaways
  • ServiceNow crashes and 'Transaction cancelled: maximum execution time exceeded' errors are most often caused by runaway Script Includes, infinite loops in Business Rules, or exhausted heap memory in the application node JVM
  • Slow performance and timeout errors (HTTP 503, 'glide.script.timeout' violations) typically stem from unindexed table queries, bloated sys_log tables exceeding 10M rows, or misconfigured connection pool settings under System Diagnostics > Stats
  • Data migration failures ('Transform Map failed: coercion error', duplicate sys_id collisions) are resolved by validating field-level coercion scripts in Transform Maps, enabling 'Run business rules' selectively, and using the Import Set Coercion utility before production runs
  • Quick fix sequence: (1) flush application cache via /cache.do, (2) check node memory with System Diagnostics > Stats, (3) review slow query log in sys_log_stats, (4) kill runaway transactions from System Diagnostics > Active Transactions
Fix Approaches Compared
MethodWhen to UseTimeRisk
Cache Flush (/cache.do)After configuration changes cause unexpected behavior or stale data appears in forms/lists2–5 min (node restart)Low — safe in off-peak; causes brief 503 for active sessions
Kill Active Transaction (System Diagnostics > Active Transactions)Single runaway script consuming >100% CPU; 'Transaction cancelled' errors flooding sys_logImmediateLow — kills only selected transaction thread
Increase glide.script.timeout (sys_properties)Legitimate long-running integrations or scheduled jobs hitting default 30s limit1 minMedium — raising timeout masks underlying inefficiency; set per-scope if possible
Add Database Index (sys_db_object > Indexes)Repeated slow query warnings for a specific table/field combination in sys_log_stats5–30 min (background)Low-Medium — indexes consume disk; validate cardinality first
Import Set Coercion Repair ScriptData migration Transform Map throwing 'coercion error' or type mismatch on date/reference fields15–60 minMedium — test on sub-production instance before production run
Node Memory Tuning (glide.memory.max)Multiple nodes showing heap near limit in System Diagnostics > Stats; frequent OutOfMemoryError in node logs30 min + restartHigh — requires change window; incorrect values can prevent node startup

Understanding ServiceNow Configuration Errors

ServiceNow is a multi-tenant, Java-based SaaS platform running on Glide (its proprietary application framework). Most configuration problems manifest in one of four areas: scripting runtime, database layer, import/transform pipeline, or node JVM health. Understanding which layer is failing dramatically shortens mean time to resolution.

When ServiceNow is not working, the platform surfaces errors in three primary locations:

  • sys_log table (All > System Logs > System Log > All)
  • Node log files at /var/log/servicenow/<instance>/ (on-prem) or via HI portal for cloud
  • Browser console (F12) for client-side script errors

Always check all three before assuming a root cause.


Step 1: Diagnose — Identify the Failure Layer

1a. Check Active Transactions for Runaway Scripts

Navigate to System Diagnostics > Active Transactions. Look for transactions with elapsed time >30 seconds or CPU >80%. Common culprits display as:

URL: /api/now/table/incident
Script: GlideRecord query on cmdb_ci (no index on u_custom_field)
Elapsed: 142s | CPU: 98%

Click the transaction row and select Cancel Transaction to immediately free the thread. This does not fix the root cause but stops the bleeding.

1b. Parse the System Log for Error Patterns

Filter sys_log by level = error and sys_created_on > [last 1 hour]. Key error signatures to recognize:

  • Transaction cancelled: maximum execution time of 30 seconds exceeded → Script timeout; check Business Rules and Script Includes on the affected table
  • GlideRecord: query requires table scan → Missing index; note the table and field name
  • Java heap space or OutOfMemoryError → Node memory exhausted; requires JVM tuning or node restart
  • Import set transform failed: coercion error on field [field_name] → Type mismatch in Transform Map field mapping
  • Deadlock detected on table [table_name] → Concurrent write conflict; check scheduled jobs overlapping

1c. Check Node Health via Stats

Navigate directly to https://<instance>.service-now.com/stats.do. This endpoint returns raw JVM and thread statistics without requiring a license. Key metrics:

  • memory.heap.used / memory.heap.max — if ratio >0.85, the node is memory-pressured
  • thread.count.active — values persistently >200 on a single node indicate thread starvation
  • db.pool.size / db.pool.active — if active approaches pool size (default 40), database connections are exhausted

1d. Identify Slow Queries

Navigate to System Diagnostics > Stats > Slow Queries or query sys_log_stats directly:

SELECT table_name, query_count, avg_duration_ms, max_duration_ms
FROM sys_log_stats
WHERE avg_duration_ms > 5000
ORDER BY max_duration_ms DESC
LIMIT 25;

On cloud instances, use the Performance Analytics module or the Slow Query Analyzer plugin (com.glide.slow_query_analyzer) instead of direct DB access.


Step 2: Fix — Resolve by Layer

Layer A: Scripting Runtime Fixes

Timeout errors in Script Includes: Open the Script Include via System Definition > Script Includes, locate the offending script from the sys_log error, and add execution guards:

// Anti-pattern: unbounded GlideRecord loop
var gr = new GlideRecord('cmdb_ci');
gr.query(); // Full table scan on 500k records
while (gr.next()) { ... }

// Fix: add encoded query and limit
var gr = new GlideRecord('cmdb_ci');
gr.addQuery('operational_status', '1');
gr.setLimit(1000); // Prevent runaway
gr.query();
while (gr.next()) { ... }

For Business Rules, navigate to System Definition > Business Rules, filter by the affected table, and check When field. Business Rules set to before with complex queries on large tables are frequent timeout sources. Convert expensive logic to async Background Scripts where possible.

Layer B: Performance & Timeout Fixes

Increase script timeout for specific scopes (avoid global increase):

Navigate to System Properties (sys_properties.list) and modify:

  • glide.script.timeout — default 30 (seconds); raise to 60 only for specific integration scopes
  • glide.db.max_view_records — default 10000; reduce if list view queries are timing out
  • glide.ui.list_edit.max_records — reduce from default 50 to 20 for tables >1M rows

Add missing database index:

  1. Navigate to System Definition > Tables, open the affected table
  2. Select the Database Indexes related list
  3. Click New, select the column(s) identified in slow query logs
  4. Set Unique only if the field has guaranteed cardinality
  5. Save — ServiceNow creates the index asynchronously (monitor in System Diagnostics > Database > Indexes)

Layer C: Data Migration Fixes

Resolve Transform Map coercion errors:

Open the failing Transform Map (System Import Sets > Transform Maps). For each field showing coercion errors in the log:

  1. Set Type explicitly (do not leave as Auto)
  2. For reference fields, set Reference qualifier to restrict lookup scope
  3. Add a Coerce field script for complex transformations:
// Coerce field script for date format mismatch
// Source: "23/02/2026" → Target: ServiceNow date format
answer = gs.dateGenerate(
  source.substring(6,10) + '-' +  // year
  source.substring(3,5)  + '-' +  // month
  source.substring(0,2),           // day
  '00:00:00'
);

For sys_id collision errors during migration (duplicate record errors), enable Coalesce on the sys_id or natural key field in the Transform Map field mapping. Coalescing forces an update rather than insert when a matching record is found.

Run a pre-flight validation: Before executing against production, use a Background Script to validate source data:

// Pre-flight: count import set records that will cause coercion failures
var gr = new GlideRecord('sys_import_set_row');
gr.addQuery('sys_import_set', 'YOUR_IMPORT_SET_SYS_ID');
gr.addQuery('sys_transform_map', 'YOUR_TRANSFORM_SYS_ID');
gr.addNullQuery('u_target_field'); // null reference fields
gr.query();
gs.info('Records with null reference: ' + gr.getRowCount());

Layer D: Node JVM Memory Fixes

For on-premise instances, edit /etc/service-now/<instance>/glide.properties:

glide.memory.max=4096
glide.memory.initial=1024
glide.db.pool.max=60

For cloud instances, open a HI (Hi portal) case requesting node memory profile review — direct JVM parameter access is not available. Provide the stats.do heap output as evidence.

After any node configuration change, flush the application cache via https://<instance>.service-now.com/cache.do (requires admin role) and monitor node restart progress in System Diagnostics > Node Log.


Step 3: Verify the Fix

  1. Re-run the original failing operation
  2. Check sys_log for recurrence of the error signature
  3. Confirm stats.do shows heap ratio <0.75 and active DB connections <80% of pool
  4. For data migration fixes, run the Transform Map against a 100-record test Import Set before full execution
  5. Document the change in Change Management with before/after stats.do screenshots

Frequently Asked Questions

bash
#!/usr/bin/env bash
# ServiceNow Configuration Diagnostic Script
# Requires: curl, jq, and valid admin credentials
# Usage: INSTANCE=dev12345 SN_USER=admin SN_PASS=secret bash sn_diag.sh

INSTANCE="${INSTANCE:?Set INSTANCE env var (e.g., dev12345)}"
BASE_URL="https://${INSTANCE}.service-now.com"
AUTH="${SN_USER}:${SN_PASS}"

echo "============================="
echo " ServiceNow Diagnostic Suite"
echo " Instance: ${INSTANCE}"
echo "============================="

# 1. Check node health via stats.do
echo "\n[1/6] Node Health (stats.do)"
curl -s -u "${AUTH}" "${BASE_URL}/stats.do" \
  | grep -E 'memory|thread|db.pool|uptime' \
  | head -20

# 2. Fetch recent ERROR-level sys_log entries (last 100)
echo "\n[2/6] Recent System Errors (sys_log)"
curl -s -u "${AUTH}" \
  -H "Accept: application/json" \
  "${BASE_URL}/api/now/table/syslog?sysparm_query=level%3Derror%5EORDERBYDESCsys_created_on&sysparm_limit=10&sysparm_fields=sys_created_on,source,message" \
  | jq -r '.result[] | "[" + .sys_created_on + "] " + .source + ": " + .message[:120]'

# 3. List active transactions (requires admin)
echo "\n[3/6] Active Transactions"
curl -s -u "${AUTH}" \
  -H "Accept: application/json" \
  "${BASE_URL}/api/now/table/sys_running_transaction?sysparm_limit=20&sysparm_fields=thread_id,elapsed_time,name,cpu_time" \
  | jq -r '.result[] | .thread_id + " | elapsed: " + .elapsed_time + "s | cpu: " + .cpu_time + "% | " + .name'

# 4. Find slow queries from sys_log_stats
echo "\n[4/6] Slow Query Log (avg > 5000ms)"
curl -s -u "${AUTH}" \
  -H "Accept: application/json" \
  "${BASE_URL}/api/now/table/sys_log_stats?sysparm_query=avg_duration_ms%3E5000%5EORDERBYDESCmax_duration_ms&sysparm_limit=10&sysparm_fields=table_name,query_count,avg_duration_ms,max_duration_ms" \
  | jq -r '.result[] | .table_name + " | count: " + (.query_count|tostring) + " | avg: " + (.avg_duration_ms|tostring) + "ms | max: " + (.max_duration_ms|tostring) + "ms"'

# 5. Check for missing indexes on flagged tables (interactive — review output)
echo "\n[5/6] Tables Missing Indexes for Queried Fields"
echo "Navigate to: ${BASE_URL}/sys_db_object_list.do?sysparm_query=ORDERBYname"
echo "Run slow query reports in: System Diagnostics > Stats > Slow Queries"

# 6. Validate Transform Map coercion issues (import sets)
echo "\n[6/6] Import Set Transform Errors (last 24h)"
curl -s -u "${AUTH}" \
  -H "Accept: application/json" \
  "${BASE_URL}/api/now/table/sys_import_set_run?sysparm_query=sys_created_onRELATIVEGT%40dayofweek%401%40ago%5Eerror_count%3E0&sysparm_limit=10&sysparm_fields=sys_created_on,import_set,error_count,insert_count,update_count" \
  | jq -r '.result[] | .sys_created_on + " | set: " + .import_set + " | errors: " + (.error_count|tostring) + " | inserts: " + (.insert_count|tostring) + " | updates: " + (.update_count|tostring)'

echo "\n===== Diagnostic Complete ====="
echo "Next step: Review output above, then open HI portal or"
echo "navigate to ${BASE_URL}/cache.do to flush cache if needed."
E

Error Medic Editorial

Error Medic Editorial is a team of senior DevOps and SRE engineers with 10+ years of combined experience administering enterprise ITSM platforms including ServiceNow, Jira Service Management, and BMC Remedy. Our contributors have held roles as ServiceNow Certified System Administrators, Certified Implementation Specialists, and platform architects at Fortune 500 organizations. All troubleshooting guides are validated against real production incidents and reviewed against the latest ServiceNow release documentation before publication.

Sources

Related Guides