Operations Wiki Entries (Handling Stuck Run Types)

Sections:

  • List of Operations:

    1. DB_MONITOR_ALL: Monitors the entire database system for errors and resource bottlenecks.

    2. PROC_RESTART [process_name]: Restarts specific database processes.

    3. LOG_FETCH [process_id]: Fetches logs for a particular process to diagnose issues.

    4. SUBSCRIBER_LIST: Lists registered subscribers for application processes.

    5. COMP_START and COMP_STOP: Start and stop application components.

  • Failure Investigation Processes:

    1. Run Diagnostic Tool:
      Use RUN_TYPE_DIAG [run_type_id] to diagnose issues with run types. This identifies whether the issue is related to resource bottlenecks, network failure, or component errors.

    2. Investigate Logs:
      Fetch logs using LOG_FETCH [run_type_id] to identify detailed error messages.

    3. Check Resource Utilization:
      Use RES_CHECK to monitor CPU and memory utilization during process execution.

    4. Process Restart:
      Execute PROC_RESTART [process_id] to restart stuck processes.

  • Operational Commands for Monitoring:

    1. MONITOR_RUN_TYPE [run_type_id]: Monitor a specific run type for status and logs.

    2. HEALTH_CHECK: Runs a full system health check to identify problem areas.

  • Operational Commands for Starting and Stopping Components:

    1. START_PROCESS [component_name]: Starts a specific application component.

    2. STOP_PROCESS [component_name]: Stops a specific application component.

  • Operational Commands for Restarting Components Following Failure:

    1. PROC_RESTART [component_name]: Restarts a failed component.

    2. SYSTEM_RESTART: Restarts all processes in the application, including DB connections.

  • Appendix:

    • Application Timeslices: Instructions on how to handle time slice errors:

      • TIMESLICE_ADJUST [time]: Adjusts timeslice allocation for processes.

    • Subscriber Lists: Run SUBSCRIBER_LIST to display subscribers for key processes.

    • Data Source Management: Run SOURCE_RELOAD to reload data sources without a restart.