CALM Operations Capabilities: Job & Automation Monitoring
What is Job & Automation Monitoring and what can we do with it?
Job & Automation Monitoring is a SAP Cloud ALM application for evaluating the status of automated actions and background jobs, providing transparency to Execution Status, Application Status, Start Delay and Run Time.
It’s an central monitoring application, supporting jobs that run on different platforms, offering unified user experience, using a common look-and-feel and handling pattern.
The application helps IT and Business users to understand success of the automation processes across all involved cloud and on-premise services and systems.
It collects individual job execution data, correlates the data to the related job definitions and evaluates executions using historical data. For an easy understanding of current status of jobs executed, a rating is propagated to the job level and to the service level.
With the Job & Automation Monitoring we can:
- Analyze, which jobs have long runtime or a high failure rate and if there are jobs that are deteriorating.
- Get alerted and notified.
- See the status of defined set of jobs.
- See exception message details for a job execution.
- See if there is a job, which latest execution didn’t finish successfully, which data wasn’t processed successfully, had job delay or if runtime for job was exceptionally long.
How to start using Job & Automation Monitoring Application?
To start using Job & Automation Monitoring we just need to configure SAP Services/Systems you want to monitor, to push job execution data to the SAP Cloud ALM Job & Automation Application.
After that we can start fine-tuning the requirements for monitoring. The application offers many options for that.
We have option to set Run Time and Delay thresholds, name of Job by changing name and Retention time in days for Aggregation. We can also configure events for Alerting with wide options for personalization.
In Event Settings:
- We can configure Events for 4 different Event Types:
- Critical Application Status
- Critical Execution Status
- Critical Runtime
- Critical Delay
- We can choose the Name for an Alert and even choose for which Rating Status we want to get alerted, regarding Execution and Application Status.
For Rating Status we are able to choose:
- Red
- Yellow
- Red or Yellow
- For Critical Runtime and Critical Delay Alerts we can set thresholds in minutes for Red and Yellow ratings.
Event Filters:
We can use Event Filters to raise events for only certain jobs or exclude certain job from creating alert.
- Available Parameters for Event Filters are:
- Job/Automation Executable Name
- Job/Automation Execution User
- Job/Automation Name
- Operator could be:
- Is
- Contains
- Is Not
- And the Value of chosen Parameter.
Event Actions:
- We can switch ON/OFF Create Alert Action.
- We can switch ON/OFF Send Email To Action and Add Email Recipient where Notification regarding the Alert will be sent.
- We can switch ON/OFF Start an Operation Flow and Add the Operation Flow definition to Automate Alert Handling Process.
- We can switch ON/OFF Create ServiceNow Ticket and Add Subscription to your ServiceNow.
What do we see in Job & Automation Monitoring?
Overview
When we access Job & Automation Monitoring Application we are able to see consolidated view on the Overview page with our Services/Systems in scope
The page provides us with a summary status regarding job executions for the services in scope. Tiles on the page show the status of the latest execution of every job with: Execution Status (With Technical Exceptions), Application Status and (With Application Exceptions) and Run Time (With High Run Time). Tile displays the number of Open Alerts and Status of Business Service Management events (Maintenance etc.)
From Overview page we can directly navigate to the list of jobs sorted by Execution Status, list jobs sorted by Application Status, list of jobs with High Run Time and to the list of Open Alerts. And by clicking on the Name of Service/System we can directly navigate to the list of all jobs regarding the System/Service.
Monitoring
When we go to Monitoring page in the Application, first we see a list of Services/Systems in scope.
The list includes information about:
- Type of Service/System
- Total number of Jobs that have at least one reported execution
- Type of Job or Automation
- Execution Status which displays worst rating with the number of jobs that currently report issues to finish successfully
- Application Status which displays worst rating with the number of jobs that currently report issues to process application data
- Number of jobs that currently report issues with regard to starting on time with worst rating
- Number of jobs that have issues to finish in normal run time with worst rating and Open alerts regarding to the Service/System.
By clicking on the Name of Service/System or the Button in the last column of the table displayed, we can drill down into the list of jobs regarding to the Service/System.
Monitoring Tables offer a wide variety of Filter Settings that gives us option to Monitor and Analyze Jobs highly efficiently.
If you see on the list Job Name with a Pen Icon in front of it, the data provider found that the name of the job/automation is constantly changing. By clicking on the Icon we can change the name of Job to represent all related executions.
By clicking on the Job Name we can drill down into the Executions of the job and further into Details, related to Execution.
Exceptions
On the Executions page, we can see the exceptions of jobs and automations related to the application data as per the selected scope.
By clicking the exception icon of a job or automation, you can drill down to the detailed description of the individual exceptions. The details also include relevant context information.
By clicking on the System/Service Name we are Navigated to the Chart showing Number of Exceptions regarding to the Category.
Clicking on the Category we can drill down into detailed descriptions of individual exceptions.
By clicking on the Category we can drill down into detailed descriptions of individual exceptions, providing us with a detailed context information in same view as accessed from Executions Overview page.
Analysis
On Analysis page we can view the analytical information of jobs or automations regarding to values for different metrics collected by the application (runtime, failure rate etc.).
Here we can filter, sort, and change the order of columns.
We can also drill down into trend graphs for individual job or automation.
Alerting
In the Alerting page we can see a list of open alerts within the selected scope.
We can Filter the list with Alert Name, Message, Status, Processor, and Object Details. The page provides us with many actions to work with Issues fast and efficiently.
For analyzing an alert, you can drill down into alert details to get information on the related message.
By choosing Actions button we can Assign or remove alert Processors, Confirm Open Alerts, Add Comments, Send Notification and Even Start Operation Flow to automate the handling of known alerts.
Button next to the Actions, opens Action Logs:
Conclusion
Job & Automation Monitoring Application gives us an overview on job executions in a distributed heterogenous landscape providing us with a central platform to monitor all aspects needed to ensure business continuity.
To make sure all jobs and automations, as backbone of business processes, are running successfully, we can monitor that the jobs and automations run on time, finish successfully, process data without errors or warnings and do not deteriorate regarding to run time.
Besides showing the current status and indicated exceptional situations, we can discover critical trends and even avoid downtimes.
After configuring Events, the application is able to perform automatic actions on occurrence of Event. Those Actions could be creating alerts, sending email notifications or even trigger the automatic execution of an operation automation procedure.
Job and automation monitoring is a great application which helps us to identify job and automation issues immediately and trigger automatic actions to create seamlessly working alert handling procedures, tailored to the customers needs.