# Release 2.0. Scenario problem management (signals) and single monitoring window

# Purpose and objectives of version 2.0

Our team is pleased to present a new, 2nd version of Acure. Acure has been driven by the concept of "monitoring as code" since its inception, and the new version is a significant milestone along that path. The main idea of the 2nd version is dynamic scenario problem management, which greatly simplifies the monitoring process, which is especially important for systems with a dynamic environment (containers, microservice architecture, Kubernetes). Simply put, instead of setting up hundreds and thousands of static triggers and constantly monitoring their relevance, now you just need to write a few scripts and forget about the old monitoring process – the platform will do everything. Another key feature of the new version of Acure is the "single window" concept, where information about all objects is collected in one place, which minimizes switching between screens. In Acure 2.0, the dashboard is merged with the Service Model graph, service mode information, and Configuration Item health. In subsequent releases, logs, results of synthetic test builds and, of course, metrics, will be added,
The technological component of the product also got upgraded: for a number of critical services in 2.0, the gRPC remote procedure call system from Google was piloted. This architectural solution made it possible to achieve significant acceleration of inter-microservice interaction, which is especially important for the scenario approach when the user can call API functions in their scripts.

# 1. Signals. Scenario management of problems and rejection of triggers

As noted above, scenario problem management is the main idea of the new version of Acure and the embodiment of the "monitor as code" concept. In previous versions, the main tool for deduplication and correlation of primary events (alerts and logs) were synthetic triggers controlled by rules in the form of lua scripts. Despite the undeniable flexibility of our triggers, there were a number of difficulties when using Acure to manage large dynamic environments. First of all, due to the static nature of the trigger, for each case, it was necessary to create a separate trigger with its own rule, which led to significant labor costs when setting up and an increase in the number of triggers to hundreds of thousands. It also gave rise to the problem of constant synchronization with external monitoring systems - you had to constantly monitor their changes. If a trigger was deleted or unlinked from a CI, the history was lost. All this made it difficult to fully use our product for monitoring dynamically changing environments. In Acure 2.0, synthetic triggers were replaced by signals driven by scripts on a low-code engine. Unlike a synthetic trigger, which only changes its status, a signal is a "short-lived" dynamic object, similar to a task in a regular task tracker. The signal is opened by a certain alert (or a set of alerts), further in the process other alerts (confirming ones) can be attached to it and, at a certain moment, the signal is closed by an alert or a scheduler event. Let's give an example of the simplest deduplication of alerts on exceeding the threshold values of the metric. For example, if a threshold value is exceeded, the source generates an exceeded alert every 5 minutes. In Acure, a signal is opened on the first event, all subsequent confirmations are attached to this signal. When the metric returns to its original state and there are no new alerts for this metric within 30 minutes, Acure closes the signal. If the situation repeats, a new signal is generated, the previously triggered and already closed signals remain unchanged. In addition to a short life cycle, the approach to managing signals has also changed. In the case of a trigger, we first linked the trigger to the required CI manually or via the API, then set the script for managing its statuses (either manually or through a template) and the event prefilter. Thus, a separate script was responsible for each individual trigger. In the 2nd version, by analogy with the previously successfully tested Service Model auto-building, we switched completely to scenario signal control. This means that now it is inside the script that the opening/closing of signals takes place, the logic of their binding to the CI is determined, as well as the attachment of primary events (alerts) to them. Not only that, but now the signals are managed by your entire set of scenarios: you can create one great alert deduplication scenario, or you can break it into many small ones without any restrictions. The connection of the signal with the CI is also established in the script. The binding logic can be completely arbitrary, depending on how your CMDB is organized and how monitoring systems and log sources are configured. Now you can directly access the CMDB functions from the script: find the required CI or group of CIs by attributes and associate them with the created signal.

# 2. Single monitoring window

The concept of a "single window" is designed to speed up the work of an engineer with the objects being monitored, and, thus, to minimize the time to solve an accident. In 2.0, 4 screens (Main, Operational, Timeline, Service Model) were replaced by a single monitoring window, in which the dashboard is combined with the SM graph. The single window is visually divided into two parts: in the left panel there is a list of filtered CIs, their health and status, in the right panel the information changes depending on the selected mode by the operator: SM graph, CI card, list of signals, service modes, change log. In this case, the right panel serves as an additional filter for the right panel, and multiple selections with the CTRL key pressed are also available. There are many different cross-links and filters on the screen. In Acure 2.0, the global parametric filter of SM maps has undergone a significant redesign: fast filter sets have appeared, and the constructor has completely changed. Another feature was the functionality of transition points between maps. You can link maps to certain configuration items, after which transition points will appear on these configuration items. Thus, it is possible to move from one map to another directly on the SM graph. In the next releases, the "single" window will develop: new filters, widgets, graphics will appear and customization of settings will be supplemented. A function for managing table columns is under development.

# 3. Service Model in 2.0

Changes have not bypassed the Service Model. CI attributes have appeared, allowing you to save a variety of information about the CI, and the life cycle of the CI. CI attributes not only allow you to store information about an object but are also an important component for scripting. Attributes are widely used in scripts, for example, to define the relationship between a CI and a signal. The set of CI attributes is defined by the Acure administrator for the CI type. Local extension of the model (for a specific CI) is also possible: users can use the special system attribute "labels". Attributes are strictly typed. Attribute types are identical to the types and structures of an automation engine and can be extended with special libraries. The CI lifecycle has been added to Acure. In 2.0, only the standard lifecycle management is available, consisting of 2 stages: in-operation and archived. In future versions, it will be possible to create your own life cycle diagrams and apply them to certain types of CIs. Please note that archived CIs are not included in the calculations; it is impossible to link new signals to them.

# 4. Extension of the Automaton.Core library

In addition to the new library for working with signals in our low-code engine, we continue to expand the set of ready-made standard functions available to our users for writing their own automation scripts. The new version of Acure adds the following features: Array functions: ArrayLength , ArrayCreate , ArraySelect , ArrayIntersect , ArrayContains , ArrayUnion ; Time functions: Now , UTCNow , ParseDateTime , TryParseDateTime , DateRangeToNow , ConvertToMilliseconds ; Other functions: CompareNumber , IsNull .

# 5. Other

The release includes a number of useful improvements: Active directory. Support for LDAP filters. Allows more flexible rules for synchronization with Acure. Management of CI service modes in the graphical interface. Now you can set service modes for your objects directly from the interface. Environment variables for Yaml scripts in the stream tasks. More convenient and customizable setup of ETL processes in Acure. Information on the total amount of data of the stream.