Client had a semi manual internal process to assign categories to the ticket based on their severity. This ticketing system was unable to keep up with the expanding business and due to the sheer volume of incoming log messages multiple high severity errors went unnoticed resulting in several days of production downtime.
- Existing semi-manual ticketing process is to be replaced with an automatic ticket classification system using Azure machine learning.
- Existing ticket classification system depended upon individual teams to view, analyze and classify logs. Tickets are then opened and assigned to internal teams based upon severity. Substantial number of tickets and involvement of many teams resulted in delay. An automated system is required that can parse through all the logs and open tickets based on the importance. Tickets can have different severity ranging from Level -1 to Level -2. Level – 1 tickets are critical for business and require immediate action. While level – 5 tickets are to be acted upon in a duration of three days.
- Over all system cost should not increase. Cost reduction is preferable however not mandatory.
- Correct classifications of tickets according to the severity.
- Metadata Edition: Logs are generated across thousands of application servers and machines. These logs are stored across multiple databases and repositories in variety of data formats throughout client’s infrastructure. Metadata information for these datasets are to be edited to ensure uniformity, merging and further downstream processing.
- Data preparation: Raw datasets contain inconsistent rows with missing data values, redundancy. Data cleaning is done to ensure good quality dataset for machine learning process.
- Duplicates rows should be removed and missing values are replaced with mean/median values.
Feature engineering: It involves converting input text data into integer values for dimensionality reduction. Various columns are aggregated to create new columns. One such example can be of calculating average of features instead of using them individually.
- Modeling/ Algorithm selection: suitable machine learning algorithm is to be selected to ensure correct classification of tickets.
Prior to metadata editing of fetched data. Teams responsible for handling various databases and repositories were instructed to introduce a unique key column in the tables. Post data aggregation metadata edition step was executed. Here required columns were identified from the tables and labeled. Further Boolean or numeric values were converted and treated as categorical values. Date and timestamps were converted to numerical values.
After metadata edition, required columns were extracted from various chunks of datasets. This merging is completed by performing a RDBMS database type inner join using the unique key column. Output of this step resulted in a larger merged dataset with one unique key column.
Using text input will have a significant impact on the performance due to input textual feature set size. The purpose of hashing is to convert variable-length text documents into equal-length numeric feature vectors, to support dimensionality reduction and make the lookup of feature weights faster. It is used to convert stream of text into a set of features represented as integers. This hashed feature set is passed into machine learning algorithm. In this step bitsize 15 is selected for hashing and entire dataset column containing the log body is converted to hash values.
Multiclass Decision Forest
This algorithm works by building multiple decision trees and then voting on the most popular output class. Voting is a form of aggregation, in which each tree in a classification decision forest outputs a label value. Aggregation process selects the label vale with the highest frequency. Preprocessed data coming from previous step is fed into multiclass decision forest algorithm to train upon. After the model is trained, it can be used to classify newly generated logs into classes.
Machine learning based ticketing classification system met the business requirements by analyzing all the generated logs and classifying them according to severity. Significant reduction in time required to process the logs was identified. New ticketing system was able to handle the business scaling requirement and in the long run will facilitate reduction in running cost as lesser human teams are required in this automated process.