Designing and Implement a Ticket Categorisation System for Automatic Classification of Logs using Azure Machine Learning
Client had a semi manual internal process to assign categories to the ticket based on their severity. This ticketing system was unable to keep up with the expanding business and due to the sheer volume of incoming log messages multiple high severity errors went unnoticed resulting in several days of production downtime.
- Existing semi-manual ticketing process is to be replaced with an automatic ticket classification system using Azure machine learning.
- Existing ticket classification system depended upon individual teams to view, analyze and classify logs. Tickets are then opened and assigned to internal teams based upon severity. Substantial number of tickets and involvement of many teams resulted in delay. An automated system is required that can parse through all the logs and open tickets based on the importance. Tickets can have different severity ranging from Level -1 to Level -2. Level – 1 tickets are critical for business and require immediate action. While level – 5 tickets are to be acted upon in a duration of three days.
- Over all system cost should not increase. Cost reduction is preferable however not mandatory.
- Correct classifications of tickets according to the severity.
- Metadata Edition: Logs are generated across thousands of application servers and machines. These logs are stored across multiple databases and repositories in variety of data formats throughout client’s infrastructure. Metadata information for these datasets are to be edited to ensure uniformity, merging and further downstream processing.
- Data preparation: Raw datasets contain inconsistent rows with missing data values, redundancy. Data cleaning is done to ensure good quality dataset for machine learning process.
- Duplicates rows should be removed and missing values are replaced with mean/median values.
Feature engineering: It involves converting input text data into integer values for dimensionality reduction. Various columns are aggregated to create new columns. One such example can be of calculating average of features instead of using them individually.
- Modeling/ Algorithm selection: suitable machine learning algorithm is to be selected to ensure correct classification of tickets.
Multiclass Decision Forest
Machine learning based ticketing classification system met the business requirements by analyzing all the generated logs and classifying them according to severity. Significant reduction in time required to process the logs was identified. New ticketing system was able to handle the business scaling requirement and in the long run will facilitate reduction in running cost as lesser human teams are required in this automated process.