New LogAnalysis with 109x speed
Nikolett Hegedüs

New LogAnalysis with 109x speed

New LogAnalysis with 109x speed

The former version of SenseLog (which serves our robust LogAnalysis module) has processed the files at the start and observed them if there were any changes in them. It has used a lot of sources for the dates in the log rows. In this version it was necessary because SenseLog had to recognize the changes and had to decide whether it has to to something or not with the changes. The process of log files took longer time because of this.

The current version only processing the changes, in the case of delegated logs, SenseLog stands at the end. This way therer is no need for excess resource.

In what way it got quicker?

The new version only deals with changes in log files, so that only newly generated logs are processed. Currently it does not handle the content of the former files. ( This is because we plan to introduce a development later that will allow you to scan the full contents of the log files using the SenseLog rules. )

In many rules, we succeeded in simplifying the regexes so much that resulted in a further acceleration in pattern matching. As well as fixing the regexes for a rule, in one row of logs a rule would only have a regex fitting, which also resulted in a significant acceleration. 

We have used the Aho-Corasick algorithm in the rules, which simplifies and accelerates the search.

We have developed a method to examine how often each log file changes and by this we  classifythem accordingly. The more they change, the more often we check them. If, however, it is found that they are less common, we will check them less often, saving unnecessary steps ( and processor time ).

How did it happen? 

We have taken a great care of planning : we have made the code cleaner so that bot current and later expansion can be more transparent and simpler. With fewer classes, we work more efficiently than in the first version of SenseLog, and we have also put the responsibilities up to date. The regexes were optimized to find the similarities from as few steps as possible. We have been careful not to use too many resources for class instances. As well as sample attachments, we tried to make the least of these function calls with the module, as these are relatively costly workflows. Of course, only within the limits of reasonableness, because we need to model attachment to find the "wickedness" in the log files.

We use the Object Pool design pattern, as this also supports "cost-effectiveness".

We have made the arrangements for the SenseLog module settings to be personalized directly from Dashboard.

Benchmark :

Based on our tests, 1 million logs in 1000 log files are proccessed in 13 seconds.


Share your ideas with us about this article

Previous posts

Zero Day phpMyAdmin Vulnerablity Patched by BitNinja
A new flaw on the horizon! A new flaw has been discovered in phpMyAdmin, in which an attacker has the possibility to include files on the server. This vulnerability is caused because of a portion of a code where the pages are redirected and loaded in phpMyAdmin. Here are the steps, how it can be achieved:  1) First, the intruder has to be authenticated, after this procedure the sql query will create a session. 2) Invoking the  ../../../../../..../var/lib/sessionId the attack can be performed. There are some exceptions though:   - $cfg['AllowArbitrary...
WordPress hosting and the BitNinja WAF - How to do it right? (Part 1: The basics)
We know that our customers care a lot about their own customers, too. Just like we care about you, and about making the internet a safer place. So, with the following series of articles titled “Wordpress hosting and the BitNinja WAF - how to do it right?”, I’d like to help those who work in Wordpress hosting, and would like to use the BitNinja WAF to protect their servers. The BitNinja WAF is a really great tool for security - when used properly. And to use it, you’ll need to understand the terminology that we’re using. So let’s start with the basics, shall we? :) What are rule...