Discussion:
[Bro] sum stats q.
Dk Jack
2018-11-29 01:31:46 UTC
Permalink
Hi,
I am trying to use Bro sumstats framework. Based on the examples, I came up
with the script shown at the end of the email. In the script, I am counting
the number of http requests for each method+uri combination.

As dictated by the framework, I am calling observe for each request. At the
end, I expected the total sumstats equal to the number of requests in my
pcap. However, this doesn't seem to be the case. I am trying understand if
I made a mistake in how I am using the framework of if something else is
going on.

For example, I ran the script on try.bro.org website using the http.pcap
available there. Per my analysis, there should be 197 requests in the pcap.
However, when I dump each of my stat into a log file, I expected the hits
column from the log to add up to 197. However, that's not the case. Running
the script against my own pcap is giving different numbers from what I
would expect.

Any help understanding the issue is appreciated... Thanks

Dk.

PS: you can copy paste this script in to try.bro.org website and run it
against the http.pcap.

@load base/utils/site
@load base/frameworks/sumstats

module HttpStats;

export {
redef enum Log::ID += { LOG };

type Info: record {
ts: time &log;
method: string &log;
uri: string &log;
hits: count &log;
};

global update_http_stats: function(method: string, uri: string);
}

global scount: count = 0;

event bro_init() &priority=5
{
print "Creating HttpStats log stream and HTTP sumstats";
flush_all();

# Create the stream.
Log::create_stream(HttpStats::LOG, [$columns=Info, $path="http-stats"]);

local r1 = SumStats::Reducer($stream="http-stats",
$apply=set(SumStats::SUM));

SumStats::create([$name="http-stats",
$epoch=5sec,
$reducers=set(r1),
$epoch_result(ts: time, key: SumStats::Key, result:
SumStats::Result) =
{
local r = result["http-stats"];
local host_uri_vec = split_string(key$str, /,/);
local method = host_uri_vec[0];
local uri = host_uri_vec[1];
#local hits = double_to_count(floor(r$sum));
local hits = double_to_count(floor(r$num));

# prep the record
local log_rec: Info = [$ts=ts, $method=method,
$uri=uri, $hits=hits];
Log::write(HttpStats::LOG, log_rec);
}
]);
}

event bro_done()
{
Reporter::info(fmt("scount=%d", scount));
}

function update_http_stats(method: string, uri: string)
{
local key = cat_sep(",", "-", method, uri);

scount += 1;

# count URI hits.
SumStats::observe("http-stats", SumStats::Key($str=key),
SumStats::Observation($num=1));
}

event http_request(c: connection, method: string, original_URI: string,
unescaped_URI: string, version: string)
{
update_http_stats(method, unescaped_URI);
}
Azoff, Justin S
2018-11-29 17:24:04 UTC
Permalink
Hi!

This is all my fault 😞.  Currently trybro limits log output to 200 lines for each file. It shows the first 100 and the last 100.  I had always intended on making that more obvious and allowing that '200' parameter to be changed, but forgot all about it. It was mostly done as a performance optimization - the log output can be quite large and the result would either take too long to transfer to the client or the browser would freeze trying to render a table with 20k rows. The good news is that it is already a parameter on the backend, it just needs to be exposed to the api.

If you increase the interval on your script to 500secs that outputs all the records since the total number of rows is just under 200.

If you run it with a local bro binary you should get the output you are expecting as well.

That said.. the script you posted would likely have issues if ran on a cluster. The short time interval combined with the potential for a large number of unique 'keys' in sumstats would cause a large amount of load on the manager. If you're not running it on a cluster on live traffic it should work fine though. If you do want to run that exact analysis on a cluster I can write you a version that uses events directly and would perform a bit better under load.

--
- Justin
Azoff, Justin S
2018-11-30 19:14:41 UTC
Permalink
>
> Hi Justin,
> Thanks for responding. My problem is not with try.bro.org but with how sumstats seem to work. I was just using try.bro.org to demonstrate the issue in case someone wanted to try my test.
>

Hi,

While trying to reproduce your problem I found that this was fixed a few months ago:

https://github.com/bro/bro/commit/3495b2fa9d84e8105a79e24e4e9a2f9181318f1a#diff-3248d64d10c61bb0656f5c167feca5f0

I ended up tracking down the root cause only to realize this is already fixed
in 2.6 :-) Never hurts to practice bro script debugging though. Turns out the old script was deleting entries from a table while iterating over it, which is undefined behavior in bro (and in many other languages).

I have a directory with http.pcap and your script (s.bro)

I run a bro 2.5.5 container and count the results, getting 128 instead of 197.

***@mbp:~/b$ docker run -t -i --rm -v `pwd`:/b broplatform/bro:2.5.5
***@cbd05c9035c3:/# cd /b
***@cbd05c9035c3:/b# bro -r http.pcap s.bro
Creating HttpStats log stream and HTTP sumstats
1320279683.449294 ./s.bro, line 55: scount=197
***@cbd05c9035c3:/b#
***@cbd05c9035c3:/b# cat http-stats.log |bro-cut hits | awk '{s+=$1} END {printf "%.0f\n", s}'
128

Now I do the same test again but using bro 2.6 released yesterday and get the correct result of 197:

***@mbp:~/b$ docker run -t -i --rm -v `pwd`:/b broplatform/bro:2.6
***@869655245d1d:/# cd /b
***@869655245d1d:/b# bro -r http.pcap s.bro
Creating HttpStats log stream and HTTP sumstats
1320279683.449294 ./s.bro, line 55: scount=197
***@869655245d1d:/b#
***@869655245d1d:/b# cat http-stats.log |bro-cut hits | awk '{s+=$1} END {printf "%.0f\n", s}'
197


--
Justin
Dk Jack
2018-11-30 20:03:59 UTC
Permalink
Thanks for investigating this Justin. I was scratching my head for two days :)

Btw, I am using 2.4.1. Since my requirements were very simple, I ended up creating my own table and writing the accumulated counts to the log periodically using the ‘schedule’ primitive. That’s working correctly. Hopefully, I can get rid of that and move to the sumstats version when I upgrade my bro to 2.6.

Thanks again.

Dk.

On Nov 30, 2018, at 11:14 AM, Azoff, Justin S <***@illinois.edu> wrote:

>>
>> Hi Justin,
>> Thanks for responding. My problem is not with try.bro.org but with how sumstats seem to work. I was just using try.bro.org to demonstrate the issue in case someone wanted to try my test.
>>
>
> Hi,
>
> While trying to reproduce your problem I found that this was fixed a few months ago:
>
> https://github.com/bro/bro/commit/3495b2fa9d84e8105a79e24e4e9a2f9181318f1a#diff-3248d64d10c61bb0656f5c167feca5f0
>
> I ended up tracking down the root cause only to realize this is already fixed
> in 2.6 :-) Never hurts to practice bro script debugging though. Turns out the old script was deleting entries from a table while iterating over it, which is undefined behavior in bro (and in many other languages).
>
> I have a directory with http.pcap and your script (s.bro)
>
> I run a bro 2.5.5 container and count the results, getting 128 instead of 197.
>
> ***@mbp:~/b$ docker run -t -i --rm -v `pwd`:/b broplatform/bro:2.5.5
> ***@cbd05c9035c3:/# cd /b
> ***@cbd05c9035c3:/b# bro -r http.pcap s.bro
> Creating HttpStats log stream and HTTP sumstats
> 1320279683.449294 ./s.bro, line 55: scount=197
> ***@cbd05c9035c3:/b#
> ***@cbd05c9035c3:/b# cat http-stats.log |bro-cut hits | awk '{s+=$1} END {printf "%.0f\n", s}'
> 128
>
> Now I do the same test again but using bro 2.6 released yesterday and get the correct result of 197:
>
> ***@mbp:~/b$ docker run -t -i --rm -v `pwd`:/b broplatform/bro:2.6
> ***@869655245d1d:/# cd /b
> ***@869655245d1d:/b# bro -r http.pcap s.bro
> Creating HttpStats log stream and HTTP sumstats
> 1320279683.449294 ./s.bro, line 55: scount=197
> ***@869655245d1d:/b#
> ***@869655245d1d:/b# cat http-stats.log |bro-cut hits | awk '{s+=$1} END {printf "%.0f\n", s}'
> 197
>
>
> --
> Justin
Continue reading on narkive:
Loading...