Friday, November 14, 2014

node-netflowv9 is updated to support netflow v1, v5, v7 and v9

My netflow module for Node.JS has been updated. Now it support more NetFlow versions - like NetFlow ver 1, ver 5, ver 7 and ver 9. Also it has been modified to be able to be used as Event Generator (instead of doing callbacks). Now you can do as well (the old model is still supported):

Collector({port: 3000}).on('data',function(flow) {
    console.log(flow);
});
Additionally, the module now supports and decode option templates and option data flows for NetFlow v9.

Wednesday, October 22, 2014

My perl-cwmp patches are merged


Hello,
I've used perl-cwmp here and there. It is a nice, really small, really light and simple TR-069 ACS, with a very easy install and no heavy requirements. You can read the whole code for few minutes and you can make your own modifications. I am using it in a lot of small "special" cases, where you need something fast and specific, or a very complex workflow that cannot be implemented by any other ACS server.

However, this project has been stalled for a while. I've found that a lot of modern TR-069/CWMP agents do not work well with the perl-cwmp. 

There are quite of few reasons behind those problems:

- Some of the agents are very strict - they expect the SOAP message to be formatted in a specific way, not the way perl-cwmp does it
- Some of the agents are compiled with not so smart, static expansion of the CWMP xsd file. That means they do expect string type spec in the SOAP message and strict ordering

perl-cwmp do not "compile" the CWMP XSD and do not send strict requests nor interpretate the responses strictly. It does not automatically set the correct property type in the request according to the spec, because it never reads the spec. It always assume that the property type is a string.

To allow perl-cwmp to be fixed and adjusted to work with those type of TR-069 agents I've done few modifications to the code, and I am happy to announce they have been accepted and merged to the main code:

The first modification is that I've updated (according to the current standard) the SOAP header. It was incorrectly set and many TR069 devices I have tested (and basically all that worked with the Broadcom TR069 client) rejected the request.

The second modification is that all the properties now may have specified type. Unless you specify the type it is always assumed to be a string. That will allow the ACS to set property value of agents that do a strict set check.

InternetGatewayDevice.ManagementServer.PeriodicInformInterval: #xsd:unsignedInt#60

The #...# specifies the type of the property. In the example above, we are setting value of unsignedInt 60 to PeriodicInformInterval.

You can also set value to a property by reading a value from another property.
For that you can use ${ property name }

Here is an example how to set the PPP password to be the value of the Serial Number:

InternetGatewayDevice.WANDevice.1.WANConnectionDevice.1.WANPPPConnection.1.Password: ${InternetGatewayDevice.DeviceInfo.SerialNumber}

And last but not least - now you can execute small code, or external script and set the value of a property to the output of that code. You can do that with $[ code ]

Here is an example how to set a random value to the PeriodicInformInterval:

InternetGatewayDevice.ManagementServer.PeriodicInformInterval: #xsd:unsignedInt#$[60 + int(rand(100))]

Here is another example, how to execute external script that could take this decision:
InternetGatewayDevice.ManagementServer.PeriodicInformInterval: #xsd:unsignedInt#$[ `./externalscript.sh ${InternetGatewayDevice.LANDevice.1.LANEthernetInterfaceConfig.1.MACAddress} ${InternetGatewayDevice.DeviceInfo.SerialNumber}` ]

The last modification I've done is to allow the perl-cwmp to "fork" a new process when a TR-069 request arrives. It has been single threaded code, which mean the agents has to wait until the previous task is completed. However, if the TCP listening queue is full, or the ACS very busy, some of the agents will assume there is no response and timeout. You may have to wait for 24h (the default periodic interval for some vendors) until you get your next request. Now that can be avoided.

All this is very valuable for dynamic and automated configurations without the need of modification of the core code, just modifying the configuration file.

Saturday, October 4, 2014

Why MVC?

As you all probably know, the MVC approach has been very modern lately. MVC stands for Model-View-Controller where it is expected that your data, your visualization and your gluing and managing code has to be fully separated in separated files (they are not separated really as they are linked to each other in the same program). As I am coming from the world of the system and embedded programming it was hard for me to understand the reason behind this. 
Instinctively I thought this should be somehow related to the ease of the development. May be this makes it easier for the separation of the work of UI designers, back-end communication (and development) and the UI execution control. You can easily split the work among different people with different skills, I thought. But now I realize it is something absolutely different.
It is maintainability, therefore easier support!
And it is best illustrated with HTML.

You can easily insert JavaScript code directly within an HTML tag:
<INPUT TYPE=BUTTON onClick="alert('blabla')" VALUE="Click Me!">

If you go for the MVC approach, you should have a separated code that do something like this:
Separate HTML:
<INPUT ID="myButton" TYPE=BUTTON VALUE="Click Me!">

Separate JavaScript:
document.getElementById("myButton").addListener("click",function() { alert('blabla') })

It is obvious - MVC is more expensive in terms of code, structure, style and preparations. So why to walk this extra mile? Some programmers with my background would usually say - it has overheads and therefore is ineffective to program.

However, if you have a case of a software that has to be rewriten constantly - introducing new functionality, new features, fixing it, you have a lot of other issues to deal with. Your major problem will be the maintainability and readability of your code.
And I am sure everyone will agree that having all your control code, execution and control flow merged in the same code structure is much better, than having them split among a lot of data processing code and UI visualisations. 

If you have a huge HTML code with a lot of javascript code separated and bound directly into the tags (non MVC) it is extremely hard to know and keep in mind what are all the events that happens and what is the order of the execution of the code. MVC will make that much much easier, even though in the beginning it may be costly with an extra overhead.

Wednesday, October 1, 2014

Sencha ExtJS grid update in real time from the back-end

Hello to all,

I love using Sencha ExtJS in some projects as it is the most complete JavaScript UI framework, even though it is kind of slow, not fast reacting and being cpu and memory expensive. ExtJS allows you to do very fast and lazy development of otherwise complex UI and especially if you use Sencha Architect you can minimize the UI development time focusing only on the important things of your code.

However, ExtJS has quite few draw backs - missing features or some things are over complex and hard to be kept in mind by inexperienced developer (like their Controller idea). 

Here I would like to show you a little example how you can implement a very simple real time update of Sencha Grids (tables) from the backend for an multi user application.

Why do you need this?
I often develop apps that has to be used by multiple persons at the same time and they share and modify the same data.

In such situation, a developer usually has to resolve all those conflicting cases where two users try to modify the same exact data. And Sencha ExtJS grids are not very helpful here. Sencha uses the concept of Store that interact with the data of the back-end (for example by using REST API) and then the Store is assigned to a visualization object like ComboBox or a Grid (Tables). If you modify a table (with the help of Cell Edit Plugin or Row Edit Plugin) that has autoSync property set to true, then any modification you do automatically generates a REST POST/PUT/DELETE query to inform the back end. It can never be easier for a developer, right? But all the data sent to the back end contains the whole modified row - all the properties. On a first sight, this is not an issue. But it is, if you have multiple users editing the same table at the same time. The problem happens because the Sencha Store caches the data. So if User1 modifies it - it is stored on the server. But if User2 modifies the same row but a different column, it will do that over the old data and can overwrite the User1 modification. The backend cannot know which property has been modified and which not and who of the two modifications has to be kept.
There are a lot of tricks a developer usually use to avoid this conflicts. Keeping a version of the modification with each data row in the server, which is received in GET by the UI clients. So when a modification happens, it is accepted only if the client sends the same version number as the one stored in the server, and then the version in the server increases. If another one modification is received with older cached data, it will not be accepted as it will have a different version number. Then the customer will receive an error, then the UI software may refresh its data and updates the versions and the content visualized to the user. 
This is quite popular model, but it is not very nice for the user. The problem is that with multiple users working with the application modifying the same data over the same time, the user will constantly be outdated and will constantly receive errors loosing all its modifications.
The only good solution for both users and the system in general is if in case of change we can update the data in real time in all UI applications. This does not avoid all the possibilities for conflict. But it is highly minimizing it, making the whole operation more pleasant for the end user.

This problem and the need of resolving it happens quite often. Google Spreadsheet and later Google Docs has introduced real time update between the UI data of all the users modifying the same document about 4 years ago.

Example
I like to show here that it is not really hard to update in real time the Stores of ExtJS applications.
It actually requires very little additional code.

Lets imaging we are using a UI developed in Sencha ExtJS with Stores communicating through REST with the backend. The backend for this example will be Node.JS and MongoDB.

Between the Node.JS and the Ext.JS UI there will be Socket.IO session that we will use to push the updates from the Node.JS to the ExtJS Store. I love Socket.IO because it provides a simple WebSockets interface with fallback to HTTP pooling model in case of WebSockets cannot be open (which happens a lot, if you are so unlucky to use a Microsoft security software for example - it blocks WebSockets). 

At the MongoDB we may use capped collections. I love capped collections - they are not only limited in size, but also they allow you to bind a triggers (make the collection tailable) that will receive any new insertion immediately when it happen.

So imagine your Node.JS express REST code looks something like this:

app.get('/rest/myrest',restGetMyrest);
app.put('/rest/myrest/:id',restPutMyrest);
app.post('/rest/myrest/:id',restPostMyrest);
app.del('/rest/myrest/:id',restDelMyrest);

function restGetMyrest(req,res) { // READ REST method
   db.collection('myrest').find().toArray(function(err,q) { return res.send(200,q) })
}

function restPutMyrest(req,res) { // UPDATE REST method
  var id = ObjectID.createFromHexString(req.param('id'));
  db.collection('myrest').findAndModify({ _id: id }, [['_id':'asc']], { $set: req.body }, { safe: true, 'new': true }, function(err,q) {
      if (err || (!q)) return res.send(500);
      db.collection('capDb').insert({ method: 'myrest', op: 'update', data: q }, function() {});
      return res.send(200,q);
  })
}

function restPostMyrest(req,res) { // CREATE REST method
  var id = ObjectID.createFromHexString(req.param('id'));
  db.collection('myrest').insert({ _id: id },req.body, { safe: true }, function(err,q) {
      if (err || (!q)) return res.send(500);
      setTimeout(function() {
         db.collection('capDb').insert({ method: 'myrest', op: 'create', data: q[0] }, function() {});
      },250);
      return res.send(200,q);
  })
}

function restDelMyrest(req,res) { // DELETE REST method
  var id = ObjectID.createFromHexString(req.param('id'));
  db.collection('myrest').remove({ _id: id }, { $set: req.body }, { safe: true }, function(err,q) {
      if (err || (!q)) return res.send(500);
      db.collection('capDb').insert({ method: 'myrest', op: 'delete', data: { _id: id } }, function() {});
      return res.send(201,{});
  })
}

As you can see above - we have implemented a classical CRUD REST method named "myrest" retrieving and storing data in a mongodb collection named 'myrest'. However, with all modification we also store that modification in a mongodb capped collection named "capDb".
We use this capped collection (in bold) as an internal mechanism for communication within the NodeDB. You can use events instead, or you can directly send this message to the Socket.IO receiver. However, I like capped db, as they set a lot of advantages - there can be multiple Node.JS processes listening on a capped db and receiving the updates simultaneously. So it is easier to implement clusters that way, including notifying Node.JS processes distributed over different machines.

So now, may be in another file or anywhere else, you may have a simple Node.JS Socket.IO code looking like this:

var s = sIo.of('/updates');
db.createCollection("capDb", { capped: true, size: 100000 }, function (err, col) {
   var stream = col.find({},{ tailable: true, awaitdata: true, numberOfRetries: -1 }).stream();
   stream.on('data',function(doc) {
       s.emit(doc.op,doc);
   }
});
 
With this little code above we are basically broadcasting to everyone connected with Socket.IO to /updates the content of the last insertion in the tailable capDb. Also we are creating this collection, if it does not exists from before.

This is everything you need in Node.JS :)

Now we can get back to the Ext.JS code. Simply you need to have somewhere in your HTML application this code executed:

var socket = io.connect('/updates');
socket.on('create', function(msg) {
   var s = Ext.StoreMgr.get(msg.method);
   if ((!s)||(s.getCount()>s.pageSize||s.findRecord('id',msg.data._id)) return;
   s.suspendAutoSync();
   s.add(msg.data);
   s.commitChanges();
   s.resumeAutoSync();
});
socket.on('update', function(msg) {
   var s = Ext.StoreMgr.get(msg.method);
   var r;
   if ((!s)||(!(r=s.findRecord('id',msg.data._id))) return;
   s.suspendAutoSync();
   for (var k in msg.data) if (r.get(k) != msg.data[k]) r.set(k,msg.data[k]);
   s.commitChanges();
   s.resumeAutoSync();
});
socket.on('delete',function(msg) {
   var s = Ext.StoreMgr.get(msg.method);
   var r;
   if ((!s)||(!(r=s.findRecord('id',msg.data._id))) return;
   s.suspendAutoSync();
   s.remove(r);
   s.commitChanges();
   s.resumeAutoSync();
});

This is all.
Basically what we do from end to end -
If the Node.JS receives any CRUD REST operation it updates the data in the MongoDB, but also for Create, Update, Delete it notify over Socket.IO all the listening web clients about this operation (in my example, I use tailable capped collection in MongoDB as a an internal messaging bus, but you can emit to the Socket.IO directly or use another messaging bus like EventEmitter).

Then the ExtJS receives the update over Socket.IO and assumes that the method property contains the name of the Store that has to be updated. Then we find the store, suspedAutoSync if it exists (otherwise we can get into update->autosync->rest->update loop), modify the content of the record (or the store) and resume AutoSync.

With this simple code you can broadcast all the modifications in your data between all the extjs users that are currently online, so they can see updates in real time in their grids.

A single REST method may be used by multiple stores. In such case, you have to modify your code with some association between the REST method name and all the related stores.
However, for this simple example, that is unnecessary.

Some other day, I may show you my "ExtJS WebSockets CRUD proxy" I made, where you have only one communication channel between the stores and the backend - Socket.IO. It is much faster and removes the need of having REST code at all in your server. 

Friday, June 13, 2014

Fun with JavaScript inheritance

One could say that JavaScript does not support native inheritance for OOP and could not be so wrong. JavaScript happened to be having one of the most powerful OOP I have ever seen :)

Something more, JavaScript does the inheritance trough a prototype chaining, which actually provides even more power than you can imagine.

See the following simple chaining example, that shows a chained inheritance (not the typical one-step prototype->object inheritance everyone knows):

~$ node
> a={}
{}
> b = Object.create(a)
{}
> c = Object.create(b)
{}
> a.test = 1
1
> a
{ test: 1 }
> b
{}
> c
{}
> c.test
1
> b.text=2
2
> b
{ text: 2 }
> b.test
1
> c.test
1
> c.text
2
> c
{}

So now you can see above how can we do a very easy but powerful chain of objects. Something more, that works for ANY object, including chaining BOTH the methods and the prototypes of the objects, so you can modify either the objects directly, or their prototypes and everything is inherited with a two-shadow priority model.

~$ node
> a=[1,2,3,4]
[ 1, 2, 3, 4 ]
> b=Object.create(a)
{}
> b
{}
> b[1]
2
> b.length
4
> b
{}

So now you can see that b is still (an empty) object but has behavior and properties like an array :)

Operator overloading with JavaScript

JavaScript happens to have an extremely powerful OOP tool-set that is quite unknown. I am surprised every day by the power and flexibility of it.

One of the things I didn't knew is possible at all with JavaScript was operator overloading.
And this is understandable - JavaScript doesn't have strict variable types. Therefore any type of polymorphism is very hard to be implemented, as it is bound to the object/variable types.

However, you can easily do polymorphism with JavaScript and as it happens, you can easily do operator overloading too :)

How that works?

It is really easy - JavaScript retrieve the value of an Object using the valueOf method. And every object has valueOf with the exception of few primitive types (as in Java and C++). Therefore you can use that to do operator oveloading.

Let me show you an example:

~$ node
> a={}
{}
> a == 3
false
> a.valueOf = function() { return 3 }
[Function]
> a == 3
true

And the best is that you can do that to the prototype (or to the inherited object):

~$ node
> {} == 3
false
> Object.prototype.valueOf = function() { return 3 }
[Function]
> {} == 3
true

New improved version of the node-netflowv9 module for Node.JS

As I have mentioned before, I have implemented a NetFlowV9 compatible library for decode of Cisco NetFlow version 9 packets for Node.JS.

Now I am upgrading it to version 0.1 which has few updates:
  • bug fixes (including avoidance of an issue that happens with ASR9k and IOS XR 4.3)
  • now you can start the collector (with a second parameter of true) in a mode where you want to receive only one call back per packet instead of one callback per flow (the default mode). That could be useful if you want to count the lost packets (otherwise the relation netflow packet - callback is lost)
  • decrease of the code size
  • now the module compiles the templates dynamically into a function (using new Function). I like this approach very much, as it creates really fast functions (in contrast to eval, Function is always JIT processed) and it allows me to spare loops, function calls and memory copy. I like to do things like that with every data structure that allows it. Anyway, as an effect of this, the new module is about 3 times faster with all the live tests I was able to perform


Tuesday, June 10, 2014

Simple example for Node.JS sflow collector

Sometimes you can use the SFlow or Netflow to extra add intelligence to your network. The collectors available on internet are usually there just to collect and store data used for accounting or nice graphics. But the collectors are either not allowing you to execute your own code in case of certain rules/thresholds reached, or do not react in real time (in general, the protocols delays you too. You cannot expect NetFlow accounting to be used in real time at all, while SFlow has modes that are bit more fast to react, by design, it is still not considered to be real-time sampling/accounting).

Just imagine you have a simple goal - you want to automatically detect floods and notify the operators or you can even automatically apply filters.

If you have an algorithm that can distinguish the incorrect traffic from the normal traffic from NetFlow/SFlow sampling you may like to execute an operation immediately when that happens.

The modern DoS attacks and floods may be complex and hard to detect. But mainly it is hard to make the currently available NetFlow/SFlow collector software to do that for you and then trigger/execute external application.

However, it is very easy to program it yourself.

I am giving you a simple example that uses the node-sflow module to collect packet samples, measure how many of them match a certain destination ip address and if they are above certain pps thresholds to execute an external program (that is supposed to block that traffic). Then after a period of time it will execute another program (that is supposed to unblock the traffic).

This program is very small - about 120 lines of code and allows you to use complex configuration file where you can define a list of rules that can match optionally vlans and networks for the sampled packet and then count how many samples you have per destination for that rule. The rule list is executed until first match in the configured order within the array, so that allows you to create black and white lists and different thresholds per networks and vlans, or to have different rules per overlapped ip addresses as long as they belong to different vlans.

Keep in mind this is just an example software there just for your example, showing you how to use node-sflow and pcap modules together! It is not supposed to be used in production, unless you definitely know what you are doing!

The goal of this example it here just to show you how easy is to add extra logic within your network.

The code is available on git-hub here https://github.com/delian/sflow-collector/

Monday, June 9, 2014

RTMP Api for Node.JS to ease the implementation of RTMP servers and clients

Hello to all,

As I mentioned before, I needed to implement a RTMP streaming server in Node.JS. All of the available modules for implementation of RTMP in Node's NPM repository were incomplete, incorrect or unusable. Not only that but the librtmp used by libav tools like avconv and avplay was incorrect and incomplete.
The same with most of the implementation I've checked (covering perl, python, others). I've tried to fix few of them but at the end I had to write one on my own.

This is my library of RTMP related tools and API for Node.JS. It is named node-rtmpapi and is available in the npm repository. Also you can get it here - https://github.com/delian/node-rtmpapi

It works well for me, and it has been tested with MistServer, OrbanEncoders and librtmp (from libav).

That does not mean it will work for you, though :)

RTMP is quite badly documented protocol and extremely badly implemented.
During my tests I have seen issues like crash of libraries (including the Adobe's original one) if the upper layer commands has been sent in unexpected order (although this is allowed by the RTMP protocol and the order of the upper layer commands is not documented at all). Also I have seen (within Adobe's rtmp library) incorrect implementation of the setPeerBandwidth command.

Generally, each different RTMP implementation is on its own and the only way to make it work is to adjust and tune it according to the software you communicate with.

Therefore I separated my code in utils that allows me to write my own RTMP server relatively easy and to adjust it according to my needs.

The current library supports only TCP as a transport (although TLS and HTTP/HTTPS is easy to be implemented, I haven't focused on it yet).

It provides separate code that implements streaming (readQueue), the chunk layer of the protocol (rtmpChunk), the upper layer messaging (assembling and disassembling of message over chunks, rtmpMessage), stream processing (rtmpStream) and basic server implementation without the business logic (rtmpServer).

Simplified documentation is provided at the git-hub repository.

The current library uses callbacks for each upper layer command it receives. I am planning to migrate the code to use node streams and to trigger events per command, instead of callbacks. This will extremely simplify the usage and the understanding of the library for a node programmer. However, this is the future and in order to preserve compatibility, I will probably name it something different (like node-streams-rtmpapi)

AMF0/3 encoding/decoding in Node.JS

I am writing my own RTMP restreamer (RTMP is Adobe's dying streaming protocol widely used with Flash) in Node.JS.

Although, there are quite of few RTMP modules, no one is complete, nor operates with Node.JS buffers, nor support fully ether AMF0 or AMF3 encoding and decoding.

So I had to write one on my own.

The first module is the AMF0/AMF3 utils that allow me to encode or decode AMF data. AMF is a binary encoding used in Adobe's protocols, very similar to BER (used in ITU's protocols) but supporting complex objects. In general the goal of AMF is to encode ActiveScript objects into binary. As ActiveScript is a language belonging to the JavaScript's familly, basically the ActiveScript's objects are javascript objects (with the exception of some simplified arrays).

My module is named node-amfutils and is now available in the public NPM repository as well as here https://github.com/delian/node-amfutils

It is not fully completed nor very well tested as I have very limited environment to do the tests. However, it works for me and provides the best AMF0 and AMF3 support currently available for Node.JS - 
  • It can encode/decode all the objects defined in both AMF0 and AMF3 (the other AMF modules in the npm repository supports partial AMF0 or partial AMF3)
  • It uses Node.JS buffers (it is not necessary to do string to buffer to string conversion, as you have to do with the other modules)

It is easy to use this module. You just have to do something like this:

var amfUtils = require('node-amfutils');
var buffer = amfUtils.amf0Encode([{ a: "xxx"},b: null]);

Sunday, June 8, 2014

SFlow version 5 module for Node.JS

Unfortunately, as with NetFlow Version 9, SFlow version 5 (and SFlow in general) has not been very well supported by the Node.JS community up to now.

I needed modern SFlow version 5 compatible module, so I had to write one on my own.

Please welcome the newest module in Node.JS's NPM that can decode SFlow version 5 packets and be used in the development of simple and easy SFlow collectors! The module is named node-sflow and you can look at its code here https://github.com/delian/node-sflow

Please be careful, as in the next days I may change the object structure of the flow representation to simplify it! Any tests and experiments are welcome.

The sflow module is available in the public npm (npm install node-sflow) repository.

To use it you have to do:
var Collector = require('node-sflow');

Collector(function(flow) {
    console.log(flow);
}).listen(3000); 
In general SFlow is much more powerful protocol than NetFlow, even it its latest version (version 9). It can represent more complex counters, report about errors, drops, full packet headers (not only their properties), collect information from interfaces, flows, vlans, and combine them in a much more complex reports.

However, the SFlow support in the agents - the networking equipment is usually extremely simplified - far from the richness and complexity the SFlow protocol may provide. Most of the vendors just do packet sampling and send them over SFlow as raw packet/frame header with an associated unclear counter.

In case of you having the issue specified above, this module cannot help much. You will just get the raw packet header (usually Ethernet + IP header) as a Node.JS buffer and then you have to decode it on your own. I want to keep the node-sflow module simple and I don't plan to decode raw packet headers there as this feature is not a feature of the SFlow itself.

If you need to decode the raw packet header I can suggest one easy solution for you. You can use the pcap module from the npm repository and decode the raw header with it:
var Collector = require('node-sflow');
var pcap = require('pcap');

Collector(function(flow) {
    if (flow && flow.flow.records && flow.flow.records.length>0) {
        flow.flow.records.forEach(function(n) {
            if (n.type == 'raw') {
                if (n.protocolText == 'ethernet') {
                    try {
                        var pkt = pcap.decode.ethernet(n.header, 0);
                        if (pkt.ethertype!=2048) return;
                        console.log('VLAN',pkt.vlan?pkt.vlan.id:'none','Packet',pkt.ip.protocol_name,pkt.ip.saddr,':',pkt.ip.tcp?pkt.ip.tcp.sport:pkt.ip.udp.sport,'->',pkt.ip.daddr,':',pkt.ip.tcp?pkt.ip.tcp.dport:pkt.ip.udp.dport)
                    } catch(e) { console.log(e); }
                }
            }
        });
    }
}).listen(3000);

Thursday, June 5, 2014

NetFlow Version 9 module for Node.JS

I am writing some small automation scripts to help me in my work from time to time. I needed a NetFlow collector and I wanted to write it in javascript for Node.JS because of my general desire to support this platform enabling JavaScript language in generic application programming and system programming.

Node.JS is having probably the best in the market package manager (for a framework) named npm. It is extremely easy to install and maintain a package, to keep dependencies or even "scoping" it on a local installation avoiding the need of having root permissions for your machine. This is great. However, most of the packages registered in the npm database are junk. A lot of code is left without any development or having generic bugs or is simply incomplete. I am strongly suggesting to the nodejs community to introduce a package statuses based on public voting marking each module in "production", "stable", "unstable", "development" quality and to set by default the npm search searching in "production" and "stable". Actually, npm already have a way to do that, but leaves the marking decision to the package owner.

Anyway, I was looking for Netflow v9 module that could allow me to capture netflow traffic with this version. Unfortunately the only module supporting NetFlow was node-Netflowd. It does support Netflow version 5 but has a lot of issues with NetFlow v9, to say at least. After few hours testing it at the end I decided to write one on my own.

So please welcome the newest Node.JS module that support collecting and decoding of NetFlow version 9 flows named "node-netflowv9"

This module supports only Netflow v9 and has to be used only for it.
The library is very very simple, having about 250 lines of code and supports all of the publicly defined Cisco properties, including variable length numbers and IPv6 addressing.

It is very easy to use it. You just have to do something like this:
var Collector = require('node-netflowv9');

Collector(function(flow) {
    console.log(flow);
}).listen(3000);
The flow will be represented in JavaScript object in a format very similar to this:
{ header: 
  { version: 9,
     count: 25,
     uptime: 2452864139,
     seconds: 1401951592,
     sequence: 254138992,
     sourceId: 2081 },
  rinfo: 
  { address: '15.21.21.13',
     family: 'IPv4',
     port: 29471,
     size: 1452 },
  flow: 
  { in_pkts: 3,
     in_bytes: 144,
     ipv4_src_addr: '15.23.23.37',
     ipv4_dst_addr: '16.16.19.165',
     input_snmp: 27,
     output_snmp: 16,
     last_switched: 2452753808,
     first_switched: 2452744429,
     l4_src_port: 61538,
     l4_dst_port: 62348,
     out_as: 0,
     in_as: 0,
     bgp_ipv4_next_hop: '16.16.1.1',
     src_mask: 32,
     dst_mask: 24,
     protocol: 17,
     tcp_flags: 0,
     src_tos: 0,
     direction: 1,
     fw_status: 64,
     flow_sampler_id: 2 } }
There will be a callback per each flow, not only one for each packet. If the packet contain 10 flows, there will be 10 callbacks containing each different flow. This simplifies the Collector code as you don't have to loop on your own trough the flows.

Keep in mind that Netflow v9 does not have a fixed structure (in difference to NetFlow v1/v5) and it is based on templates. It depends on the platform which properties it will set in the temlpates and what will be the order of it. You always have to test you netflow v9 collector configuration. This library is trying to simplify it as much as possible, but it cannot compensate it.

My general feeling is that S-Flow is much better defined and much more powerful than NetFlow in general. NetFlow v9 is the closest Cisco product that can provide (but is not necessary providing) similar functionality. However, the behavior and the functionality of NetFlow v9 differ between the different Cisco products. On some - you can define aggregations and templates on your own. On some (IOS XR) you can't and you use NetFlow v9 as a replacement to NetFlow v5. On some other Cisco products (Nexus 7000) there is no support of NetFlow at all, but there is S-Flow :)

In all of the Cisco products, the interfaces are sent as SNMP interface index. However, this index may not be persistent (between device reboots) and to associate it with an interface name you have to implement cached SNMP GET to the interface table OID on your own.

Because of the impressive performance of the modern JavaScript this little module performs really fast in Node.JS. I have a complex collector implemented with configurable and evaluated aggregations that uses on average less than 2% CPU on a virtual machine, processing about 100 packets with flows and about 1000 flow statistics per second.

Update:
http://deliantech.blogspot.com/2014/06/new-improved-version-of-node-netflowv9.html

My first post

Although, I have blog for a very long time, it is in Bulgarian and I never had written anything in English publicly before.
So this is the first post in my first (public) blog in English.
English is not my native language. Please be gentle to my language and style mistakes and note them so I can learn.

The goal of this little blog is mainly to present my opinion about some IT tech stuff as gadgets and programming. I am a busy man and I usually have no time to write. Be prepared for very long gaps of silence here :)