Sunday, June 01, 2008

Network Flow: Uni-Directional VS Bi-Directional

If you are working on network flow research, you should have heard about Uni-Directional and Bi-Directional Network Flow. I will try to explain what are they here. Lets take the quick look of what network flow is first -

Network Flow is the sequence of packets or a packet that belonged to certain network session(conversation) between two end points but delimited by the setting of flow generation tool. To cut it short, it provides network traffic summarization by metering or accounting certain attributes in the network session.

The endpoints here are defined as below -

Layer 2 Endpoint - Source Mac Address | Destination Mac Address
Layer 3 Endpoint - Source IP Address | Destination IP Address
Layer 4 Endpoint - Source Port | Destination Port

Before we dive into understanding of UniFlow and BiFlow, lets look at the definition of Uni and Bi here -

http://www.yourdictionary.com/uni-prefix

http://www.yourdictionary.com/bi-prefix

Uni - one; having or consisting of one only; regarded as a single entity

Bi - using two or both; joining two, combining or involving two

In the context of Uni/Bi Directional Flow, Uni means single, Bi means both. Now, let make it more clearer.

Uni-Directional = Single Directional

Bi-Direction = Both Directional

I put up the illustration in the diagram below.

Uni-Directional Flow


Bi-Directional Flow

Now I will make a simple example, host A sends 90 bytes to host B and host B replies with 120 bytes. Here's the output -

Uni-Directional Network Flow
Srcaddr Direction Dstaddr Total Bytes
Host A -> Host B 90
Host B -> Host A 120

Bi-Directional Network Flow
Srcaddr Direction Dstaddr Total Bytes Src Bytes Dst Bytes
Host A <-> Host B 210 90 120

The Srcaddr and Dstaddr are the endpoints here. In Uni-Directional Flow, you only see the total bytes that sent by Host A(attribute of Host A) but nothing about Host B in the first flow record. Then the next record shows Host B sends 120 bytes to Host A(attribute of Host B). The total bytes is accounted from single endpoint(either Host A or B) only. But in BiFlow, you can see that Host A sends 90 bytes(Source Bytes) and Host B replies with 120 bytes(Destination Bytes). The total bytes is the accumulation of source and destination bytes. To summarize them -

Uni-Directional Network Flow Model - One direction at a time, every flow record contains the attribute of single endpoint only.

Bi-Directional Network Flow Model - Both direction at a time, every flow record contains the attribute of both endpoints.

Theory is tough sometime, here's the practical sample -

Cisco NetFlow uses Uni-Directional model for flow generation

Argus uses Bi-Directional model for flow generation

To draw good picture of Uni-Directional and Bi-Directional Network Flow, it's best to do comparison of them.

1. Network Flow data which is generated by Argus 3 natively
2. Network Flow data which is generated by Cisco NetFlow version 5

The flow records below are generated from the same network session. You can examine closely by clicking on them.

Cisco NetFlow(UniFlow):


Argus(BiFlow):


Flow record property:
SrcAddr = Source Address
Sport = Source Port
Dir = Direction
DstAddr = Destination Address
Dport = Destination Port
SrcPkts = Source Packets
DstPkets = Destination Packets
TotPkts = Total Packets
SrcBytes = Source Bytes
DstBytes = Destination Bytes
TotBytes = Total Bytes

Sometimes I like to think that UniFlow is stateless and BiFlow is stateful.

I will continue writing this Network Flow series, and I hope you enjoy it. Stay tuned for the next one - Traffic Matrix. And of course the HeX 021 series too.

Argus 3 Tip:
You can convert Argus BiFlow to UniFlow by using -M rmon option.

Peace (;])

3 comments:

Pablo said...

Hello again Lee :) !!!
I have read several definitions about network flows and have seen that can be defined in several ways, a definition that I liked is as follows: "A flow is a burst of traffic from the same source and heading to the same destination. If the space between to packets exceeds some inter-flow gap, they are said to belong to separate flows. This approach is also known as timeout-based flow profiling. Flows are identified by a five tuple consisting of source IP address, source port, destination IP address, destination port, and transport layer protocol." Others have suggested alternative approach to profiling flows.
I think that it is important the timeout for the flow definition, so
I would like to ask you whether the definition of timeout is between consecutive packages or from the first package and the current captured packet.
Thanks.
Pablo

C.S.Lee said...

hi pablo,

I go with the definition -

Network Flow is the sequence of packets or a packet that belonged to certain network session(conversation) between two end points but delimited by the setting of flow generation tool. To cut it short, it provides network traffic summarization by metering or accounting certain attributes in the network session.

I mention "but delimited by the setting of flow generation tool as different flow based tool tend to have different setting of their time out value. The time out value is often started from the last packet it has seen in the associated flow, if that particular flow doesn't see the associated packet within certain time period, it will just terminate and hence generate the next flow record if there's any later even though they belong to same network session.

All the flow based tools use this approach to avoid the memory hogging, bear in mind it can be resource intensive if you want to keep long time out value, that's also the reason why some tools allow you to define different time out value for different set of protocol to create more accurate flow record.

If you are using argus, argus can correct the flow record using racluster as I have shown in my previous post here -

http://geek00l.blogspot.com/2007/12/network-flow-demystified.html

Personally I don't like to define a flow is a burst of traffic from same source and heading to the same destination, in fact it doesn't really need to be a burst of traffic(you don't call one packet as a burst of traffic, think if I send single invalid packet and the other host not responding so the flow contains one packet only).

For flow are identified by a five tuple consisting of src ip, src port, dst ip dst port and protocol, that's what we call as flow identifier, you identify that particular packet is in the same flow by tracking this 5 tuples, this is the most basic or standard way of layer 3 IP flow modeling.

Cheers ;]

Unknown said...

All of us just want to use a network with a good quality, but when the network has some failures is necesary to know about the appropriate tools and fix the problem as soon as possible. Actually this blog is very useful. This is similar with a webside that i saw recently is called costa rica investment opportunities