Using AWS Flow Logs to Detect Network Intruders


How do you know if there are people in your network that shouldn’t be?  The easiest way to know is by using your AWS Flow Logs.  It is an AWS feature that captures meta data from all IP traffic flowing through your AWS network.  It captures information such as IPs, ports, and protocols (no data payload) that allows you to see what is legitimate traffic and what is not.

Unfortunately, Amazon doesn’t make it easy. AWS gives you an API and a console to access this information.  With the console, you can only search through one network interface at a time.  If you are running multiple nodes, you will have to look through each interface one at a time.  Or, you can use the API and write a program to sort through it the data.


This is where can help you out.  We will pull the data nightly and create a dashboard for you to review. We parse the mountains of information into actionable data so you don’t have to.


We also provide value-added information by cross referencing the IP list with malicious threat lists. You can see at a glance if your network is communicating with known bad IPs and what threat level these IPs are to you.  For example, you most likely don’t want to leave port 3389 (Windows remote desktop protocol) or port 3306 (MySQL) open to the internet.  The table below shows that these ports are being actively scanned by malicious sources.


Head on over to to see how we can help you out and produce a useful daily dashboard on what is happening in your cloud.



Riak TS Cluster with Docker


The Riak TS database, a distributed NoSQL key/value store optimized for time series data, was recently open sourced.  Time series data is any metric that has time in its denominator, such as CPU and request per seconds over time.  This type of databases are great for metrics, log analytics, IoT sensor data, etc, etc.

This post will show you how to set-up a 3 node Riak TS cluster with docker compose. Clone out the Github repository with the docker-compose file and change dir into that directory.

You should pull the image first:

docker pull garland/riak-ts:1.3.0-1

Next, we will turn up the cluster by executing the docker compose command:

docker-compose up

A bunch of log message will fly by, here is a snippet of it:

riaktsdocker_riak1_1 is up-to-date
Recreating riaktsdocker_riak3_1
Recreating riaktsdocker_riak2_1
Attaching to riaktsdocker_riak1_1, riaktsdocker_riak3_1, riaktsdocker_riak2_1
riak3_1 | ++ grep -v
riak3_1 | ++ grep -Eo '([0-9]*\.){3}[0-9]*'
riak3_1 | ++ grep -Eo 'inet (addr:)?([0-9]*\.){3}[0-9]*'
riak3_1 | ++ ifconfig
riak3_1 | + LOCAL_IP=
riak3_1 | ++ head -1
riak3_1 | ++ ping -c 1 riak1
riak3_1 | ++ grep -E -o '[0-9]+.[0-9]+.[0-9]+.[0-9]+'
riak3_1 | + RIAK_1_IP=




riak3_1 | 2016-05-24 05:50:58.159 [info] @riak_core_gossip:log_node_changed:351 'riak@' changed from 'joining' to 'valid'
riak3_1 | 2016-05-24 05:50:58.159 [info] @riak_core_gossip:log_node_changed:351 'riak@' changed from 'joining' to 'valid'
riak2_1 | 2016-05-24 05:50:58.129 [info] @riak_core_gossip:log_node_changed:351 'riak@' changed from 'joining' to 'valid'
riak2_1 | 2016-05-24 05:50:58.129 [info] @riak_core_gossip:log_node_changed:351 'riak@' changed from 'joining' to 'valid'

Once you see the last joining messages, the cluster is up!   You can view the 3 containers running with this command:

docker ps

This output will show 3 containers running and the names of the containers:

473fb8937d96 garland/riak-ts:1.3.0-1 /bin/sh -c /opt/star 11 minutes ago Up 11 minutes riaktsdocker_riak2_1
add0832fca71 garland/riak-ts:1.3.0-1 /bin/sh -c /opt/star 11 minutes ago Up 11 minutes riaktsdocker_riak3_1
7a6fc715eece garland/riak-ts:1.3.0-1 /bin/sh -c /opt/star 12 minutes ago Up 11 minutes riaktsdocker_riak1_1

You can verify by using the riak-admin command to see the status of the cluster:

docker exec riaktsdocker_riak2_1 riak-admin member_status

Here’s the cluster status output:

================================= Membership ==================================
Status Ring Pending Node
valid 34.4% -- 'riak@'
valid 32.8% -- 'riak@'
valid 32.8% -- 'riak@'
Valid:3 / Leaving:0 / Exiting:0 / Joining:0 / Down:0

Now you have a 3 node Riak TS cluster running locally on your machine!

How To Use AWS Flow Logs

Amazon Flow Logs is a free AWS feature that “captures information about IP traffic going to and from network interfaces”.  It captures all the meta data information about IP traffic in your cloud (it does not capture the actual payload itself because that would be too much data to save).

What is in this meta data?

  • interface id (which instance it came from)
  • source and destination IP and port
  • protocol (TCP, UDP, ICMP, etc)
  • number of packets transferred
  • number of bytes transferred
  • time/duration
  • was the traffic accepted or rejected


This information can be useful in a myriad of ways to different users of your AWS cloud.

For the Ops or DevOps users, Amazon Flow Logs can provide information on what the usual traffic pattern should be.  You’ll know how much user traffic comes inbound from the internet and from where in the world by mapping the IP to a geographical location. You’ll find out who the top talkers are on the network and which machine is sending the most traffic.  This gives you a profile of how your cloud operates.

For Security users, they can use Amazon Flow Logs to find out which malicious IPs are trying to talk to their network and which machines they’re talking to.  These IP addresses can be cross referenced with curated lists that external security analysts produce to identify its threat severity. You can also use this service to find data leakage in the case of a large amount of data is sent outbound to a host that is not yours.

AWS gives you this data in a log format.  It is up to you to ingest this data in, parse it out, and produce analytics out of it.  FlowLog-Stats takes care of this process for you and provides you with a refreshed dashboard daily.  Check out our demo dashboard.


Production Ready Jenkins 2.0 in Two Steps with Docker


This blog post will show you how to run production-ready Jenkins with 2 steps.  We have been using this for over six months with great success!  Just a note, you can only run this on a machine that has Docker installed on it.

We are going to use the official image from Jenkins located here. The following scripts can also be found in this Github repository:

First, we will create the init upstart script so that when the system boots, reboots, or if the container fails, it will restart this process just like any other server processes you run.

Step 1 – Write the start configuration file

Using your favorite text editor, open this file as a root user and edit /etc/init/jenkins.conf to add this content:

description Starts the Jenkins Docker server

start on filesystem or runlevel [2345]
stop on shutdown

 echo [`date`] Jenkin is starting /var/log/jenkins.log
 exec docker stop jenkins | true
 exec docker rm jenkins | true
 exec docker run --name jenkins -p 80:8080 -p 50000:50000 -v /opt/jenkins_home:/var/jenkins_home jenkins:2.0-alpine

end script

pre-stop script
 docker stop jenkins
 echo [`date`] Jenkin is stopping /var/log/jenkins.log
end script

respawn limit 2 60

This code accomplishes several things:

  1. Starts this service when the system boots and shuts it down when the system goes down
  2. Logs the starting and stopping to the file /var/log/jenkins.log
  3. Maps the Jenkins port to port 80 on your server
  4. Outputs the Jenkins home directory to your local server path: /opt/jenkins_home

Number 4 is the most important function.  If you change files in a Docker container and it restarts, the files do not persist.  By running this code, you map the files Jenkins writes to your local file system so that if the server reboots or Jenkins restarts, all of your settings and jobs will be safe.  Plus, all you have to do is save the folder /opt/jenkins_home to backup Jenkins.

Step 2 – Start it up

This step is really easy.  You simply have to start it up:

service jenkins start

You can now go to http://localhost to configure Jenkins.  Jenkins now has a setup process and requires a password.  You can retrieve the password in the logs when it first starts up.


You can also view and tail the logs of Jenkins in the Docker logs with this command:

docker logs -ft jenkins

You will see a section like this with the password

2016-05-05T17:25:49.011787978Z *************************************************************
2016-05-05T17:25:49.011808021Z *************************************************************
2016-05-05T17:25:49.011814273Z *************************************************************
2016-05-05T17:25:49.011824960Z Jenkins initial setup is required. An admin user has been created and a password generated.
2016-05-05T17:25:49.011834803Z Please use the following password to proceed to installation:
2016-05-05T17:25:49.011845543Z e30bed7d3c104005aeadd45de3a60d37
2016-05-05T17:25:49.011855603Z This may also be found at: /var/jenkins_home/secrets/initialAdminPassword
2016-05-05T17:25:49.011865623Z *************************************************************
2016-05-05T17:25:49.011871858Z *************************************************************
2016-05-05T17:25:49.011877088Z *************************************************************

You can now walk through the initial setup.  All of the install files and configuration file Jenkins is creating will be saved to /opt/jenkins_home

We have found running Jenkins with Docker really easy.  We don’t even need to install Java on to the machine!

This blog post was written by Garland Kan, co-founder of FlowLog-Stats, and DevOps Engineering Consultant for several start-ups in the Bay area. FlowLog-Stats is a service that takes the detailed server IP flow logs provided by Amazon’s FlowLogs and translates them into easy-to-understand metrics, reports, and graphs.

S3cmd in a Docker Container


Most of the systems I run now have Docker on it.  I try to install as less as possible on the base system as I can.  There will be times where you need to ship items off to S3 or get items.  This is where the tool become useful.  Without having to install anything onto the system, I can run a Docker command to get and put items from S3.

S3cmd is a powerful command line tool that allows you to perform various tasks against Amazon S3 service.  It is used widely to copy and upload objects to S3.

This Docker image is small (31MB) and is built with the Alpine Linux base image. You can find it on Docker hub:


The documentation in the repository describes all of this but I give more details here:

Here is how to get items from S3:

If you have some files in S3 at s3://my.bucket/database and you want to copy that down to the local server to /opt/database, you can run this docker command to copy it.


docker run \
--env aws_key=${AWS_KEY} \
--env aws_secret=${AWS_SECRET} \
--env cmd=sync-s3-to-local \
--env SRC_S3=${BUCKET} \
-v ${LOCAL_FILE}:/opt/dest \

Here is how to put items to S3:

Now, let’s do the reverse.  I have a local folder named /opt/database and I want to copy that over to S3 to s3://my.bucket/database2


docker run \
--env aws_key=${AWS_KEY} \
--env aws_secret=${AWS_SECRET} \
--env cmd=sync-local-to-s3 \
--env DEST_S3=${BUCKET} \
-v ${LOCAL_FILE}:/opt/src \

Run in interactive mode and browse around S3:

You can also run it in interactive mode by using the s3cmd command just like you would on a command line.  This will allow you to browse around the S3 buckets and download and upload files.


docker run -it \
--env aws_key=${AWS_KEY} \
--env aws_secret=${AWS_SECRET} \
--env cmd=interactive \
-v /:/opt/dest \
garland/docker-s3cmd /bin/sh

You will now have to execute a script to setup the s3cmd config file to add your keys.  You just have to run:


Now you can use the s3cmd like normal.  For example, to get a listing of all of the buckets in S3:

s3cmd ls /

All of your files on the local machine have been mapped to /opt/dest inside the container.  You can find any files you put on S3 there.


AWS CLI in a Docker Container


Most of the systems I run now have Docker on it.  I try to install as less as possible on the base system as I can.  This blog post will walk you through how to use the container and two of the many useful commands available to the AWS CLI tool.

You can find an automated build of this container on Docker Hub here: This Docker image is small (only 30MB) because it was built with the Alpine Linux base image.

Starting the Docker container

It is pretty simple to start the Docker container and get a shell:

docker run \
-it \
--env AWS_DEFAULT_REGION=us-west-2 \
garland/aws-cli-docker /bin/sh

Once you are in the shell, you can use any of the supported commands.  For example, you can copy/upload items to S3, list ec2 instances, and start ec2 instances. A command list and the usage guide can be found here.

Copy files to S3

If you have a set of files on your local server that you want to copy over to S3,  you can use this tool to do that.  I’ve written instructions for copying over files in /opt/database to s3://garland.public.bucket/database below.

First, you need to restart the container and map the directory you want to copy over.

docker run \
-it \
--env AWS_DEFAULT_REGION=us-west-2 \
-v /opt/database:/opt/database \
garland/aws-cli-docker /bin/sh

This adds the Docker -v option which maps a path from your local server to inside the container.  The format is /local/path:/inside/container/path

Now that you are inside the container with a shell, you can execute this to copy the folder over.

aws s3 sync /opt/database s3://garland.public.bucket/database

The entire databasefolder has been uploaded to the garland.public.bucket folder.

You can get the help pages for any level of the CLI.  For example, you can type in aws help to open the help pages to the top level of the CLI.  It will show you all of the AWS resources it can control.  You can then delve in deeper and find help for each resource. For example, type in aws s3 help if you wanted help with S3 tasks to open a help menu specific to S3 tasks and usage.

Copy files to S3 – Automated

If you didn’t want to do the copy in an automated shell, you can execute the container with the command in one line!

docker run \
--env AWS_DEFAULT_REGION=us-west-2 \
-v /opt/database:/opt/database \
garland/aws-cli-docker \
aws s3 sync /opt/database s3://garland.public.bucket/database

You’ll notice that the -it switch, the Docker switch for an interactive terminal, was removed.  I also replaced the /bin/sh command with the S3 command from above.  You can easily automate this by figuring out what the command does in the interactive mode and creating a script or run it like this not in the interactive command line mode.

Copy files to S3 – Automated and Background

Docker can do so much more than that though!  What if the copy (or any operation) takes a long time and you don’t want to hold up your current shell? You can easily background this task with Docker.

docker run \
-d \
--env AWS_DEFAULT_REGION=us-west-2 \
-v /opt/database:/opt/database \
garland/aws-cli-docker \
aws s3 sync /opt/database s3://garland.public.bucket/database

The only change I’ve made to the previous example is adding the -d switch.  This tells Docker to background the task.  Now, how will you get the output from that command?

When you ran the previous command, it returned an ID to you.  Copy that ID and run:

docker logs [ID HERE]

This will return all of the stdout from the AWS CLI command that ran.

Setting up FlowLog-Stats (2/2): Enabling permissions to read Flow Logs

This is 1 of 2 blog posts on what you’ll need to do to set up FlowLog-Stats.  This blog post outlines instructions for giving FlowLog-Stats read only access to your AWS Flow Logs and should take <10 minutes.  To learn how to enable Flow Logs on AWS, please read more here.

These steps are agnostic to if you are going to use or not.  If you wanted to enable AWS Flow Logs and give read permission to any application, this would be how you would set it up.

These instructions outline one of the easiest methods for creating a machine user and granting this user rights to read the Flow Logs so that FlowLog-Stats can pull the data and process it.  By creating a user for only this purpose, you can audit this user and restrict permissions.  This step is optional – you can also just give FlowLog-Stats full access if you’d like.  Rest assured that our code base doesn’t do anything but read from the Flow Logs.

Step One (Optional): Create a machine user

Go to Service -> IAM to create a new user.  On the left hand side, click on Users.  Then near the top middle click on the Create New Users button.


Now enter in a user name (for example, machine.flowlog-stats).  Then click the Create button on the bottom right.


It will bring you to this screen, if you click the Show User Security Credentials you will see an Access Key ID and Secret Access Key string.  Copy this down.  This will be the last time you can get these keys since this is confidential information that AWS does NOT save.


After you record the credentials, you can click on close link on the bottom.  This will bring you back to the user’s list screen so you can give this user permission.

Step Two: Granting Read Only Access

On the user’s list screen, click on the machine.flowlog-stats user to see its details.


Click on the Attach Policy button.

Enter a caption

In the search filter, put in CloudWatchLogsReadOnly


Click the check box and then click the Attach Policy button on the bottom right.


You’re done! Now this user has READ ONLY access to your CloudWatch Logs and FlowLog-Stats can create your daily dashboard. If you haven’t enabled Flow Logs on AWS, please read more here.

Setting up FlowLog-Stats (1/2): Enabling Flow Logs on AWS

This is 1 of 2 blog posts on what you’ll need to do to set up FlowLog-Stats. This blog post gives instructions for enabling Flow Logs on AWS and should take <10 minutes. To learn how to give FlowLog-Stats read only access to these logs, please read more here.

Step One: Create a log group

In the Amazon AWS console, go to Services->CloudWatch. Then select Logs on the left hand side. Click on Action->Create log group. Give this log group a name, such as naming it after the VPC.


Step Two: Create a Flow Log

In the Amazon AWS console, go to Services->VPC and select the VPC. In the lower pane of the console, click on the Flow Logs tab. Then, click on the Create Flow Log button.

Note: Flow Logs can only be enabled on VPCs.


This brings up a dialog box for you to enter in the information about the Flow Log for this VPC.


Create a new  Role by clicking on the Set up permission link , which will open a new window.


After creating, go back to the create Flow Log window or tab and select the role you just created. (Note: When you start typing in the role name, the role name will auto-populate.)

For the Destination log group type in the name of the log group you created in Step One.  Then click on the Create Flow Log button.

You’re done! You’ve now enabled Flow Log for this VPC and it will start collecting metrics on the network flows going through this VPC. If you haven’t given FlowLog-Stats read only access to these logs, please read more here.



Viewing Flow Logs in the AWS Console

You can view the Flow Logs in the AWS console.  You might have to wait a few minutes before the logs shows up.  Go to Service->CloudWatch and select Logs on the left hand side.  You will see the Log Groups you created above.  Click on it and you will see all of the network interfaces that are sending traffic.


You can click on one of these interfaces to see the logs


Yeah, it is pretty hard to see what is going on in here.  You can use the filter to filter out IP address or ports but you can only look at one interface’s traffic at a time.

If you are wondering what the fields are in each log entry, here is the documentation provided by AWS:

We have found the Flow Logs information very useful but the interface to the data not to be very good.  Another thing is, if you wanted to analyze this data, you definitely can not do it just in the web interface.  You almost always have to start writing programs using the AWS SDK to pull this information in, digest it, then produce the analysis or reports you want.  This is the very reason why we created, it does all of the hard work for you.  You just have to give access to it.





Flowlog-Stats analyzes your AWS network data

Hello world and welcome to Flowlog-Stats! The team is putting the final polishing touches on our product and are excited to start talking about Flowlog-Stats. Today, we’ll share what we do and how we do it.

What we do: transform a sea of network data from AWS into actionable graphs and charts

Flowlog-Stats enables you to troubleshoot and analyze your AWS cloud infrastructure.  All you need to do is enable AWS Flow Logs (more info: Flow log Intro, Flowlog Docs) and give us read access to the VPCs you want us to analyze.  In just two steps, you’re all set-up!

Flow Logs are today’s equivalent of the NetFlow information that was traditionally collected by the network layer on your infrastructure’s traffic. When we all transitioned to the cloud, this information used in monitoring the health and security of your network just wasn’t available anymore. In 2015, this changed when AWS released the Flow Logs feature that provides network meta data information such as source/destination IP, port, udp/tcp, byte size, etc. for each EC2 instance through CloudWatch.


Unfortunately for us security analysts, AWS provides a really primitive interface to look at all this information.  There is an API to pull the information but no easy way to analyze it.


FlowLog-Stats takes this sea of data and provides you with useful charts and graphs about what is going on with your cloud infrastructure.

How we do it: the network visualization tools that FlowLog-Stats creates daily

Top Traffic Sources: top 10 traffic by source, rejected, and traffic.  These are the default charts typically provided by other log analyst solutions.


Top Traffic Locations: infographic illustrating where the traffic in your network is coming from and going


Traffic Flow: visualize which IPs are talking to each other and how much they are talking to each other with this chord diagram.


Threat Identification: ranked list of the highest threat IPs that your network is talking to. These threatening IP lists are published and updated frequently; FlowLog-Stats continually checks these lists so you don’t have to.


Does this sound useful to you? What other information do you want FlogLog-Stats to provide? Please let us know in the comments!