One of the many challenges in information security is collecting, managing, and applying threat intelligence. Typically, threat intelligence comes from a variety of disparate sources, such as IDS rules (Sourcefire / Emerging Threats), server/application logs, historical breach data, private/public feeds, security appliances…the list goes on. The difficulty lies in normalizing this data and turning it into actionable intelligence that can be effectively used across an enterprise. In my own experience, I primarily relied on IDS rules and data pulled from log analysis. But even then, I never had a nice method of maintaining and distributing this intelligence.
Collective Intelligence Framework
Enter the Collective Intelligence Framework (CIF). I’ve read a couple positive articles about CIF over the last year and was excited to finally give it a shot. In short, CIF aims to solve the problem described above. It’s an open source threat intelligence management system that parses, normalizes, processes, stores, queries, shares, and produces threat intelligence. It has several feeds already included and allows you to add or create your own feeds. Additionally, CIF sorts the intelligence data into various “assessments” (botnet, malware, phishing, etc) and classifies intelligence by “confidence” levels (95+ is reliable data, 85+ is very reliable, 75+ is somewhat reliable, etc).
The installation process will take some time as it’s fairly tedious. There are a lot of packages to install and configurations you’ll need to make before you even reach the classic “configure && make install” and even after that there’s still a bit more to do. For the most part, you should be able to follow the instructions on the CIF site. I went with an Ubuntu 12.04 x64 VM for my test server.
Be warned, the system requirements listed are no joke. I figured I could get away with a low powered VM with 1 CPU core and 1 GB RAM on my laptop but quickly found my keyboard burning and my fans spinning. I ended up having to create another VM on my home server and giving it 2 CPU cores with 4 GB RAM. If you look at the system requirements for a “Small Install”, you’ll find that this still doesn’t meet the recommended specs! But generally, with this setup it seems to run fine with all the default intel sources and feeds.
At it’s most basic, CIF can be used as a threat intelligence database that can be easily queried. You can query for things like IP addresses, domain names, and MD5 hashes. There are two ways to query CIF: the native client or the browser plugin. Let’s take a look at some examples using both.
Query for a malicious domain with a confidence level of 85
Query for a malicious MD5 hash
A couple of notes on these queries. First, in the native client queries the “-f” arguments were used only to fit the results on my screen. It allows you to specify what fields to print but it’s not necessary when normally running queries.
Second, you might’ve noticed that each time a query is made, it is automatically logged into the database with an assessment value of “search” and a description value of “search <query>.” At first I found this a bit noisy as it seems to clutter the results. However, I believe CIF does this as a way to help you track yours and your team’s queries. In this way you could easily say, “Oh hey, I searched for this query before…X months ago and it looks like it’s come up again.” Just a guess. Either way, you can exclude these from your query results by adding “-e search” to the query command, like so:
Here’s where CIF goes beyond a simple threat intelligence repository. If you enable feed generation (which you’ll want to do!), CIF can generate an entire feed of actionable data. What this means is, rather than simply querying to see if a URL is malicious, you can generate an entire feed of malicious URLs that can then be fed into and used by your security appliances to detect and respond accordingly. For example, you could generate a list of malicious IP addresses and create iptable rules or Snort rules to block and detect any related activity. Let’s take a look at some examples.
Malware IP Addresses
Are you starting to see how powerful CIF is? You could have a cron job generate new intelligence feeds daily and push iptables rules to your public servers and Snort rules to your IDS automatically. Very cool.
What intrigues me the most though is CIF’s ability to generate Bro intel. Bro is an amazing protocol-aware network analysis tool. I highly recommend trying it out if you haven’t already (it’s free!). Bro’s recently introduced intel framework is a great way to leverage CIF’s intelligence data. You’ll need to make some configurations in Bro to enable the framework and read in the intel. Let’s see some examples of what this intel looks like.
Malware IP Addresses – Bro
Malware Domains – Bro
Botnets – Bro
Malware MD5s – Bro
Looking at the Bro perl plugin, it doesn’t look like CIF supports this. Bro already has the ability to look up hashes in the Team Cymru malware database, but it’d be cool to feed it hashes from CIF. You could either update CIF’s Bro output plugin (located at /usr/local/share/perl/5.14.2/Iodef/Pb/Format/Bro.pm on my setup) or output the data to some other format and write your own script to properly parse and format it for Bro (likely what I’d do given my rusty Perl skills).
You could complete the feedback loop by having a script parse Bro logs for notable events like high number of failed SSH logins from a given IP address, strange URLs, etc and then send this data back into CIF. From there, the intelligence feeds can be refreshed and distributed back out to your critical servers and security appliances. Pretty sweet.
In my short time with CIF, I’ve been impressed by it’s ability to parse, normalize, and distribute intelligence data. I’m hoping to spend more time playing with it so I can better understand how to fit in my own workflows. There’s a lot of potential in this tool and I hope more people use and contribute to it. It’s another example of great free and open source security software that’s readily available for security teams to take advantage of.