Episode335

From Paul's Security Weekly
Jump to: navigation, search
Palo Alto Networks
Tenable Network Security
The SANS Institute
Pwnie Express
Black Hills Information Security

Episode Media

mp3 pt 1

mp3 pt 2

Announcements & Shameless Plugs

PaulDotCom Security Weekly - Episode 335 for Thursday June 13th, 2013

  • Defensive Intuition (the Consulting arm of PaulDotCom Enterprises) and Black Hills Information Security have joined forces to offer what we dub "Badass Security from the Badlands". If you're tired of the pentest puppy mills and their copy & paste scan results in pentest reports, visit BlackHillsInfosec.com for all your training, Offensive Countermeasures and Assessment needs! .
  • We are looking for sponsors for monthly webcasts in conjunction with SANS - contact paul -at- hacknaked.tv for details!
  • Security BSides Rhode Island Two-Day Conference on June 14th and 15th tickets are sold out. Limited walk ins available. Featured presentations from the soon-to-be-famous but currently still-available-for-autographs Alison Nixon, as well as from Josh Wright , Kevin Finisterre, Kati Rodzon, Mike Murray, Bruce Potter, Joe McCray,Ron Gula, Ben Jackson, Dave Maynor and the entire PaulDotCom crew!
  • The Stogie Geeks Show! - Kick some ash with the Stogie Geeks, Sunday nights at 8:30PM EST. Come have a cigar with us! If you are in the Rhode Island area please visit our sponsor the Havana Cigar Club, its an awesome place to have a drink! Make sure you print out your $5.00 off coupon here!

Special Segment with Dave "Rel1k" Kennedy: Connecting the Dots on Bypassing AV

Dave Kennedy is CEO of TrustedSec, Former CSO of a Fortune 1000, Founder of DerbyCon, Creator of the Social-Engineer Toolkit and Artillery tools. Dubbed "The James Brown of InfoSec" for his work ethic, Dave *is* the nicest, coolest, bad-ass technical CEO on the planet*. Heavily involved with BackTrack and the Social-Engineer Framework, Dave works on a variety of open-source projects, such as AV Bypass. *He is also the subject of a man-crush by both Carlos and Mike Perez of our show.


We recently ran a Tech Segment on ByPassing AV by Chris Truncer. Dave Kennedy has agreed to come on to help clarify prior art, update us on his thoughts on the approach and automation done by Veil, as well as give us an update on DerbyCon.

CycleOverride with JP Bourget and Bruce Potter

We have JP Bourget and Bruce Potter on the show to announce their ball busting ride across the USA, CycleOverride . CycleOverride is planning a series of rides over the coming years that revolve around information security and fundraising for organizations important to the infosec community. Support the EFF in support of Cycle Override

Interview: Bill Stearns

Bill is a Security Analyst and Instructor for CloudPassage. He also serves as a content author and faculty member at the SANS Institute, teaching the Linux System Administration, Perimeter Protection, Securing Linux and Unix, and Intrusion Detection tracks. He was the chief architect of one commercial and two open source firewalls and is an active contributor to multiple projects in the Linux development effort. His spare time is spent coordinating and feeding a major antispam blacklist.

Bill's articles and tools can be found in online journals and at http://www.stearns.org.


  1. How did you get your start in information security?
  2. Have the use of passwords run their course? What are other options?
  3. What can you do to convince the average user to give up passwords?
  4. You've been a proponent of Linux for a while. Has Linux adoption slowed or should we consider Android and the upcoming Ubuntu OS as the Linux that finally infiltrated the masses?
  5. Tell us about the work you've been doing with the antispam blacklist. What's been most effective against spam? Is the answer to spam "internet registration" of your email address as some folks want (internet ID)?

Wstearns006-rotate270.jpg

Show Links:


Bill's Yubikey giveaway!

  • 50 keys
  • Honor system: you haven’t used one before and will give it a try.
  • One per person.
  • Send in your name, mailing address, and the name of an application that supports Yubikeys to: yubikey@stearns.org . US only; if you're outside of the us, order one from the Yubico store.
  • Keys will be mailed the week of June 24th
  • Love to hear what you did with it, send the above to yubikey -AT- stearns.org

Tech Segment: Phil Hagen on logstash

Philip Hagen started his security career while attending the US Air Force Academy, but later shifted to a government contractor, providing technical services for exotic IT security projects. Most recently, Phil formed Lewes Technology Consulting, LLC where he performs forensic casework and information security training, including the creation of a new SANS Course: FOR572, Advanced Network Forensics and Analysis.


The practice of "network forensics" is getting broader. It now includes more investigative practices instead of just low-level technical challenges.

Historically, a network forensicator might spend a lot of his or her time digging through pcap files. However, network captures are massive, and unless created at the time of an incident, they simply can't be re-created. However, consider that each network transaction touches many different systems, which often create invaluable log records that the forensicator can use to piece together an incident. A comprehensive investigation will include these sources of evidence into the process. Just as system-level forensic examinations now include both disk and memory forensics, the network side must expand in scope as well.

A reasonably sized network environment might generate several thousands of events per second across dozens or hundreds of systems and devices. Collecting this evidence from many sources (switches, routers, firewalls, servers, infrastructure devices, workstations) could quickly bog down an incident response team. After collecting them, the team would soon find countless log formats to learn and parse.

Technology has helped to solve this problem. The venerable syslog daemon has been a core part of UNIX-like operating systems for decades, and it allows real-time forwarding of syslog events to a central storage location. SEIM platforms also collect data in this fashion. However, commercial solutions tend to be expensive, inflexible, and/or lack features that cater to incident responders and investigators. Fortunately, there are several great free and open-source in the space, such as rsyslog and syslog-ng, as well as more "intelligent" and flexible tools such as Logstash and ELSA.

335 logstash.png

On this show, I'll talk about Logstash, and how it can be used to handle the kinds of data and use cases that a network forensicator, incident responder, or investigator may typically encounter. It's free and open-source, and is under active development at http://logstash.net/ . If you're considering a solution like this, you should definitely evaluate the other tools, but I've really taken to Logstash for a few reasons we'll talk about.

Logstash is simply an "ingester". It reads data in from various sources, applies filters, and sends the filtered and formatted event to one or more destinations. The primary Logstash developer, Jordan Sissel, gave a great talk at PuppetConf in 2012. You can watch the video here (36min) to get an idea of what it does and how it works. As Jordan explains, he built Logstash out of necessity, to support operations at a large web hosting company. I think the key features are its extensive array of inputs, filters, and outputs, which make Logstash a very attractive platform for a lot of different users. Although it's primarily designed to read live logs as they arrive, it can also ingest from existing files. This makes it ideal for a network investigation, since new logs may be found at any stage in the process. Logstash provides a *ton* of other input methods as well - various databases, log "shippers" on individual systems can all be enabled in parallel, creating massive log vacuum.

There are also a lot of output modes, but probably the most straightforward for a network forensicator or investigator to use is the built-in "ElasticSearch" indexer/database. This backend provides a highly-optimized storage engine for typical log data.

Filters are where Logstash's power lies. Today, there are 29 different filters distributed by default. There are basics like grep and xml filtering, but also some that are very useful in our field. Date normalization is critical - and there is a dedicated plugin for just that function. There are also modules to provide IP address geolocation, data checksumming, and hash-based anonymization of data fields. These are all great, but perhaps the most supremely useful filter is the "grok" filter.

The "grok" parsing syntax takes most of the pain out of regular expressions, allowing human-readable field name assignments. Put simply: you can tokenize a known format of input data into usable fields. This helps in the analysis phase because you can use those field names and values to craft queries across the source evidence, extracting valuable leads and findings in record time. RegExp wizards may initially scoff at the idea of simplifying something as tried-and-true as the regular expression. However, I have found that grok still exposes that raw power under the hood, but in a way that allows standardization and repetition. Obviously, in this business, those are absolutely critical aspects of our business processes.

Logstash comes with a good crop of grok patterns, including numbers, IP addresses, MAC addresses, Windows UNC paths, URLs, and more. Here are a few that will give you a good idea of how grok patterns are constructed:

INT (?:[+-]?(?:[0-9]+))
NONNEGINT \b(?:[0-9]+)\b
WORD \b\w+\b
UUID [A-Fa-f0-9]{8}-(?:[A-Fa-f0-9]{4}-){3}[A-Fa-f0-9]{12}
MAC (?:%{CISCOMAC}|%{WINDOWSMAC}|%{COMMONMAC})
COMMONMAC (?:(?:[A-Fa-f0-9]{2}:){5}[A-Fa-f0-9]{2})
IP (?<![0-9])(?:(?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2}))(?![0-9])
URI %{URIPROTO}://(?:%{USER}(?::[^@]*)?@)?(?:%{URIHOST})?(?:%{URIPATHPARAM})?

You can see that they can be nested, as in the URI and MAC patterns. I didn't include all of the source patterns, but you can get the basic idea. Obviously, they're cryptic enough for the RegExp wizards, but they make that wizardry consistent and accessible to mere mortals.

Using grok to match against a string is pretty straightforward. The grok syntax is:

%{SYNTAX:SEMANTIC:TYPECONVERSION}

  • SYNTAX is a pattern name defined in the core set or your own pattern database. Upon successfully matching a SYNTAX, the grok will be successful, and will take further actions, depending on how it was constructed. (More on this in a bit.) If part of a larger match pattern, the entire pattern must be successful for this boolean evaluation.
  • SEMANTIC is simply the name you want to assign the matched value. Think of the SEMANTIC as the variable name to the SYNTAX match as its value. The SEMANTIC is optional - if omitted, the SYNTAX will just be used for matching, and will not be assigned.
  • TYPECONVERSION is also optional. By default, all matches are assigned as strings. However, you can specify "int" or "float" to cast the matched values accordingly. This allows mathematical comparisons against the data.

Creating grok statements can be tricky - sometimes as tricky as raw RegExps. There is a *great* tool to help with this process, though. The Grok Debugger site allows you to enter input data and a grok string. In real-time, it indicates if the match was successful, and if so, what SEMANTIC assignments were made. It's an excellent resource that will save thousands of facepalms when creating grok patterns. Another clever feature on the site is "Grok Discover", which allows you to paste a raw string, and then determines what standard patterns can be used to match it.

After a successful grok match, there are a number of actions that can be taken. I won't be able to go into them all in this piece, but here is an example grok stanza that I created to match BIND query log records. In this line of work, DNS query logs can be a very useful source of evidence to identify what internal clients performed a given DNS lookup - for example, a known command-and-control hostname. The records look like this in syslog:

Jun 9 14:20:11 muse named[922]: client 10.3.16.11#46714: query: mirrors.kernel.org IN A + (10.3.58.10)
Jun 9 14:20:11 muse named[922]: client 10.3.16.11#59038: query: centos.tcpdiag.net IN A + (10.3.58.10)
Jun 9 14:20:11 muse named[922]: client 10.3.16.11#40392: query: mirror.wiredtree.com IN A + (10.3.58.10)
Jun 9 14:20:11 muse named[922]: client 10.3.16.11#44702: query: mirror.trouble-free.net IN A + (10.3.58.10)
Jun 9 14:20:11 muse named[922]: client 10.3.16.11#34970: query: bay.uchicago.edu IN A + (10.3.58.10)
Jun 9 14:20:11 muse named[922]: client 10.3.16.11#48844: query: centos.supsec.org IN A + (10.3.58.10)
Jun 9 14:20:11 muse named[922]: client 10.3.16.11#60222: query: archive.cs.uu.nl IN A + (10.3.58.10)

I stripped the portion in strikethrough (through and including the process name and PID) off using a previous grok statement from the Logstash Cookbook. This left the remaining part of each line, starting with "client" for me to manually parse. Here is the section I created:

grok {
  type => "syslog"
  pattern => [ "client %{IPORHOST:dns_client}#%{POSINT:dns_sourceport}: query: %{HOST:dns_query} %{NOTSPACE} %{NOTSPACE:dns_rectype}" ]
  add_tag => "got_dnsquery"
}

The "type" was assigned upon ingest, and that part of the stanza ensures this grok won't be applied to any other data source. The "pattern" contains the grok syntax to be applied. Finally, if successful, Logstash will assign a tag of "got_dnsquery" to the record before proceeding through the rest of the config file. These tags are arbitrary string assignments, and will be inserted to the database when the parse is completed. When performing analysis through a front-end, we can use it as a basis for the query. "Find all records tagged with "got_dnsquery" that contain the field "dns_query" with a value of "krmiakrd.co.cc" for example. The tags can also be used for later processing directives in the Logstash config file - replace the "dns_client" field with its SHA1 hash, but only for events that contain a tag of "got_dnsquery" might be a useful theoretical example. Cross-source queries could also be created. Upon identifying an IP address that is believed to belong to a compromised system, querying all of its DNS and proxy accesses is as easy as searching for records where the "dns_client" or "squid_client" fields are set to the suspect IP.

As you can see, parsing and tokenizing raw text can be a very useful capability for many situations in operations and in forensics. Making sense of free-form data allows us to quickly extract value from raw evidence. By using a tool like Logstash, a forensicator, incident responder, or investigator could build a library of common log formats, then ingest evidence data to an instance running on a VMware image, for example. This would allow case segregation and help ensure proper evidence handling, without requiring such a function to be deployed on your "everyday" system. With enough practice using grok and the other filter plugins, incorporating new data formats (using the debugger, of course) becomes a snap.

The core features that Logstash provides are extremely powerful. In just a few hours or days, you can create an instance that will ingest thousands of log events per minute. With tuning and appropriate hardware, that can grow to tens of thousands per second. All that is great, but without a way to query the data, we're not getting very far, are we?

Kibana web interface

Although Logstash is distributed with an integrated web front-end, I don't recommend using it. It's a bit clunky, and there are great free alternatives. I have been quite impressed with Kibana. It took somewhere around 30-60 seconds to install and configure. If you've ever used a SEIM or web-frontend to a log aggregator before, you'll be immediately comfortable with Kibana. As you can see in the screenshot below, there is a search form with time window selection at the top, field list on the left side, and a histogram with raw data as the main section.

Event details for a Snare-relayed Windows Eventlog record

Clicking on any event displays all fields and other values that were parsed from the original log event. Here is an example of a Windows event that was delivered via the Snare syslog client and parsed with a grok pattern that I created.

Popup window detailing breakdown of HTTP return codes from a Squid proxy log

The field list displays all fields, tags, and other assigned values that are present in data returned by the search. Clicking any of these displays a pop-up with values that you can include/exclude from the search with the magnifying glass or "no" symbol, respectively.

From the Kibana interface, you can export data to CSV, watch a dashboard-like stream of new results as they arrive (if you're ingesting live log files, of course), and interact with your log data in many useful ways.

DNS query results for the 10.3.16.11 client

For example, let's say you need to know all DNS lookups that the "10.3.16.11" IP made between 0600 and 0630 on June 9th. Simply use the following search specification:

@tags:got_dnsquery dns_client:10.3.16.11 AND @timestamp:["2013-06-09T06:00:00Z" TO "2013-06-09T06:30:00Z"]

It doesn't take more than an instant to plunder the available data and see the results in this screenshot:

This would be exceptionally handy if looking for DNS lookups for known malware C2 hostnames.

DHCP ACK messages for hte 10.3.59.43/floor3-PC host

Another common situation is to establish the periods of activity for a certain system on the network. Since most environments are configured with DHCP, we have a log associating of when MAC addresses were active on the network. Since the MAC address is often, but not always a unique value traceable to a specific piece of hardware, this can be invaluable in establishing a rough timeline for a subject's activity. I wrote a quick grok pattern to pull the useful fields from these messages, which made the following query available:

dhcpmessagetype:"DHCPACK" AND @fields.hwaddr:"08:00:27:ad:38:ca"

The results show when the system with that particular MAC address was active. I've selected that system's assigned IP addresses, and, in this case, the hostname values were available as well. The IP address would then be useful in searching additional evidence that only logged IP addresses, but not MACs. In this case, there was just one IP assignment though out the system's activity. If there were more, the statistical popup window would provide a quick view into the intersection of additional data fields. Exporting these values to a CSV file would be a quick and easy way to start or add to an activity timeline.

Even outside the network domain, such a tool could be used to make our lives much easier. Although a grok pattern for log2timeline forensic data doesn't exist yet (that I know of), creating one would not be difficult and could provide cross-system artifact correlation and other helpful benefits. As the amount of source evidence and other data increases, we'll need to incorporate and adapt smart tools like Logstash to the forensic and investigative workflows.


Announcement

Stories

Paul's Stories

Larry’s Stories

They found my beer fridge - [Larry] - Neat. Doing RF analysis to hunt down interference to help other with their service. In one case it found a poorly behaving beer fridge…This is also neat for trying to find bad insulators on transmission and distribution power lines, as the small arcs create Low band AM RF signals - just like the old ham radio transmitters - Spark Gap transmitters! Yes, more SIGINT stuff

http://www.engadget.com/2013/06/13/traq-quadricopter-traces-source-radio-signals/ Tracking radio signal] - [Larry] This comes on the heels of Brad's recent interview on SIGINT. This might be nice to use to find rogue non-WiFi signals in your environment that are bypassing your DLP. Light details on the project, but seems like a good start. I think a swarm of drones will be helpful here.

http://erratasec.blogspot.com/2013/06/nsa-is-wrong-not-evil.html PRISM.] - [Larry] - Discuss.

Stealing User Certificates with Meterpreter Mimikatz Extension - [Larry] YAY CARLOS. Nice. I can see where I might like to steal user certs, for say wireless authentication that is based on user certs (instead of machine certs). I bet that this could be used for machine certs as well...

Jack’s Stories

  1. IE gets updates but not for the latest bug from Tavis Ormandy. Please sir, may I have more DRAMA?

Allison's Stories

Patrick's Stories