Tech Segment: Metagoofil: Google, Document Metadata and You
A little about Meta-data
Meta data is, in a nutshell, a bunch of additional "hidden" data included in a bunch of formatted documents (PDF, DOC, JPG). Much of this data is added to the document by default with no interaction from the end user, and most don't even know that it is there. In most cases, metadata is never normally revealed to the user, during document creation or during printing.
Typically Metadata contains information about the program that created the file, including name, type, version, user, licensee, potentially some keywords to describe the data contained in the document, and even data deleted during revisions!
So, you may be asking why this is important, to either an attacker or a pen tester. There are a number of applications from both sides of the fence Metadata can:
- Reveal the creator of a document, and even a possible network username. This is half of the challenge of a brute force password attempt. Even with just a name, one could potentially derive a username, based on a few common methods.
- Reveal the application that created the document. This may be helpful in determining appropriate client side attacks.
- Reveal the version of the software that created the document. This can narrow down the attack even more.
- Reveal creation date. This can be helpful in determining how relevant an attack may be. Oh, the document was only created 2 days ago, with an old vulnerable version? How conveinient!
So, now we have a name, possible username, application used by that individual and the software version. Now we can search available exploits, and deliver a directed attack, for something that we can reasonably assume that is installed in that individual's workstation!
You may remember several months ago, we discussed a story that described some social engineering, and backtracking to a computer hacker through EXIF metadata included in a JPG...
Auditing your public Metadata exposure using Google
I've discovered this neat tool called Metagoofil, by Christian Martorella of Edge Security. It is a python application that will query google for several types of documents an a specified site, download them and examine them for metadata. Metagoofil does require 'extract', and there are some great instructions on how to install extract in the README.
Using Metagoofil is quite simple. We'll fire it off against a site that we want to audit, for all supported file types with:
./metagoofil -d domain.com -l 100 -f all -o domain.com.html -t domain.com-temp
Let's review the command line options:
-d: The domain we with to to search Google for -f: filetype to examine/download (all,pdf,doc,xls,ppt) -l: limit the number of results (the default is 100) -o: the desired output file for metadata analysis, which will be in html format -t: specified the target directory to download files for analysis
Once metagoofil is complete, open the output html file, and review the results. You may be surprised what you see! Now go address any potential problems...
Please note, that the downloads that take place with this tool are DIRECT CONNECTIONS from your box running this tool to the site in question. One could certainly tunnel this app through somehting..
Now, this tool isn't perfect, but it is an awesome start. I'd love to see some additional file type support (such as jpg for exif data). It is in python, so it can be modified, and I'm going to attempt to see if I can contribute back on this one. I'm also going to see about proxy support addition so that is can use TOR.
In a couple of weeks (maybe next week?) we'll be discussing some tools to help mitigate metadata exposure on some common file types.
Stories For Discussion
Oracle's stupid programming errors - [Larry] - After the upgrade to 11g, Alexander Kornbrust said he's found a bunch of problems that allow bypass of the security features - most related to stupid programming errors. This, form the "most secure database" on the planet. Hrm. I thoguth some of the numbers from the article on patching were astounding:
Security Cartoon: "Its okay to write passwords down and tape them on your router" - [PaulDotCom] - Okay, lets get one thing straight, its not okay to tape a password to the device that is associated with that password. And, if you keep you password in your wallet, and somene steals your wallet, they can figure out your f-ing username in under 5 seconds!!!!!!!
"Citing the example of one German company that has 8,000 Oracle databases, Kornbrust said rolling out a single patch can require 32,000 hours of labor, or four hours per database. That translates into 60 full-time database administrators and doesn't take into account the time and expense required for testing the patch on each database..."
Lets talks bout some of the challenges, and how do we resolve some of them...
Newsflash - "Dangling Pointers" have been a problem for quite some time - [PaulDotCom] - More technical information about the vulnerabilities can be found here and we discussed it in the [Interview with Ivan and Futo http://pauldotcom.com/wiki/index.php/Ivan-FutoInterview] who referenced a vuln/exploit from the early 90s. Languages with garbage collection tend not to be vulnerable to these exploits, as they clean up any unused pointers.
Finding Situational Awareness - [Larry] - I think that situational awareness is a good thing to talk about for a pen tester and from a security perspective. Mrs. Santarcangelo tells us a story about where she discovered it was important to her, personally. I think that we can take the same concepts and apply them to security...
Threatstop DNS blocklist - [PaulDotCom] - I don't like this one bit, I give my firewall a DNS name, which resolves to multiple IP addresses, which I then block. What happens if my DNS server gets arp cache poisoned? Also, Richard Bejtlich points out that the list is long, and exceeds the 512 byte max for UDP DNS requests, and switches to TCP, so if you are blocking TCP, you won't be, well, blocking anything.
Tor Exit nodes sniffed to reveal US Government passwords - [PaulDotCom] - I wonder if it smelled like onions? Seriously, whether you are using Tor, and open wireless connection or sitting in the comfort of you own home or office, ENCRYPT YOUR FREAKING PASSWORDS! SSL SMTP/POP/IMAP is not that hard, neither is SSH, use it.
Bastile Linux at new site - [Larry] - Those damned domain squatters took away Jay Beale's site. While Bastille-linux.org was allwas a redirector to the sourceforge project site, the new site is Bastille-unix.org. the domain squatter wants $10K for domain.
Patriots Caught Spying on other team - [PaulDotCom] - Whatever happened to the good ole days of football where they wore barely a helmet and beat the crap out of each other and the toughest team won? Not the case in today's NFL league where my most favorite football team, okay the best football team *ever* f-the dolphins, was caught using a video camera to obtain the other teams signals. This happens in our world all the time, hackers spy on us (there is no sharing of IDS signatures on UNISOG), and we spy on hackers (see the numerous cases where whitehats participated in online auctions of exploits). To me, just par for the course, while you may be able to obtain the other team's signals, what are you going to do with that information? Only the really best teams know how to interpret those signals, and adjust their game play to win. Same as with our field...
Owning hospital computers - [Larry] - Whoa, so many concerns with this issues outlined in this article, so let's discuss. The part that really scares me, is that this individual seems to be admitting to a "crime", and not having permission.
"Hacking The White House" What? Hardly... - [PaulDotCom] - In this article the CSO of a wireless company that will remain unnamed, does a war walk around the white house. He finds many wireless networks, duh. It does underscore many of the vulnerabilities that go unnoticed in many organizations, such as 1) The printer that is plugged into the wired network and offers open wireless 2) Many organizations that still use WEP 3) People using an open wireless network, but still plugged into their own organization's wired network. All food for thought...
Ethics and Policy - [Larry] - This is, why, in my opinion that I agree with Mr. Northcut. Basically it boils down to Security professionals held to a code of conduct. While a sec. pro. may say it is bad for child porn, and so does policy they send the violators up the river. While policy also may say unlicensed software is also prohibited, sec. pros. may turn a blind eye for a while, or forever, depending on the user, application and so on.
US-CERT warns of insecure cookies - [PaulDotCom] - Uhm, did they make the same announcement when Dug Song released Dsniff which came with a tool called webmitm, which "webmitm transparently proxies and sniffs HTTP / HTTPS traffic redirected by dnsspoof(8), capturing most "secure" SSL-encrypted webmail logins and form submissions." Hampster claims to "Sidejacking is the process of sniffing cookie information, then replaying them against websites in order to clone a victim’s session. We use the term “sidejacking” to distinguish this technique from man-in-the-middle hijacking. " I fail to see how this is a revolutionary technique, point and click, big whoop. And, I have to use Windows? And, no Source code? Yuk.... see next story for something better.
Wifizoo - Not just a hamster on the wheel - [PaulDotCom] - Now, here is something we can work with, does the same thing as dsniff and ferret, except, its source is available, it runs on Linux, and it uses standard open libraries, such as scapy. This is something we can work with, what a great kismet plugin this would make!
MS changes files without notification - [Larry] - even with Windows update turned off, MS delivered an update. This scares me! Why? We'll what about all of those machines (in networks that are maybe not as secure as they should be), connected to medical equipment, controller equipment...security devices...what happens when one of these updates crashes, and causes issues for a patient, system, or causes a security breach...
Other Stories Of Interest
RFID implants linked to cancer - [Larry] - Heh, cancer in lab animals. Oxygen causes cancer in lab animals! Oh noes, doez I haz teh cancurz?
Porn industry hard up for solutions to piracy problem - [PaulDotCom] - So, not only did they use "hard" and "porn" in the same sentence, get this: "the industry could start to pair pirateable material—the movies—with nonpirateable material, such as t-shirts and other items to make legit sales more attractive." I want my Jenna Jameson bottle opener, oh, you know how it works, right now! Where can I order it! :)