Metadata can make the difference between success and failure on a Penetration Test. These small bits of information contained in externally-facing documents, usually found in the form of usernames and software versions, are invaluable to a penetration tester. Most modern productivity software will automatically insert this information into documents for benefits such as collaboration. However, if not removed before being published to a website, metadata can put an organization at risk.
With the goal of locating office documents from a particular company, there are multiple strategies that can be used. “Google hacking” is a popular technique used to quickly locate files from a particular organization by using search directives such as “site” and “filetype”. Once the search results are returned, the tedious task of manually downloading each file and checking for metadata begins. This process can take a long time. Thankfully, there is a more efficient solution.
Most pentesters’ tool of choice when it comes to locating metadata during the reconaissance phase of a pentest is FOCA. FOCA, (Footprinting Organizations with Collected Archives) is an amazing tool with several features, but obtaining and analyzing metadata is where it really shines. This free tool will locate documents using search directives, as mentioned above, and create a list of every document from a particular site. From here, downloading every file and extracting metadata is very easy. The results will be all usernames, folders, software, and more found in each of the documents.
When logging in to a system or application, the most common method is providing a username and password. After obtaining a username from the metadata of a document, a pentester has already won half of the battle. While guessing a single user’s password is not very likely, it is more likely to guess a username which is using a weak, guessable password. When a username is obtained through metadata, a pentester does’t just learn of one valid username; they learn the likely username schema used within the organization.
The most common username schema is the first initial of the first name, followed by the last name, such as “jdoe” for a person named John Doe. Using this example, a pentester can take a list of common last names, prefix them with a letter, and use a static password to attempt to log in to various external services hosted by an organization. Commonly, this will be Microsoft’s Outlook Web Access, Citrix, or a VPN solution. By using only one password, this also will prevent account lockouts. Using this attack, it is common to have a few successful logins, which is a significant advantage for a pentester.
The software and version information disclosed in metadata can be useful for both attackers and pentesters alike when performing client-side attacks. Knowing the software in use within the organization can determine which exploits would be likely to succeed in a phishing attack. Adobe Reader and Microsoft Word are examples of commonly exploited software during a client-side attack. After determining the version of the software in use at an organization, a pentester can craft a malicious PDF or Word document designed to compromise a system, steal credentials, or more by convincing a user to open the document. This document would then most likely be delivered to the user via an email-based phishing attack.
There are several solutions out there to sanitize documents, but some productivity applications have this functionality built in to them. In Microsoft Word, for example, just click the “office button” and go to Prepare > Inspect Document. From an organizational standpoint, it is recommended to develop policies and procedures surrounding sanitizing documents and validating that the metadata has been removed before hosting documents online.