In past presentations, I’ve discussed sourcing methods including robots, spiders and URL manipulation, suggesting there are sometimes high risks for recruiters that use these methods. The Courts interpret the CFAA on a case-by-case basis to determine when “accessing a protected computer” is considered “hacking.” The problem is, many lawyers and computer experts say the CFAA is outdated and over-broad in scope. In some cases, prosecutors go after minor uses of the Internet, like downloading lists or sharing information by email.
There is an important case pending in the US Court of Appeals (Third Circuit) regarding the use of automation to manipulate URLs and scrape email addresses that Sourcers need to pay attention to. The highly controversial ruling tells us that under the CFAA, just because a website is publicly accessible and does not require a password, it does not always mean you have permission to access it and use or collect the information you find.
In 2010, Andrew Auernheimer (a.k.a. Weev) found a security flaw in an AT&T server that allowed him to collect 114,000 email addresses belonging to iPad 3G users. Auernheimer and a fellow “hacker” created a tool to manipulate URLs and flood an AT&T website with made-up iPad IDs. When it correctly guessed an ID, the email address of the owner was displayed and the tool scraped the information from the site. Only e-mail addresses were obtained – names/passwords were not collected and no accounts were actually accessed. Auernheimer turned over the scraped information to the gossip site Gawker, which posted some partially redacted addresses, prompting an FBI investigation. He was indicted in January, 2011 and ultimately found guilty of (1)conspiracy to access a computer without authorization under the CFAA; and (2)fraud in connection with personal information. 18 U.S.C. § 1028(a)(7).
On March 19, 2013, he was sentenced to 41 months in prison followed by three years of supervised release. The court ordered him to pay $73,167 in restitution. Although an appeal has been filed, Auernheimer is currently serving out his sentence in federal prison.
Automation May be the Fatal Flaw
There is one thing that all of the recent cases involving the CFAA have in common: they used some sort of unauthorized tool to exploit a website and pull out information in bulk. Most of them wrote their own script, but there are plenty of Firefox extensions that allow people with very little technical knowledge to do the same things. One tool presented at Sourcecon in Fall 2011 is actually fully capable of doing the same thing the tool created by Auernheimer did when he collected iPad emails – no coding required.
As those of you who have used tools like this know, most of them allow you to set “click speed” or “download rate.” This is because the makers of these tools know that it is highly unlikely that a person typing in a URL to visit one website at a time and glean information is going to be caught by the server they are exploring. Don’t let this fool you – know what your tools are doing, and if it involves a guess and check brute-force attack, exceeds usage limits where they exist, spoofs URLs or in any other way “circumvents a technological barrier” to collect information, you may be subject to criminal prosecution under the CFAA.
Terms of Service/Use for Job boards and Social Media Sites
For private networks that require an account to access pages, the definition of “hacking” sometimes hinges on the terms and conditions of service. Dice.com, for example does not allow the use of “any robot, spider, site search/retrieval application, or other manual or automatic device or process to retrieve, index, “data mine,” or in any way reproduce or circumvent the navigational structure or presentation of the Site or its contents.” In other words, if are using a method other than the search and navigation methods provided by Dice to navigate the site, Dice considers it “hacking.” Dice.com tracks the username, IP address and all activity during the session. Actions in violation of this Code of Conduct may be cause for civil and/or criminal liability under the CFAA.
Facebook also restricts the use of automation to crawl or explore the site unless you are using an application with the express permission to do so. However, Facebook’s “Statement of Rights and Responsibilities” does not seem to include language restricting URL manipulation. This is because users can’t access information they would not otherwise have access to without an authentication token (like the special permissions provided when you authorize an App). For example, all of the information provided in the Graph Search is ultimately either public or accessible to the logged-in user. It is also available via the API with a standard developer access token. Facebook even shows us how to “hack the graph” here.
One Facebook user was recently sent a cease and desist letter for scrapping public phone numbers using the API. Facebook has a White Hat Bounty program that allows users to submit security flaws and bugs, but it requires you follow “terms of service” and offers test accounts to demonstrate vulnerabilities. Facebook recently refused a bounty to one user after he exposed an issue that allowed him to post to other people’s walls. They told him it was not an issue, so he used the security hole to post a status update as Mark Zuckerberg. Based on these recent situations, it seems that when it comes to Facebook “hacks” you should ask the security team if it’s something they are aware you can do – if it is a known “feature” and not a bug, then feel free to search as allowed by the “terms of service.”
Should I be worried as a Sourcer?
Have you ever edited a URL to see information you would not normally be able to access via standard navigation? Have you ever used any type of automation to take advantage of a search result “goldmine” and harvested personal information off the web? More importantly, are you exploiting a security hole while collecting the information? Did you circumvent any technology barriers or engage in a “brute-force attack” – guessing possible URLs or email addresses via automation until you got a hit?
For the first time, it is becoming very clear that some of the tools and tricks used by the industry’s leading sourcers could very well be considered “hacking” under the CFAA.