Web scraping is legal, US appeals court reaffirms

Good news for archivists, academics, researchers and journalists: Scraping publicly accessible data is legal, according to a U.S. appeals court ruling.

The landmark ruling by the U.S. Ninth Circuit of Appeals is the latest in a long-running legal battle brought by LinkedIn aimed at stopping a rival company from scraping personal information from users’ public profiles. The case reached the U.S. Supreme Court last year but was sent back to the Ninth Circuit for the original appeals court to re-review the case.

In its second ruling on Monday, the Ninth Circuit reaffirmed its original decision and found that scraping data that is publicly accessible on the internet is not a violation of the Computer Fraud and Abuse Act, or CFAA, which governs what constitutes computer hacking under U.S. law.

The Ninth Circuit’s decision is a major win for archivists, academics, researchers and journalists who use tools to mass collect, or scrape, information that is publicly accessible on the internet. Without a ruling in place, long-running projects to archive websites no longer online and using publicly accessible data for academic and research studies have been left in legal limbo.

But there have been egregious cases of scraping that have sparked privacy and security concerns. Facial recognition startup Clearview AI claims to have scraped billions of social media profile photos, prompting several tech giants to file lawsuits against the startup. Several companies, including Facebook, Instagram, Parler, Venmo and Clubhouse have all had users’ data scraped over the years.

The case before the Ninth Circuit was originally brought by LinkedIn against Hiq Labs, a company that uses public data to analyze employee attrition. LinkedIn said Hiq’s mass scraping of LinkedIn user profiles was against its terms of service, amounted to hacking and was therefore a violation of the CFAA. LinkedIn first lost the case against Hiq in 2019 after the Ninth Circuit found that the CFAA does not bar anyone from scraping data that’s publicly accessible.

On its second pass of the case, the Ninth Circuit said it relied on a Supreme Court decision last June, during which the U.S. top court took its first look at the decades-old CFAA. In its ruling, the Supreme Court narrowed what constitutes a violation of the CFAA as those who gain unauthorized access to a computer system — rather than a broader interpretation of exceeding existing authorization, which the court argued could have attached criminal penalties to “a breathtaking amount of commonplace computer activity.” Using a “gate-up, gate-down” analogy, the Supreme Court said that when a computer or website’s gates are up — and therefore information is publicly accessible — no authorization is required.

The Ninth Circuit, in referencing the Supreme Court’s “gate-up, gate-down” analogy, ruled that “the concept of ‘without authorization’ does not apply to public websites.”

LinkedIn, which brought the case, did not respond to a request for comment.

Leave a Reply

Your email address will not be published. Required fields are marked *

Subscribe to our Newsletter