Домой United States USA — software The Ultimate Guide to Legal and Ethical Web Scraping in 2022

The Ultimate Guide to Legal and Ethical Web Scraping in 2022

173
0
ПОДЕЛИТЬСЯ

The following article serves as a guide on how to extract data from the web in a completely legal and ethical way. Read below to find out how!
Join the DZone community and get the full member experience. The popularity of web scraping is growing at such an accelerated pace these days. Nowadays, not everyone has technical knowledge of web scraping and they use APIs like news API to fetch news, blog APIs to fetch blog-related data, etc. As web scraping is growing, it would be almost impossible not to get cross answers when the big question arises: is it legal? If you are browsing the internet for a legit answer that best suits your needs, you have come to the right place. Minimize the risks. Spoiler alert: The question of whether web scraping is legal or not has no unequivocal and definitive answer. This answer depends on many factors and some may vary depending on the laws and regulations of the country. But, first, let’s briefly define what web scraping is for those unfamiliar with the concept before we dive deeper into the legalities. Web Scraping is the automated art of collecting and organizing public information available on the Internet. The result is usually a structured composition stored in a table of contents as an Excel spreadsheet, which displays the extracted data in a “readable” format. This practice requires a software agent that automatically downloads the desired information by mimicking your browser’s interaction. This “robot” can access multiple pages at the same time, saving you from wasting valuable time copying and pasting data. To do this, the web scraper sends many more requests per second than any other human being could. That said, your scraping engine must remain anonymous to avoid detection and blocking. If you want to learn more about how to avoid getting left behind on the data side, I recommend reading this article before choosing a web scraping provider. Now that we have an overview of what a web scraping tool can do, let’s find out how to use it and keep you sleeping soundly at night. Using a web scraper to collect data from the Internet is not a criminal act in and of itself. Many times, scraping a website is perfectly legal, but the way you intend to use that data may be illegal. Several factors, depending on the situation, determine the legality of the process including: Let’s talk about different types of data and how to handle them gracefully. Because data such as rainfall or temperature measurements, demographic statistics, prices, and ratings are not protected by copyright, they appear to be perfectly legal to scrape. It is also not personal information. However, if the source of the information is owned by a website whose terms and conditions prohibit scraping, you may be in trouble. So, to better understand how to scrape smartly, let’s look at each of the two types of sensitive data: Any type of data that could be used to identify a specific individual is considered personal data (PII in more technical terms).

Continue reading...