THE DELICATE DANCE BETWEEN ARTIFICIAL INTELLIGENCE AND PRIVACY: ITALIAN DPA’S INVESTIGATION INTO WEBSCRAPING.

The Privacy Authority launches an investigation of websites on protection from the so called webscraping: the technique of automatically extracting data from web pages. All websites will be subjected to the authority’s scrutiny.

In the ever-evolving world of artificial intelligence (AI), the Data Protection Authority is working to preserve the sensitivity of personal information in an increasingly complex digital environment. A recent development led the Authority to launch an in-depth investigation into the practice of webscraping, focusing on the massive collection of personal data online for the training of AI algorithms by third parties.

But what is webscraping? Webscraping is a technique of automatically extracting data from web pages. In other words, it consists of using scripts or software to analyze the HTML code of a page and extract specific information, such as text, images, or links. This process can be used to collect data from websites in an efficient and automated way. However, it is critical to comply with data use policies and privacy regulations when webscraping, as these activities can raise ethical and legal issues regarding unauthorized collection of information.

The fact-finding investigation initiated by the Data Protection Authority aims to assess the effectiveness of security measures taken by public and private websites to prevent webscraping of personal data. The focus is on entities operating in Italy or offering services in the Italian territory, paying particular attention to those providing freely accessible data, including those captured by the “spiders” of AI algorithm producers.

The current context sees numerous AI platforms engaged in the massive collection of data through webscraping. These platforms, for a variety of purposes, capture huge amounts of personal data available on websites operated by public and private entities. The information, originally published for specific purposes such as news reporting or administrative transparency, becomes the object of interest for algorithm training.

The Data Protection Authority invites trade associations, consumers, experts, and academic representatives to share their comments and input on the security measures taken and potentially adoptable to counter the massive collection of personal data through webscraping. Opinions can be sent to webscraping@gpdp.it within 60 days of the publication of the consultation notice on the Authority’s website.

The Authority’s survey highlights the importance of implementing adequate security measures to protect personal data. Possible countermeasures include the use of advanced encryption techniques, strengthening firewalls and promoting more stringent security standards for sites handling sensitive data.

In conclusion, the Authority’s investigation is a significant step toward balancing innovation in AI with safeguarding individual privacy. Cooperation among authorities, industry, and academics is essential to establish standards and protocols that ensure a harmonious coexistence between the growing power of AI and respect for privacy. The Authority, following the investigation, stands ready to take the necessary measures, including emergency action, to ensure the protection of personal data in an increasingly interconnected world.

DISCLAIMER: This article merely provides general information and does not constitute legal advice of any kind from Macchi di Cellere Gangemi which assumes no liability whatsoever for the content and correctness of the newsletter. The author or your contact in the firm will be happy to answer any questions you may have.