Data security: artificial intelligence and the challenge of pseudonyms

Businesses are now in a situation where they are facing many more computer attacks than they were five or 10 years ago.

In the face of this explosion of cyber malice, there is mention of the use of artificial intelligence and machine learning. Technologies that to some extent require the use of readable data to a certain extent to function properly. Yet it is not that simple. Indeed, the trend is towards data encryption. We should make sure that we find the right balance, to meet the needs of artificial intelligence and machine learning, while protecting the data as best as possible.

This article will also interest you: 6 computer threats to which your business could be exposed

But back in 2020. Computer attacks have exploded. Businesses find themselves mired in a rather pitiful canvas. The bill for computer attacks is hefty. More than 6 trillion in losses. Despite all this, companies continue to be negligent. Because they are far too focused on developing new services or tools. Computer security then becomes secondary. It is seen as an obstacle to the development of artificial intelligence for the reasons we mentioned above. A terrible and great mistake, though.

"The whole problem for companies is to balance the speed of execution of their business with security – IT. But is it compatible? Today, companies such as Zoom, Doctolib, Alan and Google for its messaging have changed the way we approach cybersecurity by integrating end-to-end encryption, the first step towards zero-trust, a concept that places the principle of least privilege at the center of the design of new products and architectures. "Described as early as 2003 by the Jericho Forum and then by Forrester in 2010, the zero-trust, the zero-trust notes that the traditional way of looking at computer security by placing a great "barrier" around information systems was always fallible. Zero-trust therefore advocates protecting data at every step, including integrating this protection directly at the application level rather than blindly trusting "infrastructure." The responsibility is, therefore, deported to the developers of applications. Technically, the strictest data protection is achieved by putting in place end-to-end encryption. With this technology, the data can only be read by the sender and recipient of the information. For example, whatsApp (or Olvid's French competitor) is to assure its users that no conversation can be read by a third party. But while this end-to-end encryption of the data makes the information unreadable, it also prohibits any possibility of doing ML, AI or even searching for that data. As a result, companies can no longer use these technologies that are essential to their competitiveness and their speed of commercialization. Hence their reluctance to put all their data under a bell. he adds.

So the question revolves around a central point. That of finding the right balance with the displacement of artificial intelligence of so many compromising not the security of data. Is it possible to do that? To this question, Timothée Rebours replies: "In addition to the use of so-called "homomorphic" encryption techniques that are in their infancy, this difficulty can, in fact, be circumvented by sorting the data according to the purpose intended for them. Thus, for those useful to ML and AI projects, companies can pseudonymize them, i.e. replace the identifying data with a single and random "pseudonym".

The original identifying data is then stored in a table of pseudonyms, itself protected by end-to-end encryption. This process therefore amounts to cutting the data into two categories, on the one hand the pseudonyms that are exploited in an automated way and on the other the encrypted end-to-end data. In other words, this amounts to applying the principle of the least privilege (foundation of zero-trust) and it is similar to the RGPD which, it should be remembered, requires "implementing the appropriate technical and organisational measures to ensure a level of security adapted to the risk" and which cites in particular "the pseudonymization and encryption of personal data" as means to be used. ».

Now access an unlimited number of passwords:

Check out our hacking software