New Law #2: The Privacy Engineer

Who they are, what they do and their role in modern data protection compliance

One of the keys to the effective implementation of a privacy programme within an organisation is ‘the company buy-in’; getting stakeholders to believe in the programme and contribute to its success. To do this, some may cite the fines imposed for non-compliance, particularly under the GDPR. However, with such an approach, the impression that one may develop of data protection is that it is just another cumbersome compliance project which is only important because the lawyers said so.

Such a view is short-sighted. Rather than a burden, “data and analytics leaders should increase awareness of how better business outcomes can arise from changing how their organization handles personal data.” In other words, data protection compliance can be a blessing in disguise:

[P]rivacy, as a field, should probably be renamed. There is no sense of urgency or value in the word privacy, a problem that has plagued the field and will one day be addressed by shrewd marketers. Therein lies the beauty of privacy engineering: not only do data that have been “privacy engineered” comply with rules and regulations, they are also ready for exploitation, thereby transforming a legal burden into an opportunity for value creation.¹

In order to capture this hidden value, organisations must be willing to embrace those professionals who understand how to not just comply with the law but also use it to produce value. Those professionals are privacy engineers.

Privacy engineers are those who facilitate the development and implementation of data protection requirements into the products, services or operations of organisations that process personal data. Put simply, they give effect to privacy-by-design. Such individuals can therefore be described as “privacy-savvy technologists that can translate policy into practical terms”.

As a result, privacy engineers act as the medium between various business stakeholders and legal or compliance teams. They therefore work closely with software engineers, product developers, designers, cybersecurity teams and others. They effectively act as the translators between these different domains to help build viable data protection compliance solutions that still enable innovation and growth. Jane Horvath, Apple’s Privacy Officer, has stated that every new product that the company develops has a privacy engineer and privacy lawyer on the team “even in at the imminent beginning design phases”.

From a theoretical standpoint, these kind of professionals are a form of ‘expert-generalists’ (also known as ‘T-shaped individuals’); they have a deep knowledge in at least one area combined with a good level of knowledge spread across a number of neighbouring areas. Thus, privacy engineers have an expertise in privacy and data protection law whilst also having other complementary knowledge and skills; data science, information security, project management and others. This is what enables them to convey law and policy to various stakeholders within an organisation and work towards practical solutions to achieve compliance.

A New Science

Jamie Susskind, author of Future Politics, outlines the dominant question of the 21st century for society; to what extent should we let powerful digital systems control our lives and on what terms.² In doing so, Susskind emphasises the importance of cultivating ‘philosophical engineers’ to navigate this new challenge:

Today, the most important revolutions are taking place not in philosophy departments, nor even in parliaments and city squares, but in laboratories, research facilities, tech firms, and data centres. Yet these extraordinary advances are taking place in a climate of alarming cultural and intellectual isolation…Political philosophy and social policy rarely appear in degree programmes for science, technology, engineering, and mathematics…In tech firms themselves, few engineers are tasked with thinking hard about the systemic consequences of their work.³

Thus, it is important to not allow technologists to operate in their silos and be ignorant to the ethical and societal implications of their creations. This is especially the case in the context of the data age. In her book The Age of Surveillance Capitalism, Zuboff explains how the likes of Google and others have developed capabilities to infer and deduce the thoughts, feelings, intentions, and interests of individuals and groups often disregarding their knowledge, awareness or consent.⁴ This enables companies engaged in surveillance capitalism to gain privileged access to ‘behavioural data’. However, rather than privacy being eroded, Zuboff argues that privacy is redistributed; instead of people having the ability to control what personal information they disclose, such rights are claimed by surveillance capitalists for commercial ends not always beneficial to the end-user.

The efforts to resolve these injustices of the modern digital world often result in an unhelpful dynamic that stifles meaningful progress:

At times the debate seems as if technologists somehow wish (or believe) they could escape the norms of general social, cultural, and legal discourse. Simply by designing ever-more complex systems and protocols that “need” increasing levels of sensitive information to work, technologists’ actions (or creations) seem to deny the basic requirement to respect data about people. Lawyers have come trooping in en masse to write similarly complex terms and conditions and hope to paper over the problem or find a cozy loophole in unholy legislative agendas. Investors search in vain for beans to count. Everyone else finds privacy boooooooorrrrring… until their own self interests are compromised.

The GDPR is one piece of legislation that aims to reverse this trend and allow people to take back control of their data. Among its various provisions, Article 25, which stipulates both data protection-by-design (DPbD) and data protection-by-default (DPbDf), is one of the most prominent. It mandates two distinct obligations; (i) the execution of the appropriate technical and organisational measures designed to implement data protection principles in an effective manner with necessary safeguards to meet the requirements of the GDPR, and (ii) the implementation of appropriate technical and organisational measures designed to ensure that only data necessary for the processing purposes are processed, taking into account the amount of data processed, the extent of that processing, the period of storage and their accessibility. Such an obligation is designed to ensure that products and services allow users to more easily protect their data and control how it may be processed.

Privacy engineers are a crucial part of complying with Article 25. This is because they understand, firstly, what the law requires and, secondly, how those requirements can be put into practice. Hence, privacy engineers are the bridge between legal experts and technologists who can achieve the ethical awareness promoted by Susskind in order to prevent the trampling of rights highlighted by Zuboff, all while still encouraging innovation and growth within an organisation. This is particularly helpful when balancing the contrasting perceptions of technologists and lawyers. Technologists prefer to work in a more deterministic environment where issues are black or white.⁶ Lawyers, on the other hand, tend to deal in the grey areas that lack definitive answers.⁷ Privacy engineers avoid this clash by managing the perceptions of both groups to find the solutions that satisfy compliance and commercial objectives.

By working in this way, privacy engineers earn the status of being business-enablers rather than merely guardians. Instead of treating data protection as a box-ticking exercise, they make it a market differentiator and thus a competitive advantage, since good data protection compliance earns the trust and confidence of users (an increasing amount of people believe that the treatment of their data is indicative of how they will be treated as a customer). Privacy engineers, in the digital age, can therefore help boost profits whilst moderating the impact of technological progress on society.

The ABC’s of Privacy

The work of a privacy engineer will essentially focus on implementing privacy-by-design. This will mean ensuring that privacy requirements are integrated into products, services and operations from the beginning and throughout the lifecycle (hence Apple’s inclusion of privacy experts in teams working on new projects).

To be able to impart their expertise to achieve privacy-by-design, privacy engineers will need to use an appropriate workflow. Such a workflow should take into account the objectives of the organisation (in particular those of the product, service or operation in question), the objectives of the legal framework and the expectations of the end-user. The workflow should therefore encompass an extensive approach that analyses the project as a whole and its specific individual aspects.

For the GDPR, or any data protection legislation, this workflow should be based on the organisation’s privacy policy, for it will play “a key role in guiding how privacy engineering is applied”.⁸ The policy should be the key starting point for the commencement of any new project that will involve the processing of personal data. Furthermore, Article 24 stipulates that data controllers must be able to demonstrate compliance with the GDPR. That provision further provides that controllers should implement appropriate data protection policies as one way to demonstrate compliance.

In particular then, the privacy policy should, among other things, address three issues. Firstly, it should address the organisation’s commitment to the protection of personal data. Secondly, there should be a detailed description of the processing purposes, specifying the “legitimate business purposes for which personal data is collected and processed”.⁹ Thirdly, the policy should “reiterate the principles for processing personal data (as stated in Article 5(1))”¹⁰ as these will be fundamental to the privacy engineering workflow and demonstrating compliance with the GDPR. Thus, using the privacy policy and the Article 5 principles, a privacy engineering workflow can be developed.

The first part of the workflow should be about expressing the purpose of the data processing. This should take into consideration of the wider aims of the product, service or operation being built and why the processing of personal data is required to achieve those aims. Various stakeholders will need to be consulted to identify the processing purposes along with legal experts to ensure that those purposes, and the legal basis of the processing, are legitimate. This would be in compliance of the purpose limitation principle (Article 5(1)(b)).

The second part should be about limiting the type and amount of personal data to that needed for the aforementioned processing purpose. This should also keep the duration of the processing and any storage to a minimum. Those stakeholders who will be involved in the processing of the personal data required should ensure that the appropriate measures are implemented to meet the these requirements. This would be in compliance with the data minimisation and storage limitation principles (Articles 5(1)(c) and (e)).

The third part should be about ensuring that the accuracy of the data is maintained throughout the data lifecycle. This would involve the necessary processes to affirm the quality of personal data and prevent alterations to it through unauthorised access. This would be in compliance with the accuracy principle (Article 5(1)(d).

The fourth part should be about ensuring that the means for data processing are lawful, fair and transparent. Procedures will need to be set up to assess user expectations in relation to the processing of their data and to ensure that the legal basis for the processing remains legitimate. Data protection impact assessments will be an important part of this exercise. This would be in compliance with the principle of lawfulness, fairness and transparency (Article 5 (1)(a)).

The fifth part should be about preserving the processing purpose over time. This will have two constituent elements. Firstly, the processing purpose should be constantly assessed during development of the product, service or operation to confirm its legitimacy. Secondly, once that product, service or operation has been deployed, and thus begins processing personal data, the processing purpose should be routinely assessed to ensure that it is still necessary and legitimate. This would be in compliance with the purpose limitation principle (Article 5(1)(b)).

The sixth part should be about keeping the personal data, and the processing methods, secure. This will mean encrypting data both in transit and in storage, as well as implementing read-only access and other controls where necessary. This would be in compliance with the principle of integrity and confidentiality (Article 5(1)(f)).

The seventh part should be about making the processing activities as transparent as possible. This will consist of producing meaningful information for users via privacy notices. Such information should be easily accessible and comprehensive for individuals so that they can gain a full understanding of how their data is being processed. This would be in compliance with the principle of lawfulness, fairness and transparency (Article 5 (1)(a)).

The eighth part should be about enabling individuals to exercise their rights. This will include those rights detailed in the GDPR, including the right to be forgotten and the right to data portability. This is an unavoidable requirement of complying with the Regulation, and any product, service or operation must be developed in a way that respects the rights of data subjects.

The ninth and final part should be about implementing measures to demonstrate compliance with the data protection principles. Any product, service or operation processing personal data must be constantly monitored, assessed and audited to ensure that the processing remains lawful. This is especially important where that product, service or operation may change in the future. This would be in compliance with the accountability principle (Article 5(2)).

Each part of this workflow will require the implementation of certain controls to ensure that the relevant data protection requirement is met. These may be process-related, of which may determine who should process personal data or even how it should be processed. Such controls may also be system-related and technical in nature, such as the use of encryption. These controls will therefore consist of both organisational and technical measures, as required under the GDPR.

Privacy engineers will be involved in developing the appropriate organisational and technical measures to achieve each part of the workflow. For example, when it comes to privacy notices, privacy engineers might make use of design principles to make such notices more user-friendly and accessible (more here). This may involve collaborating with UI designers and software developers. In such a scenario, the privacy engineer would interpret the requirements under Articles 12 and 13 (relating to privacy notices) and then collaborate with other stakeholders to build a technical or organisational measure to meet the legal requirements in an effective manner. This would address the seventh part of the workflow and the principle of lawfulness, fairness and transparency.

Privacy-enhancing technologies (PETs) should certainly be used as part of the workflow as well. The European Agency for Network and Information Security defines PETs as “software and hardware solutions, ie systems encompassing technical processes, methods or knowledge to achieve specific privacy or data protection functionality or to protect against risks of privacy of an individual or a group of natural persons”. There are two particular PETs that are increasingly being utilised to achieve privacy-by-design: federated learning and differential privacy.

Federated Learning

This is a technique deployed for systems that use machine learning. It is where the devices of data subjects “collaboratively learn a shared prediction model while keeping all the training data on a device, decoupling the ability to do machine learning from the need to store the data in the cloud”.

The process begins with a mobile device downloading a central prediction model from the cloud. The device then improves this model by learning from the data generated and stored locally on that device. Once the model has been modified with the local data from the mobile device, the changes made to that model are summarised and condensed into an update. That update is then encrypted and transmitted back into the cloud where the central model is updated. That central model is constantly modified with updates submitted by various mobile devices of different data subjects. Once modified, the devices download the new updated central model and the process repeats.

By using this technique, all the training data stays on the data subject’s device and no individual updates are stored in the cloud (because all the updates are combined together to improve the shared central model in the cloud). Thus, not only does this technique allow for better machine learning, but it also ensures greater privacy as the model development, training and evaluation can take place “with no direct access to or labelling of raw data”. Google uses this technique for its predictive texting feature on its mobile devices:

When Gboard shows a suggested query, your phone locally stores information about the current context and whether you clicked the suggestion. Federated Learning processes that history on-device to suggest improvements to the next iteration of Gboard’s query suggestion model.

Federated learning helps with the second part of the privacy engineering workflow, which is about limiting the type and amount of data to that needed to fulfil the processing purpose and storing data where necessary. Only summarised updates are shared with the cloud, devoid of any personal identifiers and conglomerated together with other updates to modify the central model. Google can therefore train its model without having to extract personal data from mobile devices. Privacy engineers can work with software engineers and data scientists to put in place a system that executes federated learning during the development of products, services or operations that involve machine learning.

Differential Privacy

Another PET which has been increasingly used in recent years is differential privacy. This is about being able to learn from a database nothing about individuals within that database but still be able to learn about a whole group or population. The way that this is achieved is by injecting random noise so that someone observing the output of an algorithm cannot tell whether a “specific individual’s information was used in the computation”.

Apple makes use of this technique when gathering data on how people use its devices. Firstly, users must consent to sending their data to Apple’s servers. If a user does consent, then their data is collected locally on their device (for example, a user typing on emoji). Differential privacy is then applied to that local data before it is transmitted to the server through an encrypted channel. Any device identifiers are also stripped away when the data is transmitted. The data is then processed and aggregated statistics are shared with the relevant teams within Apple. The aim of this technique is to make it almost impossible to identify which personal data belong to which individuals:

Differential privacy ensures that the same conclusions, for example, smoking causes cancer, will be reached, independent of whether any individual opts into or opts out of the data set. Specifically, it ensures that any sequence of outputs (responses to queries) is “essentially” equally likely to occur, independent of the presence or absence of any individual.

Furthermore, differential privacy does not compromise the accuracy of the outputs derived from a database:

A surprising property of differential privacy is that it is mostly compatible with, or even beneficial to, meaningful data analysis despite its protective strength. Within empirical science, there is the threat of overfitting data to ultimately reach conclusions that are specific to the dataset, and lose accuracy when predictions are generalized to the larger population. Differential privacy also offers protection from such overfitting, its benefits thus go even beyond data security.

Thus, differential privacy can be used to achieve two parts of the workflow. Firstly, it can be used to keep the personal data, and the means of processing it, secure. Secondly, it can be used to ensure that the accuracy of the data is maintained throughout the data lifecycle. Accordingly, differential privacy can be used to address the third and sixth parts of the privacy engineering workflow.

Build Better

In her book, Zuboff attributes the rise of surveillance capitalism to, partly, the lack of regulation of the internet, especially during its inception. Google was often a company that advocated this argument consistently; it made claim “to unprecedented social territories that were not yet subject to law”.¹¹ This was combined with the fast-paced technological innovation that would leave lawmakers and governments behind. Such activity would only be inhibited by regulation, not encouraged. Zuboff further notes the unwelcome repercussions that this kind of innovation has yielded:

Google and Facebook vigorously lobby to kill online privacy protection, limit regulations, weaken or block privacy-enhancing legislation, and thwart every attempt to circumscribe their practices because such laws are existential threats to the frictionless flow of behavioural surplus…Regulatory interference…would only undermine competitive diversity.¹²

But with the advent of privacy engineers, there is a chance to buck this trend, since such professionals will work to find the seemingly equivocal value in data protection compliance. It can encourage the creation of what Emerald de Leeuw, a privacy and data protection specialist, calls “ethical technology” which is underpinned by a corporate responsibility that does not sacrifice innovation and growth. Ultimately, privacy engineers are the way to build a better future for society.

Sources:

[1] Michelle Finneran Dennedy, Johnathan Fox and Thomas R Finneran, The Privacy Engineer’s Manifesto: Getting From Policy to Code to QA to Value (2014 Apress), 15.

[2] Jamie Susskind, Future Politics: Living Together in a World Transformed By Tech (2018 OUP), 2.

[3] Ibid, 7-8.

[4] Shosana Zuboff, The Age of Surveillance Capitalism: The Fight for a Human Future at the New Frontier of Power (2019 Profile Books).

[5] Dennedy et al (n 1), 27.

[6] Ibid, 230.

[7] Ibid.

[8] Ibid, 174.

[9] Mary Pothos, ‘Accountability Requirements’ in Eduardo Ustaran, European Data Protection Law and Practice (2018 IAPP), 278.

[10] Ibid.

[11] Zuboff (n 4).

[12] Ibid.

David Epstein, Range: How Generalists Triumph in a Specialized World (2019 Macmillan, Main Market Edition).

European Data Protection Board, Guidelines 4/2019 on Article 25 Data Protection by Design and by Default (13 November 2019)

Deidentification versus anonymization

Why Privacy Engineering is Critical with IAPP’s Senior Privacy Fellow Caitlin Fennessy

New frontiers in lawtech innovation and data protection

GDPR: From compliance headache to business opportunity

Privacy Engineer Sample Job Description

Integrating data is getting harder, but also more important

‘Differential Privacy,’ or How Apple Finds the Most Popular Emojis Without Reading Your Texts