NSF Future Internet Architecture Project

FIA Principal Investigators' Meeting

November 17-18, 2014
Washington, DC

This workshop was the eighth in a series of investigator meetings for the NSF Future Internet Architecture program. The plan for this meeting was to use specific use cases or scenarios to better understand the implications of the different architectures in the FIA project. The use cases have been generated by the Values in Design Council. The ViD Council is composed of experts from a number of disciplines, primarily the social sciences and law, chosen to provide a range of non-technical perspectives on network design choices.

Agenda

Monday, November 17

8:30 – 9:00 Registration and Continental Breakfast

9:00 – 9:30 Introduction and review of meeting objectives (David Clark and Darleen Fisher) and (Re)introduction to Values in Design (Helen Nissenbaum)

9:30 – 10:00 Presentation of VID scenarios: Scenario A – Incidental Data/Metadata (Michael Zimmer and Luke Stark) and Scenario C – Online Identities (Seda Gürses and Katie Shilton)

10:00 – 10:30 Project-based breakouts

10:30 – 10:45 Break

10:45 – 12:15 Breakout groups work on scenarios A and B

12:15 – 1:45 Working lunch and capsule VID talks (Katie Shilton, “Building Values Discussions Into Development Work” and Seda Gürses, “Privacy Research Paradigms in Computer Science”)

1:45 – 2:45 Breakout groups report back on scenarios A and B

2:45 – 3:00 Break

3:00 – 3:30 Presentation of VID scenarios: Scenario B – Access and Approval (Geof Bowker and Daniel Howe) and Scenario D – The Internet of Things (Phoebe Sengers and Natasha Schüll)

3:30 – 4:30 Catch up time, project discussion, questions to ViD group.

4:30 – 5:30 Presentations on FIA and IP (kc claffy, Eben Moglen)

5:30 Adjourn

6:00 Reception, Dinner, and VID talks at the NYU Torch Club; (Helen Nissenbaum, “Privacy in Context” and Finn Brunton, “Values in Cryptocurrency”)

Tuesday, November 18

8:30 – 9:00 Breakfast

9:00 – 9:30 Recap of Day One/Day Two Framing (David Clark and Helen Nissenbaum)

9:30 – 11:30 Breakout groups work on scenarios C and D

(10:30 – Working break)

11:30 – 1:00 Working lunch and capsule VID talks (Geof Bowker, “It's Values all the Way Down: The Case of Big Data” and Phoebe Sengers, Title TBD)

1:00 – 2:00 Breakout groups report back on scenarios C and D

2:00 – 3:00 Plenary Discussion: VID Questions and Answers (David Clark and Helen Nissenbaum)

3:00 Adjourn

***

Scenario A: Incidental data / metadata
Michael Zimmer and Luke Stark

Use Case A: A small team of developers associated with a libertarian think tank wants to create a technology company that supports user application that makes its users as impervious to tracking and surveillance from outside parties as current technologies and regulations allow. The team is particularly concerned with recent media stories about the Snowden affair, Apple geolocation tracking, and controversies around Facebook’s research practices. The developers are curious as to which team can best support the libertarian approach, which these developers believe will also be the most lucrative. The scenario addresses privacy, logs, and consequences of inadvertent data mining for system robustness.

Questions for the teams:

How does your architecture divide “data” from “metadata” conceptually? What does that division look like in technical terms (e.g. what information is contained within packet headers?)

Could your architecture support location-based advertising without obtaining any data about individual users?

How granular could individual users make their profiles? Does your architecture allow the new social network to let its users make these decisions?

Where will data and metadata be stored in your architecture: in local servers, remotely, somewhere in between? How will we, and our customers, have access to it?

How should the team be worried about the NSA getting access to your architecture: technical means, legal injunctions or both?

If the NSA wanted to do metadata analysis to try to identify trends, identify groups, identify people - how might the architecture support/hinder this?

If a advertising network wanted to profile particular types of people based on usage data or metadata, how does the architecture enable/hinder this?

Scenario B: Access and Approval
Daniel Howe and Geof Bowker

A startup Satellite Images is producing a fleet of satellites which will provide daily hi-res photos of every part of the earth's surface. The founders are driven both by a healthy profit motive, and by a commitment to open source and the use of their data for the greater good. It could be made available for humanitarian crises such as tracking available roads during fires or after earthquakes; or charting the effects of climate change in real time. They have a principle of making their data open to the public after a one year, with the profits being driven by users who need real time data and the set of algorithms they are designing for data analytics. The service accepts anonymous payments (in Bitcoin or variants). Second, they keep no logs, or at least the minimum as required by local law. Third, they are net-neutral, which means that for a given bandwidth, their service feels faster than the services of the large telecom providers.

Use Case A: Upstream ISPs (those that smaller ISPs purchase some or all of their service from) refuse to sell bandwidth to Satellite Images. Worse, legal and even criminal penalties are threatened against the owner of ISPrivacy as they are allegedly aiding terrorist networks, and concealing this info (by not saving it) from law enforcement. When they try to challenge this, they are told that the alleged aid cannot be discussed in open court proceedings.

Use Case B: They are approached by an international agency which wants access to their data in order, they say, to help co-ordinate the delivery of humanitarian aid after a devastating flood. However, this same agency has ties to a particular superpower’s military, which is supporting a clandestine war in the same region. Is there anything they can do to permit the ‘good’ use of their data whilst blocking the ‘bad’ use. And, as a first step, how to they track downstream uses of their data.?

Use Case C: A paedophile is sitting at home on his computer one day, and decides to browse through their publicly released data for his wealthy neighborhood, which features houses with high fences around. He notices that 3 neighbors in his area (whose high fences give them an expectation of privacy) have swing sets and assorted toys in their back yards, not visible from the street. The same data he is using is being deployed by multiple not-for-profits for the social good – and if they block every potential ‘bad’ use of their data there won’t be enough to go around for ‘good’ uses.

Scenario C: Online Identities
Seda Gürses and Katie Shilton

Use Case A: There has been growing media attention to “real name” policies and other identification measures which challenge people who wish to maintain multiple pseudonymous identities on online services (e.g. keeping separate home and work selves, enabling membership in particular subcultures, etc). There is a growing consensus that it is good and right to allow people to use multiple identities online. While it’s easy to argue that this is an application problem, applications like Facebook would increasingly like to use the network to identify those subverting or avoiding “real name” policies. In this scenario, a social media provider seeks to force users to follow a real-name identity policy. A set of users hope to continue to use the service without revealing their real names.

Questions for the teams:

In your future Internet architecture, to what degree are individuals and groups required to identify themselves to the network?

What conflicts over identity and identification does the architecture resolve, what does it leave to others?

What are the second-order consequences of decisions about identification in your network?

Is your identification scheme especially geared towards identifying users to services, or is it possible for peers to identify each other without a third party?

How expensive or time-consuming is it for individuals to hide their identity from the network, or to use multiple identities, or publish anonymously?

What second-order means might be used to identify users (e.g. is the protocol susceptible to fingerprinting?)

Can users have clean slates, or are their identities tracked over time?

Are there obfuscation methods that could be used to obscure users’ identity on the network?

Who does your network’s identification requirements empower? Who do they disadvantage?

How susceptible are the protocols to what is called “fragmentation” of the internet, e.g., data hosts or namespaces are matched to national borders, and how likely is it that this will bring about an identity scheme that resembles passports and physical border crossing?

Use Case B: Anonymous participation will likely continue to be of importance to the future Internet. Anonymous participation might include making use of services ranging from crisis support to alcoholics anonymous to participation in civic activities like voting, petitions, or seeking sensitive information from government websites. Similarly, people challenging government or corporate power may have strong desires to access or publish information anonymously. Each of these forms of participation have differing anonymity and information integrity requirements. Scenario 2 focuses on a corporate whistleblower, as this is situation with a very diverse (and powerful) adversary model. A corporate whistleblower needs to verify that potentially incriminating data is what it purports to be. She has an interest in keeping the integrity and authenticity of any published documents intact (e.g., the adversary may argue that the whistleblower has produced false documents to discredit her). The whistleblower may also need to cover her tracks internally to avoid exposure, she may also want to communicate with journalists anonymously and publish documents anonymously.

Questions for the teams:

Are there elements in the architecture that shift the balance of power between current stakeholders (a single individual, perhaps backed by journalists or allies, vs. a corporation or government)?

Are there centralized elements in the architecture that are vulnerable to threats that would make this scenario impossible?

Can the data be authenticated without exposing the source?

Can the whistleblower have anonymous rendezvous with a group of journalists? What does anonymous video chat look like in these networks? What traces would it leave behind that could be used to monitor or break the integrity of exchanges?

Suppose the whistleblower is working in a locally-routable namespace or data host. How does she share data within a globally-routable space?

Does sensitive data lose its provenance if it’s transferred to and shared from the New York Times servers or namespace?

Scenario D: The Internet of Things
Phoebe Sengers and Natasha Schull

Use Case A: A 63 year old professional woman, Julia Stilinski, suffers from heart arrhythmia and has an Implantable Cardioverter Defibrillator (ICD) that constantly tracks her heart and will intervene with a shock should she need it; also communicates with her doctor. She uses a wearable movement-tracking device that is integrated with her smartphone and its geolocation capacities. Julia tracks her mood and energy levels via an application that tracks her email and online shopping activity as well as data from a collection of wearable sensors (the movement-tracking device, an electronic patch that continuously tracks heart rate and perspiration, another that tracks her posture). The mood/energy tracker is integrated with her home sensors (for lighting, fireplace, temperature, stereo playlist), so that her home can sense through the GPS in her car and phone when she is coming home and can dynamically respond to her current state – by dimming the lights, warming the air, and putting relaxing music on if she needs calming down from a stressful day, or by putting on peppy music and brighter lights if she’s had a slow and boring day and needs a more stimulating environment. This tracker is provided free of charge by her credit card company, in return for the right to use the collected data to adjust its internal model of her credit rating. The scenario raises a range of questions involving net neutrality and loci of control.

Questions for the teams:

Which of these diverse streams of data will take priority and how will they work to communicate and synch with each other if certain streams – e.g. medical – are given access to a “faster lane” on the network?

Will the woman have the option of “paying extra” to keep her various sensors communicating at the same speed, whether entertainment or home security or medical?

What if it turns out that the upbeat music her mood/energy app gives her is medically dangerous in that it can overstimulate her, inspire a little dance, and trigger an arrhythmia? Would this mean that the ICD should be able to talk to this app and override its decisions? Or that the app should be considered a medical one that has access to the ICD’s data of the heart’s activity (far more granular than the data it collects through its heart rate patch)? How will the network differentiate between the different streams of data and determine which should take precedence? How will this play out at the level of authentification?

If Julia wishes to go in and customize / adjust the default settings for these various sensor systems, will she be able to do that across the board – or will certain sensors be more “closed” to her adjustments (e.g. her pacemaker) while others (e.g. her stereo system) be more open to her control?

Idea here is that the network infrastructure – in this case, where the different control points are and how open they are to user access -- carries values that need to be thought through... I.e. it’s a value statement to give precedence to medical devices over others... Who says which should be granted the higher speed? Who says which the user should be able to access and tweak, and which are more secure and locked to user adjustment?

If Julia decides she doesn’t want to share confidential client emails with the tracker system, can she block certain classes of data proactively from being considered, or is her only choice a blanket opt-in or opt-out?

Hosted by the Computer Science and Artificial Intelligence Laboratory at MIT under agreement with NSF.
Contact David Clark <ddc@csail.mit.edu> with questions or comments.
Content of this page extracted from NSF FIA solicitation and press release. See links above for full text.