sailing the data ocean

What if there was a way for access to data could be authorized everywhere. If you were authorized to access a piece of data you could get access to it wherever it happened to be located.

This is not the way things work at the moment for sure, but if it could be made to work in a convenient way, what should it be like ?

When the first web browser arose some 20+ years ago. Static html pages and other media and document files where available by calling a URL over the HTTP protocol. Security was added – at first pretty coarse grained: If you were logged in you could access pretty much anything. It got better.
Comprehensive tools became available to centrally manage all web access to any document, with the finest granularity. The writing of the rules of who should access to do what, when and where, could be delegated out to those who actually were in a position to know.
But crucially, these tools could only manage access to stuff directly under their control. Often operating in a reverse-proxy mode intercepting HTTP traffic. APIs were available through which other applications could tap into them to take advantage of access control rules contained in them, to do their own authorization. In this way the data under the control of a unified set of access control rules could be made corporate wide. Access to all of a data in a corporation being governed by rules maintained in one place. Everyone would play together in the same data security pool.
In practice this never happened. (re-)Writing applications to take advantage of the API of the chosen security software platform , was too expensive. Other tools emerged to export the security rules from one software platform to another, leaving them to do their own enforcement through their own rule infrastructures. This didn’t work very well because it was too complicated. Rules are fundamentally about meaning, and meaning doesn’t translate easily. Never the less this was an attempt to federate authorization.

Data protected by the same access control rule infrastructure is part of the same pool. A database is a single pool. It has its own internal security governing access to individual pieces of data contained in it, but has no reach outside. The database maintains it’s own list of who gets to access which column in what tables.
A server has it’s own internal arrangement for governing access to the data in its own file systems. It may also have access remote file systems. Some remote file systems would be on other servers, which would govern access (NFS, FTP, Samba etc.) and would therefore not be part of the server’s own pool.

If authorizations could be federated between pools all data would exist in one big virtual pool.
A virtual pool made of multiple physical pools; individual databases, file servers etc. At present this is difficult as there may be user federation between some data pools, but each pool has it’s one authorization, it’s own way to enforce access rules. The rules in one are not known, or directly enforceable in another. There is no federated authorization.

Lets further suppose that any piece of data in this virtual data pool, data ocean really, is accessible over TCP with a URI. The URI may have various formats depending on what type of physical pool is being addressed.
For example, this would be the syntax of an URI accessing a directory (LDAP) store

ldap[s]://hostname:port/base_dn?attributes?scope?filter

And this to access a individual file, using HTTP(S)

https://host:port/path

Access to one of the secure web reverse-proxies mention above, would look like this too.

The would be many others. Note that the username and password does not appear. There would not be any prompting for this information either.
Access control would be through PAML tokens, passed in the headers. A SSL handshake would take place to establish the requesting entity’s authorization for the tokens presented.
All physical pools are defined by the entity that control access to it, and all of these entities, be they LDAP server and file/web server in the URI examples above must be equipped to handle PAML tokens to verify the authorization for the request. Through the acceptance of these PAML token the pools together form a virtual data ocean. Any application can call on data anywhere else and present PAML tokens for authorization.

This leave quite a bit of scope for application architecture. The use of a PAML token require access to the private key of the user to which the PAML token was issued. Which means that if a user is engaged in a transaction with an application and this application needs access to data kept somewhere else on behalf of the user, the application can only present its own PAML token, not forward those it has received from the users. The user must at a minimum contact this other data store directly and engage in a SSL handshake. This way the user’s ownership of the public key is established for the benefit of the data store. The application can then pass the PAML token received from the user on to the data store and the store would now know that the PAML tokens are OK to use; or the user could make the data retrieval directly and pass the data to the application that needs it. Sort of like a data federation.

Note that PAML token are tied to data, not any particular host environment. Among other things this means that the requesting client may send the server a considerable number of tokens in order to establish authorization for all required data. The server will grant the union of all these tokens.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s