Category Archives: IT policy

Federated Authorization

For a while now the use of federated userids has been the norm. A site called the Service Provider, SP, has entered into agreement with another, called the Identity Provider, IP that it will trust the userid the IP sends it as being properly authenticated and accurately identifies the user.
There are a number of way this can be done; the passing of a SAML token from the IP to the SP is a popular way. Where the SAML token contains the userid and the SP trusts this because the token is signed by the IP.

So much for users and their userids, but what about what they have access to; The authorization.
Userids are tied to revenue in significant ways. In businesses deriving revenue from gathering information about the users, the useris is clearly of primary concern. More generally in business applications it is not the userid itself that is central, but what the user has access to and how to execute that access control.

In a standard access control deployment scenario with federated users. For example the user login somewhere else (the IP) and click a link to (return to ) the application in question, with the userid being passed along. This is a federated login. The application (the SP) still examines the userid and refers to its own access control infrastructure for what the user is authorized to do. Could this step be federated too ? Should it ?
Like a login and authentication infrastructure costs money, so does maintaining an access control infrastructure. Federated login cuts the costs of the first, so can Federated Authorization, FA, cut the second.
Form the cost point of view FA makes sense.

Can it be done and does it make sense from a policy and governance point of view ?. It is one thing to let someone else identify the user on your behalf. Digital certificates have been around for a while so the issues surrounding outsourcing the work of identify users are not new. And the risks are largely accepted. But authorization cuts close to the essentials of governance: what are people permitted to do. True, if the user is misidentified having a thorough access control system does not make much difference.
One significant issue is related to changes. A user ID doesn’t change much, if at all. What a user has access to, does. Sometimes great urgency. Having direct control of authorization is then desirable.

The day-to-day problem with access control is not primarily the enforcement, though that can be tricky enough. It is the creation of and updates to policy. The rules governing the access. Who gets to decide and how can those decisions be captured in an enforceable way.

Big Personal Data

Reading this article in a Blog on New York Times today I kept thinking about the what was not being said in it. Data governance.
The assumption, not doubt valid, is that ever larger pools of data will be amassed; and suitable analyzed, solutions will be found to existing problems and fantastic new possibilities will emerge.
I don’t doubt the potential in “Big Data” and look forward to its exploitation with interest. But also with some trepidation. With great power come great scope for abuse. This is as true for technology as it is for people. Necessarily so, since it is people how design and use the technology.
The article mentions car sensor data and its collection. But what is this data and who get’s to use it ?
Going on to mention ever higher transfer speeds allowing larger data quantities to exchanged ever faster.

The term Data Lake is used to describe GE accumulation of sensor data. Siting that it permitted speeding up GE’s error detection by a factor of 2000.

With ever larger quantities gathered and ever faster transfer speeds and distributed analytical tools all the worlds data eventually becomes available in a Data Sea, ready for analysis.

The article argues that this is not just technically possible but also both the trend and desirable. The first two are clearly correct. The problems with different data structures and qualities are surmountable. As for whether or not it is desirable the issues are less clear.
That there are advantages to be had is obvious. It is also probable that the advantages are great enough to justify the expense involved. Indeed it would not happen in the commercial world unless there was money to be made. Governments have other objectives, social control including law enforcement, where the expense is not a primary concern, only technical feasibility.

In engineering technical feasibility is a function of resources expended. And the Big Data field is certainly seeing a great deal of resources expended on it, whatever the motivation. So what is possible in future will certainly expand from what is possible today.

Data governance has become a political issue of late. The relentless surveillance by governments of their citizens is becoming more widely known. Not all governments have the resources of the US or the determination of Chinese. But that is a temporary relief. The capabilities will expand and the costs will decline. In due course everyone can aspire to total surveillance.
At present only the richest and most powerful can access the whole data sea. Others can only access the part of it they gathered themselves. This article implies that limitation will fall away and everything will be accessible to those that can pay. But who gets the money? For now it has been those who accumulate the data that get the money. They incur the expenses; servers, storage and bandwidth cost money. The source of the data gets nothing. Certainly no money but in some cases a service. Social media run on expensive server platforms but the user can use them “for free” in exchange with surrendering all their data and any other information the platform can gather about them. Many consider this fair exchange. Even if this is so, it still in the users interest to control where the data is going, who gets to use it and for what. The stories of social media posts that have circulated outside their intended target audience and caused embarrasment are legion. The columnist Kathleen Parker advocated self-sensorship to a degree inconsistent with civilization.
It is unclear whether she honestly meant what she wrote or had simply disengaged her mental faculties before writing the piece. Bill Mahers response was withering and to the point. We can’t live in a world where the only privacy we have is inside our heads. Privacy matters also for those less exalted among us. Scanning social media before a hire is now common practice. And individuals go not get to decide what they need to “hide” – anything that may cause offense or raise the slightest question can cause later inconvenience or worse. Made even more intractable by ever changing social conventions.
Expect to see much, much more on this subject.

Some gatherers of user data have policies in place and tell the user explicitly that while they gather as much data as possible they do not pass it on to any one else. Or if they do, it is in some lesser, anonymized form.

That data is passed along at all suggest convincingly that the data has commercial value. As interconnections improve and data sales increase it is also clear that the source of this data are being underpaid; the data gatherer can increase their revenue per data item without incurring any additional source costs.

Persons who are the source of this data then have both a financial and a data governance issue (read privacy) at stake.

Social media users voluntarily submit to being guinea pigs to marketers. Their data has a clear commercial value and this is what they use to pay for the service.

In due course platforms will appear that also allow users to pay in other forms than just with their own personal information and data they provide themselves. “Pay small monthly fee and we will not sell your emails to your local grocery chain”.
Pay-for-service is available in social media, particularly in the more specialized subsets, like dating, but it is always either a pay or spy business model. Never a commingling of the two on the same platform. This is in most cases driven by technical considerations. It can be tricky to adequately safeguard the user data that is not available for resale. But see

Internet of Things is the next step. How are we going to handle that data. The Big Data providers have the datastore ready to take on the data and the tools to analyze it. But where are the tools to safeguard the individuals ?

computing when you don’t trust the platform

This appears to be virtually contradiction in terms. In IT security racket we have always stressed the importance of having physical control of the hardware. Anyone with physical access to the device could gain root access to the system and from then on do pretty much anything that wanted.
By extension we could only trust the application, and the safety of our data contained in it, if we also trusted the physical safety of hardware.
The Microsoft verdict is interesting in this regard. If the cloud vendor is US company, a US company own the hardware and it is thus subject to US jurisdiction regardless of there it is located. At the moment we do not know if this verdict will survive a supreme court appeal. But in any eventuality the conundrum remains: Can we have trusted computing on a untrusted computing platform ?

I’m going to try to demonstrate that this is possible for a server application.

Clearly a cloud based web application constructed to work on an untrusted platform would be very different in it’s internal workings from how things are generally done now.
The working model that I propose to pursue will be a web service with thick clients. The web service will run on a cloud vendor subject to US jurisdiction, AWS. I will demonstrate that the sensitive application data will sit on the AWS in encrypted form, at no time be decrypted on the AWS nor have the encryption keys at any point be on the AWS in clear.
The thick client will be an Android app.

In the event that I am successful using a thick client (app), I will attempt the same using a thin client (web browser).

the case for a virtual application server

Virtual hardware have been around for decades. Mainframes have been able to host multiple operating systems and IBM is doing good business selling boxes capable of hosting thousands of linux instances.  The economics of such co-location is very compelling.

Each user of such an virtual OS instance thinks it is alone but is in fact sharing with others. If done well it reduces the amount of hardware sitting idly waiting for work. Something which was distressingly prevalent back when every single server application needed its own hardware.

Times have moved on. To the regret of SUN Microsystems which was a leading casualty of the move from physical to virtual hardware. But all the box vendors have taken a hit.

But still it is the case that applications tend to have their separate virtual box. The argument spring from several points. Developers develop from particular versions of software libraries and so depend on exact versions of application servers. A virtual OS instance tend to have only one instance of an application server and so you end of with one OS instance per application. Plus more for redundancy and DR.

The off-shot is that to run your application you end up also running a stack of other software too. Software which is not part of your business logic. Without doing any bit of useful work, all this software still needs to hum along and consume precious hardware resources. And that’s with virtualization hopefully packing everything efficiently on to the hardware so as not to waste it.

Then there is all those system instances that need to be configured and maintained.

The virtualization of hardware saved a lot of space and service work in the server room. I recall having to go in there to load CDs and reboot servers. With virtualization this is all gone.

Still I have to work with operating systems and applications servers.

I want to find a way to get rid of those too.

Just like hardware was virtualized so that multiple application easily could run on the same hardware.  It is time for application servers to be virtualized also.

How should something like that work ?

A virtual application server. It should be able to run any number of different applications simultaneously. Currently it is good practice to have separate AS instances per application. For a virtual AS this no longer makes sense.

Physical hardware can run any number of virtual OSes (resources permitting) so the virtual AS should be able to run any number of applications.

This would be something quite different from a java application running on Tomcat. A collection of java classes, supported by libraries. But which versions of those libraries ? The developer depends of very specific versions. Is not accidental that one Tomcat instance per application is preferred. The whole programming model demands it. So something new is needed.

One can argue that with hyperlinks the whole internet is such a virtual application server, in that all functions are available everywhere over URLs. Web services essentially work along these lines. But that still leaves the problem of all the application server instances. One for every web service. But if we could have one application server that could act as any web service. Simultaneously.

This ties into an objection I have against AWS. Setting up an application in Amazon’s cloud infrastructure requires configuring instances of OS and and application server. And selecting the size and version of those instances. This suggests to me more scope for virtualization. True, it gives me a precise level of control, much like what I’d have if I hosted everything myself. But I don’t want it.

When I drive a car I don’t want care about the surface of the road. I just want to get from point A to point B by car.

There is another trend going on now. Thick clients. Or apps, in the form that that most us encounter. Web services which have been consumed by other applications rather than thin clients have been around for a while, but the massive profusion of apps on tablets and cellphones is recent.

Many of these apps are stand-alone; There is no server component to their functionality. An App version of Solitaire can be solitary in every sense. But many, perhaps most do have it. So to create an app is to also create a server-side application to go with it. Possibly one server application for every app, maybe more; but in many cases one server application can serve multiple apps. Googles API is an example of this. In any case a lot of server applications are required, with functionality tailored to the requirements of the app. The app designer can avail themselves of API s, like that of Google mentioned above or other forms of more or less generic web services. But in all those cases the app designer is constrained by someone elses design. To not be limited the app designers will have to have a server application completely of their own design.

Here the virtual application server can come into its own.