Category Archives: Uncategorized

Steganographic applications

Steganography is the practice of concealing something in something else. For instance hiding a text message inside a larger body of text. There is no cryptography. The security of the message, such as it is, is in the form of obscurity. That may be enough. Cryptography is computational expensive and required management of encryption keys. And it is no more secure that the keys employed. It appears that the better cryptographic algorithms are quite good and attacks tend to focus on the keys.
Attacking steganography would be different. Rather than asking Where is the key, one would ask is there anything there at all ?
There are applications for conducting steganography. Capable of embedding a message inside an image for example. But this is not about conducting steganography.

What I am after here is for application itself to be steganographic. To conceal the application, not inside another application, like a trojan, but altogether.

What would it mean for an application itself to steganographic.

Imagine an application as a book. And using an application as equivalent of reading a book from cover to cover.
Lets further imagine that we cut the the spine of a number of books and shuffle the loose pages together like so many decks of cards. Assuming uniformity of paper quality and page size, layout and font, and no metadata on page – having a loose page tells you little about which book it was from. Or if it came from any book at all.
If you have access to the full text of the original books that were shuffled together, you could do a text comparison and quickly identify which book the page came from. But in the absence of that there would be little to go on. Analyzing the text on the pages would yield some clues and more or less probable guesses could be made. Particularly if the books had comparatively few pages, i.e. if any one page contained a large fraction of the complete text.
The obfuscation could be further improved by cutting the pages up into individual lines of text.
Identifying a line as belonging to only one particular book would be difficult in many cases.

On a side note: what difference does it make cutting the pages up along the lines or across them, for the purposes of putting back together the original page. Again assuming uniformity of paper, layout and font. I venture that a line of text is harder to mach than a column of text. There is a certain balance here. While a line contains more meaningful information, a whole sentence perhaps – it also has fewer markers, fragments of words, that would match it to the strips of paper on either side of it. You gain information in one way and lose it in another. Specifically the anchoring within the larger body.

But the analogy we’re pursuing here is application functionality as reading a complete text: whole book to page to line.

Alright, so we have a large pile of individual strips of paper. Now, how do we read the book?

The premise being that the application, the book in out analogy, is concealed among a large body of code. Where only those knowing exactly what to look for, can find it. The secret in steganography is not the decryption key but how to find what you are looking for – the detection key if you will.
A steganographic application would have to have a detection key. With which you can locate the application and without which you can not. A link to first page, in the book analogy.

In a book you can just turn the pages but here all the pages are separated from one another. How do you get from page to page – or from one line to the next, in the more fine grained analogy.

Each page has a link to the next page but that link is only usable to someone who has a key. Else anyone stumbling on a page would discover the whole book. The idea was that a page does not guarantee a book. Given a pile of pages you can’t know how many books are in it or even if there are any at all. All you have are disjointed pages. Or code.

What will the STORK bring us ?

STORK is a federated security initiative from the EU. The acronym is ridiculously forced, being short for “Secure idenTity acrOss boRders linKed.” Less than auspicious.
It is supposed to secure interoperability between electronic identity (eID) schemes in different countries. With a few extra bit thrown in on top. Interoperability sounds very nice. Yes, lets have some of that. I applaud the effort.

However, I am dubious about the stuff “under the hood”. Can it be made to work as promised, and even if it can, do we want it or need it ?
I’ll examine this STORK thing more closely and report back here.

sailing the data ocean

What if there was a way for access to data could be authorized everywhere. If you were authorized to access a piece of data you could get access to it wherever it happened to be located.

This is not the way things work at the moment for sure, but if it could be made to work in a convenient way, what should it be like ?

When the first web browser arose some 20+ years ago. Static html pages and other media and document files where available by calling a URL over the HTTP protocol. Security was added – at first pretty coarse grained: If you were logged in you could access pretty much anything. It got better.
Comprehensive tools became available to centrally manage all web access to any document, with the finest granularity. The writing of the rules of who should access to do what, when and where, could be delegated out to those who actually were in a position to know.
But crucially, these tools could only manage access to stuff directly under their control. Often operating in a reverse-proxy mode intercepting HTTP traffic. APIs were available through which other applications could tap into them to take advantage of access control rules contained in them, to do their own authorization. In this way the data under the control of a unified set of access control rules could be made corporate wide. Access to all of a data in a corporation being governed by rules maintained in one place. Everyone would play together in the same data security pool.
In practice this never happened. (re-)Writing applications to take advantage of the API of the chosen security software platform , was too expensive. Other tools emerged to export the security rules from one software platform to another, leaving them to do their own enforcement through their own rule infrastructures. This didn’t work very well because it was too complicated. Rules are fundamentally about meaning, and meaning doesn’t translate easily. Never the less this was an attempt to federate authorization.

Data protected by the same access control rule infrastructure is part of the same pool. A database is a single pool. It has its own internal security governing access to individual pieces of data contained in it, but has no reach outside. The database maintains it’s own list of who gets to access which column in what tables.
A server has it’s own internal arrangement for governing access to the data in its own file systems. It may also have access remote file systems. Some remote file systems would be on other servers, which would govern access (NFS, FTP, Samba etc.) and would therefore not be part of the server’s own pool.

If authorizations could be federated between pools all data would exist in one big virtual pool.
A virtual pool made of multiple physical pools; individual databases, file servers etc. At present this is difficult as there may be user federation between some data pools, but each pool has it’s one authorization, it’s own way to enforce access rules. The rules in one are not known, or directly enforceable in another. There is no federated authorization.

Lets further suppose that any piece of data in this virtual data pool, data ocean really, is accessible over TCP with a URI. The URI may have various formats depending on what type of physical pool is being addressed.
For example, this would be the syntax of an URI accessing a directory (LDAP) store

ldap[s]://hostname:port/base_dn?attributes?scope?filter

And this to access a individual file, using HTTP(S)

https://host:port/path

Access to one of the secure web reverse-proxies mention above, would look like this too.

The would be many others. Note that the username and password does not appear. There would not be any prompting for this information either.
Access control would be through PAML tokens, passed in the headers. A SSL handshake would take place to establish the requesting entity’s authorization for the tokens presented.
All physical pools are defined by the entity that control access to it, and all of these entities, be they LDAP server and file/web server in the URI examples above must be equipped to handle PAML tokens to verify the authorization for the request. Through the acceptance of these PAML token the pools together form a virtual data ocean. Any application can call on data anywhere else and present PAML tokens for authorization.

This leave quite a bit of scope for application architecture. The use of a PAML token require access to the private key of the user to which the PAML token was issued. Which means that if a user is engaged in a transaction with an application and this application needs access to data kept somewhere else on behalf of the user, the application can only present its own PAML token, not forward those it has received from the users. The user must at a minimum contact this other data store directly and engage in a SSL handshake. This way the user’s ownership of the public key is established for the benefit of the data store. The application can then pass the PAML token received from the user on to the data store and the store would now know that the PAML tokens are OK to use; or the user could make the data retrieval directly and pass the data to the application that needs it. Sort of like a data federation.

Note that PAML token are tied to data, not any particular host environment. Among other things this means that the requesting client may send the server a considerable number of tokens in order to establish authorization for all required data. The server will grant the union of all these tokens.

If there was a pay version of Facebook, how much would it cost ?

Social media platforms, broadly defined comes of two forms, “free” and for-pay… And never the twain shall meet. Or is that so ?
Users of facebook have no choice. The “free” version is all they’ve got. Yahoo Mail has both a “free” version and a pay version. But it is not clear that that the pay version do not have the same drawbacks as the free version: relentless surveillance and selling of the gathered information to third parties. Sometimes the privacy policy states that information is not passed to outside companies. This often means that the information is given to “partnerships” i.e. to the same outside company but payment is taken in a different form. Perhaps somewhat anonymized but still valuable to the marketing dept. Maybe I am being a little cynical here, but in business it is the more prudent path.

New social media outfits pop up left and right. Lately some have tried other revenue models. caught my attention the other day. Like Facebook they try to fake-start exclusivity by being by-invitation only to start with. That will end soon enough. Apparently they are without advertising. “you are not the product” is their slogan. From the premise that those who pay are the customer, those who don’t are the product. Assuming that cash is the only form of payment. Some and indeed a great many people, are quite content to pay with their (digital)life.

It will be interesting to see how Ello progresses. I wish them every good fortune, but I suspect that business /greed will get the better of them. Once there is information about users that can be monetized, it will eventually.
Getting back to the original question. How much does a social media platform like Facebook cost? Adding a suitable profit margin and we have the price facing the users. A dollar (US) a month?. Two dollars? Five ?
The problem can be broken down. There are storage costs, network costs , the cost of computing power, developments costs. (and profits, of course)
The first three are falling by the day. The developments costs should scale nicely too – more users, lower development costs per user. I am assuming current the level of customer care and support will remain the same, and estimate zero costs here. These do not add up to a definite estimate on price, but suggest that whatever it is at the moment, it will be lower in future. Economies of scale can be misleading in social media. Here there is no special value of having absolutely everyone on the same platform, only enough, i.e those you want to be “social” with. Or more particularly use the platform to be in touch with. Linkedin is a special case where there is a premium on getting in touch with the people you DON’T already know. That being said, bigger is better.

And it leads to my next point, switching costs. Building up a profile takes time and effort. Moving to another platform will in general mean starting afresh. Not quite with a blank slate as most allow you to import contact lists from various other application, like Gmail and Yahoo Mail, and using those lists to find others of your contacts already on the platform. Which in the case of a new platform is unlikely to be very many. Then there is any other data you have provided to the platform: pictures, writings etc. The data export features are unlikely to be very helpfull. But some may find the loss of self-provided data to be the very reason to switch platfom and starting fresh; It is not a bug but a feature.

Never the less, switching costs increases the platform owners pricing power; And data portability reduces it. Which means that all social media platform have an interest in keeping the costs of leaving high and the costs of joining low. The winner-takes-all aspect of social media is well known. There is no price for second place. Yet new “platforms” keep emerging. It appears that however much one platform seems to be in the lead a new one can still emerge. Facebook supplanted MySpace but still felt compelled to buy WhatsApp. Now there is SnapChat. Many people are betting a great deal of money of what is going to be the next big thing. Few doubt that there will be one.
All of which suggest that while people have signifigant investments in which ever social media applications they happen to be using, they can and will move. This places an upper bound on what the user will pay (in cash or in privacy). If the terms are too exorbitant the users will move on that mauch more quickly.

XACML made portable with PAML

XACML has been pronounced dead. Repeatedly. And in truth it has never been much used. But I think it still has potential. The standard has been around for years (version 2.0 in 2005) and allows for quite a bit of flexibility. Role based and attribute based. wikipedia provides a decent run down on XACML xacmlinfo.org is a superior resource for all things XACML.

Key for our purposes is the separate between decision and enforcement in XACML; The decision is made one place and enforced somewhere else. This permits the portability we’re looking for. There is nothing in XACML directly mandating online services. A PAML token should be usable for an extended period of time, and XACML allows this.

An XACML policy sample:

<Policy xmlns=”urn:oasis:names:tc:xacml:3.0:core:schema:wd-17″ PolicyId=”medi-xpath-test-policy” RuleCombiningAlgId=”urn:oasis:names:tc:xacml:1.0:rule-combining-algorithm:first-applicable” Version=”1.0″>
<Description>XPath evaluation is done with respect to content elementand check for a matching value. Here content element has been bounded with custom namespace and prefix</Description>
<PolicyDefaults>
<XPathVersion>http://www.w3.org/TR/1999/REC-xpath-19991116</XPathVersion&gt;
</PolicyDefaults>
<Target>
<AnyOf>
<AllOf>
<Match MatchId=”urn:oasis:names:tc:xacml:1.0:function:string-regexp-match”>
<AttributeValue DataType=”http://www.w3.org/2001/XMLSchema#string”>read</AttributeValue&gt;
<AttributeDesignator MustBePresent=”false” Category=”urn:oasis:names:tc:xacml:3.0:attribute-category:action” AttributeId=”urn:oasis:names:tc:xacml:1.0:action:action-id” DataType=”http://www.w3.org/2001/XMLSchema#string”></AttributeDesignator&gt;
</Match>
</AllOf>
</AnyOf>
</Target>
<Rule RuleId=”rule1″ Effect=”Permit”>
<Description>Rule to match value in content element using XPath</Description>
<Condition>
<Apply FunctionId=”urn:oasis:names:tc:xacml:1.0:function:any-of”>
<Function FunctionId=”urn:oasis:names:tc:xacml:1.0:function:string-equal”></Function>
<Apply FunctionId=”urn:oasis:names:tc:xacml:1.0:function:string-one-and-only”>
<AttributeDesignator Category=”urn:oasis:names:tc:xacml:1.0:subject-category:access-subject” AttributeId=”urn:oasis:names:tc:xacml:1.0:subject:subject-id” DataType=”http://www.w3.org/2001/XMLSchema#string&#8221; MustBePresent=”false”></AttributeDesignator>
</Apply>
<AttributeSelector MustBePresent=”false” Category=”urn:oasis:names:tc:xacml:3.0:attribute-category:resource” Path=”//ak:record/ak:patient/ak:patientId/text()” DataType=”http://www.w3.org/2001/XMLSchema#string”></AttributeSelector&gt;
</Apply>
</Condition>
</Rule>
<Rule RuleId=”rule2″ Effect=”Deny”>
<Description>Deny rule</Description>
</Rule>
</Policy>

The enforcement point examines the incoming request and create a XACML request, which may look something like this.

<Request xmlns=”urn:oasis:names:tc:xacml:3.0:core:schema:wd-17″ ReturnPolicyIdList=”false” CombinedDecision=”false”>
<Attributes Category=”urn:oasis:names:tc:xacml:1.0:subject-category:access-subject” >
<Attribute IncludeInResult=”false” AttributeId=”urn:oasis:names:tc:xacml:1.0:subject:subject-id”>
<AttributeValue DataType=”http://www.w3.org/2001/XMLSchema#string”>bob</AttributeValue&gt;
</Attribute>
</Attributes>
<Attributes Category=”urn:oasis:names:tc:xacml:3.0:attribute-category:resource”>
<Content>
<ak:record xmlns:ak=”http://akpower.org”&gt;
<ak:patient>
<ak:patientId>bob</ak:patientId>
<ak:patientName>
<ak:first>Bob</ak:first>
<ak:last>Allan</ak:last>
</ak:patientName>
<ak:patientContact>
<ak:street>51 Main road</ak:street>
<ak:city>Gampaha</ak:city>
<ak:state>Western</ak:state>
<ak:zip>11730</ak:zip>
<ak:phone>94332189873</ak:phone>
<ak:email>asela@gmail.com</ak:email>
</ak:patientContact>
<ak:patientDoB>1991-05-11</ak:patientDoB>
<ak:patientGender>male</ak:patientGender>
</ak:patient>
</ak:record>
</Content>
</Attributes>
<Attributes Category=”urn:oasis:names:tc:xacml:3.0:attribute-category:action”>
<Attribute IncludeInResult=”false” AttributeId=”urn:oasis:names:tc:xacml:1.0:action:action-id”>
<AttributeValue DataType=”http://www.w3.org/2001/XMLSchema#string”>read</AttributeValue&gt;
</Attribute>
</Attributes>
</Request>

The request is compared to policy and the request allowed or denied accordingly.

The enforcement point must have the capability to create a XACML request from the actual request, and be able to compare it to the applicable XACML policy. This is where PAML tokens comes in, as they can link the request with the policy that governs the request, by placing XACML policy inside the PAML tokens. PAML tokens are issued to users and the user is responsible for sending a token (or possibly more than one) that contains a XACML token that will allow the request. The issuer of the PAML token owns the data and includes in the PAML token the policy XACML containing the access control rules the data owner wants to enforce.

opening statement

This blog will be mostly about IT security and distributed data issues. And what better time to start when NSA pervasive monitoring of people on the web is getting some public attention. NSA’s activities has been well-known in the IT security industry for many years and broadly taken as a given; only with PRISM has the press more generally given this activity some attention.

However, the ethics and politics of government surveillance of individuals will not be a major focus in this blog. Though there may be som incidental connection when it comes to technical details: Businesses commercial and architectural decisions for their IT infrastructure may have some bearing on 3rd parties ability to conduct surveillance more generally.

Some posts will be quite high-level and some very low-level. Occasionally there will be nitty-gritty implementation details. Other times more general exploratory posts concerning business and technical practices.