Showing posts with label search. Show all posts
Showing posts with label search. Show all posts

02 February 2014

Interview: Eben Moglen - "surveillance becomes the hidden service wrapped inside everything"

(This was original published in The H Open in March 2010.)

Free software has won: practically all of the biggest and most exciting Web companies like Google, Facebook and Twitter run on it.  But it is also in danger of losing, because those same services now represent a huge threat to our freedom as a result of the vast stores of information they hold about us, and the in-depth surveillance that implies.

Better than almost anyone, Eben Moglen knows what's at stake.  He was General Counsel of the Free Software Foundation for 13 years, and helped draft several versions of the GNU GPL.  As well as being Professor of Law at Columbia Law School, he is the Founding Director of Software Freedom Law Center.  And he has an ambitious plan to save us from those seductive but freedom-threatening Web service companies.  He explained what the problem is, and how we can fix it.

GM: So what's the threat you are trying to deal with?

EM:  We have a kind of social dilemma which comes from architectural creep.  We had an Internet that was designed around the notion of peerage -  machines with no hierarchical relationship to one another, and no guarantee about their internal architectures or behaviours, communicating through a series of rules which allowed disparate, heterogeneous networks to be networked together around the assumption that everybody's equal. 

In the Web the social harm done by the client-server model arises from the fact that logs of Web servers become the trails left by all of the activities of human beings, and the logs can be centralised in servers under hierarchical control.  Web logs become power.  With the exception of search, which is a service that nobody knows how to decentralise efficiently, most of these services do not actually rely upon a hierarchical model.  They really rely upon the Web  - that is, the non-hierachical peerage model created by Tim Berners-Lee, and which is now the dominant data structure in our world.

The services are centralised for commercial purposes.  The power that the Web log holds is monetisable, because it provides a form of surveillance which is attractive to both commercial and governmental social control.  So the Web with services equipped in a basically client-server architecture becomes a device for surveilling as well as providing additional services.  And surveillance becomes the hidden service wrapped inside everything we get for free.

The cloud is a vernacular name which we give to a significant improvement in the server-side of the web side - the server, decentralised.  It becomes instead of a lump of iron a digital appliance which can be running anywhere.  This means that for all practical purposes servers cease to be subject to significant legal control.  They no longer operate in a policy-directed manner, because they are no longer iron subject to territorial orientation of law. In a world of virtualised service provision, the server which provides the service, and therefore the log which is the result of the hidden service of surveillance, can be projected into any domain at any moment and can be stripped of any legal obligation pretty much equally freely.

This is a pessimal result.

GM:  Was perhaps another major factor in this the commercialisation of the Internet, which saw power being vested in a company that provided services to the consumer?

EM:  That's exactly right.  Capitalism also has its architectural Bauplan, which it is reluctant to abandon.  In fact, much of what the network is doing to capitalism is forcing it to reconsider its Bauplan via a social process which we call by the crappy name of disintermediation.  Which is really a description of the Net forcing capitalism to change the way it takes.  But there's lots of resistance to that, and what's interesting to all of us I suspect, as we watch the rise of Google to pre-eminence, is the ways in which Google does and does not - and it both does and does not - wind up behaving rather like Microsoft in the course of growing up.  There are sort of gravitational propositions that arise when you're the largest organism in an ecosystem. 

GM:  Do you think free software has been a little slow to address the problems you describe?

EM:  Yes, I think that's correct.  I think it is conceptually difficult, and it is to a large degree difficult because we are having generational change.  After a talk [I gave recently], a young woman came up to me and she said: I'm 23 years old, and none of my friends care about privacy.  And that's another important thing, right?, because we make software now using the brains and hands and energies of people who are growing up in a world which has been already affected by all of this.  Richard or I can sound rather old-fashioned.

GM:  So what's the solution you are proposing?

EM:  If we had a real intellectually-defensible taxonomy of services, we would recognise that a number of the services which are currently highly centralised, and which count for a lot of the surveillance built in to the society that we are moving towards, are services which do not require centralisation in order to be technologically deliverable.  They are really the Web repackaged. 

Social networking applications are the most crucial.  They rely in their basic metaphors of operation on a bilateral relationship called friendship, and its multilateral consequences.  And they are eminently modelled by the existing structures of the Web itself. Facebook is free Web hosting with some PHP doodads and APIs, and spying free inside all the time - not actually a deal we can't do better than. 

My proposal is this: if we could disaggregate the logs, while providing the people all of the same features, we would have a Pareto-superior outcome.  Everybody – well, except Mr Zuckenberg - would be better off, and nobody would be worse off.  And we can do that using existing stuff.

The most attractive hardware is the ultra-small, ARM-based, plug it into the wall, wall-wart server, the SheevaPlug.  An object can be sold to people at a very low one-time price, and brought home and plugged into an electrical outlet and plugged into a wall jack for the Ethernet, or whatever is there, and you're done.  It comes up, it gets configured through your Web browser on whatever machine you want to have in the apartment with it, and it goes and fetches all your social networking data from all the social networking applications, closing all your accounts.  It backs itself up in an encrypted way to your friends' plugs, so that everybody is secure in the way that would be best for them, by having their friends holding the secure version of their data.

And it begins to do all the things that we assume we need in a social networking appliance.  It's the feed, it maintains the wall your friends write on - it does everything that provides feature compatibility with what you're used to. 

But the log is in your apartment, and in my society at least, we still have some vestigial rules about getting into your house: if people want to check the logs they have to get a search warrant. In fact, in every society, a person's home is about as sacred as it gets.

And so, basically, what I am proposing is that we build a social networking stack based around the existing free software we have, which is pretty much the same existing free software the server-side social networking stacks are built on; and we provide ourselves with an appliance which contains a free distribution everybody can make as much of as they want, and cheap hardware of a type which is going to take over the world whether we do it or we don't, because it's so attractive a form factor and function, at the price. 

We take those two elements, we put them together, and we also provide some other things which are very good for the world.  Like automatically VPNing everybody's little home network place with my laptop wherever I am, which provides me with encrypted proxies so my web searching, wherever I am, is not going to be spied on.  It means that we have a zillion computers available to the people who live in China and other places where there's bad behaviour.  So we can massively increase the availability of free browsing to other people in the world.  If we want to offer people the option to run onion routeing, that's where we'll put it, so that there will be a credible possibility that people will actually be able to get decent performance on onion routeing networks.

And we will of course provide convenient encrypted email for people - including putting their email not in a Google box, but in their house, where it is encrypted, backed up to all their friends and other stuff.  Where in the long purpose of time we can begin to return email to a condition - if not being a private mode of communication - at least not being postcards to the secret police every day.

So we would also be striking a blow for electronic civil liberties in a way that is important, which is very difficult to conceive of doing in a non-technical way.

GM:  How will you organise and finance such a project, and who will undertake it?

EM:  Do we need money? Yeah, but tiny amounts.  Do we need organisation? Yes, but it could be self-organisation.  Am I going to talk about this at DEF CON this summer, at Columbia University? Yes.  Could Mr Shuttleworth do it if he wanted to? Yes.  It's not going to be done with clicking heels together, it's going to be done the way we do stuff: somebody's going begin by reeling off a Debian stack or Ubuntu stack or, for all I know, some other stack, and beginning to write some configuration code and some glue and a bunch of Python to hold it all together. From a quasi-capitalist point of view I don't think this is an unmarketable product.  In fact, this is the flagship product, and we ought to all put just a little pro bono time into it until it's done.

GM:  How are you going to overcome the massive network effects that make it hard to persuade people to swap to a new service?

EM:  This is why the continual determination to provide social networking interoperability is so important. 

For the moment, my guess is that while we go about this job, it's going to remain quite obscure for quite a while.  People will discover that they are being given social network portability.  [The social network companies] undermine their own network effect because everybody wants to get ahead of Mr Zuckerberg before his IPO.  And as they do that they will be helping us, because they will be making it easier and easier to do what our box has to do, which is to come online for you, and go and collect all your data and keep all your friends, and do everything that they should have done.

So part of how we're going to get people to use it and undermine the network effect, is that way.  Part of it is, it's cool; part of it is, there are people who want no spying inside; part of it is, there are people who want to do something about the Great Firewall of China but don't know how.  In other words, my guess is that it's going to move in niches just as some other things do.

GM:  With mobile taking off in developing countries, might it not be better to look at handsets to provide these services?

EM:  In the long run there are two places where we can conceivably put your identity: one is where you live, and the other is in your pocket.  And a stack that doesn't deal with both of those is probably not a fully adequate stack.

The thing I want to say directed to your point “why don't we put our identity server in our cellphone?”, is that our cellphones are very vulnerable.  In most parts of the world, you stop a guy on the street, you arrest him on a trumped-up charge of any kind, you get him back to the station house, you clone his phone, you hand it back to him, you've owned him.

When we fully commoditise that [mobile] technology, then we can begin to do the reverse of what the network operators are doing.  The network operators around the world are basically trying to eat the Internet, and excrete proprietary networking.  The network operators have to play the reverse if telephony technology becomes free.  We can eat proprietary networks and excrete the public Internet.  And if we do that then the power game begins to be more interesting.

31 August 2011

Welcome to Moody's Microblog Daily Digest

I joined Twitter on 1st January 2010 as an experiment. I wanted to see whether this trendy thing had any real merit, or was simply the latest fad that would come and go. I was was soon disabused of my prejudices about it being just for posting about what you had eaten for breakfast. Indeed, I discovered that the presence or absence of such culinary info was a very quick way of deciding whether someone should be unfollowed or not.

I was particularly impressed at the many different ways that people used Twitter. For some, it was truly an online diary, recording what they did, often in exhaustive (and exhausting) detail. For others, it was a way of passing on news far faster than traditional outlets. And for some it was evidently a real microblog – a way of publishing extremely short piece of information with optional comments.

This turned out to be the way that I felt Twitter was most useful, and my own use soon conformed to this model. I realised that it solved a problem with blogging that I had been wrestling with for a while. I frequently came across stories that warranted passing on, but which looked decidedly thin when posted to one of my blogs. What I wanted was a quick way of saying: “hey, take a look at this – it's good/bad/stupid/funny/horrible” without needing to come up with anything more detailed in terms of analysis. What I wanted, it turned out, was Twitter.

As my followers there (and later on identi.ca and Google+) will know, I soon lost control completely, and started posting dozens of microblog posts a day. Indeed, I have had several people unfollow me because they say I post too many interesting links, which stops them working....

But for all that I feel my microblogs work well on their own terms, there is one huge problem. I have apparently posted some 43,000 of them in the last 20 months (really? How posts fly by when you're having fun...). Quite a few of them have useful information that I like to refer to. But it is a truth universally acknowledged that Twitter's search function is pretty useless. Even though I have supplemented this with bit.ly, which has its own search feature, it frequently happens that I can't find that super important link I posted a few months ago.

This is not just frustrating, it is becoming a serious problem. It means that the not inconsiderable effort that I put into choosing my links and commenting on them is effectively going down the digital drain.

So, in an attempt to preserve at least some of the more interesting posts, I have set up a new blog called, with stunning originality, “Moody's Microblog Daily Digest.” As its name suggests, each day this will provide a digest of those microblog posts that I think are worth keeping. These will be posted in an entirely minimal format, simply a paste of the microblog content – don't look for any prettiness here.

This will, I hope, have two advantages.

First, it will allow Google's not inconsiderable search engine capabilities to index stuff on the new site. That means any post should be retrievable by me and anyone who feels the need. Secondly, it offers an alternative way to deal with the Moody flood: not only will it be a pared-down list of microblog posts, but it will be one-per-day (I aim to update it during the day, and then close it at the end, although I'm not sure if that will mean multiple appearances in RSS readers...) This might help those who find that you can have too much of a good thing....

Obviously, I'll be reviewing how things go, and would appreciate any comments along the way as this latest experiment progresses.

Follow me @glynmoody on Twitter or identi.ca, and on Google+

17 September 2009

Analogue or Digital? - Both, Please

Recently, I bought the complete works of Brahms. Of course, I was faced with the by-now common problem of whether to buy nostalgic CDs, or evanescent MP3s. The price was about the same, so there was no guidance there. Of course, ecologically, I should have gone for the downloads, but in the end I choose the CDs - partly for the liner stuff you never get with an MP3, and partly because I have the option of degrading the CD bits to lossy MP3, which doesn't work so well the other way.

So imagine my surprise - and delight - when I discovered after paying for said CDs that the company - Deutsche Grammophon - had also given me access to most of the CDs as streams from its Web site, for no extra cost (I imagine the same would have been true of the MP3s). This was a shrewd move because (a) it made me feel good about the company, even though it cost them very little, and (b) I'm now telling people about this fact, which is great publicity for them.

But maybe my delight is actually a symptom of something deeper: that having access to both analogue and digital instantiations of information is getting the best of both worlds.

This struck me when I read the following story:

Google will make some 2 million out-of-copyright books that it has digitally scanned available for on-demand printing in a deal announced Thursday. The deal with On Demand Books, a private New York-based company, lets consumers print a book in about 10 minutes, and any title will cost around $8.

The books are part of a 10 million title corpus of texts that Google ( GOOG - news - people ) has scanned from libraries in the U.S. and Europe. The books were published before 1923, and therefore do not fall under the copyright dispute that pits Google against interests in technology, publishing and the public sector that oppose the company's plans to allow access to the full corpus.

That in itself, is intriguing: Google getting into analogue goods? But the real importance of this move is hinted at in the following:

On Demand already has 1.6 million titles available for print, but the Google books are likely to be more popular, as they can be searched for and examined through Google's popular engine.

That's true, but not really the key point, which is that as well as being able to search *for* books, you can search *through* them. That is, Google is giving you an online search capability for the physical books you buy from them.

This is a huge breakthrough. At the moment, you have to choose between the pleasure of reading an analogue artefact, and the convenience of its digital equivalent. With this new scheme, Google will let you find a particular phrase - or even word - in the book you have in your hands, because the latter is a physical embodiment of the one you use on the screen to search through its text.

The trouble is, of course, that this amazing facility is only available for those books out of copyright that Google has scanned. Which gives us yet another reason for repealing the extraordinarily stupid copyright laws that stop this kind of powerful service being offered for *all* text.

Follow me @glynmoody on Twitter or identi.ca.

03 March 2009

What the Hashtag?!

One of the reasons Twitter has taken off and become so popular (at least amongst sad people such as myself with nothing better to do) is that a rich ecosystem has sprung up around it, with all kinds of serious and silly services that build on its content. Here's one of the better ones, What the Hashtag?!:

Welcome to What the Hashtag?!, the user-editable encyclopedia for hashtags found on Twitter

What's a hashtag?

Hashtags are a community-driven convention for adding additional context and metadata to your tweets. They're like tags on Flickr, only added inline to your posts. Hashtags can be created by anyone simply by prefixing a word with a hash symbol: #myhashtag. Hashtags were developed as a means to create groupings of related content on Twitter.

This is an interesting way to access and index content, and adds an extra level of usefulness to Twitter.

24 April 2008

Is Cheating in Microsoft's DNA?

Seems so:

I was looking to see what search sites might have a particular bug that I (ahem) came across and was trying the search for the number 0 in various places. There is a pretty good Wikipedia page about zero. Zero has a rich and interesting history and there are many other potentially reasonable results.

But I was surprised to see MSN search had demoted their good results below some crappy ones from MSDN

18 May 2007

Google Enters the Fourth Dimension

It's a bit rudimentary at the moment, but Google's new Timeline view for searches is quite entertaining. (Via Vecosys.)

20 April 2007

Google Web History: Fantastically...What?

This looks really cool:

Web History: All the web sites you visit, at your fingertips.

* View your web activity.
* Search the full text of pages you've visited.
* Get personalized search results and more.

But frankly, I'm far too frightened to install it. The idea of not just giving all this data to Google (based in the US, remember, with that nice Mr. Bush in charge), but authorising it to track my every move online....Nein Danke. (Via Vecosys.)

14 December 2006

Google Does Patents...

...in the nicest possible way, with Google Patent Search:

As part of Google’s mission to organize the world’s information and make it universally accessible and useful, we’re constantly working to expand the diversity of content we make available to our users. With Google Patent Search, you can now search the full text of the U.S. patent corpus and find patents that interest you.

26 November 2006

Petard, Meet Australian Government

Welcome back to the dark ages, Australia:

Plugging a word or phrase into a search engine may soon give you fewer results if proposed new Australian copyright laws are adopted, according to Internet giant Google.

The laws could open the way for Australian copyright owners to take action against search engines for caching and archiving material, Google says in a submission to a Senate committee considering the legislation.

This could potentially limit the scope of the search engine results, which the Internet company describes as effectively "condemning the Australian public to the pre-Internet era".

This is what kowtowing to intellectual monopolies gives you. (Via Boing Boing.)

14 November 2006

Google: Is That the Sound of Crying?

Google search is useful - my day revolves around it. But you'd be hard-pushed to claim it was cool anymore. On the contrary, it's archetypally a tool that you use and forget about.

But this is cool:

OWL multimedia has launched an audio similarity search engine stocked with 10,444 CC-licensed tracks from ccMixter and Magnatune, with many more to come from other CC supporting sound repositories.

You can search OWL via search.creativecommons.org but its real power is finding new music through music. Drag an mp3 into the OWL interface and you will be shown tracks that sound similar to the mp3 you provided. You can select a segment of a track to search on and of course you can limit your search to tracks with licenses that permit uses you require, e.g., commercial or derivative use.

Google, are you listening?

10 October 2006

Going Down a FlickrStorm

I'm a big fan of Flickr, even if I don't have much call to use it. Perhaps one reason for that is that it's a bit of a pain finding stuff: tags are only approximate at the best of times. I think I might start using it some more thanks to FlickrStorm (update: now Wunderstock), a kind of search engine plus:

It works by looking for more than what you enter to find related and more relevant images... Be surprised!
When I gave it a whirl, I won't say I was deeply surprised, but maybe pleasantly so, on the basis of both the images it found, and the rather cool way it displayed them, with a scrollable set of thumbnails on the left that bring up the main photo on the right remarkably quickly. Worth taking a look. (Via OpenBusiness.)

05 July 2006

Wikifying Search with Swickis

Swickis are an interesting idea. As their mother-ship, Eurekster, explains:

A swicki is new kind of search engine that allows anyone to create deep, focused searches on topics you care about. Unlike other search engines, you and your community have total control over the results and it uses the wisdom of crowds to improve search results. This search engine, or swicki, can be published on your site. Your swicki presents search results that you're interested in, pulls in new relevant information as it is indexed, and organizes everything for you in a neat little customizable widget you can put on your web site or blog, complete with its very own buzz cloud that constantly updates to show you what are hot search terms in your community.

If you want to see one in action, try archival, which helps you "find texts, images, audio, art, public-domain images and information, electronic books, and archival media." The interesting bit is that once you have done a search, you can suggest re-orderings of the results - just mouse over the entry, and use the options that appear to the right.