Showing posts with label lycos. Show all posts
Showing posts with label lycos. Show all posts

19 November 2007

What's a Paglo?

That was my first question to Brian de Haaff, CEO of the eponymous company. This is what he said, (more or less):

Francisco Paglo was a virtually unknown Italian explorer who first set sail as a lookout on Cadamosto's expedition to the Gambia River in 1455. Upon completion of a distance learning course in creative writing, he published a stirring account of the exploration from his viewpoint in the crow's nest, which was widely published throughout Europe. It ultimately caught the eye of Prince Henry the Navigator who was a Portuguese royal prince, soldier, and patron of explorers. Prince Henry summoned Paglo, and thanks to his generous funding, sent him on an expedition around Africa's Cape of Good Hope in 1460 to trade for spices in India. A storm pushed him off his target, and he finally dropped anchor in what is now known as New Zealand.

He never did set foot in India, but in New Zealand he remains a hero for bringing the country its first sheep, and his birthday (April 1) is celebrated every year with giant mutton pies. A growing movement has petitioned the government to officially establish the day as a national holiday — Dandy Mutton Day, in reverent appreciation for Paglo. On the eve of March 31 each year, children leave tiny bales of hay in their family rooms, hoping for the safe return of his ghost to their home and a flock of sheep for their family. Those who have been good the preceding year and have prepared fresh bales receive a bowl of lamb stew and freshly-knit wool socks and sweaters from their parents. But poor behavior and unkempt bales is frowned upon as a sign of disrespect, and these unfortunate kids receive a clump of manure.

And this is what the company does:

Paglo is a search engine for IT that specializes in searching the complex and varied data of IT networks, and in returning rich data reports in table and chart formats, as well as simple text hit lists.

As someone who was smitten with search engines ever since the early days of Lycos, WWWW and Inktomi, I was naturally highly receptive to this approach. Search has become the optic through which we see the digital world; applying it not just to traditional information, but also to corporate IT data is eminently sensible.

Things only got better when I found out that the search engine crawler was open source (GNU GPL to be precise). This makes a lot of sense. It means that people can add extra features to it to allow discovery of all kinds of new and whacky hardware and software through the use of plugins; it also means that people are more likely to trust it to wander around their intranets, gathering a lot of extremely sensitive information.

That information is sent back to Paglo, encrypted, where it is stored on their servers as a searchable index of your IT assets that can be interrogated. Now, obviously security is paramount here. I also worry about people turning up with a sub poena: after all, those search indexes will provide extremely useful information about unlicensed copies of software etc.; Paglo, not surprisingly, doesn't think this will be a problem.

There are other interesting aspects of Paglo, including its use of what it calls "social solving":

We do this by allowing all users to save their search queries and publish them for anyone’s use. The elegance here is that you can immediately access any query that’s been saved and made public, and run it against your own data. (Only the query syntax is published. The data itself, of course, is private to each user.) This is especially helpful when you need a query that searches out a complex relationship – such as between users and the applications they have installed on their desktops – and you do not know where to start. The permutations are endless, but since the core concept is the same, any saved query can be used against any set of network data.

But in many ways, the most interesting aspect of Paglo is its business model:

We are maniacally focused on delivering the most value, for the most users, as quickly as possible. To achieve this, we are removing barriers to getting started (like complex installation and cost) and making the service convenient to use. Our experience and the history of the Internet tells us that lots and lots of thrilled users of a free service are much more valuable than a handful of paying customers. If we are successful, you will love Paglo, use it daily, and tell your colleagues and friends.

Yup, that means that they don't have one, but they're really, really sure that if everyone uses them, they can find one. Of course, that's precisely what Google did, so there are precedents - but no guarantees. Let's hope the final business plan proves more credible than the explanation of the company name.

12 April 2007

Searching for an Answer

It was the arrival of the first-generation search engines like Yahoo and Lycos in the mid-1990s that turned a collection of disparate online data into a usable source of information. Today, Google's pivotal role in online activity is even more pronounced.

So it's no surprise that people are working on search engines for Second Life - the thinking being that once you can find anything there, it will be even more useful as a tool. But in virtual worlds, it's not so simple:

Second Life isn't the same as the World Wide Web (at least in how its users perceive it), and probably shouldn't be treated the same way as web pages, routinely scanned by search-engine bots. I'm pretty sure that Linden Lab would prefer to that Second Life be as permeable and open as the WWW, but it's got to take a definitive step in this direction. Currently, there is no true public data in Second Life: Linden Lab owns the data comprising the world, including user avatars and objects. On the other hand, the company's Terms of Service indicate that invasions of privacy are prohibited (section 4.1). I don't understand how user-privacy even exists in a world owned by one private entity. Any shift in resident privacy-expectations Second Life is ultimately up to Linden Lab, which hasn't seemed to have decided whether Second Life is a country or an internet--whether it is a government presiding over population of residents, or a service-provider to hundreds of thousands of users.

The problem is that most people put stuff on the Web because they want others to find it: there is a conscious act of exposing stuff there. In Second Life, people (naively) assume that it's "like" real life, in the sense that virtual objects are private unless explicitly exposed. Alas, no: anything in Second Life is just data, and as such susceptible to being farmed by search bots. As the post above points out, people must now decide now much privacy needs to be built into the system. Where the dividing line should be drawn between private and public in the virtual world is not at all obvious.

10 April 2006

Webaroo - Yawnaroo

Convincing proof that Web 2.0 is a replay of Web 1.0 comes in the form of Webaroo. As this piece from Om Malik explains, this start-up aims to offer users a compressed "best of the Web" that they can carry around on their laptops and use even when they're offline.

Sorry, this idea was invented back in 1995, when Frontier Technologies released its SuperHighway Access CyberSearch, a CD-ROM that contained a "best of the Web" based on Lycos - at the time, one of the best search engines. As I wrote in September 1995:

Not all of the Lycos base has been included: contained in the 608 Mbytes on the disc is information on around 500,000 pages. The search engine is also simplified: whereas Lycos possesses a reasonably powerful search language, the CyberSearch tool allows you to enter just a word or phrase.

Only the scale has changed....

27 March 2006

Searching for an Answer

I have always been fascinated by search engines. Back in March 1995, I wrote a short feature about the new Internet search engines - variously known as spiders, worms and crawlers at the time - that were just starting to come through:

As an example of the scale of the World-Wide Web (and of the task facing Web crawlers), you might take a look at Lycos (named after a spider). It can be found at the URL http://lycos.cs.cmu.edu/. At the time of writing its database knew of a massive 1.75 million URLs.

(1.75 million URLs - imagine it.)

A few months later, I got really excited by a new, even more amazing search engine:

The latest pretender to the title of top Web searcher is called Alta Vista, and comes from the computer manufacturer Digital. It can be found at http://www.altavista.digital.com/, and as usual costs nothing to use. As with all the others, it claims to be the biggest and best and promises direct access to every one of 8 billion words found in over 16 million Web pages.

(16 million pages - will the madness never end?)

My first comment on Google, in November 1998, by contrast, was surprisingly muted:

Google (home page at http://google.stanford.edu/) ranks search result pages on the basis of which pages link to them.

(Google? - it'll never catch on.)

I'd thought that my current interest in search engines was simply a continuation of this story, a historical relict, bolstered by the fact that Google's core services (not some of its mickey-mouse ones like Google Video - call that an interface? - or Google Finance - is this even finished?) really are of central importance to the way I and many people now work online.

But upon arriving at this page on the OA Librarian blog, all became clear. Indeed, the title alone explained why I am still writing about search engines in the context of the opens: "Open access is impossible without findability."

Ah. Of course.

Update: Peter Suber has pointed me to an interesting essay of his looking at the relationship between search engines and open access. Worth reading.