Scoundrel: A New Concept For Searching P2P

2001-03-06 01:12:24

One problem still haunts the peer-to-peer (P2P) world: how the hell do you find anything? Traditional search engines are impractical because by the time a P2P network has been spidered, the makeup of the network and the content will probably have changed. Real-time keyword searches are too slow. Yahoo-style index pages don't cut it. We need some crazy new ideas. Enter Scoundrel...

Searching P2P networks sucks. With systems such as Napster and Gnutella you put in keywords, or partial or full titles or artist names, and you get a big list of crap back. There are always tons of redundant results to sift through, and you have to decide which file to download. Once you finally decide on what to snarf, the file transfer bombs out half way through, so you have to try again. This all assumes that you know exactly what you're searching for. Usually there is no cross-indexing.

Then there's the "catch as catch can" problem. Because P2P nodes are connecting and disconnecting to the network at unpredictable times, the availability of resources is constantly in flux. You might be able to find something one day, but the next day the same thing might not be available. If you really want to find something, you may need to search for it over and over again.

Some P2P systems barely have any way to search at all. To discover what's on Freenet, for instance, you generally have to look through big text files full of "keys" (Freenet's version of hyperlinks).

So what can be done about this?

The Scoundrel Project has developed a system for automating the process of searching, re-searching, and downloading things from P2P networks. The core idea is to have what the author of Scoundrel calls a "linkless index." This index, rather than being an index of what actually is on the network, is an index of what theoretically, or ideally, should be on the network. It's "linkless" because the actual index entries do not directly point to resources on the network.

The user can browse this index and select things that he or she would like to retrieve, regardless of whether or not the files are actually available on the network at that particular time. Later on, an agent (bot, or whathaveyou) retrieves the files. The agent does all the shit work of searching, re-searching, selecting the right file, downloading, retrying, etc., all in the background, or while you're away doing something fun. After all, that's what computers should be doing -- tedious drudge work.

This idea has some interesting ramifications. Because the index is disconnected from the actual files on the network, no checks need to be performed to ensure that the index accurately reflects the contents of the network. Thus the index can be more complete, be maintained independently, and kept up-to-date more conveniently. And with a big, comprehensive index, it will be easier to have good cross-referencing.

What's more, the user is free to browse the index at high speeds, selecting things willy-nilly like a kid in a candy store, without waiting for a search to complete, downloads to finish, or being disappointed when a real-time search turns up an empty result set. Everything is ultra-responsive, and it's a better user experience.

Another implication of this strategy is that the network doesn't have to be fast. Some P2P systems are blecherously slow right now. This will certainly change as the software develops and the systems get more users, but at the moment the wait to get a file can be maddening. But who cares if you're not doing all the waiting yourself?

There've also been strange and evil rumblings from the Digital Millennium Copyright Act (DMCA) people about how innocent hyperlinks to net resources containing copyrighted material will be considered some kind of horrible copyright infringement in themselves, punishable by hanging and whatnot. This could put an ugly chill on the whole Internet. A linkless index steps around this issue quite handily.

Although this linkless index strategy can be used for all kinds of data, it is obviously well-suited for digital music trading (e.g. MP3s), which seems to be the focus of many of the P2P projects right now. So to build it's proof-of-concept application, the Scoundrel Project decided to use an existing, highly-developed database for it's linkless index: Amazon.com. Amazon's music index is huge, cross-indexed, chock-full of user reviews, and has all sorts of handy features which make it great for browsing music titles.

Here's how the Scoundrel program works: You fire it up and configure it to know about several OpenNap servers -- the open source clone of the Napster system. Currently, Scoundrel only works with OpenNap. Next, you use Scoundrel's built-in Web browser to navigate Amazon's music section. Scoundrel watches as you browse, and when you visit the description page for a CD, Scoundrel automatically picks up the title, artist, and track listings. You are given an opportunity to review the list of stuff that Scoundrel has created, and to modify and delete items. When you are ready to have Scoundrel go to work for you, you hit the "get'em" button, and it crawls the various OpenNap servers looking for MP3s of the music you want. Then you can minimize Scoundrel and play some Nethack or whatever, or continue to browse Amazon for even more goodies.

This works surprisingly well. As a test, I chose a few CDs from Amazon's "Top Sellers" list, set Scoundrel loose, and went to breakfast. When I came back there were at least two complete CDs in MP3 form on my hard drive, and several partial CDs. And Scoundrel was still out busting ass for me. Anyone who has ever spent all night on Napster trying to put together an entire track list of MP3s knows how cool that is.

Scoundrel isn't perfect, and neither is the linkless index idea. Even though there are copious widgets and screens obstensively indicating what the program is doing, it's hard to figure out. Sometimes it seems to just hang, and sometimes it doesn't seem to search for all of the things you tell it to. But after all, it is just a proof of concept. The author calls it a "technology preview." And while a comprehensive index of music files already exists, and there are other databases for things such as movies (e.g. IMDB), how do we deal with P2P resources that aren't already in a nice tidy index somewhere? And how would you create an index for that stuff?

Another thing is that Scoundrel only runs on Windows. It's an open source project under the GNU GPL, but it's written in some sick language like Delphi (which may be excusable considering that it's only a prototype).

Despite these concerns, I give a big warm Beaujolais to the Scoundrel Project!

There is one last intriguing thing to say about Scoundrel. The author of the program is a mystery man. He remains anonymous to this day. On March 1st, just after releasing the latest incarnation of Scoundrel, he posted a message on the Scoundrel home page announcing that he is abandoning all work on the project, and will never be heard from again, although he hopes that others will continue work on the project. This is from the Scoundrel web page:

Well, so much for what Scoundrel has and has not done. As of today, March 1st, 2001, I will no longer be able to continue development on Scoundrel. I'll be disappearing from the face of the earth and will not be reachable. I will not go into the reasons behind this.

Could it be that the big media companies got to him too? Is the RIAA playing hardball behind the scenes? Will we ever know?

In the meantime, give Scoundrel a whirl.

Check it out yourself

Over. End of Story. Go home now.

ozzyluvr@pigdog.org

T O P S T O R I E S

Viva La Musica

America's National Recording Registry Inducts Culturally Significant Artist - Weezer!

by El Destino

America's Library of Congress calls them "defining sounds of history and culture" and "audio treasures worthy of preservation for all time based on their cultural, historical or aesthetic importance in the nation’s recorded sound heritage." Ladies and gentlemen, I give you... Weezer! (More...)

Viva La Musica

Thunder on The Frontage Road

by Flesh

The Crossroads are real and The Blues is a place; The enduring myth of Robert Johnson (More...)

Net Flotsam

California Glory Hole attracts huge crowds

by Baron Earl

A glory hole at Napa's Lake Berryessa is drawing huge crowds. According to Chris Lee, the general manager for the Solano County Water Agency, the glory hole hasn't been active since 2019, and only restarted operations on Feb 4. (More...)

Police State Chronicles

Republican State Senator busted after soliciting a teenage girl

by Baron Earl

Republican State Senator Justin Eichorn of Minnesota was arrested for soliciting a teen girl on Monday just hours after he introduced a bill proposing "Trump derangement syndrome" (TDS) as a form of mental illness. (More...)

Virus in America

Parents claim measles is not that bad after having only one child die

by Baron Earl

The parents of a Texas girl who died from the measles are defending their decision not to vaccinate their daughter. "She says they would still say 'Don't do the shots,'" an unidentified translator for the parents said. "They think it’s not as bad as the media is making it out to be." (More...)

-tarded

Delusional rich man tries to fire town staff

by Baron Earl

"I'm mayor now" said write-in mayoral candidate and founder of Pirate’s Booty Snacks Robert Ehrlich after losing the election for Mayor of Sea Cliff, NY. Then he tried to take over the Village Hall and fire everyone. (More...)

C L A S S I C P I G D O G

Spocktail of the Week

The Sedated Pirate

by JRoyale

Last week I had eye surgery and it was certainly one of the least enjoyable episodes of my life. Eye Surgeons like their patients to be conscious enough so that they can move their eyes to the proper position during surgery. (More...)

After Action Reports

A Day in the Life of a Beverotologist

by JRoyale

It was starting to look like a very boring Saturday, trapped as I was in the suburban wastelands of the outer Bay Area, so I called my Able Assistant (AA) and proposed that we perform some Spocktail field tests. For some time I've been working on creating the quintessential cinematic beverage and even tho' SMRL does most of its testing during nocturnal hours, this seemed an opportune time to roll up the sleeves of our labcoats and get some science done. While the beverotology creation tested this day (The Neurotoxin) must be deemed a success, this article focuses more the journey of the experimenters, rather then the science of beverotology. (More...)

After Action Reports

High Availability Guinness Stress Test

by El Snatcher

All too often we forget the incredible depth of technology behind the weekly ritual of TNiPN@*. We tend to only become aware of the strategy of High Available Guinness (HAG) when it rises to the forefront during a complete and utter venue failure. Yet we should all be super grateful that this system exists. (More...)

The Corporate Fuck

Down at the Money Mart

by Siduri

It's not like I have a heroin problem, see. I'm just a self-indulgent brat who likes to live beyond her means. When I zip down to my corner Money Mart for a little cash-till-payday loan, I'm really not planning to spend it on drugs. I'll spend it on sushi. Seventy bucks of interest for a two-week $400 loan is perfectly reasonable, if you really need that hamachi. (More...)

Art Fux

Sex Crimes of the X-Men

by El Destino

"Gee, I wish I was older."
"So do I." (More...)

Team Walken

The Walken / Country Bear Conspiracy

by JRoyale

As has been recently reported in the PDJ, Christopher Walken, evil s00per villain extraordinaire, will be appearing next month in Disney's newest release, The Country Bear Movie. Always playing some wicked and very disturbed badass in movies like Sleepy Hollow, Illuminata, The Prophecy I, II, III, Pulp Fiction, Batman Returns, The Milagro Beanfield War, A View to a Kill, The Dogs of War, Heaven's Gate, and The Deer Hunter, Walken is unsuprisingly a big favorite in the PDJ news room. (More...)