Google Blog Search


Ages ago Google stopped providing excerpts for LiveJournal searches, but they’ve now made up for it with Google Blog Search, which lets you search the blogosphere or a particular blog with full excerpts (and some intelligence in terms of figuring out individual authors and blogs by name). It seems to work pretty well, although it appears you need to use the “http://www.livejournal.com/users/USERNAME” URLs even for paid users.

(Cue doomsayers who will now predict that Google will begin excluding blogs from its regular search.)

The great part about this, if you ask me, is this:

Google
 
mendel’s journal all of LiveJournal all blogs

Whee!


14 responses to “Google Blog Search”

  1. The funny thing is I wrote this just yesterday. It’s not obsolete…yet. Google certainly has the processing power to parse all the articles for keyword searches on their servers.

    It uses a lot of bandwidth, though. I ran it from 7:35 to 9:35 Pacific this morning, and it used almost 30MB of bandwidth. It’ll go up as more North Americans wake up and start posting.

  2. I can’t tell from their about page if they’re getting feeds or if they’re spidering. This certainly would mesh with what Brad mentioned in the post where he talked about the stream you’re using there, though, about companies asking for pings for all entries.

  3. I stole this and made it a “backdated” post for December 31, 2037 at 23:59 (the last date/time combo LJ currently accepts), so that it’s always the first post for people looking at my journal but doesn’t show up in friends view.

    Thanks for this!

  4. Ah, that makes sense. Was Google one of the ones you wrote the push feed thingy for?

    A user preference to choose between publishing the domain-alias URL, paid user URL, or free user URL would be neat, though. I suppose all I’d need to do is ping weblogs.com with a domain-alias URL whenever I posted and it wouldn’t know that it wasn’t LJ, though. (If I used the domain-alias feature for something other than a livegerbil.com URL, of course.)

  5. Oh hey, that’s a good idea. I just stuck it on my userinfo. I suppose the ideal way would be to integrate it into the style somehow. I really should get around to picking up S2.

  6. Now, cool would be being able to add Google (which happens to belong to someone already, pfft!) to your LJ Friends list, which would in turn allow the Googlebot to spider your Friends-locked entries for searching. Google’s search results obviously shouldn’t show an excerpt for these particular entries in the overall results, nor allow folks to view a cached version of the page … but at least provide a means of getting the entries into your search results.

  7. Does updates.sixapart.com also provide feeds if someone has checked the preference to block search spiders? I’ve seen some (unverified) complaints by people who found their LJs on the google blog search when they thought they’d blocked Google by denying robots. The mechanism for supplying the data is entirely different, of course, but I expect many people will have assumed that their preference carried forward to this newer setup.