Showing posts with label tools. Show all posts
Showing posts with label tools. Show all posts

Saturday, January 31, 2009

Black swans and Google failure {Ramble}

Re-reading this article from last year in the Times (via George Dvosrky and BoingBoing) concerning Nassim Nicholas Taleb and black swans.

As I write Google seems to be going through a bit of a crisis, flagging all sites it's searches return with "this site may harm your computer" (it even spawned a Twitter hashtag #googmayharm) and redirecting you to an interstitial site:


Anyway it occurs to me that Google going bankrupt or suffering some huge failure that wipes out all the data stored in, say, blogger would be a black swan event. High impact and widely unpredicted.

This is exactly why I've taken to making local backups of my blog with this software using Blogger Backup. In the event of Google getting fubared I can be back up and running within two shakes of a Wordpress template.

I've also been doing something similar with my Delicious account, by using this website to create a local xml copy of all my Delicious bookmarks.

Incidentally Delicious is now becoming really useful: it's got to the stage (with 2096 tags and 2009 saved URLs) where it acts as a sort of private search engine of stuff I know I'll already be interested in.

But my obsession with long term data storage (and I mean for Long Now values of long term) has since been piqued by this article by Charles Stross. Stoss is talking about data formats being essentially forgotten after a few decades and data stored in those formats becoming inaccessible.

But the value of something like my Delicious xml backups may change for me because those websites might drop off the web.

So I'm currently looking for some software that will save all the pages associated with the URLs in my Delicious account as local html files.

LATER: Well Google is working properly and I found something like what I just described (a means to acquire local copies of all my Delicious sites) using the wget tool on UNIX based systems.

wget actually looks really awesome.

I know I should bite the bullet and switch to either Apple or Linux but I haven't got round to it yet, so in the meantime I'm looking for something similar to wget but for Windows...

Wednesday, January 28, 2009

The nolinkvisit rule of worthwhile Internet stuff

Like all denizens of the Interwubz I have, over the course of several years, collected records of everything I've read, touched, looked at, linked to, discussed, commented on, or visited - the sticky trail of webby spoor that will undoubtedly follow me for the rest of my life.

I back up bookmarks at Delicious, Google bookmarks, and locally, both in my browser and in big HTML gloms in my weekly monthly backups.

The other day when I did this I realised that the blog of Scottish science fiction writer Ken MacLeod was not included in any of my bookmarks! I realised that over the years I've been reading his excellent weblog I've simply got into the habit of (I've never liked RSS aggregators - I've tried FeedDemon but found it deeply unsatisying somehow) clicking through when I visited Charles Stross' similarly excellent weblog.

It occurs to me that this behaviour can act as a kind of litmus test for genuinely excellent online stuff.

If you care enough about something to remember it without creating a link on your desktop or browser then it's almost certainly worthwhile.

Called it the nolinkvisit rule.

Thursday, January 08, 2009

R programming language

A new programming language, R, designed specifically for data mining and statistics is discussed in the New York Times:

“R has really become the second language for people coming out of grad school now, and there’s an amazing amount of code being written for it,” said Max Kuhn, associate director of nonclinical statistics at Pfizer. “You can look on the SAS message boards and see there is a proportional downturn in traffic.”




[via Slashdot][image from R Project Website]

Friday, November 28, 2008

Wikipedia visualisation tool

Interesting Wikipedia visualisation tool here called WikiDashboard from PARC [via Magical Nihilism]. It gives a real insight into who edits what when in Wikipedia: