30 Jul 2010

Every Doctor Who Villian 1964-2010 Wordled

Author: will | Filed under: data, reuse, television

Exactly what it says on the tin. Here is a pretty wordle (OK, black and white, it should be in the new colours, but I’ll go with the 1963 style) of all the villains appearances in the TV series Doctor Who all prettied up.
The Guardian created the Google spreadsheet (crowd sourced by a lot of fans naturally) of all the villains in Doctor Who and why they did what they did. Take over the universe is a fairly popular reason. Something to remember for your end of year review.

I took a much simpler subset of their spreadsheet, simply the name of the villain and the number of episodes they have been in. Wordle has an advanced setting where instead of counting the words themselves, you can format the list of words in pairs in a Alpha:10 Beta:20 type format instead of having to type Alpha out ten times. Works fairly well for visualising your analytics as to how people find a site. Its not pure data visualisation. I still want to drop the spreadsheet in to Pivot and do some of the “why” analytics.

This time is simply a homage as to how often the Daleks have been on Doctor Who.
Wordle: Doctor Who villain since 1963 Click through for much bigger

And no, I’m not a rabid fan boy. I’d need to work on it more. Any other data sets I can play with?

5 Apr 2010

iPad musings

Author: will | Filed under: data, format, humour

Is the “iPad for Dummies” book going to be released in iBook format or just for the Kindle and Sony Reader?

Part of this post is for a Tuesday Push that is, namely Decisions for Heroes, and partly for a push that should be, Kildare Street.

Decisions for Heroes is a project that Robin Blandford has been working on for a while. And talking about it. In fact I assumed that the product has been launched a few months back. I was wrong; today is launch day.

And he’s built something amazing – technology that will help rescue teams save more lives. Its essentially a project management tool combined with an incident reporting mechanism that’s able to monitor team histories and readiness and raise alarms for expiration or under manning conditions.

What makes it different is that it is designed for a particular niche; rescue teams. Are the exercises and training reflecting the actual calls? Or the actually locations? Are there enough cliff climbers on-call this weekend? Are there certifications that are about to run out? This kind of thing actually saves lives. Its been studied, over 1,800 rescuers from Ireland, UK, USA, Greece, and Australia helped to trial and shape the development of the software. But one stands out. Robin is a volunteer member of the Irish Coast Guard (a cliff rescue climber to be precise) so he has seen first hand what is needed, and what is the most useful way to get that information across.

I’m sure that the basis of D4H can be used in more business-like settings, or indeed in logistic based industries.

And from saving lives, we move to a performance management technology that may cost the careers of a few politicians.

Created by John Handelaar Kildare Street is, almost simply a database. A database of what is being said in both Houses of the Oireachtas, by whom, when, how often and the complete text of what they say so it can be parsed for content. Based off the UK project, theyworkforyou.com, you can keep an eye on your favourite politician, or all the politicians in a constituency, or even when a particular word or phrase is spoken in the Dáil or Seanad Éireann debates or in written answers or questions to the Dáil.

There are a few bugs still in the system (it is a beta and since Irish addresses are vague it can misidentify a constituency, particularly when one side of a road is in one constituency, and the other side is in another constituency. It happens), and there is up to a 24 hour delay between the speech in the chambers and the text of the speech hitting the system (not a fault with the system but with the source; debates.oireachtas.ie.

Its useful to find out which TD or Senator has stayed quite all along (the records go back to 2004), and finding out how they actually voted on subjects of concern to you. Then you can challenge them when they call around asking for your vote.

Do challenge them. Right now, I’m wondering if there is a version for the MEPs.

Two people who should be praised for being heroes and making a difference.

Will Knott

8 Apr 2009

A budget in the clouds

Author: will | Filed under: 2009, Ireland, Irish, data, format, investment, mixing, technology

Well the budget is over, and the government released the text of the Financial Statement (the budget preamble if you will) made my Brian Lenihan yesterday.

So I ran it through Wordle to see what patterns emerged.

wordle Statement of the Minister for Finance Mr Brian Lenihan, T.D. 7 April 2009

We have a big “government”, a large “tax” and a much smaller “payment”, “pay” and “spending”. Oddly, “Public” is almost as big as Government.

Now then, its unlikely that a Minister would put the text of their speech in to something like Wordle, but if they did, the resulting speech might be, well, interesting.

They have the statements from other years too. I think I might have to play with comparison tools (once I’m on a more powerful machine).

take care,
Will Knott

Given yesterday’s post, a history lesson is in order..

History of the Internet” is an animated documentary explaining the inventions from time-sharing to filesharing, from Arpanet to Internet.
The history is told using the PICOL icons on picol.org. PICOL is an project for providing free and open icons for electronic devices. The aim is to find a common pictorial language for electronic communication.


History of the Internet from PICOL on Vimeo.

Some feed readers may need to click through to see the video.

take care,
Will Knott

Reblog this post [with Zemanta]

1 Jan 2009

On the 6th day

Author: will | Filed under: 2009, Barcamp, Cork, Dublin, Ireland, conference, data, personal information

Well I’m going to be running silent for a few days (I could automate a few posts, but I’d rather be honest).

Eireprenure and Will King above it all

Things are changing in my life. Firstly I have a contract in Dublin. Great stuff you say. Yes it starts tomorrow. And secondly, on Monday I have exams for a module in my Masters. In Cork.

Slight geography problem.

Given my job hunt has been going for a little while (and my new employers don’t mind loosing me for the exam day) I’m afraid to say that the opportunity of paid employment wins this particular day.

This means that I have to have a little conversation with the couse head to talk about the possibilities of dropping out, putting things in limbo, transferring to another college or seeing what can be done remotely.

In the meantime, I have a few days of intensive study (I’m not cramming, I’m refining) ahead of me. So silent running until Tuesday 6th of January.

In the meantime, I’ll direct you to a forthcoming event.

Teencamp Ireland is due to take place on January in Filmbase in Dublin. TeenCamp Ireland is a gathering of the techies/bloggers/fanboys age 13+ in Ireland to give talks, meet others, share ideas and have a laugh. TeenCamps are organised/planned/run by teens for other teens. So I’m probably a little too old to go (I might show up and embarrass a few faces).

4 days to my exams and 16 days until Teemcamp.

Good luck one and all,

Will Knott

The idea struck when Carl availed of a cheap flight to Cork and took his Sat Nav GPS device with him. He also took a camera so there will be more photobloging soon. Given that I was going to meet him a few days later at TechLudd he lent me the device for the trip.

cork airport rocketImage by Will Knott via Pix.ie

This Cork-Dublin trip was the first time I had actually used a GPS Satellite Navigation system the way it was intended and I’m wondering if I might suggest a few things (and this might warrant giving away a business model, but I think only Apple could use it. If so, I want one). If this already exists… tell me about it please.

Driving along with both a Sat Nav and an MP3 player (full of podcasts) talking at me it occurred to me that having an integrated unit which would pause the audio to give directions would be wonderful. I think it could work in much the same way that the iPhone or some Nokia devices pause playback when a telephone call comes through the device; integrating a hands free phone would be useful for pedestrian use around a city anyway.

In fact it might make sense to integrate a hands free phone in to the device (see the Nokia and Apple angle folks?). Given the cost of these things, and how tempting they are to thieves, it makes sense to have a reason to put it in your pocket.

However the nasty part of owning a Sat Nav is not the annoying voice (it can be fixed) but, as Adam explains, the cost of the maps. So how about incremental updates via “the cloud” and wireless access?

Imagine the scenario. You want to drive from Cork (say for the Learning Festival this fortnight) to the Ice Cream Ireland book launch at the end of April. Later to the 3D Camp event in Limerick in June. Then to the Open Coffee Club BBQ in Terryglass, Co. Tipperary in either June or July. And maybe even a quick trip to London in July, or San Francisco in April. If you want up-to-date Sat Nav maps you would have to check for an update, and buy a map for the entire country (if its available). But what about an incremental update.

In April you plan your trip from its starting point (Cork) to your destination (Killarney) and the Sat Nav calculated both the route, and what map data is available. It could then use that calculated route to check if there has been an update in that area (say 20 miles within the calculated route) by checking in to the “iTunsMaps” or “KlubNocia” online store and see if there is an incremental map update (or 2) and offer to sell you the maps updates on the calculated route for 1.99 (be it Dollars or Euro).

You are not getting the whole map, but a single route. Similar to buying a single track and not the album. And you are choosing to update the route based on the age of the stored route (and the update naturally).

In addition, a pedestrian probably wouldn’t need the whole country map while he or she is in one city (the quick trip to San Francisco). So downloading a city Sat Nav map (with points of interest, an event guide for that week and free wi-fi hotspots) could cost the same as a single album would.

Straining the point, the current map sales model is akin to being forced to buy a 50 year anthology every time you want to hear a single track.

Given the advances in mapping database systems (yes geographic databases of geography do exist, its Sat Nav data) such modular updating could be useful. You could even automate it for a road trip (and hope the connection lasts to let you know that the old road is now a dead-end with a wall across it)

So guys. Think it could work and is a viable model for a business?

take care,
Will

I have the nasty feeling that I have more questions than answers but here goes…

The old days

Before technology, life in the office was simple. You have documents, and you filed them away. They were big, bulky and paper based (once stone, velum and papyrus had their days). Sometimes documents got lost (down the back of the filing cabinet), sometimes documents were destroyed (blessed be the shredder despite projects to restore shredded documents using software). Rarely did physical documents end up in the hands of the wrong person (but it happened). The came easy duplication. And then came electronic records.

Electronic records, or data to give it an even more generic name, are everywhere. Data can be automatically collected and stored. When I first raised “data loss” I simply assumed I would stay on simple technical grounds such “hard disk crash” or indeed loosing the financial data of 25 million people in the post. Some of the issues are technical, some and legal, but all are social.

Never enough

Disk drives get larger to cope with the torrent of data. Much in the same way that “you can never be too rich” it’s true that “you can never have too much disk space”. However… As data volume grows, our ability to weed out the what from the chaff declines. It’s easy to say ‘never throw out anything, in case it’s needed’. It also lets you avoid the boring (and possibly compromising) task of deleting data you don’t need. However, then your operational budget bloats – it costs as much to look after useless data as expensive data. If it goes on long enough, you can’t do anything about it; it’s possible you won’t never remember what most of it is.

This is where one part of the legal framework stands. If you are, say, automatically collecting all the web sites that a certain IP address connects to, how long should you hang on to it? How long is it legally useful for? And worth keeping for? ( Digital Right Ireland have a few things to say on this.) There is also a technical problem… If an Internet access node is unsecured, is the owner of the node liable for something posted using it? At the moment, yes, but that is because it hasn’t been tested in an Irish Court

Sealed with a click

Another part of this is content. Google have an archive of a precursor to the web, called Usenet on archive. This is data. Public data? Well everything was considered public a the time. So this archive is publicly available.

But what about you diary? Not your blog, but your diary. Currently you have automatic copyright protection on everything you write. The contents of your diary become public domain 75 years after your death. Does the same apply to your e-mail? Private musings are supposed to become public domain after a time. If you turn out to be a famous person (at the time of your death) someone will hang on to every scrap of paper in the hopes that it will be worth something.
However every e-mail you write is technically protected under copyright, and replying or worse, forwarding an e-mail is technically in breach of a dozen copyright laws. When should your e-mail become public domain? If that data is on your hard drive, there is some hope that it will be forgotten about, but as a Microsoft anti-trust cases showed, e-mail has a habit of copying itself in other places than your drive. After all, there are the recipients, and all the server between (and a few that shouldn’t have gotten it in the first place).
When should this mail become public domain? 75 years after your and every contributor’s death? Something like that is impractical. 100 years after the message is sent? 50 years? And what if the message contains still confidential information (like the secret recipe for Snickerdoodles & Chocodoodles)?

Silly idea? Old medical records do go “public”, but these are usually stored in archives of interest to few (usually medical students and researchers who would be qualified to have access to the information in the first place).
“Would it be morally right to give public access to email & messaging accounts 100 years after they were last accessed ? How interested would the historians of the future be in a copy of bebo.com from 2005 ? Or the contents of the mailbox of a famous serial killer 50 years after they died ? I don’t think we have the option of letting that sort of data lapse. It will be the clearest echo of society’s global digital consciousness.”

This is the first time that the general public have had their personal messages (not just) information stored. Should I be retailed for your grandchildren (but hidden from your prospective employer)? When should an e-mail be considered an orphaned work?

Backing away

Along with the problem of how long data should be retained, lets look at the actual retention problem. If you ‘never throw out anything, in case it’s needed’, you have an increased storage problem. I hear the call of “backups”?

“As data volumes grow, you either have to put all your eggs in one basket, or have multiple baskets. From experience, it’s so tempting to try consolidate your data in one place, to reduce admin overhead. Hopefully that one system won’t have a buggy motherboard that’s silently corrupting everything it writes. And it’s really painful if someone accidentally deletes a few petabytes of data – copying from backups takes ages, for a start.”
Or “bugs in archival software (“Yup, that’s archived. Oh, wait. No..it isn’t. The machine had a bad disk, software crashed, and reported ‘everything OK’ when it restarted…”) and freaky network instability (guys doing rewiring, restarting cluster routers and maybe some dodgy cables) resulting in more than one machine reporting as being the ‘one true repository’ for a certain type of data.”

So the backups might be a problem….

But let’s assume that the backups are valid. Then you have 2 format problems.
We don’t have the hardware which can read the tapes anymore.
This actually happened to me professionally. I remembered when the archives were made, and indeed the data was found. Documented in place A where where the off-site storage utility had the backups. However, the tape drives had been scrapped years before.
And those of you that remember the Domesday project know tha the BBC fell in to a similar problem.

But let’s assume that the anarchic backup archive tape could get it’s contents loaded on to a system you can use… can you read the data format?

Earlier this year, Microsoft released a service pack which purposefully disabled older file formats. So your carefully restored data might be unreadable to the world, and worse, yourself. In a business case, the original specifications (or recipe) might be needed. Or your great grandfather’s proposal on an on-line forum to the woman you’ve come to know as your great grand aunt.

Is there a “fix” for this? Well making the older formats fall in to the public domain would help. After all, if you’re not using them…

So who deserves the credit, and who deserves the blame

So the disk has crashed, who do you sue? It should be simple, but it ain’t. Much like a delayed or canceled air flight is not the cause of refunds if the cause of the problem is beyond the control of the airline, there are ways a disk can go. Legally.

Usually a hard disk will crash in infancy (within a day or two of starting life), meaning little if anything has been lost and it’s under warranty of the manufacturer. Or the disk will die was it approaches the end of it’s predicted life (well after warranty). The fact that the computer is usually obsolete long before you take it out of the box isn’t something to be considered.

And while I’m sure that back-up software and hardware has warranties, the legal click through probably covers some lost data. But since the cost a new hard disk is usually less than the lost of the backup measures… home backing up is rare.

In a corporate setting, the party that looses the data should be held liable, but I don’t know of any cases in Irish law on data crashes. Data gong missing however…

it’s a steal, it’s a loss

Credit card data gets stolen. It’s an identifiable crime. Who (other than the criminals) is liable?
Well was a reasonable attempt made to protect the data? If so, was it reasonable enough? Can you sue for loss of data? (and given the ability to reconstruct shredded credit card bills (cited at the start) are you the cause of the data breach?)

Apparently no. If data is lost (in the post) or stolen, there is no case until the data is used and a victim can be shown to have damages (or have lost money) from the act. If personal data goes missing, is there a lawsuit? Liable or slander is not applicable since the data suggests if not proves that the information about the victim is true. There are privacy charges, but currently there is no privacy law in Ireland. Direct financial damages are possible, but the cost of the case is usually more than the loss? And there is the time it takes…

In the case of the recent UK financial data loss a lot of the data is personal data pertaining to minors. In fact everything needed for identity theft for then the minor becomes an adult. So someone sitting on the data would wait 10 to 18 years to strike. Is there a statute of limitations (or similar) for data theft? Or in this case, identity stolen almost a generation ago?

Well, I have asked more questions than I’ve answered…

Anyone able to answer some of these too?

Take care,
William Knott

With kind thanks to John Looney of Google (for the tech and social angles) and Simon McGarr of Tuppenceworth.ie (for the legal questions and answers)

tags : , , , , , , , , , , , , , ,

When someone talks about losing their data, the assumption is a hard disk crash. But not always. Sometimes it’s something more.

Sometimes its business. Sometimes it’s personal. So it’s not always technical.

The thought sprung to mind when I was driving to Waterford with Security Now playing through the radio. This was also after the Facebook/Scoble/Plaxo incident around Christmas, so ownership of data was in my mind. Barcamp Waterford was where Ellybabes presented Death and Divorce in the Digital World almost a year ago. (The slides from the presentation are in the link)

At the session few, if any of the questions were answered, so I’m going to take a stab at it over the new weeks. The schedule I’m planning is…

24-Jan-08 D 1. Data losses (from a head crash to lost in the mail)
31-Jan-08 D 2. Death (You’re gone, but your data will live on)
07-Feb-08 D 3. Demotion (You’re fired, and I own your info)
14-Feb-08 D 4. Divorce (I didn’t choose the date to be sarcastic)

I don’t have all the answers, or even all the questions. So if there is anything you want me to cover (or corrections after the fact) please let me know either through e-mail or comment

take care,
William Knott

tags : , , , , , ,

14 Nov 2007

What I want

Author: will | Filed under: data, information loss, invention, science, what if

Well you can never be too rich or have too much data? And today’s Science Week question is What invention do you want to see most in the future? I’m sure that there will be flights of fancy, but I want something very simple to make… I might even make one myself.

Oddly enough, what I want is something which could be made today. Findable data. Physically findable data.

Back when I was a kid, there was a tacky toy keyring thing which beeped when you whistled. Its one of those things which seem like a good idea but no one ever really praised it.

Nowadays we have USB memory sticks which hold most of our data.

And I’ve lost 2 in the past few days.

Can someone combine these two things and make a whistle and beep USB memory stick?
Please?

And if someone finds a USB stick with my CV and interesting pictures from Mash-up camp… Contact me.

Will

tags: , , , , ,