About Graham White

You can read my Eightbar bio.

Tackling Cancer with Machine Learning

For a recent Hack Day at work I spent some time working with one of my colleagues, Adrian Lee, on a little side project to see if we could detect cancer cells in a biopsy image.  We've only spent a couple of days on this so far but already the results are looking very promising with each of us working on a distinctly different part of the overall idea.

We held an open day in our department at work last month and I gave a lightening talk on the subject which you can see on YouTube:

There were a whole load of other talks given on the day that can be seen in the summary blog post over on the ETS (Emerging Technology Services) site.

Machine Learning Course

Enough time has passed since I undertook the Stanford University Natural Language Processing Course for me to forget just how much hard work it was for me to start all over again.  This year I decided to have a go at the coursera Machine Learning Course.

Unlike the 12 week NLP course last year which estimated 10 hours a week and turned out to be more like 15-20 hours a week, this course was much more realistic in estimation at 10 weeks of 8 hours.  I think I more or less hit the mark on that point spending about 1 day every week for the past 10 weeks studying machine learning - so around half the time required for the NLP course.

The course was written and presented by Andrew Ng who seems to be rather prolific and somewhat of an academic star in his fields of machine learning and artificial intelligence.  He is one of the co-founders of the coursera site which along with their main rival, Udacity, have brought about the popular rise of Massive Open Online Learning.

The Machine Learning Course followed the same format as the NLP course from last year which I can only assume is the standard coursera format, at least for technical courses anyway.  Each week there were 1 or two main topic areas to study which were presented in a series of videos featuring Andrew talking through a set of slides on which he's able to hand write notes for demonstration purposes, just as if you're sitting in a real lecture hall at university.  To check your understanding of the content of the videos there are questions which must be answered on each topic against which you're graded.  The second main component each week is a programming exercise which for the Machine Learning Course must be completed in Octave - so yet another programming language to add to your list.  Achieving a mark of 80% or above across all the questions and programming exercises results in a course pass.  I appear to have done that with relative ease for this course.

The 18 topics covered were:

  • Introduction
  • Linear Regression with One Variable
  • Linear Algebra Review
  • Linear Regression with Multiple Variables
  • Octave Tutorial
  • Logistic Regression
  • Regularisation
  • Neural Networks Representation
  • Neural Networks Learning
  • Advice for Applying Machine Learning
  • Machine Learning System Design
  • Support Vector Machines
  • Clustering
  • Dimensionality Reduction
  • Anomaly Detection
  • Recommender Systems
  • Large Scale Machine Learning
  • Application Example Photo OCR
The course served as a good revision of some maths I haven't used in quite some time, lots of Linear Algebra for which you need a pretty good understanding and lots of calculus which you didn't really need to understand if all you care about is implementing the algorithms rather than working out how they're derived or proven.  Being quite maths based, the course used matrices and vectorisation very heavily rather than using the loop structures that most of us would use as a go-to framework for writing complex algorithms.  Again, this was some good revision as I've not programmed in this fashion for quite some time.  You're definitely reminded of just how efficient you can make complex tasks on modern processors if you stand back from your algorithm for a bit and work out how best to utilise the hardware (via the appropriately optimised libraries) you have.

The major thought behind the course seems to be to teach as many different algorithms as possible.  There really is a great range.  Starting of simply with linear algorithms and progressing right up to the current state-of-the-art Neural Networks and the ever fashionable map-reduce stuff.

I didn't find the course terribly difficult, I'm no expert in any of the topics but have studied enough maths not to struggle with that side of things and don't struggle with programming either.  I didn't need to use the forums or any of the other social elements offered during the course so I don't really have a feel for how others found the course.  I can certainly imagine someone finding it a real struggle if they don't have a particularly deep background in either maths or programming.

There was, as far as I can think right now, one (or maybe two depending on how you count) omission from the course.  Most of the programming exercises were heavily frameworked for you in advance, you just have to fill in the gaps.  This is great for learning the various different algorithms presented during the course but does leave a couple of areas at the end of the course you're not so confident with (aside from not really having a wide grasp of the Octave programming language).  The omission of which I speak is that of storing and bootstrapping the models you've trained with the algorithm.  All the exercises concentrated on training a model, storing it in memory, using it and as the program terminates then so your model disappears.  It would have been great to have another module on the best ways to persist models between program runs, and how to continue training (bootstrap) a model that you have already persisted.  I'll feed that thought back to Andrew when the opportunity arises over the next couple of weeks.

The problem going forward wont so much be applying what has been offered here but working out what to apply it to.  The range of problems that can be tackled with these techniques is mind-blowing, just look at the rise of analytics we're seeing in all areas of business and technology.

Overall then, a really nice introduction into the world of machine learning.  Recommended!

Speech to Text

Apologies to the tl;dr brigade, this is going to be a long one... 

For a number of years I've been quietly working away with IBM research on our speech to text programme. That is, working with a set of algorithms that ultimately produce a system capable of listening to human speech and transcribing it into text. The concept is simple, train a system for speech to text - speech goes in, text comes out. However, the process and algorithms to do this are extremely complicated from just about every way you look at it – computationally, mathematically, operationally, evaluationally, time and cost. This is a completely separate topic and area of research from the similar sounding text to speech systems that take text (such as this blog) and read it aloud in a computerised voice.

Whenever I talk to people about it they always appear fascinated and want to know more. The same questions often come up. I'm going to address some of these here in a generic way and leaving out those that I'm unable to talk about here. I should also point out that I'm by no means a speech expert or linguist but have developed enough of an understanding to be dangerous in the subject matter and that (I hope) allows me to explain things in a way that others not familiar with the field are able to understand. I'm deliberately not linking out to the various research topics that come into play during this post as the list would become lengthy very quickly and this isn't a formal paper after all, Internet searches are your friend if you want to know more.

I didn't know IBM did that?
OK so not strictly a question but the answer is yes, we do. We happen to be pretty good at it as well. However, we typically use a company called Nuance as our preferred partner.

People have often heard of IBM's former product in this area called Via Voice for their desktop PCs which was available until the early 2000's. This sort of technology allowed a single user to speak to their computer for various different purposes and required the user to spend some time training the software before it would understand their particular voice. Today's speech software has progressed beyond this to systems that don't require any training by the user before they use it. Current systems are trained in advance in order to attempt to understand any voice.

What's required?
Assuming you have the appropriate software and the hardware required to run it on then you need three more things to build a speech to text system: audio, transcripts and a phonetic dictionary of pronunciations. This sounds quite simple but when you dig under the covers a little you realise it's much more complicated (not to mention expensive) and the devil is very much in the detail.

On the audial side you'll need a set of speech recordings. If you want to evaluate your system after it has been trained then a small sample of these should be kept to one side and not used during the training process. This set of audio used for evaluation is usually termed the held out set. It's considered cheating if you later evaluate the system using audio that was included in the training process – since the system has already “heard” this audio before it would have a higher chance of accurately reproducing it later. The creation of the held out set leads to two sets of audio files, the held out set and the majority of the audio that remains which is called the training set.

The audio can be in any format your training software is compatible with but wave files are commonly used. The quality of the audio both in terms of the digital quality (e.g. sample rate) as well as the quality of the speaker(s) and the equipment used for the recordings will have a direct bearing on the resulting accuracy of the system being trained. Simply put, the better quality you can make the input, the more accurate the output will be. This leads to another bunch of questions such as but not limited to “What quality is optimal?”, “What should I get the speakers to say?”, “How should I capture the recordings?” - all of which are research topics in their own right and for which there is no one-size-fits-all answer.

Capturing the audio is one half of the battle. The next piece in the puzzle is obtaining well transcribed textual copies of that audio. The transcripts should consist of a set of text representing what was said in the audio as well as some sort of indication of when during the audio a speaker starts speaking and when they stop. This is usually done on a sentence by sentence basis, or for each utterance as they are known. These transcripts may have a certain amount of subjectivity associated with them in terms of where the sentence boundaries are and potentially exactly what was said if the audio wasn't clear or slang terms were used. They can be formatted in a variety of different ways and there are various standard formats for this purpose from an XML DTD through to CSV.

If it has not already become clear, creating the transcription files can be quite a skilled and time consuming job. A typical industry expectation is that it takes approximately 10 man-hours for a skilled transcriber to produce 1 hour of well formatted audio transcription. This time plus the cost of collecting the audio in the first place is one of the factors making speech to text a long, hard and expensive process. This is particularly the case when put into context that most current commercial speech systems are trained on at least 2000+ hours of audio with the minimum recommended amount being somewhere in the region of 500+ hours.

Finally, a phonetic dictionary must either be obtained or produced that contains at least one pronunciation variant for each word said across the entire corpus of audio input. Even for a minimal system this will run into tens of thousands of words. There are of course, already phonetic dictionaries available such as the Oxford English Dictionary that contains a pronunciation for each word it contains. However, this would only be appropriate for one regional accent or dialect without variation. Hence, producing the dictionary can also be a long and skilled manual task.

What does the software do?
The simple answer is that it takes audio and transcript files and passes them through a set of really rather complicated mathematical algorithms to produce a model that is particular to the input received. This is the training process. Once system has been trained the model it generates can be used to take speech input and produce text output. This is the decoding process. The training process requires lots of data and is computationally expensive but the model it produces is very small and computationally much less expensive to run. Today's models are typically able to perform real-time (or faster) speech to text conversion on a single core of a modern CPU. It is the model and software surrounding the model that is the piece exposed to users of the system.

Various different steps are used during the training process to iterate through the different modelling techniques across the entire set of training audio provided to the trainer. When the process first starts the software knows nothing of the audio, there are no clever boot strapping techniques used to kick-start the system in a certain direction or pre-load it in any way. This allows the software to be entirely generic and work for all sorts of different languages and quality of material. Starting in this way is known as a flat start or context independent training. The software simply chops up the audio into regular segments to start with and then performs several iterations where these boundaries are shifted slightly to match the boundaries of the speech in the audio more closely.

The next phase is context dependent training. This phase starts to make the model a little more specific and tailored to the input being given to the trainer. The pronunciation dictionary is used to refine the model to produce an initial system that could be used to decode speech into text in its own right at this early stage. Typically, context dependent training, while an iterative process in itself, can also be run multiple times in order to hone the model still further.

Another optimisation that can be made to the model after context dependent training is to apply vocal tract length normalisation. This works on the theory that the audibility of human speech correlates to the pitch of the voice, and the pitch of the voice correlates to the vocal tract length of the speaker. Put simply, it's a theory that says men have low voices and women have high voices and if we normalise the wave form for all voices in the training material to have the same pitch (i.e. same vocal tract length) then audibility improves. To do this an estimation of the vocal tract length must first be made for each speaker in the training data such that a normalisation factor can be applied to that material and the model updated to reflect the change.

The model can be thought of as a tree although it's actually a large multi-dimensional matrix. By reducing the number of dimensions in the matrix and applying various other mathematical operations to reduce the search space the model can be further improved upon both in terms of accuracy, speed and size. This is generally done after vocal tract length normalisation has taken place.

Another tweak that can be made to improve the model is to apply what we call discriminative training. For this step the theory goes along the lines that all of the training material is decoded using the current best model produced from the previous step. This produces a set of text files. These text files can be compared with those produced by the human transcribers and given to the system as training material. The comparison can be used to inform where the model can be improved and these improvements applied to the model. It's a step that can probably be best summarised by learning from its mistakes, clever!

Finally, once the model has been completed it can be used with a decoder that knows how to understand that model to produce text given an audio input. In reality, the decoders tend to operate on two different models. The audio model for which the process of creation has just been roughly explained; and a language model. The language model is simply a description of how language is used in the specific context of the training material. It would, for example, attempt to provide insight into which words typically follow which other words via the use of what natural language processing experts call n-grams. Obtaining information to produce the language model is much easier and does not necessarily have to come entirely from the transcripts used during the training process. Any text data that is considered representative of the speech being decoded could be useful. For example, in an application targeted at decoding BBC News readers then articles from the BBC news web site would likely prove a useful addition to the language model.

How accurate is it?
This is probably the most common question about these systems and one of the most complex to answer. As with most things in the world of high technology it's not simple, so the answer is the infamous “it depends”. The short answer is that in ideal circumstances the software can perform at near human levels of accuracy which equates to in excess of 90% accuracy levels. Pretty good you'd think. It has been shown that human performance is somewhere in excess of 90% and is almost never 100% accuracy. The test for this is quite simple, you get two (or more) people to independently transcribe some speech and compare the results from each speaker, almost always there will be a disagreement about some part of the speech (if there's enough speech that is).

It's not often that ideal circumstances are present or can even realistically be achieved. Ideal would be transcribing a speaker with a similar voice and accent to those which have been trained into the model and they would speak at the right speed (not too fast and not too slowly) and they would use a directional microphone that didn't do any fancy noise cancellation, etc. What people are generally interested in is the real-world situation, something along the lines of “if I speak to my phone, will it understand me?”. This sort of real-world environment often includes background noise and a very wide variety of speakers potentially speaking into a non-optimal recording device. Even this can be a complicated answer for the purposes of accuracy. We're talking about free, conversational style, speech in this blog post and there's a huge different in recognising any and all words versus recognising a small set of command and control words for if you wanted your phone to perform a specific action. In conclusion then, we can only really speak about the art of the possible and what has been achieved before. If you want to know about accuracy for your particular situation and your particular voice on your particular device then you'd have to test it!

What words can it understand? What about slang?
The range of understanding of a speech to text system is dependent on the training material. At present, the state of the art systems are based on dictionaries of words and don't generally attempt to recognise new words for which an entry in the dictionary has not been found (although these types of systems are available separately and could be combined into a speech to text solution if necessary). So the number and range of words understood by a speech to text system is currently (and I'm generalising here) a function of the number and range of words used in the training material. It doesn't really matter what these words are, whether they're conversational and slang terms or proper dictionary terms, so long as the system was trained on those then it should be able to recognise them again during a decode.

Updates and Maintenance
For the more discerning reader, you'll have realised by now a fundamental flaw in the plan laid out thus far. Language changes over time, people use new words and the meaning of words changes within the language we use. Text-speak is one of the new kids on the block in this area. It would be extremely cumbersome to need to train an entire new model each time you wished to update your previous one in order to include some set of new language capability. The models produced are able to be modified and updated with these changes without the need to go back to a full standing start and training from scratch all over again. It's possible to take your existing model built from the set of data you had available at a particular point in time and use this to bootstrap the creation of a new model which will be enhanced with the new materials that you've gathered since training the first model. Of course, you'll want to test and compare both models to check that you have in fact enhanced performance as you were expecting. This type of maintenance and update to the model will be required to any and all of these types of systems as they're currently designed as the structure and usage of our languages evolve.

OK, so not necessarily a blog post that was ever designed to draw a conclusion but I wanted to wrap up by saying that this is an area of technology that is still very much in active research and development, and has been so for at least 40-50 years or more! There's a really interesting statistic I've seen in the field that says if you ask a range of people involved in this topic the answer to the question “when will speech to text become a reality” then the answer generally comes out at “in ten years time”. This question has been asked consistently over time and the answer has remained the same. It seems then, that either this is a really hard nut to crack or that our expectations of such a system move on over time. Either way, it seems there will always be something new just around the corner to advance us to the next stage of speech technologies.

Going Back to University

A couple of weeks ago I had the enormous pleasure of returning to Exeter University where I studied for my degree more years ago than seems possible.  Getting involved with the uni again has been something I've long since wanted to do in an attempt to give back something to the institution to which I owe so much having been there to get good qualifications and not least met my wife there too!  I think early on in a career it's not necessarily something I would have been particularly useful for since I was closer to the university than my working life in age, mentality and a bunch of other factors I'm sure.  However, getting a bit older makes me feel readier to provide something tangibly useful in terms of giving something back both to the university and to the current students.  I hope that having been there recently with work it's a relationship I can start to build up.

I should probably steer clear of saying exactly why we were there but there was a small team from work some of which I knew well such as @madieq and @andysc and one or two I hadn't come across before.  Our job was to work with some academic staff for a couple of days and so it was a bit of a departure from my normal work with corporate customers.  It's fantastic to see the university from the other side of the fence (i.e. not being a student) and hearing about some of the things going on there and seeing a university every bit as vibrant and ambitious as the one I left in 2000. Of course, there was the obligatory wining and dining in the evening which just went to make the experience all the more pleasurable.

I really hope to be able to talk a lot more about things we're doing with the university in the future.  Until then, I'm looking forward to going back a little more often and potentially imparting some words (of wisdom?) to some students too.

Hursley Celebration

Today is one of those great days in Hursley when everyone lifts their head and gets away from their desk for a little while…

Car Fair

OK, so excuse the quality of that picture as it’s just a quick snap from my phone. Every few years we have a classic car fair on site, there seems to be no rhythm to when they’re held, possibly it’s just long enough since we’ve all forgotten about the cars we saw at the same show last time round – but I’m sure there are some different ones too.

Today’s celebration is under the guise of an Olympic celebration so in addition to the car show there’s a big quiz taking place, a careers fair, several different “sporting” events (such as egg and spoon race and the like) so it’s as much a summer fair as anything else; and it’s not raining which is always a bonus. The real draw of course is the free cookie or scone and drink of course, but however you look at it, to have these sorts of events on site (and such a lovely site on a summers day) is absolutely brilliant. It’s a great chance for us all to take a little time away from the desk in the afternoon, catch up with friends, see what’s going on while enjoying ourselves and having a bit of fun.

<edit>More pictures are coming in of the event on Twitter…</edit>
Reproduced with permission from Simon Maple
Delorean at Hursley by Simon Maple

Reproduced with permission from Peter Anghelides
Looks like the IBM Hursley car park is full again

Natural Language Processing Course

Over the first few months of this year I have been taking part in a mass online learning course in Natural Language Processing (NLP) run by Stanford University.  They publicised a group of eight courses at the end of last year and I didn't hesitate to sign up to the Natural Language Processing course knowing it would fit very well with things I'm working on in my professional role where I'm doing more and more with text analytics and continuing my work in speech to text.  There were others I could easily have signed up for too, things like security or machine learning, more or less all of them are relevant for something I'm doing.  However, given the time commitment required I decided to fully commit to one course and the NLP one was to be it.

I passed the course with a grade of 85% which was well above the required 70% pass mark.  However, the effort and time required to get there was way more than I was expecting and quite a lot more than the expected time the lecturers (Chris Manning and Dan Jurafsky) had said.  From memory it was an 8 week course with 10 hours a week required effort to complete the work. As it went on the amount of time required went up significantly, so rather than the 80 hours total I think I spent more like 1½ times that at over 120 hours!

There were four of us at work (that I know of) who embarked on the course but due to the commitment of time I've mentioned above only myself and Dale finished.  By the way, Dale has written an excellent post on the structure and content of the course so I'd suggest reading his blog for more details on that stuff, there's little point in me re-posting it as he's written such a good summary.

In terms of the participants on the course, it seems to have been quite a success for Stanford University - this is the first time they have run courses in this way it seems.  The lecturers gave us some statistics at a couple of strategic points throughout the course and it seems there were around 40,000 people registering an interest, of which around 5000 were watching the lecture material and around 2000 completed the course having taken part in the homework assignments.

I'm glad I committed as much as I did.  If I were one of the 5000 just watching the lectures and not doing the homework material I don't think I would have got as much out of it, but the added time required to complete the homework was significant so perhaps there's a trade-off here?  It's certainly the first time I've committed this much of my own personal time (it took over the lives of myself and Dale for quite a few weeks) as I was too busy at work to spend many business hours working on the course so it was all done in evenings and weekends.  That's certainly one piece of feedback I gave at the end of the course, Stanford could make the course timing more flexible but also allow more time for the course to be completed.

My experience with the way the assignments were marked was a little different to the way Dale has described in his post.  I was already very familiar with the concepts of test, development and held-out sets (three different sets of data used when training NLP systems) so wasn't surprised to see that the modules in the course didn't necessarily have an exact answer to them or more precisely that the code your wrote to perfectly analyse some data on your local system may not get full marks as it was marked against a different data set.  This may seem unfair but is common practice in all NLP system training that I know of.

All in all, an excellent course that I'm glad I did.  From what I hear of the other courses, they're not as deeply involved as the NLP course so I may well give another one a go in the future but for now I need to get a little of my life back and have a well earned rest from education.

Hursley Emerging Tech on the News

Kevin Brown who also featured in my previous eightbar post appears to be increasing his level of fame after appearing on Channel 4 news last night.

Kevin has done a lot of work with HCI (Human Computer Interfaces) and is leading the way in the Hursley Emerging Technology Services department. He has a huge interest and wealth of knowledge on the topic but the bleeding-edge HCI device catching people’s attention again at the moment is the brain reading headset from Emoviv Technologies. Kevin has been working with this device for quite some time already, having for example used it with hospital patients, and a wealth of other uses too including driving cars. This gives a good indicator to how far ahead of the curve our emerging tech team can be at times.

The Channel 4 news clip focuses on using the headset to drive cars and puts this in the context of Google’s self-drive car too, here’s the video:

Emerging Technology Services Interviews

The British Computer Society recently came to Hursley to interview some of the members of Emerging Technology Services about some of the work we’ve been doing recently. The results, as ever in ETS, are really interesting so here is the set of video interviews reposted for all you Eightbar subscribers out there.

To kick things off we have Bharat Bedi, IBM Master Inventor, talking about his work on the Universal Information Framework. This is an innovative idea that allows secure interactions that could benefit, for example, banks:

Another piece from Bharat Bedi but this time talking about his work on the Living Safe project which runs in Balzano, Italy to help older residents who live by themselves:

Now something a little different from Kevin Brown, IBM Senior Inventor, talking about his work using a mind-reading headset. Here he gets Brian Runciman from the BCS to drive a car with his brain and trains him to run a brain wave reading headset:

Next up we have Dominic Harries, IBM Emerging Technologies Specialist, talking about some of his work using a multi-user multi-touch surface. Here Dominic is demonstrating the use of a business application on the multi-touch table:

Last, but not least we have Helen Bowyer, Emerging Technologies Manager, talking about her work on Automatic Sign Language. Helen explains and demonstrates the Say It, Sign It (SiSi) project which uses an avatar to translate spoken English into sign language.

The original content can be found at http://www.bcs.org/content/conWebDoc/44430.

Failing to Invent

We IBM employees are encouraged, indeed incented, to be innovative and to invent.  This is particularly poignant for people like myself working on the leading edge of the latest technologies.  I work in IBM emerging technologies which is all about taking the latest available technology to our customers.  We do this in a number of different ways but that's a blog post in itself.  Innovation is often confused for or used interchangeably with invention but they are different, invention for IBM means patents, patenting and the patent process.  That is, if I come up with something inventive I'm very much encouraged to protect that idea using patents and there are processes and help available to allow me to do that.

This comic strip really sums up what can often happen when you investigate protecting one of your ideas with a patent.  It struck me recently while out to dinner with friends that there's nothing wrong with failing to invent as the cartoon above says Leibniz did.  It's the innovation that's important here and unlucky for Leibniz that he wasn't seen to be inventing.  It can be quite difficult to think of something sufficiently new that it is patent-worthy and this often happens to me and those I work with while trying to protect our own ideas.

The example I was drawing upon on this occasion was an idea I was discussing at work with some colleagues about a certain usage of your mobile phone [I'm being intentionally vague here].  After thinking it all through we came to the realisation that while the idea was good and the solution innovative, all the technology was already known available and assembled in the way we were proposing, but used somewhere completely different.

So, failing to invent is no bad thing.  We tried and on this particular occasion decided we could innovate but not invent.  Next time things could be the other way around but according to these definitions we shouldn't be afraid to innovate at the price of invention anyway.

Where are they now?

Ian Hughes Ian Hughes/Epredator

As part of the reorganisation of the Eightbar site recently I’ve been catching up with some of the honored past Eightbar members. We say past in the loosest sense of course, Eightbar was set up with the principle that “Once you’re Eightbar, you’re always Eightbar”. Here, I manage to muscle in on some of Ian Hughes’ (a.k.a epredator) time as he’s kindly answered some questions for us. What follows is a 10 question interview style post where I talk to Ian about life after IBM – in more than 140 characters. I think it’s a really interesting read, enjoy…!

Ian, you worked for IBM for a long time (somewhere around 20 years!) before making the big decision to leave and form your own start-up at Feeding Edge nearly 3 years ago!

1. What have you found are the main things keeping you busy now?

Just as when I was at IBM my work life is very varied. Living and working with technology and social changes, and being a bit of a polymath I find myself mixing a lot of skills.

Sometimes I am coding or combining code, usually on open source platforms but often in Unity3d. Building some game elements for a startup. Other times I am on the conference circuit helping
people to see the future by showing examples of how various things have changed already and how they link together to form a disruptive future. i.e. carrying on as an evangelist.

Much of this is still related to virtual worlds because they form a social and technical glue that still surprises many people only just getting to grips with Twitter and Facebook.

2. We’ve seen your continued rise to stardom on the ITV programme The Cool Stuff Collective, how did that come about?

Stardom is a very strong word 🙂 It was an ambition I had tucked away to do some more TV work. Like many things though it was serendipity that brought that about.

As I still blog many of my ideas and things about interesting advances many of my friends still read that. A good friend and IBMer Scotty (Kevin Scott / @starbase37) had told his friend John Marley / @marleyman007 who runs a TV production company Archie Productions about all the stuff I was talking about. Games, 3d printing, virtual worlds etc. So we got connected and had a meeting about a new show John was looking to start.

The aim of the meeting was really a friendly catchup and for me to give John a list of things that he could put on his show. Somewhere in the conversation he said “and then you will come on set and explain that to camera and the other presenter?” Which I still thought he meant he wanted me to be tech advisor for the kids show. Then it clicked and I realised I was being thrown in at the deep end. It was one of the few shows ITV/CITV has commissioned over the past few years.

So really because I have always shared what I know, used the web and social media to explain and offer a kind of open source advise I ended up with a character and role on the show. Which we have done 3 series of too!

Cue Showreels 🙂 TV Showreel

3. You must enjoy being the CSC resident g33k and teaching the viewers, what do you learn from them?

It has been the most fun and rewarding thing I have done. The third series in particular we moved from a studio and just the crew to being on location with schools in a Top Gear style. Whilst we were making a show for a mass audience it became even more important to be able to reach kids directly. I learned, and re-learned that the willingness to go with the flow on some ideas because they just are cool is still a magical thing. The things I say on the show are the same things I say in boardrooms and at conferences. The kids put many adults to shame though in not worrying straight away about ROI or marketing blurb. They get the idea and then fly with it.

It was also great to be able to reclaim geek/g33k. In a few schools the kids who were the tech geeks were suddenly allowed to be cool too. After all there was a bloke off the telly they could talk to.

We always had questions at the end of my future tech slot and I often didn’t get to know what they were up front, they were their questions and they were always taking me by surprise with their new angles or just the depth of understanding they showed. Once again putting many adults to shame.

4. Your time on Eightbar was mainly filled with Virtual Worlds work, what’s going down with the 3D Internet now, has it progressed as you thought?

It’s interesting as in many ways parts of the metaverse are now so mainstream, yet still not so much in the “business” world as you may have expected. We know that people tend to have to evolve through things, hence the struggling to understand the power of connection in social media is still a struggle for many decision makers in business. In a time of global recession with restricted travel it seems that the obvious use for communication and understanding via virtual environments is still not being exploited. Much of this is due to people being risk averse when they think their jobs are on the line. I find that many of the things we do and talk about are still reaching an audience who then say wow I didn’t think of it like that.

When they are used in their various forms they have a huge impact. Imperial College have some of the best examples, even with just a simple Opensim environment to help people plan a particular event it showed up real world procedures needed fixing after the first 5 minutes which saved more than money.

Lots of companies have floundered who where virtual world providers, but equally lots of their code is now open source. At the same time though lots of the games industry has been turned on its head by the arrival of minecraft. Which is a “game” but that uses co creation tools live in the environment. It has done a lot to help the games industry (who also did not understand virtual worlds of this sort) to look and say “oh! thats what its all about”.

So none of it has gone away. It hit the usual Gartner trough of dissillusionment after the confused hype and now is ploughing up the right slope.

Regular business will get hit with a minecraft moment though. A game changer in the same way open source software hit the IT industry, or Amazon hit retail. It’s just about being prepared to go with it when it arrives.

Another great development has been the ability to self build game tech environments with products like unity3d (a huge nod to Rob Smart for spotting unity3d way back too!) and have socket servers like photon and smartfoxserver.

I should also mention gamification, a horrible word, another thing for people to misunderstand, yet it covers the principles of applying both gaming and game technology into places it has not been before. It is often used in a lazy fashion slapping badges on things and giving out points, however at its heart the elements of playing with identity and expression online with a virtual environment in a business context provide way more benefit.

5. What has the past 3 years done for 3d printing, another of your interest areas?

3d printing has gone from strength to strength. It is appearing in more places and often more people have seen something about it when I talk about it. It is linked to the virtual worlds work as when you consider that a virtual environment is often about distributing digital assets from one place to another, you bolt a 3d printer on the end of that and you get digital design and distribution of physical product and the world changes.

The increase in open source builds like the RepRap make the hobby end of this accessible (around £400 of bits to build one). Makerbot provide some very cheap, but clever printers too that were featured heavily at CES 2012 (Consumer Elecrtonics Show) note the Consumer in that 🙂 ! Services that print for you, like Shapeways, initially funded by Phillips, have grown and moved to New York.

It is still something that when someone has never seen it they think it is witchcraft, somewhat like google used to seem to people 🙂 That magic is nice to share, but then applying the extrapolation of the change to the entire world economy and manufacturing business as it moves on then scares and excites in equal measure.

6. What would you like to see Eightbar doing more/less of after the departure of Andy Piper from Hursley recently?

When we all set up eightbar it was an antidote to the west coast US tech bloggers getting all the kudos. We’re doing some great things over here too 🙂 Just as tech blogging has evolved I would love to see eightbar carrying on as a mini brand and a voice of that same attitude wherever it needs to be.

7. Looking back at IBM, any regrets about leaving? Things you miss?

I miss all the people, well nearly all 😉 Though in reality much of the work was with people all over the world having a base of people in the same timezone and same place eating lunch in the same canteen provides an anchor. As does having to battle the same corporate resilience to change, or political short sightedness. There are still a great many sparky, slightly subversive but for the right reasons, renegade thought leaders under the radar at IBM.

Oh and the regular pay 🙂

8. What’s been the best thing about moving on?

Diversity of experiences and freedom to explore them. Like the TV work, it was just because of being open minded and master of my own calendar. I like to link everything, let one piece of work and ideas flow with another. That is tricky in a billable utilisation environment when you are not in control of the finances and the workload. It is why big corporations will keep getting side swiped by very small fast moving organisations with huge world connectivity at their finger tips.

I have also had to learn a lot about the various forms and processes needed to run even the smallest Ltd company. It’s an odd and archaic system, but they are the rules 🙂 It has also been fun picking various ideas and developing them getting people with the money to get interested. It gets all very Dragon’s den.

Freedom also allows me to try and pick things based on if I think they are beneficial in some way, not purely just because they are there. I have always prided myself on trying to act honourably in everything and with positive principles. So now it is up to me to stick to that and help others try and do the same.

9. Your personal life and work-life balance must have adjusted, what does a day in the life of epredator look like now you’re self-employed?

Aha! I called myself self employed once and my accountant was quick to point out I am not 🙂 This is part of what I was saying about companies and rules. As Feeding Edge is a limited company it is a legal entity in its own right that I happen to be a director of. At the same time there is a person on its official payroll, an employee… me 🙂 So as many twist and turns in business language as in any piece of tech 🙂

My day is much more thinly sliced than ever. I get up check a few streams of information, spot anything urgent, then do the school run, back home for 45 minute workout on UFC trainer on the Kinect, do some calls afterwards whilst cooling down. Most of the day is spent talking to the US and or my other biz partners around the gaming startup we have, building some code, pitching how bizarre the idea is. This is usually interspersed with some contacts from previous conferences getting in touch or some BCS animation and Games Dev SG business. Several times a month I pop along to a convention or meeting to talk about Tech and usual with Cool Stuff Collective as a backdrop. So the cycle continues.

Then there are the ad hoc conversations around other possible TV shows, or helping other startup businesses who are focussed using new tech with some connections or ideas.

Evenings are mix of cooking for the family, putting the kids(predlets) to bed, some gaming, heading to a Choi Kwang Do class or late night calls with US west coast for an interview or in Second Life.

However there is not start or end to a working day, a tweet on the way back form the school run may lead to something as much as a scheduled Skype call at 2pm. The emphasis is still on talking and sharing online.

10. Finally, give us a plug for Feeding Edge, who might I be if I were your customer and what might you be able to do for me?

Feeding Edge is a vehicle for people to get help from me, consulting or hands on development. As I say I am taking a bite out of technology so you don’t have to. All the years of experience with corporate tech and now several years out in the wild having to use what I talk about gives me a view on the world that many people don’t have time to consider, in person, in writing, on the TV, on stage, in the lab. I cover how technology feels and changes your life as much as the more obvious version x with version y tech.

In conferences I am usually the one put there to shake everybody up. So if you need a jolt of inspiration and a view of the future. well thats Feeding Edge and epredator. Cue show reels again 🙂

Well that’s it from Ian again for now. It’s really good to hear him talking in a wider context again, reading about the mix of drawing inspiration from such a wide variety of sources is really refreshing. It’s certainly reminded me to go “heads up” more often than I generally manage to do, so easy is it to keep too narrow a view on your immediate work tasks.

Thanks Ian, it’s been a pleasure – as always!