Grow Your Business with Machine Learning

Whether your goal is to voice enable your applications or simply "solve intelligence", there has never been a better time to get started with machine learning. I wrote recently about how 7.5 million traditional economy jobs in the UK will be displaced by AI automation in the next 20 years.

Recent advances in machine learning have been nothing short of amazing. Tasks that appeared impossible only a few years ago are now almost commonplace:

Now is the time for your business to start planning for that future. In this article, you will discover how machine learning can help your business grow.

  • The addition of sounds to silent video
  • Automatic handwriting recognition and generation
  • Automatic language translation
  • Classification of objects in photographs
  • Colourisation of black and white images

However, in my experience, the one thing businesses have in abundance is text. It is everywhere! Be it in process documentation, sales literature or customer feedback - the list is endless. Luckily, we now have proven techniques to build infrastructure and develop algorithms that bring natural language processing (NLP) to the enterprise.

Driven by recent breakthroughs in deep learning research, machines can now understand and analyse human language better than ever before. Building on these advances, we can create technology and products to help businesses extract value from data.

Machines are exceptional at classifying text, extracting structured data, translating between languages and having meaningful conversations with customers. The business applications of such advances are limitless. By automating mundane tasks currently done by humans, we can free our workforce to focus on activities that add greater value.

Here are some examples to get you started:

Sentiment Analysis
Use sentiment analysis to detect positive or negative feelings in text. Monitor what customers are saying about your products on social media, or prioritise angry customer service inquiries. Until recently, sentiment analysis techniques were based on dictionary methods that could not deal with negation or information across sentences. We can now predict sentiment based on the true meaning of text, leading to much better accuracy.

Conversational Interfaces
Allow customers to deal with your company through conversational chat interfaces, or chatbots. For example, a chatbot could handle customer service inquiries, fulfil orders or make product recommendations.

Information Extraction
Useful information is often hidden in unstructured text, but most software can only analyse structured data. Use NLP techniques to automatically extract facts from unstructured text, build a database or knowledge graph and feed the resulting structured data into standard applications like spreadsheets or data visualisation tools.

Machine Translation
Building machine translation systems used to take years of engineering effort and a combination of sophisticated machine learning techniques. Using the vast amount of multilingual data available today, we can now train a neural network to translate automatically between various languages, saving cost, reducing engineering effort, while improving performance and accuracy at the same time.

Text Classification
Classify reviews, customer service conversations, news articles, sales emails and other types of text into useful categories based on the meaning of the text.

Information Retrieval
Find the most relevant text passages, advertisements or other items for a given user query or question. Instead of relying on keyword matching and word frequency measures, we can now score the semantic relatedness of documents, find relevant items, even if they do not contain exact matches, and build more intelligent search engines.

This is just a taster of how machine learning can help your business grow - contact me to discover more.

Follow me on Twitter for more updates like this.


AI Automation Will Save The UK Economy Billions

I was watching a TV programme recently about the BMW Mini manufacturing plant in Oxford employing apprentices. I thought, how quaint, but surely most of the construction is now automated? Then it became apparent that these apprentices were not there to build cars (at least not directly), but instead their job was to maintain the robots that did all the work!

That same night, I saw another programme about the introduction of the Boris Bus in London. This is a very high tech machine, sporting a hybrid diesel-electric drive-train and a 75 kWh lithium iron phosphate battery pack. The "classic Routemaster" mechanics at the bus garage were complaining that you couldn't work on these new vehicles without a degree in computer science. Most were reluctant to even consider retraining...

We hear a lot of doom and gloom in the press about how machines will steal jobs. But I see it another way - machines will release us from drudgery to do more interesting things, just as they did during the industrial revolution. Nobody would want to go back to a time before electricity and, in a similar vein, we will not want to go back to a time before mass AI automation.

So, I decided to calculate exactly how much AI automation will contribute to the UK economy. According to the Office of National Statistics the current UK workforce is comprised as follows:

Sector People %
Agriculture, forestry and fishing 381,000 1.1%
Mining and quarrying 64,000 0.2%
Manufacturing 2,684,000 7.8%
Electricity, gas, steam and air conditioning supply 128,000 0.4%
Water supply, sewerage, waste and remediation activities 212,000 0.6%
Construction 2,301,000 6.7%
Wholesale and retail trade 5,026,000 14.6%
Transport and storage 1,588,000 4.6%
Accommodation and food service activities 2,316,000 6.7%
Information and communication 1,409,000 4.1%
Financial and insurance activities 1,148,000 3.3%
Real estate activities 548,000 1.6%
Professional scientific and technical activities 2,998,000 8.7%
Administrative and support service activities 2,927,000 8.5%
Public admin and defence 1,484,000 4.3%
Education 2,953,000 8.6%
Human health and social work activities 4,279,000 12.5%
Arts, entertainment and recreation 965,000 2.8%
Other service activities 921,000 2.7%
Total 34,332,000 100.0%

This is where I begin to make some pretty big assumptions! Proceed with caution!!

Given that the average gross salary in the UK is £26,260 we can get a feel for how much is spent on employment in each sector. And, using my "patented" back of an envelope calculation, we can see how much AI will impact each sector over the next twenty years - in terms of the number and financial cost of jobs that will be displaced. There is a method in this madness - honest!

Sector People % Salary (£m) AI Penetration People Saving (£m)
Agriculture, forestry and fishing 381,000 1.1% £10,005 23.9% 90,897 £2,387
Mining and quarrying 64,000 0.2% £1,681 21.6% 13,794 £362
Manufacturing 2,684,000 7.8% £70,482 37.4% 1,004,637 £26,382
Electricity, gas, steam and air conditioning supply 128,000 0.4% £3,361 9.7% 12,368 £325
Water supply, sewerage, waste and remediation activities 212,000 0.6% £5,567 11.3% 23,987 £630
Construction 2,301,000 6.7% £60,424 21.0% 482,409 £12,668
Wholesale and retail trade 5,026,000 14.6% £131,983 24.7% 1,243,521 £32,655
Transport and storage 1,588,000 4.6% £41,701 31.7% 503,939 £13,233
Accommodation and food service activities 2,316,000 6.7% £60,818 17.3% 400,393 £10,514
Information and communication 1,409,000 4.1% £37,000 25.7% 362,321 £9,515
Financial and insurance activities 1,148,000 3.3% £30,146 42.4% 486,615 £12,779
Real estate activities 548,000 1.6% £14,390 36.3% 199,101 £5,228
Professional scientific and technical activities 2,998,000 8.7% £78,727 9.4% 280,476 £7,365
Administrative and support service activities 2,927,000 8.5% £76,863 28.0% 819,722 £21,526
Public admin and defence 1,484,000 4.3% £38,970 13.7% 203,954 £5,356
Education 2,953,000 8.6% £77,546 16.1% 474,694 £12,465
Human health and social work activities 4,279,000 12.5% £112,367 17.6% 754,240 £19,806
Arts, entertainment and recreation 965,000 2.8% £25,341 4.2% 40,984 £1,076
Other service activities 921,000 2.7% £24,185 17.1% 157,600 £4,139
Total 34,332,000 100.0% £901,558 22.0% 7,555,651 £198,411

As you can see, I predict that financial services, manufacturing and real estate will be hit hardest, closely followed by transportation and administrative support. Overall, I predict 7.5 million jobs and £200 billion of salaries will disappear from the traditional economy over the next twenty years - that's 22% of our workforce.

So, why is this good news?

Well firstly, a £200 billion saving per year will have a huge impact on the UK's GDP. Overall, in 2015, the UK's current account deficit was £96.2 billion (5.2% of GDP) at current market prices. This puts us firmly in the black very quickly. That money can be reinvested on health care, environmental protection, crime reduction and, most importantly, education. These are all factors that improve quality of life.

Secondly, the revolution won't happen overnight. Innovation of this sort generally follows a normal distribution or bell curve, which is summarised nicely by the diffusion of innovations theory. The first few years will show relatively little progress, with a surge towards the mid-term and finally slowly mopping up the laggards towards the end. We can use the slow ramp-up phase to ensure that people who lose their jobs to AI are being retrained. This requires patience, recognising that it will take time for these workers to be reemployed in higher skilled jobs.

To compliment this, our workforce and young people still in education should be directed towards what have become known as 21st century skills. These skills have been designed to foster creativity and global engagement:

  • Critical thinking and problem solving
  • Creativity and innovation
  • Cross-cultural understanding
  • Communications, information and media literacy

There will still be a demand for people, just as there was following the industrial revolution. Forward thinking companies, like BMW mentioned at the beginning, are already planning for this future. Even the White House has published it's paper "Preparing for the future of Artificial Intelligence". I have faith in the human race and the free market to work out the kinks.

Obviously, this story is not unique to the UK - all economically developed countries will be feeling a similar effect. In stark contrast to the recent Brexit victory and "I'm gonna build a wall" Trump, I believe in a world without borders. The scientific community that will drive this revolution is global and, as the new world economy evolves, so too should its workforce.

I've written before, there's revolution happening in plain sight. And, it's going to be great!

Follow me on Twitter for more updates like this.


Generating Aesop's Fables One Character at a Time...

Note: The supporting code for this blog post can be found on GitHub.

I must admit, I'm a fan of Aesop's Fables. These are a collection of fables credited to Aesop, a slave and storyteller believed to have lived in ancient Greece.

They are a fun read and form the basis of many modern phrases and sayings. Unfortunately, Aesop won't be writing more anytime soon...

I thought, wouldn't it be fun to write a generative language model that could automatically create new fables in the style of Aesop?

Obviously, this is a tall order, but character level generators do exist - see this awesome post by Andrej Karpathy: The Unreasonable Effectiveness of Recurrent Neural Networks.

In fact, Andrej's post inspired me to see what I could do using C#.

Andrej shows how Recurrent Neural Networks (RNNs) can be used to generate complex sequences with long-range structure, simply by predicting one step at a time. Long Short Term Memory (LSTM) cells are a special kind of RNN, capable of learning long-term dependencies and specifically designed to avoid the vanishing gradient problem. Scouring the internet, I could find virtually no attempts to build an LSTM using C# - challenge accepted!

I won't go into depth about how LSTMs work, but if you want to learn more, check out this excellent article by Christopher Olah: Understanding LSTM Networks. Or, for the really technical stuff, check out Alex Graves homepage.

Alex states in his paper Generating Sequences With Recurrent Neural Networks: RNNs can be trained for sequence generation by processing real data sequences one step at a time and predicting what comes next. Assuming the predictions are probabilistic, novel sequences can be generated from a trained network by iteratively sampling from the network’s output distribution, then feeding in the sample as input at the next step. In other words by making the network treat its inventions as if they were real, much like a person dreaming.

Now that we have selected LSTMs as our tool of choice, let's generate some new fables...!

Concatenating all of Aesop's Fables I could find on the internet, I accumulated 185kb of plain text which would be used for training the network - absolutely tiny by machine learning standards, but big enough to prove the point.

Given my background, most of my subscribers have an affinity towards C#, so I thought it would be a good idea to write an LSTM text generator from scratch in C#, sharing the code on GitHub. There's also a vanilla RNN bundled in too, if you want to compare the difference.

My network uses two LSTM layers, each with 256 memory cells, and a single SoftMax output layer - approx 867k parameters in total. I use a cross entropy loss function to calculate the gradients, which are back propagated through 24 time steps. I found that Root Mean Square Propagation (RMSProp) dramatically improves training times.

The code uses Parallel.For loops to make efficient use of all available CPU cores. If you are running on less than four cores, this will be painfully slow!

Whilst this is a fully functional example, I recommend its use is purely educational. If you want to build a deep production network, I strongly suggest using an optimised deep learning framework such as Torch or Caffe.

After each training iteration, the generator produces some sample text. Let's see how the samples evolve as the model learns. At the end of the first iteration, it's basically generating gibberish with the odd short word interspersed:

A "famen the by mous, Rhe I liog sI contoplepis al ane neon eo, betk, on uut the giind, mhemaed in lo dinghis fiting the oRod dedingof. ApocumFib ther is awthe Rhe mion him is mis bropn gA the Laoc the camok mand cimis po; baticg thile ip Cous ha cro et, in I thim and fous to is sifk ind bimke lothe ipit milre hat thed toouf ffifir ifiribwthel thed macoun. Qo curhey riff.

After 20 iterations, it's beginning to look a lot more like English, but is still littered with spelling mistakes.

A Wolf quickly of time in his fight of abmain. The dother pleaget, and hake a smmeroutaridained that well said. He had heppled their procairing of a shepherd in her beeped his spig, and shuaking ann desfiess precespence of his resent for the Ast, but is upwain, the dreasuress an Kight a Streich repleding from this careing in abriectens, as the beast tome a happent.

After 50 iterations, the spelling mistakes have gone and it's even learnt to open and close quotation marks - but it still doesn't make a huge amount of sense.

A Wolf and his state, saw to her for me, but clothing and to sell, and asked forth those harder, bent in these lift imitation as to hear at mush. "After you most believed you?" I reached the stream but often interrupted! Why was easily were in their fate. Of the plain desire and not wish heat matter The Snake, therefore shards. His Mother in reply benefit.

So the moral of the story - you can generate some pretty interesting results simply sampling one character at a time. More data would have helped, but running this without using a GPU would have taken forever. As I said before, this was an academic exercise to explore what could be coded by hand. The recent Deep Learning revolution has been fuelled by huge data sets and GPUs.

Check out my code on GitHub, but if you really want to do anything serious, get Torch!

In summary, it's been fun......


Athena GPU now on GitHub C#

Last month, I announced the release of Athena on GitHub.

Athena is a C# word embedding program based on the original paper Efficient Estimation of Word Representations in Vector Space published by Tomas Mikolov in January 2013. It provides a full environment to manage a large text corpus and subsequently learn and query word embeddings.

This month, I have released an updated GPU accelerated version, which uses Cudafy.NET. This works with either CUDA or OpenCL devices and will auto-select depending on your hardware. A reference to Cudafy.NET is required, which can be downloaded from GitHub.

The GPU version provides between five and ten times acceleration over pure CPU depending on hardware setup. To the best of my knowledge, this is the only implementation of word2vec on a GPU using C# - please correct me if I'm wrong!

If you have already downloaded the CPU version, I strongly suggest you upgrade to the GPU accelerated version which is now available on GitHub: https://github.com/robosoup/Athena


The Revolution Hidden in Plain Sight

For better or worse, machine learning will have a huge impact on humanity by the end of this century.

Hollywood loves to dramatise the emergence of a malevolent artificial general intelligence (AGI) – think Skynet – but it could go the other way. In theory, an AGI will be able to perform any intellectual task that a human can. And, whilst the creation of AGIs is still a long way off, it is also an inevitable outcome given the unrelenting advances in computer hardware and software.

A more immediate impact on humanity, especially its middle classes, will be the displacement of repetitive white collar work by machine learning. To compensate, we are likely to see the emergence of new companies and markets that currently do not exist.

Google and Tesla are well known for their advances in self-driving cars. However, Uber may be the first company with a ready-made business model to exploit it. Uber has announced it will be bringing its autonomous cars to the roads of Pittsburgh in the coming weeks. If Uber (and ultimately its competitors) can globalise this idea they will require a fleet of cheap and energy efficient vehicles.

To date, there has been little investment by startups into the underlying chip architectures that support the creation of machine learning systems. Most self-driving cars tend to use NVIDIA GPUs (built for computer gaming and graphics processing), which have not been optimised specifically for machine learning. NVIDIA may grow into this space, but there is plenty of room for hardware innovation from competitors to emerge.

Self-driving vehicles will disrupt the multi-billion global transportation market. Today’s generation of self-driving cars and trucks are very expensive to build and resemble a lab experiment. Who will supply this next generation of vehicles? Major vehicle manufacturers must innovate or die. Inevitably, self-driving vehicles will result in the loss of millions of jobs and a fundamental transformation of our society. The industrial revolution took 150 years – the self-driving revolution may be finished within the next twenty.

Medicine is another area where we’re likely to see a huge disruption by machine learning, radically changing the way diseases are diagnosed and treated. Machine learning has the potential to augment or replace major aspects of medical care. Imagine if anyone with a smart phone could access the machine equivalent of the world's best doctors, at low cost, from anywhere. Already, IBM Watson has made significant progress in oncology. And, only in the past week Google’s DeepMind announced a partnership with University College London Hospital to use machine learning for the treatment of patients with head and neck cancers.

Transportation and medicine are just two areas – there are so many more. Hollywood may continue to predict the destruction of humanity (it sells films), but I believe the impact will be positive. There will be bumps along the way, it may be uncomfortable at times - the way we work will change, our global society will change – but ultimately it will be better for everyone.


Athena Open Sourced on GitHub C#

In my last blog post I wrote about my implementation of Tomas Mikolov's word2vec algorithm. I subsequently received much interest about the code, so decided to open source it on GitHub - the full repository can be found at https://github.com/robosoup/Athena

Athena is a word embedding program based on the original paper "Efficient Estimation of Word Representations in Vector Space" published by Tomas Mikolov in January 2013.

This is a C# implementation, which provides a full environment to manage a large text corpus and subsequently learn and query word embeddings.

To get started, load a large text corpus into the same directory as the compiled application – I use a 6GB full text dump of Wikipedia - this file must be called "corpus.txt".

Athena then converts the corpus file to lower case, standardises diacritics and converts numerics to standard tokens - the resulting text will be saved as "corpus_0.txt".

Next, Athena will identify recurring terms and concatenate these into phrases – this file will be saved as "corpus_1.txt".

The proceeding steps, which will take several hours to run, need only be executed once – now the training can begin.

By selecting the training option, Athena will use the clean corpus file to create a word embedding model – this will be stored in "model.bin". If training has taken place before, Athena will attempt to load the existing model and continue training from the current state.  If not, Athena will learn the vocabulary and build a seeded, but untrained model file and start training from here.

To query the model, simply select the load option and type in a word or phrase.

For example, typing in “london” will return cities similar to London.

You can also perform vector subtraction by appending a colon ":" to the word you want to negate. For example "france: paris italy" is the equivalent of asking Athena "France is to Paris as Italy is to...?" – this should return Rome.

Let me know how you get on...


Word2Vec Lightweight Port C#

In January 2013, Tomas Mikolov and a team from Google published a paper titled Efficient Estimation of Word Representations in Vector Space. This proposed two new architectures for computing continuous vector representations of words from very large data sets. Whilst word embedding (as the technique is more generally known) was nothing new, the approach taken by the team demonstrated large improvements in accuracy at much lower computational cost.

Later that year, the word2vec C language source code supporting the paper was open sourced and is now available on Google code.

The word2vec tool is generally trained on a very large text corpus and subsequently learns vector representations (or embeddings) of words. The resulting word vectors can be used as features in natural language processing or machine learning applications.

The word2vec code has been hugely popular and as a result been ported to other languages including Python and Java. However, to the best of my knowledge there has not been a lightweight C# port of word2vec – so I decided to make one!

For my own purposes I chose to implement a Continuous Bag of Words model, rather than Skipgram, which works just fine for my needs.

The code below contains three classes:

  • Word2vec.cs – this is where the vector representations are learned.
  • Model.cs – a simple class showing how to query the word vectors.
  • Program.cs – a console application tying it all together.

I train my model on a 6GB extract from Wikipedia, which yields really nice results.

A simple way to visualise the learned representations is to list the closest words for a user input. The console application provided displays the closest words and their cosine similarity to the user input.

For example, if you enter 'france', you should see an output similar to this:

Input> france
1.000  france
0.681  spain
0.679  belgium
0.661  netherlands
0.654  italy
0.642  england
0.627  switzerland
0.611  luxembourg
0.569  portugal
0.560  russia
0.542  germany

Once words are represented as vectors it’s easy to perform standard vector operations, such as addition and subtraction. Research has shown that word vectors capture many linguistic regularities. A couple of famous examples often cited are:

vector('paris') - vector('france') + vector('italy') is close to vector('rome')

vector('king') - vector('man') + vector('woman') is close to vector('queen').

I’ve not included this in the code, but it’s really easy to implement and I leave for my readers to do so should they wish.

word2vec.cs

  1. using System;
  2. using System.Collections.Generic;
  3. using System.IO;
  4. using System.Linq;
  5. class Word2Vec
  6. {
  7.     public static int MinCount = 10;
  8.     private const float sample = 1e-3f;
  9.     private const float starting_alpha = 0.05f;     // Starting learning rate.
  10.     private const int dimensions = 75;              // Word vector dimensions.
  11.     private const int exp_table_size = 1000;
  12.     private const int iter = 5;                        // Training iterations.
  13.     private const int max_exp = 6;
  14.     private const int negative = 5;                    // Number of negative examples.
  15.     private const int window = 5;                    // Window size.
  16.     private Dictionary<string, float[]> syn0 = new Dictionary<string, float[]>();
  17.     private Dictionary<string, float[]> syn1 = new Dictionary<string, float[]>();
  18.     private Dictionary<string, int> vocab = new Dictionary<string, int>();
  19.     private float[] expTable = new float[exp_table_size];
  20.     private long train_words = 0;
  21.     private Random rnd = new Random();
  22.     private string[] roulette;
  23.     public void Train(string train_file, string model_file)
  24.     {
  25.         BuildExpTable();
  26.         LearnVocab(train_file);
  27.         InitVectors();
  28.         InitUnigramTable();
  29.         TrainModel(train_file);
  30.         WriteVectorsToFile(model_file);
  31.     }
  32.     private void BuildExpTable()
  33.     {
  34.         for (int i = 0; i < exp_table_size; i++)
  35.         {
  36.             expTable[i] = (float)Math.Exp((i / (double)exp_table_size * 2 - 1) * max_exp);
  37.             expTable[i] = expTable[i] / (expTable[i] + 1);
  38.         }
  39.     }
  40.     private void InitVectors()
  41.     {
  42.         foreach (var key in vocab.Keys)
  43.         {
  44.             syn0.Add(key, new float[dimensions]);
  45.             syn1.Add(key, new float[dimensions]);
  46.             for (int i = 0; i < dimensions; i++)
  47.                 syn0[key][i] = (float)rnd.NextDouble() - 0.5f;
  48.         }
  49.     }
  50.     private void WriteVectorsToFile(string output_file)
  51.     {
  52.         using (BinaryWriter bw = new BinaryWriter(File.Open(output_file, FileMode.Create)))
  53.         {
  54.             bw.Write(vocab.Count);
  55.             bw.Write(dimensions);
  56.             foreach (var vec in syn0)
  57.             {
  58.                 bw.Write(vec.Key);
  59.                 for (int i = 0; i < dimensions; i++)
  60.                     bw.Write(vec.Value[i]);
  61.             }
  62.         }
  63.     }
  64.     private void LearnVocab(string train_file)
  65.     {
  66.         using (StreamReader sr = new StreamReader(train_file))
  67.         {
  68.             string line;
  69.             while ((line = sr.ReadLine()) != null)
  70.             {
  71.                 foreach (var word in line.Split(' '))
  72.                 {
  73.                     if (word.Length == 0) continue;
  74.                     train_words++;
  75.                     if (train_words % 100000 == 0) Console.Write("r{0}k words read", train_words / 1000);
  76.                     if (!vocab.ContainsKey(word)) vocab.Add(word, 1);
  77.                     else vocab[word]++;
  78.                 }
  79.             }
  80.         }
  81.         Console.WriteLine();
  82.         var tmp = (from w in vocab
  83.                    where w.Value < MinCount
  84.                    select w.Key).ToList();
  85.         foreach (var key in tmp)
  86.             vocab.Remove(key);
  87.         Console.WriteLine("Vocab size: {0}", vocab.Count);
  88.     }
  89.     private void InitUnigramTable()
  90.     {
  91.         List<string> tmp = new List<string>();
  92.         foreach (var word in vocab)
  93.         {
  94.             int count = (int)Math.Pow(word.Value, 0.75);
  95.             for (int i = 0; i < count; i++) tmp.Add(word.Key);
  96.         }
  97.         roulette = tmp.ToArray();
  98.     }
  99.     private void TrainModel(string train_file)
  100.     {
  101.         float alpha = starting_alpha;
  102.         float[] neu1 = new float[dimensions];
  103.         float[] neu1e = new float[dimensions];
  104.         int last_word_count = 0;
  105.         int sentence_position = 0;
  106.         int word_count = 0;
  107.         List<string> sentence = new List<string>();
  108.         long word_count_actual = 0;
  109.         DateTime start = DateTime.Now;
  110.         for (int local_iter = 0; local_iter < iter; local_iter++)
  111.         {
  112.             using (StreamReader sr = new StreamReader(train_file))
  113.             {
  114.                 while (true)
  115.                 {
  116.                     if (word_count - last_word_count > 10000)
  117.                     {
  118.                         word_count_actual += word_count - last_word_count;
  119.                         last_word_count = word_count;
  120.                         int seconds = (int)(DateTime.Now - start).TotalSeconds + 1;
  121.                         float prog = (float)word_count_actual * 100 / (iter * train_words);
  122.                         float rate = (float)word_count_actual / seconds / 1000;
  123.                         Console.Write("rProgress: {0:0.00}%  Words/sec: {1:0.00}k", prog, rate);
  124.                         alpha = starting_alpha * (1 - word_count_actual / (float)(iter * train_words + 1));
  125.                         if (alpha < starting_alpha * 0.0001) alpha = starting_alpha * 0.0001f;
  126.                     }
  127.                     if (sentence.Count == 0)
  128.                     {
  129.                         if (sr.EndOfStream)
  130.                         {
  131.                             word_count_actual = train_words * (local_iter + 1);
  132.                             word_count = 0;
  133.                             last_word_count = 0;
  134.                             sentence.Clear();
  135.                             break;
  136.                         }
  137.                         sentence.Clear();
  138.                         sentence_position = 0;
  139.                         string line = sr.ReadLine();
  140.                         foreach (var key in line.Split(' '))
  141.                         {
  142.                             if (key.Length == 0) continue;
  143.                             if (!vocab.ContainsKey(key)) continue;
  144.                             word_count++;
  145.                             if (sample > 0)
  146.                             {
  147.                                 double ran = (Math.Sqrt(vocab[key] / (sample * train_words)) + 1) * (sample * train_words) / vocab[key];
  148.                                 if (ran < rnd.NextDouble()) continue;
  149.                             }
  150.                             sentence.Add(key);
  151.                         }
  152.                     }
  153.                     if (sentence.Count == 0) continue;
  154.                     string word = sentence[sentence_position];
  155.                     for (int i = 0; i < dimensions; i++) neu1[i] = 0;
  156.                     for (int i = 0; i < dimensions; i++) neu1e[i] = 0;
  157.                     int cw = 0;
  158.                     for (int w = 0; w < window * 2 + 1; w++)
  159.                     {
  160.                         if (w != window)
  161.                         {
  162.                             int p = sentence_position - window + w;
  163.                             if (p < 0) continue;
  164.                             if (p >= sentence.Count) continue;
  165.                             string last_word = sentence[p];
  166.                             float[] tmp0 = syn0[last_word];
  167.                             for (int i = 0; i < dimensions; i++) neu1[i] += tmp0[i];
  168.                             cw++;
  169.                         }
  170.                     }
  171.                     if (cw > 0)
  172.                     {
  173.                         for (int i = 0; i < dimensions; i++) neu1[i] /= cw;
  174.                         for (int w = 0; w < negative + 1; w++)
  175.                         {
  176.                             string target;
  177.                             int label;
  178.                             if (w == 0)
  179.                             {
  180.                                 target = word;
  181.                                 label = 1;
  182.                             }
  183.                             else
  184.                             {
  185.                                 target = roulette[rnd.Next(roulette.Length)];
  186.                                 if (target == word) continue;
  187.                                 label = 0;
  188.                             }
  189.                             float a = 0;
  190.                             float g = 0;
  191.                             float[] tmp1 = syn1[target];
  192.                             for (int i = 0; i < dimensions; i++) a += neu1[i] * tmp1[i];
  193.                             if (a > max_exp) g = (label - 1) * alpha;
  194.                             else if (a < -max_exp) g = (label - 0) * alpha;
  195.                             else g = (label - expTable[(int)((a + max_exp) * (exp_table_size / max_exp / 2))]) * alpha;
  196.                             for (int i = 0; i < dimensions; i++) neu1e[i] += g * tmp1[i];
  197.                             for (int i = 0; i < dimensions; i++) tmp1[i] += g * neu1[i];
  198.                         }
  199.                         for (int w = 0; w < window * 2 + 1; w++)
  200.                         {
  201.                             if (w != window)
  202.                             {
  203.                                 int p = sentence_position - window + w;
  204.                                 if (p < 0) continue;
  205.                                 if (p >= sentence.Count) continue;
  206.                                 string last_word = sentence[p];
  207.                                 float[] tmp0 = syn0[last_word];
  208.                                 for (int i = 0; i < dimensions; i++) tmp0[i] += neu1e[i];
  209.                             }
  210.                         }
  211.                     }
  212.                     sentence_position++;
  213.                     if (sentence_position >= sentence.Count)
  214.                     {
  215.                         sentence.Clear();
  216.                         continue;
  217.                     }
  218.                 }
  219.             }
  220.         }
  221.         Console.WriteLine();
  222.     }
  223. }

model.cs

  1. using System;
  2. using System.Collections.Generic;
  3. using System.IO;
  4. public class Model
  5. {
  6.     public int Dimensions;
  7.     private Dictionary<string, float[]> model = new Dictionary<string, float[]>();
  8.     private int wordCount;
  9.     public Dictionary<string, float> NearestWords(string word, int count)
  10.     {
  11.         var vec = WordVector(word);
  12.         if (vec == null) return new Dictionary<string, float>();
  13.         var bestd = new float[count];
  14.         var bestw = new string[count];
  15.         for (var n = 0; n < count; n++) bestd[n] = -1;
  16.         foreach (var key in model.Keys)
  17.         {
  18.             var dist = 0f;
  19.             for (var i = 0; i < Dimensions; i++) dist += vec[i] * model[key][i];
  20.             for (var c = 0; c < count; c++)
  21.                 if (dist > bestd[c])
  22.                 {
  23.                     for (var i = count - 1; i > c; i--)
  24.                     {
  25.                         bestd[i] = bestd[i - 1];
  26.                         bestw[i] = bestw[i - 1];
  27.                     }
  28.                     bestd[c] = dist;
  29.                     bestw[c] = key;
  30.                     break;
  31.                 }
  32.         }
  33.         var result = new Dictionary<string, float>();
  34.         for (var i = 0; i < count; i++) result.Add(bestw[i], bestd[i]);
  35.         return result;
  36.     }
  37.     public float[] WordVector(string word)
  38.     {
  39.         if (!model.ContainsKey(word)) return null;
  40.         return model[word];
  41.     }
  42.     public void LoadVectors(string model_file)
  43.     {
  44.         var file = model_file;
  45.         using (var br = new BinaryReader(File.Open(file, FileMode.Open)))
  46.         {
  47.             wordCount = br.ReadInt32();
  48.             Dimensions = br.ReadInt32();
  49.             for (var w = 0; w < wordCount; w++)
  50.             {
  51.                 var word = br.ReadString();
  52.                 var vec = new float[Dimensions];
  53.                 for (var d = 0; d < Dimensions; d++) vec[d] = br.ReadSingle();
  54.                 Normalise(vec);
  55.                 model[word] = vec;
  56.             }
  57.         }
  58.     }
  59.     private void Normalise(float[] vec)
  60.     {
  61.         var len = 0f;
  62.         for (var i = 0; i < Dimensions; i++) len += vec[i] * vec[i];
  63.         len = (float)Math.Sqrt(len);
  64.         for (var i = 0; i < Dimensions; i++) vec[i] /= len;
  65.     }
  66. }

program.cs

  1. using System;
  2. class Program
  3. {
  4.     static void Main(string[] args)
  5.     {
  6.         Console.Write("Train? {Y/N} ");
  7.         if (Console.ReadKey().Key == ConsoleKey.Y) Train();
  8.         Console.WriteLine();
  9.         // Load model from file.
  10.         var model = new Model();
  11.         model.LoadVectors("model.bin");
  12.         // Get 10 nearest words to user input
  13.         while (true)
  14.         {
  15.             Console.Write("Input> ");
  16.             var word = Console.ReadLine();
  17.             foreach (var item in model.NearestWords(word, 10))
  18.                 Console.WriteLine("{0:0.000}t{1}", item.Value, item.Key);
  19.             Console.WriteLine();
  20.         }
  21.     }
  22.     private static void Train()
  23.     {
  24.         // Train vector model and save to file.
  25.         var word2vec = new Word2Vec();
  26.         word2vec.Train("corpus.txt", "model.bin");
  27.     }
  28. }

Fisher Iris Dataset Classification with Torch

Note: The code supporting this post can be found on GitHub.

For people interested in deep learning, there's never been a better selection of open-source frameworks available. I recently took some time to review the top five (Caffe, CNTK, TensorFlow, Theano & Torch) and my favourite by far is still Torch.

Torch is a collection of flexible and powerful neural network and optimisation libraries. It sits on top of the Lua programming language, so has the potential to be super portable. It scales really well too, making full use of your CPU and GPU architectures.

However, for me, the most powerful reason to choose Torch is its large and ever growing community. This platform is here to stay, with active research happening in machine learning, computer vision, signal processing and parallel processing, amongst others. Already, some big names like Facebook, Google and Twitter have adopted Torch and actively contribute to the community.

A few years ago, when I was still hand-crafting neural nets, I wrote a blog post about classifying the UCI Fisher Iris dataset using back propagation. I thought I'd update this example, showing how I would do the same thing today with Torch. I hope you find the comparison useful.

 

trainSet = {}
testSet = {}function trainSet:size()
return trainCount
end-- Download data if not local.
if not paths.filep('iris.data') then
print("Getting data...")
data = "http://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data"
os.execute('wget ' .. data)
end

-- Load data.
math.randomseed(os.time())
trainCount = 0; testCount = 0
file = io.open('iris.data')
for line in file:lines() do
if (string.len(line) > 0) then

-- Read line from file.
x1, x2, x3, x4, species = unpack(line:split(","))
input = torch.Tensor({x1, x2, x3, x4});
output = torch.Tensor(3):zero();

-- Set output based on species.
if (species == "Iris-setosa") then
output[1] = 1
elseif (species == "Iris-versicolor") then
output[2] = 1
else
output[3] = 1
end

-- Keep 20% of data aside for testing.
if (math.random() > 0.2) then
table.insert(trainSet, {input, output})
trainCount = trainCount + 1
else
table.insert(testSet, {input, output})
testCount = testCount + 1
end

end
end

-- Initialise the network.
require "nn"
inputs = 4; outputs = 3; hidden = 10;
mlp = nn.Sequential();
mlp:add(nn.Linear(inputs, hidden))
mlp:add(nn.Tanh())
mlp:add(nn.Linear(hidden, outputs))

-- Train the network.
criterion = nn.MSECriterion()
trainer = nn.StochasticGradient(mlp, criterion)
trainer.learningRate = 0.01
trainer.maxIteration = 25
trainer:train(trainSet)

-- Test the network.
correct = 0
for i = 1, testCount do
val = mlp:forward(testSet[i][1])
out = testSet[i][2]
z = val:add(out)
if (torch.max(z) > 1.5) then
correct = correct + 1
end
end
print(string.format("%.2f%s", 100 * correct / testCount, "% correct"))


Baidu Open-Sources AI Secrets

Baidu (the Chinese Google) has just released its WARP-CTC library on GitHub under an open-source Apache licence.

CTC is an objective function that can be used for the supervised training of sequence prediction. The original CTC approach was developed in 2006 in Swiss AI lab IDSIA by Alex Graves, Santiago Fernandez, Faustino Gomez and Jurgen Schmidhuber.

Baidu decided to develop something new because their engineers found that existing implementations of CTC generally require significant amounts of memory and were very slow. Warp-CTC can be used to solve supervised problems that map an input sequence to an output sequence, such as speech recognition.

sound waves

Warp-CTC was developed to improve the scalability of CTC models trained for Baidu’s Deep Speech system. The chief scientist at Baidu responsible for Warp-CTC is Andrew Ng, who is noted for his research on neural networks running on GPUs.

Ng has taught classes on machine learning, robotics, and other topics at Stanford University. He also co-founded the massively open online course start-up Coursera.

Specifically, Warp-CTC is an open source implementation of the CTC algorithm for CPUs and NVIDIA GPUs. Baidu are releasing Warp-CTC as a C library along with integration for Torch, a scientific computing framework.

Baidu have said they want to make end-to-end deep learning easier and faster so researchers can make more rapid progress. A lot of open source software for deep learning exists, but previous code for training end-to-end networks for sequences has been too slow.

The CTC approach builds on recurrent neural networks, an increasingly common approach used in deep learning solutions. Recurrent neural networks are powerful sequence learners that are well suited to applications such as speech recognition.

One of the major challenges Deep Speech faces is handling variable length audio snippets. In computer vision tasks, images may be rescaled to a fixed size without changing the overall content. But speech data can’t be normalised in the same way. The underlying data model needs to shrink or grow automatically. To do this, Deep Speech uses a recurrent neural network, where an audio sample is sliced into time steps of equal size and a neural network is applied to each time step. Thus, the network learns not only from the input of the current time step but also from the output of the previous time step.

Baidu’s approach to open-source is not new - Facebook, Google and Microsoft have also open-sourced their AI software. In November last year, Google released TensorFlow on GitHub, also under an open-source Apache license.

Baidu was founded in 2000 by Internet pioneer Robin Li, with the mission of providing the best way for people to find what they’re looking for online (a search engine??)  Over the next five years, financial analysts that follow this company are expecting it to grow earnings at an average annual rate in excess of 23%. Baidu currently operate almost exclusively in China, but clearly have aspirations – Google, watch out!


RANSAC Line Feature Extraction from 2D Point Cloud C#

I have recently been researching several types of laser range finder with the goal of developing an effective robot navigation capability within an internal space.

There are so many laser range finders available on the market it can be difficult to know what to choose from. Just browsing Google shopping, there’s a huge list, which include:

• Parallax 15-122cm Laser Rangefinder (£86)
• Hokuyo URG-04LX-UG01 Scanning Laser Rangefinder (£880)
• SF10-C Laser Rangefinder (£430)
• LIDAR-Lite 2 Laser Rangefinder (£90)
• RoboPeak RPLIDAR 360° Laser Scanner (£250)

I really like the RPLIDAR 360° laser scanner – it sits at the lower end of the price range and is a 360 degree 2D scanner (LIDAR) solution, with more than 6 meters of effective range detection. The 2D point cloud data it creates can easily be consumed by a simultaneous localisation and mapping (SLAM) algorithm.

Incidentally, you’ll find a very similar laser range finder in a Neato XV robot vacuum cleaner – so if you’re a hacker at heart, you could tear one down, get the sensor, plus harvest a host of other useful parts.

Due to environmental conditions, a point cloud generated from any laser scanning device will contain noise - so how can we clean the data and establish clean landmarks? Landmarks are features which can easily be distinguished from the environment and repeatedly be reacquired, allowing the robot to localise itself.

I decided to implement a derivative of the RANSAC algorithm. Random sample consensus (RANSAC) is an iterative method to estimate parameters of a mathematical model from a set of observed data which contains outliers. For the point cloud data I’ve been looking at, I decided to drop the random element of the algorithm, which gives me more consistent results.

Given a dataset whose data elements contain both inliers and outliers, RANSAC uses a voting scheme to find the optimal fitting result. Data elements in the dataset are used to vote for one or more models. The implementation of this voting scheme is based on two assumptions: that the noisy features will not vote consistently for any single model and there are enough features to agree on a good model.

The code below firstly creates a noisy data set, simulating a laser scan point cloud for a square room. Then it applies my hacked implementation of the RANSAC algorithm to correctly identify walls from the noisy data.

 

Code Snippet
  1. using System;
  2. using System.Drawing;
  3. using System.Windows.Forms;
  4. public class Ransac : Form
  5. {
  6.     const int MAX_LINES = 4;
  7.     const int SAMPLE_SIZE = 10;
  8.     const int RANSAC_CONSENSUS = 50;
  9.     const double RANSAC_TOLERANCE = 7.5;
  10.     const double D2R = Math.PI / 180.0;
  11.     double[] LinesA = new double[MAX_LINES];
  12.     double[] LinesB = new double[MAX_LINES];
  13.     double[] PointCloud = new double[360];
  14.     bool Draw = false;
  15.     int LineCount = 0;
  16.     [STAThread]
  17.     static void Main()
  18.     {
  19.         Application.Run(new Ransac());
  20.     }
  21.     public Ransac()
  22.     {
  23.         Width = 500;
  24.         Height = 525;
  25.         Text = "RANSAC Demo";
  26.         CreateNoisyPointCloud();
  27.         ExtractLines();
  28.         Draw = true;
  29.     }
  30.     private void CreateNoisyPointCloud()
  31.     {
  32.         Random rnd = new Random();
  33.         for (int i = 0; i < 90; i++)
  34.         {
  35.             double length = 150 / Math.Cos((i - 45) * D2R);
  36.             PointCloud[i] = length + rnd.NextDouble() * 16 - 8;
  37.             PointCloud[i + 90] = length + rnd.NextDouble() * 16 - 8;
  38.             PointCloud[i + 180] = length + rnd.NextDouble() * 16 - 8;
  39.             PointCloud[i + 270] = length + rnd.NextDouble() * 16 - 8;
  40.         }
  41.     }
  42.     private void ExtractLines()
  43.     {
  44.         int[] pointStatus = new int[360];
  45.         for (int seed = 0; seed < 360; seed++)
  46.         {
  47.             int[] sample = new int[SAMPLE_SIZE];
  48.             for (int i = 0; i < SAMPLE_SIZE; i++) sample[i] = (seed + i) % 360;
  49.             int[] fitPoints = new int[360];
  50.             int fitCount = 0;
  51.             double a = 0;
  52.             double b = 0;
  53.             LeastSquaresFit(PointCloud, sample, SAMPLE_SIZE, ref a, ref b);
  54.             for (int i = 0; i < 360; i++)
  55.             {
  56.                 if (pointStatus[i] == 0)
  57.                 {
  58.                     // Convert scan vectors to cartesian coordinates.
  59.                     double x = Math.Cos(i * D2R) * PointCloud[i];
  60.                     double y = Math.Sin(i * D2R) * PointCloud[i];
  61.                     // Claim points close to sample line.
  62.                     if (DistanceToLine(x, y, a, b) < RANSAC_TOLERANCE)
  63.                         fitPoints[fitCount++] = i;
  64.                 }
  65.             }
  66.             if (fitCount > RANSAC_CONSENSUS)
  67.             {
  68.                 // Refresh line and add to collection.
  69.                 LeastSquaresFit(PointCloud, fitPoints, fitCount, ref a, ref b);
  70.                 LinesA[LineCount] = a;
  71.                 LinesB[LineCount] = b;
  72.                 LineCount++;
  73.                 // Update point cloud status.
  74.                 for (int i = 0; i < fitCount; i++) pointStatus[fitPoints[i]] = LineCount;
  75.             }
  76.         }
  77.     }
  78.     private void LeastSquaresFit(double[] pointCloud, int[] selection, int count, ref double a, ref double b)
  79.     {
  80.         double sumX = 0;
  81.         double sumY = 0;
  82.         double sumXX = 0;
  83.         double sumYY = 0;
  84.         double sumXY = 0;
  85.         for (int i = 0; i < count; i++)
  86.         {
  87.             double x = Math.Cos(selection[i] * D2R) * pointCloud[selection[i]];
  88.             double y = Math.Sin(selection[i] * D2R) * pointCloud[selection[i]];
  89.             sumX += x;
  90.             sumXX += Math.Pow(x, 2);
  91.             sumY += y;
  92.             sumYY += Math.Pow(y, 2);
  93.             sumXY += x * y;
  94.         }
  95.         a = (count * sumXY - sumX * sumY) / (count * sumXX - Math.Pow(sumX, 2));
  96.         b = (sumY * sumXX - sumX * sumXY) / (count * sumXX - Math.Pow(sumX, 2));
  97.     }
  98.     private double DistanceToLine(double x, double y, double a, double b)
  99.     {
  100.         double ao = -1.0 / a;
  101.         double bo = y - ao * x;
  102.         double px = (b - bo) / (ao - a);
  103.         double py = ((ao * (b - bo)) / (ao - a)) + bo;
  104.         return Math.Sqrt(Math.Pow(x - px, 2) + Math.Pow(y - py, 2));
  105.     }
  106.     protected override void OnPaint(PaintEventArgs e)
  107.     {
  108.         if (Draw)
  109.         {
  110.             Draw = false;
  111.             Graphics g = e.Graphics;
  112.             Pen pen = new Pen(Color.Gray, 4);
  113.             for (int i = 0; i < LineCount; i++)
  114.             {
  115.                 int x1 = 0;
  116.                 int y1 = 250 + (int)(LinesA[i] * -250 + LinesB[i]);
  117.                 int x2 = 500;
  118.                 int y2 = 250 + (int)(LinesA[i] * 250 + LinesB[i]);
  119.                 g.DrawLine(pen, x1, y1, x2, y2);
  120.             }
  121.             pen = new Pen(Color.Red, 4);
  122.             for (int i = 0; i < 360; i++)
  123.             {
  124.                 int x = 250 + (int)(Math.Cos(i * D2R) * PointCloud[i]);
  125.                 int y = 250 + (int)(Math.Sin(i * D2R) * PointCloud[i]);
  126.                 g.DrawEllipse(pen, x, y, 1, 1);
  127.             }
  128.         }
  129.     }
  130. }