Jump start your Artificial Intelligence



This article is intended for all audiences. All views expressed are entirely mine [Emmanuel Giyoh Ngala].

The first part covers the impact of Artificial Intelligence while the second part closely looks at technical details of how artificial intelligence works. The second part is intended for a technical audience.


Brief History

Like everything in computing Artificial Intelligence is divided into the software and the hardware. The bulk of the software or at least the algorithms have been around since the 40s but the hardware has only caught up recently. Making Artificial Intelligence gain prominence.

Decades prior to Artificial Intelligence, in many industrial systems where automation is desired there were many different ways to achieve automation.

The use of microelectronics and micro-controllers was main stream and the popular PID [Proportional-Integral-Derivative] algorithms were predominantly used but they had a lot of limitations, they had to be programmed to a strict mathematical model [Laplace Transforms and Z Transforms] and had limited ability to handle unforeseen scenarios but once programmed they could run on small pieces of hardware with little demand for resources.

Then the need to monitor and control more sophisticated systems without human intervention increased leading to the development of the technics of Fuzzy Logic where a decision or the state of any variable at any time was determined by a combination of many different variables. The concepts of Fuzzy Logic had been introduced in the 1920s by mathematicians but did not have a lot of application then. It was revisited in the 1965s. This had the advantage of being able to handle more complex scenarios beyond what was foreseeable by the programmer but again had a lot of disadvantages like the complexity it introduced and required a lot of pre-programming but was very useful when it was programmed. With it came requirements for more rigorous hardware leading to implementations in FPGA [Field Programmable Gate Arrays] where the hardware could be reprogrammed to adjust to changing conditions, so still requiring human intervention.

In the 1940s Warren McCulloch and Walter Pitts designed a model based on the human nervous system which simply uses the technics of Fuzzy Logic but defines a threshold where a signal is triggered. They called it the McCulloch-Pitts neuron, analogous to a human neuron. Like the human nervous system there are many neurons interconnected to one another like a mesh. They form the basis for the Artificial Neural Network (ANN). They could imitate the characteristics of the human nervous system, the ability to perceive the environment and take action, automate systems even when they were not explicitly programmed to do so because like the human nervous system they have the ability to learn. So they could in the real sense be autonomous, learn from the environment and act or take decisions in scenarios they were not explicitly programmed to handle.

Like the human nervous system they need to be trained and the more training they get the more clever they become.

This is the birth of what we know today as Artificial Intelligence except that in the 40s the McCulloch-Pitts neuron based Artificial Neural Network was computationally expensive and there was no hardware that could efficiently run it. A few decades later when computer processor hardware got faster and more complex, on attempts to train Artificial Neural Networks on advanced computers, they still did not cope. Then GPUs [Graphics Processing Unit]were introduced enabling Artificial Neural Networks (ANN) to be trained at which point they could efficiently acquire Artificial Intelligence. More on GPU later


The impact of Artificial Intelligence


Automation is a mixed bag. Depending on who you talk to it could mean a lot of jobs being eliminated or it could mean a lot of profit for some organizations if they can be more efficient. Whatever perspective you hold about Artificial Intelligence one thing everyone agrees on is that it makes a difference and in my opinion it is here to stay.

I believe Artificial Intelligence is transforming the way things are done. Jobs will be eliminated, jobs will be created and new skills will need to be learned. Some organizations will benefit financially and excel however others will go out of business.

For instance, if an autonomous car is involved in an accident, how does the insurance cover that? Is it the owner of the car who was not driving or the manufacturer of the car? The car is learning and knows things that the manufacturer doesn’t. Is the current law sufficient to cover these kind of situations? This will give rise to exponential questions that will need answers. This will mean there will be a requirement to set new ways of doing things and you can apply this to any domain affected by Artificial Intelligence which is practically every domain.

It is anticipated that capitalist leaning societies will get more skewed towards greater profits at the expense of social welfare while socialist leaning societies will only be able to keep a balance if high profits from automation are used to maintain social welfare. This is not about sociology but Artificial Intelligence will have such a far reaching impact.

The question of ethics comes to the forefront. There will be good and bad use for Artificial Intelligence like with any other technology.

Generally, technology is meant to improve life, advance efficiency, reduce errors but there is always the other side where it can be used for malicious purposes and especially because it has the ability to learn and be autonomous. AI may learn things that it was not intended to. Which gives rise to the questions, how independent do we want our machines? Who takes responsibility for a system that is completely independent? The answer may go back to ethics again. Back to how the machines learn. Back to how they are made to learn.

If the machines learn the difference between what is right and what is wrong then they can ‘be encouraged’ to make the right choices then they serve the right purpose. This is Reinforcement Learning. We can guide the machines how to learn.

We have the responsibility to ensure ethical standards such that Artificial Intelligence is not maliciously used and that machines do not learn the wrong things. This leads us to the concept of learning, specifically Machine Learning. Machine Learning is not the same thing as Artificial Intelligence. Artificial Intelligence depends on Machine Learning but after learning the machine becomes intelligent.

We must also separate Artificial Intelligence from the ability to think. These are not the same things. It is likely machines will be able to think in the near future but Artificial Intelligence is currently centered around cognitive skills, the ability to see and recognize, the ability to hear and interpret, the ability to touch and so forth, but the ability to initiate a thought process is the next step where ethics will play even a critical role.


Jumpstart You Artificial Intelligence


This part of the article is relevant to readers who are technically inclined.

Given that GPUs facilitate training Artificial Neural Networks and have propelled Artificial Intelligence, what are GPUs and how do they affect training?

CPUs [Central Processing Unit] are fairly understood and have cores and each core has threads. Beyond that they have a lot of other things, ALU [Arithmetic and Logic Unit], registers, cache, buses, etc.

An Intel CPU in 2018 may have up to 24 cores in one CPU, each core having 2 threads. A SPARC CPU in 2018 may have up to 32 cores, each core having 8 threads. In either architecture, each core comes with at least an ALU. The ALU mainly handles mathematical operations like multiplication, addition, bitwise operations and float point computation on behalf of the processor and is sometimes referred to as the co-processor. It is specialized for this kind of task.

On the other hand GPUs were initially designed to display images on computer screens and other screens. Each GPU also has cores made of ALU to be able to handle pixel manipulation. The larger the screen the greater the resolution, the more the cores the GPU will need to have. The GPU cores are made of ALUs that calculate pixel colors in parallel to display an image. GPUs in 2018 may have up to 4000 cores.

If we compare this to the 24 or 32 cores on the CPU it is evident that the application of GPUs can go beyond graphics display and expand to applications such as gaming and training. We see why it became easier to train Artificial Neural Networks on GPUs.

So let’s look at training Artificial Neural Networks on GPUs.

To train we need data to learn from. Let’s say there is some data in a tabular format with rows and columns, this can be anything from a table or a picture with pixels represented in a tabular format like from bitmaps or jpeg or other image format. The important thing to consider is that we have data in a tabular format with rows and columns and we want to use this data to train an ANN.

Training needs to go through all the data in the training set. The empirical approach is to go through row by row and for each row process all the columns of the row. This actually works as a nested loop. The problem here is if the data set is large then we will be in a loop for a very long time. And in practical scenarios the data set is always large. I mean in the order of hundreds or thousands of columns and millions of rows.

If a nested loop is applied to this kind of data set it will be performing billions of loops and if each loop is performing a complex computation. This will be time consuming and training may take months and years to complete. Upon completion of the training, learning is not guaranteed. Just like saying students may complete a course but at the end of the course not all students learn.

In conclusion the nested loop approach is not practical.

The more practical approach is to go back to mathematics, Matrices. We treat the data set as a matrix say X and we treat the Artificial Neural Networks that needs to learn as another matrix say W then the output or what the Artificial Neural Networks will do after learning will be another matrix say Y. Mindful that all the laws and principles of matrices apply we will express this as

Y = [W][X]

And our need to train the Artificial Neural Networks will be expressed mathematically as the need to find the matrix W

W = [X-1][Y] ~ [X-1] being the inverse matrix of [X]

Our need to train and ANN has been reduced to a need to perform matrix inverse and matrix multiplication, something ALUs are meant for and GPUs have thousands of ALUs that can execute in parallel. This is the perfect match.

If we go back to our data set with thousands of columns and millions of rows and consider a single GPU with about 4000 cores in parallel, our computational problem becomes in the order of thousands as opposed to billions with the nested loop. If we add a few more GPUs we can train the same data set in the magnitude of hours or days as opposed to months or years.

Short story is GPUs facilitate training.

The algorithms used are routed in mathematical and statistical principles and have been around as long as the mathematical theories so what really held Artificial Intelligence from taking off is computer hardware.

Given the availability of hardware and given that we can train let’s consider the different ways machines can learn.

Machine Learning

Machine Learning is not the same thing as Artificial Intelligence. Machine Learning comes before Artificial Intelligence. Artificial Intelligence consumes the learning that has been earned from the machine learning and determines an appropriate course of action or makes a decision. As such there is no Artificial Intelligence without Machine Learning but there can be machine learning without Artificial Intelligence.

The most common forms of Machine Learning are Supervised Learning, Unsupervised Learning and Reinforcement Learning;

Supervised Learning is used to train on labelled data. For example, to train a computer to recognize a dog, many pictures of a dog are shown to the computer [algorithm] and are labelled dog so the algorithm associates that 4 legged object with a dog.

Popular supervised learning algorithms include Regression, Classification, Kernels, Support Vector Machines, Naïve Bayes Classifiers, Collaborative Filtering, Decision Trees, etc.

Unsupervised Learning is used to train on unlabelled data. For example, data is just presented to the algorithm without any further information and the algorithm needs it identify common characteristics within the data and maybe group them according to their characteristics.

Popular unsupervised learning algorithms include Clustering, Anomaly Detection, Expectation Maximization, Principal Component Analysis, Singular Vector Decomposition, etc.

In Reinforcement Learning the algorithms reward a particular action or learning result making it more likely that the algorithm will be inclined towards that outcome again.

Examples of reinforcement learning are the Markov Decision Process, Monte Carlo, etc

While some of the learning above can be achieved with techniques other than Artificial Neural Networks, in Artificial Intelligence, ANN shines when it comes to imitating natural intelligence.

Deep learning is focused on the application of Artificial Neural Networks and its application. Some of the popular ANN and their applications are listed;

Convolutional Neural Networks [CNN] – Image recognition and Computer Vision

Recurrent Neural Network [RNN] – Robot Control

Deep Belief Networks [DBN] – Speech Recognition

Long Short Term Memory [LSTM] – Natural Language Processing

Boltzmann Machine – Improve Speech Recognition

Markov Chain – Search and Page Rank

A lot of these algorithms can be implemented from scratch with a good understanding of the mathematical and statistical principles that under lye them but indeed there is a lot of that already made available from a number of tools and libraries which can be leveraged and used. Some of the more popular ones include Theano, TensorFlow, Caffe, Torch, Keras, scikit-learn, Neon, etc

Put these libraries together on a GPU [Powerful GPUs are available on the Oracle Cloud] with some sample data and train intelligent models. Collect these trained models into an agent that will make intelligent decisions.

The time is now. Jump start your Artificial Intelligence.


DaniGeek - Emmanuel Giyoh Ngala

 
Contact us - About us