One of today’s most overused buzzword is “Artificial Intelligence”. Both technical and general press is full of articles talking about machines that drive autonomous cars and invent new languages. Many also talk about intelligent machines being a threat to humanity. Machine Learning is an essential part of the AI puzzle and Deep Learning is one of the most popular approaches to implement Machine Learning. Interestingly, Deep Learning is not new. Geoffrey Hinton demonstrated the use of back-propagation of errors for training multi-layer neural networks in 1986, more than 30 years ago. Even earlier, in the 60’s, Kelley, Bryson and Ho published research papers about dynamic optimization which many consider as the basis for back-propagation. Generations of researchers have shown that, given enough data, neural networks can be trained to recognize things. This training consists in slow, progressive, iterative adjustments that allow the network to progressively configure itself to produce the desired answer. Deep Learning is not new but it recently became popular because of the availability of GPU/TPU/VPU architectures which offer some level of parallelism and, therefore, deliver acceptable performance for some applications.
What are the key differences between Deep Learning and RBF-based solutions?
Online vs Offline learning
Deep Learning is an offline learning process. The learning phase and the execution (inference) phase are separate and, very often, are not even processed on the same machine. Typically, the learning phase happens in a data center. A massive data set is crunched to generate a neural network. This takes huge computing resources and can take days depending on the size of the data set and the number of levels in the network. Once the network has been generated, it can be executed to perform the required recognition tasks. Such inference execution can sometimes be achieved on relatively low power devices (Intel-Movidius or Nvidia Jetson are good examples of embedded inference processing platforms that are not capable of embedded learning). More often, powerful PCs with GPU accelerators are used, leading to significant cost and power consumption. Moreover, as the training dataset grows during the learning phase, there is no guarantee that the target hardware will remain sufficient and users may have to upgrade their inference hardware to execute properly after a new network has been generated during the learning phase. In a way, this is similar to the PC world where you have to upgrade your hardware regularly if you want to run the newest games. This continuous and fast upgrade cycle is driving a healthy consumer business but is unacceptable in an industrial environment.
The most important limitation of this approach is that new training data cannot be incorporated directly and immediately in the executable knowledge. In a fairly static environment where the training data is not changing often, this may not be a problem. For example, speed limit road signs are always the same, so you don’t need to learn new ones dynamically. However, in an industrial environment, novelty is very common. New components, new suppliers, new configurations happen almost every day, and it’s critical to be able to train an industrial machine dynamically, just like we train operators, on-the-job. In fact, modern manufacturing techniques tend to encourage novelty with smaller volumes per products and a higher level of customization.
Our approach, using a neuromorphic technology, solves that problem. Training can happen online, at any time, dynamically. Also, unlike Deep Learning networks, RBF networks are free of convergence problems and they can be easily mapped on hardware because the structure of the network does not change with the learned data. This ability to map the complete network on specialized hardware allows RBF networks to reach unbeatable performances in terms of speed and power dissipation both for learning and recognition. In contrast, any other Neural Network solution based on back propagation of errors for learning needs to be mapped (and remapped after each learning process) on programmable hardware (CPU, GPU or FPGA with specific hardware assist or not), which is a lot costlier in terms of complexity and power dissipation. Deep Learning is fundamentally a software technology which requires powerful, expensive and power consuming hardware to achieve reasonable levels of performance. It often also requires a fair amount of hand coding and tuning to deliver useful performance on the target hardware and is therefore not easily portable.
Local vs Remote learning
Another issue with Deep Learning is that data is crunched in a data-center which usually means that it is handled on someone else’s computer. This may create confidentiality or security issues. Many industrial customers prefer to keep their precious data local. The data used by industrial customers is very sensitive because it may contain their process and quality secrets. Ownership and control over this data are, more often than not, very critical to their business. With our approach, precious data stays local. It is learned and then recognized on the same machine, in a totally controlled environment. This gives the ability for the domain experts to train the machines themselves without having to outsource the training process to IT specialists who do not necessarily understand the meaning behind the data. The domain expert has a lot more control over the training of the machine and has full control over the qualification of that machine and its release to production.
Additive learning vs Forget-and-Learn-from-scratch learning
When a Deep Learning based system needs to learn something new, it has to forget everything it knew before and learn from scratch, based on the new dataset. In a way, it’s similar to “old manufacturing” style in which you have to break the old mold and build a new one if you want to have a different plastic casing. In our modern world of additive manufacturing and flexible production chains, it is paradoxical to introduce a machine learning technology which is not additive in nature. Besides the lack of flexibility, this creates another potential problem. When a Deep Learning based system “batch-learns” from a new incrementally better dataset, there is no guarantee that previous results will be maintained. In an inspection system, parts that were good before may be bad now and vice versa.
RBF learning is an additive process, unlike Deep Learning. It is also important to note that Deep Learning requires a lot of training data to produce acceptable results. Even with minimal training, the RBF classifier will output the closest match along with a confidence factor. It is also capable of pinpointing uncertainties and unknowns therefore enabling dynamic learning. Redundant RBF classifiers can also work in parallel using different features to produce more robust decisions. The ability to detect unknown situations is essential for the implementation of anomaly detection in predictive maintenance applications.
Predictable recognition latency
For all industrial applications, low and constant latency is a very desirable feature because it guarantees high and predictable productivity. With Deep Learning, latency varies. Typically, the more the system learns, the slower it gets. This is due to the Von Neumann architecture bottlenecks found in all computers which are sequential by nature. Even the most modern multi-core architectures, even the best GPU/TPU/VPU architectures, have limitations to their parallelism because some resources (cache, external memory access bus, …) are shared between the cores and therefore limit their true parallelism. The neuromorphic architecture goes beyond the Von Neumann paradigm and, thanks to its in-memory processing and fully parallel nature, does not slow down when the training dataset grows.
In addition, the shallow nature (3 levels) of RBF networks is not a disadvantage for such applications as researchers have shown that 3 layers are sufficient to solve any pattern classification problem. The quality of the recognition is therefore not compromised.
Deep Learning is an exciting field of research, and it has produced amazing results in many Cloud-based applications where its limitations are not critical. However, in an industrial, real-time, high productivity, high predictability but high flexibility environment, we consider that Deep Learning is not the best approach to solve the machine learning problems the market is facing for inspection, monitoring, maintenance and robotics applications. In fact, any environment which needs dynamic on-the-job learning, fast and predictable latency, easy auditing of decisions is likely to be better served by RBF neural networks, rather than by Deep Learning neural networks.
Philippe Lambinet is a Senior Executive in the Semiconductors, with a proven track record in developing successful large businesses and leading global international teams, engaged in a new adventure. With a few friends, he created Cogito Instruments to deliver embedded machine intelligence. Philippe believes that cognitive processing can be and should be done inside the machines, at the edge of the Industry 4.0 network.