An artificial neural network (NN for short) is a classifier. In supervised machine learning, classification is one of the most prominent problems. The aim is to assort objects into classes (terminology not to be confused with Object Oriented programming). Classification has a broad domain of applications, for example:
- In image processing we may seek to distinguish images depicting different kinds (classes) of objects (e.g. cars, bikes, buildings etc) or different persons,
- In natural language processing (NLP) we may seek to classify texts into categories (e.g. distinguish texts that talk about politics, sports, culture etc),
- In financial transactions processing we may seek to decide if a new transaction is legitimate or fraudulent.
The term “supervised” refers to the fact that the algorithm is previously trained with “tagged” examples for each category (i.e. examples whose classes are made known to the NN) so that it learns to classify new, unseen examples in the future.
In simple terms, a classifier accepts a number of inputs, which are called features and collectively describe an item to be classified (be it a picture, text, transaction or anything else as discussed previously), and outputs the class it believes the item belongs to. For example, in an image recognition task, the features may be the array of pixels and their colors. In an NLP problem, the features are the words in a text. In finance several properties of each transaction such as the daytime, cardholder’s name, the billing and shipping addresses, the amount etc.
It is important to understand that here we assume that there is an underlying real relationship between the characteristics of an item and the class it belongs to. The goal of running a NN is: Given a number of examples, try and come up with a function that resembles this real relationship (Of course, you’ll say: you are geeks, you are better with functions than relationships!) This function is called the predictive model or just the model because it is a practical, simplified version of how items with certain features belong to certain classes in the real world.
Get comfy with using the word “function” as it comes up quite often, it is a useful abstraction for the rest of the conversation (no maths involved). You might be interested to know that a big part of the work that Data Scientists do (the dudes that work on such problems) is to figure out exactly which are the features that better describe the entities of the problem at hand, which is similar to saying which characteristics seem to distinguish items of one class from those of another. This process is called feature selection.
- Posted by Udit Agarwal
- November 16, 2017