A visual introduction to machine learning | SHIFT*: Digital Capability Acceleration

Machine learning is a scientific discipline that uses algorithms to decipher patterns and learn from data. It’s a critical tool for complex tasks, such as predicting whether a house will sell above or below its asking price. The data used in these algorithms often come from a dataset, a collection of many individual ‘items’. Each item has different ‘features’, which are measurable properties that can vary across items.

In the context of housing, features could include the neighbourhood, size, or number of rooms. The algorithm learns from these features to predict an ‘outcome’, such as the house price. This process is known as ‘training’ the model. The model’s accuracy is then tested on a new set of data, unseen during training, to validate its predictive power.

Machine learning models can be simple, such as a decision tree, or complex, like a random forest, which is a collection of decision trees. The complexity of a model can affect its performance. Too simple, and it may miss important patterns; too complex, and it may overfit the data, performing well on training data but poorly on new data. This balance between simplicity and complexity is a key challenge in machine learning.

Go to source article: http://www.r2d3.us/visual-intro-to-machine-learning-part-1/