Data scientists need to have a variety of algorithms in their toolkit. Firstly, Linear Regression is a statistical method that allows for the prediction of a dependent variable based on an independent one. It’s a must-know for anyone in the field. Similarly, Logistic Regression, used when the dependent variable is categorical, is another essential tool.
Decision Trees, which use a tree-like model of decisions, are also crucial. They’re simple to understand and interpret, making them a popular choice. Random Forests, an ensemble learning method, improves upon Decision Trees by reducing overfitting.
K-Nearest Neighbours (KNN) is a simple algorithm for both classification and regression. It’s easy to implement, but can be computationally expensive. The Support Vector Machine (SVM) algorithm, used for classification and regression tasks, is another important tool. It’s effective in high dimensional spaces and best suited for problems with complex domains.
Lastly, there’s the K-Means algorithm, a type of unsupervised learning used for clustering. It’s simple and efficient, making it widely used. The Naïve Bayes classifier, based on Bayes’ theorem, is also essential due to its simplicity and efficiency in high-dimensional datasets. These algorithms form the bedrock of any data scientist’s toolkit.
Go to source article: http://www.quora.com/What-are-the-top-algorithms-that-every-data-scientist-should-have-in-their-toolbox