ML Engineer Job Prep
Jun 23, 2022
# Computer Science Fundamentals and Programming
- Data structures: Lists, stacks, queues, strings, hash maps, vectors, matrices, classes & objects, trees, graphs, etc.
- Algorithms: Recursion, searching, sorting, optimization, dynamic programming, etc.
- Computability and complexity: P vs. NP, NP-complete problems, big-O notation, approximate algorithms, etc.
- Computer architecture: Memory, cache, bandwidth, threads & processes, deadlocks, etc.
# Probability and Statistics
- Basic probability: Conditional probability, Bayes rule, likelihood, independence, etc.
- Probabilistic models: Bayes Nets, Markov Decision Processes, Hidden Markov Models, etc.
- Statistical measures: Mean, median, mode, variance, population parameters vs. sample statistics etc.
- Proximity and error metrics: Cosine similarity, mean-squared error, Manhattan and Euclidean distance, log-loss, etc.
- Distributions and random sampling: Uniform, normal, binomial, Poisson, etc.
- Analysis methods: ANOVA, hypothesis testing, factor analysis, etc.
# Data Modeling and Evaluation
- Data preprocessing: Munging/wrangling, transforming, aggregating, etc.
- Pattern recognition: Correlations, clusters, trends, outliers & anomalies, etc.
- Dimensionality reduction: Eigenvectors, Principal Component Analysis, etc.
- Prediction: Classification, regression, sequence prediction, etc.; suitable error/accuracy metrics.
- Evaluation: Training-testing split, sequential vs. randomized cross-validation, etc.
# Applying Machine Learning Algorithms and Libraries
- Models: Parametric vs. non-parametric, decision tree, nearest neighbor, neural net, support vector machine, ensemble of multiple models, etc.
- Learning procedure: Linear regression, gradient descent, genetic algorithms, bagging, boosting, and other model-specific methods; regularization, hyperparameter tuning, etc.
- Tradeoffs and gotchas: Relative advantages and disadvantages, bias and variance, overfitting and underfitting vanishing/exploding gradients, missing data, data leakage, etc.
# Software Engineering and System Design
- Software interface: Library calls, REST APIs data collection endpoints, database queries, etc.
- User interface: Capturing user inputs & application events, displaying results & visualization, etc.
- Scalability: Map-reduce, distributed processing, etc.
- Deployment: Cloud hosting, containers & instances, microservices etc.