Graph Neural Networks

[Download this notebook](13 - Graph Neural Networks.ipynb) In this lesson you’ll learn: how molecules can be represented as graphs. how basic Graph Neural Networks work. how to write a Graph Neural Network as a Pytorch class. Graph Neural Networks are still a relatively new class of algorithms. Intuitively, molecules can be represented very easily as a (mathematical) graph. The bonds of a molecule correspond to the edges of the graph and the atoms to the nodes.

How to Speed up Pandas?

Download this notebook Pandas is a frequently used package in Python and every aspiring Python data scientist should have some familiarity with it. However, Pandas has a lot of quirks to it which are not obvious to even advanced users. One of them is the different methods of how to apply a given function to every row of the data set. As we will see here, this can make an enormous speed difference of a factor of 1000.

Introduction to cheminformatics using rdkit

[Download this notebook](03 - Cheminformatics.ipynb) In this lesson you’ll learn: how to read smiles using rdkit. how to manipulate and visualize molecules. how to calculate molecule descriptors. how to calculate the similarity of molecules using fingerprints. Today’s notebook is about the use of Python in cheminformatics. As a case study, you will be looking for an alternative to Sorafenib. Sorafenib is a kinase inhibitor used mainly to treat advanced kidney cancer.

Introduction to Statistics

[Download this notebook](04 - Linear Regression.ipynb) Today we are going to look at some basics of statistics. Statistics can help us to describe and explain data in a simple way. In this lesson you’ll learn: how to calculate the mean, variance, and standard deviation in Python. the difference between a regression and a classification. how a linear regression functions and the meaning of its coefficients. about the Mean Squared Error and the loss function.

k-armed bandit: $ arepsilon$-greedy and $ arepsilon$-decay strategies

Download this notebook This tutorial will focus on the multi-armed (k-armed) bandit problem and two solution strategies, namely $\varepsilon$-greedy and $\varepsilon$-decay strategies. Author: Oliver Mai Problem setup The multi-armed bandit problem can be imagined as playing a game of slot-machines, where there are multiple arms to pull (either because one bandit has mutliple arms or because there are multiple bandits). The goal of the game is then to maximize the rewards obtained by pulling on any of the $k$ arms, without knowing how likely you are to receive a reward pulling each individual arm.

K-Fold Cross Validation

Download this notebook You may already be familiar with the idea of splitting data into training and test data: You only train your model on the training data and then evaluate it on the unknown test data to see how good it deals with completely new data. Often, you also see a validation data set that is known to the Machine Learning engineer, but not known by the model during training process.

Linear Algebra

[Download this notebook](06 - Linear Algebra for NN.ipynb) In this lesson you’ll learn: about vectors and matrices and how to do simple calculations with them in Python. how to calculate the derivative of simple functions. how the chain rule works and why it is so useful for neural networks. Today we will explain the essential mathematical principles for neural networks. The first essential mathematical concept is the vector. A vector represents a point in a space that is described by several values.

Logistic Regression

Download this notebook A typical task solved using machine learning algorithms is the assignment of instaces to different classes - a so-called classification. Despite the somewhat misleading name, logistic regression is a method to handle classification problems, in particular two-class classification problems. Calculating the probability that an instance belongs to a certain class, the classification based on logistic regression takes place. Stepping through this notebook, you will get familiar with the fundamental concepts of logistic regression.

Model Selection and Collinearity

Download this notebook This notebook uses a very simply dataset and model to show that problems can arise if the different features in your dataset are collinear or correlated with each other. Although the setup here is deliberately simple, this can obviously also occur in much more complex high-dimensional data and models and lead to very wrong interpretations of the model coefficients. import pandas as pd import numpy as np import matplotlib.

OpenAI Gym Example: CartPole

Download this notebook This notebook introduces the python package gym from OpenAI and employs a basic search strategy for finding a policy in the frequently used environment “CartPole-v1”. Author: Oliver Mai First we import the relevant packages import gym # the package that supplies us with environments and useful tools import numpy as np # for later array manipulation seed = 42069 # general seed for reproducibility np.random.seed(seed) Environment Now we import the CartPole-v1 environment and take a random action to have a look at it and how it behaves.