Blog posts

2019

Spark introduction

1 minute de lecture

Mis à jour : January 12, 2019

This lecture will be an abstract overview, we will discuss:

Spark
Spark vs MapReduce
Spark RDDs
Spark DataFrames

Hadoop

1 minute de lecture

Mis à jour : January 12, 2019

When data gets too large to be dealt with in memory (most computers have up to 32 GB in RAM usually), it is possible to use a distributed system.

Link prediction

7 minute de lecture

Mis à jour : January 01, 2019

In this lecture, we learn the basics of how to perform unsupervised link prediction and supervised ling prediction. We will overview the following techniques:

Link prediction

plus petit que 1 minute de lecture

Mis à jour : January 01, 2019

Spreading through Network

2018

Language modelling

7 minute de lecture

Mis à jour : December 02, 2018

“But it must be recognized that the notion ’probability of a sentence’ is an entirely useless one, under any known interpretation of this term.”

MXNet in a nuthsell

1 minute de lecture

Mis à jour : December 01, 2018

As per MXNet documentation:

Dialogue

20 minute de lecture

Mis à jour : December 01, 2018

Word2Vec rocks!

First order scattering transform

2 minute de lecture

Mis à jour : December 01, 2018

1. Introduction

First order scattering transform

9 minute de lecture

Mis à jour : December 01, 2018

Much like Fourier transform expresses periodic functions as a sum of sinus and cosinus, the Wavelet transform expresses signals as a weighted sum of a special kind of functions, wavelets. Both use some inner product (scalar product and convolution) of an input signal and a given kernel / mask. The difference lies in the kernel the kernel of the transformation.

Statistics basics

plus petit que 1 minute de lecture

Mis à jour : December 01, 2018

1. Introduction

2. Generalities

3. Robust estimation

3.1. L-Statistics

3.2. M-estimates M-estimates of Location ### 3.3. M-estimates of scale

4. Robust regression

Statistics basics

13 minute de lecture

Mis à jour : December 01, 2018

1. Statistical modelling

Statistics basics

9 minute de lecture

Mis à jour : December 01, 2018

1. Generalities

Statistics basics

1 minute de lecture

Mis à jour : December 01, 2018

1. Simulations

Statistics basics

4 minute de lecture

Mis à jour : December 01, 2018

1. Convergence of random variables

Gabor filters

2 minute de lecture

Mis à jour : December 01, 2018

In image processing, aa Gabor filter, (Dennis Gabor) is a linear filter used for texture analysis.

Statistical Modeling and Parameter estimation

plus petit que 1 minute de lecture

Mis à jour : December 01, 2018

1. Introduction

Overview to supervised learning The missing piece BOW: drawbacks

Words representation

20 minute de lecture

Mis à jour : December 01, 2018

Word2Vec rocks!

Language modelling

1 minute de lecture

Mis à jour : December 01, 2018

Low-level vision framework

8 minute de lecture

Mis à jour : December 01, 2018

The problem of image analysis and understanding has gained high prominence over the last decade, and has emerged at forefront of signal and image processing research (read more here). In consequence, my first post on computer vision will deal with the basics of image understand.

Filters

7 minute de lecture

Mis à jour : December 01, 2018

This post deals with the basics of Filtering (see a family of methods) which is extremely useful in computer vision.

Filters

plus petit que 1 minute de lecture

Mis à jour : December 01, 2018

Predictions in Reinforcement Learning

3 minute de lecture

Mis à jour : December 01, 2018

In a model-free setting, the transition probabilities are unknown and the agent must interact with the environment. An additional challenge is the possibility that the control policy is different to the one to estimate. This is called off-policy but we will focus on on-policy in this blogpost. We will leverage a simulator and a policy $\pi$ - coupled with our knowledge of $S, A, \gamma$ to run episodes and improve the latter from sampled data.

Introduction to reinfocement learning

7 minute de lecture

Mis à jour : December 01, 2018

Reinforcement learning find its roots in several scientific fields, such as Deep learning, Psychology, Control, Statistics (but not limited to!). It typically consists of taking suitable action to maximize reward in a particular situation. Below is a common illustration of its core idea:

Introduction to dynamic programming

7 minute de lecture

Mis à jour : December 01, 2018

Dynamic Programming is a method for solving a complex problem by breaking it down into a collection of simpler subproblems, solving (often recursively) each of those subproblems just once, and storing their solutions using a memory-based data structure (array, map,etc).

Recurrent Neural Networks

1 minute de lecture

Mis à jour : December 01, 2018

A major limitation of Vanilla Neural Network and Convolutional Neural Networks is that API is rather constrained:

Utility theory

9 minute de lecture

Mis à jour : December 01, 2018

This blogpost is the first one of a series, whose aim is to both introduce what is Decision Modeling and build the corresponding mathematical framework. Decisions model are very important - particularly in business - as they reduce stress and deal with uncertainty. Supported by ever increasing amounts of data and sophisticated algorithms, their growing power have captured plenty of C-suite attention in the recent years. From very accurate predictions to guiding knotty optimization choices, decisions models are essential and worthy of interest.

Introduction Decision Modeling

3 minute de lecture

Mis à jour : December 01, 2018

Decision process

Synthesis criterion

3 minute de lecture

Mis à jour : December 01, 2018

1. Introduction

Social choice theory

7 minute de lecture

Mis à jour : December 01, 2018

Social choice theory is an established field and a cornerstone of countless others: Economics, Political science, Computer science, Applied mathematics, Operational Research. As for AI applications, it turns out to be really useful for developing Multiple Agents systems.

Outranking methods

7 minute de lecture

Mis à jour : December 01, 2018

A collective decision problem

Multiple criteria decision

9 minute de lecture

Mis à jour : December 01, 2018

Multiple-criteria decision analysis (MCDA) is a sub-discipline of operations research that explicitly evaluates multiple conflicting criteria in decision making (both in daily life and in settings such as business, government and medicine).

Interpretability in Deep Learning

22 minute de lecture

Mis à jour : December 01, 2018

Deep learning has been all the rage for the last few years. Powered by ever-increasing power, memory and bigger datasets, neural networks are easier to train than before. However, despite the tremendous success of certain applications (computer vision, Go and Dota…), the theoretical understanding of what a neural net actually takes more time to burgeon.

Optimizing Deep Learning

16 minute de lecture

Mis à jour : December 01, 2018

Architectures as priors on function space, initializations as random nonlinear projections

Word embeddings

5 minute de lecture

Mis à jour : December 01, 2018

An embeddings is a representation of an object (word, image) formulated as continuous vectors. They are constructed so that similar objects can have similar embeddings (metric learning). Usually, embeddings are not the final goal but are rather used as features (feature learning).

Word embeddings

5 minute de lecture

Mis à jour : December 01, 2018

Goal: Use word embeddings to embed larger chunks of text!

Optimizing Deep Learning

7 minute de lecture

Mis à jour : December 01, 2018

First-order methods: gradient descent and variants.

Multi-Layer Perceptron

2 minute de lecture

Mis à jour : December 01, 2018

This post covers the history of Deep Learning, from the Perceptron to the Multi-Layer Perceptron Network.

Basics of Deep Learning

1 minute de lecture

Mis à jour : December 01, 2018

Activation functions

Auto-encoder

4 minute de lecture

Mis à jour : December 01, 2018

Autoencoder

Community detection

14 minute de lecture

Mis à jour : November 16, 2018

The notion of community structure captures the tendency of nodes to be organized into communities, where members within a community are more similar among each other.

The scale free property

5 minute de lecture

Mis à jour : November 16, 2018

Hubs are encountered in most real networks. They represent a signature of a deeper organizing principle that we call the scale-free property.

Barabási-Albert Model

5 minute de lecture

Mis à jour : November 16, 2018

Hubs represent the most striking difference between a random and a scale-free network.

The Random Network Model

4 minute de lecture

Mis à jour : November 15, 2018

Network science aims to build models that reproduce the properties of real networks. As most encountered networks are irregular alnd look like they were spun randomly.

Naive Bayes

1 minute de lecture

Mis à jour : October 16, 2018

Naïve Bayes is a generative learning algorithm for discrete valued input. In particular, it is known to work great on texts classification tasks like spam detection.

Probabilistic classifiers

1 minute de lecture

Mis à jour : October 16, 2018

Discriminative learning algorithms are algorithms that try to learn $p(y

x)$ directly (such as logistic regression), or algorithms that try to learn mappings directly from the space of inputs $X$ to the labels ${0, 1}$, (such as the perceptron algorithm).

Linear Regression

6 minute de lecture

Mis à jour : October 16, 2018

In layman terms, a Linear Regression consists in predicting a continuous dependent variable $Y$ as a linear combination of the independent variables $X = {x_{1}, …, x_{n} }$. The contribution of each variable $x_{i}$ is expressed by a parameter $\beta_{i}$. Altogether, it is a simple weighted sum:

Linear Discriminant Analysis

2 minute de lecture

Mis à jour : October 16, 2018

LDA is closely related to PCA as it is linear.

Decision trees

7 minute de lecture

Mis à jour : October 16, 2018

Decision Tree Induction

Logistic Regression

1 minute de lecture

Mis à jour : October 16, 2018

TL; DR: Logistic Regression is a simple but oftentimes efficient algorithm that tackles binary classification.

Introduction to optimization

4 minute de lecture

Mis à jour : October 01, 2018

Gentle introduction to the convexity, derivatives and the taxonomy of problems in optimization.

Agents in AI

7 minute de lecture

Mis à jour : October 01, 2018

Agents

Constrained optimization

3 minute de lecture

Mis à jour : October 01, 2018

Goal: Make use of the Lagrangian and other methods to accomodate constraints.

Unconstrained optimization

3 minute de lecture

Mis à jour : October 01, 2018

Main assumption: $f: \mathbb{R}^{n} \rightarrow \mathbb{R}$ is $\mathcal{C}^{1}$ or $\mathcal{C}^{2}$.

Introduction to Machine Learning

9 minute de lecture

Mis à jour : October 01, 2018

In a nutshell, Machine Learning is a sub-field of Artificial Intelligence whose aim is to convert experience into expertise of knowledge. Born in the 1960s, it quickly grew as a separated field because of a focus shift from decisional AI (logical and knowledge-based approach); its aim is not General Artificial Intelligence but rather tackle solvable problems of a practical nature. And ever since the 1990s, Machine Learning is progressing and flourishing rapidly. This success can largely be attributed to two factors:

Map Reduce

7 minute de lecture

Mis à jour : October 01, 2018

MapReduce is:

A simple programming model for processing huge data sets in a distributed way.
A framework that runs these programs on clusters of commodity servers, automatically handling the details of distributed computing :
- Division of labor.
- Distribution.
- Synchronization.
- Fault-tolerance.

Graph theory

7 minute de lecture

Mis à jour : August 14, 2018

Networks are a central aspect of any systems, and Graph theory - a branch of Mathematics - is fundamental to grasp and represent those networks. From degrees to degree distributions, from paths to distances and learn to distinguish weighted, directed and bipartite networks.

Louis de Vitry

Blog posts

2019

Spreading through Network

2018

1. Introduction

1. Introduction

2. Generalities

3. Robust estimation

3.1. L-Statistics

3.2. M-estimates M-estimates of Location ### 3.3. M-estimates of scale

4. Robust regression

1. Statistical modelling

1. Generalities

1. Simulations

1. Convergence of random variables

1. Introduction

1. Social Media Analytics: Introduction

Decision process

1. Introduction

Architectures as priors on function space, initializations as random nonlinear projections

Activation functions

Autoencoder

Decision Tree Induction

Agents