AI Reading group

2022

Mar. 31 2022

Educating neural networks

Mouath Aouayeb

Deep learning
ViT
Transformer

It is more than hundreds of years of evolution of our education system. Thanks to that, today, the growth of the research is astonishing. Now we are making machines learn. And new robust and optimized models are trained day after day: from Neural Network to CNN to ViT. So, if we consider the DL models as the students of the machine education system,one could ask: Is ViT a Ph.D. student? This talk presents an analogy between the human education system and the deep learning system. Furthermore, different techniques dedicated to training transformers on mid-small databases alongside a novel hybrid model of ViT and CNN are presented.

Mar. 10 2022

Understanding Deep Learning (Still) Requires Rethinking Generalization

Alban Marie

Deep Learning

Despite their massive size, successful deep artificial neural networks can exhibit a remarkably small gap between train- ing and test performance. Conventional wisdom attributes small generalization error either to properties of the model family or to the regularization techniques used during training. Through extensive systematic experiments, we show how these traditional approaches fail to explain why large neural networks generalize well in practice. Specifically, our experi- ments establish that state-of-the-art convolutional networks for image classification trained with stochastic gradient methods easily fit a random labeling of the training data. This phenomenon is qualitatively unaffected by explicit regularization and occurs even if we replace the true images by completely unstructured random noise. We corroborate these experimental findings with a theoretical construc- tion showing that simple depth two neural networks already have perfect finite sample expressivity as soon as the num- ber of parameters exceeds the number of data points as it usually does in practice. We interpret our experimental findings by comparison with traditional models. We supplement this republication with a new section at the end summarizing recent progresses in the field since the original version of this paper.

Feb. 17 2022

Hue-Net: Intensity-based Image-to-Image Translation with Differentiable Histogram Loss Functions

Joseph Faye

Image-to-Image
Deep Learning

Image-to-image translation is a computer vision task aiming to learn the mapping between an input image from one domain to an output image from another domain following the style or characteristics. It can be applied to a wide range of applications, such as collection style transfer, object transfiguration, season transfer, and photo enhancement. Hue-Net is a deep learning framework for intensity-based Image-to-Image translation. It introduces a differentiable representation of (1D) cyclic and (2D) joint histograms and uses them for defining loss functions based on cyclic Earth Mover's Distance (EMD) and Mutual Information (MI). The of Hue-Net strength has been demonstrated on color transfer problems, where the aim is to paint a source image with the colors of a different target image.

Feb. 03 2022

Graph Attention Network

Nicolas Beuve

GNN
Attention

the Graph Attention Network (GAT) is a Graph Neural Network (GNN) using attention to represent relative importance of neighbooring nodes in a graph. GNNs are a special kind of neural network operating on graphs. At first glance, they seem to be not very suitable for image analysis, but after an in-depth analysis of the Graph Attention Network, we will see an example of application of such model in superpixel image classification.

2021

jul. 1, 2021

Paper Digest: “Dynamic Neural Networks: A Survey”

Karol Desnos

Deep Learning

Dynamic neural network is an emerging research topic in deep learning. Compared to static models which have fixed computational graphs and parameters at the inference stage, dynamic networks can adapt their structures or parameters to different inputs, leading to notable advantages in terms of accuracy, computational efficiency, adaptiveness, etc. In this survey, we comprehensively review this rapidly developing area by dividing dynamic networks into three main categories: 1) instance-wise dynamic models that process each instance with data-dependent architectures or parameters; 2) spatial-wise dynamic networks that conduct adaptive computation with respect to different spatial locations of image data and 3) temporal-wise dynamic models that perform adaptive inference along the temporal dimension for sequential data such as videos and texts. The important research problems of dynamic networks, e.g., architecture design, decision making scheme, optimization technique and applications, are reviewed systematically. Finally, we discuss the open problems in this field together with interesting future research directions

jun. 16, 2021

Supervised learning to approximate model observer in task-based measure of image quality

Meriem Bayou-Outtas

Deep Learning

The ability of an observer to perform a specific task on images, produced by a given medical imaging systems, defines an objective measure of image quality. If the observer is “numerical”, can deep learning methods “do the job”? What we found in the literature? Some papers rise this issue and propose to approximate the Ideal Observer for performing tasks detection and localization.

jun. 5, 2021

A gentle introduction to hardware architectures for DNN acceleration

Mickaël Dardaillon

Deep Learning

“The synergy between the large datasets in the cloud and the numerous computers that power it has enabled remarkable advancements in machine learning, especially in DNNs. […] That changed in 2013 when a projection in which Google users searched by voice for three minutes per day using speech recognition DNNs would double Google datacenters’ computation demands.” This presentation will introduce the concepts behind the hardware architectures used to support current growth in machine learning, including GPUs and TPUs.

apr. 22, 2021

Travel in the Deep Learning

Mouath Aouayeb

Deep Learning

Traveling at the time of coronavirus is difficult with the restrictions set by governments all around the world and that’s why most international meetings and conferences are held online. On the other side, deep learning has grown significantly in the past few years and especially for vision applications. Different architectures and models from CNNs to Transformers have been proposed. In this talk, we will not present another model, but we will list different techniques, layers, loss functions, and optimizers that can improve the performance of your model. Also, an analogy between travel and deep learning is presented in the beginning.

apr. 4, 2021

Optimizing implementation of CNN inferences: change the model or the architecture?

Alexandre Honorat

CNN

CNN are now widely used so it is necessary to implement them efficiently. To do so, CNN are most commonly implemented on GPU processors, and also a bit on FPGA. In this talk, without entering into the details, we will list some problems arising when implementing the CNN inferences, especially on FPGA. We will also link these problems to the CNN models themselves and we will highlight a few general recommendations extracted from the following papers.

mar. 18, 2021

Dense-Gated Blocks, towards layers relationships in densely connected blocks

Joseph Faye

CNN

ResNet, Highway networks to DenseNet, adding more inter-layer connections besides the direct connection in adjacent layers, emerged as popular approaches to strengthen feature propagation among different layers. However, dense connections cause much redundancy especially in the case of DenseNet. Another aspect is that for many dense connections from previous layers, the role played by the mainstream module is unclear. To address these issues, authors introduce a gating mechanism, inspired by SENeT to model the layer relationships in densely connected blocks.

mar. 4, 2021

ConvNets and humans are not biased towards the same information in images

Alban Marie

CNN

Nowadays, it is well established that ConvNets are able to achieve incredible performance on complex vision task such as classification, object recognition or semantic segmentation. A common thought is that humans and ConvNets are able to solve these tasks by learning increasingly complex representations of object shapes. However, recent studies show that humans and ConvNets have indeed very different strategies by not being biased towards the same information in images. To this end, authors propose a stylized version of ImageNet , allowing ConvNets to learn images representation used by humans easier.

feb. 18, 2021

Image-to-image translation with GANs

Nicolas Beuve

GAN

Image-to-image translation is a realm aiming at transposing images from one representation to another, like generating an aerial map of a region based on a photograph. Results in this field were greatly improved since the arrival of GAN models in 2014. GANs (Generative Adversarial Nets) are neural networks, specialized in sample generation. When applied to an image, those models are able to generate convincing samples that are similar to images from a reference dataset while remaining completely original.

feb. 4, 2021

Maybe BERT is all you need ?

Paul Peyramaure

Transformer
Attention
NLP

BERT, which stands for Bidirectional Encoder Representations from Transformer, has been published by a Google AI team in 2018. It has been presented as a new cutting-edge model for Natural Language Processing (NLP). Based on Transformer achitecture, it is design to learn bidirectional representations by considering both the left and right contexts in all its layers. While being initially introduced for NLP tasks, it has recently been used to model other tasks such as action recognition.

jan. 21, 2021

Transformers: Attention is all you need!

FLorian Lemarchant

Transformer
Attention
NLP

While the Transformer architecture has become the de-facto standard for natural language processing tasks, its applications to computer vision remain limited. In vision, attention is either applied in conjunction with convolutional networks, or used to replace certain components of convolutional networks while keeping their overall structure in place. It has been shown that this reliance on CNNs is not necessary and a pure transformer applied directly to sequences of image patches can perform very well on multiple image tasks.