Cover Photo (Left: Kano-san, Univ. of Tokyo & RIKEN / Middle: Du, GROUND / Right: Kashiwa-san, Ads Agency)
The 10th Asian Conference on Machine Learning was held during 14th-16th Nov. 2018 at Beijing JiaoTong Univ (https://jim-zhenxue.github.io/ACML-2018/). Many speakers/presenters came from mainland China but also many other countries and regions including HongKong, Macau, Taiwan, Japan, South Korea, Singapore, Australia, Turkey and United Kingdom. Although the hottest topic was no doubt DeepLearning especially on images but many other issues and application were also presented such as Kernel methods and Gaussian Process.
■ Learning Objectives
- Latest Image Processing techniques
- Feature extraction and Clustering techniques for SKU grouping which is used for our AI development.
- Application of Neural Network to combinatorial optimization which occurs many times in warehouse operation
■ Keynote Speeches
<Clustering - what both Theoreticians and Practitioners are Doing Wrong>
The speaker alerted the ignorance and unawareness of clustering algorithm selection when very wise analysts work on clustering their data. Like generally mentioned *no free lunch* or *no silver bullet*, every clustering algorithm has its merit and drawback. The speaker explained certain guidance to choose clustering algorithm. Users need to first prioritize properties such as scale invariance, antichain richness and local consistency and then select an appropriate algorithm. The speaker also noted that their is no *completely unsupervised learning*. One needs to incorporate some domain specific prior knowledge to the clustering, thus needs to set certain hypothesis or condition before diving into algorithm.
<Something Old, Something New, Something Borrowed, ...>
The speaker explained now there is a movement to combine new techniques including Deep Learning to classic methods such as ensemble learning and Bayesian modeling. He said the success of Deep Learning does not come from its theoretical advantage but from recently developed hardware or computational techniques including multicore CPU, GPU, accelerated SGD algorithms and autograd technique. Thus the next frontier shall be applying those techniques to classical models. Such kind of movement is already happened. Recently they emerged several Bayesian modeling frameworks which utilize autograd to solve Hamiltonian MC or Variational Learning, even on GPU such as stan, tensorflow-probability and edward. The speaker also briefly introduced them under developing new modeling framework which can also use discrete distributions.
<Machine Learning in Autonomous Systems: Theory and Practice>
The speaker briefly showed several demonstration on how complex the modern autonomous system (self-driving car, robots) are and explained the importance of robustness and versatility of models such as pose, light condition invariance. Most recent approaches to handle this problem are explicitly add rotated, cropped or noised data, known as a data augmentation. Alternatively, finding a low dimensional manifold representation was also studied for a long time. The speaker gave some theoretical background on manifold, information capacity and analysis of Neural Network from the viewpoint of manifold learning. He also presented latest algorithm to learn optimal cutting plane to classify two manifolds, which is a generalization of SVM.
<Dual Learning: Algorithms, Applications and Challenges>
Dual Learning is a new paradigm of training models. This is a kind of Multi-Task problem. It tries to solve both primal task and the corresponding dual task to improve accuracy or even train supervised model in unsupervised way. The examples are as follows.
- p-task: translate language A to B
- d-task: B to A
Image Style Conversion:
- p-task: convert real picture to comic
- d-task: convert comic picture to real one
- p-task: sentence classification
- d-task: sentence generation
The speaker showed many application of dual learning to both supervised and unsupervised examples. Among them, the most impressive one was unsupervised translation. The speaker showed way to combine autoencoder models and dual learning to achieve it. The basic idea is as follows:
The first two are ordinal autoencoder reconstruction error but the last two are new trick of dual learning.
The speaker also presented Dual Inference by which one can improve the inference accuracy of two conjugate models without any further training.
■ Paper Session
There were many interesting paper presentations. Deep Learning and Neural Network are hottest topic. They included not only targeted to simple classification but also feature extraction. There was also a paper that tries to solve combinatorial optimization problem by Neural Network. Although Deep Learning is seemingly overemphasize in non academic world, here I saw a lot of classical ML works such as kernel methods, Bayesian models, generic Markov decision process are still actively studied.