Notes on Convex Optimization (2): Local Methods, Especially First-Order Methods

Posted on 2017-06-13 | In Optimization |

The simplest goal of general nonlinear optimization is to find a local minimum of a differentiable function. The majority of general nonlinear optimization methods are based on relaxation, i.e. generating decrease sequence $\{ f(x_k) \}_{k = 0}^\infty$, such that
$$
f(x_{k+1}) \leq f(x_k), ~ k = 0, 1, \cdots.
$$

To implement the idea of relaxation, we employ another fundamental principle, approximation. Usually, we apply local approximation based on derivatives of nonlinear functions. Local methods are discussed in general setting without convexity.

Notes on Convex Optimization (3): Optimal Method for Smooth Convex Optimization

Posted on 2017-06-13 | In Optimization |

We have seen that in general situation we cannot do too much, for example, it is impossible to guarantee convergence even to a local minimum. Thus we have to restrict our functions class. In chap. 2, the smooth (smooth enough) functions are considered. Also, the (strongly) convex functions are introduced. We can get lower bounds for these objectives, and propose optimal gradient based algorithms.

Stochastic Variance Reduced Gradient (SVRG)

Posted on 2017-06-08 | In Optimization |

This is a non-state-of-art read through of Stochastic Variance Reduced Gradient (SVRG) ^[1] method. Gradient descent and stochastic gradient descent (SGD) plays the most important role in optimization of machine learning problems. With large scale datasets, especially in deep learning applications, SGD and its variants maybe the only practical choice. This paper proposes to accelerate the convergence of SGD by reducing the variance of gradient, introduced by random sampling when evaluating gradient. This work has been extended to many other problems, such as non-convex optimization, sparse learning, etc. So this post is just a late bird note.

Notes on Convex Optimization

Posted on 2015-03-17 | In Optimization |

最近开始学习凸优化，虽然作为数学系的学生曾上过数值优化课，但当时学得是“稀里糊涂”(in boss’s words)。当然以前也没有认识到数值优化的应用，现转为初级码农，从业ML，遂得知优化之实用(当然主要是每年paper无数)，重头学习凸优化。目前正在看Yurii Nesterov所编的教材的Introductory Lectures on Convex Optimization: A Basic Course，为免看完即忘，因此作了笔记。

What’s in this book

General optimization problems (Unfortunately “unsolvable”)
Smooth convex optimization
Nonsmooth convex optimization
Convex minimization problems with explicit structure

PS: Goals of this course (from Introduction section):

Show that: despite their attraction, the proposed “solutions” of general optimization problems very often cannot satisfy the expectations of a naive user. The main face, which should be known to any person dealing with optimization models, is that in general optimization problems are unsolvable. (The reality is always cruel)
Discussing methods for solvable nonlinear models, namely convex optimization problems.

Notes on Automata - Week 2

Posted on 2014-09-19 | In Programming |

In week 2 of Automata Course, regular expressions are introduced, along with some applications of regular expressions. Not surprisely, the regular expressions describe exactly the regular languages, the same as Finite Automatas. Also, the properties of regular languages are introduced, including decisions properties and closure properties. The latter indicates that the regular class is closed under operations such as union, intersection, difference, etc, while a decision property for a class of languages is an algorithm that takes a formal description of a language (e.g., DFA) and tells whether or not some property holds. In particular, the pumping lemma is useful to tell whether a language is regular or not.

Notes on Automata - Week 1

Posted on 2014-09-08 | In Programming |

上个学期修了计算理论课程，但结果没有去上过几次课，最后勉强没有挂科。但实际上，计算理论是CS之所以能成为科学的重要基础，也是CS中非常有趣的部分。因此选了Coursera的Automata课程，由Stanford的Jeff Ullman教授讲课。教授自然是大神，传说中的龙书（《编译原理》）的作者，当然半路出家的我并没有拜读过，不过封面真是赞啊。

What’s in this course

Regular languages and their descriptors
- Finite Automata, nondeterministic finite automata, regular expressions
- Algorithms to decide questions about regular languages, e.g., is it empty?
- Closure properties of regular languages
Context-free languages and their descriptors
- Context-free grammars, pushdown automata
- Decisions and closure properties
Recursive and recursively enumerable languages
- Turing machines, deciability of problems
- The limit of what can be computed
Intractable problems
- Problems that (appear to) require exponential time
- NP-completeness and beyond

PS: Ullman教授提到了这样的事实： A survey of Stanford grads 5 years out asked which of their courses did they use in their job.

Basics like intro-programming took the top spots, of course.
But among optional courses, CS154 (Automata and Complexity Theory) stood remarkably high.