Convex sets, functions, and optimization

Agenda today

Reading:

Why do we care about convex problems?

Convex sets

Definion: \(C\) is a convex set if for any \(x_1, x_2 \in C\) and any \(\theta \in [0,1]\), \(\theta x_1 + (1 - \theta)x_2 \in C\).

Example: Affine sets

An affine set/solution set of linear equations: \(\{x : Ax = b\}\) are convex sets

Proof: check the definition: if \(x_1\) and \(x_2\) are both solutions to \(Ax = b\), \(\theta x_1 + (1 - \theta)x_2\) is as well.

Example: Hyperplanes and half spaces

Example: Norm balls

A norm is a function \(\| \cdot \|\) that satisfies

Example: Positive semidefinite cone

Example: Polyhedra

A polyhedron is the solution set to a finite number of linear inequalities and equalities: \[ \{x : Ax \preceq b, Cx = d\} \] where \(\preceq\) means component-wise inequality, \(A\) and \(C\) are matrices.

Can also think of as the intersection of a finite number of halfspaces and hyperplanes

Operations that preserve convexity of sets

Convex functions

Definition: \(f: \mathbb R^n \to \mathbb R\) is convex if its domain is a convex set and \[ f(\theta x + (1-\theta) y) \le \theta f(x) + (1 - \theta) f(y) \] for all \(x, y \in \text{dom}(f)\) and \(\theta \in [0,1]\)

First-order condition

Suppose \(f\) is differentiable, and let \(\nabla f(x) = (\frac{\partial f(x)}{\partial x_1}, \frac{\partial f(x)}{\partial x_1}, \cdots, \frac{\partial f(x)}{\partial x_n})\).

The 1st-order condition states that \(f\) is convex iff \(f\) has a convex domain and \[ f(y) \ge f(x) + \nabla f(x)^T (y - x) \quad \forall x, y \in \text{dom}(f) \]

Interpretation: the first-order Taylor approximation of \(f\) is a global underestimator.

Second-order condition

Suppose \(f\) is twice differentiable: the Hessian \(\nabla^2 f(x)\), \(\nabla^2 f(x)_{xj} = \frac{\partial^2 f(x)}{\partial x_i \partial x_j}\) exists for any \(x \in \text{dom}(f)\).

The 2nd-order condition states that if \(f\) is twice differentiable and has a convex domain

Restriction of convex function to a line

\(f : \mathbb R^n \to \mathbb R\) is convex iff the function \(g : \mathbb R \to \mathbb R\), \[ g(t) = f(x + tv), \quad \text{dom}(g) = \{t : x + tv \in \text{dom}(f)\} \] is a convex function of \(t\) for any \(x \in \text{dom}(f)\) and any \(v \in \mathbb R^n\)

This equivalence lets you check the convexity of \(f\) by checking convexity of a function of one variable.

Operations that preserve convexity

Positive weighted sum/composition with affine function

Examples:

Pointwise maximum

If \(f_1, \ldots, f_m\) are convex, then \(f(x) = \text{max}\{f_1(x), \ldots, f_n(x)\}\) is convex

Example:

Composition

Suppose \(g : \mathbb R^n \to \mathbb R\) and \(h : \mathbb R \to \mathbb R\), and define \(f\) as \[ f(x) = h(g(x)) \]

\(f\) is convex if either:

Proof for differentiable \(g\), \(h\) by checking the second-order conditions

Examples

Convex optimization problem

In a convex optimization problem, we minimize a convex function over a convex set.

Standard form for an optimization problem is:

\[ \begin{align*} \text{minimize} \quad &f_0(x) \\ \text{subject to}\quad &f_i(x) \le 0, \quad i = 1,\ldots, m\\ &a_j^T x = b_j, \quad j = 1,\ldots, n \end{align*} \]

Regression

Standard least squares problem is convex: \[ \text{minimize} \|y - X \beta\|_2^2 \]

Regularized regression

Any "regularization" with a convex function \(P\) will still be convex:

\[ \text{minimize} \|y - X \beta\|_2^2 + P(\beta) \]

Covariance estimation

Let \(S\) denote the sample covariance and \(\Theta\) be the inverse covariance matrix.

Up to constant factors, the log-likelihood of the data given a Gaussian distribution is

\[ \log \det \Theta - \text{tr}(S \Theta) \]

Covariance estimation would be by maximizing the log likelihood or minimizing the negative log likelihood.

Summing up

If you can show an optimization problem is convex, you're very likely able to solve it efficiently

Many statistical estimation problems are naturally convex

You have a couple of options for checking convexity: