Planet Plone

This is where developers and integrators write about Plone, and is your best source for news and developments from the community.

December 18, 2014

Six Feet Up: Turning a static theme into a Diazo theme through the web

by Chrissy Wainwright at 2014-12-18T12:00:00Z

static to diazo Overview

  1. Find/Create a static theme
  2. Create a new Diazo theme in the Theming control panel
  3. Copy in files from theme
  4. Write/Build the rules.xml

Detailed Steps

  1. sample final themeFirst you’ll need a static theme. You can create one yourself, or find one on a free template site (make sure to read the terms of use!). The great thing is the theme doesn’t need to know anything specific about Plone, but you should still make sure all Plone elements that you want are displayed in the theme.

  2. Create a new Diazo theme

    • Go to Site Setup > Theming (or directly to @@theming-controlpanel)
    • Click the ‘New theme’ button. You will have the option to immediately enable the new theme. Note: I do not recommend this for a live site.
  3. You are taken to a theme editor where you can upload all files from your theme. Keep the folder structure the same. The only files you’ll need in addition to your static files are rules.xml and manifest.cfgmodify theme sample

  4. Write the rules. This is a single file that replaces the static content in your theme with dynamic content from Plone, based on the rules you define. The theme editor has tools built in to help you write the rules.

Show inspectors - This turns on a couple frames in the window, showing you both the static theme and unthemed Plone site. Hovering or clicking on the elements will give you a selector for that element.


Build rule - This is a wizard that writes rules for you based on settings you enter and elements you select. This is very helpful for learning Diazo syntax.

Build rule 1

build rule 2

Help - This displays the manual for - the product that turns your Diazo theme into a Plone theme.

Once you have built all the rules, you can then Activate your theme in the Theming Control Panel to apply it to the entire site.

Like what you've read? Have a suggestion of something you'd like to see? Leave a comment below! You can see more trainings like this one at Also, be sure to sign up for our weekly Plone & Python How-To digests.

December 17, 2014

Daniel Nouri: Using convolutional neural nets to detect facial keypoints tutorial


This is a hands-on tutorial on deep learning. Step by step, we'll go about building a solution for the Facial Keypoint Detection Kaggle challenge. The tutorial introduces Lasagne, a new library for building neural networks with Python and Theano. We'll use Lasagne to implement a couple of network architectures, talk about data augmentation, dropout, the importance of momentum, and pre-training. Some of these methods will help us improve our results quite a bit.

I'll assume that you already know a fair bit about neural nets. That's because we won't talk about much of the background of how neural nets work; there's a few of good books and videos for that, like the Neural Networks and Deep Learning online book. Alec Radford's talk Deep Learning with Python's Theano library is a great quick introduction. Make sure you also check out Andrej Karpathy's mind-blowing ConvNetJS Browser Demos.


You don't need to type the code and execute it yourself if you just want to follow along. But here's the installation instructions for those who have access to a CUDA-capable GPU and want to run the experiments themselves.

I assume you have Python 2.7.x, numpy, pandas, matplotlib, and scikit-learn installed. Lasagne is still waiting for its first proper release, so for now we'll install it straight from Github. To install Lasagne and all the remaining dependencies, run these commands:

pip install -r
pip install -r

(Note that for sake of brevity, I'm not including commands to create a virtualenv and activate it. But you should.)

If everything worked well, you should be able to find the src/lasagne/examples/ directory in your virtualenv and run the MNIST example. This is sort of the "Hello, world" of neural nets. There's ten classes, one for each digit between 0 and 9, and the input is grayscale images of handwritten digits of size 28x28.

cd src/lasagne/examples/

This command will start printing out stuff after thirty seconds or so. The reason it takes a while is that Lasagne uses Theano to do the heavy lifting; Theano in turn is a "optimizing GPU-meta-programming code generating array oriented optimizing math compiler in Python," and it will generate C code that needs to be compiled before training can happen. Luckily, we have to pay the price for this overhead only on the first run.

Once training starts, you'll see output like this:

Epoch 1 of 500
  training loss:            1.352731
  validation loss:          0.466565
  validation accuracy:              87.70 %
Epoch 2 of 500
  training loss:            0.591704
  validation loss:          0.326680
  validation accuracy:              90.64 %
Epoch 3 of 500
  training loss:            0.464022
  validation loss:          0.275699
  validation accuracy:              91.98 %

If you let it run long enough, you'll notice that after about 75 epochs, it'll have reached a test accuracy of around 98%.

(If any of the instructions in this tutorial do not work for you, submit a bug report here.)

The data

The training dataset for the Facial Keypoint Detection challenge consists of 7,049 96x96 gray-scale images. For each image, we're supposed learn to find the correct position (the x and y coordinates) of 15 keypoints, such as left_eye_center, right_eye_outer_corner, mouth_center_bottom_lip, and so on.

An example of one of the faces with three keypoints marked.

An interesting twist with the dataset is that for some of the keypoints we only have about 2,000 labels, while other keypoints have more than 7,000 labels available for training.

Let's write some Python code that loads the data from the CSV files provided. We'll write a function that can load both the training and the test data. These two datasets differ in that the test data doesn't contain the target values; it's the goal of the challenge to predict these. Here's our load() function:

# file
import os
import numpy as np
from import read_csv
from sklearn.utils import shuffle
FTRAIN = '~/data/kaggle-facial-keypoint-detection/training-cleaned.csv'
FTEST = '~/data/kaggle-facial-keypoint-detection/test.csv'
def load(test=False, cols=None):
    """Loads data from FTEST if *test* is True, otherwise from FTRAIN.
    Pass a list of *cols* if you're only interested in a subset of the
    target columns.
    fname = FTEST if test else FTRAIN
    df = read_csv(os.path.expanduser(fname))  # load pandas dataframe
    # The Image column has pixel values separated by space; convert
    # the values to numpy arrays:
    df['Image'] = df['Image'].apply(lambda im: np.fromstring(im, sep=' '))
    if cols:  # get a subset of columns
        df = df[list(cols) + ['Image']]
    print(df.count())  # prints the number of values for each column
    df = df.dropna()  # drop all rows that have missing values in them
    X = np.vstack(df['Image'].values) / 255.  # scale pixel values to [0, 1]
    X = X.astype(np.float32)
    if not test:  # only FTRAIN has any target columns
        y = df[df.columns[:-1]].values
        y = (y - 48) / 48  # scale target coordinates to [-1, 1]
        X, y = shuffle(X, y, random_state=42)  # shuffle train data
        y = y.astype(np.float32)
        y = None
    return X, y
X, y = load()
print("X.shape == {}; X.min == {:.3f}; X.max == {:.3f}".format(
    X.shape, X.min(), X.max()))
print("y.shape == {}; y.min == {:.3f}; y.max == {:.3f}".format(
    y.shape, y.min(), y.max()))

It's not necessary that you go through every single detail of this function. But let's take a look at what the script above outputs:

$ python
left_eye_center_x            7034
left_eye_center_y            7034
right_eye_center_x           7032
right_eye_center_y           7032
left_eye_inner_corner_x      2266
left_eye_inner_corner_y      2266
left_eye_outer_corner_x      2263
left_eye_outer_corner_y      2263
right_eye_inner_corner_x     2264
right_eye_inner_corner_y     2264
mouth_right_corner_x         2267
mouth_right_corner_y         2267
mouth_center_top_lip_x       2272
mouth_center_top_lip_y       2272
mouth_center_bottom_lip_x    7014
mouth_center_bottom_lip_y    7014
Image                        7044
dtype: int64
X.shape == (2140, 9216); X.min == 0.000; X.max == 1.000
y.shape == (2140, 30); y.min == -0.920; y.max == 0.996

First it's printing a list of all columns in the CSV file along with the number of available values for each. So while we have an Image for all rows in the training data, we only have 2,267 values for mouth_right_corner_x and so on.

load() returns a tuple (X, y) where y is the target matrix. y has shape n x m with n being the number of samples in the dataset that have all m keypoints. Dropping all rows with missing values is what this line does:

df = df.dropna()  # drop all rows that have missing values in them

The script's output y.shape == (2140, 30) tells us that there's only 2,140 images in the dataset that have all 30 target values present. Initially, we'll train with these 2,140 samples only. Which leaves us with many more input dimensions (9,216) than samples; an indicator that overfitting might become a problem. Let's see. Of course it's a bad idea to throw away 70% of the training data just like that, and we'll talk about this later on.

Another feature of the load() function is that it scales the intensity values of the image pixels to be in the interval [0, 1], instead of 0 to 255. The target values (x and y coordinates) are scaled to [-1, 1]; before they were between 0 to 95.

First model: a single hidden layer

Now that we're done with the legwork of loading the data, let's use Lasagne and create a neural net with a single hidden layer. We'll start with the code:

# add to
from lasagne import layers
from lasagne.updates import nesterov_momentum
from nolearn.lasagne import NeuralNet
net1 = NeuralNet(
    layers=[  # three layers: one hidden layer
        ('input', layers.InputLayer),
        ('hidden', layers.DenseLayer),
        ('output', layers.DenseLayer),
    # layer parameters:
    input_shape=(128, 9216),  # 128 images per batch times 96x96 input pixels
    hidden_num_units=100,  # number of units in hidden layer
    output_nonlinearity=None,  # output layer uses identity function
    output_num_units=30,  # 30 target values
    # optimization method:
    regression=True,  # flag to indicate we're dealing with regression problem
    max_epochs=400,  # we want to train this many epochs
X, y = load(), y)

We use quite a few parameters to initialize the NeuralNet. Let's walk through them. First there's the three layers and their parameters:

    layers=[  # three layers: one hidden layer
        ('input', layers.InputLayer),
        ('hidden', layers.DenseLayer),
        ('output', layers.DenseLayer),
    # layer parameters:
    input_shape=(128, 9216),  # 128 images per batch times 96x96 input pixels
    hidden_num_units=100,  # number of units in hidden layer
    output_nonlinearity=None,  # output layer uses identity function
    output_num_units=30,  # 30 target values

Here we define the input layer, the hidden layer and the output layer. In parameter layers, we name and specify the type of each layer, and their order. Parameters input_shape, hidden_num_units, output_nonlinearity, and output_num_units are each parameters for specific layers; they refer to the layer by their prefix, such that input_shape defines the shape parameter of the input layer, hidden_num_units defines the hidden layer's num_units and so on. (It may seem a little odd that we have to specify the parameters like this, but the upshot is it buys us better compatibility with scikit-learn's pipeline and parameter search features.)

We'll discuss batch iterators later on. For now you'll have to be aware that we use mini batches with 128 samples in each batch, and we define the first dimension of input_shape accordingly.

We set the output_nonlinearity to None explicitly. Thus, the output units' activations become just a linear combination of the activations in the hidden layer.

The default nonlinearity used by DenseLayer is the rectifier, which is simply max(0, x). It's the most popular choice of activation function these days. By not explicitly setting hidden_nonlinearity, we're choosing the rectifier as the activiation function of our hidden layer.

The neural net's weights are initialized from a uniform distribution with a cleverly chosen interval. That is, Lasagne figures out this interval for us, using "Glorot-style" initialization.

There's a few more parameters. All parameters starting with update parametrize the update function, or optimization method. The update function will update the weights of our network after each batch. We'll use the nesterov_momentum gradient descent optimization method to do the job. There's a number of other methods that Lasagne implements, such as adagrad and rmsprop. We choose nesterov_momentum because it has proven to work very well for a large number of problems.

    # optimization method:

The update_learning_rate defines how large we want the steps of the gradient descent updates to be. We'll talk a bit more about the learning_rate and momentum parameters later on. For now, it's enough to just use these "sane defaults."

Comparison of a few optimization methods (animation by Alec Radford). The star denotes the global minimum on the error surface. Notice that stochastic gradient descent (SGD) without momentum is the slowest method to converge in this example. We're using Nesterov's Accelerated Gradient Descent (NAG) throughout this tutorial.

In our definition of NeuralNet we didn't specify an objective function to minimize. There's again a default for that; for regression problems it's the mean squared error (MSE).

The last set of parameters declare that we're dealing with a regression problem (as opposed to classification), that 400 is the number of epochs we're willing to train, and that we want to print out information during training by setting verbose=1:

  regression=True,  # flag to indicate we're dealing with regression problem
  max_epochs=400,  # we want to train this many epochs

Finally, the last two lines in our script load the data, just as before, and then train the neural net with it:

X, y = load(), y)

Running these two lines will output a table that grows one row per training epoch. In each row, we'll see the current loss (MSE) on the train set and on the validation set and the ratio between the two. NeuralNet automatically splits the data provided in X into a training and a validation set, using 20% of the samples for validation. (You can adjust this ratio by overriding the eval_size=0.2 parameter.)

$ python
  InputLayer          (128, 9216)             produces    9216 outputs
  DenseLayer          (128, 100)              produces     100 outputs
  DenseLayer          (128, 30)               produces      30 outputs
 Epoch  |  Train loss  |  Valid loss  |  Train / Val
     1  |    0.105418  |    0.031085  |     3.391261
     2  |    0.020353  |    0.019294  |     1.054894
     3  |    0.016118  |    0.016918  |     0.952734
     4  |    0.014187  |    0.015550  |     0.912363
     5  |    0.013329  |    0.014791  |     0.901199
   200  |    0.003250  |    0.004150  |     0.783282
   201  |    0.003242  |    0.004141  |     0.782850
   202  |    0.003234  |    0.004133  |     0.782305
   203  |    0.003225  |    0.004126  |     0.781746
   204  |    0.003217  |    0.004118  |     0.781239
   205  |    0.003209  |    0.004110  |     0.780738
   395  |    0.002259  |    0.003269  |     0.690925
   396  |    0.002256  |    0.003264  |     0.691164
   397  |    0.002254  |    0.003264  |     0.690485
   398  |    0.002249  |    0.003259  |     0.690303
   399  |    0.002247  |    0.003260  |     0.689252
   400  |    0.002244  |    0.003255  |     0.689606

On a reasonably fast GPU, we're able to train for 400 epochs in under a minute. Notice that the validation loss keeps improving until the end. (If you let it train longer, it will improve a little more.)

Now how good is a validation loss of 0.0032? How does it compare to the challenge's benchmark or the other entries in the leaderboard? Remember that we divided the target coordinates by 48 when we scaled them to be in the interval [-1, 1]. Thus, to calculate the root-mean-square error, as that's what's used in the challenge's leaderboard, based on our MSE loss of 0.003255, we'll take the square root and multiply by 48 again:

>>> import numpy as np
>>> np.sqrt(0.003255) * 48

This is reasonable proxy for what our score would be on the Kaggle leaderboard; at the same time it's assuming that the subset of the data that we chose to train with follows the same distribution as the test set, which isn't really the case. My guess is that the score is good enough to earn us a top ten place in the leaderboard at the time of writing. Certainly not a bad start! (And for those of you that are crying out right now because of the lack of a proper test set: don't.)

Testing it out

The net1 object actually keeps a record of the data that it prints out in the table. We can access that record through the train_history_ attribute. Let's draw those two curves:

train_loss = np.array([i["train_loss"] for i in net1.train_history_])
valid_loss = np.array([i["valid_loss"] for i in net1.train_history_])
pyplot.plot(train_loss, linewidth=3, label="train")
pyplot.plot(valid_loss, linewidth=3, label="valid")
pyplot.ylim(1e-3, 1e-2)

We can see that our net overfits, but it's not that bad. In particular, we don't see a point where the validation error gets worse again, thus it doesn't appear that early stopping, a technique that's commonly used to avoid overfitting, would be very useful at this point. Notice that we didn't use any regularization whatsoever, apart from choosing a small number of neurons in the hidden layer, a setting that will keep overfitting somewhat in control.

How do the net's predictions look like, then? Let's pick a few examples from the test set and check:

def plot_sample(x, y, axis):
    img = x.reshape(96, 96)
    axis.imshow(img, cmap='gray')
    axis.scatter(y[0::2] * 48 + 48, y[1::2] * 48 + 48, marker='x', s=10)
X, _ = load(test=True)
y_pred = net1.predict(X)
fig = pyplot.figure(figsize=(6, 6))
    left=0, right=1, bottom=0, top=1, hspace=0.05, wspace=0.05)
for i in range(16):
    ax = fig.add_subplot(4, 4, i + 1, xticks=[], yticks=[])
    plot_sample(X[i], y_pred[i], ax)

Our first model's predictions on 16 samples taken from the test set.

The predictions look reasonable, but sometimes they are quite a bit off. Let's try and do a bit better.

Second model: convolutions

The convolution operation. (Animation taken from the Stanford deep learning tutorial.)

LeNet5-style convolutional neural nets are at the heart of deep learning's recent breakthrough in computer vision. Convolutional layers are different to fully connected layers; they use a few tricks to reduce the number of parameters that need to be learned, while retaining high expressiveness. These are:

  • local connectivity: neurons are connected only to a subset of neurons in the previous layer,
  • weight sharing: weights are shared between a subset of neurons in the convolutional layer (these neurons form what's called a feature map),
  • pooling: static subsampling of inputs.

Illustration of local connectivity and weight sharing. (Taken from the tutorial.)

Units in a convolutional layer actually connect to a 2-d patch of neurons in the previous layer, a prior that lets them exploit the 2-d structure in the input.

When using convolutional layers in Lasagne, we have to prepare the input data such that each sample is no longer a flat vector of 9,216 pixel intensities, but a three-dimensional matrix with shape (c, 0, 1), where c is the number of channels (colors), and 0 and 1 correspond to the x and y dimensions of the input image. In our case, the concrete shape will be (1, 96, 96), because we're dealing with a single (gray) color channel only.

A function load2d that wraps the previously written load and does the necessary transformations is easily coded:

def load2d(test=False, cols=None):
    X, y = load(test=test)
    X = X.reshape(-1, 96, 96, 1)
    X = X.transpose(0, 3, 1, 2)
    return X, y

We'll build a convolutional neural net with three convolutional layers and two fully connected layers. Each conv layer is followed by a 2x2 max-pooling layer. Starting with 32 filters, we double the number of filters with every conv layer. The densely connected hidden layers both have 500 units.

There's again no regularization in the form of weight decay or dropout. It turns out that using very small convolutional filters, such as our 3x3 and 2x2 filters, is again a pretty good regularizer by itself.

Let's write down the code:

# use the cuda-convnet implementations of conv and max-pool layer
Conv2DLayer = layers.cuda_convnet.Conv2DCCLayer
MaxPool2DLayer = layers.cuda_convnet.MaxPool2DCCLayer
net2 = NeuralNet(
        ('input', layers.InputLayer),
        ('conv1', Conv2DLayer),
        ('pool1', MaxPool2DLayer),
        ('conv2', Conv2DLayer),
        ('pool2', MaxPool2DLayer),
        ('conv3', Conv2DLayer),
        ('pool3', MaxPool2DLayer),
        ('hidden4', layers.DenseLayer),
        ('hidden5', layers.DenseLayer),
        ('output', layers.DenseLayer),
    input_shape=(128, 1, 96, 96),
    conv1_num_filters=32, conv1_filter_size=(3, 3), pool1_ds=(2, 2),
    conv2_num_filters=64, conv2_filter_size=(2, 2), pool2_ds=(2, 2),
    conv3_num_filters=128, conv3_filter_size=(2, 2), pool3_ds=(2, 2),
    hidden4_num_units=500, hidden5_num_units=500,
    output_num_units=30, output_nonlinearity=None,
X, y = load2d()  # load 2-d data, y)
# Training for 1000 epochs will take a while.  We'll pickle the
# trained model so that we can load it back later:
import cPickle as pickle
with open('net2.pickle', 'wb') as f:
    pickle.dump(net2, f, -1)

Training this neural net is much more computationally costly than the first one we trained. It takes around 15x as long to train; those 1000 epochs take more than 20 minutes on even a powerful GPU.

However, the patient is rewarded with what's already a much better model than the one we had before. Let's take a look at the output when running the script. First comes the list of layers with their output shapes. Note that the first conv layer produces 32 output images of size (94, 94), that's one 94x94 output image per filter:

InputLayer            (128, 1, 96, 96)        produces    9216 outputs
Conv2DCCLayer         (128, 32, 94, 94)       produces  282752 outputs
MaxPool2DCCLayer      (128, 32, 47, 47)       produces   70688 outputs
Conv2DCCLayer         (128, 64, 46, 46)       produces  135424 outputs
MaxPool2DCCLayer      (128, 64, 23, 23)       produces   33856 outputs
Conv2DCCLayer         (128, 128, 22, 22)      produces   61952 outputs
MaxPool2DCCLayer      (128, 128, 11, 11)      produces   15488 outputs
DenseLayer            (128, 500)              produces     500 outputs
DenseLayer            (128, 500)              produces     500 outputs
DenseLayer            (128, 30)               produces      30 outputs

What follows is the same table that we saw with the first example, with train and validation error over time:

 Epoch  |  Train loss  |  Valid loss  |  Train / Val
     1  |    0.111763  |    0.042740  |     2.614934
     2  |    0.018500  |    0.009413  |     1.965295
     3  |    0.008598  |    0.007918  |     1.085823
     4  |    0.007292  |    0.007284  |     1.001139
     5  |    0.006783  |    0.006841  |     0.991525
   500  |    0.001791  |    0.002013  |     0.889810
   501  |    0.001789  |    0.002011  |     0.889433
   502  |    0.001786  |    0.002009  |     0.889044
   503  |    0.001783  |    0.002007  |     0.888534
   504  |    0.001780  |    0.002004  |     0.888095
   505  |    0.001777  |    0.002002  |     0.887699
   995  |    0.001083  |    0.001568  |     0.690497
   996  |    0.001082  |    0.001567  |     0.690216
   997  |    0.001081  |    0.001567  |     0.689867
   998  |    0.001080  |    0.001567  |     0.689595
   999  |    0.001080  |    0.001567  |     0.689089
  1000  |    0.001079  |    0.001566  |     0.688874

Quite a nice improvement over the first network. Our RMSE is looking pretty good, too:

>>> np.sqrt(0.001566) * 48

We can compare the predictions of the two networks using one of the more problematic samples in the test set:

sample1 = load(test=True)[0][6:7]
sample2 = load2d(test=True)[0][6:7]
y_pred1 = net1.predict(sample1)[0]
y_pred2 = net2.predict(sample2)[0]
fig = pyplot.figure(figsize=(6, 3))
ax = fig.add_subplot(1, 2, 1, xticks=[], yticks=[])
plot_sample(sample1[0], y_pred1, ax)
ax = fig.add_subplot(1, 2, 2, xticks=[], yticks=[])
plot_sample(sample1[0], y_pred2, ax)

The predictions of net1 on the left compared to the predictions of net2.

And then let's compare the learning curves of the first and the second network:

This looks pretty good, I like the smoothness of the new error curves. But we do notice that towards the end, the validation error of net2 flattens out much quicker than the training error. I bet we could improve that by using more training examples. What if we flipped the input images horizontically; would we be able to improve training by doubling the amount of training data this way?

Data augmentation

An overfitting net can generally be made to perform better by using more training data. (And if your unregularized net does not overfit, you should probably make it larger.)

Data augmentation lets us artificially increase the number of training examples by applying transformations, adding noise etc. That's obviously more economic than having to go out and collect more examples by hand. Augmentation is a very useful tool to have in your deep learning toolbox.

We mentioned batch iterators already briefly. It is the batch iterator's job to take a matrix of samples, and split it up in batches, in our case of size 128. While it does the splitting, the batch iterator can also apply transformations to the data on the fly. So to produce those horizontal flips, we don't actually have to double the amount of training data in the input matrix. Rather, we will just perform the horizontal flips with 50% chance while we're iterating over the data. This is convenient, and for some problems it allows us to produce an infinite number of examples, without blowing up the memory usage. Also, transformations to the input images can be done while the GPU is busy processing a previous batch, so they come at virtually no cost.

Flipping the images horizontically is just a matter of using slicing:

X, y = load2d()
X_flipped = X[:, :, :, ::-1]  # simple slice to flip all images
# plot two images:
fig = pyplot.figure(figsize=(6, 3))
ax = fig.add_subplot(1, 2, 1, xticks=[], yticks=[])
plot_sample(X[1], y[1], ax)
ax = fig.add_subplot(1, 2, 2, xticks=[], yticks=[])
plot_sample(X_flipped[1], y[1], ax)

Left shows the original image, right is the flipped image.

In the picture on the right, notice that the target value keypoints aren't aligned with the image anymore. Since we're flipping the images, we'll have to make sure we also flip the target values. To do this, not only do we have to flip the coordinates, we'll also have to swap target value positions; that's because the flipped left_eye_center_x no longer points to the left eye in our flipped image; now it corresponds to right_eye_center_x. Some points like nose_tip_y are not affected. We'll define a tuple flip_indices that holds the information about which columns in the target vector need to swap places when we flip the image horizontically. Remember the list of columns was:

left_eye_center_x            7034
left_eye_center_y            7034
right_eye_center_x           7032
right_eye_center_y           7032
left_eye_inner_corner_x      2266
left_eye_inner_corner_y      2266

Since left_eye_center_x will need to swap places with right_eye_center_x, we write down the tuple (0, 2). Also left_eye_center_y needs to swap places: with right_eye_center_y. Thus we write down (1, 3), and so on. In the end, we have:

flip_indices = [
    (0, 2), (1, 3),
    (4, 8), (5, 9), (6, 10), (7, 11),
    (12, 16), (13, 17), (14, 18), (15, 19),
    (22, 24), (23, 25),
# Let's see if we got it right:
df = read_csv(os.path.expanduser(FTRAIN))
for i, j in flip_indices:
    print("# {} -> {}".format(df.columns[i], df.columns[j]))
# this prints out:
# left_eye_center_x -> right_eye_center_x
# left_eye_center_y -> right_eye_center_y
# left_eye_inner_corner_x -> right_eye_inner_corner_x
# left_eye_inner_corner_y -> right_eye_inner_corner_y
# left_eye_outer_corner_x -> right_eye_outer_corner_x
# left_eye_outer_corner_y -> right_eye_outer_corner_y
# left_eyebrow_inner_end_x -> right_eyebrow_inner_end_x
# left_eyebrow_inner_end_y -> right_eyebrow_inner_end_y
# left_eyebrow_outer_end_x -> right_eyebrow_outer_end_x
# left_eyebrow_outer_end_y -> right_eyebrow_outer_end_y
# mouth_left_corner_x -> mouth_right_corner_x
# mouth_left_corner_y -> mouth_right_corner_y

Our batch iterator implementation will derive from the default BatchIterator class and override the transform() method only. Let's see how it looks like when we put it all together:

class FlipBatchIterator(BatchIterator):
    flip_indices = [
        (0, 2), (1, 3),
        (4, 8), (5, 9), (6, 10), (7, 11),
        (12, 16), (13, 17), (14, 18), (15, 19),
        (22, 24), (23, 25),
    def transform(self, Xb, yb):
        Xb, yb = super(FlipBatchIterator, self).transform(Xb, yb)
        # Don't flip images if we're in 'test' mode:
        if not self.test:
            # Flip half of the images in this batch at random:
            bs = Xb.shape[0]
            indices = np.random.choice(bs, bs / 2, replace=False)
            Xb[indices] = Xb[indices, :, :, ::-1]
            if yb is not None:
                # Horizontal flip of all x coordinates:
                yb[indices, ::2] = yb[indices, ::2] * -1
                # Swap places, e.g. left_eye_center_x -> right_eye_center_x
                for a, b in self.flip_indices:
                    yb[indices, a], yb[indices, b] = (
                        yb[indices, b], yb[indices, a])
        return Xb, yb

To use this batch iterator for training, we'll pass it as the batch_iterator argument to NeuralNet. Let's define net3, a network that looks exactly the same as net2 except for these lines at the very end:

net3 = NeuralNet(
    # ...

Now we're passing our FlipBatchIterator, but we've also tripled the number of epochs to train. While each one of our training epochs will still look at the same number of examples as before (after all, we haven't changed the size of X), it turns out that training nevertheless takes quite a bit longer when we use our transforming FlipBatchIterator. This is because what the network learns generalizes better this time, and it's arguably harder to learn things that generalize than to overfit.

So this will take maybe take an hour to train. Let's make sure we pickle the model at the end of training, and then we're ready to go fetch some tea and biscuits. Or maybe do the laundry:, y)
import cPickle as pickle
with open('net3.pickle', 'wb') as f:
    pickle.dump(net3, f, -1)
$ python
 Epoch  |  Train loss  |  Valid loss  |  Train / Val
   500  |    0.002238  |    0.002303  |     0.971519
  1000  |    0.001365  |    0.001623  |     0.841110
  1500  |    0.001067  |    0.001457  |     0.732018
  2000  |    0.000895  |    0.001369  |     0.653721
  2500  |    0.000761  |    0.001320  |     0.576831
  3000  |    0.000678  |    0.001288  |     0.526410

Comparing the learning with that of net2, we notice that the error on the validation set after 3000 epochs is indeed about 5% smaller for the data augmented net. We can see how net2 stops learning anything useful after 2000 or so epochs, and gets pretty noisy, while net3 continues to improve its validation error throughout, though slowly.

Still seems like a lot of work for only a small gain? We'll find out if it was worth it in the next secion.

Changing learning rate and momentum over time

What's annoying about our last model is that it took already an hour to train it, and it's not exactly inspiring to have to wait for your experiment's results for so long. In this section, we'll talk about a combination of two tricks to fix that and make the net train much faster again.

An intuition behind starting with a higher learning rate and decreasing it during the course of training is this: As we start training, we're far away from the optimum, and we want to take big steps towards it and learn quickly. But the closer we get to the optimum, the lighter we want to step. It's like taking the train home, but when you enter your door you do it by foot, not by train.

On the importance of initialization and momentum in deep learning is the title of a talk and a paper by Ilya Sutskever et al. It's there that we learn about another useful trick to boost deep learning: namely increasing the optimization method's momentum parameter during training.

Remember that in our previous model, we initialized learning rate and momentum with a static 0.01 and 0.9 respectively. Let's change that such that the learning rate decreases linearly with the number of epochs, while we let the momentum increase.

NeuralNet allows us to update parameters during training using the on_epoch_finished hook. We can pass a function to on_epoch_finished and it'll be called whenever an epoch is finished. However, before we can assign new values to update_learning_rate and update_momentum on the fly, we'll have to change these two parameters to become Theano shared variables. Thankfully, that's pretty easy:

import theano
def float32(k):
    return np.cast['float32'](k)
net4 = NeuralNet(
    # ...
    # ...

The callback or list of callbacks that we pass will be called with two arguments: nn, which is the NeuralNet instance itself, and train_history, which is the same as nn.train_history_.

Instead of working with callback functions that use hard-coded values, we'll use a parametrizable class with a __call__ method as our callback. Let's call this class AdjustVariable. The implementation is reasonably straight-forward:

class AdjustVariable(object):
    def __init__(self, name, start=0.03, stop=0.001): = name
        self.start, self.stop = start, stop = None
    def __call__(self, nn, train_history):
        if is None:
   = np.linspace(self.start, self.stop, nn.max_epochs)
        epoch = train_history[-1]['epoch']
        new_value = float32([epoch - 1])

Let's plug it all together now and then we're ready to start training:

net4 = NeuralNet(
    # ...
    # ...
    # batch_iterator=FlipBatchIterator(batch_size=128),
        AdjustVariable('update_learning_rate', start=0.03, stop=0.0001),
        AdjustVariable('update_momentum', start=0.9, stop=0.999),
X, y = load2d(), y)
with open('net4.pickle', 'wb') as f:
    pickle.dump(net4, f, -1)

We'll train two nets: net4 doesn't use our FlipBatchIterator, net5 does. Other than that, they're identical.

This is the learning of net4:

 Epoch  |  Train loss  |  Valid loss  |  Train / Val
    50  |    0.004216  |    0.003996  |     1.055011
   100  |    0.003533  |    0.003382  |     1.044791
   250  |    0.001557  |    0.001781  |     0.874249
   500  |    0.000915  |    0.001433  |     0.638702
   750  |    0.000653  |    0.001355  |     0.481806
  1000  |    0.000496  |    0.001387  |     0.357917

Cool, training is happening much faster now! The train error at epochs 500 and 1000 is half of what it used to be in net2, before our adjustments to learning rate and momentum. This time, generalization seems to stop improving after 750 or so epochs already; looks like there's no point in training much longer.

What about net5 with the data augmentation switched on?

 Epoch  |  Train loss  |  Valid loss  |  Train / Val
    50  |    0.004317  |    0.004081  |     1.057609
   100  |    0.003756  |    0.003535  |     1.062619
   250  |    0.001765  |    0.001845  |     0.956560
   500  |    0.001135  |    0.001437  |     0.790225
   750  |    0.000878  |    0.001313  |     0.668903
  1000  |    0.000705  |    0.001260  |     0.559591
  1500  |    0.000492  |    0.001199  |     0.410526
  2000  |    0.000373  |    0.001184  |     0.315353

And again we have much faster training than with net3, and better results. After 1000 epochs, we're better off than net3 was after 3000 epochs. What's more, the model trained with data augmentation is now about 10% better with regard to validation error than the one without.


Introduced in 2012 in the Improving neural networks by preventing co-adaptation of feature detectors paper, dropout is a popular regularization technique that works amazingly well. I won't go into the details of why it works so well, you can read about that elsewhere.

Like with any other regularization technique, dropout only makes sense if we have a network that's overfitting, which is clearly the case for the net5 network that we trained in the previous section. It's important to remember to get your net to train nicely and overfit first, then regularize.

To use dropout with Lasagne, we'll add DropoutLayer layers between the existing layers and assign dropout probabilities to each one of them. Here's the complete definition of our new net. I've added a # ! comment at the end of those lines that were added between this and net5.

net6 = NeuralNet(
        ('input', layers.InputLayer),
        ('conv1', Conv2DLayer),
        ('pool1', MaxPool2DLayer),
        ('dropout1', layers.DropoutLayer),  # !
        ('conv2', Conv2DLayer),
        ('pool2', MaxPool2DLayer),
        ('dropout2', layers.DropoutLayer),  # !
        ('conv3', Conv2DLayer),
        ('pool3', MaxPool2DLayer),
        ('dropout3', layers.DropoutLayer),  # !
        ('hidden4', layers.DenseLayer),
        ('dropout4', layers.DropoutLayer),  # !
        ('hidden5', layers.DenseLayer),
        ('output', layers.DenseLayer),
    input_shape=(128, 1, 96, 96),
    conv1_num_filters=32, conv1_filter_size=(3, 3), pool1_ds=(2, 2),
    dropout1_p=0.1,  # !
    conv2_num_filters=64, conv2_filter_size=(2, 2), pool2_ds=(2, 2),
    dropout2_p=0.2,  # !
    conv3_num_filters=128, conv3_filter_size=(2, 2), pool3_ds=(2, 2),
    dropout3_p=0.3,  # !
    dropout4_p=0.5,  # !
    output_num_units=30, output_nonlinearity=None,
        AdjustVariable('update_learning_rate', start=0.03, stop=0.0001),
        AdjustVariable('update_momentum', start=0.9, stop=0.999),

Our network is sufficiently large now to crash Python's pickle with a maximum recursion error. Therefore we have to increase Python's recursion limit before we save it:

import sys
X, y = load2d(), y)
import cPickle as pickle
with open('net6.pickle', 'wb') as f:
    pickle.dump(net6, f, -1)

Taking a look at the learning, we notice that it's become slower again, and that's expected with dropout, but eventually it will outperform net5:

 Epoch  |  Train loss  |  Valid loss  |  Train / Val
    50  |    0.004619  |    0.005198  |     0.888566
   100  |    0.004369  |    0.004182  |     1.044874
   250  |    0.003821  |    0.003577  |     1.068229
   500  |    0.002598  |    0.002236  |     1.161854
  1000  |    0.001902  |    0.001607  |     1.183391
  1500  |    0.001660  |    0.001383  |     1.200238
  2000  |    0.001496  |    0.001262  |     1.185684
  2500  |    0.001383  |    0.001181  |     1.171006
  3000  |    0.001306  |    0.001121  |     1.164100

Also overfitting doesn't seem to be nearly as bad. Though we'll have to be careful with those numbers: the ratio between training and validation has a slightly different meaning now since the train error is evaluated with dropout, whereas the validation error is evaluated without dropout. A more comparable value for the train error is this:

from sklearn.metrics import mean_squared_error
print mean_squared_error(net6.predict(X), y)
# prints something like 0.0010073791

In our previous model without dropout, the error on the train set was 0.000373. So not only does our dropout net perform slightly better, it overfits much less than what we had before. That's great news, because it means that we can expect even better performance when we make the net larger (and more expressive). And that's what we'll try next: we increase the number of units in the last two hidden layers from 500 to 1000. Update these lines:

net7 = NeuralNet(
    # ...
    hidden4_num_units=1000,  # !
    hidden5_num_units=1000,  # !
    # ...

The improvement over the non-dropout layer is now becoming more substantial:

 Epoch  |  Train loss  |  Valid loss  |  Train / Val
    50  |    0.004756  |    0.007043  |     0.675330
   100  |    0.004440  |    0.005321  |     0.834432
   250  |    0.003974  |    0.003928  |     1.011598
   500  |    0.002574  |    0.002347  |     1.096366
  1000  |    0.001861  |    0.001613  |     1.153796
  1500  |    0.001558  |    0.001372  |     1.135849
  2000  |    0.001409  |    0.001230  |     1.144821
  2500  |    0.001295  |    0.001146  |     1.130188
  3000  |    0.001195  |    0.001087  |     1.099271

And we're still looking really good with the overfitting! My feeling is that if we increase the number of epochs to train, this model might become even better. Let's try it:

net12 = NeuralNet(
    # ...
    # ...
 Epoch  |  Train loss  |  Valid loss  |  Train / Val
    50  |    0.004756  |    0.007027  |     0.676810
   100  |    0.004439  |    0.005321  |     0.834323
   500  |    0.002576  |    0.002346  |     1.097795
  1000  |    0.001863  |    0.001614  |     1.154038
  2000  |    0.001406  |    0.001233  |     1.140188
  3000  |    0.001184  |    0.001074  |     1.102168
  4000  |    0.001068  |    0.000983  |     1.086193
  5000  |    0.000981  |    0.000920  |     1.066288
  6000  |    0.000904  |    0.000884  |     1.021837
  7000  |    0.000851  |    0.000849  |     1.002314
  8000  |    0.000810  |    0.000821  |     0.985769
  9000  |    0.000769  |    0.000803  |     0.957842
 10000  |    0.000760  |    0.000787  |     0.966583

So there you're witnessing the magic that is dropout. :-)

Let's compare the nets we trained so for and their respective train and validation errors:

 Name  |   Description    |  Epochs  |  Train loss  |  Valid loss
 net1  |  single hidden   |     400  |    0.002244  |    0.003255
 net2  |  convolutions    |    1000  |    0.001079  |    0.001566
 net3  |  augmentation    |    3000  |    0.000678  |    0.001288
 net4  |  mom + lr adj    |    1000  |    0.000496  |    0.001387
 net5  |  net4 + augment  |    2000  |    0.000373  |    0.001184
 net6  |  net5 + dropout  |    3000  |    0.001306  |    0.001121
 net7  |  net6 + epochs   |   10000  |    0.000760  |    0.000787

Training specialists

Remember those 70% of training data that we threw away in the beginning? Turns out that's a very bad idea if we want to get a competitive score in the Kaggle leaderboard. There's quite a bit of variance in those 70% of data and in the challenge's test set that our model hasn't seen yet.

So instead of training a single model, let's train a few specialists, with each one predicting a different set of target values. We'll train one model that only predicts left_eye_center and right_eye_center, one only for nose_tip and so on; overall, we'll have six models. This will allow us to use the full training dataset, and hopefully get a more competitive score overall.

The six specialists are all going to use exactly the same network architecture (a simple approach, not necessarily the best). Because training is bound to take much longer now than before, let's think about a strategy so that we don't have to wait for max_epochs to finish, even if the validation error stopped improving much earlier. This is called early stopping, and we'll write another on_epoch_finished callback to take care of that. Here's the implementation:

class EarlyStopping(object):
    def __init__(self, patience=100):
        self.patience = patience
        self.best_valid = np.inf
        self.best_valid_epoch = 0
        self.best_weights = None
    def __call__(self, nn, train_history):
        current_valid = train_history[-1]['valid_loss']
        current_epoch = train_history[-1]['epoch']
        if current_valid < self.best_valid:
            self.best_valid = current_valid
            self.best_valid_epoch = current_epoch
            self.best_weights = [w.get_value() for w in nn.get_all_params()]
        elif self.best_valid_epoch + self.patience < current_epoch:
            print("Early stopping.")
            print("Best valid loss was {:.6f} at epoch {}.".format(
                self.best_valid, self.best_valid_epoch))
            raise StopIteration()

You can see that there's two branches inside the __call__: the first where the current validation score is better than what we've previously seen, and the second where the best validation epoch was more than self.patience epochs in the past. In the first case we store away the weights:

          self.best_weights = [w.get_value() for w in nn.get_all_params()]

In the second case, we set the weights of the network back to those best_weights before raising StopIteration, signalling to NeuralNet that we want to stop training.

          raise StopIteration()

Let's update the list of on_epoch_finished handlers in our net's definition and use EarlyStopping:

net8 = NeuralNet(
    # ...
        AdjustVariable('update_learning_rate', start=0.03, stop=0.0001),
        AdjustVariable('update_momentum', start=0.9, stop=0.999),
    # ...

So far so good, but how would we go about defining those specialists and what they should each predict? Let's make a list for that:

            'left_eye_center_x', 'left_eye_center_y',
            'right_eye_center_x', 'right_eye_center_y',
        flip_indices=((0, 2), (1, 3)),
            'nose_tip_x', 'nose_tip_y',
            'mouth_left_corner_x', 'mouth_left_corner_y',
            'mouth_right_corner_x', 'mouth_right_corner_y',
            'mouth_center_top_lip_x', 'mouth_center_top_lip_y',
        flip_indices=((0, 2), (1, 3)),
            'left_eye_inner_corner_x', 'left_eye_inner_corner_y',
            'right_eye_inner_corner_x', 'right_eye_inner_corner_y',
            'left_eye_outer_corner_x', 'left_eye_outer_corner_y',
            'right_eye_outer_corner_x', 'right_eye_outer_corner_y',
        flip_indices=((0, 2), (1, 3), (4, 6), (5, 7)),
            'left_eyebrow_inner_end_x', 'left_eyebrow_inner_end_y',
            'right_eyebrow_inner_end_x', 'right_eyebrow_inner_end_y',
            'left_eyebrow_outer_end_x', 'left_eyebrow_outer_end_y',
            'right_eyebrow_outer_end_x', 'right_eyebrow_outer_end_y',
        flip_indices=((0, 2), (1, 3), (4, 6), (5, 7)),

We already discussed the need for flip_indices in the Data augmentation section. Remember from section The data that our load_data() function takes an optional list of columns to extract. We'll make use of this feature when we fit the specialist models in a new function fit_specialists():

from collections import OrderedDict
from sklearn.base import clone
def fit_specialists():
    specialists = OrderedDict()
    for setting in SPECIALIST_SETTINGS:
        cols = setting['columns']
        X, y = load2d(cols=cols)
        model = clone(net)
        model.output_num_units = y.shape[1]
        model.batch_iterator.flip_indices = setting['flip_indices']
        # set number of epochs relative to number of training examples:
        model.max_epochs = int(1e7 / y.shape[0])
        if 'kwargs' in setting:
            # an option 'kwargs' in the settings list may be used to
            # set any other parameter of the net:
        print("Training model for columns {} for {} epochs".format(
            cols, model.max_epochs)), y)
        specialists[cols] = model
    with open('net-specialists.pickle', 'wb') as f:
        # we persist a dictionary with all models:
        pickle.dump(specialists, f, -1)

There's nothing too spectacular happening here. Instead of training and persisting a single model, we do it with a list of models that are saved in a dictionary that maps columns to the trained NeuralNet instances. Now despite our early stopping, this will still take forever to train (though by forever I don't mean Google-forever, I mean maybe half a day on a single GPU); I don't recommend that you actually run this.

We could of course easily parallelize training these specialist nets across GPUs, but maybe you don't have the luxury of access to a box with multiple CUDA GPUs. In the next section we'll talk about another way to cut down on training time. But let's take a look at the results of fitting these expensive to train specialists first:

Learning curves for six specialist models. The solid lines represent RMSE on the validation set, the dashed lines errors on the train set. mean is the mean validation error of all nets weighted by number of target values. All curves have been scaled to have the same length on the x axis.

Lastly, this solution gives us a Kaggle leaderboard score of 2.17 RMSE, which corresponds to the second place at the time of writing (right behind yours truly).

Supervised pre-training

In the last section of this tutorial, we'll discuss a way to make training our specialists faster. The idea is this: instead of initializing the weights of each specialist network at random, we'll initialize them with the weights that were learned in net6 or net7. Remember from our EarlyStopping implementation that copying weights from one network to another is as simple as using the load_weights_from() method. Let's modify the fit_specialists method to do just that. I'm again marking the lines that changed compared to the previous implementation with a # ! comment:

def fit_specialists(fname_pretrain=None):
    if fname_pretrain:  # !
        with open(fname_pretrain, 'rb') as f:  # !
            net_pretrain = pickle.load(f)  # !
    else:  # !
        net_pretrain = None  # !
    specialists = OrderedDict()
    for setting in SPECIALIST_SETTINGS:
        cols = setting['columns']
        X, y = load2d(cols=cols)
        model = clone(net)
        model.output_num_units = y.shape[1]
        model.batch_iterator.flip_indices = setting['flip_indices']
        model.max_epochs = int(4e6 / y.shape[0])
        if 'kwargs' in setting:
            # an option 'kwargs' in the settings list may be used to
            # set any other parameter of the net:
        if net_pretrain is not None:  # !
            # if a pretrain model was given, use it to initialize the
            # weights of our new specialist model:
            model.load_weights_from(net_pretrain)  # !
        print("Training model for columns {} for {} epochs".format(
            cols, model.max_epochs)), y)
        specialists[cols] = model
    with open('net-specialists.pickle', 'wb') as f:
        # this time we're persisting a dictionary with all models:
        pickle.dump(specialists, f, -1)

It turns out that initializing those nets not at random, but by re-using weights from one of the networks we learned earlier has in fact two big advantages: One is that training converges much faster; maybe four times faster in this case. The second advantage is that it also helps get better generalization; pre-training acts as a regularizer. Here's the same learning curves as before, but now for the pre-trained nets:

Learning curves for six specialist models that were pre-trained.

Finally, the score for this solution on the challenge's leaderboard is 2.13 RMSE. Again the second place, but getting closer!


There's probably a dozen ideas that you have that you want to try out. You can find the source code for the final solution here to download and play around with. It also includes the bit that generates a submission file for the Kaggle challenge. Run python to find out how to use the script on the command-line.

Here's a couple of the more obvious things that you could try out at this point: Try optimizing the parameters for the individual specialist networks; this is something that we haven't done so far. Observe that the six nets that we trained all have different levels of overfitting. If they're not or hardly overfitting, like for the green and the yellow net above, you could try to decrease the amount of dropout. Likewise, if it's overfitting badly, like the black and purple nets, you could try increasing the amount of dropout. In the definition of SPECIALIST_SETTINGS we can already add some net-specific settings; so say we wanted to add more regularization to the second net, then we could change the second entry of the list to look like so:

            'nose_tip_x', 'nose_tip_y',
        kwargs=dict(dropout2_p=0.3, dropout3_p=0.4),  # !

And there's a ton of other things that you could try to tweak. Maybe you'll try adding another convolutional or fully connected layer? I'm curious to hear about improvements that you're able to come up with in the comments.

Daniel Nouri is the founder of Natural Vision, a company that builds cutting edge machine learning solutions.

December 16, 2014

UW Oshkosh How-To's: How to embed video on a page

by ledwell at 2014-12-16T22:52:26Z

The problem

When you try and use TinyMCE and embed video code from Kaltura (or any video service) TinyMCE will strip out and rework the code such that the video will not display in Plone 4.3. When you go into "site setup > HTML Filtering" and adjust those setting TinyMCE continues to strip out the code. You can turn off TinyMCE and edit the code without the WYSIWYG editor then the code will not be modified on save and the video will display as expected. The problem with this solution is that someone else might come along who just wants to fix or change some text. If they were to use TinyMCE on that page with the functioning video then the code would be reworked on a save and the video which worked before would stop doing so.

You could try using;

Davids option looks like it would work but you'd have to have more knowledge of monkeypatching and rights to access the file server, neither of which I have. 

For this to work you will need to have


Create a Snippet

  1. Activate "Snippets" (activating this product will create a folder called ".snippets" on the root level of your site
  2. Copy the embed code for the video you want to display
  3. Navigate to the root of your site and go into the /.snippets folder
  4. Choose "add new > page"
  5. Change "Text Format" to "textile" (this is mandatory! we do NOT want to use TinyMCE in anyway here).
  6. Paste your embed code
  7. IMPORTANT!. please title your file very simply. Do NOT use any goofy characters or spaces a-zA-Z0-9_ (if you name it incorrectly youll get an error when you try and insert the snippet. If that happens just go back and rename the snippet.)
  8. Save

Add your snippet to a page using TinyMCE

  1. Navigate to a page where you want your video (now a snippet) to appear
  2. Choose the "insert snippets icon" looks like this  {{}}
  3. Choose the "browse" button
  4. Choose your new Snippet 
  5. Choose the "Select" button
  6. If your happy with the preview choose the "insert" button

Positives Side Effects

This is a snippet that can be inserted in multiple places on your website. If you ever go to change the video for the snippet then all pages using that snippet get the new video as well. Plone Conference Bucharest October 2015


Plone Conference 2015 will take place October 14-16, 2015, in Bucharest, Romania. The schedule will include pre-conference trainings and post-conference sprints.

December 15, 2014

CodeSyntax: zest.releaser and some add-ons


For some time we have been using a script found in some plone svn repository (currently unreachable) to do proper egg releases, bump the egg version and upload them to pypi or our custom repository. But now we are moving to use zest.releaser for both public and private eggs. We have written two add-ons for it to ease our move from our previous hand-made-scripts.

BubbleNet: Control setuptools version installed by buildout bootstrap

by Godefroid Chapelle at 2014-12-15T11:20:00Z

A few weeks ago, I got mad for the xth times because my CI pipeline got broken again by zc.buildout bootstrap installing a version of setuptools that is incompatible with other parts of the pipeline.

With a few lines of code, I got a fix for this issue.

You can now find on github master branch a version of that accepts the --setuptools-version parameter.

To control finely which versions of packages are actually installed at bootstrap time, you can now use something like : 

python --setuptools-version=7.0 --version=2.2.5

This might be useful for those bitten by setuptools 8.0 and that cannot wait until all bugs are sorted out.


December 12, 2014

Tom Gross: Workaround setuptools 8.0 bug with zc.buildout

by Tom at 2014-12-12T23:00:00Z

Buildout always fetches the latest version of setuptools for bootstraping. No matter what is defined in versions.cfg. It is possible to set the version of zc.buildout when bootstraping but not the one of setuptools.

This behavior is hardcoded in

77 ez = {}
78 exec(urlopen(''
79             ).read(), ez)
80 if not options.allow_site_packages:

Unfortunately there are some incompatible changes (a bug?) in setuptools 8.0 which prevent zc.buildout from bootstraping. It fails with the following error:

tom@localhost:~/demobuildout> python2.7
Extracting in /tmp/tmp_34LbA
Now working in /tmp/tmp_34LbA/setuptools-8.0
Building a Setuptools egg in /tmp/tmpeA5PHB
Traceback (most recent call last):
  File "", line 145, in <module>
    if _final_version(distv):
  File "", line 131, in _final_version
    for part in parsed_version:
TypeError: 'Version' object is not iterable

I found a quite easy workaround to use a different version of setuptools until this issue is fixed. Setuptools 7.0 seems to work fine. Do the following:

  1. Create a directory and change to it:

    $ mkdir setuptools-workaround
    $ cd setuptools-workaround
  2. Download

    $ wget
  3. Edit and change the setuptools version to be used.:

    39 DEFAULT_VERSION = "7.0"
    40 DEFAULT_URL = ""
  4. Start a python webserver in the directory.:

    $ python -m SimpleHTTPServer

    The server does not daemonize itself. The following actions need to be done in a new terminal.

  5. Now change the line where it downloads in your file to use the patched

    77 ez = {}
    78 exec(urlopen('http://localhost:8000/'
    79             ).read(), ez)
    80 if not options.allow_site_packages:
  6. You are ready to start your working buildout.

    tom@linux-zoc2:~/demobuildout> python
    Extracting in /tmp/tmp82Jp3m
    Now working in /tmp/tmp82Jp3m/setuptools-7.0
    Building a Setuptools egg in /tmp/tmpgDuB3k
    Generated script '/home/tom/demobuildout/bin/buildout'.

This works for me. Hope this bug is fixed soon anyway.

Reinout van Rees: Naming things: don't use reserved words

by Reinout van Rees at 2014-12-12T20:17:00Z

Update: added postfixed-with-underscore examples.

I'm currently cleaning up some code. Some people just cannot spell "p-e-p-8" if their lives depended on it, apparently. Luckily I'm civilized so I only threaten with dismemberment.

Look at this gem I just uncovered:

tg_after = instance.filter.technischegegevensfilter_set.exclude(

Don't mind about the Dutch in there. Just look at those two filter words in the first two lines. They're even nicely underneath each other. At least they are now, I first had to fit the 159 characters long line within 78 characters, of course.

In Django, you do sql filtering with .filter(some_condition=42). That's not what's happening in the first line, though. There's a foreign key called filter there! So the first filter is the name of a foreign key and the second filter is the filter method that is used everywhere in Django.

Big confusion. And big chance that someone else that reads the code messes it up somehow.

So... steer clear of common words used in your programming language or framework or whatever. Some examples:

  • Don't use type as an name. Use customer_type or station_type or whatever. Only use type by itself if you really mean the python build-in. Alternatively you can postfix it with an underscore, so type_
  • Don't use class. Either use the often-used klass or class_ alternative if you want to keep it short. But why not use css_class if you want to return a css class, for instance?
  • Don't use filter for Django models. Even if you're modelling filters that are used for groundwater extraction (as in this case). Call them WaterFilter or so.

So... you can now go and fix your code. You've got about 45 minutes before I'm ready sharpening my axe. Plone Foundation Board Elects Officers for 2014-2015


Meet your new Plone Foundation officers: Paul Roeland has been named President, Cris Ewing Vice-President, Steve McMahon Secretary and Jen Myers Treasurer.

December 11, 2014

Martijn Faassen: A Review of the Web and how Morepath fits in

by Martijn Faassen at 2014-12-11T16:26:00Z

I've spent a bit of time writing the somewhat grandiosely titled A Review of the Web, and I think it's a neat addition to the documentation of the Morepath web framework.

It attempts to do two things:

  • Help those developers who aren't as familiar yet with the details of web technology to get a handle on various concepts surrounding web frameworks and web services.
  • Show to developers who are more familiar with these concepts how Morepath fits in as a web framework.

Does this document fulfill either purpose? Let me know!

UW Oshkosh: Plone community wisdom: a personal take on the Bristol Plone 2020 open space

by nguyen at 2014-12-11T15:00:11Z

This is a repost of - please add your comments and feedback there.



In the lead up to the Bristol Plone Conference, there was much chatter on the attendee mailing list about Andreas Jung's provocatively entitled presentation, "Why Plone is Going to Die". 

Another mailing list thread centered around the need to make Plone easier to use for non-developers. 

Some email participants were pushing for a complete rewrite of Plone, to address concerns about the complexity of Plone's underpinnings and the rapidly changing nature of web user interfaces going to pure JavaScript. 

There was much angst and predictions of doom if we did not make radical changes to Plone-the-software.


Andreas has been part of Plone and Zope for as long as most of us can remember. He has been a huge contributor in many parts of the software we all use and deploy, and he has been ever present in online forums and chat rooms.  Not a person who shies from controversy, he makes it his business to speak forthrightly and say the things that others may think but will not say out loud. 

This has gotten him into trouble, but his well deserved reputation as a contributor, organizer, and doer -- not to mention his intentionally provocative presentation title -- led to his much awaited presentation on Day 1 to be standing room only... and it was a big room. 

Andreas' slides:

Andreas's talk on video:

Andreas' main points were:

  • growing developer pain
  • growing integrator pain
  • rising & unpredictable project costs
  • developer frustration


Andreas and his iceberg slide

As he went through his 48 slides, I realized that Andreas had not just raged on with a litany of reflexive complaints (where would I get that idea?), but he had come up with a list of items that, yes, we could work on!  His pain points were a roadmap to improving the quality of the code base that he had run into problems using. 

In fact, during the sprints two days later, an entire table of developers tackled some of the migration pain Andreas had described with the next Dexterity based content types.

Other code-centric examples he brought up:

  • z3c.form
  • the Pythonic (or not) nature of Dexterity
  • the proliferation of small plone.* eggs
  • Zope Component Architecture
  • TinyMCE and other JavaScript migration issues
  • Zope 2 and CMF
  • incompatible add-ons
  • plone.api


At least one audience member pointed out that it's maybe not always necessary to upgrade a site, especially since upgrading major releases is, by definition, significant, to which Andreas admitted that he doesn't always automatically upgrade sites (a position that many of us have taken, given that unless the client has a strong need to go to a major new version, there may not be a compelling reason to, especially given Plone's stability and security).

Andreas also presented the results of a survey he circulated prior to the conference, attempting to look at the demographics of Plone developers: age (mostly in their 30s, followed by 40s), gender (95% male), geographical distribution (72% Euro-centric, 12% North American).  He also addressed a question we all have been wondering: is the CMS market shrinking? Is the Plone market shrinking? Are there more legacy Plone projects than new ones?  The numbers don't show a drastic change in Plone's presence. 

Someone suggested that interest in CMS development has dropped in recent years, true of perhaps all CMSs except maybe Wordpress.  I see that since 2001 or the heyday of Plone in the mid 2000s there is a lot more interest in mobile applications and other (non-CMS) web sites and services that would draw new, young developers, rather than to the large, established projects for complex CMSs such as Plone and Drupal.  We have all encountered youth who are eager to write their own CMS because Plone seems too complicated to learn... true enough, but after your first quick wins with your DIY CMS and you are asked to implement security and maybe rudimentary workflow, that's when you go "um, ok" and if you have enough self-confidence you allow yourself to wonder if perhaps it would have been ok to spend a little bit of time to learn to use Plone instead of reinventing it...

Andreas then had a wonderful slide (#39) in which he listed all the reason why Plone is still really good:

  • enterprise CMS
  • fine-grained security model
  • outstanding security record
  • flexible workflows
  • ZODB is great (but time to move on...)


Andreas and slide of good things


and then he gave us his roadmap for making Plone great for tomorrow:

  • get rid of old code
  • do not overengineer code (e.g. portlets and his punching bag, z3c.form)
  • consistent APIs everywhere
  • consistent type checking
  • better / mo' explicit error messages
  • better search engine
  • more scalable database back end
  • coherent/consistent architecture


The key takeaway here is that these are all things we can and DO agree with.

Somewhat more debatable were his prescriptions to:

  • kill Zope 2
  • kill CMF
  • kill ZCA
  • orphan ZODB
  • use Pyramid
  • use Python 3
  • use a new (pluggable?) persistence API or layer
  • evaluate the market for a better database back end
  • don't call the new Plone "Plone"


but, unbeknownst to all of us, those were to be addressed in the Plone 2020 open space two days later...


As Days 1 and 2 of the conference unfolded, the open spaces (Day 3) signup charts filled up quickly, with the morning booked for both the Plone 2020 and the "Growing Plone" topics.

(Sally Kleinfeldt's notes from that combined open space are here: )

It was standing room only, and Martin Aspeli was kind enough (perhaps he was volunteered?) to moderate.  He started off by asking us to write down on post-it notes what it is that each of us loves about Plone.  We passed our notes up to the front of the room, where the notes were clustered by a handful of helpers.

Martin during the open space

After a few minutes, Martin reported that the #1 "most valuable thing" about Plone was Community, followed by Features, Security, Technology, and Making a Living.  He asked us to keep those important things in mind for the rest of the morning, and whatever we proposed or decided to do should protect and not threaten those most valuable aspects of Plone.

This was a brilliant opening, because it focused our thinking on constructive ways to improve Plone.

Martin then asked us "What are the top one or two items that need to be improved?" and had us pass our post-it notes up to the front again. After a few minutes' wait, the results came back:

  • APIs, hide complexity
  • Integrator usability (TTW, deployments, training)
  • End user usability
  • Code improvements (simplifying code base)
  • Strategy (roadmap, communications, marketing, funding, growth)


The discussion then centred around three issues: improving the front end of Plone, reimplementing the back end of Plone, and reducing the learning hurdles for new Plone developers.

On the front end, it seemed that the proposal that got the most traction was to go with a pure JavaScript user interface, that would render entirely within the browser, and would give us the flexibility to reimagine how to create a beautiful and modern user experience. This would require that the Plone back end serve, essentially, just JSON data.

On the back end, there was much discussion about throwing out Zope 2 and CMF and replacing it with something like Pyramid or Subtance D. Attendees went back and forth on the pros and cons, given the stability of the existing stack, and trying to gauge how much work it would take to rebuild functionality we take for granted using one of these newer frameworks.

On the topic of new Plone developers, keeping in mind that all complex enterprise software requires a significant amount of learning on the part of new developers, we agreed that plone.api was good and that by continuing to add to it and improve it we would ensure that new developers had one canonical way of interacting with Plone.  Adding a JSON layer on top of plone.api would make it possible for Plone to serve JSON data to a pure JavaScript front end. (Whenever someone says "JavaScript" I keep expecting Rok to appear).


Having a feature-complete plone.api in the middle of our stack solves three problems (maybe more) in one fell swoop:

  • those pushing for a rewrite of the back end could do so without affecting add-on developers, integrators, and front end developers
  • those pushing for a rewrite of the front end could do so without affecting add-on developers, integrators, or back end developers
  • new developers (as well as, ahem, existing ones) would have a much easier time working with the Plone code base

But that wasn't the truly mindblowing thing... it was that we had gathered in Bristol, in that particular open space, worried about potentially radical things that were being proposed or that might be proposed to address strategic issues facing Plone, but at that critical juncture we instead found that this broad representation of the Plone community had come together and found a solution, a way forward, that all present in that room agreed with, that didn't rule out potentially big changes in the code base and in the user experience.  By proceeding first with a major push to flesh out plone.api, we would make it possible for the ambitious among us (some might say "crazy") to revamp the back end, or the front end, or both, while keeping all other developers, integrators, and end users along for the ride.  For the proverbial truck barreling down the highway, it would be possible to swap out the tires, wheels, engine, and cabin without slowing it down.


Although I mentioned a few people by name (Andreas, Martin, Sally, Rok), there is one person in particular who has worked very hard at raising awareness of these strategic issues and at leading discussions about them: Philip "That's Konferenz with a K" Bauer.  For that, among the many other things he has done for Plone, we salute him!

Philip relaxing

Everyone who was in these discussions (whether at Brasilia, Sorrento, Oshkosh, Cologne, Munich, or in the room in Bristol) deserves all our thanks for your committed, constructive participation.  Plone would not be here if it weren't for you.


I would be remiss if I didn't mention Eric Bréhault's rebuttal and Bristol lighting talk in which he explained why the issues we have been seeing within the Plone community are, in fact, common to all CMSs:

Eric's blog post:

Eric's lightning talk (starts 15 minutes in):


This is a repost of - please add your comments and feedback there.

Six Feet Up: Stupid ZMI Tricks

by Rob McBroom at 2014-12-11T12:00:00Z

Zope logoThe Zope Management Interface (ZMI) is the web interface for interacting with the Zope framework, which Plone runs on. As a new Plone developer, here are some of the things I've found useful within the ZMI.

When do you use the ZMI?

The short answer is "When you can't do something in site setup.", but there's more to it than that. For instance, it's a good place to look up the names and locations of things when writing code.

How do you access the ZMI?

To access the ZMI, you can simply add /manage to the end of any URL in Plone (though you’ll typically want to do it at the top of the site). Alternatively, you can go into Site Setup and select Zope Management Interface from the Plone Configuration section.


GenericSetup Changes

If you want to automate something using GenericSetup and aren't sure exactly what to create, or where, you can often find out this way:

portal snapshot screenshot

  1. Go to portal_setup and click the Snapshots tab.
  2. Create a snapshot.
  3. Make the desired changes using the Plone interface.
  4. Return to portal_setup and create another snapshot.
  5. While still in portal_setup, click the Comparison tab.
  6. Select the two snapshots and click Compare.

This will hopefully show the path to the XML files you need to create or modify in your policy package, as well as the changes you need to make there.

Explore the Catalog

Under the portal_catalog, the Catalog tab will allow you to explore containers and content objects in your site. This is a good way to discover what properties are available, or to confirm that something exists.

The Indexes tab is useful when you're attempting to create custom collections. You can browse the indexes to see if criteria are being assigned to objects as expected.


Exploring Types

The portal_types section is very helpful for learning about the built-in types, and probably even more useful when creating your own types in code.

portal types screenshot

Undoing Changes

In many areas of the ZMI, you'll see an "Undo" tab at the top of the page. This allows you to back out if a change ends up causing a problem. It also lets you see who made configuration changes and when.

ZMI Undo Tab

Repeating Import Steps

If you go to the "Import" tab under portal_setup, you'll be able to repeat some things that typically only happen when a package is loaded for the first time. Select a profile from the drop-down and you'll be presented with a list of available import steps.

Select the steps you want to repeat, or click "Import all steps" to go through the entire profile. This can be really handy when you're working on a new package and making frequent changes.

import tab screenshot

More Information

This article offers basic information and tips for the Zope Management Interface in Plone. For more detailed information and documentation visit

Was this article useful? Let us know in the comments and be sure to sign up for our Plone & Python How-To digests to receive more how-to guides as soon as they are published!

eGenix: eGenix mxODBC Zope DA 2.2.0 GA



The eGenix mxODBC Zope DA allows you to easily connect your Plone CMS or Zope installation to just about any database backend on the market today, giving you the reliability of the commercially supported eGenix product mxODBC and the flexibility of the ODBC standard as middle-tier architecture.

The mxODBC Zope Database Adapter is highly portable, just like Zope itself and provides a high performance interface to all your ODBC data sources, using a single well-supported interface on Windows, Linux, Mac OS X, FreeBSD and other platforms.

This makes it ideal for deployment in ZEO Clusters and Zope hosting environments where stability and high performance are a top priority, establishing an excellent basis and scalable solution for your Plone CMS.

>>>   mxODBC Zope DA Product Page


We are pleased to announce our new version 2.2.0 of the mxODBC Zope/Plone Database Adapter product.

In this release, we have upgraded the adapter to mxODBC 3.3.1 and added compatibility with the latest Plone releases and ODBC drivers/managers.

Feature Enhancements:

  • Compatible with Plone 4.0 - 4.3.
  • Compatible with the upcoming Plone 5.0.

Enhanced Support for Stored Procedures

  • Added documentation on how to call stored procedures from Plone / Zope.
  • Added support for input, output and input/output parameters to the .callproc() method for calling stored procedures.
  • Added documentation on how to use External Methods to access and use the mxODBC Zope DA connection objects.

Fast Cursor Types

  • Switched to forward-only cursor types for all database backends, since this provides a much better performance for MS SQL Server and IBM DB2 drivers.

Updated mxODBC API

Easier Installation

For the full list of features, please see the mxODBC Zope DA feature list.

Driver Compatibility Enhancements:

  • ODBC driver compatibility updated. By upgrading to the latest mxODBC 3.3 release, we are bringing all compatibility enhancements added to mxODBC 3.3 to the mxODBC Zope DA. This includes updated support for Oracle, MS SQL Server, Sybase ASE, IBM DB2, PostgreSQL and MySQL. See the mxODBC 3.3.0 and 3.3.1 release announcements for full details.
  • ODBC manager compatibility updated. Built against unixODBC 2.3.2, iODBC 3.52.8, DataDirect 7.1.2 on Unix. Built against the MS Windows Manager ODBC on Windows. Built against iODBC 3.52.8 on Mac OS X.

The complete list of changes is available on the mxODBC Zope DA changelog page.


Users are encouraged to upgrade to this latest mxODBC Plone/Zope Database Adapter release to benefit from the new features and updated ODBC driver support. We have taken special care not to introduce backwards incompatible changes, making the upgrade experience as smooth as possible.

For major and minor upgrade purchases, we will give out 20% discount coupons going from mxODBC Zope DA 1.x to 2.2 and 50% coupons for upgrades from mxODBC 2.x to 2.2. After upgrade, use of the original license from which you upgraded is no longer permitted. Patch level upgrades (e.g. 2.2.0 to 2.2.1) are always free of charge.

Please contact the Sales Team with your existing license serials for details for an upgrade discount coupon.

If you want to try the new release before purchase, you can request 30-day evaluation licenses by visiting our web-site or writing to, stating your name (or the name of the company) and the number of eval licenses that you need.


Please visit the eGenix mxODBC Zope DA product page for downloads, instructions on installation and documentation of the packages.

If you want to try the package, please jump straight to the download instructions.

Fully functional evaluation licenses for the mxODBC Zope DA are available free of charge.


Commercial support for this product is available directly from

Please see the support section of our website for details.

More Information

For more information on eGenix mxODBC Zope DA, licensing and download instructions, please write to

Enjoy !

Marc-Andre Lemburg,

December 09, 2014

Andreas Jung: Announcing the "XML-Director" XML CMS project


XML-Director will be a new-generation XML content management system based on the Plone 5 CMS with either eXist-db or Base-X as backend. Additional components will provide DOCX to XML and XML/HTML to PDF/EPub conversion, support for desktop and web-based XML editors.