Job opportunity with Shogun and the ATI

We, the community around the [1] open-source machine learning library, are looking for a developer for a paid 6 month pilot project (October 18 – March 19) on improving meta-learning capabilities (openml, coreml). The ideal candidate is a highly motivated MSc/PhD/postdoc with the desire to get involved in the open-source movement, who

  • is based on London
  • is able to start working in October
  • is flexible enough to spend full-time or at least 50% on the project
  • has a background in designing software in C++ (gcc, valgrind, C++11, etc)
  • (optional) has knowledge of openml [2] and coreml [3]
  • (optional) has experience with build management and dev-ops tools (git, cmake, travis, buildbot, linux, docker, etc)
  • (optional) has experience in computational sciences (ml, stats, etc)
  • (optional) has contributed to open-source before

The project is funded by with the Alan Turing institute, and at least part of the work will be located there. You will be supervised by the Shogun core development team, partly in person and partly remotely. This is a great opportunity to get involved in one of the oldest ML libraries out there, getting your hands dirty on a huge code-base, and tipping into the open-source community.

The project is currently in planning stage. After a successful pilot, there is the option for an extension. If you are interested, please get in touch via the developers, the mailing list, or even better, read how to get involved [4] and send us a pull-request for an entrance task [5] on github. See our website for contact details.

Shogun is a library aiming to offer unified and efficient machine learning methods. Its core is written in C++ and it interfaces to a large number of modern computing languages. The Shogun community is vibrant, diverse, and international. Shogun is a fiscally sponsored project of NumFOCUS, a nonprofit dedicated to supporting the open source scientific computing community.


DS3 summer school in Paris

I had the pleasure to run a (actually two) practical on “representing and comparing probabilities using kernels” at the DS3 summer school at the Polytechnique in Paris, following a lecture by Arthur Gretton. Thanks to Zoltan Szabo for organising the session.

We covered the implementation basics of two-sample testing, independence testing, and goodness-of-fit testing, with examples including testing the quality of GAN samples, detecting dependence across translated documents, and more. I even managed to sneak Shogun into the practical 😉 Good fun overall!

Slides 1, slides 2, and practical session notebook.

Deep Self-Organization: Interpretable Discrete Representation Learning on Time Series

I got mildly involved in a cool project with the ETHZ group, lead by Vincent Fortuin and Matthias Hüser, along with Francesco Locatello, myself, and Gunnar Rätsch. The work is about building a variational autoencoder with a discrete (and thus interpretable) latent space that admits topological neighbourhood structure through using a self organising map. To represent latent dynamics (the lab is interested in time series modelling), there also is a built-in Markov transition model. We just put a version on arXiv.