Yeah! Shogun this week got accepted to be an organisation participating in the 10th Google Summer of Code. This year, besides mentoring a few projects, I am one of the three project administrators. I am curious how this will be. One first thing to do was to write the application for Shogun – I’m glad it worked! I also will spend a little more time organising things. Apart from trying to find mentors (which requires a lot of talking people into it), I also want to make Shogun (and the students) having more from the program. Last year, I pushed the team to ask all students
- to write a project report in the form of IPython notebooks (link). These are absolutely great for talking about the GSoC work, impressing people, and having a final piece of work to show for the students.
- To fully unit-test every module of their algorithm/framework. This is absolutely essential in order to not loose the student’s work a few years later when a re-factoring change breaks their code and nobody knows how to fix it. Those tests already saved lots of life since last year.
- To peer-review each other in pairs of students. This improved documentation here and there and solved some bugs. I want to emphasise this more this year as I think it is a great way of enabling synergistic effects between students.
In addition, we will again screen all the applicants via a set of entrance tasks on our github page (link). I just wrote a large number of such smaller or larger tasks that get students started on a particular project, fix bugs in Shogun, or prepare some larger change. In order to get the students started a bit more easily (contributing to Shogun these days is a non-trivial task), I wrote a little how-to (link) that is supposed to point out our expectations, and what are the first steps towards participating in GSoC.
Finally, I wrote descriptions for quite a few possible projects, some of them with a number of interesting co-mentors. The full list is here (link). If you are a talented student interested in any of those topics, consider working with us during the summer. It’s usually very fun!
- Variational Learning for Recommendation with Big Data. With Emtiyaz Khan, who I met at last year’s workshop for latent Gaussian models. Matrix factorisation and Gaussian Processes, ultra-cool project.
- Generic Framework for Markov Chain Monte Carlo Algorithms and Stan Interface. With Theo Papamarkou, who I know from my time at UCL Statistics. It’s about a modular representation of MCMC within Shogun and a possible interface to STAN for the actual sampling. This would be a major step of Shogun towards probabilistic models.
- Testing and Measuring Variable Interactions With Kernels. With Dino, who is post-doc at Gatsby and co-author of our optimal kernel for MMD paper. This project is to implement all kernel based interaction measures in Shogun in a unified way. We’ll probably use this for research later.
- A Meta-Language for Shogun examples. With Sören. Write example once, press button to generate in any modular language binding. This would be so useful to have in Shogun!
- Lobbying Shogun in MLPACK’s automatic benchmarking system. Joint project with Ryan from MLPACK. He already can compare speed of different toolboxes. Now let’s compare results.
- Shogun Missionary & Shogun in Education. With Sören. Write high quality notebooks and eye-candy examples. Very different project as this is about creative technical writing and illustrating methods on cool data rather than hacking new algorithms. I would be very excited if this happened!
Some of the other projects involve cool buzzwords such as Deep Learning, Structured Output, Kernel, Dual solvers, Cluster backends, etc. Join us! 🙂