My Sincerest Apologies Tear-down

On Reddit I often see a variation of a conversation like this:

Newbie: I want to write games in Python!

Random person: Python is no good for games. You should use C++ or C#.

The claim that Python is not a good language to write games is not true. It's more accurate to say "Python is not a good language to write games alone."

It is well known that Python trades raw performance for productivity. But if you use it glue together powerful libraries written in faster languages then you can do better - you can have performance and productivity.

My Sincerest Apologies

In October 2017 Larry Hastings and I teamed up to enter the Pyweek Game Jam, and we won! Our game was a fast-paced 2D shooter with real-time lighting effects, called My Sincerest Apologies.

Here's a clip:

How did we write this in a week? With 3000 lines of Python code, but importantly, with four high-performance libraries.

Each one of these libraries is an independent project, written in its own way, but we were able to put them together in a game engine in our spare time in one week.

Pyglet

Pyglet provides bindings for OpenGL, sound, and input devices. As well as these it has higher-level capabilities for rendering 2D sprites and text.

Pyglet is written in pure Python using the ctypes module in the Python standard library.

Pyglet isn't the fastest OpenGL library around. However it gives you the tools to draw hundreds of sprites with very minimal cost, by arranging them into vertex buffers and handing them to OpenGL.

When you're using Pyglet you're mostly delegating all the hard work to your system's OpenGL drivers, and this is easily fast enough for writing many types of games.

All of the basic drawing we do in My Sincerest Apologies is done using Pyglet. We also use Pyglet to control the shaders, framebuffers and render passes needed for our lighting effects.

/2018/msa-pyglet.png

The base scene as rendered by Pyglet with lighting shaders turned off.

Lepton

Lepton is a venerable Python particle engine that I have recently taken maintainership of.

Lepton is written in C using the Python C API. Most common particle operations, such as forces, rotation, scaling and fading are performed in C under the control of Python. It also draws the particles using OpenGL.

Lepton means that we never need to actually access the particle data in Python. All we need to do is construct and modify Lepton's Emitter objects and let Lepton take care of everything else in fast C code.

/2018/msa-particles.png

Short-lived particle emitters are spawned wherever a railgun ray hits.

Pymunk

Most games need collision detection, and fast native code libraries are a great tool here too.

We used Pymunk, which is a Python library built around the Chipmunk 2D physics engine using CFFI.

Pymunk gives us a lot of the tools we need to create a 2D game. For a start it prevents the player character moving through walls. But it also gives us collision events that we can attach Python code to, such as when a bullet hits an enemy.

It lets us do ray queries to test for objects hit by our railgun rays. We get some great effects for free, such bouncing bullets or what happens when the player character collides with crates:

Lightvolume

Lightvolume is the most unknown of the libraries we used - because I wrote it during the competition!

Lightvolume handles the rendering of volumetric (2D) lights. You feed it geometry from Python and it implements the visibility algorithm described in this interactive article by Red Blob Games.

In fact it doesn't even implement this from scratch - it simply wraps the C++ implementation of the algorithm provided by trylock on Github! What a cheat, right? But this was very easy to wrap with just a little C/C++ code and CFFI again.

Lightvolume also takes care of rendering the lit area using OpenGL, meaning that again, we don't need to funnel data managed in a native code library back into Python. All we do with Python is feed in the lights and shadow casting objects, and set up shaders.

/2018/msa-lightvolume.png

A 2D lightvolume being rendered without shaders

Why not try it yourself in Pyweek 25?

That's what we did in Pyweek 24.

But if you read this article and thought "I'd like to have a go", then fear not! Pyweek 25 is happening in April 2018, starting on April 15th, and it's a great opportunity to try out some of these things for yourself.

Join us at pyweek.org!

Pyweek Game Jam, Episode 25

Pyweek 25 has been announced for the 15th-21st April this year!

Pyweek logo

Pyweek is a week-long game jam, in which you must write a game from scratch, in Python, in just one week.

It's online, so you can participate from the comfort of your own home (or bed, if you're extra lazy). Games are rated by other entrants, which means you're guaranteed to get useful feedback on your game, even if you don't win.

Wow, 25 Pyweeks, over 13 years! So far, 972 complete Python games have been submitted. It's something of a tradition in the Python community. It's a very social event. Other entrants are encouraged to write "diaries" about their progress so you can see their games evolve and see how they solve the problems they encounter.

Pygame, Pyglet, Cocos2D - even Godot: there are many libraries in Python for creating games. If you're a beginner, I would recommend Pygame Zero, my easy game framework for beginners. You can also use your own library, but it needs to be published and well-documented 30 days before the contest starts, so roughly in a week from now. Get documenting and uploading!

Why participate in Pyweek? There are many reasons:

  • Practice your Python programming
  • Try programming something different
  • Unleash your creativity
  • Finish a real project
  • Maybe, just maybe, you could win and get all the kudos

So, why not give it a go? Here's what happens next:

  1. Register an account on https://pyweek.org/
  2. Submit an entry using the drop-down at the top right under your username.
  3. At 00:00 UTC on 2018-04-08, visit the website to find the 5 potential themes. You have a week to think about the themes and decide which you like best.
  4. Submit your votes for the themes in order of preference.
  5. At 00:00 UTC on 2018-04-15, we find out which theme has won, and coding begins! You now have exactly 168 hours to write a game on the theme that was selected!
  6. At 00:00 UTC on 2018-04-22, everyone must stop coding. You now have 24 hours to upload your game.
  7. Then you can relax, and start playing the other games. You can rate the other games on three criteria: fun, production, and innovation.
  8. At 00:00 UTC on 2018-05-06, the winners are announced. If you don't win first time, don't worry! Hopefully you'll learn loads of things and come back with better ideas next time.

Good luck! Hope to see you there!

Fun and Games in Python (Pycon PL Keynote)

The following is the skeleton of a keynote I gave at Pycon Poland in August 2017.

/2017/pycon-pl-keynote/slide-01.jpg

Fun is powerful.

Fun motivates us. I play video games, because it’s fun. I program Python because it’s fun. I do not program PHP, because it’s not fun.

I think that probably Python exists, at some level, because Guido thought it would be a fun thing to create.

I’m going to tell a story about fun - something of the history of programming video games in Python.

Programming games in Python? Isn’t Python an objectively terrible language for programming computer games? It’s so slow! Well, I think you know that that isn’t the whole story. Python is used for crunching some of the biggest sets of data in the world, in applications where clearly performance matters.

How is it possible that such a slow language can be appropriate for big data crunching… and games programming? To find the answers we have to look back 20 years, to the age of Python 1.x.

/2017/pycon-pl-keynote/slide-02.jpg

For big data crunching, that story began in 1995 with a library called Numeric, written by Jim Hugunin, which was later rolled into something called “numpy” by Travis Oliphant.

/2017/pycon-pl-keynote/slide-03.jpg

Jim Hugunin was also one of the people who was responsible for Python’s first OpenGL library, PyOpenGL, with Thomas Schwaller and David Ascher, which dates from around 1997.

OpenGL is an API for writing 3D graphics applications, including games.

However, OpenGL is not sufficient on its own for writing games; it lacks support for things like sound, and creating windows and handling their input.

In March 2000, Mark Baker started working on bindings for LibSDL - the Simple DirectMedia Layer. SDL provides capabilities to render graphics, but also support for creating windows, handling input devices, and playing sound and music.

/2017/pycon-pl-keynote/slide-04.jpg

Later that year, Peter Shinners refactored this into a library called Pygame, which was released in October 2000, the very same month that Python 2.0 was released. It was now possible to write graphical games in Python!

/2017/pycon-pl-keynote/slide-05.jpg

We’ll take a look at some Pygame code a bit later.

So this is the answer to how we can write fast, graphical games in Python. We can do it because we use Python to drive fast, lower-level libraries.

It was in about 2003 that I encountered Python for the first time. I had already taught myself to program games. I learned Java at university. I’d written a few Java games, and I experimented with writing games in C++ with DirectX, OpenGL and SDL.

/2017/pycon-pl-keynote/slide-07.jpg

But programming in those languages was slow and painful. Python was a fun language to program in! I was working as a web developer/sysadmin person, so I was primarily using Python for that at first. But of course, I eventually tried writing some games with this “Pygame” thing. And I was amazed! I had written half of a complicated puzzle game and drawn tons of graphics for it before I realised I hadn’t really considered how to make it fun.

/2017/pycon-pl-keynote/slide-08.jpg/2017/pycon-pl-keynote/slide-09.jpg

When showing my half-finished game off, somebody recommended I should try my hand at a Python games programming contest called “Pyweek”.

/2017/pycon-pl-keynote/slide-10.jpg

PyWeek was started by Richard Jones in 2005. Richard is perhaps better known as the BDFL of PyPI - originally the “Cheese Shop”. He also wrote the Roundup issue tracker that powers bugs.python.org.

/2017/pycon-pl-keynote/slide-11.jpg

PyWeek challenges participants to write a game from scratch, in Python, in exactly a week. Games have to be written from scratch - which means that they may only make use of published, documented libraries created at least 30 days before the contest starts. You can enter as an individual or as a team; there will be a winner in each category.

/2017/pycon-pl-keynote/slide-12.jpg

The week before the competition, five possible themes are put up for consideration and votes are collected. At the moment the contest starts - always midnight UTC - a theme is picked using a run-off system, and coding begins!

During the contest, you’re able to write “diary entries” about your progress using the PyWeek website, and you can also follow along with the other amazing projects that are being created on the same theme. This is also a forum to discuss your approaches and problems and get feedback and tips.

After just 168 hours of coding, the contest is over! You upload your game, and probably take a well-deserved day or two’s rest. Then everyone who entered a game can spend two weeks playing and reviewing the entries. Entries are scored on three criteria: fun, production and innovation.

Fun is of course measured in how enjoyable the game is to play. The innovation score measures whether the game includes original ideas, in terms of mechanics, theme, plot, control system or whatever. The production score is given for the overall quality of the presentation: graphics, sound, and music, tutorials, title and game over screens… and you might lose production points if your game contains bugs.

Of course that PyWeek was a success, and has run roughly twice a year since then. The latest competition was the 23rd, and the 24th is happening in October.

My first Pyweek was Pyweek 10 in 2010. In that competition, the theme was “Wibbly-Wobbly”. I created a ninja fighting game set in a wibbly-wobbly bamboo forest, like that scene in Crouching Tiger Hidden Dragon.

/2017/pycon-pl-keynote/slide-13.jpg

I was working as a web designer at the time, and I designed the game like I’d design a website: start with mock-ups and concept art and then build the game from them. I programmed for a week, using Pyglet for the first time, and found it was just remarkably simple to use and be effective with.

/2017/pycon-pl-keynote/slide-14.jpg/2017/pycon-pl-keynote/slide-15.jpg

I didn’t win - though I thoroughly enjoyed the experience. It was my most original and complicated game to date, and I’d written it in just a week. People thought my production values were good, but the game simply wasn’t that fun (2.9 out of 5).

Ah, fun, the elusive element, and the most important. Creating a game is a technical challenge, and innovation is also something that but creating something fun is something more difficult. You’re going to spend a week programming, making choices, and creating… will it actually be fun to play at the end?

Of all the literature I’ve read about the art of creating games that are fun to play, and I think the best way of thinking about Fun is maximising reward and minimising punishment.

/2017/pycon-pl-keynote/slide-16.jpg

Punishment is when the game causes negative feelings. That instant of frustration when Sonic collides with a spike and drops all his rings. When you’re in pole position in Mario Kart when the goddamn blue shell explodes. The boredom of a loading screen that takes 10 seconds, when you see it 20 times an hour, is punishing. Dying in a game, and having to repeat the last ten minutes of gameplay to get back to where you were, is very punishing.

/2017/pycon-pl-keynote/slide-17.jpg

Reward is when things in the game that cause positive feelings, however small. The surge of adrenaline when your bonus multiplier is 10x and you’re still alive and there are a thousand enemies on screen and you’re not sure how long you can keep this going. The discovery that you can combine an ice potion and an arrow to create an ice arrow, and that you can use an ice arrow to halt the waterfall so you can pass.

/2017/pycon-pl-keynote/slide-18.jpg

These can be the smallest things. The satisfying ‘ding’ when you collect a coin is rewarding - that’s why in Mario people keep jumping under that block until it spits out no more coins. The boredom of not being challenged by a puzzle is punishing. Having to repeat the same set of clicks to open your inventory and consume a potion is punishing.

Games are fun when they provide a constant stream of rewards and avoid being too punishing.

If you’ve every played any of the massively commercial mobile games, you’ll be familiar with this. Lots of the games will offer you a constant stream of rewards - “Here, have 2 free diamonds” - with practically no punishment (you can never lose). They dangle rewards in front of you, withholding them just long enough to tantalise you and make you want to pay to get them sooner.

I think this also explains why Pyweek is such fun. Python is a low-punishment language. It doesn’t bite often. When we mess up, it crashes with a clear traceback rather than a SIGSEGV. The libraries don’t come with mind-bending type systems. Python is also a high-reward language, famed for its productivity. But when we’re challenged to unleash creativity, make something quickly, and compete with and alongside others, the rewards are multiplied.

And occasionally, you might win.

In May 2012, my submission for Pyweek 14 - on the theme “Mad Science” - was a physics game called Doctor Korovic’s Flying Atomic Squid. And it won! I’m going to show you a little of it now.

/2017/pycon-pl-keynote/slide-19.jpg

Jumping back to the story of Pyweek -- Pyweek was also conceived as a way of energising a community to create useful tools and libraries. The rules of Pyweek start with a mission statement:

The PyWeek challenge:

  1. Will hopefully increase the public body of python game tools, code

    and expertise,

  2. Will let a lot of people actually finish a game, and

  3. May inspire new projects (with ready made teams!)

Because of the “from scratch” rule, from the outset people would create and publish libraries from their Pyweek games to avoid having to write the libraries again.

The second Pyweek competition ran in March 2006. The individual competition was won by Alex Holkner, with a game called Nelly’s Rooftop Garden, and it is still the highest-scoring Pyweek game of all time.

/2017/pycon-pl-keynote/slide-21.jpg

Later that year, Alex started a Google Summer of Code project to create a clone of Pygame using the ctypes library, which having been available for a few years, was due for inclusion in the Python standard library in Python 2.5 that autumn. His mentor was Peter Shinners, the Pygame author.

He quickly got good at this. In a few short weeks he had a working implementation of Pygame with ctypes, as well as a lower-level SDL-ctypes. And then he started looking at PyOpenGL, which had some gaps in API coverage. He started ctypes bindings to fill those gaps - and quickly realised he could produce complete ctypes bindings for OpenGL relatively quickly. The project was named pyglet.

/2017/pycon-pl-keynote/slide-22.jpg

At this point the ever active Richard Jones pitched in and together Richard and Alex forged Pyglet into a full cross-platform, dependency-free games library. Alex pulled text rendering from Nelly’s Rooftop Garden. Then Pyglet gained a featureful sprite library, with support for rotation and scaling of sprites, as well as colours and transparency, and a native sound library called AVBin.

Pyglet is easy to use for beginners, while exposing the underlying OpenGL primitives to more experienced users. Its sprite and text libraries can be wrapped up with OpenGL state changes to achieve complex effects.

/2017/pycon-pl-keynote/slide-23.jpg

Pyglet is a great library, and it is what I used in the majority of my Pyweek entries, including Doctor Korovic’s Flying Atomic Squid. It has more graphics features than Pygame, and is probably as easy or easier for the basics - but Pyglet does offer the full power of OpenGL at the cost of complexity.

A year and a half after Pyglet was released, at PyCamp in Los Cocos, near Córdoba in Argentina, Ricardo Quesada and others started the “Los Cocos” Python game engine, a higher-level engine based on Pyglet. It was soon renamed Cocos2d. Ricardo later ported it to iPhone, where it became wildly popular. Many App Store number one games have been written with it. There are over 100 books about Cocos2D programming.

/2017/pycon-pl-keynote/slide-24.jpg

So Pyweek is a successful incubator for technology.

Let me pick up another thread of history now - this Education Track at Pycon UK. Pycon UK’s education track was set up by Nicholas Tollervey - who has keynoted here previously - and who is himself a former teacher.

/2017/pycon-pl-keynote/slide-25.jpg

The very first education track, in 2012, invited teachers to attend Pycon UK for the first time. The teachers were “shared out” between Python programmers, and each group was challenged to come up with course material. I was there that first time, fresh from my Pyweek win, and I was sorted into a group with a teacher named Ben Smith. Ben is a teacher, and was keen to see what I could come up with for teaching programming to kids through the medium of games.

I spent five minutes or so writing a basic Pygame game, which looks like this:

/2017/pycon-pl-keynote/slide-26.jpg

Looks simple, right? But Ben told me the first moment he saw it, “I can’t teach that.” The problem was that it was just too many lines of code. In a 40-minute Computing lesson, he would have some students race through it in five minutes, while others would be struggling to type the first “import pygame” line correctly. I was a bit taken aback, to be honest. It needs to be simpler than that!? But, as he explained, he needed to be able to feed them a sequence of tasks that were small enough that everyone could catch up, before moving on.

The education track has run every year since 2012, and I’ve attended it every time. As well as the teacher’s day, it now also has a “kids day” where kids can come and learn to do fun things with Python. A few years ago, I was working for Bank of America Merrill Lynch, and I helped arrange for the bank to sponsor the education track, which it has done since 2014. The sponsorship money helps, for example, to pay for supply teachers so that the regular teachers can come.

Teachers in the UK are facing an immense challenge in teaching a modern computing curriculum. Many teachers don’t have programming skills themselves. Most schools lack resources to train their teachers, procure hardware, or install and configure the software needed to teach programming skills.

Still, through Pycon UK I’ve met many amazing teachers who overcome all this with ingenuity and enthusiasm. They make programming fun.

In October 2014, after a few attempts to take the Pyweek team entry title had failed, I won the Pyweek solo competition again, with a game called Legend of Goblit. The theme was “One Room”, and Goblit is a LucasArts-style adventure game set in a single room. It’s actually a sort of “adventure stage play”, with lots of scripted sequences, pausing for interactive puzzles. Let me show it to you now.

/2017/pycon-pl-keynote/slide-27.jpg

Goblit was probably not the most innovative game that time - but it was the most fun. In fact it is the third most fun Pyweek game of all time, behind Nelly’s Rooftop Garden (second) and Mortimer the Lepidoptorist (first).

Legend of Goblit was my first Pyweek entry in a few years that was written with Pygame. As I wrote it, I had something at the back of my mind - I thought back to that education track, to what Ben Smith had said about Pygame being too complicated. Part of the code of Legend of Goblit was a small Pygame-based game engine that was intended to be accessible to complete beginners.

Well, as usual, that sat there for six months, until Pycon 2015, in Montreal, Canada. On the first day of the sprints, I was sitting with Richard Jones, and I told him that I was proposing to sprint on writing a Pygame-based game engine for education. Well, as you’ll understand by now, when Richard is into an idea he’s into it - and he joined me to sprint on the new library, which I named “Pygame Zero”. This is us working together on the first cut of Pygame Zero.

/2017/pycon-pl-keynote/slide-28.jpg

Pygame Zero is a zero-boilerplate game engine for Python. What do I mean by zero-boilerplate? An empty file is a working Pygame Zero program.

/2017/pycon-pl-keynote/slide-29.jpg

We don’t run this with the standard interpreter; we run it with pgzrun.

/2017/pycon-pl-keynote/slide-30.jpg

Next can add a draw() function.

/2017/pycon-pl-keynote/slide-31.jpg/2017/pycon-pl-keynote/slide-32.jpg

You might use this to draw an Actor - in this case a rabbit sprite loaded from a file called ‘rabbit.png’.

/2017/pycon-pl-keynote/slide-33.jpg/2017/pycon-pl-keynote/slide-34.jpg

Then we can write an on_key_down() function, to handle input.

/2017/pycon-pl-keynote/slide-35.jpg

And if we add a button parameter to the on_key_down() function, this value will be passed into the function.

/2017/pycon-pl-keynote/slide-36.jpg

At each stage, we’re adding just a couple of lines at a time, and getting immediate results. If you think back to Ben Smith’s criticism of Pygame, to the problem of feeding his class a bite-size piece of work that would let groups all catch up before moving on, Pygame Zero solves it.

We use a little bit of magic - some extra builtins, some metaprogramming - so that we don’t have to teach import statements before we teach how to define and call a function.

Let me show you something else. Another teacher I met at Pycon UK was writing his first Pygame Zero program, and he tweeted me “It doesn’t work when I define a mouse down function”. This is what he’d written.

/2017/pycon-pl-keynote/slide-37.jpg

So now when you run Pygame Zero with a similar bug, it prints this:

/2017/pycon-pl-keynote/slide-38.jpg

Pygame Zero was a bit of a hit - educators everywhere became very excited about it. It ships on every Raspberry Pi, for example, and the creator of the Raspberry Pi, Eben Upton told me “It’s pretty much perfect.” Mark Scott, who works for the Raspberry Pi Foundation, told me, “It has lowered the age at which we can teach text-based programming languages by a couple of years.” (Text-based as opposed to graphical tools like Scratch). And it spawned a bit of a spin-off movement - there are now “Zero” libraries for GPIO, for networking, for UI, and more.

Pygame Zero is fun, and I’ve only recently realised why. Pygame Zero lets you create games with a constant stream of reward - moments when you get something new working - and it minimises punishment, by warning about potential pitfalls.

I’d like to look into the future for a moment. At Europython this year, Leblond Emmanuel announced that he has completed Python bindings for the Godot game engine - a comprehensive, multi-platform, open-source game engine. Roberto De Ioris has created Python bindings for Unreal Engine 4 - one of the most well known game engines.

/2017/pycon-pl-keynote/slide-39.jpg

But you don’t need a huge, modern, 3D game engine. Slobodan Stevic recently published a game called Switchcars - written with Pygame - on Steam.

/2017/pycon-pl-keynote/slide-40.jpg

Let me try to summarise what I think this story teaches us.

Firstly, yes, you can program games in Python. You can do it today, and you can do it commercially. Or you could just do it for fun.

Secondly, if you’re starting a new library, try to make sure Richard Jones is there to help. But seriously - perhaps we should be a bit more like Richard? We should try to have that infinite enthusiasm, and be quick to help kick-start other people’s projects. If you’re starting a new project, particularly a game or a games library, I’d like to volunteer to help you.

Thirdly, make it fun. Whatever you’re creating, make it reward people as much as possible, and make sure it doesn’t punish people.

Finally, please sign up for the next Pyweek in October. I’m sure you will have fun - but you might just end up creating something amazing, something that will make Python better for for education, or for fun, or for all of us.

/2017/pycon-pl-keynote/slide-41.jpg

Thank you.

Pyweek Game Jam is 19th-26th February

The 23rd Pyweek games programming competition will run from the 19th to 26th of February 2017, from midnight to midnight (UTC).

The Pyweek rules, in short, are:

  • Develop a game
  • In Python (mostly, at least!)
  • As an individual or with a team
  • In exactly one week (or less!)
  • From "scratch" - no personal codebases, only public, documented libraries
  • On a theme that is selected by vote, announced at the moment the contest starts.

Python has great libraries for programming games, both 2D and 3D, and the flexibility of the language means you can achieve a lot in just 7 days. Pyweek is open to programmers of all ages and experience levels, from anywhere in the world, and it's a great way to challenge yourself and improve your skills.

Games are scored by other entrants, on criteria of fun, production and innovation, and you'll have to think about all three to be in with a chance of winning! It's a free competition though, so your prize is recognition :-)

Here's what you need to do if you want to take part:

  1. Sign up/login at http://www.pyweek.org/, and then "Register an entry" at the top-right.
  2. Familiarise yourself with some of the libraries and resources you can use to make games.
  3. Join the IRC Channel, #pyweek on Freenode, for discussion and advice.
  4. Play some of the previous entries at the Pyweek.org site.
  5. Why not post a diary entry (from your entry page), introducing yourself and your team?
  6. Put a "save the date" in your calendar! Theme voting starts 2017-02-12, and the competition the week after, 2017-02-19 00:00 UTC.

Have fun, and good luck!

Scaling software development without monorepos

Google, Twitter and Facebook all famously use monorepos for the bulk of their development. A monorepo is a single version control repository shared by all of an organisation's code. That's not to say that a monorepo should be just an unstructured mess of code in a single repository; that would be chaos. It's usually a collection of components - apps, services, libraries etc - all stored alongside each other in a single conceptual codebase.

A monorepo obeys two rules:

  • Whenever you build, test, run or release software, all the code used comes from the same version of the whole repo.
  • Code can only be pushed to the repo if all tests it could possibly affect have passed. Conceptually the entire repo is always passing all tests and is ready to release at any time.

Crucially, a monorepo bans any flexibility to pick and choose versions of libraries code depends on. This is seen as the big advantage of monorepos - avoiding dependency hell.

A typical dependency hell situation is this:

  • You depend on two libraries A and B.
  • The latest A depends on C version 1.1.
  • The latest B depends on C version 1.2.
  • You therefore can't use the latest A and latest B.

There may be a solution - downgrade A, or B, or both - or there may not.

Dependency hell is a problem, and one that becomes much more problematic as the organisation scales - as the number of libraries increases. Monorepos avoid dependency hell by enforcing that A and B will always depend on the same version of C - the latest version.

Another advantage is that monorepos can help avoid using stale code. Once you get your code into the monorepo, any future release of any product will be using that code. Of course, it's the same amount of effort to port code to use newer versions of dependencies, but that work has to be done before the new version can be pushed.

However, it's not without huge downsides.

Even if you can calculate which tests could possibly be affected, you can find yourself rerunning huge swathes of the organisation's tests to guarantee the codebase is always ready to release. To respond to this, orgs disregard extravagancies such as integration tests and mandate fast running unit tests only.

Making breaking changes to an API in a monorepo is hard, because to push code into the repo it already has to be passing all (unit) tests. There are several responses that drop out, all sensible but undesirable:

  • Only make backwards-compatible changes - bad, because we accrue debt and cruft, shims and hacks
  • Introduce feature flags - bad because we introduce codepaths that may mean combinations of flags run in production that haven't been tested
  • Take heroic steps to try to patch all the code in the organisation - bad, because this involves developers changing other team's code, which can lead to mistakes that may slip through code review

It can be extremely hard to utilise third-party libraries in a monorepo, because code that was developed with the assumption of versioned library releases is completely naïve of the breaking-changes issue.

Also, if things break that aren't cause by unit tests, finding what changed can be hard - everything is changing all the time.

Put simply, monorepos neglect how valuable it is to have fully tested, versioned releases of library code accompanied by CHANGELOGs describing what changed.

Versioned releases mean a developer using a library is decoupled from any breaking changes to a library. There can be multiple branches in development at once, say a 1.6 maintenance release and a 2.0, letting developers upgrade as time allows.

An alternative

I believe a better alternative to monorepos can be found using traditional component versioning and releasing.

Let's go back to the two problems we were trying to solve:

  1. We want to help solve dependency hell.
  2. We want to drive developers towards using up-to-date versions of libraries.

Rather than going to the effort of building a monorepo system (because tooling for these isn't readily available off-the-shelf), could we build tooling that tackles these problems, using a standard assumption that libraries will be released independently, their code fully tested?

Driving users to upgrade

Ensuring that developers work towards staying current with the latest versions of libraries is perhaps the easier problem.

If we let developers develop libraries which are released with semantic versions, we can build a system to keep track of which versions are supported.

I envisage this looking very much like requires.io (random example page), a system that lets Github users see if the open-source libraries they depend on are up-to-date.

Conceived as an internal release management tool, we simply let library maintainers set the status of each released version, as one of:

  • Up-to-date - green
  • Out-of-date - amber - prefer not to release against this
  • End-of-life - red - only release against this as last resort. You could have special red statuses for "insecure" and "buggy"

The system should be able to calculate, for any build of any library, whether it is up-to-date.

Something like semantic versioning would of course be recommended; in principle semantic versioning would make it possible to automatically update which versions are out-of-date.

With this system we can easily communicate to developers when they need to take action, without making it painful. Maintainers could quickly kill a buggy patch release by marking it "end-of-life".

Solving dependency hell

Dependency hell can be relieved by being more agnostic about the versions of code we can support against. This is much easier in dynamic languages such as Python that have strong introspection capabilities. This allows for code compatibility across a range of versions of libraries.

  • You depend on two libraries A and B.
  • The latest A depends on C version 1.0 to 1.1.
  • The latest B depends on C version 1.1 to 1.2.
  • You therefore can therefore use the latest A and latest B with C version 1.1.

This is so innate to Python that even pip, Python's package installer, doesn't currently fully resolve dependency graph conflicts - for each dependency, the first version specification encountered as pip walks the dependency graph is the only one guaranteed to be satisifed.

This kind of flexibility is not impossible in other languages, however. In C and C++ this is sometimes achievable through preprocessor directives. It's a little harder in Java and C# - mostly you'd have to explicitly expose compatible interfaces - but that is something that we're often doing anyway.

Even without this flexibility, you could perhaps create point release of a library to add compatibility with current versions of dependencies.

Here's my suggestion for our build tool:

  1. Libraries should be flexible about the release versions of dependencies they build against. (On the other hand, applications - the leaves of the dependency graph, that nothing else depends on - should pin very specific versions of dependencies.)
  2. If we're not running hordes of unit test on every single push, but instead running a full test suite only on a release, we can instead use some of those test farm resources to "try out" combinations of library dependencies. Even if it doesn't find solutions, it can give developers information on what breaks in different scenarios, before developers come to need it.

In short, we should try to encourage solutions to dependency hell problems to exist, and then precompute those solutions.

The build tool itself would effectively write the requirements.txt that describes what works with what.

Combining these

These ideas come together very nicely into a single view of reasoning about versioned releases of code.

  • You can see what versions of dependencies are available.
  • Query the system for dependency hell solutions.
  • See whether dependency hell solutions push you into the territory of having to use out-of-date code.
  • See where effort needs to be spent to add compatibility with or port to newer library versions.

Maybe this system could show changelog information as well, for better visibility of what is going on to cause version conflicts and test failures.

I can't say for sure whether this system would work, because as far as I know it has not yet been built. But given the wealth of pain I've always felt as a Python developer in organisations that are embracing monorepos, I long for the comfort and convenience of open-source Python development, where there's no monorepo pain. I hope we can work towards doing that kind of development at scale inside large organisations.

Chopsticks - a Pythonic orchestration library

In my current role, we've been using Ansible for our orchestration and configuration management. Ansible is okay, but after several months wrestling with its extension API and being frustrated by YAML syntax ideas started popping into my head for something better.

Chopsticks (docs) represents my vision of Pythonic orchestration. It's not an orchestration framework in itself. It's more the transport layer for other orchestration systems. It's a remote procedure call (RPC) system that relies on no agent on the remote host: the agent is built dynamically on the remote host by feeding code over the SSH tunnel to the system Python.

For example, you can create an SSH Tunnel to a remote host:

from chopsticks.tunnel import Tunnel
tun = Tunnel('ns1.example.com')

Then you can pass a function (any pickleable Python function), to be called on the remote host. Here I'm just calling the standard time.time() function.

import time
print('Time on %s:' % tun.host, tun.call(time.time))

Of course, you might want to do this in parallel on a number of hosts, and this is also built-in to Chopsticks:

from chopsticks.group import Group
from chopsticks.facts import ip

group = Group([
    'web1.example.com',
    'web2.example.com',
    'web3.example.com',
])
for host, t in group.call(time.time).items():
    print('%s time:' % host, t)
for host, addr in group.call(ip).items():
    print('%s ip:' % host, addr)

Note that the code for chopsticks.facts does not need to be installed on the remote hosts. They will load it on demand from the orchestration host.

Effectively, Chopsticks gives you the ability to write Python programs that straddle a number of machines, all sharing a single codebase.

See the README for a summary of how this works.

SSH tunnels are not the only connection type Chopsticks supports. It communicates over stdin/stdout pipes, so can work with any system that supports these without inteference. Such as Docker (this is on Github, but not PyPI yet):

from chopsticks.tunnel import Docker
from chopsticks.group import Group
from chopsticks.facts import python_version

group = Group([
    Docker('worker-1', image='python:3.4'),
    Docker('worker-2', image='python:3.5'),
    Docker('worker-3', image='python:3.6'),
])

for host, python_version in group.call(python_version).items():
    print('%s Python version:' % host, python_version)

Why "Chopsticks"?

Chopsticks gives fine control at a distance - like chopsticks do.

Chopsticks vs ...

It's natural to draw comparisons between Chopsticks and various existing tools, but I would point out that Chopsticks is a library, not an orchestration framework, and I'd invite you to think whether other tools could benefit from using and building on it.

Ansible

Perhaps the immediate comparison is with Ansible, because it is frustrations with this that inspired Chopsticks.

Ansible feels a lot like Bash scripting across hosts, but in a warty YAML syntax. So first and foremost, I'm attracted to the idea of describing plays in nice, clean Python code. Python code is also more easily testable, and there are great documentation tools you can use.

Ansible's remote execution model involves dropping scripts, calling them, and deleting them. In Ansible 2.1, some of Ansible's support code for Python-based Ansible plugins gets shipped over SSH as part of a zipped bundle; but this doesn't extend to your own code extentions. So Chopsticks is more easily and naturally extensible: write your code how you like and let Chopsticks deal with getting it running on the remote machine.

Fabric

Fabric is perhaps more similar to Chopsticks - it's a thin framework around the SSH transport, that allows scripting across hosts in Python syntax.

The big difference between Fabric and Chopsticks is that Fabric will only execute shell commands on the remote host, not Python callables. Of course you can drop Python scripts and call them, but then you're back in Ansible territory.

The difference in concept goes deeper: Fabric tries to be "of SSH", exploiting all the cool SSH tunnelling features. Chopsticks doesn't care about SSH specifically; it only cares about Python and pipes. This is what allows it to work identically with Docker as with remote SSH hosts.

Execnet

As I was sharing Chopsticks on the Twitters, people pointed out the similarity to execnet, which I had not heard of.

Chopsticks has similarity to execnet, but from what little I've read it works in a very different way (by shipping selected code fragments), and will not allow importing arbitrary code from the orchestration host (ie. full import hooks).

Future of Chopsticks

Chopsticks is open source under the Apache 2 license, and at the time of writing, is at a very early stage - barely more than a proof-of-concept - but under very active development.

It currently has support for:

  • SSH, Docker and subprocess tunnels
  • Python 2.6-2.7 and 3.3-3.6
  • Parallel execution
  • Error handling
  • Proxying of stderr (with hostname prepended)

It needs:

  • Tests
  • Send/receive file streams
  • Higher-level orchestration functions

If Chopsticks looks interesting to you, I'd appreciate your feedback, and I welcome any pull requests.

Updated 2016-07-24: Updated to reflect improvements since original posting.

Pyweek 21 is one week away, 28th February-6th March 2016

The next Pyweek competition will run from 00:00 UTC on Sunday 28th February to 00:00 UTC on Sunday 6th March. That's next week!

Pyweek is a week-long games programming competition in which participants are challenged to write a game, from scratch, in a week, in Python.

This week, we vote on themes! The possible themes can be interpreted however you like:

  • Jump in line - you could do a Dance Dance Revolution-style game with cute animals
  • Showtime! - perhaps a business sim in which you run a TV network?
  • The aftermath - clean up the mess from a house party before your parents get home.
  • The incantation - players enter a TV talent show to prove they are the best spell-caster. Each week one is voted off! Featuring Will.i.am.
  • In the model - this naturally lends itself to a totally awesome game in which you have to develop a Python script with sklearn to solve exciting big data problems! With cute animals!

But seriously, if these theme ideas get your creative juices flowing, and you have spare time to write a game next week, why not register an entry and give it a go?

Get a Kanban! (or Scrum board)

I continue to be staggered at the effectiveness, as a process management technique, of simply sticking cards representing tasks onto a whiteboard. Whatever your industry or project management methodology, the ability it offers to visualise the flow of work is immensely powerful. It lets us plan the current work and the near future to maximise our productivity.

/2015/kanban-example.png

It's valuable whether you're working on your own or working as a team. When working as a team, it can be used to schedule work among team members. When on your own, it merely helps with clarity of thought (we'll look at why a little later).

Yet this is largely unknown outside of software development. All sorts of industries would benefit from this approach, from farming to law.

Terminology

There's lots of variation in the terminology around kanbans, so let me lay out the terms as I use them.

The idea of a kanban originates in manufacturing in Japan. The word itself means sign board and refers to the board itself. Specific processes built around a kanban are called kanban methodologies. Scrum calls the kanban a "Scrum Board" and naturally there are all manner of other terms and practices for using a similar approach in other methodologies too.

Onto the kanban we need to stick cards representing tasks - small pieces of work that are easy to pick up and get done. Sometimes tasks will relate to bigger projects. Some call these bigger projects epics, and may use additional cards to represent the relationship of tasks to epics.

A backlog is the totality of the work yet to do (and again, terms differ; some practices may exclude work that is already scheduled).

How to run a kanban

First of all, get yourself a real, physical whiteboard. If you can get a magnetic whiteboard, you can stick task cards to it with magnets, which is nice and clean. But otherwise your tasks can be cards stuck to the board with blu-tak, or post-it notes. I favour index cards of a weighty paper density, about the size of your hand when flat. This lets you write large, clear letters on them, which are easier to see from a distance, and they are somewhat resistant to being scuffed as you stack them into a deck and riffle through it.

Next, you need to come up with your backlog. If you're in the middle of a piece of work, you can start by braindumping the current state. Otherwise, get into a quiet room, with the appropriate people if necessary, and a stack of index cards, and write out cards, or break them down, or tear them up, until you have a set of concrete tasks that will get you to your goal. Make sure everyone agrees the cards are correct.

The cards can include all kinds of extra information that will help you plan the work. For example, you might include deadlines or an estimate (in hours, days or your own unit - I like "ideal hours").

/2015/kanban-task-metadata.png

Sometimes tasks are easy to describe on a card but if you were to pick up the card as something to work on, it wouldn't be immediately obvious where to start. These should be broken down into smaller pieces of work during this planning phase. This allows you to see with better granularity how much of the large piece of work is done. I like tasks that are of an appropriate size for each person to do several of them in a week. However, it's OK to break down the card into smaller tasks later if the task is probably going to be something to tackle further in the future.

Now, divide the whiteboard into columns. You will need at least two: something like backlog, and in progress. But you could have many more. Kanban is about flow. Tasks flow through the columns. The flow represents the phases of working on a task. You might start by quoting for work and finish by billing for it. Or you might start by making sure you have all the raw materials required and finish by taking inventory of materials used.

/2015/kanban-flow.png

None of these practices are set in stone - you can select them and reselect them as your practices evolve. For example, you could focus on longer-range planning:

/2015/kanban-schedule.png

So with your whiteboard drawn, you can put your tasks on the board. Naturally many of your cards may not fit, so you can keep your backlog stack somewhere else. Choosing what to put on the board becomes important.

Now, just move the cards to reflect the current state. When a task is done, you update the board and choose the next most valuable task to move forward. You might put initials by a card to indicate who is working on it.

Visit the kanban regularly, as a team. Stop and replan frequently - anything from a couple of times a week up to a couple of times a day - especially when new information becomes available. This might involve pulling cards from the backlog onto the board, writing new cards, tearing up cards that have become redundant, and rearranging the board to reprioritise. Make sure the right people are present every time if possible.

Less frequently you might make a bigger planning effort: pick up all the cards from your backlog pile or column, and sit down again with the relevant people to replan these and reassess all their priorities. Some of the cards may be redundant and some new work may have been identified.

The value of the kanban will then naturally begin to flow:

  • Higher productivity as you see how what you're working on fits into a whole
  • A greater ability to reschedule - for example, to park work in progress to tackle something urgent
  • Team collaboration around tasks that seem to be problematic
  • Estimates of when something might get done or which deadines are at risk

Tips

A physical whiteboard seems to be very important. A lot of the practices don't seem to evolve properly if you use some sort of digital version of a kanban. There are lots of reasons for this. One obvious one is that physical whiteboards offer the ability to annotate the kanban with little hints, initials, or whatever. Another one is that an online whiteboard doesn't beg to be looked at; a physical whiteboard up in your workplace is something to notice frequently, as well as offer a designated place to get away from a screen and plan work.

Naturally, having a physical whiteboard is only possible if your team is not geographically distributed. Geographically distributed teams are challenging for a whole host of reasons, and this is just one. A digital version of a kanban may be a good approach in those cases. Or perhaps frequent photos of a physical whiteboard elsewhere in the world can help to keep things in sync.

Readability from a distance helps get value from your kanban. Write in capital letters because these are more readable from a distance. Use a broad felt pen. Use differently coloured index cards or magnets to convey additional information.

It's somewhat important to ensure that the kanban captures all streams of work. There's a tendency to think "This isn't part of the project we're planning; let's not get distracted by it". But that reduces the value of the kanban in tracking what is actually happening in your workflow. Obviously, different streams of work can be put in a different place on the kanban, or use differently coloured cards.

You can also track obstacles to delivering work on the board. I like to reserve red cards to indicate obstacles. Removing those obstacles may require work!

Why Kanbans work

Kanbans are certainly a form of process visualisation. Enabling you to visualise how tasks are flowing will let you spot problems in the process, such as too much work building up that only a certain team member can do. You can design workarounds to a problem like this also right there on the kanban.

Stepping back from this, the reason I've found having a kanban useful even for solo work may be related to the psychological idea of transactive memory, where we use our memory not as a primary store of information, but as an index over other stores of information, such as those in other people's heads, or on paper. The model of thought is then very much like a database transaction - we might "read" a number of facts from different sources into working memory, generate some new insight, and "write" that insight back to an external source.

By committing our understanding of our backlog of work to index cards, we can free our memories to focus on the task at hand. And when that task is done, we waste no time in switching back to a view of our workflow that can tell us immediately "what's next". Or say we encounter new information that we suspect affects something in the backlog - being able to go straight back to that card and recover exactly how we defined the task turns out to be useful: it allows us to quickly assess the impact of new information to our existing ideas and plans.

The final reason I believe kanbans work so well is that both the kanban and the stack of cards that represent your backlog are artifacts that are constructed collaboratively in a group. Taking some concrete artifact out of a meeting as a record of what was said cuts down a lot on misremembered conclusions afterwards. Some people try to take "action points" out of meetings for the same reason, and then quote them back to everyone by e-mail afterwards. This doesn't seem to work as well - I often find myself thinking "I don't recall agreeing that!" One reason for this is that the record of the action points is not written down for all to see and approve/veto, but a personal list written by the person taking the minutes.

Writing tasks out on index cards in front of people, and reading them out repeatedly or handing them around (or laying them out on the table for people to move around and reorganise - related in principle to CRC Cards), means that everyone gets a chance to internalise or reject the wording on the card.

Similarly, the organisation of kanban is not only a concrete artifact that is modified with other people standing around: it is ever-present to consult and correct. Nobody can have an excuse to leave the kanban in an incorrect state. Thus the kanban is a reliable source of truth.

So whatever your industry, whatever your process methodology, set yourself up a kanban and give it a try. Happy kanbanning!

Pygame Zero 1.1 is out!

Pygame Zero 1.1 is released today! Pygame Zero is a zero-boilerplate games programming framework for education.

This release brings a number of bug fixes and usability enhancements, as well as one or two new features. The full changelog is available in the documentation, but here are a couple of the highlights for me:

  • A new spell checker will point out hook or parameter names that have been misspelled when the program starts. This goes towards helping beginner programmers understand what they have done wrong in cases where normally no feedback would be given.
  • We fixed a really bad bug on Mac OS X where Pygame Zero's window can't be focused when it is installed in a virtualenv.
  • Various contributors have contributed open-source implementations of classic retro games using Pygame Zero. This is an ongoing project, but there are now implementations of Snake, Pong, Lunar Lander and Minesweeper included in the examples/ directory. These can be used as a reference or turned into course material for teaching with Pygame Zero.

Pygame Zero was well-received at Europython. Carrie-Anne Philbin covered Pygame Zero in her keynote; I gave a lightning talk introducing the library and its new spellchecker features; and the sprints introduced several new collaborators to the project, who worked to deliver several of the features and bugfixes that are being released today.

A big thank you to everyone who helped make this release happen!

Pyweek 20 announced, 9th-15th August 2015

The next Pyweek competition has been announced, and will run from 00:00 UTC on Sunday 9th August to 00:00 UTC on Sunday 16th August.

Pyweek is a week-long games programming competition in which participants are challenged to write a game, from scratch, in a week. You can enter as a team or as an individual, and it's a great way to improve your experience with Python and express your creativity at the same time.

If writing a game seems like a daunting challenge, check out Pygame Zero, a zero-boilerplate game framework that can help you get up and running more quickly.

Due to various circumstances this has been delayed somewhat, and is now being announced at somewhat short notice. Be aware that this means that theme voting begins this Sunday, 2nd August.