Splitting a Sequence into Subsequences with Python

I’m writing a load balancing algorithm for this cluster I’m working on, and have a bunch of things in a Python list sequence. So came the issue, how to divide the list into sublists for each node in the cluster to compute? I came up with the algorithm:

def split_seq_stride(l, n):
  """Splits a sequence l into a list of subsequences containing at least n
  elements each, not preserving order. The first few subsequences may contain
  n+1 elements, containing the last few elements. (Algorithm is good for load
  balancing)"""
  r=[]
  k=len(l)/n
  [r.append(l[i::k]) for i in range(k)]
  return r

which behaves like:


>>> l = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
>>> print split_seq_stride(l, 3)
[[1, 4, 7, 10], [2, 5, 8], [3, 6, 9]]

Assuming each task is an equal amount of work, this algorithm is a good load balancing algorithm. it prevents any node from having to do any less significant work than the others.

Probably useful for someone: what about splitting up a sequence in-order, with the last subsequence containing fewer elements? This code segment does just that:


def split_seq(l, n):
  """Splits a sequence l into a list of subsequences containing at most n
  elements each, preserving order. The last subsequence may contain less
  than n elements."""
  r=[]
  [r.append(l[s:s+n]) for s in range(0, len(l), n)]
  return r

which behaves like:


>>> l = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
>>> print split_seq(l, 3)
[[1, 2, 3], [4, 5, 6], [7, 8, 9], [10]]

Thanks to yason and others on EFnet’s #python for pointers.

Like this article? Please support my writing! Flattr my blog (see my thoughts on Flattr), tip me via PayPal, or send me an item from my Amazon wish list.