Fun Stuff

NOTICE

I’m gradually converting the content here into a LaTeX/PDF package for viewing/printing. Download the newest version <link to come>.

A Wild pyQuine Appears!

If you ever want to flex your programming muscle, pick a language and write a quine in it. A quine is a computer program that performs a single function: It simply prints its own source code and then terminates. There are many ways to write a quine, techniques differ by language, and some ways are definitely more hackish/cheatish than others.

I sat down and wrote a quine in Python today, and it took me about 20 minutes. It really felt like an amazing experience, and imagine having a program that lets you do this:

$ python quine.py > out1 && python out1 > out2 && python out2 > out3 && python out3 > out4 && ….

It simultaneously rewards and boggles the mind! I strongly recommend that you attempt to write your own quine in python (or your language of choice) before viewing mine or others. See if you can come up with your own!

Sorting Algorithms

Any student of any field in computer science knows about sorting. The sorting problem is this: Given a homogeneous list A of n elements, on which elements a < relationship is defined, output a sorted list. That is to say, the output list should be such that for any two indices, x, y, in the list, x<y => A[x] < A[y]. The sorting problem is in P. Handwaving about notation: You cannot index lists, but when I do, you should find the meaning clear.

A number of algorithms exist to sort an array, and some are as fast as O(nlogn):

  • O(nn): Insertion sort, bubble sort, selection sort, quick sort et al.
  • O(nlogn): Mergesort, heap sort, binary tree sort et al.

Researchers have found that by using certain tricks (e.g. using specialized hardware, restricting the possible inputs) they can ensure faster runtimes, even linear time, but without such restrictions, the general worst-case runtime for sorting an array is generally considered O(nlogn)… until now! I’d like to present two sorting algorithms below that have worst case run time even faster than O(nlogn):

  • Drop sort: This produces a sorted list in linear time, worst case:

For (i=0; i<n; i++)
{
if L[i] > L[i+1] {drop L[i+1]; i=i++}
}

Drop sort works by stepping through the list elements, one by one, and dropping those elements that violate the ordering. As long as the drop operation takes constant time–and there’s no reason it shouldn’t–this algorithm produces a sorted list in worst (and best and average) case O(n) time. A friend of mine spotted this algorithm online somewhere, and now we can’t find it. If someone happens upon it, inform me and I’ll update this page to include a link and proper attribution.

  • Instant sort: This produces a sorted list in constant time:

Return L.head()

Instant sort, a novel sorting algorithm as far as I can tell, returns the head of the list. Note that instant sort does, indeed, return a sorted list. The list trivially meets the definition of “sorted” above, but it meets it exactly nonetheless; no tricks involved! Instant sort works in O(1) time.

The astute reader will undoubtedly point out the fact that these algorithms operate destructively; that is, sorting the list may return a list missing some of the original data. That’s a valid observation, but also consider that many times we accept the data loss, or the fact that the array be sorted is more important than preserving the data. In fact there’s a strong analogy between these destructive sorting algorithms and lossy media compression (“lossy” means that data is lost irreversibly in the compression process). Digital media compression algorithms like MP3, M4A (iTunes music store purchases), JPEG, and MPEG store the media in smaller space by eliminating some of it.

Yes, these popular compression algorithms throw away some of the data in the media in order to save space. The people aware that this occurs either don’t use these compression algorithms, or they decide that the data loss is acceptable. Several audio and image file formats compress data in a way that keeps all of the information, 100% of it. These are called “lossless” compression algorithms (or “lossless” formats). They include (audio) FLAC, ALAC, WMA lossless, and WAV (uncompressed). The same goes for image formats (PNG’s fantastic: It compresses losslessly and supports transparency and interlacing); in fact, there is a lossless JPEG specification, but people most frequently use the lossy version. Purists, archivists, audiophiles, and music enthusiasts tend to insist on lossless encoding, and for good reason, especially so considering the abundance of affordable storage we have available today.