Showing posts with label c++. Show all posts
Showing posts with label c++. Show all posts

December 10, 2010

Something nice about every language I've used

Inspired by this post (and more great answers on the reddit comment thread), here's one nice thing about each of the languages I can remember using in any meaningful capacity:

Java: I learned to program in this. It led to the creation of the JVM, and while Java isn't my favorite language, the JVM is a pretty sexy piece of technology that enabled a number of other languages (Scala and Clojure most notable) to flourish.

C: The closest thing imperative programming has to "sparse beauty," a la Scheme. Shows you really don't need many bells and whistles to do a job, and do it well.

C++: Back when everyone was using C, it's kind of a technical miracle that Bjarne could create a proper superset on top of C with the features it has. Further, it's still blazing fast; without it we wouldn't have all the video games we love today ^_^.

PHP: Incredibly easy to learn, and no-hassle to set up on a server. One of the matches that lit the web on fire.

Racket (and Scheme applies here too): The language that never lets up. The most delicious learning curve I've ever tasted. Like Wagner's music is jokingly said to be "better than it sounds," Racket is more fun and fulfilling than it deserves to be.

Erlang: Industry-proven functional programming with more concurrency love than 1000 suns. Also the top language for gaining hipster-programmer cred. When you drop this name, suddenly everyone looks at you like "that guy" (you can decide if this is what you want or not).

Haskell: A wolf in sheeps's clothing, the most modern, practical, and supported language with features I think we'll see as necessary in the future. Another candidate for learning curve that keeps on giving.*

SML: A really sick module system for programming in the large. While not my favorite for "programming in the small," an understanding of SML's module system makes you pity that it never really took off.

Ruby: One of the most beautifully designed, fully-realized languages I know. Shows you can make a language that is simple, with practical value for Herp Derp programmers without sacrificing power and flexibility for the craftsmen as well.

Objective-C: The real "C with objects." Message passing with named parameters (and the much more sensible #import rather than #include), this is one time where I'm highly in favor of Mr. Jobs' stubbornness.

Prolog: 10-second youtube video. I mean this in a good way.

Max/MSP: Probably the most fully-realized and pleasing graphical programming environment I know, as well as an example of a damn fine DSL.

Javascript: I don't have to worry about compiler errors! No seriously, prototypical objects for the masses.

Flapjax: Functional reactive programming! An excellent example of functional languages and concepts attacking problems from the language level. Very innovative workaround for the horrors of client-side programming of the time.

Actionscript: Adobe makes it! Like Java, but better (Flash Platform > Swing/most Java GUI's).

*= Small qualification on the learning curve lines: virtually any language takes years of work over dozens of programs to "master." But many times mastering a language means compensating for its weaknesses, not discovering new strengths. That's what Haskell and Racket have given me more than most other languages.

May 6, 2010

Social Golfer problem

One summer I spent working with a professor, the father of a friend heard my plans and remarked (to my friend, not to me)

I can't think of anything less intellectually stimulating than writing computer code for 7 hours a day.

After some thinking, my guess was they were probably confusing programming with data entry. Moments like these make me realize how much I take computer literacy for granted, especially among older folk. The Supreme Court makes similar mistakes, to much lulz.

I frequently find it helpful then to describe a problem that we work on, since lots of them can be easily described without mathematical notation. One of these has been a real hair-puller these last few days, called the Social Golfer Problem.

Problem Statement


The premise is this: you're organizing a golf tournament with 9 people in it. They play every week in groups of 3, and want to play for three weeks.

Here's the catch: they are social golfers, meaning if possible, they never want to play with the same people twice. Your job is to organize the tournament; you have to arrange who plays with whom every week.

So rather than use names, we'll just use numbers. The first week could look something like this:


 Group 1Group 2Group 3
Week 11   2   34   5   67   8   9


Since players 1, 2, and 3 are playing together in the first group, next week you need to organize the tournament a little differently to make sure 1 isn't in a group with 2 or 3, 5 isn't in a group with 4 or 6, and so on. A complete tournament schedule might look like this:

 Group 1Group 2Group 3
Week 11   2   34   5   67   8   9
Week 21   4   72   5   83   6   9
Week 31   5   92   6   73   4   8

So here's the question: could you organize a tournament like this with 100 people, in groups of 10, for 10 weeks?

Because if you could, you could get published in a Computer Science paper. Whether or not its even possible to schedule such a tournament is an open problem.

For Solving Hard Problems, we have to write a program that takes three numbers

  • g - the number of groups in a week.

  • s - the size of each group.

  • w - the number of weeks.


and either prints a schedule for a social golfing tournament, or tells you its impossible.

It's a really cute problem. Hopefully in a few days, when it's done and gone, I can write about how I wrote the program. In the meantime, how would you go about it?

April 8, 2010

Software and Evolution

I think the software is growing, and will continue to grow, the way lifeforms have grown and evolved on Earth. By this I mean we started with a single ancestor, likely of a few proteins or perhaps a single cell, only to become a planet housing humans, echidnas, sponges, fungi, insects, trees, and more.

This mostly comes to mind when I look at essays like this series, by Mike Taylor, on how so much of coding these days is just playing plumber between various libraries, fixing leaks and disasters that occur when the piping isn't perfect. The argument is stated well by jdietrich commenting on the story (where else?) on Hacker News:

My biggest gripe with modern programming is the sheer volume of arbitrary stuff I need to know. My current project has so far required me to know about Python, Django, Google App Engine and it’s datastore, XHTML, CSS, JQuery, Javascript, JSON, and a clutch of XML schema, APIs and the like.Don’t get me wrong, I’m grateful for all of it, but it just doesn’t seem like what I was promised when I followed SICP for the first time. It just feels like I spend most of my time scouring through documentation and trying to remember umpteen different sets of syntax and class names rather than actually thinking in code.

Back in ye olden days, most programming tasks I performed felt quite natural and painless, just a quiet little chat between me and the compiler. Sometimes longwinded, sometimes repetitive, but I just sat and though and typed and software happened. The work I do these days feels more like being a dogsbody at the tower of babel. I just don’t seem to feel fluent in anything much any more.

We talk about ‘flow’ quite a lot in software and I just have to wonder what’s happening to us all in that respect. Just like a conversation becomes stilted if the speakers keep having to refer to their phrasebooks and dictionaries, I wonder how much longer it will be possible to retain any sort of flowful state when writing software. Might the idea of mastery disappear forever under a constant torrent of new tools and technologies?

I happen to agree with most of the posts, but their symptomatic of something that's been on my mind: our code is really inefficient. But more importantly: that's okay, and further, we will have to live with it in order to reach software at the level that humans are at biologically.

Allow me to clear up the mapping. When we started with computers, we wrote in raw, unadulterated binary. Every machine instruction was treasured, coddled, and several amazingly clever hacks were developed so operations could use minimal resources.

This was a necessity! We had to! But then we moved up to assembly, then the Capital Languages (FORTRAN, COBOL), and so on, until computers got powerful enough that we could afford ourselves some abstractions. What level of abstractions? Imagine how Mel the Real Programmer and other hackers of the binary era must feel when we're using languages with immutable strings, and someone writes code like:

String container = "";
for (String suffix : suffixes)
    container += suffix;
return container;

In which every iteration of the loop allocates a new string! And the code doesn't render the program unusable!

How does Mel feel? Probably how a bacteria (or other single-celled organism) would feel when I scratch an itch, and kill or damage hundreds of skin cells ostensibly for nothing.

Single cell organisms are still with us, and will almost certainly outlast us. We still have them in programming as well. To this day, if you want to really bust out the performance, you still gain lots by living close to the metal: I know a student in the introductory graphics class who implemented his linear algebra package by including x86 in his C. And almost all projects for my combinatorial optimization class are done in C only because, true or not, we believe "it's the fastest." (it is really fast).

The truth is, while people are still busting out assembly and squeezing whatever hardware gains they can, most of us can now get away with being pretty wasteful. And its the only way we can build the truly large, monolithic systems people pay big money for.

What am I trying to communicate with this metaphor?

First, stop arguing that speed be the limiting factor of a language or technology's eventual success. Every abstraction we use today (structured programming, object-orientation) was painfully slow during its introduction, but it will be one of these abstractions that will be the key to the next step in software evolution.

I recognize there are many good arguments against the use of functional programming, logic programming, and other alternate paradigms. Having speed comparable to other non-C languages today and calling it slow is not one of them.

Second, the diversity of software will propagate. Bacteria, fungi, plants, and eagles all live in radically different ways. Learn this and love it. Saying 'my form of programming is the real way' is like saying fungi are a real life form, but plant life isn't. Embedded systems have different needs than white-collar users of 'enterprise' software, different than logicians.

Finally, as it relates to Mike Taylor's article, what we are seeing now with library hell is the bad mutations of software evolution, the ones that will die out until we figure out how to do it right. If software at this point is at a jellyfish level, us sorting out library or framework programming are all the failed experiments to grow bones, gills, feet, and wings. One of them will work eventually, but lots and lots of our software will die until it does.

December 28, 2009

On-Demand Prime Generation

Many Project Euler problems are made easier if you have a good facility for generating and detecting prime numbers. Usually when I see such a problem, I would immediately switch to Haskell, but the other day a problem took about a minute to solve (way too long), so I ported the functionality to C.

In both, I'm doing a simple trial division algorithm: a number is prime if no other number divides it. A simple glance at Wikipedia shows you only have to test up to and including the square root of the number you're testing, and the nature of primes means you really only need to test factorization by other primes.

---

In Haskell, I used a former exercise from my CS173 class that highlighted laziness as a language feature to achieve this. First, we declare our list of primes:


primes :: [Int]
primes = filter isPrime [1..]


And then we define the predicate to determine if a number is prime or not, on which our list depends:


isPrime :: Int -> Bool
isPrime 1 = False
isPrime 2 = True
isPrime n = checkFactors (takeWhile (\ x -> x <= (floor $ sqrt (fromIntegral x))) primes) n

checkFactors :: [Int] -> Int -> Bool
checkFactors [] _ = True
checkFactors (x:xs) num = (num `mod` x) /= 0 && checkFactors xs num


For those who aren't Haskellites (even those who are, since my Haskell probably isn't very pretty or idiomatic), the code is doing this:

  • The list primes is the list of all integers beginning at 1, with all the non-primes filtered out, where non-primes are determined by the function isPrime.

  • The function isPrime tests a number x for primality by testing divisibility for all prime numbers below the square root of x, where the prime numbers come from the list primes.


Huh? You might have read over that a few times trying to piece it together and couldn't, as primes depends on isPrime and isPrime depends on primes. But not to worry, this is laziness at work :)

Laziness means that Haskell will only perform the computations it needs as it needs them, so as long as isPrime only needs as many numbers as primes has already computed, this works. Alternatively, so long as primes only needs to generate numbers isPrime can check, we're golden.

Imagine a snake that is three feet long eating a foot of itself from its tail, but growing a foot and a half longer as a result. It can do this continuously and grow to indefinite size!

More concretely, suppose I want to check if 9 is prime. A call to isPrime 9 causes the following to happen:

  • The takeWhile in isPrime will take all numbers from primes that are less than or equal to 3.

  • The language will compute values for primes by running each integer through isPrime, and will do this while those values are less than or equal to 3.
  • isPrime 1 will fail, isPrime 2 will pass, and isPrime 3...

  • ... we're not sure, so we repeat the steps above for all numbers in primes less than the square root of three. This gives us the empty list, which by definition of isPrime tells us 3 is prime.

  • We return this result, and stop taking values from primes since this is less than or equal to 3. We divide 9 by 3. This fails the predicate, so 9 is not prime.


Next to shell languages, Haskell is the only language I can think of in wide use (if you consider Haskell in wide use...) that employs laziness by default, and it allows you a powerful abstraction like this. The laziness allows us to define the list of all primes to be used freely in the code as any other list, the program will only calculate as many primes as it needs, and it can calculate arbitrarily many primes. Furthermore, the primes are comparatively fast to compute, since there are no 'wasted' divisibility tests.

---

And so I solved a number of Project Euler problems in Haskell just so I can use this facility. That being said, the speed of the language itself left me wanting, so I ported a somewhat similar abstraction to C. We'll do this in the opposite order, first defining an isPrime:

BOOL
isPrime(int num)
{
  int curr_prime, index = 0;
  double limit = sqrt(num);
  while ((curr_prime = takePrime(&index)) <= limit) {
    if (num % curr_prime == 0) return FALSE;
  }
  return TRUE;
}

This relies on a takePrime function, which can be called continuously to fetch the next prime from an index value. It looks like:

unsigned int
takePrime(int* indx)
{
  unsigned int val = prime_ptr[*indx];
  if (val != UNCOMPUTED) {
    ++(*indx);
    return val;
  }
  else {
    unsigned int last_prime = prime_ptr[(*indx)-1];
    unsigned int next_prime;
    for (next_prime = last_prime + 1; ; ++next_prime){
      if (isPrime(next_prime)) break;
    }
    prime_ptr[*indx] = next_prime;
    ++(*indx);
    return next_prime;
  }
}

Where prime_ptr is an array of integers, memset to some value UNCOMPUTED (I've left out most of the header information, as well as the init() and finished() calls that make this all work).

This takePrime preserves previously computed values (this is the purpose of the first condition), so if you ever need to check the i-th index of a value you've already computed, you simply return it. Otherwise, you take the last value you've computed previously, and increment all integers above it until you find your next prime. When you do, store it in the array and return it.

The C abstraction has a number of shortcomings over the Haskell version; namely, you can't pass in an arbitrary index and receive a prime number (last_prime only checks the prime immediately behind the one you are trying to compute. If you've only computed 6 primes and you ask for the 9th, you Segfault). You also lose the list abstraction (prime_ptr is static and this set of functions is #included, so I choose not to 'export' it), and you can't calculate arbitrarily many primes (the array has a fixed size).

All that being said, changes to correct those are pretty easy to implement; I've never needed them. But this allowed me to brute-force a few problems that might have had more elegant solutions. While I hate on C pretty frequently, gcc/g++ produce some pretty slick executables.