Test Driven Development in Python

TDD is a mindset

Test driven development is a way of designing as much as it is a way of developing.  I have been trying to use it on and off for a several years now without much success.  I think I understand the process, but not necessarily the mindset.  So, I am taking on a challenge in hopes of learning the mindset as well as the process.  The transition to TDD is going to take practice and I intend to get that practice using the Project Euler problems that I started on a couple of years ago.  One of the added benefits of using the Euler problems is that they are simple enough usually (or I should say, “So far”) that they can be solved using a single class or even a single method.  This makes writing tests for the design a little more direct and therefore simpler.

I have decided to use Python, for now.  Python is a simple language to use because it is interpreted rather than compiled.  I am also taking this opportunity to learn to use VIM a little better.  I have setup VIM as my python ide, and am pretty happy with it so far.  And when I am away from my main machine I can use my Koding virtual machine to work on a problem, but that is another post.  Plus I have always wanted to know more python, and so far I am really enjoying using it, it was super easy to get up and running with TDD.  Writing a unit test in python is very easy, as you will see.  So, despite the fact that I write C# all day everyday at work, I am trying to use an easier setup to get some practice in with Test Driven Development.

The Problem

I am currently on problem number 6 at Project Euler, which is title: Sum Square Difference.  The statement of the problem is this:

The sum of the squares of the first ten natural numbers is,

12 + 22 + … + 102 = 385

The square of the sum of the first ten natural numbers is,

(1 + 2 + … + 10)2 = 552 = 3025

Hence the difference between the sum of the squares of the first ten natural numbers and the square of the sum is 3025 − 385 = 2640.

Find the difference between the sum of the squares of the first one hundred natural numbers and the square of the sum.

This led me to do some research.  The past couple of problems I have tackled with more of a brute-force tack and so I decided that this time I would do some research first.  I knew there were some formulas that could help me with this one.  I basically needed a formula for the following:

  1. The sum of the squares for 1..n
  2. The square of the sums of 1..n

With formulas for these two tasks I could then simply diff the result for my answer.  I have sample values from the problem statement to use in my unit tests of the formulas.  Here are the formulas I will use in solving the problem, both were readily available through Google or Wikipedia searches.

Sum of Integers

Sum of Integers

Sum of Squares

Sum of Squares Identity

The Test Class (and the first test)

With test driven development, rather than writing some code for the problem first, of course I need to write a test.  So what test?  Well I look at the problem and see that I need to find the difference between the sum of the squares of a list of numbers and the square of the sum of those numbers.  Therefore I will need to be able to get those values.  My first test will be for a correct sum of squares value (the second formula above).

I know that I am supposed to dive right in and write a test for a method that does what I need.  This is where you actually do some designing as well.  I wrote the following test as the first code for this problem:

Test for a sum of squares method

Test for a sum of squares method

And here is the output for that test:

Output from run of first test.

Output from run of first test.

As you can see the test failed with an ‘ImportError:  No module named Problem6’.  That is because I need to add a module for Problem6, which is a new file named Problem6.py.  In order to maintain the strict tenets of test driven development I need to only to the minimum required to get past this error.  So I will add a new file/module named Problem6.py, and then re-run the test.  Here is the output from the next run:

Output from 2nd test run

Output from 2nd test run

Now we failed with an AttributeError: ‘module’ object has no attribute ‘Problem6Solution’ and this is because the new module we just added is empty and has no class.  So, I will add a class to the module and name it ‘Problem6Solution’ and a method named sumOfSquares that accepts an input number.  This should be enough to get through this error.  Here is the first look at the solution class:

Solution Class

Solution Class

Now we can run the test again and see what we get this time.  It should be an assertion error because we are not yet performing the formula calculation.  Here is the result:

Result from pass 3 of test 1

Result from pass 3 of test 1

Ah, there is my assertion error.  Now I am finally ready to add some code to execute the formula and hopefully pass the test.  The formula is simple but as I am a bit of a noobie with python I looked up the math.pow method to help me execute the power calls.  Here is the code for the sumOfSquares method complete:

Test1 - Passing!

Test1 – Passing!

Repeat for Remaining Tests

The rest of the process is a little smoother in that now we have our classes and files and beginnings completed.  We have a passing test and are under way!  The next test will be for another piece of our solution’s puzzle, the square of sums formula, where we sum 1..n and then square the result.  We have the formula for summing 1..n so it should be fairly simple.  Let us see what the test looks like:

Test 2 for a square of sums method

Test 2 for a square of sums method

Of course we will run the test without writing any new code to see it fail.  Then we will refactor until it passes.  I am seeing though, at this point that in writing a test first I am letting the desired output dictate the design of the class.  I am starting with what I need and only writing the minimum code in order to get it.  Resulting in a fairly concise bit of coding.  I am still only utilizing this at a very simple level, but I can see the benefits none the less.  What follows if the test result and the code written to make it pass:

Output from Test 2

Output from Test 2

Square of Sums method

Square of Sums method

And now our test 2 passes.

Test 2 - Passing!

Test 2 – Passing!

For the last test, where we test a solve method that takes the difference of our two sums, I will show the resulting classes in their entirety.  You will notice there will now be a third test and a third method in the solution class, called ‘solve’.

The complete test class

The complete test class

Complete solution class

Complete solution class

Summary

I have added a main method at the end of the solution class that actually shows the answer and the elapsed time to calculate it.  That, in my mind, makes this a complete solution that does provide the output desired.  And, thanks to Test Driven Development, without anything extra.  Of course I could further refactor the formula methods to shorten them, specifically using the sum() method rather than the += operator.  However, I have left the code the way it is because I feel it is more readable this way.  And in my book, readability counts for something.

Lastly here is the output, both the passing tests and the printed answer!

All Tests Passing!

All Tests Passing!

Results!

Results!

Advertisements

Project euler – again

I am back working on Project Euler again.  It has been a great long while, maybe 4 or 5 years, not sure, since I started on it originally, using ruby.  Now I am using a mixture, starting for Python for a bit, and possibly switching back to ruby.  I say that after solving this last problem (#4) and then going thru some of the other folks solutions, the most elegant usually are the ruby ones, in my humble opinion. 

There are of course smaller, or rather more concise, ones but they are too cryptic.  Solutions is in statistical languages such as J or K are frightfully illegible.  I have to share, just for kicks.  First, the problem was to find the largest numeric palindrome that is the product of two 3-digit numbers.  Similar to 9009 being the largest that is the product of two 2-digit numbers, 99 and 91.  So just for giggles, here are a couple of super concise (and un-readable INHO) solutions:

1. This is in J (http://www.jsoftware.com)

   1: >([:{: ]#~ (=|.&.>)) <@":"0 /:~(0:-.~[:,>:/**/)~(i.100)-.~i.1000

and

2. This one is in K (http://www.kx.com/developers/documentationkdb.php)

   1: |/m@&{x~|x}'10_vs'm:*/',/n,/:\:n:100+!900

Please realize that I am in awe of the guys that posted these solutions and maybe someday I will be able to learn a language such as one of these, but for now…

I am going to stick to C# and Javascript, and Ruby and Python.  They are sufficient to accomplish the tasks I perform on a daily basis at work.  I am just learning Python again and also planning to brush up on Ruby.  I spent a year or so messing with ruby and ruby on rails, but only at home in my spare time so naturally I didn’t get super far.

So, to the issue at hand.  How to find the largest numeric palindrome that is a product of two 3-digit numbers.  And, for me, more importantly how do I do this in Python using Test-Driven-Development, only using VIM as my development environment.  Pow!  Throw that one in there at the last minute.  Ha, Ha!  Vim super rocks and I am learning more and more about how to mack it out and make it easier to use.  Death to my mouse!

Vim setup

I decide to use VIM because I have been really wanting an excuse to get better with it.  I use it at work sometimes but when I am in a hurry I end up going back to Notepad++ or SublimeText (which is super nice in its own right, closest thing you can find to Textmate for windows).  Anyways I figured this would be a good chance to spend some time with VIM.  I am using gVIM by the way on Windows 7 and I have python 2.7 installed as well and in my PATH variable. 

I did some research on VIM and found out that I needed to add a ~\_vimrc file with some settings in it.  Over the next day or two I slowly acquired a pretty good selection of customizations for VIM that make it much nicer to use.  Things like syntax highlighting and default font and color scheme presets.  One of the coolest things I figured out from this wonderful article, Turning VIM into a modern Python IDE, which gave me numerous tips, was to use the vertical screen split to show two buffers (files being edited) at once.  This allowed me to work on my class in on one side and my unit tests on the other.  Like so:

vim_split_screen

Once I had this setup I was rolling pretty good, really helped with the TDD workflow of:

  1. Write a test
  2. See the test fail
  3. Refactor code to pass test

This was also aided by the VIM settings for executing my scripts using the python command and seeing the output in a new command window.  The following additions to my _vimrc file allow me to simply hit ‘F5’ to execute the code in the current buffer.  And, conveniently, the *.py filter ensures that it is only set up to do so for Python files.  These autocmd lines take advantage of the fact that you can execute any shell command from the VIM command line by prepending it with the ‘!’ character.  The last setting maps the ‘F5’ key to the command output of “python “ + the current filename (represented by the ‘%’ output placeholder).  Super convenient.

vim_settings

The settings file allowed me to correct the weird backspace key behavior that VIM was displaying.  I couldn’t be more pleased, VIM is seemingly endlessly configurable.

The Problem

The goal was to find the correct answer to the 4th problem on Project Euler which is to find the largest numeric palindrome that is a product of two 3-digit numbers.  After a bit of thought I decided to start at 999 and work down with any loop I would use, that way I would get to the solution faster than if I started at 100 and went up.  But the very first thing I needed to do was to write a test.  So I went for an isPalindrome method test, as I knew I would need a test for my products as I iterated through the numbers.  

I turns out that there are numerous ways to solve the problem, of course, including using pen and paper and some algebra.  I must admit that it never occurred to me to use algebra to solve the problem.  Brute force was immediately where my thoughts went and where they stayed.  I guess years of coding have taught me that using the simplest and most direct approach is usually the best.  There were many interesting solutions posted in the forum for the problem though, and I encourage everyone to check out the many different bits of code there, but only after you solve the problem yourself.

TDD

Before I could write a test I had to figure out exactly how unit testing worked with python, so I did some research and came up with what I would guess is the most direct solution.  There is an included unit test module called, unittest and so it was there I began.  No configuration necessary, only the use of the very sweet vertical split screen in VIM, so I could look at my tests and my code as I wrote them both.

My first code that I wrote was a class that derived from the base unittest.TestCase class and I named it TestProblem4Solution.  I then did a little setup, a valid palindrome value and an invalid palindrome value that I figured I would need in my tests of an isPalindrome method.  Then straight to my first test, which was:

   1: def test_isPalindrome_ReturnsFalse(self):

   2:        result = self.cut.isNumericPalindrome(self.invalid_palindrome)

   3:        self.assertFalse(result)

Naturally this test failed when I ran it, so I had some code to write in my Problem4Solution class, which I needed to create as well.  Here is a shot of what I ended up with for making the first test pass:

problem4_firstmethod

This was enough to make my second test pass as well, test_isPalindrome_ReturnsTrue which is just a positive result test to go with the first one which tested the negative result.

Moving on to some product work, I knew I would want to test products that were produced from a loop, so I wrote a test called test_highestPalindromeProduct_ReturnsCorrect that would test a method that would take a starting number and multiply it by a set of numbers beginning with itself and stepping down by –1 until a palindrome product was found.  Of course there was no such method so the test failed and I was back to my solution class to fix the problem.  The resulting method, after a few tweaks and quick trips to the python online docs came out looking like this:

problem4_secondmethod

I then added a test for a solve method that I figured would use the two helper methods and find my solution.  But, before I finished that I realized that I might want to solve this problem for multiple length numbers.  The sample given in the problem was for tow 2-digit numbers and the challenge was for two 3-digit numbers so right off I needed to handle multiple length start numbers (I was using the two 2-digit as test values because I already had the correct answer).  Therefore I stepped back and wrote a test for a buildStartNumber method that would take a length up to 5 and return the highest number of that length.  I wanted to get back 999 for an input of 3.  Here is that test:

   1: def test_buildStartNumber_ReturnsCorrectNumber(self):

   2:         result = self.cut.buildStartNumber(3)

   3:         self.assertEqual(self.expectedStartNumber, result)

Again, fairly straight forward, I am just testing a result produced by calling the method.  The expected value is hanging off of the class instance (self.expectedStartNumber) and is 999 which is what I wanted back.  The assertEqual test statement just checks that value against the actual result.

The method to make this test pass was the simplest one in the solution, but only because I did some Googling first to learn some more about what python is capable of…

   1: def buildStartNumber(self, length):

   2:        start = ''.join("9" for x in range(length))

   3:        return int(start)

You can see that I am using the join function to join the “9” string together for x times in a range of the provided length. The range function is very handy.

That brought me to the solve method which would get me my answer.  I first wrote a test that used the known example solution for two 2-digit numbers which is 99 * 91 = 9009.  These values ensured that my solve method produced correct results.  Sort of!  I did get an incorrect answer the first time because I had assumed that if I counted down instead of up the first solution I came to would be the correct one but it wasn’t.  I had to store the results and continue on down until I found them all and then keep only the highest one.  Counting down from 999 and looking for palindrome products you come first to 995 * 583 which is 580085, a palindrome but not the highest palindrome.  You have to keep going to get to 993 * 913 which is 906609 and the correct answer.  Now, if I had limited my attempts with a bottom end of say 900 then I would have come to the correct answer first, but I didn’t.  I let it go all the way down to 100 until it found a result.  Oh well, live and learn.

Back to the TDD, the next test looked like this:

   1: def test_solve_ReturnsCorrectValue(self):

   2:         result = self.cut.solve(2)

   3:         self.assertEqual(self.solution, result)

Another simple test, where the expected value of self.solution (defined in the test setup method) is an array of [99,91,9009] which is the correct answer for two 2-digit numbers, including the product itself.  I knew I would want my solve method to provide me all three values for a complete answer, even though technically all I needed was the palindrome itself to submit to the answer page at project euler.

And the code that I added to make the test pass:

problem4_solvemethod

This completed the problem.  I submitted the solution, which thanks to my handy ‘F5’ key mapping was just a click away after adding a small main() method to the class that would print out the result of the solve method to the screen.  The final output looked like this:

problem4_finaloutput

And the answer you can see printed out as [993, 913, 906609], nice and neat.  The final code is actually shown at the beginning of the post in the screen shot of VIM with the vertical split.  You can see both the Problem4Solution class on the left and the TestProblem4Solution class on the right.  If you have any questions just post a comment!


Project Euler Journey Begins

What, Project Euler?

Project Euler Website

Project Euler Website

This post is an introduction to what I intend to be a series of posts that detail my journey through the problems of Project Euler.  I am hoping to post as I solve, or attempt to solve each of these problems.  My goal is, of course, to solve all of them, but I expect them to become very challenging very quickly so I may find myself stumped early on.  However, this is an exercise in learning so I will have to prepare to be challenged, that is really the whole point.

The intent is for me to use the problems on Project Euler(see previous link) to accomplish several goals I have, or rather have had for a while and am just now attempting.  I can list them as three distinct learning goals.  I say learning because they all have to do with me wanting to either learn something knew or to increase my knowledge/skills in an area. 

My Goals for the Journey

The first goal is to learn how to use a unix shell.  I want to learn how to program for the bash shell as well as how to get around and get things done using the shell command prompt.  I would eventually like to be able to incorporate shell usage into my daily work as much as possible as I really think it is an efficient way of getting tasks done. I have always felt that anything that keeps my hand off of the mouse is a step toward increasing productivity.  Being able to key tasks at a command prompt is always faster than moving your hand to the mouse to search through dropdown menus and lists of icons, atleast it is in most cases.  I do not think that the mouse is unnecessary by any means, I just prefer to use the keyboard as much as possible, or as much as is practical, while I am working.

The second thing I have always wanted to learn more about is higher math.  I never took calculus in my brief stint at university, nor when I was in high school.  I always loved algebra and I am hoping that calculus as well as other higher mathematics areas will be just as interesting.  I also hope they are not beyond me though.  I expect I will have to learn quite a bit in order to solve the problems on Euler, even after the first couple I am already looking up stuff on Wikipedia and learning new things, exactly my purpose.

Lastly, but certainly not least, I want to learn a new language, maybe more than one.  I am going to start with Perl because I have always felt that not knowing perl was a drawback as a professional developer.  I will be attempting to solve all of the problems using Perl as the coding language and consequently have to learn it as I go.  I have already used perl to solve the first two problems.  I also really want to learn Ruby so I may do some problems in Ruby as I go along.  There are plenty of problems so I may be able to learn both languages as I go.

More about Project Euler

Euler is considered to be the preeminent mathematician of the 18th century and one of the greatest of all time. He is also one of the most prolific; his collected works fill 60–80 quarto volumes.

Euler is considered to be the preeminent mathematician of the 18th century.

I stumbled on to project euler somehow, though I cannot remember how for the life of me.  At the time I bookmarked it in my GMarks and went on my way keeping it in the back of my mind as something to look at later.  Well later came and I went to the site and read about it.  

Project Euler is a series of challenging mathematical/computer programming problems.  They involve both the math aspect as well as the coding aspect which is used to perform the calculations needed to get the solution in a timely fashion.  In fact all solutions are supposed to adhere to the one minute rule meaning that they should take no more than a minute to execute and produce the result.  The site’s about page states that its intent is

“to provide a platform for the inquiring mind to delve into unfamiliar areas and learn new concepts in a fun and recreational context”

which is what it does, quite well in my opinion.  There are several cool features about the site that make it fun as well as challenging.  It is also well kept and updated often, which makes it feel like you are participating in something more so than if it was a dead site.  In fact, every time I logon to the site there are dozens of people logged in already, as I write this there are 88 members logged in, so it is definitely an active site.

Some of the Features

There are a few features on the site that appealed to me and that made me want to participate as well as making me think that this was an excellent way for me to accomplish the goals I mentioned above.  Of course the activity level of the site and the fact that it is kept current and that they are still adding new problems, etc. all made it attractive to me.  But, there are a couple of specific features that make it even cooler.

One of those features is the simple fact that you can register and login.  This makes possible the scoring and tracking stats features per user (and per problem).  Once you register and login you start on the problems, I am doing them in the suggested order that they are presented in, but I suppose you don’t have to do it that way.  When you solve a problem or rather, when you think you have solved a problem you can submit your answer.  There is a place to do that for each problem, then you are told whether you are right or not.  I have been right the first time on the first two problems so I am not sure what happens when you are wrong.  When you are right you will see a different view in the problem list from then on.  You now see links to an overview pdf, which explains the solutions and has notes about the math/theory behind the problem.  This lets you see if you went about solving the problem in the most efficient way possible.  You also see a link to the forum thread for that problem where others have discussed the problem in greater detail.  

Screenshot of the list of problems.

Screenshot of the list of problems.

All this extra info is quite informative and contributes greatly to the learning experience.   If not for this extra info and participation by the community I wouldn’t know that although I correctly solved the first two problems, there were better ways to solve them than the ways that I used.  While not wrong, because I got the right answer, I discovered that there were simpler ways, that included more math theory that I did not have, that would find the solution more efficiently.  These extra notes and posts helped me learn much more than I would have by just submitting my answer and finding out it was correct.  I plan to post separately about each problem where I will go into this point in more detail.

Some tidbits about Project Euler

  • Here are some cool facts/details about project euler that I thought might add to this post and were interesting:
  • There are 228 problems so far, all of which have been solved by someone.
  • The first problem has been solved by 48,882 people (including me) and the 228th problem has been solved by 76 people.
  • You can sort the problems in descending ID, ascending Difficult, or ascending Difficulty.  I am solving them in order by ID.
  • From the Stats page, there are 52,922 users whom have submitted a total of 892,601 solutions, an average of 16.9 problems per user.
  • I count 165 countries listed on the stats page w/atleast one solution submitted next to their flag.

The top five(5) languages in number of users are: 

    1.  C/C++ [5798]
    2. Python [5121]
    3. Java [3231]
    4. C# [1758]
    5. Haskell [1255]
    6. Ruby [1185] — i added a sixth because the last two were so close 

      The top five(5) languages in User’s Rating are:

        1. 40% – Magma — (?? never heard of this one, have to look it up)
        2. 39% – Pari/GP — (?? never heard of this one either)
        3. 23% – APL/J/K — (?? okay I must be an idiot)
        4. 22% – Mathematica — (finally one i have heard of)
        5. 20% – GAP — (?? sigh)

          Okay, so there you go, a bunch of stuff about Project Euler and why I am gonna be working on it.  I hope I don’t get discourage right away.  I have little doubt that this is going to get real tough pretty soon, but I am equally sure that I will learn a great deal if I can do it… j