Category Archives: C++

Joining the MacResearch Team

I’ve been a regular visitor to the www.macresearch.org web site pretty much since its inception, so I was very pleased to be able to accept an invitation to join the Executive Committee.

MacResearch is a web site that targets the Mac-using Scientist. It provides a wide range of services, including news feeds, software reviews, how-to articles, forums, a script repository, and — most recently — access to a 4-node Xserve computational cluster.

But one of the more important roles that MacResearch has taken on is that of mediator: Polls are held regularly, and the results summarized in a report which is communicated directly to Apple, and released to the community at large. If you want to know more, either visit the site, or check out the new web cast, in which Ivan Judson and Joel Dudley explain it all much more eloquently than I ever could.

What will my role be at MacResearch? To be honest, it’s a bit too early to say. I will certainly contribute content, most probably related to scientific software development in Cocoa, Python, C++, and Fortran. I also have some ideas for applications of Xgrid, but I can’t say much more than that until I find out what the existing MacResearch team have in mind. Whatever happens, I’m sure it will be an interesting ride…

Leave a Comment

Filed under C++, Cocoa, Fortran, Mac, Personal, Python, Scientific Programming

Bruce Eckel on Ruby, Python, Java, etc

Bruce Eckel is one of the better authors of programming books around. He is famous for his ‘Thinking in …’ books on Java and C++, and also has an appreciation of — and enthusiasm for — dynamic scripting languages like Python, which is quite unusual for those developing in statically-typed languages.

Bruce has just written an interesting blog entry on Ruby on the Rails, with comparisons to Java and Python. Interesting stuff. Check it out.

Leave a Comment

Filed under C++, Python

Don’t Be So Direct

After years tossing up the idea, I finally got around to putting together a short course on advanced programming concepts for scientists. The research group I belong to has grown considerably in the last few years, and there seems to be more interest in programming than ever before, so the time seemed right.

The course is called ‘Programming Paradigms for Scientific Developers’. It consists of just three lectures: the first covers Procedural and Structured Programming; the second, Object Oriented Programming; and the third, Generic Programming. The goal is not to teach people how to write an if branch in Fortran, or define a function in Python; it is designed to address the concepts that transcend language-level details.

Scientists are not like most developers — they have a nasty habit of wanting to know why. They don’t usually accept advice unless it is accompanied by solid reasoning. Preaching inheritance and polymorphism to Scientists perfectly content with common blocks and implicit typing will get you about as far as a Ballet Dancer in a Mangrove Swamp. To get through to them, you have to be able to rationalize the concepts you are advocating, and that means a lot of soul searching.

One of the concepts I use throughout my course is that of indirection. I don’t simply mean the term as it is often used in C programming to describe the role of a pointer, I mean it in a much broader sense. Indirection relates to how directly something is represented in a piece of software. For example, a function provides a means of indirectly executing a series of instructions. The alternative to a function is directly inserting the function body into the code wherever it is required.

Because indirection occurs at every level of software development, I have been able to use it as a base from which to describe new techniques. I begin by demonstrating the role of indirection in the development techniques the Scientists are already familiar with, and then show how the more advanced programming paradigms facilitate other forms of indirection not possible in procedural programming.

An important form of indirection in Procedural Programming is that introduced by a procedure (i.e.. subroutine or function). A procedure allows the programmer to avoid duplication of code by using it indirectly via a call. Reducing duplication is an important theme in software development, and techniques that seek to introduce indirection are inevitably also designed to reduce duplication — the one facilitates the other.

A procedure also places an interface between the calling code and the procedure body. Interfaces are the means by which indirection is realized. By introducing an interface, code becomes more flexible, because as long as the interface is fixed, code behind the interface is free to vary independent of the calling code. (In object oriented terms, code behind an interface is known asimplementation.)

In summary, different forms of indirection are designed to eliminate different types of duplication by introducing different sorts of interfaces. Once you realize this, the techniques introduced in each programming paradigm make a lot more sense.

Consider Fortran’s user defined types (UDTs), which are equivalent to C’s structs. UDTs are to variables what procedures are to expressions: a form of indirection that relieves the developer from duplicating data declarations. And take inheritance in object oriented programming (OOP); it is a means of indirectly including the variables and methods of one class in another class.

Polymorphism is one of the more difficult concepts to grasp for those new to OOP. It makes more sense, however, when you recognize it as yet another incarnation of indirection, one in which interfaces are introduced to free a given piece of code from making direct reference to a particular concrete data type.

Reuse of code in procedural programs — which do not make use of polymorphism — usually involves conditional branches, with one branch for each data type used. Consider the following Fortran 90 example:


integer, parameter                :: QN_OPTIMIZER = 1
integer, parameter                :: CG_OPTIMIZER = 2
	
type (QuasiNewtonOptimizer)       :: qn
type (ConjugateGradientOptimizer) :: cg
integer                           :: opt
	
read(5,*)opt
	
select case (opt)
  case (QN_OPTIMIZER)
    call new(qn)
  case (CG_OPTIMIZER)
    call new(cg)
end select
	
select case (opt)
  case (QN_OPTIMIZER)
    call takeStep(qn)
  case (CG_OPTIMIZER)
    call takeStep(cg)
end select
	
select case (opt)
  case (QN_OPTIMIZER)
    call delete(qn)
  case (CG_OPTIMIZER)
    call delete(cg)
end select

This code could form the basis of an optimization engine that includes several different types of optimizers. What you will hopefully notice is that there is a subtle form of duplication occurring in the branching structure. Exactly the same form of select block is being used for initializing the optimizers, taking a step, and deleting them again. Wouldn’t it be good if you could employ a single branching block, and somehow ‘remember’ which branch was followed, so that the rest of your code would not be polluted by duplicated blocks?

Polymorphism is in effect exactly that: a means of storing branching decisions. Consider the following rewrite of the above example:


integer, parameter                :: QN_OPTIMIZER = 1
integer, parameter                :: CG_OPTIMIZER = 2
	
type (QuasiNewtonOptimizer)       :: qn
type (ConjugateGradientOptimizer) :: cg
type (Optimizer)                  :: optimizer
integer                           :: opt
	
read(5,*)opt
	
! Choose implementation
select case (opt)
  case (QN_OPTIMIZER)
    call new(qn)
    optimizer = qn
  case (CG_OPTIMIZER)
    call new(cg)
    optimizer = cg
end select
	
! Generic type remembers the choice
call takeStep(optimizer)
call delete(optimizer)

In this case, a new ‘generic’ type called Optimizer has been added. It effectively stores the branch chosen when an optimizer is initialized. The generic type is a polymorphic pointer: when thetakeStep and delete methods of the Optimizer object are invoked, it ‘looks up’ the stored concrete optimizer type, and invokes the appropriate subroutine. Naturally, the look up is a form of indirection, freeing the calling code from direct knowledge of the executed code, including the concrete type of the optimizer.

Standards of Fortran prior to 2003 do not directly support polymorphism of this type, but it is easy enough to fudge — the code above is real working Fortran 90. To find out more about what lies behind the Optimizer generic type, you can download my course slides, which go into plenty of detail.

Someday, I would like to incorporate automated creation of generic types into the Forpedopreprocessor, but I’ll leave that for another time …

Leave a Comment

Filed under C++, Fortran, Python, Scientific Programming