Daniel Watrous on Software Engineering

A Collection of Software Problems and Solutions

Posts tagged object oriented programming

Software Engineering

Override Java Methods on Instantiation

Most Java programmers are very familiar with the mechanism to extend a class. To do this, you simply create a new class and specify that it extends another class. You can then add funtionality not available in the original class, AND you can also override any functionality that existed already. Imagine a simple class

public class Message {
    private String message;
 
    public Message(String message) {
        this.message = message;
    }
 
    public void showMessage() {
        System.out.println(message);
    }
}

Suppose you want these messages sent to the log too. You could extend the class as follows:

public class LoggedMessage extends Message {
    private String message;
 
    public Message(String message) {
        this.message = message;
    }
 
    public void showMessage() {
        Logger.getLogger(MyClass.class .getName()).info(message);
        System.out.println(message);
    }
}

Now you could create an instance of LoggedMessage rather than Message and have all messages logged in addition to just displaying them. Pretty neat, but there might be a better way.

Override on instantiation

Imagine that you only needed messages logged in one case. In other words, the Message class is sufficient in all cases except one. In that case, you can simply create an instance of Message that overrides this functionality. This is also referred to as an anonymous subclass.

Message loggedMessage = new Message () {
    @Override
    public void showMessage() {
        Logger.getLogger(MyClass.class .getName()).info(message);
        System.out.println(message);
    }
};

Why is this great? With this approach, you avoid adding another class with an uncommon use case to your software. You still have only one Message class. However, you still benefit from the modified behavior of a sub class.

Another use case would be when you want to use Message, but showMessage is frequently different. In other words, you would end up with many sub classes to accommodate all the use cases.

When to extend

If you find yourself copying and pasting this same override in many places, you should probably create a sub class one time. You may also find that the difference each time you override the method at instantiation differs by a parameter that could be injected. In that case, you should simply define an injection point in your class and let your DI framework provide you with a Message object that is configured the way you need it.

Software Engineering

Using Java to work with Versioned Data

A few days ago I wrote about how to structure version details in MongoDB. In this and subsequent articles I’m going to present a Java based approach to working with that revision data.

I have published all this work as an open source repository on github. Feel free to fork it:
https://github.com/dwatrous/mongodb-revision-objects

Design Decisions

To begin with, here are a few design rules that should direct the implementation:

  1. Program to interfaces. Choice of datastore or other technologies should not be visible in application code
  2. Application code should never deal with versioned objects. It should only deal with domain objects

Starting with A above, I came up with a design involving only five interfaces. The management of the Person class is managed using VersionedPerson, Person, HistoricalPerson and PersonDAO. A fifth interface, DisplayMode, is used to facilitate display of the correct versioned data in the application. Here’s what the Person interface looks like:

public interface Person {
    PersonName getName();
    void setName(PersonName name);
    Integer getAge();
    void setAge(Integer age);
    String getEmail();
    void setEmail(String email);
    boolean isHappy();
    void setHappy(boolean happy);
    public interface PersonName {
        String getFirstName();
        void setFirstName(String firstName);
        String getLastName();
        void setLastName(String lastName);
    }
}

Note that there is no indication of any datastore related artifacts, such as an ID attribute. It also does not include any specifics about versioning, like historical meta data. This is a clean interface that should be used throughout the application code anywhere a Person is needed.

During implementation you’ll see that using a dependency injection framework makes it easy to write application code against this interface and provide any implementation at run time.

Versioning

Obviously it’s necessary to deal with the versioning somewhere in the code. The question is where and how. According to point B above, I want to conceal any hint of the versioned structure from application code. To illustrate, let’s imagine a bit of code that would retrieve and display a person’s name and email.

First I show you what you want to avoid (i.e. DO NOT DO THIS).

Person personToDisplay;
VersionedPerson versionedPerson = personDao.getPersonByName(personName);
if (displayMode.isPreviewModeActive()) {
    personToDisplay = versionedPerson.getDraft();
} else {
    personToDisplay = versionedPerson.getPublished();
}
System.out.println(personToDisplay.getName().getFirstName());
System.out.println(personToDisplay.getEmail());

There are a few problems with this approach that might not be obvious based on this simple example. One is that by allowing the PersonDAO to return a VersionedPerson, it becomes necessary to include conditional code everyehere in your application that you want to access a Person object. Imagine how costly a simple change to DisplayMode could be over time, not to mention the chance of bugs creeping in.

Another problem is that your application, which deals with Person objects, now has code throughout that introduces concepts of VersionedPerson, HistoricalPerson, etc.

In the end, all of those details relate to data access. In other words, your Data Access Object needs to be aware of these details, but the rest of your application does not. By moving all these details into your DAO, you can rewrite the above example to look like this.

Person personToDisplay = personDao.getPersonByName(personName);
System.out.println(personToDisplay.getName().getFirstName());
System.out.println(personToDisplay.getEmail());

As you can see, this keeps your application code much cleaner. The DAO has the responsibility to determine which Person object to return.

DAO design

Let’s have a closer look at the DAO. Here’s the PersonDAO interface:

public interface PersonDAO {
    void save(Person person);
    void saveDraft(Person person);
    void publish(Person person);
    Person getPersonByName(PersonName name);
    Person getPersonByName(PersonName name, Integer historyMarker);
    List<Person> getPersonsByLastName(String lastName);
}

Notice that the DAO only ever receives or returns Person objects and search parameters. At the interface level, there is no indication of an underlying datastore or other technology. There is also no indication of any versioning. This encourages application developers to keep application code clean.

Despite this clean interface, there are some complexities. Based on the structure of the mongodb document, which stores published, draft and history as nested documents in a single document, there is only one ObjectID that identifies all versions of the Person. That means that the ObjectID exists at the VersionedPerson level, not the Person level. That makes it necessary to pass some information around with the Person that will identify the VersionedPerson for write operations. This comes through in the implementation of the MorphiaPersonDAO.

Download

You can clone or download the mongodb-revision-objects code and dig in to the details yourself on github.

Software Engineering

Refactoring to interfaces

The information below was delivered to one of my programmers as direction for how to implement a rather big change in an existing software product that I sell. I thought it was potentially useful to a broader audience, so I’m posting it here:


…The rest of this is rather complicated to explain online. I’ll do my best. I’m going to look at this in a simplistic way and let you work through the details.

First imagine that we have an Authorize.net processing class based largely on their API.

class AuthnetProcessAIMPayment {
	protected $apiKey;
	protected $transactionKey;
	protected $x_first_name;
	protected $cardNumber;
	protected $expDate;
	protected $ccv;
 
	function __construct($apiKey, $transactionKey) {
		// initialize object instance
	}
 
	// getter and setter functions
	function setX_first_name ($x_first_name) {
		$this->x_first_name = $x_first_name;
	}
	function getX_first_name () {
		return $this->x_first_name;
	}
	...
 
	function execute () {
		// process payment here
	}
}

I could use this in a script like authnet_process.php

// get $myApiKey and $myTransactionKey values
$payment = new AuthnetProcessAIMPayment($myApiKey, $myTransactionKey);
$payment->setX_first_name("Daniel");
...
$payment->execute();

Now there’s a problem with this approach when it comes to extending our software. You’re about to create an integration with payflowpro. Let’s imaging that the payflowpro class looks like this.

classPayPalPayflowProPayment {
	protected $securityToken;
	protected $paypalEmailAddress;
	protected $name_first;
	protected $cardNumber;
	protected $expDate;
	protected $ccv;
 
	function __construct($securityToken, $paypalEmailAddress) {
		// initialize object instance
	}
 
	// getter and setter functions
	function setName_first ($name_first) {
		$this->name_first = $name_first;
	}
	function getName_first () {
		return $this->name_first;
	}
	...
 
	function execute () {
		// process payment here
	}
}

Here is what you’ll be tempted to do in authnet_process.php (but it’s a mistake)

$paymentProcessor = get_option("authnet_payment_processor");
if ($paymentProcessor == "authnet") {
	// get $myApiKey and $myTransactionKey values
	$payment = new AuthnetProcessAIMPayment($myApiKey, $myTransactionKey);
	$payment->setX_first_name("Daniel");
	...
	$payment->execute();
} else if ($paymentProcessor == "paypal") {
	// get $mySecurityToken and $myPaypalEmailAddress values
	$payment = new classPayPalPayflowProPayment($mySecurityToken, $myPaypalEmailAddress);
	$payment->setName_first("Daniel");
	...
	$payment->execute();
} else {
	// error out, invalid payment processor
}

There are a few problems with this. First is that you now have several lines of code that are almost duplicate to set name, cardNumber, etc. The other is that for each new payment processor (like bluepay coming up) you now add more conditionals. Conditionals increase the chances of bugs and they make the code less clear. Another problem is that the names for common values are likely to be different between implementations. For example, authnet may call it “x_first_name” and paypal might call it “name_first”. Even though you’re providing the same value to it, each payment processor class receives it a different way. This divergence can make it difficult to identify bugs and know which token to search for when you’re making changes.

The ideal that we’re looking for in this case is a solution that will allow authnet_process.php to remain the same regardless of how many payment processors we implement. In answer to one of your questions, we also want our settings interface to be as insulated as possible. In other words, we don’t want to have a different settings mechanism for each payment processor. In even other words, we want each payment processor to accommodate as many of the settings that we’ve already implemented, so that the user has the most seamless experience possible.

You can think of this as being similar to your autoresponder integration in subscription mate. That was done very well. and it didn’t bleed into other settings. You could change the autoresponder provider without requiring changes anywhere else.

So, how do we do this?

Wouldn’t it be great if we could just give a different processing object to authnet_process.php and it knew what to do with it? That’s exactly what we can do with interfaces. Dependency Injection (DI) can be a huge benefit too, but PHP isn’t wired for DI like other languaes, so let’s start with the interface.

An interface cannot be instantiated. Instead, all it does is say that anyone implementing the interface must provide certain functions. Mind you it doesn’t dictate how those functions should be implemented, but you must provide an implementation, even if it does nothing (e.g. a function that just adds a log entry or simply returns an empty value).

The mapping that you were tempted to do in the conditionals above, meaning mapping the clients first name on to one member function for authnet and another for payflowpro, should instead happen one time in the interface implementation.

Here’s what an interface for the above case might look like.

interface OneTimePayment {
	public function setName($name);
	public function getName();
	...
	public function execute();
}

The payment processing classes above would now look like this (I’m showing only the lines that would change).

class AuthnetProcessAIMPayment implements OneTimePayment {
	...
	// getter and setter functions
	function setName ($x_first_name) {
		$this->x_first_name = $x_first_name;
	}
	function getName () {
		return $this->x_first_name;
	}
	...
}

and for payflow pro

class PayPalPayflowProPayment implements OneTimePayment {
	...
	// getter and setter functions
	function setName ($name_first) {
		$this->name_first = $name_first;
	}
	function getName () {
		return $this->name_first;
	}
	...
}

Now that the method names to set and get a name are identical, the code within the conditionals really is duplicate. authnet_process.php can now be simplified to this:

$paymentProcessor = get_option("authnet_payment_processor");
if ($paymentProcessor == "authnet") {
	// get $myApiKey and $myTransactionKey values
	$payment = new AuthnetProcessAIMPayment($myApiKey, $myTransactionKey);
} else if ($paymentProcessor == "paypal") {
	// get $mySecurityToken and $myPaypalEmailAddress values
	$payment = new classPayPalPayflowProPayment($mySecurityToken, $myPaypalEmailAddress);
} else {
	// error out, invalid payment processor
}
// once you create the object, everything after this 
// point can assume it's dealing with a OneTimePayment
// insead of a specific payment processor
$payment->setName("Daniel");
...
$payment->execute();

At this point, even if that conditional grows to accommodate additional payment processors, each one adds only a handful of lines of code to get authentication values and create the object. None of the work to set values and execute transactions will ever need to change, since it treats everything like a OneTimePayment.

How could DI make this even better?

I haven’t seen any good DI frameworks for PHP. In some ways it goes a bit against the grain for PHP development. If there were, this might be how authnet_process.php would look.

$payment = diFramework.get(OneTimePayment);
$payment->setName("Daniel");
...
$payment->execute();

Elsewhere in the DI framework you would define what a concrete instance of OneTimePayment should be. In other words, something like this might be in an XML file (Spring like) or a module (Guice like).

bind(OneTimePayment).to(AuthnetProcessAIMPayment);

Now, anytime you ask for a OneTimePayment, you’ll get an AuthnetProcessAIMPayment object. Similar injection can be used to provide the authentication values, so that in authnet_process.php you really only ask for a OneTimePayment and you get back a functional, ready to use object.

Refactoring

Refactoring is a key aspect of developing software. In this case, where we started with code that creates instances of Authnet classes and uses them directly, changes to authnet_process.php will be necessary. Changes will also be necessary to the Authnet classes to conform with the new interface that we come up with. That may sound like a lot of work.

It is a lot of work, and may actually be more work than just creating a payflowpro class and adding the conditional that I showed at the top. However, there are some concrete benefits and gains we get from doing this additional work.

First is that once the work is done and authnet_process.php uses the interface based calls, all future changes for new payment processors will be very small and won’t functionally endanger any working code for other payment processors.

Second is that there’s a clear scaffolding for adding new payment processors. You can still add as many private internal functions as are necessary and desired to create an “execute” function, but you know that you need one and when it’s done it will work anywhere in your app that you call execute.

Third unittests can more easily test new payment processors without duplicating a lot of code. You simply provide a different OneTimePayment implementation and run the same test. Test coverage stays high and maintenance remains low.

Where do you start?

The best place to start is by coming up with the interface. You want to look at all the values that the current implementation needs and ask if it can be generalized. For example, “x_first_name” might prompt you to design for “first_name” in your interface. A quick check with the documentation for the other payment processor(s) will help you arrive at a comprehensive interface.

Next, make the changes to the Authnet classes so that they implement the new interface(s). Finally work with your code until the unittests pass.

Now you’re ready to implement the interface for payflowpro. As soon as you have done that, you run your unittests and provide an instance of your payflowpro OneTimePayment object.

Once you’re done there and have passing unittests, move on to authnet_process.php.

At some point you’ll need to modify the settings page so that you can provide payflowpro credentials rather than authnet. I really like how you did multiple autoresponder vendors in subscription mate, so I would suggest that as a possible approach for this.

Feel free to create a branch and be daring. You have the security of unittests and the isolation of a branch in svn. If you end up with a mess the first time through, you can start over. I personally think you’ll do great.

Software Engineering

Java, Wicket and Hibernate on EC2 (pre-interview project)

Over the weekend I put together a project as a precursor to an interview. I really like interviews where I have a chance to solve a problem that’s more meaningful than generating a random number efficiently.

The pre-interview question came in the form of a sketch of the application. This worked out great since I suggest always starting with a sketch drawn by hand. Here’s what they wanted:

Choice of technology

The instructions indicated that I could use any technology that I was familiar with, as long as I included the libraries necessary to compile the code. Since I’ve been interested in Wicket lately and wanted to get into the latest version of Hibernate (it’s been a few years), I chose Java, Wicket and Hibernate.

For the development IDE I chose NetBeans due to its native support of Maven. Wicket quickstart projects base their build on a Maven pom.xml.

Development

I started my project using the Wicket quickstart. This makes use of Maven archetypes which require Maven2+. Some development benefits that come along with the quickstart include tests that are verified at each compile of the project. This ensures, at a minimum, that your markup and Java files agree.

I used Mercurial (hg) to create a local repository to manage revision control. Even for small projects revision control is a key element and reduces the risk of big refactorings and other explorations. The hg repository can also be ‘pushed’ to anyone else that wants to collaborate on the project.

In some ways, the story that the revision history tells is as important (maybe more so) than the finished product. The revisions themselves provide valuable insight into the way a developer approaches his work.

Database

For the database I chose HyperSQL (HSQLDB). This is a pure Java database with an in memory mode. Hibernate abstracts the database access and makes it easy to move to a more robust production database at some point in the future. HSQLDB makes development easy since the database is initialized each time I restart the jetty server.

Documentation and collaboration

TRAC is my preferred artifact tracking system for software projects. It integrates directly with Mercurial. It provides roadmap, wiki, timeline and other reporting and collaboration devices. Once I got the project to a stable point, I pushed the repository up to a public location with an integrated TRAC instance.

You can checkout the code or view it online using the URL below:

http://danielwatrous.repositoryhosting.com/hg_public/danielwatrous/favorite-movies

To checkout the code using the URL above just enter this command:

hg clone http://danielwatrous.repositoryhosting.com/hg_public/danielwatrous/favorite-movies favorite-movies

Once you have the code cloned to your local system as shown above, run the commands below to compile and run the application (this requires that you have Maven 2+).

mvn compile
mvn jetty:run

You should now be able to view the application at this URL:

http://localhost:8080

Additional development

Next steps in the development of this application may include additional tests, better encapsulation and refined access mechanisms.

For example, aside from the default tests that are a result of the Wicket quickstart, I haven’t added any additional unittests. If the complexity of the application increased if might be worthwhile to add unittests to the Movie and RatingModel classes.

At some point data access could be encapsulated into a DAO for the Movie class. This might be beneficial if access to Movie objects spread to additional pages and those pages duplicated the code required to access those objects.

Deployment

I’ve been keen to play around with Amazon’s EC2 service for a while. This seemed like a perfect opportunity, so I added EC2 to my Amazon Web Services account and created an instance. I chose 64 bit Amazon Linux. I chose Tomcat as the web server.

I added the Tomcat server running on my EC2 instance to the Maven files, which made it possible to build and deploy in a single step from the command line

mvn tomcat:deploy

As development continues I can redeploy easily using this command

mvn tomcat:redeploy

Resources

Download the source

favorite-movies.zip

The following resources were helpful during the development of this application:

Hibernate

http://stackoverflow.com/questions/3345816/hibernate-projects-and-building-with-maven
http://docs.jboss.org/hibernate/core/4.0/quickstart/en-US/html_single/
http://wicketinaction.com/2009/06/wicketspringhibernate-configuration/

Enum support in Java/Wicket

http://yeswicket.com/index.php?post/2009/09/24/Enums-internationalization-with-Wicket
http://blog.armstrongconsulting.com/?p=163
http://stackoverflow.com/questions/3224244/wicket-resource-string-not-found

Amazon EC2

http://coenraets.org/blog/2011/11/set-up-an-amazon-ec2-instance-with-tomcat-and-mysql-5-minutes-tutorial/
http://www.mkyong.com/maven/how-to-deploy-maven-based-war-file-to-tomcat/

Wicket users list

I also found the wicket users list very helpful, as usual:
http://apache-wicket.1842946.n4.nabble.com/form-processing-for-multiple-objects-td4321129.html
http://apache-wicket.1842946.n4.nabble.com/AJAX-Rating-extension-multiple-on-a-page-td4317346.html
http://apache-wicket.1842946.n4.nabble.com/guestbook-application-with-database-update-td4316943.html

Software Engineering

Software licensing: The value of good books

I have a large budget for books (but thanks to Amazon it doesn’t have to be as big as it could be). Sure it’s true that most of the information in programming books is online and available for free. There may even be substance to the argument that most books are out of date as soon as they hit the shelf because technology moves so fast. Oh well.

I get huge value from books. They save me many hours of time that I might spend scouting around for a snippet here or an explanation there. One of my favorite publishers of technology books is O’Reilly.

For this project I purchased Programming Google App Engine by O’Reilly. It’s a fantastic book so far and covers a lot of ground. Bookmarking, highlighting and so on gives me a quick path back to bits that I’ve learned.

Another book I purchased for this project and for my shelf is Thinking in Java (4th Edition) by Bruce Eckel. I previously read the free downloadable version of his 3rd edition. He provides uncommon depth in his approach, tying Java back into the other languages that inspired it. That context is extremely valuable!

If you have a hard time spending $100 or more on books for a project, just ask yourself how many hours you would have to save in order to justify the cost. At today’s contractor rates that might only be two or three hours to hit the break even point. Across the life of a project, a well written and edited book from a trusted publisher can save you many more hours than that.

Software Engineering

Roadmap to Become an Expert Object Oriented Programmer

Programming has evolved in very significant ways over the last few decades. There have been some significant strides forward in terms of language structure, reduced complexity and programmer productivity. One of these shifts was from procedural style programming using a language like C to object oriented programming using a language like C++.

Most modern languages support objects, inheritance and other object oriented constructs. However, not all programmers use these the right way.

As a matter of course, most introductory material in programming is procedural in style (a linear sequence of commands). It’s important as a programmer to move beyond this procedural style when the software calls for it (yes, there are times when a procedural approach is preferable).

The best place to start is with The UML, then on to object oriented design, relational databases and finally to object-relational mappers. Here’s a roadmap with some links out to resources.

The UML

Years ago Borland published a tutorial for the UML that is still a great introduction to the topic:

http://edn.embarcadero.com/article/31863

I’ve also read Sams Teach Yourself UML in 24 Hours. It was published in the early 2000’s and I found it helpful. I think the most effective UML training has come to me while reading books like the ones I mention below.

ArgoUML is an amazing free tool that you can use to create and explore UML models. In practice I’ve found that UML modeling is most effective in two cases. The first case is where a functional code base already exists and requires modifications. The second case is where the bulk of the design has been done long hand on paper. I very rarely (almost never) start with a tool like ArgoUML for initial design. I’ve also found that modeling one component, then implementing it before modeling other components is useful. I think this is called an Agile approach nowadays.

http://argouml.tigris.org/

Object Oriented Programming/Design (OOP/OOD)

This would be a good starting point to learn about Object Oriented Programming (typically abbreviated as OOP):

http://www.google.com/search?aq=2&oq=object+oriented+programming+tutorial&sourceid=chrome&ie=UTF-8&q=object+oriented+programming+tutorial+python

While the types of tutorials that you find using a query like the one shown above may be helpful, they only scratch the surface of what OOP really is and how it is to be done. In order to really understand it you’ll need a good book (or a few, but it’s good to take it a step at a time). Some books that were extremely influential for me include

Thinking in Java 3rd edition by Bruce Eckel. You can get this for free in electronic format:

http://www.mindview.net/Books/TIJ/

Here’s the download page: http://mindview.net/Books/DownloadSites/

Obviously this book discusses Java, but it’s important to recognize that in the evolution of programming languages, Java really descends from C (C >> C++ >> Java).  At least Sun had in mind to arrive at a language that would improve programmer productivity and eliminate some of the common pitfalls in C++. In large measure they accomplished this. Bruce Eckel discusses some of this in the Preface. The book also comes with code samples, solutions, etc. You can use the Eclipse IDE to explore the examples.

Patterns of Enterprise Application Architecture by Martin Fowler:

In this book he offers a very practical view of how to implement enterprise software. Enterprise software isn’t necessarily different than any other software except that the problems it solves are more commonly found in the enterprise. Many of his examples in that book deal with financial and transactional processing. Dave currently has my copy of this book.

Of course there’s the seminal work by the Gang of Four simply called Design Patterns. Most of the discussion and examples are in smalltalk. Despite the age of the language, the patterns are still very pertinent to software design.

Data Persistence

The use of databases to persist data is widespread. There are various types of databases, but the most common by far is the Relational Database Management System (RDBMS). Examples of this type of system include MS SQL Server, Oracle, MySQL and PostgreSQL, to name a few. There are many many more. There have been many efforts to create object oriented databases, but these failed to gain traction due to speed and complexity. Recently, however, there has been a new movement in the direction of object databases (notice I didn’t say object oriented). An object database stores data as an object, but doesn’t necessarily provide mapping between objects.

I’ve read and can recommend two books on database design.

Data Model Patterns: Conventions of Thought by David C. Hay. This is to database design what Fowler’s PEAA is to object oriented software design. He masterfully details mature database models that accommodate a wide range of domains. In the final chapter he provides effective clarification on some points about nomenclature and design approach. This is helpful when extending or adapting the data model patterns found throughout the book.

Database Design by Ryan Stephens and Ronald Plew. This is an overall view of database design, including normalization. This book also covers the process of domain identification and some nuts and bolts of how SQL and RDBMS’s work. The chapters dealing with analyzing and modeling business components is the most useful and has carry over into object oriented design (OOD).

Obejct-Relational Mapping (ORM)

Once you have a good grasp on OOD and RDBMS’s, the most logical next step is to explore the various object-relational mapping tools. Fowler explores Data Mapping patters in PEAA. My first exposure to object-relational mapping was with Hibernate: http://www.hibernate.org/. I originally purchased and read Hibernate in Action by Christian Bauer and Gavin King back in 2004. While the project has come a long way since then, the core principles as defined in that book haven’t changed. PEAA is probably the best place to explore object-relational mapping.

Obviously there’s a lot more to becoming an expert object oriented programmer than just these concepts, but this list of books will push you in the right direction. Next comes practice, practice, practice.