Code Refactoring Case Study

Legacy code is stinky. Every decent developer would like to refactor it, and in order to refactor it, one ideally should have a suite of unit test cases to prevent regressions. However, writing unit tests for legacy code is not easy; the legacy code is usually a big mess. To write effective unit tests against the legacy code, you probably need to refactor it first; and to refactor it, you need unit tests to ensure you are not breaking anything. So it is a chicken and egg situation. This article describes a methodology to safely refactor legacy code by sharing a real case I once worked on.

Problem Statement

In this article, I use a real case to illustrate the pragmatic practices in testing and refactoring legacy systems. This case was written in Java, but the practices should be applicable in other languages. I have changed the scenario to protect the innocent, and simplified it to make it more understandable. The practices introduced in this article were instrumental in refactoring the legacy system I recently worked on.

This article is not about the basic skills of unit testing and refactoring. You can learn much more about those topics by reading books such as Refactoring: Improving the Design of Existing Code by Martin Fowler and Refactoring to Patterns by Joshua Kerievsky. Rather this article describes some of the complexity often encountered in real life and provides, I hope, useful practices to tackle those complexities.

The example case study I will use is a fictional resource management system in which a resource refers to a person who can be assigned to some tasks. Let us say a resource can be assigned to a ticket—a HR ticket or an IT ticket. A resource can also be assigned to a request — a HR request or an IT request. The resource manager can log the time a resource is estimated to spend on the task. The resource can log how much time he or she actually works on the ticket or request.

The utilization of resources can be shown in a bar chart, which shows both the estimated and actual time spent.

Not complicated? Well, in the real system, resources can be assigned many types of tasks. Still, technically it is not a complex design. However when I first read the code, I felt like I was reading a fossil, I could see how the code had evolved (or rather de-evolved). At first the system was only able to deal with requests, and then functionality was added to deal with tickets and other kinds of tasks. Then an engineer came along and wrote code to deal with requests: extracting data from the database, and assembling it to be shown in the bar chart. He didn’t even bother to organize the information into proper objects:

class ResourceBreakdownService {     public Map search (Session context) throws SearchException{   
        //omitted twenty or so lines of code to pull search criteria out of context and verify them, such as the below:         if(resourceIds==null || resourceIds.size ()==0){             throw new SearchException(“Resource list is not provided”);         }         if(resourceId!=null || resourceIds.size()>0){             resourceObjs=resourceDAO.getResourceByIds(resourceIds);         }       
        //get workload for all requests
        Map requestBreakDown=getResourceRequestsLoadBreakdown (resourceObjs,startDate,finishDate);         return requestBreakDown;    } }

I am sure you were immediately repulsed by the stink of this code, for example, you were probably immediately thinking search is not a meaningful name, it should use the Apache Commons library CollectionUtil.isEmpty() to test a collection and you were probably questioning the resulting Map?

But wait, the stink continued to build up. A second engineer came along and followed suit and dealt with tickets in the same way, so in the code, you have:

// get workload for all tickets Map ticketBreakdown =getResourceRequestsLoadBreakdown(resourceObjs,startDate,finishDate,ticketSeverity); Map result=new HashMap(); for(Iterator i = resourceObjs.iterator(); i.hasNext();) {     Resource resource=(Resource);    
    Map requestBreakdown2=(Map)requestBreakdown.get(resource);
    List ticketBreakdown2=(List)ticketBreakdown.get(resource);
    Map resourceWorkloadBreakdown=combineRequestAndTicket(requestBreakdown2, ticketBreakdown2);
return result;

Forget about the naming, the balance of structure and other aesthetic concerns, the stinkiest thing about this code is the returning Map object. Map is a black hole, it sucks in everything and doesn’t give you many clues as to what is contained in it. Only by writing some debugging code to recursively print out the content of the map, was I able to find out its data structure.

In this example, {} means a Map, => means key value mapping and [] means a collection:

{resource with id 30000=> [     SummaryOfActualWorkloadForRequestType,     SummaryOfEstimatedWorkloadForRequestType,     {30240=>[         ActualWorkloadForReqeustWithId_30240,         EstimatedWorkloadForRequestWithId_30240],           

With such a terrible data structure, the data marshaling and un-marshalling logic was virtually unreadable and extremely long and tedious. 

Integration test

I hope by now I have convinced you that this code was truly complex. If I had started out by first unraveling the mess and understanding every bit of the code before refactoring, I might have gone crazy. To save my sanity, I decided that to understand the code logic, I would be better to use a top down approach. That is, instead of reading the code and deducing the logic, I decided it would be better to play with the system and debug it and understand the big picture.

I use the same approach in writing tests. Conventional wisdom is to write small tests to verify every code path, and if every piece is well tested, when you put all the code paths together, there is a big chance that the whole will work as expected. But that didn’t apply here. ResourceBreakdownService was a God class. If I had started out by first breaking it apart armed with only a big-picture understanding, I would have made a lot of mistakes – there could be many secrets hidden in every nook and corner of the legacy system. 

I wrote this simple test, which reflected my understanding of the big picture:

public void testResourceBreakdown(){     Resource resource=createResource();     List requests=createRequests();     assignRequestToResource(resource, requests);     List tickets=createTickets();     assignTicketToResource(resource, tickets);     Map result=new ResourceBreakdownService().search(resource);     verifyResult(result,resource,requests,tickets); }

Notice the verifyResult() method, I first found out the structure of result by recursively printing out its content, and verifyResult() used this knowledge to verify it contained the right data:

private void verifyResult(Map result, Resource rsc, List<Request> requests, List<Ticket> tickets){    
    // in this simple test case, actual workload is empty
    UtilizationBean emptyActualLoad=createDummyWorkload();        
    List resourceWorkLoad=result.get(rsc.getId());      
    UtilizationBean scheduleWorkload=calculateWorkload(rsc,requests);
    Map requestDetailWorkload = (Map)resourceWorkLoad.get(3);
    for (Request request : requests) {
        UtilizationBean scheduleWorkload0=calculateWorkload(rsc,request);
    // omit code to check tickets

Walk Around Obstacles

The above test case seemed easy but there were many complexities. For a start, ResourceBreakdownService().search is tightly weaved into the runtime, it needs to access the database, other services and who knows what else. And, like many legacy systems, this system didn’t have any unit test infrastructure built. To access the runtime services, the only option was to start up the whole system, which was very expensive and inconvenient.

The class that started up the server side of the system was ServerMain.ServerMain was also like a fossil, from which you can tell how it was evolved. This system was written more than 10 years ago, at that time, there was no Spring, no Hibernate, a little bit of JBoss and a little bit of Tomcat. The brave pioneers at that time had to do a lot of DIY's so they created a home-made cluster, a cache service and a connection pool amongst other things. Later they somehow plugged into JBoss and Tomcat (but unfortunately they left behind their DIY's, so the code was left with two sets of transaction managing mechanisms and three types of connection pools. 

I decided to copy ServerMain to TestServerMain. Invoking TestServerMain.main() failed with:

org.springframework.beans.factory.BeanInitializationException: Could not load properties; nested exception is class path resource [] cannot be opened because it does not exist

OK, that was easily fixable! I grabbed a file and dropped it to the test class path, and ran it again. This time, this exception was thrown out: .\server.conf (The system cannot find the file specified)     at Method)     at<init>(     at<init>(     at<init>(     at     at     at     at     at     at     at     at<clinit>(

server.conf somewhere, but that approach made me uncomfortable. Just by writing this single test case, it immediately exposed a problem in the code. HibernateUtil, as its name suggests, should only care about database information which should have been supplied by Why should it access server.conf which configures server-wide setups? Here is a clue that the code is smelly: if you feel like you are reading a detective novel and constantly asking “why”, the code is usually bad. I could have spent much time reading through ServerConfigFactory, HibernateUtil and ServerConfigAgent and trying to figure out how to point HibernateUtil to, but at this point, I was too anxious to get a running server. Besides, there was a way to walk around it. The weapon is AspectJ:

void around():     call(public static void{
    System.out.println("bypassing"); }

For folks who do not understand AspectJ, in simple English the above means: when the runtime is at the point of calling ServerConfigFactory.initServerConfig(), print a message and return without going into the method itself. This may feel like hacking, but it very cost effective. Legacy systems are full of riddles and problems. At any given moment, one needs to pick his fight strategically. At that moment, the most rewarding thing for me to do, in terms of customer satisfaction, was to fix the defects in the resource management system and improve its performance. Straightening up things in other areas was not on my radar. However, I made a mental note that I would come back later and sort out the mess in ServerMain.

Then at the point of HibernateUtil reading required information from server.conf, I pointed it to read it from

String around():call(public String{     // code omitted, reading from

   String around():call(public String{
    // code omitted, reading from

You probably can guess the rest now: walk around obstacles when it is more convenient or natural to do so. But if there are ready mockups, reuse them. For example, at one point, TestServerMain.main() failed with:

- Factory name: java:comp/env/hibernate/SessionFactory
- JNDI InitialContext properties:{}
- Could not bind factory to JNDI javax.naming.NoInitialContextException: Need to specify class name in environment or system property, or as an applet
parameter, or in an application resource file: java.naming.factory.initial     at javax.naming.spi.NamingManager.getInitialContext(     at javax.naming.InitialContext.getDefaultInitCtx(

This was because the JBoss naming service was not started. I could have used the same hacking technique to walk around this, but InitialContext is a big Javax interface containing many methods, and I didn’t want to chase and hack every method – that would be very tedious. A quick search revealed Spring already has a mock class SimpleNamingContext, so I dropped it into the test:

SimpleNamingContextBuilder builder = new SimpleNamingContextBuilder(); builder.bind(“java:comp/env/hibernate/SessionFactory”,sessionFactory); builder.activate();

After a few rounds, I was able to make TestServerMain.main() run successfully. Comparing to ServerMain, it was much simpler, it mocked up a lot of JBoss services, and it didn’t entangle with cluster management.

Build the Building Blocks

TestServerMain is connected to a real database. Legacy systems can hide surprising logic in stored procedures, or even worse in triggers. Following the same big-picture thinking, I decided it was not wise for me at that time to try to understand the mysteries inside the database and fake up a database, so I decided to have my test case access the real database.

The test case would need to run repeatedly to ensure every small change I made to the product code passes the test. At each run, the tests would create resources and requests in the database. Unlike conventional wisdom in unit tests, you sometimes do not want to clean up the battle field after each run by destroying the data created by test cases. So far, the testing and refactoring exercise has been a fact-finding exploration - you learn the legacy system by trying to test it. You might want to check the data generated by test cases in the database, or you might want to use the data in the runtime system, in order to confirm everything is working as expected. This means test cases have to create unique entities in the database to avoid stepping on other test cases. There should also be some utility classes to easily create such entities.

Here is such a simple builder block for building resource:

public static ResourceBuilder newResource (String userName) {
    ResourceBuilder rb = new ResourceBuilder();
    rb.userName = userName + UnitTestThreadContext.getUniqueSuffix();
    return rb; }
public ResourceBuilder assignRole(String roleName) {
    this.roleName = roleName + UnitTestThreadContext.getUniqueSuffix();
    return this;
public Resource create() {
    ResourceDAO resourceDAO = new ResourceDAO(UnitTestThreadContext.getSession());
    Resource rs;
    if (StringUtils.isNotBlank(userName)) {
        rs = resourceDAO.createResource(this.userName);
    } else {
        throw new RuntimeException("must have a user name to create a resource");

    if (StringUtils.isNotBlank(roleName)) {
        Role role = RoleBuilder.newRole(roleName).create();
    return rs;

public static void delete(Resource rs, boolean cascadeToRole) {
    Session session = UnitTestThreadContext.getSession();
    ResourceDAO resourceDAO = new ResourceDAO(session);

    if (cascadeToRole) {
        RoleDAO roleDAO = new RoleDAO(session);
    List roles = rs.getRoles();
    for (Object role : roles) {

ResourceBuilder is an implementation of the Builder and Factory design patterns; you can use it in a “train-wreck” style:


It also contains a battle-field cleaning-up method: delete(). In the early stage of this refactoring exercise, I didn’t invoke delete() very often, as I often started up the system and pulled in the test data and checked if the bar chart was correct.

The class that was very useful here was UnitTestThreadContext. It stores a thread-specific Hibernate Session object, and provides unique strings to be appended to names of the entities you intend to create, thus ensuring uniqueness of the entities.

public class UnitTestThreadContext {
    private static ThreadLocal<Session> threadSession=new ThreadLocal<Session>();
    private static ThreadLocal<String> threadUniqueId=new ThreadLocal<String>();
    private final static SimpleDateFormat dateFormat = new SimpleDateFormat("yyyy/MM/dd HH_mm_ss_S");
    public static Session getSession(){>
        Session session = threadSession.get();
    if (session==null) {
        throw new RuntimeException("Hibernate Session not set!");
    return session;

    public static void setSession(Session session) {
    public static String getUniqueSuffix() {
        String uniqueId = threadUniqueId.get();
        if (uniqueId==null){
           uniqueId = "-"+dateFormat.format(new Date());
        return uniqueId;       


Wrap It Up

Now I could start up a minimum running infrastructure and run my simple test case:

protected void setUp() throws Exception {; //setup a minimum running infrastructure
public void testResourceBreakdown(){
    Resource resource=createResource(); //use ResourceBuilder to build unique resources
    List requests=createRequests(); //use RequestBuilder to build unique requests
    assignRequestToResource(resource, requests);
    List tickets=createTickets(); //use TicketBuilder to build unique tickets
    assignTicketToResource(resource, tickets);
    Map result=new ResourceBreakdownService().search(resource);
protected void tearDown() throws Exception {
    // use TicketBuilder.delete() to delete tickets
    // use RequestBuilder.delete() to delete requests
    // use ResourceBuilder.delete() to delete resources      

I continued to write more complex test cases and refactored the product code and test code along the way.

Armed with these test cases, I broke down the God class ResourceBreakdownService little by little. I won’t bore you with the details; there are a lot of books that will teach you how to do refactoring safely. For the sake of completeness, here is the resulting structure after refactoring

Now that terrible Map_Of_Array_Of_Map_Of… data structure is now organized into ResourceLoadBucket, which uses the Composite design pattern. It contains estimated effort and actual effort at a certain level, the next level efforts can be aggregated up with the aggregate() method. The resulting code was much cleaner and performed better. And it even exposed some defects that were hidden in the complexity of the original code. And of course, I evolved my test cases along the way.

Throughout this exercise, the key principle I adhered to was big-picture thinking. I picked my fight and stayed focused on it, walking around obstacles that were not important to the task at hand, and built up a minimal viable testing infrastructure, which my team continued to use to refactor other areas. Some ldquo;hacking” pieces still remain in the testing infrastructure, because it doesn’t make too much business sense to clean them up. I was not only able to refactor a very complex function area, but also gained a deeper knowledge about the legacy system. Treating a legacy application as if it was fragile china doesn’t bring you more safety, boldly charging into it and refactoring it is how legacy applications can survive into the future.

About The Author 

Chen Ping lives in Shanghai, China and graduated with a Masters Degree in Computer Science in 2005. Since then she has worked for Lucent and Morgan Stanley. Currently she is working for HP as a Development manager. Outside of work, she likes to study Chinese medicine.




  • 1.

    Abrahamsson, P., Koskela, J.: Extreme programming: Empirical results from a controlled case study. In: ACM-IEEE International Symposium on Empirical Software Engineering (ISESE 2004), Redondo Beach CA, USA (2004)Google Scholar

  • 2.

    Abrahamsson, P., Hanhineva, A., Hulkko, H., Ihme, T., Jäälinoja, J., Korkala, M., Koskela, J., Kyllönen, P., Salo, O.: Mobile-D: An Agile Approach for Mobile Application Development. In: Proceedings 19th Annual ACM Conference on Object-Oriented Programming, Systems, Languages, and Applications, OOPSLA 2004, Vancouver, British Columbia, Canada (2004)Google Scholar

  • 3.

    Beck, K.: Extreme Programming Explained: Embrace Change. Addison-Wesley, Reading (2000)Google Scholar

  • 4.

    Basili, V.R., Briand, L.C., Melo, W.L.A.: Validation of Object-Oriented Design Metrics as Quality Indicators. IEEE Transactions on Software Engineering 22(10), 267–271 (1996)CrossRefGoogle Scholar

  • 5.

    Boehm, B.W., Brown, K.J.R., et al.: Characteristics of Software Quality. TRW Series of Software Technology. North-Holland, Amsterdam (1978)Google Scholar

  • 6.

    Bois, B.D., Mens, T.: Describing the impact of refactoring on internal program quality. In: Proceedings of the International Workshop on Evolution of Large-scale Industrial Software Applications (ELISA), Amsterdam, The Netherlands (2003)Google Scholar

  • 7.

    Bois, B.D., Demeyer, S., Verelst, J.: Refactoring – Improving Coupling and Cohesion of Existing Code. In: Belgian Symposium on Software Restructuring, Gent, Belgium (2005)Google Scholar

  • 8.

    Bois, B.D., Demeyer, S., Verelst, J.: Does the “Refactor to Understand” Reverse Engineering Pattern Improve Program Comprehension? In: Proceedings 9th European Conference on Software Maintenance and Reengineering (CSMR 2005), Manchester, UK, March 21-23 (2005)Google Scholar

  • 9.

    Briand, L.C., Wüst, J.: Modeling Development Effort in Object-Oriented Systems Using Design Properties. IEEE Transactions on Software Engineering 27(11), 963–986 (2001)CrossRefGoogle Scholar

  • 10.

    Chidamber, S., Kemerer, C.F.: A metrics suite for object-oriented design. IEEE Transactions on Software Engineering 20(6), 476–493 (1994)CrossRefGoogle Scholar

  • 11.

    Corbi, T.A.: Program Understanding: Challenge for the 1990s. IBM Systems Journal 28(2), 294–306 (1989)CrossRefGoogle Scholar

  • 12.

    Demeyer, S., Ducasse, S., Nierstrasz, O.: Finding Refactorings via Change Metrics. In: Proceedings of the 15th Annual ACM Conference on Object-Oriented Programming, Systems, Languages, and Applications, OOPSLA 2000, Minneapolis, USA (2000)Google Scholar

  • 13.

    Demeyer, S.: Maintainability versus Performance: What’s the Effect of Introducing Polymorphism? Technical report, Lab. on Reeng. Universiteit Antwerpen, Belgium (2002)Google Scholar

  • 14.

    Van Deursen, A.: Program Comprehension Risks and Opportunities in Extreme Programming. In: Proceedings of the Eighth Working Conference on Reverse Engineering (WCRE 2001), Stuttgart, Germany, October 2-5 (2001)Google Scholar

  • 15.

    Van Emden, E., Moonen, L.: Java Quality Assurance by Detecting Code Smells. In: Proceedings of the 9th Working Conference on Reverse Engineering. IEEE Computer Society Press, Los Alamitos (2002)Google Scholar

  • 16.

    Fenton, N., Pfleeger, S.L.: Software Metrics A Rigorous & Practical Approach. PWS Publishing Company, Boston (1997)Google Scholar

  • 17.

    Fowler, M.: Refactoring Improving the Design of Existing Code. Addison-Wesley, Reading (2000)MATHGoogle Scholar

  • 18.

    Henderson-Sellers, B.: Object-Oriented Metrics: Measures of Complexity, p. 62. Prentice-Hall, Upper Saddle River (1996)Google Scholar

  • 19.

    Johnson, P.M., Disney, A.M.: Investigating Data Quality Problems in the PSP. In: Proceedings of Sixth International Symposium on the Foundations of Software Engineering (SIGSOFT 1998) (1998)Google Scholar

  • 20.

    Kataoka, Y., Imai, T., Andou, H., Fukaya, T.: A Quantitative Evaluation of Maintainability Enhancement by Refactoring. In: Proc. Int’l Conf. Software Maintenance, October 2002, pp. 576–585 (2002)Google Scholar

  • 21.

    Lehman, M.M., Ramil, J.F., Wernick, P.D., Perry, P.E., Turski, W.M.: Metrics and laws of software evolution-the nineties view. In: Proceedings of the Fourth International Software Metrics Symposium, November 5-7 (1997)Google Scholar

  • 22.

    Lehmann, E.L.: Testing Statistical Hypotheses. Springer, New York (1986)CrossRefMATHGoogle Scholar

  • 23.

    McCall, J.A., Richards, P.K., Walters, G.F.: Factors in Software Quality. RADC TR-77-369, Vols I, II, III, US Rome Air Development Center Reports NTIS AD/A-049 014, 015, 055 (1977)Google Scholar

  • 24.

    Mens, T., Demeyer, S., Bois, B.D., Stenten, H., van Gorp, P.: Refactoring: Current Research and Future Trends. Electronic Notes in Theoretical Computer Science 82(3) (2003)Google Scholar

  • 25.

    Mens, T., Tourwé, T.A.: Survey of Software Refactoring. IEEE Transactions on Software Engineering 30(2), 126–139 (2004)CrossRefGoogle Scholar

  • 26.

    Pizka, M.: Straightening spaghetti-code with refactoring? In: Proceedings of the Int. Conf. on Software Engineering Research and Practice - SERP, Las Vegas, NV, pp. 846–852 (2004)Google Scholar

  • 27.

    Poppendieck, T., Poppendieck, M.: Lean Software Development: An Agile Toolkit for Software Development Managers. Addison-Wesley, Reading (2003)Google Scholar

  • 28.

    Ratzinger, J., Fischer, M., Gall, H.: Improving Evolvability through Refactoring. In: Proceedings 2nd International Workshop on Mining Software Repositories, MSR 2005, Saint Louis, Missouri, USA (2005)Google Scholar

  • 29.

    Sahraoui, H.A., Godin, R., Miceli, T.: Can metrics help to bridge the gap between the improvement of oo design quality and its automation? In: Proc. International Conference on Software Maintenance, pp. 154–162 (October 2000)Google Scholar

  • 30.

    Schofield, C., Tansey, B., Xing, Z., Stroulia, E.: Digging the Development Dust for Refactorings. In: Proceedings of the 14th International Conference on Program Comprehension (ICPC 2006), Athens, Greece (2006)Google Scholar

  • 31.

    Sillitti, A., Janes, A., Succi, G., Vernazza, T.: Collecting, Integrating and Analyzing Software Metrics and Personal Software Process Data. In: Proceedings of the EUROMICRO 2003, Belek-Antalya, Turkey (2003)Google Scholar

  • 32.

    Simon, F., Steinbruckner, F., Lewerentz, C.: Metrics based refactoring. In: Proc. European Conf. Software Maintenance and Reengineering, pp. 30–38. IEEE Computer Society Press, Los Alamitos (2001)CrossRefGoogle Scholar

  • 33.

    Stroulia, E., Kapoor, R.V.: Metrics of Refactoring-based Development: An Experience Report. In: The 7th International Conference on Object-Oriented Information Systems, Calgary, AB, Canada, pp. 113–122. Springer, Heidelberg (2001)Google Scholar

  • 34.

    Tahvildari, L., Kontogiannis, K.A.: Metric-Based Approach to Enhance Design Quality through Meta-Pattern Transformations. In: Proc. European Conf. Software Maintenance and Reeng., pp. 183–192 (2003)Google Scholar

  • 35.

    Yu, Y., Mylopoulos, J., Yu, E., Leite, J.C., Liu, L., D’Hollander, E.H.: Software refactoring guided by multiple soft-goals. In: Proceedings of the 1st workshop on Refactoring: Achievements, Challenges, and Effects, in conjunction with the 10th WCRE conference 2003, Victoria, Canada, November 13-16, pp. 7–11 (2003)Google Scholar

  • 0 Replies to “Code Refactoring Case Study”

    Lascia un Commento

    L'indirizzo email non verrà pubblicato. I campi obbligatori sono contrassegnati *