An IDE tool for data structure changes

A few days ago I was writing some unit tests for my code, when I realized the data structure my code was based on was incorrect.  I had, as one of the variants of my data type, RegExBase char, where I wanted RegExBase string instead.  This is because I wanted to be able to express empty string as a valid regular expression (other strings can just be built up from RegExConcat (regex * regex)).

This change was a fairly minor one, and the only real code changes I needed to make were in the pretty printer… and in the unit tests.  The changes I needed to make for the unit tests were fairly easy (search and replace ‘ for “).  However, it did highlight a critical flaw that many people who are unwilling to adopt unit testing bring up.  Adding in unit tests increases development time, and makes refactoring more difficult.  Increasing development time is up for debate, I saw a recent study that showed that TDD increased initial development time by ~20%, but caused there to be ~20% fewer bugs (I could be wrong on those percentages).  Of course, unit tests aren’t merely for decreasing bugs right there, serve as documentation for future users about the intention of the code.  However, refactoring is certainly more difficult for a user who knows the codebase.  It makes refactoring easier for future developers, as they can understand the code, but for the original author of the code, who understands how everything fits together, the unit tests are merely a barrier in the way of refactoring.

However, the IDE has a wealth of information available to it during these transformations.  Not only does it see the before and after state of the data structure, it also sees how the actual code is changed to work with the new data structure, and it knows it wants all previously failing tests to continue failing, and all previously succeeding tests to continue succeeding.  The IDE can use this information to make (or give suggestions) at automated changes to the unit tests.

But it can go a step further.  While it can automatically synthesize new unit tests, perhaps it can go the reverse direction, and a user needs to change the unit tests, and the code already changes.  This then begins to look like a normal input-output synthesis problem, but with the additional information about how the code was changed in the unit tests.  And it can then go even further, where maybe just one or two unit tests are changed, and it tries to generalize to the normal code and the rest of the unit tests.  Or maybe only one function is changed, and it tries to generalize to the rest of the code and all the unit tests.  This could make refactoring significantly easier when there are simple changes like changing a character parameter to a string parameter, especially when working with a large codebase, where large amounts of time could be spent chasing where the build failures occur.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s