Home United States USA — software Project Comprehension: Understanding Java Projects Efficiently

Project Comprehension: Understanding Java Projects Efficiently

September 21, 2017

258

Getting new devs up to speed is imperative. Here is an approach to help new and old devs alike improve project comprehension and build a cohesive team.
Let’s start with a bit of theory.
A modern Java application is a complex system that frequently operates as a node in a larger enterprise network. By the time a new developer joins the team, the project will likely have been in development for a couple of years and contain code contributions from dozens of developers, most of whom left the project long ago. Documentation is not always up-to-date and accurate, and only a few team members may have a comprehensive picture of the project (whom you’ll have to catch for short Q&As in between meetings, code reviews, and emergency deployments).
If the described problem sounds familiar, it should. Project comprehension (more generally — program comprehension) is one of the most overlooked areas in software development. It’s rarely a topic for water cooler conversations. However, it’s enough of a deal to constitute a separate branch of scientific research dating back to the 1970s (e.g. Using a behavioral theory of program comprehension). There are even dedicated international conferences (e.g. IEEE International Conference on Program Comprehension) on the subject, with its facets spanning theoretical math, computer science, psychology and brain physiology (yes, they even used MRI to study our brains: Measuring Neural Efficiency of Program Comprehension).
Why is project comprehension so difficult?
Writing code from your mind
Reading and navigating someone else’s code
Intentions are clear
Intensions are not clear and have to be deduced
Working on one abstraction level at a time
Jumping between various abstraction levels
For product owners and management, the biggest concern should be associated costs. Time is money. Every workday spent by the developer figuring out how the thing works and why it was designed that way is an expense.
IBM (Corbi, 1989)
Over 50% of time
Bell Labs (Davison, 1992)
New project members: 60-80% of time, drops to 20% as one gains experience
National Research Council in Canada (Singer, 2006)
Over 25% of time either searching for or looking at code
Microsoft (Hallam, 2006)
Equal amount of time as design, test
Microsoft (La Toza, 2007)
Over 70% of time
Microsoft (Cherubini, 2007)
95%~ significant part of job
65%< at least once a day
25%< multiple times of a day
If you imagine what these percentages mean in terms of software development budgets, it’s easy to see that project comprehension is a very real problem in the industry.
Moving from theory to practice, below I offer a structured list of questions (a template) that may help you build a mental model of an unfamiliar Java project.
In general, projects should be studied using the top-down approach and starting from the business aspect first.
Deducing such information from documentation or the codebase can be very time-consuming and inaccurate, so it’s a perfect subject for an introductory Q&A session, which has to be done as early as possible. Ideally, these questions should be answered by at least two team members with the most experience in the project. This ensures better accuracy of such information and also gives project old-timers the rare opportunity to identify and resolve differences in their understanding of the project’s basic premises.
After building valuable context with information gained during the introductory Q&A session, it’s time to dive into the purely technical side of the project. However, don’t start analyzing source code or debugging just yet, Try approaching the matter from a higher level. The questions above stopped at the component level. You can continue studying the project on your own starting from the module level:
At this point, you’ve reached the level of individual source files. Even here you can start by collecting valuable general information.
Whether you are studying the project at the component level, module level, or class level, it’s important that you understand the logic behind the names of the units. Sometimes, this can be an eye-opener.
It’s really useful to study the organization of project files and folders. This organization might differ from what you see in your IDE. By default, some IDEs display the project with logical grouping and nesting of files and folders according to the IDE’s notions of « project », « module », « dependency », « library » etc. If your IDE does that, make sure you understand the logic that it’s applying.
Finally, if your project uses persistence storage, study the general organization of data in such storage, if reasonably possible (e.g. it’s not hundreds or thousands of tables). For starters, just check out what relational tables (alternative structures for non-relational storages) are there and what is their relation to one another. Many database GUI clients allow you to easily generate diagrams. It gets a bit more complex with NoSQL persistence storage as, often, they promote a schemaless design and the only place you can see the data model is in the code itself.
Now, if you were curious enough (and you should be on a new project), even after all the research described above, you should be able to come up with additional points to discuss during a follow-up Q&A session. If you’re going to bother someone with questions, prepare and structure them thoroughly, ask them early and ask them in batches. Throw in your assumptions about the project (they may be wrong). Write down everything. The more details you collect, the better. You can always cross out irrelevant or unimportant items later, but if you don’t write them down immediately you risk missing something valuable. Sum up and write down all the answers. Memorizing so much heterogeneous information is a bad idea. Don’t rely on the initial feeling of complete clarity as it can be illusory.
Having read this article, you may ask why bother the busiest developers on the team with long Q&A sessions, and not once, but twice?
I have yet to see a more time-efficient (cost-efficient) way of transferring general knowledge about a project (context) than having someone who knows the project hands-on explain things in plain words while you are able to rephrase any question or clarify any response as you go.