Design Guidelines for Developing Frameworks and Class Libraries

by Vitalii Tsybulnyk 26. August 2010 13:21
After I spent the last couple of months designing some class library for Windows Azure engineering infrastructure, I realized that design principles for frameworks and class libraries are not exactly the same as for 'off the shelf' or enterprise applications and systems. The fundamental difference is that, in the case of applications, customers don't care about your code at all, so you can use all techniques you want to make your design elegant and easy to maintain. In the case of framework/library, quite the contrary is true: your code is in some ways a user interface, which customers see, use, and care about a lot. Believe it or not, this difference significantly influences your architecture and OOD decisions in ways you may not expect. In this post, I've collected the advice I'd give to framework/library designers. Some of these suggestions are based on these sources [1-2]; however, most of them are from my own experience. Some of this advice might contradict traditional design principles, so be careful and use them for public API/frameworks only. Fundamentals 1. Framework designers often make the mistake of starting with the design of the object model (using various design methodologies) and then write code samples based on the resulting API. The problem is that most design methodologies (including most commonly used object-oriented design methodologies) are optimized for the maintainability of the resulting implementation, not for the usability of the resulting APIs. They are best suited for internal architecture designs—not for designs of the public API layer of a large framework. When designing a framework, you should start with producing a scenario-driven API specification. This specification can be either separate from the functional specification or can be a part of a larger specification document. In the latter case, the API specification should precede the functional one in location and time. The specification should contain a scenario section listing the top 5-10 scenarios for a given technology area and show code samples that implement these scenarios. 2. Common scenario APIs should not use many abstractions but rather should correspond to physical or well-known logical parts of the system. As noted before, standard OO design methodologies are aimed at producing designs that are optimized for maintainability of the code base. This makes sense as the maintenance cost is the largest chunk of the overall cost of developing a software product. One way of improving maintainability is through the use of abstractions. Because of that, modern design methodologies tend to produce a lot of them. The problem is that frameworks with lot of abstractions force users to become experts in the framework architecture before starting to implement even the simplest scenarios. But most developers don’t have the desire or business justification to become experts in all of the APIs such frameworks provide. For simple scenarios, developers demand that APIs be simple enough so that they can be used without having to understand how the entire feature areas fit together. This is something that the standard design methodologies are not optimized for, and never claimed to be optimized for. Naming Guidelines 3. The code samples should be in at least two programming languages. This is very important as sometimes code written using those languages differs significantly. It is also important that these scenarios be written using different coding styles common among users of the particular language (using language specific features). The samples should be written using language-specific casing. For example, VB.NET is case-insensitive, so samples should reflect that. Think about different languages even when you name classes, e.g. don't make mistakes like the NullReferenceException class which can be thrown by VB code, but VB uses Nothing, not null. Avoid using identifiers that conflict with keywords of widely used programming languages. 4. The simplest, but also most often missed opportunity for making frameworks self-documenting is to reserve simple and intuitive names for types that developers are expected to use (instantiate) in the most common scenarios. Framework designers often “burn” the best names for less commonly used types, with which most users do not have to be concerned. For example, a type used in mainline scenarios to submit print jobs to print queues should be named Printer, rather than PrintQueue. Even though technically the type represents a print queue and not the physical device (printer), from the scenario point of view, Printer is the ideal name as most people are interested in submitting print jobs and not in other operations related to the physical printer device (such as configuring the printer). If you need to provide another type that corresponds, for example, to the physical printer to be used in configuration scenarios, the type could be called PrinterConfiguration or PrinterManager. Similarly, names of most commonly used types should reflect usage scenarios, not inheritance hierarchy. Most users use the leaves of an inheritance hierarchy almost exclusively, and are rarely concerned with the structure of the hierarchy. Yet, API designers often see the inheritance hierarchy as the most important criterion for type name selection. For example, naming the abstract base class File and then providing a concrete type NtfsFile works well if the expectation is that all users will understand the inheritance hierarchy before they can start using the APIs. If the users do not understand the hierarchy, the first thing they will try to use, most often unsuccessfully, is the File type. While this design works well in the object-oriented design sense (after all NtfsFile is a kind of File) it fails the usability test, because “File” is the name most developers would intuitively think to program against. Classes vs. Interfaces 5. In general, classes are the preferred construct for exposing abstractions. The main drawback of interfaces is that they are much less flexible than classes when it comes to allowing for evolution of APIs. Once you ship an interface, the set of its members is fixed forever. The only way to evolve interface-based APIs is to add a new interface with the additional members. A class offers much more flexibility. 6. Abstract types do version much better, then allow for future extensibility, but they also burn your one and only one base type. Interfaces are appropriate when you are really defining a contract between two objects that is invariant over time. Abstract base types are better for define a common base for a family of types. 7. When a class is derived from a base class, I say that the derived class has an IS-A relationship with the base. For example, a FileStream IS-A Stream. However, when a class implements an interface, I say that the implementing class has a CAN-DO relationship with the interface. For example, a FileStream CAN-DO disposing. Methods vs. Properties There are two general styles of API design in terms of usage of properties and methods: method-heavy APIs, where methods have a large number of parameters and the types have fewer properties, and property-heavy APIs, where methods with a small number of parameters and more properties to control the semantics of the methods. 8. All else being equal, the property-heavy design is generally preferable. 9. Properties should look and act like fields as much as possible because library users will think of them and use them as though they were fields. 10. Use a method, rather than a property, in the following situations:  - The operation is orders of magnitude slower than a field access would be. If you are even considering providing an asynchronous version of an operation to avoid blocking the thread, it is very likely that the operation is too expensive to be a property. In particular operations that access the network or the file system (other than once for initialization) should likely be methods, not properties.  - The operation returns a different result each time it is called, even if the parameters don’t change. For example, the Guid.NewGuid method returns a different value each time it is called.  - The operation has a significant and observable side effect. Notice that populating an internal cache is not generally considered an observable side effect.  - The operation returns an array. Properties that return arrays can be very misleading. Usually it is necessary to return a copy of an internal array so that the user cannot change the internal state. This may lead to inefficient code. Events 11. Consider using a subclass of EventArgs as the event argument, unless you are absolutely sure the event will never need to carry any data to the event handling method. If you ship an API using EventArgs directly, you will never be able to add any data to be carried with the event without breaking compatibility. If you use a subclass, even if initially completely empty, you will be able to add properties to the subclass when needed. Enums 12. Use enums if otherwise a member would have two or more Boolean parameters. Enums are much more readable when it comes to books, documentation, source code reviews, etc. Consider a method call that looks as follows. FileStream f = File.Open (“foo.txt”, true, false); This call gives the reader no context with which to understand the meaning behind true and false. The call would be much more usable if it were to use enums, as follows: FileStream f = File.Open(“foo.txt”, CasingOptions.CaseSensitive, FileMode.Open); Some would ask why we don’t have a similar guideline for integers, doubles, etc. Should we find a way to “name” them as well? There is a big difference between numeric types and Booleans. You almost always use constants and variables to pass numeric values around, because it is good programming practice and you don’t want to have “magic numbers”. However, if you take a look at real life source code, this is almost never true of Booleans. 80% of the time a Boolean argument is passed in as a literal constant, and its intention is to turn a piece of behavior on or off. We could alternatively try to establish a coding guideline that you should never pass a literal value to a method or constructor, but I don’t think it would be practical. I certainly don’t want to define a constant for each Boolean parameter I’m passing in. Methods with two Boolean parameters, like the one in the example above, allow developers to inadvertently switch the arguments, and the compiler and static analysis tools can't help you. Even with just one parameter, I tend to believe it's still somewhat easier to make a mistake with Booleans ... let's see, does true mean "case insensitive" or "case sensitive"?   Sources 1. MSDN 'Design Guidelines for Developing Class Libraries' 2. Krzysztof Cwalina, Brad Abrams 'Framework Design Guidelines: Conventions, Idioms, and Patterns for Reusable .NET Libraries'

Tags: ,

Software Architecture | Software Development

The Human Face of Software Architecture

by Vitalii Tsybulnyk 25. December 2009 13:51
Over a last two decades, the human aspects of software engineering have become a central topic among researchers and practitioners. It appears that software construction has even more in common with some sort of social activity than with technical engineering [Cockburn04]. This happens partially because "software is soft" [Fowler98], which means that existing practices of civil engineering don't work well for its dynamic requirements and rapidly changing environments. But an even more important issue is that software is the product of human thinking processes, which makes the 'people factor' a key aspect for future investigation in the understanding and improvement of software construction. The result of this 'human-centric' era is that some important parts of software development are already being referred to by most practitioners as not only engineering, but also a social subject. This list includes software architecture. 'Software architecture' is a terribly overloaded and ambiguous term, even by our industry standards. There are a few formal definitions, (for example, "architecture is the highest level concept of a system in its environment" or "architecture is the set of design decisions that must be made early in a project"), and researchers and practitioners argue with such definitions [Fowler00]. Ultimately, there is still no better definition than Ralph Johnson's: "Architecture is about the important stuff. Whatever that is." But 'importance' is a very subjective quality, and the people factor plays a huge role in the definition of software architecture. Even if you consider that software architecture is a 'set of design decisions' (such as design patterns usage, object model, etc.), there are a significant amount of human factors in these decisions. In this article, I would like to give attention to those factors, and try to summarize the main human aspects of software architecture.   Design Patterns and Best Practices Are Always Used With Bias Design patterns are not concrete, especially in where and when they should be used. Alistair Cockburn, in his brilliant article [Cockburn96], proves that even the use of well-known design patterns is significantly influenced by social issues, as well as the designer's personal bias in designing a solution. Design decisions are balance points between intent, external forces, principles, and counter-forces, so this balance is unique for every designer and project environment. Cockburn presents 15 related patterns, which show social issues driving architectural decisions. It appears from this article that identifying the social background influencing design decisions is significantly more important for successful software architecture than the engineering background of these decisions.   Architecture Is Often Built Evolutionary Another very important quality of architecture is agility. Software architecture first appeared in the early 1970’s, and inherited much from a civil architecture. As a result, the methods of software architecture were quite formal and the whole concept of the architecture supposed a 'beforehand' construction. However, when a planned design stage became included into a wider and wider range of projects, it became apparent that careful and formal architecture, developed before the beginning of coding, doesn't work well for all software projects. That was the beginning of the agile era in software development. The idea of agile-style evolutionary architecture is illustrated in Martin Fowler's article [Fowler00]. Fowler proposes to start architecture with the simplest possible current requirements solution (YAGNI principle) and then 'grow' the architecture through the project's lifecycle and changing requirements, using refactoring techniques for safe incremental changes. Even for database design, which is traditionally considered to be the most 'fundamental' part of software architecture, Fowler uses refactoring and other agile techniques to build it evolutionarily [Sadalage03]. The evolutionary nature of architecture is a very important development from the point of view of human aspects in architecture. It means that real-world architecture is not a static concept with formal academic methods; instead, it is a dynamic, subjective, non-linear process with many unknown parameters. So the software architecture of some projects on the current stage is closer to handicraft, or even art, than an industrial engineering process. I spent over 5 years of my professional career working as a developer and architect in small to medium-sized projects (SMP’s), some of them startups. In my own experience, carefully planned architecture almost never helps to make a project successful. On the other hand, projects with mediocre architecture become successful, both technically and financially, quite often. This happens because SMP’s and startups almost never have adequate resources or pre-defined long-term requirements, so careful prior architecture leads to project failure, either because of an over-spending of time and human resources, or because the architecture rapidly becomes out-of-date due to changing requirements. Only human intuition and the use of an evolutionary approach allow technical leaders or architects to find a balance between all these hostile factors and bring such projects to success.   Software Architects Are Human Beings There is also an essential fuzziness in the definition of the 'software architect' role in the software construction process. Some managers have a difficult time hiring for an 'architect' position because they don't exactly know what an architect's duties and responsibilities are. However, if the right person appears in a team, everybody can easily recognize who the architect of the project is, even without a clear consensus on the meaning of this role. Martin Fowler allocates the roles of 'Architectus Reloadus' (the person who makes all the important decisions and does this because a single mind is needed to ensure a system’s conceptual integrity, and, perhaps, because the architect doesn’t think that the team members are sufficiently skilled to make those decisions) and 'Architectus Oryzus' (the person who is very aware of what’s going on in the project, looks out for important issues and tackles them before they become serious problems, collaborates, programs with a developer, and participates in a requirements session, helping explain to the requirements people the technical consequences of some of their ideas in nontechnical terms, etc.) [Fowler03]. However, these two types of architects could be also explained as just two different sorts of people playing the architect's role, approaching the problems in different ways to compensate for different natures and skills. Luke Hohmann separates architects into 'tarchitects' (technical architects) and 'marchitects' (marketing architects), according to the technical or business perspective of system's architecture [Hohmann03]. However, this bias in an architect's role could also be explained by the individual strengths and personal spheres of interest of two different people playing the role of the software architect. Basically, 'software architect' is not just a role in a development process; it is a very concrete person with a very concrete set of human characteristics, such as knowledge, skills, experience, and communication abilities. Moreover, an architect almost never works on his own. He works in a team, so the team's skills, biases, and intra-team communications all contribute to the successful design development. In my own experience, projects are more successful if an architect is able to explain important design decisions to developers using just a sheet of paper with clumsy boxes than if an architect can draw perfect UML diagrams (which nobody can easily understand).   To summarize, the characteristics of people have a first-order effect on software development, not a lower-order effect [Cockburn99]. Social issues, human intuition, the reaction to rapid changes in a dynamic project's environment, personal bias, and the unique set of skills and experience of team - all of these human factors influence the 'big picture' of the project and key technical decisions, even more than all the engineering factors put together (knowledge of design patterns, optimal object hierarchy, detailed UML schemas, etc.). Engineering techniques are just good instruments, which are used by human beings with all the logical consequences.   References [Cockburn04] Alistair Cockburn: “The end of software engineering and the start of economic-cooperative gaming”, Humans and Technology Technical Report, January 2004 [Fowler98]  Martin Fowler: “Keeping Software Soft”, Distributed Computing, December 1998 [Fowler00] Martin Fowler: “Is Design Dead?”, XP 2000, July 2000 [Cockburn96] Alistair Cockburn: “The Interaction of Social Issues and Software Architecture”, COMMUNICATIONS OF THE ACM , Vol. 39, No. 10, October 1996 [Sadalage03] Martin Fowler, Pramod Sadalage: “Evolutionary Database Design”, martinfowler.com, January 2003 [Fowler03] Martin Fowler: “Who Needs an Architect?”, IEEE Software, July/August 2003 [Hohmann03] Luke Hohmann: “The Difference between Marketecture and Tarchitecture”, IEEE Software, July/August 2003 [Cockburn99] Alistair Cockburn: “Characterizing people as non-linear, first-order components in software development”, Humans and Technology, October 1999

Tags:

Software Architecture

Blog