CAT | Programming
I wanted to share a trick I learned a while back, that is pretty useful. Personally, I’m not a fan of including code in your executable/bytecode that is never used, because it has the risk of exposing additional implementation, and generally makes for larger DLLs/Jars. One nice thing I can say about C# is that they did not completely remove the idea of conditional compilation. In C#, you can do the following:
#define LETS_DANCE
...
#if LETS_DANCE
// ... do something ...
#endif
However, C#’s conditional compilation is completely restricted to only being able to define a boolean, where the existence of the preprocessor symbol means it exists, and otherwise, it doesn’t. Makes sense. Conditional compilation without all the nasty misuses of them you see in C++ programs. Seems to be the best of both worlds and an obvious leg up on Java, right?
Well, actually not really. Turns out that Java has supported something similar since day one, also. In Java, the equivalent is this:
public static final LETS_DANCE = false;
...
if (LETS_DANCE)
{
// ... do something ...
}
#endif
Why is this guaranteed to work and not include the unreachable code? Turns out there is an important note in Java. Java will not output code that is obviously unreachable. To be obviously unreachable it needs to be known at compile time as a static final constant variable. So, there you go. Definitely a useful thing.
2
Dynamic: The New C# Keyword That Has Abuse Written All Over It
0 Comments | Posted by softwarepurist in C#
Dynamic is a new keyword that was introduced in C# 4.0. Thankfully, it hasn’t gotten a ton of press, but I can see it paving the way for all sorts of new types of abuse. By now, you’re probably familiar with my thoughts on the var keyword. Dynamic takes this one step further, by dynamically binding the class type at runtime, and determining if method calls are legal, also at runtime. Here’s an example:
public void MyMethod(dynamic param)
{
param.Call();
}
So, let’s say you pass in an instance of a class which has the method named Call. All is well, Call() will be called, albeit with a performance penalty.
Now, consider what happens if you pass in an instance of a class for which Call() doesn’t exist. It will, instead, throw an exception, which essentially states that Call() doesn’t exist for this type.
So, what’s the problem? Very simple. You’re taking a situation where you can enlist the compiler’s help, and instead pushing it off to runtime, which entails discovering it later than necessary, in some cases, much later. This increases risk and potentially increases the time to fix, since we know that at each stage later that a bug is discovered, the more expensive it is to fix. Problem 2 is that there is absolutely a performance penalty. In a lot of cases, these penalties are no so noticeable, but spread out through an entire application, they can make a serious impact. I have seen poorly written C# applications which made both heavy and improper usage of reflection that easily can operate an order of magnitude slower than well-written one which takes advantage of the type system.
Avoid dynamic in most cases. Use it for cases where it really matters. If you find yourself needing a lot of dynamic behavior, you are more well-suited to using a scripting language like Python or Ruby, where the syntax will be much cleaner and you don’t risk so much paradigm mix, therefore creating more understandable code. Otherwise, if you’re determined to use C#, then use it as the type-safe language that it can be, and don’t treat the type-system as optional, even if Microsoft got overzealous in feature overload.
What’s the solution? Instead, make the method look like this:
public interface ICall
{
void Call();
}
public void MyMethod(ICall param)
{
param.Call();
}
Using generics is also a possible solution. Just remember that when you violate the type system, you are losing the advantages of using a type-safe language.
24
Var/Auto is Ugly and in Some Cases, Downright Evil.
2 Comments | Posted by softwarepurist in C#, C++, STL
In this post, I’m going to describe a set of keywords, which effectively serve the same purpose. In C#, there is the var keyword. In C++, there is the auto keyword. Effectively, what they let you do is to automatically inter the type of a variable from the context on the right hand side. Here’s an example, in C#:
var myVar = new MyClass();
and an example in C++:
auto myVar = MyClass();
The problem I have is, while this allows you a tremendous amount of flexibility, in saving redundant typing, it also potentially nullifies some of the benefits of using a language which supports type safety, because you can do some pretty nasty things which destroy the readability and can potentially introduce some subtle bugs. Worse still, tools like Resharper encourage, potentially poor usages of var. Here’s another example of the var problem:
var myVar = new MyClass().DoThis().DoThat().DoSomethingElse().NowGet();
Ok, what’s the type? Don’t know? Me neither. I find this problematic. As such, as I’d like to establish some guidelines for better usage of var/auto, taking some common use cases. I’ll probably switch back and forth with examples from C++ and C#, just to illustrate the point. Here we go:
1) Usage case 1:
var x = new Y();
I consider this usage a minor evil, even though some like it a lot. Here’s my major problem: You have type inference on the wrong side. The point of using a language which supports type safety, is, well, you support the type system and let it help you. If you don’t want that, may I suggest a language that is dynamically typed? It would be nice, if var, instead worked this way:
Y x = new infer-this-type();
As a general rule, we would prefer to infer the type that’s on the right hand side, not the left. This feature is not available in a lot of languages, so for now, I would simply suggest not using var. Go with the old:
Y x = new Y();
2) Usage case 2:
var x = GetY();
Again, with this one, I prefer not to use var, for the same reason as before, with an added caveat. First, of all, you should avoid using var to avoid typing a simple type. Secondly, you can’t even figure out what the type is from reading the code. Not good. On a scale of 1 to evil, I consider this a significant evil.
3) Usage case 3:
for (typename std::vector
{
...
}
Becomes:
for (auto iter = container.begin(); iter != container.end(); ++iter)
{
...
}
In this case, I can justify using var/auto, so I consider using auto a minor evil. However, C++ has a better mechanism for doing this. It’s called typedef. For example:
typedef typename std::vector
for (MyIter iter = container.begin(); iter != container.end(); ++iter)
{
...
}
So, in C++, I have trouble finding ANY usage, where I really like auto. However, in C#, there is no typedef, so I’m fine with usage of var, in the case of complicated nested classes.
Conclusion
Concluding, as you figured, I don’t like the usage of var or auto much. In a lot of cases, it is a minor evil. However, there are some major concerns: Readability and subverting some of the redundant checking that the type system supports. In C#, it is sometimes useful to save a lot of typing, due to the lack of the typedef keyword, while in C++, it is rarely useful. In a future article, I will tackle C#’s dynamic keyword, which I dislike far more than var. Stay tuned.
10
Is a Programming Language Merely Just Syntax?
0 Comments | Posted by softwarepurist in C#, C++, Developers, Programming, Threading, XML
I was interviewing a candidate recently, and we were talking about some of the programming languages he claimed to know, which was mainly focused around C++ and C#. We started to discuss some of the types of problems he finds easier to solve in each when he said something I found very misleading: “A programming language is really just syntax”. As we talked more, I started to ponder what a shallow understanding of a language this really was. It’s kind of like questioning whether a dollar is a dollar is a dollar. Smart financial people know that the source of the dollar is very important, and so the utter simplification is misleading.
Before I get too far into this, I will note that if you’re using .Net, you can get a lot closer to making the premise of the initial argument, because the .Net languages all can support the same APIs. That wasn’t the case for this individual, as he was using C++, not C++/CLI. With that being said, let’s get into some of the real substantial differences:
1) The standard libraries that the language supports. I find this to be one of the most underrated aspects of working with a language. I think part of the reason is that when you use some languages (C++) there’s a large population of programmers who don’t really make good use of the standard libraries. In my experience, this is often because they sometimes fall into using non-portable APIs, such as Microsoft’s ATL or MFC instead of STL containers, or libraries that may be considered more suitable for an embedded environment. Anyway, I think this is a critical feature, because, again consider the example of C++: Something as commonly used as XML is not standard. This is almost unfathomable for programmers of languages like Java or C#. Meanwhile, the flipside is .Net, which is immense and becomes very difficult for a developer to be proficient in all of it.
2) Third party libraries the language supports. I would consider this almost as important as the standard libraries issue. Unlike the standard libraries, there’s a much better chance that the third party libraries have API interfaces to be used in multiple languages. This will be one of the the top things you would consider when choosing the appropriate language for a task.
3) Language Paradigms: This is another feature which gets underrated at times. Does the language properly support Object-Oriented Programming? Template Meta-Programming? Functional? How easily does it support multi-threading? Again, all of these are critical differences between languages which go well beyond syntax.
4) Static Typing or Dynamic Typing or Both (coughC#cough)? With static typing, you can potentially find a lot of structural errors during compile time and need less code coverage during your unit tests. With dynamic typing, you don’t have these luxuries, but have much better support for rapid development.
5) The tools that are supported for your language.
6) Syntax. I consider this one of the least important differences between languages. It’s very easy for a professional programmer to adjust to a new syntax, assuming it isn’t completely nonsensical.
In the end, I think of this like deciphering any sort of spoken or written language. At the end of the day, if you aren’t able to communicate effectively, then your words are meaningless. This goes far beyond sentence structure. When discussing programming languages, it goes far beyond syntax.
I recently came across this post: http://reprog.wordpress.com/2010/04/19/are-you-one-of-the-10-percent/ . I thought this was a very interesting topic, not because a binary search is a particularly interesting algorithm, but because of the realization that only 10% of programmers could get this right on the first try. To test, I did it myself, on paper. I was able to work it out on paper before officially testing it, but I can also see how most people can miss it. It appears to work, according to my tests. Here’s the solution I came up with:
template <class TCollection>
size_t binarySearch(const TCollection& p_collection, const typename TCollection::value_type& p_value)
{
size_t oldMax = p_collection.size() - 1;
size_t oldMin = 0;
size_t newIndex = (oldMax + oldMin) / 2;
while (oldMin != oldMax)
{
if (p_collection[newIndex] > p_value)
{
if (oldMax == newIndex)
{
break;
} // end if
oldMax = newIndex;
newIndex = (oldMax + oldMin) / 2;
}
else if (p_collection[newIndex] < p_value)
{
if (oldMin == newIndex)
{
break;
} // end if
oldMin = newIndex;
newIndex = (oldMax + oldMin + 1) / 2;
}
else
{
return newIndex;
} // end if
} // end while
return size_t(-1);
} // end binarySearch
void test()
{
std::vector<int> collection;
collection.push_back(1);
collection.push_back(2);
collection.push_back(3);
collection.push_back(5);
collection.push_back(6);
collection.push_back(7);
collection.push_back(9);
collection.push_back(10);
assert(binarySearch(collection, 0) == size_t(-1));
assert(binarySearch(collection, 1) == 0);
assert(binarySearch(collection, 2) == 1);
assert(binarySearch(collection, 3) == 2);
assert(binarySearch(collection, 4) == size_t(-1));
assert(binarySearch(collection, 5) == 3);
assert(binarySearch(collection, 6) == 4);
assert(binarySearch(collection, 7) == 5);
assert(binarySearch(collection, 8 ) == size_t(-1));
assert(binarySearch(collection, 9) == 6);
assert(binarySearch(collection, 10) == 7);
assert(binarySearch(collection, 11) == size_t(-1));
} // end test
It’s easy to miss it, because there are edge cases, a truncation problem, and multiple termination conditions. I only discovered these cases when I worked through a very small sample set in my head. What worried me about the initial find is that I wasn’t shocked by the result of only 10% of developers getting this right. I think what’s concerning to me is that many people wouldn’t have tested it in isolation, just felt that they had a solution that worked, and moved on. But, then I can’t blame developers for this. I think this is the fault of working in an industry where being thorough isn’t as rewarded as it should be.
The more I work, the more I notice this problem. There is so much focus on short-term goals and results, that it’s easy to cut corners. “You need to finish feature X by COB today.” Well, in order to accomplish this, I notice most people will skip otherwise essential things such as unit testing, and sometimes not even test at all. The design phase is also often skipped, perhaps even more often than testing. When you’re given ambitious or impossible deadlines, it’s natural to skip designing and start coding right away. Worse is that if you do spend time designing and testing the software, you often aren’t rewarded for your efforts; Nobody notices.
Ultimately, what I’m noticing is that a lot of developers remove as much as 50% of the engineering effort, for 80% of the result. While maybe not the 80/20 principle, it’s close enough to feel like it. The problem is that I don’t believe software can follow the 80/20 principle and be effective. That loss of 20% is a lot of bugs… many relatively benign, but I also feel like that remaining 20% also encompasses most of the difficult to reproduce bugs; The bugs we are forced to write off and sometimes never successfully fix. At best, we pull out a roll of duct tape, use enough tape to keep it stable enough to hold together, hoping nobody notices.
As a developer, I can fully admit that when I’ve been pressed with unreasonable schedules, I’ve occasionally responded by cutting corners, just like most people would have handled it. This rarely resulted in a better product. It’s also almost impossible to later go back and rework the foundation when the tower is in danger of toppling over. In other industries, proceeding in this manner would be seen as ludicrous, but in software, it’s accepted. But, it’s unfair to blame developers for this; I blame short-sighted business models and managers who focus mostly on CYOA.
Anyway, coming full circle, I wouldn’t be surprised if there’s a search algorithm in your code base that doesn’t quite work for all cases. If you find a broken search in your code base, let me know and maybe I’ll even post your findings on this website.
Hi all. In this edition, I want to discuss a C++ logging system a little bit. I’ve been through the process of developing loggers many times in my career. Each time they get iteratively better, and so I feel this would be an interesting topic to discuss. Now, of course, you could just use a logger from an already existing library, but then there would be no point in this article. So, let’s play along for my sake.
First Approach
Now, obviously, the first thing a logger should do is… log! So, a basic first step might have our logger looking like this:
class Logger
{
public:
void log(const std::string& p_output)
{
file_ << p_output;
}
};
Logger& operator<<(Logger& p_logger, const std::string& p_output)
{
p_logger.log(p_output);
return p_logger;
}
Simple enough, and effective. However, there are obviously some improvements we could make:
- Logging of additional information: file, method name, line.
- Scoped logging.
- Conditional logging. The point of this would be to log only if a message is important enough to log.
- Effect on application performance. Asynchronous logging.
- Registering different output sources (Difficulty: Intermediate-Hard. Mostly because it hits up a part of the standard most people avoid. Unfortunately, out of the scope of this article, but can write a follow-up if there’s interest.)
Logging of Additional Information
It’s always very helpful if we can get more information when we’re logging. A log message by itself is often useless, since it may have been called from a variety of different points or the message may be repeated. By providing the right information in the right places, we can help alleviate the first issue and remove any doubt for the second. By putting logging in areas of interest, you might have 5 log messages, which happen to precede the call location where the error message you’re really interested is occurring. Of course, a full stack trace in C++ is more effective, but this, while doable, is way beyond the scope of this article.
The output method might be changed to this:
class Logger
{
public:
void log(const char* p_pFile, const char* p_pMethod,
size_t p_line, const std::string& p_output)
{
file_ << p_pFile << ":" << p_pMethod << ":" << p_line << ": " << p_output;
}
};
// Called like this:
logger.log(__FILE__, __FUNCTION__, __LINE__, ... msg ... );
This is certainly starting to get a bit verbose, but you could potentially cover this with a macro, if i’s a win in this case. Unfortunately, operator<< is eliminated, but there is potential trickery you could get to reuse the solution, but it’s probably fine to not have it in this case. Besides, log() is more readable than << for C++ newbies.
Scoped Logging
This is my favorite and most important feature for any logging system. In C++, it’s useful to log not only when logging of a scope started, but also when it ended. If you can scope the log calls, then information will be shared between the constructor and destructor of this auto log object and the destructor is guaranteed to be called. Here’s an example:
class AutoLogger
{
public:
AutoLogger(Logger& p_.logger, const char* p_file,
const char* p_function, size_t p_line, const std::string& p_msg) :
logger_(p_logger), file_(p_file), function_(p_function),
line_(p_line), msg_(p_msg)
{
logger_.log(file_, function_, line_, msg_);
}
~AutoLogger()
{
logger_.log(file_, function_, line_, msg_);
}
};
AutoLogger autoLog(logger, __FILE__, __FUNCTION__, __LINE__, ... msg ...);
... do stuff ...
// destructor automatically called, so it logs at exiting the scope.
This has a lot of benefits. The main thing is the guaranteed and saved typing for executing potentially the same conditions. You also potentially gain the ability to have the scoped version log additional information, such as time deltas, memory deltas, etc… this can even be a weak form of profiling.
Conditional Logging
You could add an optional log type which is a conditional. When provided, it will check the logger’s conditional and see if it’s allowed. If not, it won’t log. This is purely meant as an efficiency, and preventing log files from getting ridiculously large, while keeping old log messages in tact and not having to change otherwise working code. As an example:
class Logger
{
public:
void log(const char* p_pFile, const char* p_pMethod, size_t p_line,
LogCondition p_logCondition, const std::string& p_output)
{
if (p_logCondition is allowed)
{
file_ << p_pFile << ":" << p_pMethod <<
":" << p_line << ": " << p_output;
}
}
};
// Called like this:
// The following line gets logged
logger.log(__FILE__, __FUNCTION__, __LINE__, MUST_LOG, ... msg ... );
// This one, however, is not logged
logger.log(__FILE__, __FUNCTION__, __LINE__, TRIVIAL, ... msg ... );
Conditional logging holds the advantages that I mentioned above and is very easy to create. Just fill in with the types you need. In this case, I mentioned MUST_LOG and TRIVIAL.
Asynchronous Logging
Now, if we implement what we’ve gone through so far, we’ve got a decent logging system. Unfortunately, put enough logging in place, and your application is slowed down to the speed of the hard disk, at best. This is absolutely unacceptable, of course. So, the obvious solution is to create a thread or pool of threads to handle disk writes to the system. When you make a log call, it actually pushes the “request” on the queue and then allows the logging system to take care of the request when it is free. This prevents your application being slowed down due to logging (if done with care). Without this, any multithread application would become a single-threaded application which is basically synchronized on files. Now, at this point, you’re probably thinking that we’ve fixed the last major issue. Almost…
Maximum Queue Size
This hit me once when working on a server application. Basically, the story behind it is you almost always won’t be logging a ton of information in your regular application. You only start logging at key points and if things are starting to look “fishy” (e.g.: unexpected value in method Blah::Blah2()). However, on a server application, things scale quickly with load and then are limited based on hardware. Ultimately, a load test with 2000 simulated users worked in a testing environment. However, at around 800 users in a live environment, we started to hit what appeared to be increasing memory leaks, which is never good. To make a long story short, it turned out that the live system was using RAID and the testing environment wasn’t. So, because the hard disk was considerably slower, it hit hard disk write limits much quicker. This issue was easily corrected by providing a maximum queue size. If you’re queueing, say, 10k messages, you then hit a decision path for how you’d like to handle: toss away messages until the queue shrinks to say, 8k, or start having the logger sync with the app. The first solution is easier to code, so we went with that one.
Ultimately, this article discussed writing a basic logger from scratch with example code. Hopefully you found it useful.
10
Inheritance vs. Composition
3 Comments | Posted by softwarepurist in Beginner, C++, Programming
If you’re familiar with Object-Oriented programming, and potentially, UML, these terms may be familiar to you. If not, I will present them below.
Inheritance is a relationship between two classes, where there is a parent-child relationship, in the sense that the parent appears higher on the tree than the children and a parent may have multiple children. However, the relationship follows than the children inherit all the parent’s state and behavior and may also additionally add their own. I’ll demonstrate an example with UML, done with SmartDraw, and some pseudo-code:

class Shape
{
public:
virtual void draw() = 0;
};
class Circle : public Shape
{
public:
Circle(const Pos2f& p_pos, float p_radius) :
pos_(p_pos), radius_(p_radius)
{
}
virtual void draw()
{
// perform draw operation based on position and radius
}
private:
Pos2f pos_;
float radius_;
};
class Square : public Shape
{
public:
Square(const Pos2f& p_topLeftPos, const Dim2f& p_dimensions) :
topLeftPos_(p_topLeftPos), dimensions_(p_dimensions)
{
}
virtual void draw()
{
// perform draw operation based on top-left position, width and height
}
private:
Pos2f topLeftPos_;
Dim2f dimensions_;
};
This relationship makes sense as a class hierarchy because there’s a simple guideline to determine whether a class should be inherited from another. If you can justify that a class “is-a” more specific type of another type, then semantically, it makes sense to inherit, if not it doesn’t. Applying it to this case, a Circle is-a type of Shape. A Square is-a type of Shape. So, therefore, inheritance is appropriate.
This brings up the question, what do we do when inheritance isn’t appropriate? There is also object composition. Object composition basically means that a type or class is “composed” of other types or classes. If you make use of object composition it looks more like this:
Here’s the UML diagram, done with SmartDraw:

class Duck
{
public:
void quack()
{
// logic to quack
}
void fly()
{
// logic to fly
}
private:
Bill duckBill_;
std::vector<Feather> feathers_;
};
class Bill
{
...
};
class Feather
{
...
};
class Pond
{
...
private:
Duck* pDuck_;
};
This example demonstrates what it might look like for code that involves a Duck. A duck has-a bill, and has-some feathers. When a relationship between two classes is a has-a relationship, then you should use composition. There are two types of composition, which are demonstrated in the code and diagram above. The more general form of composition implies full ownership, including responsibility for an object’s lifetime. This is true for the bill of a duck and its feathers, for without a instance of a duck object, neither the feathers nor bill exist. (Side note: This is assuming we can’t pluck the feathers of the duck or remove its bill, which would get into a concept called Transfer of Ownership, but that is beyond the scope of this article.) The second type of composition is called aggregation, which implies that while an object has another object, it doesn’t own it. In the example I showed above, the Pond has this sort of relationship with the Duck. An instance of a Pond object may keep a reference (or pointer) to the Duck object while the Duck is in the Pond, but isn’t responsible for managing the lifetime of the Duck object, so it is a aggregation relationship.
Now that I’ve clarified inheritance vs. composition a bit, it brings up some additional questions. In the case of the Duck, a common design problem I’ve seen before is that in the case of a one-to-one relationship, like there is between the Duck and the Bill, you could theoretically derive Duck from the Bill class. The argument I’ve heard is, “This violates the is-a rule, but really what is the big deal? I was going to promote the methods of Bill to be public on Duck anyway? This saves typing.” The problem is semantic understanding. This type of inheritance is generally meant to be a polymorphic relationship. Polymorphism requires a is-a relationship to make sense. Now, there are cases where it makes sense to inherit when there isn’t a is-a relationship, but generally you should avoid it, when it’s obviously wrong. A codebase that violates is-a/has-a rules is much more difficult to understand than one that follows it. Ultimately, it just flows more naturally, and you wind up avoiding some dead ends that code that violates this runs into.
In a future article I will discuss some of the more grey area cases. Hopefully, you found this to be a good introduction.
8
Purism vs. Pragmatism
3 Comments | Posted by softwarepurist in Boost, C++, Programming, STL, Software Development, SoftwarePurist.com
So, one thing you might be wondering is why I titled my blog,The Software Purist. One friend even surmised that it had to do with some clensing program where I would annihilate all the impure programmers. While a humorous suggestion, it wasn’t what I was aiming for.
The long and short of it is that at a previous job, two different managers used to refer to me as a C++ purist, for my take on approaching tackling issues when programming in the C++ language. It was generally meant more as a compliment, because I believe they respected me “sticking to my guns”, so to speak. My general mentality at the time is that all problems can be solved by using well-defined methods, best practices and always maintaining a preference for anything standard or that has the possibility of becoming standard in the future (such as some of the Boost libraries). So, if there was an approach, my general methodology was a question of, “How would Scott Meyers solve this problem?” It’s difficult to get more pure than following Scott Meyer’s advice in Effective C++, at least at the time.
Now that we’ve been through that intro, perhaps some definitions, from my perspective would be useful. There’s two camps of extremes in software development. First, there’s the purists. Purism is about following language standards, following best practices, avoiding legacy features, ignoring language loopholes and using the language as intended by its authors. For example, a purist, in C++, might completely avoid and ignore C++ macros, except when necessary for things, such as header guards. A C++ purist might also prefer to use std::copy instead of memcpy, even if either solution would work, performance was equal, but memcpy is outdated. A C++ purist would use std::string instead of MFC’s CString or Qt’s QString. I know, because I did this and generally stick to this advice, unless it gets in the way.
Pragmatism is exactly the opposite. Pragmatism dictates that the most important thing is getting the job done. If you’re a pragmatist, your goal is generally to get the product out the door. So, if macros streamline your process and make it quicker to code something, this is more important than following a set of recommendations, because you can get your product out the door faster. Likewise, memcpy is a few characters less typing than std::copy and you’re more used to it, so you use it over std::copy, even though iterators are considered “safer” than pointers. Finally, you might use CString, because it gives you direct access to the internal memory, so you don’t have to wrestle through an abstraction and you can use a C-style API if you choose.
Now, these are all fair and valid views. Yes, that’s right. Both views are fair, both are valid. Both are also extreme. We know from many aspects of life, that extremes are generally bad. A temperature of too hot or too cold is uncomfortable. Driving at a speed of too fast or too slow is uncomfortable. And so on… The same holds true for software. Ultimately, we all evolve as developers. I have a theory that many developers start out as purists and start to migrate towards gaining more pragmatism each year, once they become more seasoned with more business experience. Anyway, as most developers know, it can be a problem to be either too pure or too close to pragmatism.
If you’re too pure, you will probably think every pragmatist’s code is terrible, even to the point of saying that they’re “under-engineering”. I know, because I was there, a long time ago. In some situations, what’s called for is simply getting the job done. Businesses have a need to have a product ship and a product that doesn’t ship is a failure, even if the code was beautiful. Purism often has a large focus on long term goals, which is definitely beneficial. The secret knowledge that purists don’t want to let out is by following purist methodology: 1) The coding becomes very mechanical and repetetive, which is great for developing, because if you can reuse patterns of development, it gets easier and becomes more natural each time. 2) The purist has a keen sense and sensitivity to maintaining the code later and they know if they take a shortcut now, they will be grunting and groaning when they have to look at it later. The truth is that these are really valid points, and in a lot of situations, this is the right approach. Of course, there’s some items that have to be compromised in the name of getting things done. On the flip side…
If you’re too pragmatic, you will probably think every purist is overengineering. Why build a packet class for every packet to the server? You can more easily code it inline, in one large source file, right? The problem with this approach is when you need to maintain it later, such as putting it in a test harness, all of this hardcoded logic becomes a mess to fit in. It’s difficult to extract the 80 out of 200 lines of a large function that you actually want to test, while stubbing out the other 120 — this necessitates refactoring. Both extremes find that refactoring is time consuming. Extreme purists find that in reality, it’s difficult to find time to refactor, so they try to code in such a way to avoid this step. Extreme pragmatists find that it’s difficult to find time to ever refactor, so they just never bother with it and the code is messy forever. Refactoring is one of those concepts that works is good in practice, but unless you get everyone to buy in, it doesn’t happen. Extreme pragmatists often don’t buy into refactoring; they got used to the mess, and have developed a mental map of the shortcuts, so they can often work effectively, despite the challenges. Extreme pragmatism creates a potentially difficult work environment for coworkers when done to extremes, because it becomes a mind field for the uninformed to trip over.
Ultimately, as we know, any sort of extremist views should generally be avoided. There is never always a single answer to any problem. Development has to be done with care and the beauty of the code is imoprtant. However, don’t lose sight of actually shipping the product. There must be a balance. If you feel like you are in the middle and you are accused of either overengineering or underengineering, it’s very possible that the person you’re talking to is an extremist. As for The Software Purist, my general approach now is to stay somewhere in between, but I still lean a bit towards the purist side, because ultimately, I have a high appreciation for the beauty of code.
6
Programming Paradigms
1 Comment | Posted by softwarepurist in Actionscript, Ada, C++, Java, Programming
If you’re an experienced programmer, this probably won’t be new information, but I hope to at least present it in a new way. When developing software programs, there’s different ways to think about the problems you’re trying to solve, which affect the entire process from initial design to how it’s coded, even to how it’s tested. I discuss a few of these in this article.
Unstructured
These days, unstructured styles of programming are generally frowned upon. In the old days, you might have programmed unstructured in older dialects of Fortran, COBOL, Basic, etc…, and used GOTO to move between sections of code. All variables were global to the program and so you had to be very careful about the usage and naming. This type of code was simple to code at first, but very difficult to read. In addition, it didn’t scale well at all. As programs got larger, it became exponentially more difficult to maintain. There isn’t much to talk about here, because coding in this manner is rare nowadays, except in specialized fields. You can imagine a program looking very sequential, though. Something like this:
if my height is less than 4' goto short elif my height is greater than 6' goto tall else goto normal short: print "You are too short to go on this ride" tall: print "You must wear a helmet." ... offer helmet ... if accept goto getonride else print "No dice..." normal: goto getonride getonride: print "Welcome onto the ride." exit: print "You must leave. NOW!"
As you can see, this can quickly get out of control. Reuse was almost non-existent.
Procedural Programming/Imperative Programming
Procedural programming was the first type of structured programming and it started to become widespread in the late 60s and early 70s. It was probably the first major step towards programming we do today. Structured programming is still used quite a bit and is the basis for some of the later programming paradigms. Structural programming is responsible for mostly eliminating the widespread use of GOTO. This methodology was more commonly taught at the time I was in college, as Object-Oriented programming was newer at the time, and not very well understood outside of a smaller community. The main concepts behind is that any task can be broken up into sub-tasks. Emphasis is placed on functionality and data structures. With this, it is became easy to break down a workflow with direct relationships and traceability for a functional specification, often, provided by a customer. This directly can be derived into software functional requirements, and then directly derived into software code, and then directly derives into tests. Because all of the emphasis is on functionality, and the code is structured in that manner, it provides a lower barrier to entry than newer techniques such as Object-Oriented Programming. Due to this, the common tool for figuring out where a piece of code is implemented is simply a matter of using grep (on Linux/Unix) or find (on Windows). Data definitions can be provided by systems engineers, because once the functionality is defined, the data required is also easily derived.
Some of the common procedural-oriented programming languages are Ada-83, Algol, Pascal and C. Of course, at different points, many of the procedural programming languages later gained Object-Oriented features with new revisions (Ada-95, C++, etc…). One of the main problem with structural programming concerns reuse. You can successfully meet functional requirements, but later notice that different components of the system have 95% similar functionality. It becomes difficult to directly express these relationships in your code. To try to handle reuse of sections of code where there can be different types used, you wind up with large if and/or switch statements. The problem then becomes that for each new type that supports this relationship, you wind up modifying working code, which is always risky. Modifying working code makes it difficult to supply a working library because elements in the library will often be changed.
As an example, let’s take the case of a vehicle and then provide multiple types of vehicles, a car and a bus. Example in C:
void drive(int type, void* obj)
{
switch (type)
{
case CAR:
{
Car* car = (Car*)obj;
// ... Logic to accelerate car
} break;
case BUS:
{
Bus* bus = (Bus*)obj;
// ... Logic to accelerate bus
} break;
default:
{
} break;
}
}
Later, we provide a boat:
void drive(int type, void* obj)
{
switch (type)
{
case CAR:
{
Car* car = (Car*)obj;
// ... Logic to accelerate car
} break;
case BUS:
{
Bus* bus = (Bus*)obj;
// ... Logic to accelerate bus
} break;
case BOAT:
{
Boat* boat = (Boat*)obj;
// ... Logic to accelerate boat
} break;
default:
{
} break;
}
}
As you can probably see, it’s relatively easy to figure out where to insert the code, but the maintenance of this can increase quickly, if you take into account that each function, such as drive, park, accelerate, addFuel, etc… would each need this sort of switch statement. You would wind up changing a lot of working code.
Object-Oriented Programming
Object-Oriented programming could be considered the next phase in the evolution of programming languages. It largely gained popularity due to C++ (formerly C with Classes). Object-Oriented development changes the emphasis. The emphasis in Object-Oriented programming is not with defining the functionality of the system and the data. Instead of putting the emphasis on the data of the system, you start out by identifying the objects in the system. So, imagine a game, such as the original Super Mario Bros. You could identify objects such as your main character (Mario, Luigi), the enemies in the world (Goombas, Koopa Troopas, Bowser, etc…), the blocks, the pipes, moving platforms, and even the world itself. The functionality is tied in when the objects communicate with each other. In technical terms, this communication is called messaging. The functions are owned by objects, and are called methods, instead of functions. This ownership is based on something being able to do something else. For example, Mario can jump, so Mario might have a method called jump(). Mario can also shoot fireballs, so he would have a method called shoot(). Since Mario and Luigi are the same, they might simply be two separate object instances of the same class called Player. The enemies have some similarities, so they could be structured with a base class called Enemy and derived classes, which implement the different functionality. It’s a different way of thinking about things.
Now what I’ve described so far might not make sense if you’re not proficient with Object-Oriented programming, so let me go back to the Vehicle example. Here’s what it would look like in OO-terms:
class Vehicle
{
public:
virtual void drive() = 0;
};
class Car : public Vehicle
{
public:
virtual void drive()
{
// ... Logic to accelerate car
}
};
class Bus : public Vehicle
{
public:
virtual void drive()
{
// ... Logic to accelerate bus
}
};
class Boat : public Vehicle
{
public:
virtual void drive()
{
// ... Logic to accelerate boat
}
};
In it’s most simple form, OO is simply a reorganization of code. However, it is obviously much more than this and this is a very simple example, which doesn’t touch all of the depths of how far things can go, but I think is fine to start. When you say that a Car is derived from a Vehicle, you are effectively saying that a Car is-a Vehicle. This is the basis for this type of inheritance. You should only derive if you can logically say that something is something else. For example, you shouldn’t have Bus derive from Boat, because a bus is-not-a boat.
So, if you look at the above example, I think you can see how this flows really nicely for things like GUIs. That’s why you can have a framework where every visual element might be derived from Control, Widget, or even Window (Side note: except in Actionscript, where for legacy reasons, everything is nonsensically derived from MovieClip). There’s logic behind this. A button is-a control. A push button is-a type of button. A list box is-a type of widget or control. And so on… It gets a little trickier when the base class is called Window (or CWnd in MFC), but if using this type of framework, you can try to accept the notion that each control could be considered a window, even though there were better name choices.
Object-oriented suffers from a set of problems of its own, even though it is an improvement over procedural programming. The first is that there’s more typing, at least upfront. As developers, we try to reduce typing, but OO can often be more verbose than necessary. Of course, wizards and newer programming languages aim to reduce this overhead more than languages like C++ or Java, which can often be overly verbose. The OO theory, though, is that through reuse, you avoid much of this typing as you develop higher-level things because you’re basically picking from a toolbox. OO also suffers from not having the same traceability that you have in procedural programming, because functional requirements do not map directly into design anymore, nor into code, nor into testing. When doing Object-Oriented Design, Object-Oriented Programming and Object-Oriented Testing, the traceability becomes less direct, and so newer processes try to take this into account a bit more.
5
Visual Studio Helpers: VS Plugins Edition
0 Comments | Posted by softwarepurist in C++, Visual C++, Visual Studio
I wanted to write in here with some of my favorite Visual Studio Tools, that I find useful in my every day work.
VSFileFinder has a very simple concept: You have a lot of files, but it takes a while to navigate for which project they’re in. Not so with VSFileFinder. This helpful tool allows you to type a few letters of a file or even use wildcards in your search and helps you find the files you’re looking for. The premise is simple, but this tool definitely meshes well with my workflow.
CppDoc comment maker gives you an easy to fill out template for generating a new comment that conforms to the JavaDoc standard. Again, there isn’t much to say here, other than it streamlines the workflow making this process easier. I will admit, I haven’t used this one as much lately, since I discovered Visual Assist X (below).
Visual Assist X is really a lifesaver, especially if you do a lot of C++ work. It has an incredible amount of features, including syntax highlighting, improved intellisense and refactoring. The syntax highlighting in Visual Assist X is more intuitive and easier to follow than the default in Visual Studio. The intellisense performs better, and actually seems to do more than Visual Studio Intellisense typically does. I find it’s less prone to getting “confused”. Finally, the refactoring is a godsend. Refactoring tools in C++ are almost non-existent. So, while Visual Assist X doesn’t provide as many refactoring capabilities as you get in Java IDEs, it does a good job. Furthermore, you can edit and configure the snippets, to further customize your refactoring. For this reason, I actually don’t use CppDoc Comment Maker so much, because I can use Visual Assist X to generate a CppDoc-style comment once I edit the template. At a later date, I plan to share some of my useful Visual Assist X snippets that I use.
