Monday, 31 March 2014

Difference between Software Architecture & Design

A lot of people are confused between what architecture mean and what design means for software. At times even i got confused early on in my career, as I worked only on parts of a big system (homogeneous if big) not the whole system.

What they mean ?

Architecture defines the big picture. It describes structure. What is done by the system, where its done and why it is done (strategy) like that. If you are given a problem statement and asked to solve it, you will do architecture to arrive at the most optimal solution. Architecture is thus essential to sub-divide the complexity of a big system into smaller parts (the where part). Architecture is externally visible. Architecture is a strategic (stable for long-term) activity. It can cover goals, scope, choice of frameworks, high level methodologies, etc. Architecture is therefore free from finer Implementation details.

Design on the other hand is the small picture and talks about how things are done in the sytem. If you are given a solution to a problem, then you wiill carry out design to move further. Design is tactical (or short term) activity dealing with code organization, interfaces between parts (layers, modules, classes, etc), design patterns, programming idioms & refactoring steps. Design is not visible externally. If a design has no implementation details, then it is not design because it does not help you write the software.

Both architecture and design can have patterns. Such as LAMP stack, MVC, N-Tier Web architecture for architecture and factory pattern, Proxy, etc for design.

If you are in the process of writing a framework, toolkit or SDK for a particular application and you do not know what is the level of abstraction (boundary) that you will provide, then that decision making process however small the software is has architectural underpinnings. If the API level is clear (like provide a parsing interface, or transaction based interface or session based or just limit to XXX functionality), then all you need is a design step. With my previous employer, we used to follow this principle to decide whether to have a seperate architecture phase or not in the project lifecycle.

Getting an architecture wrong is a a sure-fire step to product disaster while if you get design wrong (usually its limited to parts or sub-systems) you at-least have a chance to learn and correct yourself at the expense of some schedule slippage. Architects thus require to have more business, domain and general software skills (width) while designers need to be good in writing software (depth). I believe you can be a good architect without necessarily being great at software design. A person aiming to be a good software architect needs to work on many different sub-systems, domains & technologies to get the necessary knowledge & interest width for solving a problem.

It should be clear that the order in development lifecycle is Architecture followed by Design whether you are doing Waterfall or Agile or Rational or something else.

When they are different ?

When developing a very large system or software. There has to be a concept of the system defined before it can be constructed and that's why it requires a seperate architecture definition step, before a design can be even attempted. You need to know what you have to build, before you build it.

When they are same ?

When developing a small piece of software, where the purpose is clear from just the name of the software, component or project. Or a project where scope, language, functionality is decided and provided. A seperate architecture hardly conveys much by itself apart from what is the most obvious. Maybe a small architecture section could be the first section of the design documentation in such cases.

Hope this discussion will restore order. I owe a lot of description of these ideas to the remarks/definitions/phrases coined by the folks at www.stackoverflow.com on a similar discussion. Acknowledging all of them here. One comment I found particularly intriguing:

"...long time ago in a faraway place philosophers worried about the distinction between the one and the many. Architecture is about relationship, which requires the many. Architecture has components. Design is about content, which requires the one. Design has properties, qualities, characteristics. We typically think that design is within architecture. Dualistic thinking gives the many as primordial. But architecture is also within design. It's all how we choose to view what is before us - the one or the many."

One can also refer to Part-I (about 20 crisp pages) of the book "Software Achitecture for Developers" by Simon Brown on this topic.

Sunday, 30 March 2014

Aim Small, Miss Small - How to engineer high quality and competitivesoftware

One of the most common issues while developing a big system is the lack of focus on the "-ilities" or quality attributes which are characterised as non-functional requirements (NFRs). The development starts with a 100% focus on functional requirements and NFRs only being an afterthought. Yet in this competitive environment, almost all the competitors are able to implement the FRs, leaving differentiation only in the NFR implementation. Its what in many cases differentiates a great product from a me-too one.

Their are 60+ of these quality attributes as per wikipedia and not each and every one is important to every product. Therefore one needs to devise a quality model specific to the product (may even be a common in a business unit with some tailoring for each product). This is the first step and serves as an table of contents for the NFR specification. Then each section is elaborated into requirements and requirements analysis. Usually each NFR has to be measurable (quality is) and that is one BIG difference from FRs and also puts them at a much higher risk to accomplish technically and effortwise. What is also different here is that techniques and tools applied to FR capturing & analysis (UML) may not apply (and usually don't) to NFR analysis. The overall list can get quite daunting if completed and can push people towards a BDUF approach which is so un-agile or waterfall. I also in the favour that design & coding of functional and non-functional features is not done together as the focus on NFR is usually not enough and cannot achieve desirable results as far as NFR implementation is concerned.

But their is also a problem with this unbundled desiogn approach. Some requirements like performance are very tightly coupled to vast tranches of code and ignoring them is suitable only for a prototype (reserach project, proof-of-concept or testing that we are building the right product) not product development. There is a difference between refactoring (small transformations) & rework (massive transformation). My guiding principle is the "Open-Closed" principle which states that the software (module, sub-system, or whole) should be open for extension (adding new functionality) not modification (deleting or modifying existing functionality). We should be refactoring (modifying) software to accomodate new things or improve the code, not rework the algorithms, design & data structure completely. Its a tough ask from developers to do a big project all over again and expect them to be enthusiastic to the task

So what do I suggest for the development steps. Something like this:

(1) The first build (comprising multiple iterations or one waterfall project as you may decide) focusses only functional requirements + the most key NFRs which have tight dependency with functional requirement implementation. the architecture and design should be cohesive, loose-coupled and show all other attributes of good design & architecture so that this skeleton is very receptive to enhancements. Again the Open-Closed principle applies entirely here. Ideally I would prefer that the architecture team does the requirement analysis and produces the story cards/SRS requirements to implement along with the software design.

(2) Next while the implementation of first build is going on we can theoretically in parallel work on NFRs one by one and produce NFR specifications and Design for the "ilities" (DFx - the title of this blog). The architects can do thsi work in parallel with development of the first build. The focus is very clear for teh architects and designers, the method tailored to the quality attribute in question, the identified NFRs measurable as far as possible. NFR and "-ilities" need system thinking, not modular thinking to be effective.

(3) After the first development build is over and is being tested by user, we can implement teh NFRS in a series of small builds (iteration). I have seen too many times logs being written by developers based on some API specification and finallty being rendered useless in the market beacuse they are excessive and load the system if enabled (which masks the problem under study), or are not effective to isolate the problem in field or backend technical support.

Good quality is not a cliche. Its not an afterthought. On ground its a very hard engineering challenge to accomplish especially when the opinion of the customer is the very important for proclaiming victory. Focus is everything to produce the results that customer wants to see.

Aim Small, Miss Small !!!

Software Design for Adaptability

Software Design for adaptability is important in some deployment scenarios where a lot of envioronment related dependencies can change. Its a common requirement encountered during the software maintenance lifecycle. Infact IEEE calls it Adaptive Maintainability, one of the four types of maintainability attributes.

As a general rule, software can depend on one or more of the following:

(1) A Target hardware (CPU) architecture or family (unless you are writing Java code)

(2) OS Services

(3) Middleware services (like Database, Logging, Tracing, Alarms, etc)

(4) 3rd Party ISV supplied libraries or software modules

(5) Open Source software

(6) Another internal software from within your organization which belongs to a different business unit and simply not your competency wrt to developing that (you only are a user)

(7) Network and Peers

I would assume that you are programming in C/C++ which are particularly sensitive to environment changes being closer to the hardware and OS. Though (3)-(6) are language independent issues.

These dependencies change over a period of time and a well written software is able to forsee this change (probably on the basis of probability or even domain experience) and safegaurd to core functionality against this change. The agile folks would argue aginst that non-existing requirement, but I am fine if you conciously taking this debt and prepared to pay it down the line when required, whatever may be the cost. Otherwise, you may read ahead. Some strategies which can mitigate the impact of chnage can be:

(1) Do not use assembly language snippets in code. Keep your code endian aware for data formats or better make it endian independent. Avoid depending on CPU specific optimizations in such a way that they become critical to the operation of the software. If possible the code should be 64 bit ready (32 bit are dying out slowly. Now they are invaliding the mobile space also as DRAM sizes increase). Avoid excessive optimization for a particukar CPU architecture (it may not work equally well on other).

(2) Previously the sheer multitude of OS were mind-boggling (UNIXes, Windows, Linux, RTOSes). Each had its own APIs with miinor or major differences in APIs. Now they have rationalized to linux and windows (unix and RTOSare dying out). Try sticking POSIX APIs if you are making OS calls directly. Avoid kernel specific feature usage (like Linux kernel synchronization options) as they are non-POSIX and will make your code dependent on the Linux kernel. Consider wrapping all OS APIs with a wrapper layer (or Proxy), so that the business logic is kept independent of the OS. Have a dedicated test suite for adaptation. When porting just chnage & test the adapatation first and this will get rid of the big issues. Take care of instances where the API signature is same but partamter values and/or behavior is slightly different (Ideally the OS adpatation layer should mask signature, paramter range and behavior to be truly an adaptation layer). Similarly your buildscripts (makefiles) should be written in a manner that they can work for any OS, CPU and with any compiler tool.

(3) Usually the database which one uses is one candidate which changes regularly (licensing, performnace, customer preferencing etc are typical reasons) and therefore it makes sense to have an addaption layer for database access. Similarly in another deployment scenario the middleware componnets of SMP like log, trace, stats, etc might be different and therfore the adaptation will help shield the business logic components from this change.

(4) 3rd Party ISV software is another risk item. that company may shut down due to changing business environment (or get acquired) making support impossible and also trigger a changing licensing terms/prices that in turn will create a demnd to move to another ISV's component or OSS. They may not support the new OSes you want to use. Its equally likely it plays a critical part in your product's competitive advantage and their is a new ISV component which is much better than what you are using, triggering again a MUST-MIGRATE situation to retain the competitive advantage.

(5) Open Source Software can also be subsituted by better OSS which you would like to use or the open source community for that piece of software may shut down, or refuse to develop the component in the direction you want, or simply be bought over & go commercial. It could also happen that as you move to a different OS/CPU environment, this OSS component may not be available to that platform, triggering a substitition in worst case.

(6) This business unit may shut down or down size such that they may not eb able to meet your requirements of enhancement & suppotrt in future. It will leave you the option of either taking over and doing those changes yourself (if you can & have the bandwidfth to do it) or in worst case go for a replacement.

(7) Network and Peers - Their could be cases where the same software-intensive system can interact with multiple peers and on varying types of networks which can upset the application's timer . A tuning facility if provided (even masked under "slow peer" or "slow networks") on a per peer or per network interface basis, can help the product adapt well in the environment. Even if we extend this to any other type of service configuration (ports, behavior, etc), then they can be looked upon as items that aid adaptability (hard-coding in general restricts adaptability). The idea is to have a power-user or advanced configuration hidden from a general user, but available to support staff or tweakers ...

If we extend Murphy's pessimism to change, it will just read "Anything that can change (& need to be adapted) will probably change" or in the worst form"everything will change (need to be adapted)". So take your own call on where and how much you see the risk and implement accordingly.

By no means above is an exhaustive list of things that may require to be adapted or actions that can be taken. But the idea should be clear. If your design & coding is taking care of these type of change patterns, you are designing your software for the "Adaptability" quality attribute. What it does it allows the system to adapt and evolve independently in pieces by keep a componentized architecture withput resorting to necessrily using a component programming framework. Its implementing loose coupling between components. Its an emphirical observation that the more (in places and volume) the software chnages, the more is the risk that defects will be injected. Minimizing this also has a +ive impact of the reliability of future versions facing adapative changes.

Saturday, 29 March 2014

The economics of smartphone power consumption

I read recently that would be about 1.75 Billion smartphones in the world (almost 1/5th of the world's total population) in 2014. Lets assume that each one of them has a 2000 mah battery. Some observations:

Assuming no power loss it means it needs around 10W to charge a single smartphone.

And if we assume a 20% loss in charger and cables we can come to around 12W of power per charge of a single smartphone.

If we focus on improving energy efficiency and double battery life, we save 12W of power per day. Hardly a big deal and worthwhile target to chase. But we charge each smartphone every day (that's the typical life of battery), and therefore we need 12x1.75 Billion Watts of Power or 21000 Mega Watt of power every day from the worldwide electricity grids put together. It is big but a very small fraction of the overall power generation on the planet per day.

We could save more if each of these 1.75 billion smartphone users could switch off an electric bulb for an hour. Energy wise, it does not seem to be target worth chasing at the moment if we consider the big picture.

Alternatively, imagine a new battery & charger technology (Panasonic is rumoured to be working on such stuff) which cloud charge a phone from 0-100% in 5-10 minutes. Would we care about terminal software's energy efficiency then ? This is a hardware solution to a software problem, but it might be more effective. It will surely solve the problem of frequently having to run to charging points and waiting upto an hour to charge your device which is the bigger symptom of the problem.

USB charging is ubiquitous today. Smartphones, tablets, MP3 players etc. all use it. I see every device coming with it (except iPod and kindle). Many are interchangeable. Do we really need that ? Output Voltage is fixed at 5-5.2V while current drain goes upto 2.4A. I would prefer to see the next generation of electric sockets to have a built-in USB charger port (over the earth plug) that can supply upto 5A or 15A based on the type of socket. Atleast it will save production of millions of chargers in the times to come. Manufacturers can make the charger as an optional accessory which is very liberally priced.

Are the boundaries between Software infrastructure for handheld devicesand server side software changing again ?

Around 10 years back, I was lead for a team that took a big (perhaps bloated) communication software component whose purpose was to re-engineer such that we (a) retain existing functionality (b) Implement new concerns of distribution for server side (c) Retain performnace characteristics for server side (d) Target mobile devices with a reduced footprint. In other words turn the space-time tradeoff on its head.

I cannot say we were entire unsuccessful or successful. We could retain performance, functionality and new non-functional concerns. We could cut the footprint to 1/10th but by the time we finished it (a year or so), the requirement on the mobile device footprint came down to 1/4th of what we had achieved. There was enough grinding of teeth all around. But we came to a realization that server side software and software for handheld devices was to be different as requirements were so divergent. All we achieved is that we got rid of bad coding and gained insight into how to optimize code for size. And we started another seperate project to create a new code of the component for mobile devices.

That was 10 years ago. Server side memory was in GBs and handheld devices in MBs. For a volume driven business of devices nobody was willing to take the decsision of increasing RAM & ROM size to accomodated common software as selling price was fixed (by competition) and if Input costs go up, it will only eat into profits. Profits that are calculated when multiplied by millions of units sold. We had learnt our lessons

This is 2014. Smartphone memory of 2 GB is common and 3/4 GB may be about to go mainstream. Though the number of apps have exploded beyond imagination, so has the memory mangement. I frequently see simple apps having sizes in excess of 10 MB. It forces me to think whether our work was far too ahead of times or we just jumped to the wrong conclusion. Today my 1/10th footprint or what we achieved (400K) would be easy to accomodate and not the concern of the application footprint. So can we now say that Server side software and Handheld terminal software have converged ?

If we look at size, the answer may be "probably yes" in most cases. But there is a new challenge. Energy efficiency. Battery life is the single biggest problem facing terminal side software. Every bit of tuning is useful to prolong it. Though energy efficiency is important on server side too its nowhere on the top and the big focus is to save additional server hardware that is used to scale out the application. Ofcourse the programming language of choice is different on server (C/C++) and terminal (Java, Objective C or Java like).

What do others feel ?

How to select an open source for your commercial product developmeny

I have touched upon this topic loosely in my previous posts, but I will try to put a more comprehensive list here. First of all many techniques in evaluating a 3rd party ISV's software building block and Open Source Software (OSS) remain the same. They differ in licensing, how you judge the licensing organization's long term stability, etc. Anyways, here goes

(1) Fitment for pupose - Before you consider anything else, the reused FOSS component must be fit for your use. You can't a square peg in round hole if its corners fall outside the circular boundary. You should have a list of functional requirements and non-functional quality attributes (which are probably the same as that you would have for evaluating a 3rd party ISV). Better to have a Priority list for further selection.

(2) Go through the licensing terms & IPRs very carefully, If its policies are a problem for your commercial software distribution, then evaluate the workarounds and whether they are acceptable (like seperate distrubution in case of GPL, opening of using code, etc) for your software management. You may have to compromise here

(3) Check whether the community has support (sponsorship) of big companies. It is likely that in such a case the open source will have a fairly long usable lifetime and will not be shutdown entirely, go commercial or be bought over canceling the license. Anyways, the action of no-step motherly treatement to adopted child will help you handle this scenario.

(4) Check about the number of developers working on it. If the list is large, the open source will be active and supported well in future. Lack of interest would implify risks in medium-to-long term stability.

(5) Check the Bug tracking tool for a list of open bugs, date of filing & date of closure of historical defects, defect trend, etc. Check how many new features are being added every release. This would give you an idea about the stability of software and the activity on the codebase and the support that you may get for bug fixes. Its like judging ifthe inhouse development software is reaching a stage that is fit for shipping it out. Frequency of releases was earlier used as additional metric to test for stability, but the meteric is now slightly archaic in the Agile world and should likely be done away with as long as small no. of enhancements are being done in each release

(6) Check references of other adopters in the business segment (or even outside it). If more adopters are there it is likely that the software is fit for use.

The items in (2)-(6) will refine the priority list you identified in (1), but if your team is prepared to undertsand, modify and support the adopted open-source, then that in itself will be a conisderable hedge against the risks. And you should be prepared to contribute back to the community in any case.

Also since this is an external dependency consider, considering wrapping it with an abstract enough interface and let your proprietary source use the Adaptation APIs. Sometimes some decisons go wrong or are not in our control or their could be a new smart kid on the block, whom you would like to use. At that time this Abstract interface implemented over the open source APIs will return your investment in gold.

Benefits of Reading Open Source Software Code

The FOSS system is strong today and covers most areas of software infrastructure. If one wants to build expertise in a new area for eg., for carrying out technical research & prototying, a good option would be to read open-source rather than just read available documentation (which may be scarce or incomplete or even obsolete). In the agile world more of useful software and less of internal software documentation is being produced, and therefore the art of reading & quickly comprehending other people's code can greatly benefit an engineer in the coming times. I believe that this skill today is equally (if not more important) than leraning to code great or big.

For eg., you can read SIP protocol using SIP RFCs, but one step better would be to read code of FOSS SIP implementations. It tells you how that software is written, what it supports, how good its quality is (incase you want to reuse it for your prototype/development project). Plus it can tell you what are the key implementation challenges in implementing SIP. Another example would be Caching technology.

However there is a difference in data and information. If you read 5 cache open sources and come out with a list of key design elements & functionality it supports and what it dosen't at the end of 1 month, it is not the most productive outcome. One should be able to identify key design techniques, question it using the 5-why principle as to why things are done like that to being out the core problems and challenges. For eg, the information that a cache uses its own custom overlay filesystem is of little use than the knowldege that disk I/O is key performance & scalability challenge in a cache implementation. It will lead you to consider using SSDs and High RPM disks or 64 bit implemntation for big RAM disks apart from doing a special filesystem for massive I/O on a regular disk. This is system level knowledge not just a software issue and could be solved even in hardware as a design alternative.

Ofcourse regularly reading code helps you develop as a software designer & developer by learning from the good and bad practices (even algorithms) of other people's codes. I repeat that going ahead this will be a very key skill to look for developers to acquire in a fast paced produce development environment.

Open source adoption in commercial closed-source software development

As I discussed in my previous post, there has been a considerable progress in the quality of FOSS as well as an increase in openness & maturity of ISVs in adopting and contributing to open source. Their are no rules castes in stone, but one must be aware of the possible challenges that we can encounter in this endeavour. This is what this post is about.

When we start any large scale software application/product/system development, we have the following construction options for its building blocks:

Develop in-house

Buy COTs software from 3rd Party ISV.

Adopt FOSS ecosystem building blocks

Get into a joint development with another vendor

Outsource to a off-shore/near-shore/on-shore outsourced software development service provider

These are 5 different software projects, reach requiring a tailored project management and R&D management. But before we spawn these projects we need to have a rather rigorous decision making progress which can drive any architect to insanity. I have personally being involved in 1-3 as an employee of a Tier-I Telecommunication Equipment Manager.

For now, our interest is point 3. You can use FOSS software in two ways:

(a) Use as it is i.e. without modifications

(b) Modify and tailor to your needs.

We will come back to this later. Let me throw some light on the issues involved (in no partcular order of importance):

(1) Cost - Their is no doubt that if we are able to reuse a FOSS component, our CAPEX costs (or development & stabilisation costs) and time-to-market are cut drastically & especially if the reused building blockis reasonably large. However it is NOT ZERO and we incur OPEX costs & delays during the lifecycle of its usage, which is ignored by many project managers.

First, we need to spend some effort of an experienced engineer to select the right FOSS component to use. You don't do that properly, you are going to have support issues, instability of software, security problems and conflict between future development directions of community (including closing the project) and your organization's interest.

Second, we need to get our inhouse developers to thoroughly understand that FOSS Code so that they can use it properly, debug and support it and also modify (bug-fixes, enhancement, adaptation, improvment). This requires again an investment in effort. Even if you don't touch the source code [Point (a) above], you need to understnad it to debug it in case things don't work properly. Can't turn a blind eye (or a black one) to any adopted thing. The FOSS community is not going to do it for you and expects you to do it yourself. I have seen instances where teams start sweating if a core-dump or fault points to something deep inside the open source. Such teams have only understood the APIs (that to from another implementation, not the internals of the FOSS Component). As it turned out, the fault was in usage of the FOSS component, not the implementation of it. But it took quite a while and a lot of anxious faces to get things back in order (or so we think !!!).

Third, in case we need to mofdify the software, we need to have skill and effort for that. The community is not going to do it for you and expects yyou to do it and contribute back to the community (if it is agreeable with them).

(2) Licensing - FOSS components are also copyrighted and are governed by copyright laws. Though the licensing terms is specified pretty clearly by the license method that that the open source uses. For eg., GPL would require you to open source all components that use it and also oepns-ource your business logic. But L-GPL will allow you the flexibility of dynamically linking to FOSS component and not requiring you to open your business logic components. MIT license will allow you to use and modify free of any commitments. So this is the most important criteria when selecting an opens-ource to adopt in your product. You do not want to mess with the law aand at teh same time protect your Intellectual Property & Innovations (which anyway manifest in code not design dopcuments or patents).

(3) Value First, Cost Second - Their is a big trend in the usage of FOSS software infrastructure. In future non-business logic components or basic platform infrastructure will be open source. OS, Middleware, SDKs & Frameworks. But one must also tradeoff principles with value and competitive advantage that can be generated. Sometimes the contribution of infrastructure is very high in overall product quality attribute (like performnace, scalability & reliability). You should evaluate the open source qualities very carefully and if you can do much better than open-source with a radically different engineering approach then you should go for inhouse development. Otherwise considering use open source, tweaking it for improvement and contribute back to the community. It will buy you a lot of goodwill among the community members who may return the favour by supporting (or even joining) you in future.

For eg., SIP performance is the key in IMS equipments for per-port density. If I an develop an implementation that has radically higher performnace than what FOSS can offer me (they are anyways not designed for this but client side User Agents), then I will have my inhouse SIP stack and CSCF/AS frameworks. For VoiceXML platform, Javascript performance is key on server side for port density and most open source are developed keeping the client in mind and can't meet high performance, scalability & reliability characteristics. Therefore their could be significant value in developing aninhouse javascript engine tailored for server side implementations. Similarly for applications like configuration, their is no point in writing custom XML parsers with greatest parsing speeds, lowest footprint as the activity is not in critical path of device function, nor its business logic. You could consider any open-source resuse in this case as long is is reasonably stable and meets the functional needs. The substitution will also be easy if something goes wrong. Play the value game first and then the Cheap game.

(4) Modification of Open Source - Once you have decided that you will use open-source and which one you will use, you are likely to enounter shortcomings for your use-case. You will be tempted to make those changes and move ahead, but a word of caution is advised here. Some organizations are still not mature enough to allow you to commit your chnages back to community as policy forbids it. And hence if you go overboard with the changes which you cannot merge with the main branch at the community repository, then you are are constant inconvenience of having to merge (or even redo) your changes every time you upgraded to a new open source version for improved functionality, reliability or other quality attribute. The strict non-disclosure policy is archaic and hypocritical. It allows you to use other's work, but not return the favor by making your contribution to it. Its parasitic. It has to to go. Thankfully most organizations have matured to this reality. Next even if you could commit your changes, you should be doing changes in a manner that can be accepted and merged into main branch and you need proper collaboration with community for this. everybody should move in common direction, otherwise you may be left with your version to fully maintain & support. When we work with FOSS, we have to follow & live by the rules of that community not the otherway round.

(5) Perceived Quality & Security risks of Open Source - Being open and done by people in spare time or on very less pay, it is thought that quality is poor. While true initially, things have progrssed in last 5 odd years. Quality levels are high. The number of developers working are beyond what a single organization could ever afford to put into such a project developed inhouse, in partnership ror by ousourcing. And these developers work & contribute with their passion and heart and not because they are being paid for it. The results are surprisngly good. If an open source has backing of many companies and developers, it is liely it will mature very fast if not already (this is one key point to consider when selecting the open-source to use.

Open-source licenses usually require you to disclose their usage when used commercially. So you are prone to security risks if some developer leaks your usage and hackers exploit a security flaw in open source (which is known to them and not published). This is a real risk. The community is usually very prompt in fixing such issues on an urgent basis. However you could mitigate this by treating your adopted child as your own and extending the same care to it as you would to your own blood child. Means actively review & test to find security holes, defects and commit them back to community. If everyone does this and brings his skills to the table, the software will mature faster than your inhouse developed one. Everybody benefits with this. And you get tremendous goodwill in the community.

As you can see open-source is a double edged sword. You could easily hurt yourself with it as you can the competition around you Yet adoption and open software infrastructure is the trend & way forward. Linux foundation in a recent (March 2014) survey report :

http://storage.pardot.com/6342/105392/lfwp_collabdevtrends_v3.pdf

highlighted the development towards collaborative open source software development by big companies across industries to cut costs & push innovation. And it involves upcoming technologies. We should tag along not fight this change.

Friday, 28 March 2014

The apocalypse for the COTS software developers and ISVs

I used to work in an business vertical area which focused on developing proprietary middleware and application frameworks in the telecom industry. The challenge at that time were:

The target hardware were many, from x86 or Sparc servers to ARM, MIPS, PowerPC on the embedded side.
The OS for which to develop was equally fragmented, from multiple flavours of Unix (& Linux) to RTOS like VxWorks, pSoS, threadX, etc.
The open source revolution was just starting out and open-source quality was relatively immature when we consider the demands of carrier grade commercial software. FOSS was not upto the mark of COTS.
Their was also no clear strategy in organisations to adopt & contribute to open source and share software infrastructure (not business logic) like components (libraries), platforms & frameworks.
Standards were more important for interoperability than reference implementations. A consequence of the pre-Agile world in my opinion.

The open source revolution was largely driven on Linux/x86 and ports were available in some cases for some general purpose OS like Unix and Windows. And these ports were just ports as a proof of concept rather than optimised software. It was rare to find open source ports for RTOS. In such a case if I could write high quality communication & networking software which is portable and adaptable to multiple hardware and OS platforms, then I have a highly reusable IP which saves development costs and creates competitive advantage for the product using this software (on the basis of quality). Infact it spawned a small ecosystem for ISVs (Trillium, Hughes, Radvision, etc) who were practically gun-running the communication software wars and OEMs trying their best to develop things in-house to save licensing costs of ISV software while maintaining high quality.

However in the previous 5-7 years the above 3 challenges faced disruptive trends

The target hardware tended to consolidate around ARM for Clients and x86 for servers and network gear. Maybe I have over-simplified this, but it is true that the number of hardware platform to develop for drastically decreased.
Linux became so ubiquitous that it spawned clients/terminals (Android) tor Servers and Pipe equipment. It displaced UNIXes from server side and increasingly is eliminating RTOS on embedded side
Open source increased in poularity, the community of *skilled* developers supporting it increased and so did the usage. The result was the quality improved in all respects (functionality, performance, reliability, scalability, etc). Optimal use of multicore, improving compilers and tools all contributed to make this happen. Now FOSS became a serious challenger to COTS. And it was "FREE".
Organisations became more aware about the benfots of open source adoption in their commercial products, infrastructure took backseat to business logic and therefore their was more eagerness to adopt open source. Value was seen in contributing back and any policy obstacles were done away with.
People developing standards (especially corporates who wanted to make their work a standard for the business segment) for interoperability started putting more focus on developing reference implementations, open sourcing & licensing it free of cost to other players, rather than writing bulky specification documents and then letting other players struggle to develop an implementaion and then bring their product in the market. It was agile as it shortened time to market and drive rapid prototyping & adoption.

So now we have come 180 degrees. From a position where people were relucantant to use FOSS, we are in an environment which embraces FOSS. The value of COTS diminished. A lot of closed-source ISVs disappeared or diversified into other businesses. A small industry was killed, but mankind benefitted.

For a platform developer this makes little difference. Where he was writing closed, proprietary software, he is using the same skills acquired to contribute to open source projects. Only new skill that he needs to develop is to quickly read, browse, evaluate and modify somebody else's work, rather than do everything himself from scratch. Components, Stacks, Platforms & Frameworks are still there, just they are open source and the development teams are massive (community). For those who diversified into writing applications, they know have to master teh sills of selcting, using & supporting open source in their products rather than their own or 3rd party ISV components. Not so much difference either.

Depending on how you look at it, the cheese has moved (or not moved) !!!

Friday, 14 March 2014

Goto Is not ALWAYS harmful

D. Dobbs Journal recently carried an article:

Is goto Still Considered Harmful?

Its about an security flaw in Apple's SSL libraries (or rather the one that secures things in the first place) that is loosely related to "goto" statement usage in "C", though the actual bug has nothing to do with goto, but possibly a copy-paste or editing problem and supplemented by lack of proper unit tests. Its a very common question because many software company guidelines tend to discourage its usage, calling simply the use "unstructured" programming in a :structured" programming language.

However I have found usage of goto to be beneficial in two scenarios:

(1) Critical (performance sensitive and very-frequently-used/hotspot) parser code to implement a small state machine on one (somewhat large) function. done purely for performance and space efficiency reasons.

(2) Error handling, to avoid repetition of error handling code within the function and thereby enhancing maintainability and reducing footprint (like the code in Apple's SSL library). In large C language projects the footprint of error handling codes is sometimes as much as 20% and any optimization helps in improving readability & maintainability.

Maybe there are more cases which make the usage of goto attractive.

The overall idea should not be to out-rightly clamp down on usage of goto (as many QA departments tend to do), but be more prudent in its usage and ensure that the safety test net is especially strong in areas where goto is used.

Using rarely used compiler options to generate real business value for software

Recently an article appeared in Dr. Dobb's Journal on the usage of some rarely used compilation options for MS VC++ Compiler:

The Most Underused Compiler Switches in Visual C++

The question is fairly generic and can be asked in context of even the gcc compiler. Or even others. And I have encountered this many times myself & other developers. Especially in context of performance Optimization where it is quite tempting to enable them and squeeze out whatever performance we can.

Generally these compiler options are of the of the following types:

(1) Syntactical checking Options, which usually do not have any bearing on the generated output.

(2) Optimization options which influence the generated output

(a) Generic target independent optimizations like like Loop unrolling, strength reduction etc

(b) Optimization Options which target a particular CPU architecture/family (like X86, MIPS, ARM, etc)

(d) Optimization Options that tune the generated code to a particular CPU

The first question to ask is "Should we enable these options ?". The answer is not so simple. My opinion is that we should be circumspect. If the compile switch is new or not well tested, it is very risky to use. Compilers also have bugs. Bugs in the generated machine code. Developers struggle to debug high level language source codes. Having to debug generated machine/assembly code is beyond the skill of most and not something one bargained for when you wrote the program in a high-level language like C/C++. I have faced problems with gcc 2.7x versions where some of my C-Code works fine with -O2 option, but *randomly* crashes in one specific place with -O3 option, which i sent out in the production version because it would run 20% faster and the customer wanted more throughput (as much as possible). Ultimately I got someone skilled in Hex Debugging to look at the generated code and he was able to pin point the faulty generated code. And it wasted 1/2 weeks of time before that gentleman rescued me. This was my first lesson learning. Its not that we shouldn't use the latest and the greatest switch out there, but ensure that it is mature enough (widely used in market, well tested, not many bugs, etc). Beyond this I will let your instinct guide your decision. And this is influenced by what domain you work in. Whether safety/reliability is more important than absolutely the last bit of performance you can squeeze out.

Another unrelated mitigation action that may help is to ensure that the Production (not the debug version) is tested in your unit, functional, performance and other tests. We should always test what we will ship, not a variant of it ...

The next question one asks is which one of the above 5 options I can use. The answer , IMO, lies in your software deployment & distribution model:

(a) You can use (1) AFAP to improve your code quality. It depends ion your company and business domain policy. Its just a static code analyzer which helps to avoid pitfalls in which you or other developers that work on the code in future may fall into. It is just safe Programming if you are into this. And if you do not require your code to be compiled by pre-ANSI compilers, no need to use legacy environment preserving flags like -pedantic (in case of gcc)

(b) My opinion to the first question covers the use of (2)(a).

(b) If you are shipping a library or a pure general purpose software application (like Windows, Linux) which will run on particular architecture (like x86, ARM), then the underlying CPU or micro-architecture ehere the software will execute is not in your control. You can try (2)(b) keeping in mind the answer to the first question. If you use other options, then it will create many libraries and complicate your software management.

(c) If you are shipping the software for a particular CPU micro-architecture family (for e.g., a very tightly related product line), then you could add (2)(c) too because you know the user (product line) will just use CPUs of a particular micro-architecture. And you could still keep the answer to the first question in mind, if you do not want to lose sleep.

(d) And finally, if you are shipping an embedded system where the software and hardware (exact CPU) is tightly coupled and distributed, then (2)(s\d) could also be used.

The overall idea is that as your flexibility of target CPU reduces, the flexibility of using more and more optimization options increases.

And what Knuth said ("Premature Optimization is the root of all evil") applies to even compiler driven optimization. Their has to be a pressing reason to use the above optimizations. If the performance of the application/library can be met without these options and no competitive advantage or business value to the application/equipment can be derived from using these options, you can always leave them alone. The risks would outweigh the benefits in this case by far.

Thursday, 6 March 2014

How to a design a software system for performnace

Performance is a very key requirement (implicit or explicit) for most server side software. It might also manifest itself in some way for client side software. The root of the problem first starts in the requirement itself not being clear or completely specified, leading to a subsequent demand after initial development which in turns leads to a major costly re-engineering project. In a way it is very intrinsic part of functional design and its hard to separate or realize as an independent concern.

I have faced these questions in most projects I have been associated with as a system engineer and a software architect in a leading telecommunication equipment firm:

(1) The performance requirement is not there

(2) The performance requirement is there but I am not sure whether I have specified it completely

(3) I am not sure if my design can ensure that the performance goals will be met

I am sure others have too. Ignorance shown towards these jeopardizes the usage of the developed software commercially. The only solution is we need to ask these questions and solve them to the best of our abilities when we begin the development.

The best description of what a performance requirement means, how it is written, whether it is completely specified, ensuring that the design meets the performance, etc is beautifully captured in the first 3 chapters chapters of the book

"Improving .NET Performance and Scalability"

from Microsoft. Its plain practical wisdom and experience dished out. And the principles are language agnostic as well as independent of .NET platform. Over to the wonderful people who wrote this book (follow the hyperlink).

Software Design for Excellence