Thursday, September 29, 2016

JavaScript for the whole application? The case against

The sixth edition of the JavaScript specification - ECMAScript 2015 - was approved by Ecma in mid June 2015. As more JavaScript development takes place on the server it's not just about getting nice-to-have features or additional built-ins. Conveniences and sugar are nice but what about language features that compete with other server-side languages like C#, Java, Python, Ruby and PHP? Things like modules, classes and asynchronous programming are needed to bring a robust and a consistent programming model when considering developing the whole application in JavaScript. Does the evolution of JavaScript starting with ECMAScript 2015 finally deliver the necessary features needed to develop whole large-scale applications completely in JavaScript?

In this post I'll layout what I believe is the case against using JavaScript for the whole application and why many organizations use other languages on the server. In part 2 of this series, I'll take the opposing view and present the case for why building the whole application in JavaScript can provide significant cost savings to development organizations.

The need for strongly typed languages

The computing world evolved strongly typed languages because there was an immense need to build highly reliable and maintainable software systems after many system failures. This need resulted in new approaches to managing the software development life cycle, the use of formal system analysis and design as well as the use of structured programming techniques using strongly typed languages. The goal was to promote modularity and reusability, clean up interfaces and reduce side effects to create highly durable software systems that are easily modified.

Using well-defined interfaces for system modules and components including the use of strong types allows a development team to reuse code and easily coordinate with one another. Types are not just for allocating memory on the heap and stack in compiled languages; types are used to structure code, to formalize interfaces and to make components reusable which ultimately reduces human error. When working on large-scale projects with many modules things like type safety, type signatures, well defined interfaces, IntelliSense and the ability to refactor code are paramount to team productivity and project quality.

Strong types allow a developer to ask questions like "is versionNumber a string or a number?" or "what fields can be set for this invoice object?". Without strong types a developer would have to either read documentation, ask another developer or refer to the implementation code. Types make constants, variables, properties, parameters and return values explicit and well defined. They can help prevent things like trying to pass a 64-bit integer to something that expects a 16-bit integer or accidentally passing in a customer object to the invoice parameter of a function. Using a strongly typed languages on a large team significantly reduces human error and does so very early in the development cycle.

Type signatures define the inputs and outputs of functions, methods or subroutines. A well defined type signature can include parameter names, parameter types, return types and errors generated. Not all languages are so explicit in defining type signatures. In C/C++ parameter types and return types alone are sufficient to declare a type signature - called a prototype. Saving a few characters of typing may be convenient for the creator of the interface but having less information like not specifying the parameter names makes a type signature less explicit and error prone as the following C/C++ example code demonstrates:

public void assignManager(Employee*, Employee*);

//versus

public void assignManager(Employee* manager, Employee* subordinate);

In the above example where parameter names are not specified in the first type signature, a developer using this function for the first time would need to stop and do further research to ascertain the intent of each parameter. This research could involve referring to documentation, referring to the implementation code or worse calling or emailing other developers. The few seconds the originating developer saved by typing a few less characters can result in a lot of work for the other developers utilizing the type signature. Compare the same type signature expressed in Java where the parameter names and even the exceptions that can be thrown are well specified:

public void assignManager(Employee manager, Employee subordinate) throws EmployeeNotManager
{ 
... 
}

A group of type signatures and other data types can be used to define a programming interface for the purposes of modularizing and decoupling system components and services. Sometimes developers cite that a benefit of using a weakly typed language is that it decouples components and makes for easy dependency free programming. However, this form of decoupling whereby changes to interfaces and types can take place and things seem to work is often coincidental and can result in side-effects that are not being tested. It is also an example where what is easy and convenient during development for one developer or a small team is bad practice for large teams and future maintenance. Decoupling via weak interfaces means that potential changes in the interfaces and types can go undiscovered until later in the development cycle. Using the definition of strong interfaces and types can permit decoupling while also ensuring strict adherence to the interfaces early in the development cycle. It also does so at a very negligible cost to the development cycle.

IntelliSense makes coding easier and developers more productive by not having to repeatedly reference documentation or source code. It also makes developers more accurate by not having to rely on memory or outdated documentation. Large systems today use numerous libraries and components that are not easily committed to memory not to mention newly created modules by other developers on the same team. IntelliSense minimizes mental context switching between code and documentation and is perhaps one of the fastest ways to learn a new library or framework. Today, developers working with strongly typed languages rely on tutorials and how-to articles to understand concept but rarely use reference documentation to determine types and call signatures.

Refactoring allows developers to make accurate structural changes to code where scope and intent are taken into account. Rather than using error prone string search and replace, a developer can make large sweeping changes to code quickly with confidence and accuracy using refactoring techniques. Refactoring is an important part of structuring and restructuring code to improve quality and maintainability.

Type safety, well defined interfaces, type signatures, IntelliSense and the ability to refactor code are best implemented using strongly typed languages rather than weekly typed languages and the benefits are realized early when developers are editing or compiling code. Breaking an application early after making changes - especially if developers are working with hundreds of entities - should be a desired outcome. The compiler is a tool that can be used to discover errors early and it should be used to do so for as many areas of the application as possible: database code, web APIs, URL creation, forms, reports and so on.

Code and strong types should be used to enforce application convention not only at runtime but a compile time whenever possible. Then when convention changes there is a greater chance to fix code that breaks convention early even before running a single unit test. Developers identifying and fixing errors during the edit or compile phase do so at the earliest, most efficient and least costly time to do so in the software development lifecycle.

Developing without types is like developing without source control or automatic memory management: it may be possible but it is very difficult and a lot more error prone. A strongly typed language provides enormous benefit to software development like catching errors early to productivity enhancements like IntelliSense and refactoring. A strongly typed language is a key part of developing highly maintainable and robust software systems that significantly reduce system life cycle costs.

Developing in JavaScript

In JavaScript today, there are many ways to create and reuse components and modules - especially on the browser client. From the definition to the use of 3rd party libraries to instantiation, the code can look very different. The lack of well-defined constructs for component reusability can be simulated using various code patterns or libraries. There is no shortage of competing techniques and libraries that simulate modules, classes, inheritance, asynchronous programming and other advanced constructs. A development shop can attempt to force unanimity but once third party libraries are included it's easy to start seeing a mixture of coding styles and techniques.

It is not always apparent what are the reusable components in a given JavaScript library. Perhaps there are global functions that can be called. Perhaps some of those global functions are constructors for creating new objects. Maybe there is a global variable holding a function or object from where everything else is accessible like with JQuery and Underscore. Perhaps the library can be loaded using require to get back function, an object or simple value. In JavaScript it is usually best to read the documentation to get started with a library rather than attempting to look for public reusable components or any other starting point.

Side effects are a major area of concern when developing in JavaScript. Side effects can consume an enormous amount of time both in development and in production as these types of subtle issues are usually the most difficult to diagnose and fix. Common side effects that can result from using JavaScript include: unintentionally setting the wrong properties and variables; passing in the the wrong number of arguments or argument types; re-declaring variables; hoisting related side effects; replacing built-ins; automatic semi-colon insertion; automatic coercion; improper use of object-wrapped primitives; and using normal equality instead of strict equality. The list of potential side-effects is quite large and without equal when compared to other popular server-side languages. Many side effects can be reduced with strict mode but not all. Even with strict mode it is still possible to define a function in JavaScript that can have a totally meaningless signature to the caller as the following example demonstrates:

"use strict";

function test() {
 switch (arguments.length) {
        case 0:
            return 0;
        case 1:
            return "Hey there";
        case 2:
            return true;
        case 3:
            return {};
    }
}

console.log(test()); //0
console.log(test(1)); //"Hey there"
console.log(test(1, "2")); //true
console.log(test(1, "2", true)); //{}

In JavaScript, a function or method signature may or may not be an indication of what parameters are required and what arguments can be passed to those parameters. The test function in the sample code above can be written using no parameters, one parameter, two parameters or virtually any number of parameters. To be absolutely positive of the possibilities, a developer must either read documentation or read the implementation code. Of all the shortcomings in JavaScript, the lack of a clearly defined and enforced function or method signature is the biggest shortcoming. Not only is this shortcoming highly error prone, it causes a developer to slow down and question many functions and methods before use especially given that the side effect is exploited to implement function and method overloading.

Perhaps one of the most confusing aspect of JavaScript is the use of this. Often a developer must read and re-read a thesis on what this means in different contexts and how to change its value to achieve the desired effect. In other languages, this is usually the object instance from which the method was is invoked. In JavaScript things are not so simple and understanding the value of this can get complicated:

//"use strict";

var obj = {
  me: "obj",
  func: function (cb) {
    cb(); 
    cb.call(this);  //Contrived to see what happens
    console.log(this.me); 
  }
}

var obj2 = {
  me: "obj2",
  func1: function() {
    obj.func(function() {
      console.log(this.me);
    })
  },
  func2: function () {
    obj.func((function () {
      console.log(this.me);
    }).bind(this))
  }
}
 
obj2.func1(); //undefined , obj , obj
obj2.func2(); //obj2 , obj2 , obj

Today there are several ways to change the value of this using various implicit and explicit techniques. Many developers store away its value in pseudonyms like This or that or self for use in callbacks. Below is one trivial code example in which a method of one object is called but the value of this is changed to a different object causing the first object to access the internal implementation of the second object.

"use strict";

var objA = {
    name : "objA",
 internalMethod : function() {
        console.log(this.name);
    },
 externalMethod : function() {
      this.internalMethod();
 }
};

var objB = {
    name : "objB",
    internalMethod : function() {
        console.log(this.name);
    }
};

objA.externalMethod.apply(objB); //objB

In JavaScript the value of this depends on the context where a function is called or how the function is called rather than the object instance that owns the method which can presume that the caller has internal knowledge of the function being called. This versatility permits interesting reuse scenarios like Function Borrowing but is done at the expense of clarity and information hiding.

JavaScript common practices

Common coding practices in JavaScript can lead to bad coding practices. Take for example function overloading: pass in no arguments, get back a string; pass in a string get back an object; pass in an option object get back an object. This type of overloading in other languages is usually done with distinct method signatures via strong parameter types. With JavaScript a developer would have to read the documentation, rely on a parameter naming convention or refer back to the code to be positive of the possibilities.

Even properties can be assigned different types of values. For example, in Node, module.exports can be assigned just about anything: simple values, functions, constructors, objects. This approach is very convenient for the module creator but creates a highly inconsistent interface for the module consumer. To start using a module a developer must start with the module documentation because one can never be certain of the intended use of calling require.

Another bad practice is that of passing in different argument types to a function to get entirely different functionality. For example in jQuery passing in a string queries the browser DOM; passing in an object creates a jQuery object wrapping the object; and passing in a function creates a document ready handler. Even passing in a string that is an HTML tag creates an HTML element wrapped in a jQuery object. There is a very good chance that the wrong functionality of the function is inadvertently invoked without any errors as the following simple example of counting elements based on a search string demonstrates:

"use strict";

$(function () { //creates document ready handler
    
    //Test HTML
    //<html><body><div>Test</div></body></html> 

    var searchString = "div";
    console.log($(searchString).length); //1

    searchString = "xyz";
    console.log($(searchString).length); //0

    searchString = "<b>";
    console.log($(searchString).length); //1 - creates DOM element
});

These type of functions and properties where parameters and return values can be anything and the functionality invoked can be anything have horrible coupling and cohesion and represent the antithesis of structured programming techniques. Such function and properties are created for programming convenience yet a JavaScript developer must live by documentation or by referring to the original code because nothing is concrete. Convenience in declaration and convenience of usage should not outweigh the need to write properly structured and maintainable code. In JavaScript, it is entirely too easy to shoot yourself in the foot, creating functions and properties that further promote this style of programming only exacerbate the potential for very subtle errors.

Another common coding practice that is an area of concern is poor code structure via over the top use of closures and anonymous functions even when it not necessary to do so. This type of code often looks like deeply indented HTML but in endless run-on JavaScript functions and objects that boggle the mind. The start of new anonymous function appears which happens to be a third argument to some function that is being called 100 lines above with the first and second arguments passed as object literals containing even more anonymous functions and object literals. It's a heaping helping of parenthesis-braces-comma spaghetti code without a single goto. This terse form of coding is common practice in popular libraries emulated by many is very difficult to untangle or refactor to make changes. Terse code is very difficult to understand and debug, is usually non-reusable and very expensive to maintain.

//...
   })
  })
 }
}, {
     method: 'DELETE',
     path: '/invoices/{id}',
        handler: function (req, reply) {
            loadInvoices(null, function (invoices) {
                invoices.remove({_id: id(req.params.id)}, function (e, result) {
                    if (e) return reply(e) 
                    reply((result === 1) ? {msg: ' success'} : {msg: ' error'})
                })
            })
     }
}])

Arguments for JavaScript

JavaScript is often cited as a misunderstood language. If the language were separated from the browser and developers focused on the good parts and avoid or turn off the bad parts then what emerges is an elegant and beautiful language. This may be true but such an argument can be made for many languages. Remove global functions, global variables and macros from C++ and perhaps C++ would be transformed into a cleaner more elegant language.

Languages get messy over time as they evolve and mostly from a compatibility standpoint between old and new. As new features such as generics and asynchronous programming constructs are added to a language the necessity to stay backward compatible results in a mixture of old and new paradigms that never get cleaned up. One cannot separate a language from its environments, implementations, libraries and common practices then evaluate it from a pristine viewpoint unless one is doing so as an academic exercise. All these factors need to be taken into consideration when evaluating the language for real world use.

Some argue that many of structural features in other languages can be simulated in JavaScript using various techniques. In fact JavaScript is so versatile that it can be much more powerful than other languages because objects and their prototypes can be easily manipulated to achieve things like Multiple Inheritance, Mixins and Polyfills. These powerful features are achieved in a way that is similar to what C developers tried to do in the past by implementing object oriented techniques using macros, pointers and casting. The versatility of pointers in C similar to the versatility of objects in JavaScript permit numerous reusability paradigms. The problem is that without well defined constructs these advanced features require the use of very esoteric error prone coding patterns or the use of competing external libraries implementing those patterns in a simplified interface. Such is the case currently with asynchronous programming in JavaScript where there are many different techniques and numerous libraries for asynchronous programming.

Some have argued that the use of Javascript lint tools and thorough unit testing can obviate the benefits provided by strongly typed languages. A developers does get added value from using a lint tool but lint is not a substitute for a strongly typed language. It is not possible to eliminate type information from code and deduce the same type safe intent of explicitly typed code. Even with Duck Typing techniques the code below could be considered type safe but it is not the same type safe intent as code written using explicit types.

"use strict";

var sendCustomerInvoice = function (customer, invoice, originalEstimate) {
    console.log("sending customer " + customer.number + " invoice " + invoice.number);
}

var customer = {
    number : 10
}

var invoice = {
    number: 20,
    date: new Date(),
    expires: new Date("2015-03-25")
    //... line items, etc.
}

var estimate = {
    number: 30,
    date: new Date(),
    expires: new Date("2015-03-25")
    //... line items, etc.
}

sendCustomerInvoice(customer, estimate, invoice); //sending customer 10 invoice 30

Unit testing is critical in software development but developers need to carefully balance how much test code they need to write. Test code is more code that needs to be maintained and the more test code developers write the longer it takes to change the system. The use of strongly typed function parameters and return type creates a well defined signature that reduces assumptions. In a strongly typed language, a parameter having a 16-bit integer type does not need to be checked for null or that it is in fact a number and is between 0 and 65536. There is also no reason to write unit tests to test all these difference test cases. Types reduce assumptions that must otherwise be eliminated using code - code that when written should be tested using more code in the form of unit tests.

Some cite that a benefit of JavaScript is the elimination of "impedance" by using JSON between the client and server. There is a perception that JSON is JavaScript and therefore everything from client code to the server code and the communication format between the two is JavaScript. This simply is not true: JSON is text and not 100% JavaScript compatible code. Additionally, how objects are represented in memory to how they are JSON serialized to how they are stored in cache and even persisted in a database will likely be different nullifying the benefit of any generic JSON serialization mechanism. Developers not only have to think about how objects are JSON serialized in other languages, they must do so in JavaScript as well. The biggest benefit in using JavaScript for JSON serialization is type and object literal compatibility.

"use strict";

var a = {};
var b = {};

a.child = b;
b.parent = a;

var nodes = [a, b];

var s = JSON.stringify(nodes); //TypeError: Converting circular structure to JSON

console.log(s);

There are developers that believe many of the issues encountered in JavaScript can be avoided by developing shallow single purpose modules. This argument has a lot of credence but not in all situations. These modules are perfect for middleware services that are called upon to provide single purpose functionality such as compression, logging, authentication and other HTTP services. This approach however suffers for certain types of complex components like a browser grid control or server side charting engine. Often the attempt to use this middleware approach for these types of components results in passing in a very deep typeless initialization or configuration objects with numerous properties and complex callbacks that are not any simpler than using a well-designed reusable class hierarchy. When one looks at the history of various languages, most have evolved from being shallow to becoming deep. They evolved from being flat like C to becoming object-oriented like C++ because there was a need to give ever more complex systems (e.g. OS GUIs) structure and organization that simplified programming while maximizing reusability and maintainability.

An application paradigm being promoted in the JavaScript community today is breaking up the application into numerous completely share-nothing independent services and using a message queuing system to communicate between the different services. This type of architecture is excellent for creating asynchronous, loosely coupled services that can be easily scaled and upgraded independently. There are drawbacks such how to relate query data together efficiently, referential integrity, increased communication and coordinating transactions across services which usually requires more elaborate services to solve. From an implementation standpoint, these services may be deployed and operate independently but they are still dependent on common messaging contracts. One of the best way for a developer working on one service to know the message contract required by other services is through the use of strong type definitions.

Proponents often cite how easy it is to change code and hit the Go button without having to wait for typed code to compile yet a typical JavaScript project today incorporates the use of many tools such as linting, minification, concatenation, and browser maps. Throw in TypeScript and the process start to look a lot like working with a statically typed compiled language. If the end result of minification is compact, optimized but completely non-readable code why not just generate byte-code? Concatenation seems like static module linking and browser maps feels a lot like generating symbolic debug info. All these tools move JavaScript development closer to how a developer works a with a strongly typed compiled language.

Typing JavaScript

What about using strongly typed JavaScript alternatives like TypeScript, CoffeScript or Dart? The problem with these alternatives is that they further fragment JavaScript development and usually suffer from one or more deficiencies. The two most glaring deficiencies include the lack of third party support and the ability to debug the original code. There is also increased project risk from the lack of readily experienced developers and risk from adopting one of these alternatives only to find out later it has not garnered enough mindshare and is quickly becoming obsolete. CoffeeScript and Dart are two such examples but history has shown that the many alternatives to JavaScript have been short-lived.

Manipulating something like the browser DOM requires dynamic programming that is easily provided by JavaScript and attempting to do it with a strongly-typed language would be an exercise in frustration. However, this type of dynamic code should be limited to the local scope of a function or method. Dynamic and weekly typed code should not be extended to type signatures and object definitions because it makes reusability difficult and dramatically increases the likelihood of error.

It is quite possible that it is too late to add types to JavaScript based on how JavaScript code is written today especially with the large body of existing reusable libraries. There is not a lot of value in specifying that most function parameters and return values are "any" or a set of potential types in terms of type safety. JavaScript is dynamic and weakly typed and attempting to type existing code will result in many parameters and return values marked as "any" as this JQuery TypeScript type definition file demonstrates. The attempt to externally type a JavaScript library results in numerous cases of Abstraction Leakage while at the same time creating elaborate interfaces, types and signatures that only partially solve the problem. Defining types in an external type library by a third party is a highly unique and proprietary endeavor than can quickly get out of sync with the target type-less JavaScript library. Types, if defined, should be part of the same project as the JavaScript library or framework and maintained by the project members modifying the code.

Conclusion

JavaScript started out as a weakly typed dynamic light-weight interpreted language. In the early days the language was used to manipulate the browser DOM of an HTML page and support functionality like client-side validation and drag-n-drop. Over the years JavaScript has been forced to support large-scale projects using unorthodox and out-of-band techniques. Today, projects such as Single Page Applications (SPA) and Node applications on the server-side can have large codebases yet something as basic as defining or using a module is not part of the JavaScript language but can be implemented using one of several popular techniques.

JavaScript evolved very slowly over the years mostly as a result of large software vendors not cooperating to move the standard forward or by attempting to develop proprietary alternatives. The last update to JavaScript was back in 2011. Even the good features and patterns popularized by JavaScript like closures, anonymous functions and promises have made their way into other languages. Many have tried to "fix" JavaScript without actually fixing JavaScript by cross compiling entirely different languages or by pre-compiling annotated JavaScript. These efforts were carried out by both open source projects and large commercial organizations that include Google and Microsoft . The reason for these efforts were to fill in glaring deficiencies and fix a host of issues in JavaScript that make it a weak language for use in large development organizations and large-scale software projects.

Unfortunately in software like other technology fields, technical excellence and market adoption do not always coincide. JavaScript epitomizes this divergence. JavaScript is really the only option for client-side web development thanks to its ubiquity across browser implementations. JavaScript also has the development community mindshare for asynchronous server-side programming using Node. A development shop should consider JavaScript and Node for the server-side development of web user interfaces including WebSocket driven user interfaces. Doing so would at least consolidate all web UI development - both on the client or server - to a single language and hopefully permits some modicum of code reuse. A significant benefit both in terms of cost and code reuse would be if mobile user interface development can also be included using a framework like Cordova. Everything else - business APIs , data models, storage code and business processing - should be developed using a strongly typed language and implemented as services that can be accessed from JavaScript where needed.