Tomek on Software

Tuesday, April 8, 2014

Mac, Windows, Ubuntu cross-platform development

“Grand[m|p]a” (have to be careful with pronouns this year), “what did you use to write cross-platform software back in 2014? No, really… do tell… Wow, amazing… It is, like, they really did not have […]?”

This post describes a cross platform development setup that proved efficient for me in the course of several cross platform projects (most notably Edge.js). As anything in technology, this is point in time: it has as much practical value for my contemporary engineers as it going to have entertainment value for my daughter.

Cutting to the chase:

image

I am using a MacBook Pro 13” (my shoulders are getting too old to drag along the 15”) with SDD 512GB (my ears are too old to listen to the HDD hum) and 16GB RAM (my nerves are too strung to wait for Windows to do its thing).

I am running Windows 8.1 and Ubuntu 12.04 in a VM using VMWare Fusion. Including the MacOS host, this captures most of my x-platform target.

I share the home folder on the MacBook Pro to both Ubuntu and Windows WMs. This is the quickest way to share files across the host and guest OSes.

I use Git[Hub] for sharing public artifacts between my fellow developers and my own development machines.

I use OneDrive for sharing private artifacts between my development machines. I suppose you could use DropBox, but I am psychologically biased towards OneDrive.

I use Sublime Text for a uniform code editing experience across platforms. It is sublime. And text. Plus, Commodore is so much better than Atari. Bottom line it works x-platform and has all these fancy colors, unlike vi. Could not resist.

I use Visual Studio for those infrequent tasks that require Windows specific work. It also comes really handy when profiling code. Turns out some of the code that is slow on Windows is also slow on *nix and MacOS. Yes, at the end of the day everything boils down to E=mc^2.

I use Windows Live Writer to write this post.

Friday, January 10, 2014

Workers on a shoestring in Windows Azure Web Sites

Hosting web apps in Azure, by the book

Many web apps consist of web, worker, and storage components. The web component handles HTTP traffic from clients which results in new work items (e.g. uploaded pictures that need to be resized). The worker component performs the actual work independently from interactions between the client and web component. Web and worker components exchange state using some form of external storage (e.g. a database or a queue).

image

In the general case, the three components are deployed to separate server farms to accommodate different scalability, reliability, computing resource, and process lifetime requirements.

For web applications like this hosted in Windows Azure, there is a natural mapping of the web, worker, and storage components onto Windows Azure concepts. The web component would be running in Windows Azure Web Sites, which is by far the most convenient way of hosting web tier code in Azure. The worker component would run as Hosted Service or a Virtual Machine. The storage component is not something you want to run yourself these days, unless you have a very compelling reason not to use one of the many hosted storage solutions available in Azure (MongoHQ, MongoLab, Azure Blob, Azure Table, SQL Azure, etc.). The details of storage farm management are abstracted away from you, and your app perceives storage as an endpoint to talk to sticking from a black box.

Given all that, a web application hosted in Azure would look like this:

image 

Problem in paradise

While using Azure to develop a web application like the one above, the experience gap between working on the web tier hosted in Windows Azure Web Sites and the worker tier running in Hosted Services becomes apparent and annoying very quickly.

Web Sites support code deployment in seconds using git. Hosted services take minutes to update code and require it to be done from VS or Windows only command line tools. Web sites provide very convenient streaming logging feature. Getting logs out of a Hosted Service is brittle.

As a developer, I would love to have a worker tier development experience match that offered by Windows Azure Web Sites. I want quick, git-based deployment for both web and worker code. I want to deploy from Mac or Windows without discrimination. I want my streaming logs available for both web and worker, or perhaps even unified.

Let’s break some rules

To achieve my ideal development experience, I am going to run both web and worker tier code in Windows Azure Web Sites:

image

In general this is a big no-no most of the time, but there is a class of web applications for which having a single deployment container for both web and worker code is not entirely unreasonable. Below are some guidelines to decide if this is a good fit for your app.

The resource consumption profile of both web and worker tier should be sufficiently similar. Web tier workloads are typically IO bound: they accept HTTP requests, do some minimal processing, turn around and exchange some data with the storage tier, then respond to the client. Worker tier profiles vary from CPU bound, memory bound, to IO bound. It is reasonably safe to combine a web tier with a worker tier that is also IO bound. For example, your worker tier may be implementing a long running IO orchestration, coordinating processes across several distributed systems. If the resource consumption profiles of web and worker tiers were different, chances are high one or more classes of resources would go underutilized when the system is scaled out to handle the traffic.

The worker tier must be implemented in a way that is compatible with the process management of your web tier. In Azure Web Sites, processes are running under IIS. They are only activated when HTTP requests arrive, and the recycling policy will terminate them in pre-configured circumstances, e.g. within 15 minutes of lack of HTTP activity. You must design your worker tier to be robust enough to withstand this recycling policy. You must also mitigate the lack of control over process activation (more on this in the next section).

The benefits of running both web and worker in Windows Azure Web Sites are numerous and particularly relevant at active development phase:

  • Simplicity: the is only one artifact to deploy and manage.
  • Logging: streaming logging from both web and worker components is available in a unified form.
  • Deployment: git-deploy in seconds both web and worker code, and make the deployment atomic between web and worker tier.
  • Cross-platform: deploy from Mac or Windows
  • Configuration: quickly update configuration settings of web and worker using the same mechanism (app settings in Windows Azure Web Sites propagated as environment variables to web and worker processes).

Workers on a shoestring, the practice

There is a number of considerations for hosting worker code in Windows Azure Web Sites that must be addressed.

Initializing your worker process and keeping it running

Processes running in Windows Azure Web Sites are managed by IIS. IIS itself can be configured to start up a process on system startup and keep it always running. However, the configuration of IIS in Windows Azure Web Sites is different and locked: processes are only activated when an HTTP request arrives that targets a particular application. As a corollary, without an HTTP request the process will never run.

Moreover, IIS in Windows Azure Web Sites is configured to terminate web processes for which no HTTP requests were received during a specific period (15 minutes by default, but the application has no control over this value). A new process will only be created when another HTTP request arrives.

To have a worker process initialized and running most of the time in this environment one must:

  • Create the worker process as soon as the web process is initialized by IIS. While technically you can run the worker logic from within the web process, it is a good idea to have a process boundary between web and worker. This reduces cold startup latency of the initiating HTTP request, and also helps keep web and worker logic encapsulated in case you need to split worker from web tier later. Note that if you spawn a worker process from within a web process, they are still going to run in the same Windows job object and therefore be bound by the same process lifetime policy that IIS imposes. If IIS decides to terminate the web process given its recycling policy, the worker process will be terminated with it, no questions asked.
  • Send an HTTP request to the web application periodically to ensure the web process (and the worker process spawned by it) are running. You can use an external system to send these periodic HTTP requests, but since we are implementing workers on a shoestring, let’s hack another Windows Azure feature to do the job for us for free: Health Monitoring endpoints. Every Windows Azure Website can be configured with a Health Monitoring endpoint that Azure will periodically invoke to measure and report on latency of calls originating from various places in the world:

    image

    As it happens, Azure invokes these endpoints every 5 minutes:

    image

    Given that you can define up to 2 monitoring endpoint per web application in Windows Azure Web Sites, and each of these endpoints can be called from up to 3 worldwide locations for monitoring purposes, the combined frequency of periodic HTTP calls to your web site should be sufficient to reduce the risk of your worker process being down at any point.

Dealing with recycling

If you run your worker code in Windows Azure Web Sites, you have no control over when your process is recycled. This should be no huge issue from the reliability standpoint, since your worker logic should be implemented to properly handle unexpected failures anyway (recycling is no different than any other unexpected event that causes your process to terminate).

In practice, however, worker logic is often optimized for certain assumptions around typical process lifetime. For example, you may run for 30 minutes before committing in-memory results to durable storage if you assume failures are infrequent and you otherwise control the process lifetime.

Given that you know your worker process is likely to be terminated by IIS more frequently, you should design around this assumption. Make your “transactions” smaller and commit often. This way when a worker process is created anew after being recycled, it can pick up from where it left off without loosing much work.

Dealing with unexpected worker termination

What should happen when your worker process unexpectedly terminates? Since it was spawned by the web tier process, that situation must be handled by the web tier code itself. You can either implement your own worker process lifetime policy within the web process code, or you can rely on the IIS policy for handling unexpected application process failures. Most of the time you probably don’t want to roll out your own process lifetime management mechanism where one already exists. Instead, when a web process detects termination of the worker process it spawned, the web process should just terminate itself and let IIS handle this situation. When a next HTTP request arrives, the web/worker process combo will be created anew.

Keep web and worker code separate

Once you grow out of the shoestring solution described here, you will need to separate your web and worker components into separate containers. To make this easy, it is best to minimize any interaction or shared state between the web and worker processes despite they run on the same machine. Having the durable storage be the only way for web and worker to exchange data makes it so much easier to separate them when the time comes.

The only on-machine interaction between web and worker processes should be scoped to the web process spawning the worker process, and web process terminating itself upon unexpected worker process termination.

Limitations of scalability

The scalability mechanism of Windows Azure Web Sites really prevents reliable use of this shoestring mechanism on deployments involving more than 1 instance.

When your worker logic needs to be scaled out to handle the workload, you must be able to say “I need 5 instances of workers now” and have all of the 5 instances running concurrently. This is not how Windows Azure Web Site scalability works. When you say “I need 5 web instances now”, Azure really interprets it as “up to 5 instances”. The actual number of instances that will be running depends on the incoming HTTP traffic. So unless your worker scalability needs are always proportional to the number of incoming HTTP requests, you are likely to run into a situation where worker processes cannot keep up with outstanding work.

Workers on a shoestring, Mobile Chapters case study

I have successfully used the shoestring approach to run worker processes as part of the Mobile Chapters web application.

At the core, the web application accepts a book manuscript upload, stores the file in a durable store, and let’s a worker process asynchronously convert the manuscript into mobile applications for iOS, Android, and Windows Phone. The overall conversion process can take between seconds and minutes, depending on the complexity and size of the manuscript. The process is mostly IO bound, coordinating data flow and state transitions between PhoneGap Build, Azure Blob Storage, and MongoDB.

Another job the worker process performs is to periodically refresh data the mobile applications can later fetch by calling out to external services. This is scheduled to happen every 15 minutes or so, and according to logs from loggly it works as clockwork. So the mechanism described here also yields itself well to the implementation of lightweight web schedulers.

image

The important part is the shoestring approach provides me as a developer with a superior experience compared to what I would have to endure if I hosted worker code in a Hosted Service, without compromising the functionality of the web application.

Enjoy!

Monday, December 16, 2013

Secure by default with SSL in Windows Azure Web Sites

Windows Azure Web Sites (WAWS) allow your web applications to be exposed over HTTP and HTTPS. After you configure your Azure web site with an SSL certificate, by default your endpoints will be reachable over both HTTP and HTTPS. For some apps you may want to prevent the use of HTTP and require callers to always use HTTPS. How do you do this within an web app deployed to Windows Azure Web Sites?

A simple practice to promote your site’s security is to detect if a call is made over HTTP and redirect the caller to the corresponding HTTPS endpoint instead. Implementation of this mechanism in Windows Azure Web Sites requires understanding of how HTTPS is implemented in WAWS.

When a caller is making an HTTPS request to your site, the request first arrives at WAWS router based on the ARR technology. This is where the SSL connection from the client is terminated. After performing SSL handshake on your site’s behalf, ARR forwards the client request to the actual server running your application code over unsecured HTTP connection. (This is OK, since this traffic is internal to an Azure data center). However, it raises the question of how your web application can detect if the original client call was made over HTTP or HTTPS? It turns out the ARR is attaching a special HTTP request header to every request that arrives over HTTPS. The name of the header is x-arr-ssl and its value contains information about the SSL server certificate that was used to secure the TCP connection between the client and the ARR.

image

An application deployed to Windows Azure Web Sites can detect presence of the x-arr-ssl header in deciding whether to redirect client’s call to an HTTPS endpoint, or continue processing it. This approach can be implemented with any web application technology. The example below shows a simple Connect middleware for Node.js applications deployed to WAWS that allow redirecting all traffic to HTTPS endpoints:

function ensureHttps(redirect) {
return function (req, res, next) {
if (req.headers['x-arr-ssl']) {
next();
}
else if (redirect) {
res.redirect('https://' + req.host + req.url);
}
else {
res.send(404);
}
}
}

The middleware will detect HTTPS request and continue processing them. If an HTTP request arrives, it can be either redirected to a corresponding HTTPS endpoint, or flat out rejected with an HTTP 404 response, depending how the middleware is configured. As a rule of thumb, if the request contains sensitive information and it was received over plain HTTP, it should be rejected. Otherwise, it is OK to redirect it to a corresponding HTTPS endpoint. Here is how you can use this middleware in configuring endpoints of an Express application:

app.post('/',
ensureHttps(true), // This is a home page, redirect HTTP to HTTPS
routes.home);

app.get('/account',
ensureHttps(false), // Authenticated endpoint, reject HTTP with a 404
authenticate(),
routes.account);

You can see the redirection from HTTP to HTTPS in action when you navigate to http://mobilechapters.com.

image

Enjoy!

Friday, September 6, 2013

Access Windows Azure Cache Service from Node.js and Express

The Windows Azure Cache Service is a great mechanism for scaling out web applications deployed to Windows Azure Web Sites. It allows you to externalize and very quickly access session state from any instance of your web application.

With the azurecache module you can now access Windows Azure Cache Service from Node.js applications. The azurecache module allows you to connect to the cache service directly, but it also provides an implementation of a session store that can be used in any session-enabled Express application.

The azurecache module uses Edge.js to call into the .NET Windows Azure Cache Service client that ships as a NuGet package. As such the module only works on Windows.

Using Windows Azure Cache Service to store Express session state

First create your Windows Azure Cache Service instance following instructions at Scott Guthrie's blog. You will end up with an endpoint URL of your cache service (e.g. tjanczuk.cache.windows.net) and an access key (a long Base64 encoded string).

Then install the azurecache and express modules:

npm install azurecache
npm install express

Next author your Express application that uses the azurecache module to store Express session state in the Windows Azure Cache Service:

var express = require('express')
, AzureCacheStore = require('azurecache')(express);

var app = express();

app.use(express.cookieParser());
app.use(express.session({ store: new AzureCacheStore(), secret: 'abc!123' }));

app.get('/inc', function (req, res) {
req.session.counter = (req.session.counter + 1) || 1;
res.send(200, 'Increased sum: ' + req.session.counter);
});

app.get('/get', function (req, res) {
res.send(200, 'Current sum: ' + req.session.counter);
});

app.listen(process.env.PORT || 3000);

Lastly set some environment variables and start your server:

set AZURE_CACHE_IDENTIFIER={your_azure_cache_endpoint_url}
set AZURE_CACHE_TOKEN={your_azure_cache_access_key}
node server.js

Every time you visit http://localhost:3000/inc in the browser you will receive an ever increasing counter value. When you visit http://localhost:3000/get you will receive the current counter value. The value of the counter is stored as part of the Express session state in the Windows Azure Cache Service with a default TTL of one day. You can now scale out the application to several instances since the session state is externalized to the Windows Azure Cache Service.

Deploying Node.js apps using Azure Cache Service to Azure Web Sites

If you are not familiar with deploying Node.js application to Windows Azure Web Sites, this walkthrough will explain the process.

Deploying an Express application that uses the azurecache module to store session state requires that module dependencies are declared in the package.json file:

{
"name": "azurecachetest",
"version": "0.1.0",
"dependencies": {
"express": "3.3.8",
"azurecache": "0.1.0"
}
}

Once you deploy a Node.js application consisting of the package.json and server.js above to Windows Azure Web Sites, you still need to provide the credentials to Windows Azure Cache Service to it. Just as you were doing this using environment variables before, you can now set the application settings of your web site using the Windows Azure management portal:

image

You can also use the management portal to scale out your Express application to multiple instances, now that the session state is externalized to Windows Azure Cache Service:

image

After saving the changes, you can navigate to your site and see the azurecache module in action:

image

How fast is the cache?

What is the latency of accessing Windows Azure Cache Service from a Node.js application using the azurecache module? To find out, let’s deploy a simple latency test to Azure Web Sites. The HTTP server will execute 1000 sequential puts against the cache and return the average latency in milliseconds as an HTTP response:

var http = require('http')
, cache = require('azurecache').create();

http.createServer(function (req, res) {
var start = Date.now();
var count = 1000;
function one() {
cache.put('puttest', { first: 'Tomasz', last: 'Janczuk' }, function (error) {
if (error) throw error;
if (--count === 0) {
res.writeHead(200, { 'Content-Type': 'text/plain' });
res.end('' + ((Date.now() - start) / 1000));
}
else {
one();
}
})
}
one();
}).listen(process.env.PORT || 3000);

Save, deploy to Azure Web Sites, and send a request:

image

The average latency of inserting into the cache is just a notch over 1 millisecond. Converting the test to measure the latency of getting data from the cache is trivial. Here is the result:

image

Similarly to put, a get is around 1 millisecond.

Note that you can only achieve such low latency for Node.js applications deployed to Windows Azure, since locality of data is a major factor in caching. If you run the same performance test by hosting the Node.js server on your developer machine, your latencies will be much higher (in my case they were around 50ms) since every call to the Windows Azure Cache Service needs to go from your developer machine to a Windows Azure data center.

So, go forth and scale out!

Monday, July 29, 2013

Debug Node.js applications in Windows Azure Web Sites

This post explains how you can remotely debug your Node.js application deployed to Windows Azure Web Sites using the node-inspector debugger.

Node.js applications deployed to Azure are using the iisnode module to run. One of the features of iisnode is integrated debugging experience based on node-inspector. In order to enable node-inspector debugger in Azure, a few steps need to be followed.

image

Configuration

To enable node-inspector debugging for Node.js apps deployed to Azure, you must currently ensure the settings in your iisnode.yml and web.config files are correct. We are working on streamlining this experience in future releases of Windows Azure Web Sites.

In iisnode.yml, you must enable debugging by setting the debuggingEnabled property to true:

debuggingEnabled: true

In web.config, you must configure URL rewriting rules which allow iisnode to distinguish HTTP requests that target the node-inspector debugger from requests that target your application. Assuming the entry point to your Node.js application is the server.js file, your web.config could look as follows:

<?xml version="1.0" encoding="utf-8"?>
<configuration>
<system.webServer>
<handlers>
<add name="iisnode" path="server.js" verb="*" modules="iisnode"/>
</handlers>
<rewrite>
<rules>
<rule name="NodeInspector" patternSyntax="ECMAScript" stopProcessing="true">
<match url="^server.js\/debug[\/]?" />
</rule>
<rule name="Application">
<action type="Rewrite" url="server.js"/>
</rule>
</rules>
</rewrite>
</system.webServer>
</configuration>

Using the debugger

After your application has been re-deployed with the changes to web.config and iisnode.yml described above, you are ready to start debugging.

To open the debugger for your application, navigate to http://yourapp.azurewebsites.net/server.js/debug. This should bring up the familiar node-inspector interface for your application, which allows you to set breakpoints, inspect code, etc. In a separate browser window you can invoke an endpoint in your application, e.g. http://yourapp.windowsazure.net/apis/myapi. If any of the breakpoints you set in node-inspector were hit, you should see execution paused in the browser instance running node-inspector.

When you are done with the debugging session or simply want to start from a clean slate, navigate to http://yourapp.windowsazure.net/server.js/debug?kill. This will terminate the debugger and debugee processes in Azure.

When you have finished debugging, remember to disable the feature by setting debuggingEnabled to false in iisnode.yml. The URL rewrite rules in web.config can remain in place.

You can read more about the node-inspector integration with iisnode here.

Advanced configuration

The iisnode debugger integration requires that part of the URL space of your Windows Azure Web Site is reserved for use by node-inspector. However, you have control over the URL path segment value used for that purpose. By default the value of the segment is debug, and so you navigate to the node-inspector debugger by visiting http://yourapp.azurewebsites.net/server.js/debug. You can modify this value using the debuggerPathSegment setting in iisnode.yml, e.g.:

debuggerPathSegment: 6534adw287dgx552

When changing the debugger path segment, a corresponding change must be done in the URL rewrite rules in web.config. Once the path segment has been changed, you can navigate to the node-inspector debugger using the http://yourapp.azurewebsites.net/server.js/6534adw287dgx552 URL.

Note that customizing the URL path segment reserved for debugging is important for the security of your site. As long as debugging is enabled in iisnode.yml (it is disabled by default), anyone who knows the value of the debugger path segment can interfere with your application. It is therefore wise to set the value to a cryptographically secure string before enabling debugging for your site.

Enjoy!

My Photo
My name is Tomasz Janczuk. I am currently working on my own venture - Mobile Chapters (http://mobilechapters.com). Formerly at Microsoft (12 years), focusing on node.js, JavaScript, Windows Azure, and .NET Framework.