Tomek on Software

Thursday, June 6, 2013

Access SQL Azure from a Node.js app deployed to Windows Azure Web Sites

In this post I will show how you can access SQL Azure data from an Express.js Node.js application deployed to Windows Azure Web Sites, and expose that data over HTTP in 31 lines of Node.js and T-SQL code.

This post demonstrates collaboration between several technologies: in-process interop between Node.js and ADO.NET using Edge.js and Edge-sql, SQL Azure, Windows Azure Web Sites, and ADO.NET.

Prerequisites

You need a Windows Azure subscription. You can get started for free here if you don’t have one.

You need to know how to deploy a simple Node.js hello, world application to Widows Azure Web Sites. If you have not done it before, here is a walkthrough. You don’t need a Windows development machine to follow through, it works from Mac, *nix, and Windows just as well.

Create SQL Azure database

If you don’t have a SQL Azure database you could experiment with, create one now through the Windows Azure Management portal. Once created, go to the management dashboard of the database and open the T-SQL console:

image

In the T-SQL console, create a simple Products table and populate it with three entries:

image

Your SQL Azure database should now have three rows in the Products table:

edgewaws-products

The last step is to capture the ADO.NET connection string to your SQL Azure database. Again this can be done from the Windows Azure Management dashboard for SQL:

image

Make sure to capture the ADO.NET connection string, and replace the {your_password_here} placeholder with the actual SQL Server password you specified when creating the SQL Server:

image

The Windows Azure Web Site

I am going to assume you have a simple Node.js application already running in the Windows Azure Web Sites, and that you know how to update it. If you don’t, follow the walkthrough.

Before you create an Express.js application that exposes SQL Azure data over HTTP, you will need to pass the connection string to that application using a Windows Azure Web Sites mechanism of app settings. App settings are key/value pairs configured on the management portal of a Windows Azure Web Site. In the context of a Node.js application, these settings are propagated as environment variables to the Node.js process running your application. As such, they provide a convenient mechanism for passing secrets, like database connections strings, to the application.

Go to the management dashboard of your Windows Azure Web Site Node.js application, navigate to the Configuration tab, and set the EDGE_SQL_CONNECTION_STRING app setting to the value of the ADO.NET connection string of the SQL Azure database you captured in the previous step:

edgewaws-connectionstring

After you enter the value, make sure to click on Save below to reset the Node.js application and enable it to access the new app setting via an environment variable.

The Node.js application

The Node.js application is going to use Express.js framework to expose two HTTP endpoints for accessing the data in the Products table of the SQL Azure database:

  • One endpoint, /products, will return a JSON array containing all products from the Products table,
  • Another endpoint, /products/<id>, will return JSON representing the product with ProductID equal to <id>.

The application is going to use the Edge.js module for Node.js to enable in-process introp with .NET. It is also going to use Edge-sql extension of Edge.js which allows scripting T-SQL inside of a Node.js application using .NET Framework’s ADO.NET client. This is the package.json of the application which declares the dependencies:

image

The actual Express.js application is using Edge.js and Edge-sql to communicate with the SQL Azure database and expose the results as JSON:

image

Lines 4-7 use Edge.js to create the getProducts JavaScript function. The function uses Edge-sql extension of Edge.js to provide in-process introp with ADO.NET functionality that executes the specified T-SQL query over a SQL database. The connection string to the SQL database is provided through the EDGE_SQL_CONNECTION_STRING environment variable set in the Node.js process as a result of configuring the Windows Azure Web Site app setting in the previous step.

Lines 9-13 create a similar getProductById JavaScript function. This function is created over a parameterized T-SQL query. The actual value of the @productId parameter is provided based on a HTTP url segment of the corresponding HTTP GET call in lines 24-25.

Lines 17-22 expose the /products endpoint which uses the getProducts function to obtain a list of all products in the Products table of the SQL Azure database identified with the EDGE_SQL_CONNECTION_STRING connection string. The list of products is returned to the caller of the HTTP API as a JSON array:

edgewaws-productsall

Lines 24-29 expose the /products/<id> endpoint which uses the getProductById function parameterized with the product ID value passed as a URL segment of the HTTP GET request to obtain information about that particular product. The information is returned to the caller of the HTTP API as JSON:

edgewaws-productsone

What’s next

The Edge.js module for Node.js allows in-process interop beween Node.js and .NET code. The Edge-sql extension of Edge.js enables executing T-SQL scripts embedded within a Node.js application using asynchronous ADO.NET running in-process with Node.js code. Therefore it enables access to MS SQL databases, including but not limited to SQL Azure databases. The Edge-sql extension currently supports the four basic CRUD operations: select, insert, update, and delete.

Enjoy. Collaboration welcome: https://github.com/tjanczuk.edge.

Saturday, May 25, 2013

How to save $5 million running a Node.js application

Prerequisites

Before you can save $5 million, you must be spending more than that. One way to spend over $5 million is to run a large Node.js web application. For example, an application that requires 10,000 CPU cores to handle its traffic. For the purpose of this napkin math, let’s assume this Node.js application is deployed on 1,250 extra large (8 core CPU) Windows VMs in Windows Azure at an annual cost of $6,912,000 (1250 x 12 x $460.80):

image

The web application

With the problem of spending more than $5 million out of the way, let’s talk about the application itself. There is a class of web applications for which horizontal data partitioning is the most sensible approach of scaling out. Many relay applications are in this class. Imagine a web chat application, which allows several browser clients to connect to a chat room on the server to exchange data in real time. There may be millions of chat rooms handled by the web application at a time, which requires a total of 10,000 CPU cores. However, individual chat rooms typically have only a handful of participants and does not consume a lot of resources. Given that we can horizontally partition the chat rooms such that all calls to a particular chat room are always handled by a specific process on a one of the servers. This allows us to keep chat room state in-process and therefore improve the response latency compared to an approach with externalized state.

The Node.js process is single threaded, so a single process can only fully utilize one CPU core. On a VM with 8 CPU cores, running 8 Node.js processes will ensure that the overall CPU is fully utilized.

To summarize, the deployment of the application requires 1,250 extra large (8 CPU core) servers to accommodate the traffic. Each server runs 8 Node.js processes. Each Node.js process handles many chat rooms:

image

To support horizontal partitioning the system must have a routing logic in place that knows how to route all requests targeting a specific chat room to the Node.js process that keeps the chat room’s state. To build such routing system, individual chat rooms must be addressable, for example using HTTP URL path segments as follows:

http://megachat.com/{server_id}/{process_id}/{chatroom_id}

or a combination of a DNS name and HTTP URL path segment:

http://{server_id}.megachat.com/{process_id}/{chatroom_id}

In this post I am going to assume that the problem of server level routing had already been solved: when the system receives an HTTP or WebSocket request from the client, a router consistently routes the request to the server identified with the {server_id}. Once a request is received by the appropriate server, additional routing mechanism is necessary to dispatch that request to the Node.js process identified by the {process_id} URL segment. And a good way of doing that is the 5 million dollar question this post is about.

The natural choice: Application Request Routing

When a request of the form http://megachat.com/{process_id}/{chatroom_id} is received by a server, there must be a mechanism in place to route it to a particular Node.js process based on the {process_id} segment of the URL. This is typically accomplished with an HTTP reverse proxy that routes requests to other processes that listen on distinct TCP ports, for example:

image

In the example above, the {process_id} segment of the URL is an ordinal number of a Node.js process running on the server, an integer between 1 and 8. Given that value, the reverse proxy chooses to forward the request to a TCP port number in the range 8081-8088.

There are many HTTP reverse proxy technologies that allow this sort of configuration, from Nginx, to Application Request Routing (ARR) in Internet Information Services (IIS), to several solutions built with Node.js (e.g. node-http-proxy). Since the application is running on Windows, the natural choice would be to use Application Request Routing in IIS. Let’s assume ARR is what the web application is using currently.

The alternative: using HTTP.SYS port sharing to implement HTTP reverse proxy logic

Part of the Windows operating system is a kernel level HTTP stack called HTTP.SYS. One of the many interesting features of HTTP.SYS is port sharing. With port sharing, several processes on the machine can register HTTP listeners on the same TCP port but distinct URL path segments. For example, one process can listen for all messages sent to http://{server}:80/foo (and subordinate URLs), while another process can listen for all messages sent to http://{server}:80/bar (and subordinate URLs). This mechanism would allow the HTTP reverse proxy logic to be implemented at the kernel level as long as we can access HTTP.SYS functionality from Node.js:

image

The httpsys module enables use of HTTP.SYS in Node.js applications running on Windows. With the httpsys module an application can register an HTTP listener directly with HTTP.SYS and use the kernel mode HTTP stack on Windows instead of the user mode HTTP stack that Node.js ships with. The httpsys module preserves the server side HTTP APIs of Node.js, so minimal changes are required in your application code when switching from the built-in HTTP stack to HTTP.SYS. Here is a Hello, world application that uses the httpsys module:

image

Compared to a standard Hello, world sample for Node.js, there are two differences. In line 1, the httpsys module is used instead of the built-in http module. In line 6, the HTTP server starts listening on a URL prefix string instead of TCP port number. The server in this example will only receive HTTP requests that arrive on port 80 and whose first segment of the URL is equal to /1/. You can read more about the port sharing feature of httpsys module here.

Performance

Implementing the HTTP reverse proxy logic required for horizontal partitioning using kernel level port sharing with HTTP.SYS offers much better performance than doing the same using Application Request Routing in IIS. In addition, the raw performance of HTTP.SYS is superior to the performance of the HTTP stack in Node.js.

To demonstrate this, I measured and compared the throughput of the ARR and HTTP.SYS solution. The measurement was done as follows:

  • The server machine was an 8-core Intel Xenon W3550 @ 3GHz
  • In the ARR variant, I created 8 Node.js processes listening on ports 8081-8088 using the built-in Node.js HTTP stack and configured ARR on port 8080 to reverse proxy incoming requests across these ports based on the value of the first segment of the URL path of the request.
  • In the HTTP.SYS variant, I created 8 Node.js processes using the httpsys module to listen on URL prefix strings of the form http://*:8080/{n} where n was an integer between 1 and 8.
  • The Node.js applications were returning a simple Hello, world response to all requests.
  • I used 2 client machines running WCAT to achieve 100% CPU utilization of the server during the measurement to ensure the results are normalized. Each measurement had a 30 second warm-up period followed by 30 second period of measurement.
  • WCAT was configured to issue requests evenly distributed across the eight Node.js processes.

Here are the results:

image

It looks like kernel level port sharing with HTTP.SYS offers 6.1x better performance than ARR with IIS when used as a mechanism to implement a reverse proxy logic in this measurement. Time to go back to the napkin math and convert this to dollars.

Show me the money: saving the $5 million

By replacing the ARR/IIS reverse proxy mechanism in our web application with one based on HTTP.SYS port sharing, we can increase the capacity of the application by the factor of 6.1. Conversely, we can maintain the current capacity of the web application and reduce the deployment size by the factor of 6.1. Instead of using 1,250 extra large VMs we started with, we can now use 205 machines and still serve the same traffic. This translates to a drastic reduction in annual infrastructure cost:

image

Switching from ARR/IIS to HTTP.SYS port sharing saved $5,778,432. It is somehow larger saving than the $5 million originally promised by this post. You could spend the extra $778,432 in savings on a Bavaria Cruiser 56 with still enough change left to take you comfortably on a cruise around the world. If you do, please send me a post card.

Wednesday, May 15, 2013

Multi-line strings in JavaScript and Node.js

When writing Node.js or JavaScript applications, you sometimes need to embed multi-line strings in your code. It may be a snippet of HTML, a fragment of textual template, a piece of XML (remember XML?), or code in another programming language.

JavaScript has no built-in way of representing multi-line strings. If you need to embed a longer non-JavaScript text in your application the natural options are limited to concatenating several one-line JavaScript strings or using external files. Unless, of course, you use a little known trick:

var html = (function () {/*
<!DOCTYPE html>
<html>
<body>
<h1>Hello, world!</h1>
</body>
</html>
*/}).toString().match(/[^]*\/\*([^]*)\*\/\}$/)[1];

What happens here? An anonymous function is created with a function body consisting only of a multi-line comment. The comment itself is the very text you want to embed in your application. The function is then serialized to a string using toString(). Interestingly, the call preserves the function signature along with the function body and the comments within. Last, a regular expression is run over the serialized form of the function to extract the comment hidden inside. The end result assigned to the html variable contains the HTML content within the comment.

That pattern can be applied to a variety of types of multi-line text. For example, I am using the pattern liberally in the Edge.js project to embed (and later run) fragments of C#, F#, Python, or PowerShell code in a Node.js application:

fs.ps.py.cs.js

Enjoy!

Tuesday, March 12, 2013

Run C# and node.js code in-process with no compilation

The edge.js project allows running .NET and node.js code in-process. With the edge@0.7.0 release,  you can now embed C# code in directly in the node.js application, without the hassle of projects, compilation, and DLLs. Edge.js will compile the C# code automatically in-memory before running it.

Note: the edge.js project was previously called “owin”. If you know it by that name, this explains why it was renamed.

image

Edge.js provides a prescriptive, asynchronous model for calling .NET code from node.js and node.js code from .NET. Edge.js module takes care of marshaling data between V8 and CLR and reconciling the threading models. And with edge@0.7.0 you no longer have to deal with C# compilation and CLR DLLs as edge.js compiles C# sources for you on the fly.

Everything you need to get started is covered at http://tjanczuk.github.com/edge. Poll requests and feedback are welcome.

For specific topics check out previous posts:

Run node.js and .NET code in-process
Access MS SQL from node.js using OWIN
Implement CPU-bound computations on CLR threads within node.js process
Hosting .NET code in node.js process using OWIN

Friday, March 8, 2013

Run node.js and .NET code in-process

The owin project allows running node.js and .NET code in-process. It provides an asynchronous mechanism for calling .NET code from node.js and node.js code from .NET. The owin module takes care of marshaling the data between V8 and CLR and reconciling the threading models.

With owin you can:

  1. Use 24,000+ npm modules for node.js and 11,000+ nuget packages and .NET Framework within a single application.
  2. Combine the benefits of single-threaded node.js and multi-threaded CLR to run applications composed of IO-bound workloads and CPU-bound computations in-process.
  3. Write node.js extensions in C# and .NET Framework instead of C/C++/Win32.
  4. Use excellent CLR debugging tools (e.g. Visual Studio) to debug .NET code in your application.

Owin is based on a prescriptive pattern of a fully asynchronous interop interface shown below. It combines the essential aspects of event-based, async node.js programming model with the modern, TPL based async model that .NET offers:

clr2v8-2

Check out the owin project on GitHub for in-depth description of the features it offers. Below is simple example that illustrates the gist of the idea.

Hello, world

You need Windows, node.js v0.8.x (tested with 0.8.19), and .NET Framework 4.5 on the machine.

Implemenent a .NET function to be called from node.js in Startup.cs as follows:

using System.Threading.Tasks;

namespace Owin.Sample
{
public class Startup
{
public Task<object> Invoke(object input)
{
return Task.FromResult<object>(".NET welcomes " + input.ToString());
}
}
}

Compile it to Owin.Sample.dll with:

csc /target:library /out:Owin.Sample.dll Startup.cs

Install owin:

npm install owin

Implement the node.js application that will call into the .NET code in server.js:

var owin = require('owin');

var helloWorld = owin.func('Owin.Sample.dll');

helloWorld('JavaScript', function (error, result) {
if (error) throw error;
console.log(result);
});

Run the node.js application and enjoy the response generated by .NET code displayed to the console from the node.js callback function:

C:\projects\barebones>node sample.js
.NET welcomes JavaScript

The sample shows calling a .NET function from node.js, passing a string parameter to it, and receiving a string result via a callback in node.js.

More

Visit the owin project on GitHub for in-depth documentation. Contributions and derived work welcome!

Check out the previous reincarnation of the owin@0.4.0 project to get a better idea of the scenarios that informed the project in the first place.

Enjoy!

About Me

My Photo
I am helping to develop and ship software for Microsoft. My current focus is node.js, JavaScript, and Azure. Before that I worked on .NET Framework and Silverlight, in particular web service technologies: Windows Communication Foundation (WCF), SOAP, WS-*, REST, AJAX.

Search This Blog

Loading...