Jaxer Update Available. Jaxer 1.0 RC B is now available at the Download Page.
This release has many new features and updated APIs.

Jaxer 1.0 RC B API docs: Available inside Aptana Studio after you run the update.
Additional Info: Jaxer 0.9 Beta to Jaxer 1.0 RC Migration Guide.



Aptana Jaxer 0.9 Beta Release Documentation

Jaxer-Web Server Protocol

There are at least three entities that are relevent to this protocol. They are the web server, Jaxer, and Jaxer Manager. Jaxer Manager and Jaxer act together as the “server” (Jaxer Server), and the web server act as the “client”. In the first part of this document, we will talk about the protocol between the “server” (Jaxer Server) and the “client” (the web server). In the second part, we will explain how Jaxer Manager and Jaxer handles the “server”-side communications.

Network protocol between Web Server and Jaxer Server
Communication between a web server and Jaxer Server takes place over TCP/IP connections. There may be multiple such connections, as a connection at any given time may only handle a single request. A connection may be reused for another request once the prior request has completed, and it is encouraged for a web server to maintain a pool of idle connections to avoid the overhead of setting-up/tearing-down connections.

Blocks are sent over a connection in both directions. A block has a header and body. The header has two fields, the block type (1 byte) and length (2 bytes). The body ranges from zero to 65535 bytes in length, the contents depending on the block type.
All fixed-sized, multi-byte numbers are in network byte order, most significant byte first.
Strings are not zero-byte delimited unless stated otherwise.

Block type BeginRequest (1)

When a web server needs Jaxer Server, it takes an idle connection and sends over it a BeginRequest block. The first two bytes are a 16-bit integer designating the protocol version used by the web server. The next byte specifies the type of request.

Currently there is only one request type: HTMLDocument.

  • HTMLDocument is defined as 0.

Jaxer Server will send back a BeginRequest block as a response. The first two bytes are a 16-bit integer designating the protocol version used by the Jaxer Server. The next byte specifies whether Jaxer Server can handle the request (1 means yes, 0 means no). In the case of 0 (declining the request), two optional fields might be returned – a two byte 16-bit integer specifying an error code, followed (again optionally) by a string explaining the reason. The string, if presented in the response, will be 2-byte-length-prefixed.
The error code that Jaxer Server might return are

  • 1 – Web server specified a protocol version that is lower than the supported version.
  • 2 – Web server specified a protocol version that is higher than the supported version.
  • 3 – Web server specified a protocol version that is different from the supported version.
  • 4 – Web server specified an invalid type of request.

The Web server should terminate the request when the Jaxer Server indicates that it cannot handle the request.

Block type Header (2)

The HTTP request headers are sent via Header blocks. Header blocks must be sent after BeginRequest and before Environment. Header blocks collectively form a stream in which block boundaries do not matter. The stream starts off with a two-byte number indicating the number of headers which follow. Each header is a pair of strings, the header name followed by its value. Each string is prefixed by a two-byte number indicating its length. A header may appear multiple times. Jaxer Server does not response to these blocks.

Block type Environment (3)

Following the Header blocks are Environment blocks, which contain any environment variables that the web server wishes Jaxer Server to have, per the usual CGI conventions. The encoding is the same as for headers, except that duplicates are not permitted. Jaxer Server does not respond to these blocks.

Block type Document (4)

Following the Environment blocks are Document blocks, which contain the HTML document to be processed by Jaxer Server. Nothing in a Document block indicates that it is the final block; this is implied by the subsequent EndRequest block.
Although Jaxer Server does not specifically respond to Document blocks, while a web server is sending them it must be ready to receive a RequirePostData block from Jaxer Server and respond to it ASAP.
In the case where there is no document (eg when Jaxer Server is configured as the handler/content generator), an empty document must be sent to Jaxer Server.

Block type RequirePostData (5)

While processing script blocks in a document, Jaxer Server may discover it needs the data that was sent via an HTTP PUT request and informs the web server via RequirePostData. Jaxer Server cannot resume parsing HTML until it receives this data, but it must continue to receive Document blocks and set them aside so that it can eventually receive the PostData blocks.

Block type PostData (6)

In response to receiving a RequirePostData block, the web server suspends sending Document blocks and starts sending PostData blocks. It will always send at least one PostData block, even if it means sending a zero-length body. Once all post data have been sent, the web server resumes sending Document blocks. Jaxer Server knows it has received all post data when it receives a non-PostData block.

Block type EndRequest (7)

A web server sends the EndRequest block when it has nothing more to send as part of the current request. Jaxer Server will then complete work on the request and begin sending the response. For HTMLDocument requests, the response takes the form of a sequence of (response) Header blocks followed by Document blocks followed by EndRequest. The response headers will include at a minimum status, content-type and content-length. When the web server receives EndRequest from Jaxer Server, the connection is once again idle. The web server will then, either immediately or after some passage of time, close the connection or start a new request via a BeginRequest block.
It is possible that the web server will send EndRequest before Jaxer Server discovers it needs the post data, especially in the case of short documents. The web server must still look for RequestPostData and respond. It will resend EndRequest to mark the end of the post data.
Jaxer Server must send RequestPostData (if needed) before sending the response header.
The response header and document have the block types of Header and Document respectively.

Block type Error (8)

Jaxer Server informs a web server of an error condition by sending an Error block. The contents are simply a message string. The connection will then be closed.

Communications between Jaxer Manager and Jaxer

As far as the web server is concerned, it is talking to a server (the Jaxer Server) on the other side of the socket according to the formentioned protocol. Jaxer Manager and jaxer work together to fulfill the web server request.
Jaxer Manager, as its name indicates, manages the individual Jaxer’s:

  • Starts them.
  • Terminates them.
  • Asks a Jaxer to perform a specific task.

It also listens for requests from the web server and when a request comes, it delegates the work to a Jaxer.
Jaxer waits for a request from Jaxer Manager , completes it, and notify Jaxer Manager when it has completed the task. All actual requests from the web server are done by Jaxers.
Jaxer Manager and the individual Jaxers are different processes (instead of a single process with multiple threads). Since Jaxer Manager needs to get the request (over the socket) and ask a Jaxer to complete the request, the actual communications between Jaxer Manager and Jaxer are different on Windows and on Unix platforms.

Unix Platforms (Linux & Mac for now)

Jaxer Manager and Jaxer communicates using a unix domain socket; a socket pair is created when Jaxer Manager creates (fork) a new Jaxer process. When Jaxer Manager accepts a connection from the web server, it peeks the incoming data to determine the request type (only Hello or BeginRequest), and then send a short message to the Jaxer (child) process. The message has the actual connection (so Jaxer can talk to web server directly) and the type of task that Jaxer should perform (do a Hello or Perform a Request). (There are 2 other types that Jaxer Manager can ask Jaxer to perform that do not have a web server connection – a) get a configuration setting from the Jaxer Manager for Jaxer to use internally, and b) told by Jaxer Manager to exit.)

When Jaxer gets the request, it extracts the socket connection from the message. It then talks directly over the connection to the web server to fulfill the request. When it has completed the request, it sends a single byte back to Jaxer Manager through the unix domain socket to signal it has completed its task.

Windows

Since Windows does not provide a way to pass a socket connection between processes, Jaxer Manager has to literally read the entire request from the webserver and forward the data to Jaxer and do the same for Jaxer to web server.

Jaxer Manager and Jaxer communicates via a named pipe. Jaxer Manager construct a unique pipe name and pass it to the created Jaxer process through an environment variable.

Jaxer manager uses async read/write to handle the communications between web server and Jaxer. When a read is completed from the web server, it checks the type of the request. If it is a Hello or BeginRequest, it tells Jaxer the type of request before forward the actual data from the web server. Otherwise (in the middle of a request), it simply forwards the data to Jaxer.

When Jaxer sends back data to Jaxer Manager, it always prefix with the message length (on top of the protocol described earlier). A zero length message indicates the end of the request, so Jaxer Manager would know when to put the Jaxer into its idle pool or new requests. For non-zero length messages, Jaxer Manager simply forwards it to the web server (after stripping the length prefix).