Signs of Triviality

Opinions, mostly my own, on the importance of being and other things.
[homepage] [index] [] [@jschauma] [RSS]

Just-in-time translation of user-provided LESS via NodeJS - Yikes!

screenshot of the author's 'homepage' from around 1997Imagine, if you will, a Content Management System (CMS) that uses Cascading Style Sheets (CSS) to make things look pretty. I'm told they are all the rage these days; even I wrote something using CSS with what called itself DHTML back in... 1997 or so.

Anyway, so web developers used to focus on making pretty things, but in doing so came to realize that it'd be useful to apply certain logic constructs, transformations and a little of the usual this'n'that. And thus, following the general rule that another layer of abstraction cannot possibly hurt, people came up with CSS metalanguages, such as SASS and LESS. That is, after processing the input these "languages" generate valid CSS as output. (Some people like to call this "compiling", but in my funny little world "compilation" has to do with translation of one language into a lower-level language such as assembly or machine code. Oh well.)

LESS was apparently originally written in Ruby, but nowadays is released as JavaScript. That is, you can provide the less.js script to your clients and have the translation occur in the browser. This has the benefit of pushing the execution completely to the client, which of course is also its distinct disadvatage: first you have to shuffle the LESS "compiler" to the client, then burden the client with the execution and finally, well, guess what, you're relying on the client to be able to run javascript. Alternatively, you can just perform the translation before hand, generate the resulting CSS and simply serve those to your client.

The general development/deployment process then seems naturally to be something along the lines of:

  1. write LESS (haha, wonderful, get it?)
  2. translate LESS into CSS
  3. ship .css files
Seems like a no-brainer, right?

I like pretty things.Well, here come the web developers again, with their desire to make pretty things. Turns out, it's frequently not entirely unreasonable to wish to perform the LESS=>CSS translation on the fly, allowing your developers and designers to build dynamic templates and such that get your approved-prettiness stamp of approval. And so you start down the road of dynamic translation (or "Just-In-Time (JIT) compilation").

There are a number of different ways that you can perform this JIT translation; two common ways are to use either Rhino or Node.js. If, to come up with an arbitrary example, you were running your web application inside of tomcat, then choosing Rhino would seem to make sense, since you already have the JVM loaded.

Turns out, Rhino sucks donkey balls when it comes to performance. Running a node.js webserver alongside your tomcat and have your application interface with node.js over HTTP is by an order of magnitude faster. So your process flow is then: Perfect stocking stuffer

  1. a user makes a request to your web site
  2. your tomcat web app determines what LESS to use
  3. your tomcat web app makes an HTTP request to your local node.js service to translate this LESS code into CSS
  4. your node.js performs the translation on the fly
  5. your tomcat web app serves CSS to the client
Alright, so far so good. Seems a bit convoluted, but hey, if it makes things pretty...

But now things are getting interesting: people realize that it would be wonderful to let your users supply their own stylesheets. Let them monkey around with the layout to their heart's content - what's the harm? It's just CSS, static text, right?

Look back at the process flow above and think about where content comes from and where it executes. If you allow users to supply their own LESS, you then end up executing user-provided code on your machine!

But, but, but... LESS only allows you to do a few minor things, it's not more than a bit of syntactic sugar on top of CSS, right? Well, I'm afraid the answer is: no, that is not quite correct. Here are a few things to be careful about in this context:

  • size limitations -- it is conceivable that a user could create a very large LESS file to upload, which could take significant resources to process
  • intensive processing/calculations -- LESS offers the ability to perform some rudimentary mathematical calulcations; it is conceivable that a user could perform calcuations that would take signficant resources to process
  • importing -- LESS allows you to @import libraries; it is conceivable that users may attempt to use this feature to either load additional files or to probe for the existence of files on the server. Information about your environment might be exposed in this manner.
  • JavaScript evaluation -- LESS allows you to evaluate JavaScript code via backticks. This feature can be abused to then perform virtually everything and anything that can be accomplished with JavaScript. What's even more concerning is that since the JavaScript runtime in this case is not in the browser sandbox, but the nodejs process running on our server, one can create LESS code that not only exposes significant amounts of information about the web server (for example via nodejs's process module), but to even access or influence other processes (for example by sending signals via process.kill), (attempt to) load dynamically linked libraries into the runtime etc.

Of the above concerns, the JavaScript evaluation really is the single road block; the other issues can be dealt with. The options to address the evaluation problem are:

  • fork LESS and rip out backtick support; this means you then incur the notable cost of maintaining our own code fork of the software and to communicate to our users that this feature is not available; you also have to be careful that this feature is not accidentally added back in (at upgrade/merge from upstream or by developers wishing to take advantage of this feature)
  • attempt to "sanitize" user input; this is notoriously unreliable and usually fails spectacularly
  • perform LESS=>CSS translation in the browser sandbox client-side; you can restrict this behaviour to user-provided LESS input only and still have your own, "trusted" LESS code generated on the fly. Or, even better, you rework your application such that all CSS code is generated pre-deployment (and user-provided code remains client-generated).
Letting users run code on your machine makes me a sad panda.

(It should be noted that it's quite likely that the same problem currently exists with the Rhino implementation. There, it's just a question of what functionality is exposed by Rhino and what can be done within the Rhino context. Since Rhino runs within the tomcat/jvm context, that might actually be even scarier, but I have not looked into this.)

Well, I for one know what I'd prefer, but it's rare that I get what I want. I really miss the days where an HTTP server was an HTTP server and served you a static file that you could hunt down and inspect on the server's filesystem...

February 27, 2012

[...and they're both probably right] [index] [Unpatch (!= patch -R)]