BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage Articles Polyglot Programming with WebAssembly: A Practical Approach

Polyglot Programming with WebAssembly: A Practical Approach

Key Takeaways

  • Like other technologies, WebAssembly has outgrown its initial purpose and is now showing up in cloud, edge, embedded, and other environments besides the browser.
  • The WebAssembly Component Model (WCM) allows one WebAssembly binary to safely interact with another over a structured interface.
  • With WCM, libraries written in different languages (such as Rust, Python, JavaScript, Go, and more) can interoperate in a true polyglot fashion.
  • A WebAssembly component exposes a set of imports and exports using the WebAssembly Interface Types (WIT) language. The WebAssembly runtime then instantiates all the proper components to run the application.
  • The WCM is similar to technologies such as gRPC and COM, but its advantage lies in the combination of a strong security sandbox and the absence of network transport or costly encoding and decoding.

How WebAssembly Outgrew the Browser

WebAssembly (sometimes abbreviated to Wasm) was built for the browser. However, like many technologies, it has outgrown its initial purpose and become more significant. And with the recent WebAssembly Component Model (WCM) launch, we are getting a taste of a new kind of software development. It’s a world where you can pick the best language for the immediate job and select the best upstream libraries regardless of their language. In short, it’s a true polyglot world.

When Luke Wagner, then at Mozilla, first introduced the world to WebAssembly in 2015, this is what he had to say about it:

I’m happy to report that we at Mozilla have started working with Chromium, Edge, and WebKit engineers on creating a new standard, WebAssembly, that defines a portable, size- and load-time-efficient format and execution model specifically designed to serve as a compilation target for the Web.

Mozilla, Google, Microsoft, and Apple had teamed up to build an open-source technology that would enable compiling languages other than JavaScript into a standard format, interoperable with JavaScript, that a browser could execute.

It attempted to remedy several false starts in making the browser home to multiple languages. We remember Java Applets, Silverlight, and Flash. This path had been tread before. But this time, reasoned Luke and the other early WebAssembly developers, the underlying tech would be open source, broadly supported, and language-agnostic.

WebAssembly in the browser has been stable for a long time now. The W3C has released a final recommendation of version 1.0 and is already working on version 2.0. Several high-profile web apps, including Figma and Microsoft Office Online, already use WebAssembly.

But a second look at Luke’s WebAssembly introduction reveals why WebAssembly is promising beyond the browser: "WebAssembly [...] defines a portable, size- and load-time-efficient format and execution model." This format is a new generation of write once, run anywhere, a generation built on the decades of research on bytecode optimization performed on Java and .NET.

Developers quickly noticed that WebAssembly was promising in many other areas of computing, including plugin architectures, IoT, edge, and cloud computing. After all, WebAssembly has a security sandbox (stronger, in fact, than Docker containers) and can run on Windows, macOS, Linux, and a wide variety of specialized operating systems. It is fast, and WebAssembly binaries are compact and easy to transport.

Luke and many other early WebAssembly developers were so excited by the new prospects that they formed the Bytecode Alliance to extend the specification toward general use and building reference implementations. A recent Cloud Native Computing Foundation (CNCF) report shows that WebAssembly has rocketed to popularity outside the browser. At KubeCon EU in Paris in March 2024, WebAssembly was the second most talked about topic, trailing only Docker.

Extending WebAssembly

For a technology like WebAssembly to achieve even a small amount of success, many languages must support compiling to WebAssembly. These days, most of the top 20 languages support WebAssembly to some degree.

And this has opened up a fascinating new possibility.

If a Python program can run as WebAssembly and a Rust program can run as WebAssembly, we know we have two languages compiled to the same bytecode format. That bytecode format is portable and can be executed on a runtime (like Java and .NET). That means the runtime can access the runtime information (like which function is called or which type is used) for every bit of code. In a way, the runtime has a language-neutral view of the executing code.

Would it be that hard to make it possible for the Python and Rust codes to talk together using the runtime as a substrate? Would it be that hard to allow a Rust binary to load a Python library (compiled to WebAssembly) and use it as if it were written in Rust? Go, C/C++, C#, JavaScript/Typescript, PHP, Swift, Kotlin, and several other languages can all do WebAssembly. In theory, building a system that allows all of those languages to work together inside a WebAssembly runtime should be possible.

That theory is now a reality.

The WCM specifies how one WebAssembly binary is linked to other WebAssembly binaries. More specifically, it allows a WebAssembly binary to declare its exports (functions, types, and interfaces) and imports (functions, interfaces, and types that it needs to get from some other source).

Figure 1: The WebAssembly Component Model

Suppose a program needs to parse some YAML, run an AI inference against an LLM, and then store the results in a key-value storage. With the WCMl, we might build the program this way: The main code, written in TypeScript, declares that it needs a few different libraries.

  • A YAML parser that can take a string and return an object
  • An AI inferencing service that can take a prompt and return an answer (both as strings)
  • A key-value storage that allows setting and getting string pairs

As it stands today, the developer of this project would choose to write a WebAssembly Interface Types (WIT) file describing these requirements. WIT is an interface definition language (IDL) describing how a given system can exchange structured data with other systems. If you’ve ever worked with technologies like Protobuf, Thrift, or MsgPack, you’ve seen an IDL in action.

A WIT file defines the imports (needs) and exports (capabilities) for a WebAssembly binary. In our example above, the program needs three different behaviors. But let’s just zoom in on one. It might ask for a key-value storage implementation with an interface like this:

interface key-value-storage {
    get: func(key: string) -> string;
	set: func(key: string, value: string);
}

That interface declares two functions (get and set) that have specific signatures. In the TypeScript code of the main program, using that object might look something like this:

import {get, set} from "@someplace/KeyValueStorage"

function getLatest(): string {
	return get("latest")
}

function setLatest(value: string) {
	set("latest", value)
}

Of course, the developer must find an implementation of key-value storage (or write their own as another component) that exports the same WIT interface. However, once they have a matching implementation, they must link the upstream key-value storage implementation with their typescript code.

What the developer doesn’t need to know or worry about is the upstream language the key-value storage implementation was written with. It could be Rust, Python, or C. It doesn’t matter. They all work the same. Thanks to the WIT specification, the upstream implementation and the developer’s code already agree on the shape of the APIs. As the TypeScript code above illustrates, from the developer’s perspective, that component appears as a TypeScript library.

There is one more little detail in this example. Not just one but many implementations of the key-value storage may exist. One might use a local JSON file to store the data, another may use Redis, and another may just be a mock for unit testing. The application can be built using whichever version is appropriate as long as the interfaces match.

Saving Tens of Thousands of Work Hours

One may initially be skeptical that this model is valuable. The reasoning might go as follows: I’m already happy writing in my preferred language; I don’t need this feature.

But hidden in that statement is the value proposition. We have to build a language-specific set of libraries for every language we currently use. JavaScript has its YAML parsers, HTTP libraries, and timestamp parsers. Rust has its own version of each of those. So does Go. And so does every other language.

Each of these libraries has to be maintained by individuals who have to dedicate time to solving a problem already solved in other languages. Tens of thousands of work hours are probably expended developing these same features across multiple languages, even though the core behaviors are (for all intents and purposes) the same! And why? Because there is no way to use libraries in a truly cross-language way.

Adding to the frustration, even well-specified standards like HTTP and YAML are implemented unevenly across language ecosystems. Subtle differences in YAML serialization lead to unexpected bugs when a piece of content is serialized by one language’s library and deserialized by another’s. These bugs cause wasted time and sometimes even disruptive bugs or costly outages.

Each developer should be empowered to write in the language of their choice. If developers write in just one language, they should be empowered to do so. The WCM allows the user to choose the language yet use state-of-the-art features. One developer can write in Python and another in C#, using the same upstream libraries.

Figure 2: WebAssembly Component users can use components written in multiple languages

Moreover, for particularly performance-sensitive code like parsing, developers may write a library in a high-performance language like Rust or C. AI and ML are a strong suit of the Python world. Suddenly, those Python libraries will be available to Go and JavaScript programs. In short, we can write fundamental code in the language best suited for solving an immediate problem. We can do so without dictating what language other developers must use.

WebAssembly Component Model Is Different

What distinguishes the WCM from its predecessors? From Common Object Request Broker Architecture (CORBA) and Component Object Model (COM) in the late 1990s to recent popular frameworks like gRPC and Thrift, many technologies have provided ways of passing structured data.

WebAssembly is unique in bringing this concept into a sandboxed bytecode format that provides component-to-component isolation without requiring a network interface. When two components are executed together, they run in the same WebAssembly runtime but in separate execution contexts (or sandboxes). Each component runs with its own isolated memory, which prevents a whole class of memory-oriented attacks.

Thanks to the capabilities-based security model, each component can have different security parameters. For example, a YAML parser may not have access to the Internet but can access a limited part of the filesystem. Also, because the components are run locally, there is no need for the expensive network overhead of frameworks like gRPC or Thrift.

Three Big Hurdles

Despite its promise, the WCM has three critical challenges to overcome before we expect substantial usage.

WebAssembly’s most significant strength also exposes its key risk. A neutral bytecode format is only helpful if language toolchains support it. While most prominent languages have at least some degree of support, many of them lag. One important area where many languages are behind is support for the component model. Rust, C, Python, and JavaScript/Typescript have support, but most other languages do not.

Even when they all support components, one interesting question is whether each language will perform well as a tool for writing components. Languages with garbage collectors (like Go) will be slower at runtime than those without (like Rust and C). Scripting languages (like Python and JavaScript) will still be slower since each requires an interpreter.

Finally, for components to catch on, they must reach critical mass. Two conditions must be met: developers must create and distribute components, and developers targeting Wasm must consciously decide to use components instead of native libraries. This is no small amount of inertia that must be overcome since each major language has already amassed huge numbers of libraries.

Likely, the best way to overcome this inertia is to create components of such high value and quality that developers desire to use them over available alternatives. That’s a big task for a budding community to fulfill.

The Present and Future of WebAssembly Components

A month ago, the WCM was released as part of the WebAssembly System Interface (WASI) v0.2 specification (sometimes called WASIp2). Already, a number of tools have arisen to assist with everything from automatically generating and consuming WIT (like we created in our example above) to linking components into apps and running these apps in production environments.

The tools are evolving rapidly, but plenty of work remains to be done. Those early to this ecosystem will begin writing the base components that become the foundation for the component ecosystem.

Not all WebAssembly-compatible languages have component implementations yet. Python, JavaScript, and Rust were the first to the game, but others are progressing rapidly. Throughout 2024, different language communities will introduce component tooling.

There are already production-grade tools for WebAssembly components. The open-source Spin project has supported components for almost a year, which means component-based applications can run in environments as diverse as Raspberry Pis and giant Kubernetes clusters.

However, to see components make it into the mainstream, WebAssembly has a few hurdles to leap. Languages must support the core bytecode specification and the latest specs. Performance, especially across various languages, needs to be analyzed and optimized. Perhaps most importantly, though, enough components must be produced and consumed to make the technology a viable everyday choice for developers.

Components are here, and they unlock true polyglot applications. Since a consortium of browser companies first announced the WebAssembly project, the technology has always relied upon the cooperation of many disparate communities. Language developers, browser creators, and a cloud ecosystem must all find common ground to collaborate. And that is what will push components into the mainstream.

Evidence of this collaboration has sprouted as we see forward momentum in several languages and runtimes. This moment presents a rare opportunity to become involved in creating this technology. The Bytecode Alliance, which maintains the tools and documentation at the core of the WCM, is a good place to kickstart your involvement with it.

About the Author

Rate this Article

Adoption
Style

BT