RFC: fluent babel plugin #625

KevinMind · 2024-01-19T11:22:54Z

Summary

I've been investigating fluent for addons-frontend along with other popular localization "frameworks" (collection of packages used for l10n)

What I've found is that fluent seems to be the clear winner in terms of syntax features, but lacks some support in terms of usability and runtime features. I've done a pretty deep comparison of both localization syntax as well as the libraries available for actually rendering text on a page and I've drawn the conclusion that fluent needs a babel plugin (potentially multiple).

Background

In addons-frontend we have been using the getteext API in our souce code and .po as a format for storing translations. One of the killer features the team has grown to depend on is extracting translations from source code.

Instead of this:

{l10n.getString('my-message-id')}

We do this:

{l10n.extractString(`This is the english text for my message`)}

I won't go into why you might want to do that as I think it is a debate in its own right, but for the sake of concision, just assume this is a valid use case and we can debate it's validity in the comments.

To make this work, we need to run a slightly more complex build pipeline that includes an additional step to extract the translation from the source code. (babel and eslint enter the room)

Using babel/eslint to parse source code enables additional use cases like:

validate that a message exists for every ID you call (similar proposal)
validate the arguments passed are present and of the right type
obfuscate message IDs to prevent collision
and a super awesome bonus use case I'll get to in the options... ✨ fluent-in-js ✨

Proposal

I propose we implement babel/eslint plugin(s) to unlock these use cases. There is a lot of wiggle room in terms of exactly what use cases we could/want to cover but the all would include some thing like:

a visitor to parse the code and generate a fluent AST. (this is the "current state" of your code base in terms of what you are consuming in your build)
a comparison algorithm that would produce the above AST as well as parse your existing .ftl files, walking both and diffing them.
from this we can build use case specific hooks to implement all of the above use cases.

All options are somewhat independent and we could implement any combination of them. The core proposal here is to build the baseline visitor and walker components and expose an API that each option could extend to implement the given set of requirements.

Option 1 Validate all messages exist

message-one = One

<Localized id="msg-one" />

Pros

prevent errors in the build
better DX with visualized errors and auto complete

Cons

Requires a static and predictable location of .ftl files relative to where they are used.

We can fail the build, surface a red squiggly in your IDE and even support autocomplete features in your code.

Option 2 Validate message arguments

# $arg (Number) this is a number 
message-one = One { $arg }

<Localized id="message-one" vars={{foo: true, $arg: '2' }} />

Typescript enabled IDEs like VScode would be able to tell you that '2' is not a Number and suggest a change. We could also determine that foo is not a valid arg for this translation.

Doing this would likely require auto generated types that would map each message ID to an interface describing it's defined variables. We would need a structured way of parsing comments that can map variable names to a type.

Pros

prevent invalid arguments from being passed to translations
improved translator experience with stricter variables
simplifies debugging
simplifies refactoring/renaming

Cons

only applies to typescript users and or typescript enabled IDEs

Option 3 Extract translations

This is where things start to get wild so strap in.

Given this react pseudo code.

{l10n.createMessage(`Hello ${name`, {description: '$name is the users full name'})}}

I could actually generate the fluent source translation file

# $name(String)
message-sdfkj = Hello { $name }

And I could transform the source into valid fluent.js

{l10n.getString('message-sdfkj', {$name: name})}

This option using the new "createMessage" api is implemented in the linked POC.

Pros

(arguably) improved DX
much simpler API. you can pass expressions directly in the source code, including react elements. Don't really need .getElement with this in place
source your translations to their usage in the code, makes pruning trivial

Cons

additional complexity required to transform source code correctly
significantly different API surface could be confusing to those used to the existing api

Option 4 Full on fluent in js

I could generate fluent in typescript, with almost the same syntax but fully type aware.

We would have a fluent.ftl.{j|t}s file format for babel to parse.

import ftl, {selectorRef} from '@fluent/somewhere';
import {Gender} from '@types/gender';

export const simple = ftl.message`Hello World!`;

// ftl-description: The name of the user
export const name = ftl.term`World`;

// ftl-description: $ref: A specific reference
export const withVariableReference = ftl.message<{$ref: string}>`Hello ${({$ref}) => $ref}!`;

export const withTermReference = ftl.message`Hello ${name}!`;

export const genderedStream = ftl.selector<Gender>({
  male: `her stream ${name}`,
  female: 'his stream',
  other: `their stream ${selectorRef}`,
});

export const withSelectorReference = ftl.message<{
  $gender: Gender,
  $user: string,
  $photoCount: number,
}>((props) =>
  `${props.$user} added ${props.$photoCount} to ${genderedStream(props.$gender)}`
);

And we can import and reference the messages like this

import * as React from 'react';

import * as messages from './fluent.ftl';

export function Component() {
  return (
    <div>
      <h1>{messages.simple}</h1>
      <p>{messages.withVariableReference({ $ref: 'World'})}</p>
      <p>{messages.withTermReference}</p>
     <p>{messages.withSelectorReference({$photoCount: 23, $genderedStream: 'female', $user: 'Elaine'})}</p>
    </div>
  );
}

This code would result in the following fluent

simple = Hello World!

# $term-name-2s23k (String) - The name of the user
-name = World

# $ref (String) - A specific reference
with-variable-reference = Hello { $ref }!

with-term-reference = Hello { -term-name-2s23k }!

gendered-stream =
  { $selector-fdk23 ->
    [male] her stream { -name }
    [female] his stream
    [other] their stream { $selectyor-fdk23 }

with-selector-reference = { $user } added { $photo-count } photos to { $gendered-stream($gender) }

and the following transpiled fluent.ftl.js file, this is used by anyone who is importing messages from this resource.

const __GENERATED_RESOURCE_ID__ = 'sdflkjsdlkj';
export async function loadResource() {
   return import(`${__FLUENT_LOADER_SOURCE__}/${__GENERATED_RESOURCE_ID__}`)
}

And finally the bundled component file

import * as React from 'react';

// this is a generated import that allows us to pass our loadResource method
// to the context provider to lazy load the translations.
import {loadResource} from './fluent.ftl';
import {useLocalization, useL10Loader} from '@fluent/somewhere';

export function Component() {
   const {l10n} = useLocalization();
   useL10Loader({
    loader: loadResource,
    ...opts // additional opts could be injected for specific handling
  });
  return (
    <div>
      <h1>{l10n.getString('simple')}</h1>
      <p>{l10n.getString('with-variable-reference', { $ref: 'World'})}</p>
      <p>{l10n.getString('withTermReference')}</p>
     <p>{l10n.getString('withSelectorReference', {$photoCount: 23, $genderedStream: 'female', $user: 'Elaine'})}</p>
    </div>
  );
}

The examples are meant to show the high level architecture of how we would transform the code. The actual runtime would require a different approach to useLocalization that could be a completely different hook in order to maintain the existing api.

To break this down further, what's happening here is:

fluent.ftl.ts files produce a fluent resource file, babel knows where to store that based on user defined configs.
The user renders their app with a special provider that will store loaded strings, similar to the existing l10n provider but with internally controlled loading of translations via the useL10Loader hook
the fluent.ftl.ts file is imported by N components that want to render translated strings. since we know where the generated fluent file is, we can insert into the code where it is and how to load it.
the components importing messages are transpiled to inject a call to the provider instructing it to load the resource. We can do this lazily via suspense or with explicit fallback content, or we can instead gather all loaders and load up front. We could have this configurable and let users decide their loading strategy.
The references to messages are replaced with calls to the existing getString api

Pros

probably the best DX. You are writing source code that looks and feels almost exactly like fluent, but with full ts support
composable. You get the benefits of extraction, but also decoupling translation defenition from usage makes reusability possible
Automatic loading of locales, as lazy as you want.
Combines all of the use cases into something that could be implemented in a single component

Cons

definitely the hardest to implement
to get the best UX requires newer react primitives like Suspense and lazy

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: fluent babel plugin #625

RFC: fluent babel plugin #625

KevinMind commented Jan 19, 2024

RFC: fluent babel plugin #625

RFC: fluent babel plugin #625

Comments

KevinMind commented Jan 19, 2024

Summary

Background

Proposal

Option 1 Validate all messages exist

Pros

Cons

Option 2 Validate message arguments

Pros

Cons

Option 3 Extract translations

Pros

Cons

Option 4 Full on fluent in js

Pros

Cons