I love internationalisation (i18n). It’s one of those domain problems that is sufficiently easy, beneficial, and satisfying to solve early in a project. It can be very expensive to solve late. It also has side benefits not often considered: I’ve seen pluralisation problems solved like count === 1 ? ‘apple’ : ‘apples’
all too often, and I’m telling you there’s a better way.
For this site I have the goal of using AI to translate the content into as many languages as the LLM will properly handle. This is for two reasons: first to fulfill a love of i18n, and second to have a chance to work and famliarise myself more with AI and LLMs. With this goal in mind we can expect to have English source mdx files outputting to locale specific folders each containing the same mdx files but translated.
Translating mdx files isn’t the only concern however. There are still the other general textual elements of the website e.g. as noted during the initial implementation the text in category tags (like general/log/adventures).
Astro’s out of the box internationalisation support focuses on locale detection and locale routing: e.g. allowing you to place english content under the url /en/
french content under /fr/
and so on — as well as automatically sending the user to the appropriate route based on their browser locale.
However, one aspect missing on Astro’s internationalisation documentation page is how to create Astro pages which import and render locale specific content from the locale specific content folders for the current locale. There are breadcrumbs and hints on other pages in this regard, but bringing it all together the key elements are as follows.
Firstly, we need to make sure to declare the locales that Astro should support in astro.config.mjs
by adding e.g.:
i18n: {
defaultLocale: 'en',
locales: ['en', 'fr'],
routing: {
prefixDefaultLocale: true,
},
},
Secondly, we need to add an astro page inside a [lang]
folder such as [lang]/index.astro
. Because this path contains a route parameter, this index.astro
will need to export getStaticPaths
(when in SSG mode) to tell the router all the possible paths that will be valid. We’re going to need access to the locale list provided in the astro.config.mjs
. Of note is that locale list is actually a shorthand for the urls that are generated by supported locales. For example en-US
and en-NZ
would map to en
if configured like:
locales: [{ path: "en", codes: ["en-US", "en-NZ"] }];
There’s no actual Astro built in function to get the list of locale url-paths directly, but using getRelativeLocaleUrlList
from astro:i18n
we can get what we need:
import { getRelativeLocaleUrlList } from "astro:i18n";
const pathSlugs = getRelativeLocaleUrlList().map((url) =>
url.replaceAll("/", ""),
);
export function getStaticPaths() {
return pathSlugs.map((locale) => ({
params: { lang: locale },
}));
}
Thirdly, we had our posts in the contents/posts
folder, but now we need to move them to locale specific folders like contents/posts/en
and contents/posts/fr
and so on. While Astro collections allow folders in the tree like this, collections don’t understand locales, and we must therefore update our index.astro file once again to only show locale appropriate posts. This is unfortunately a little bit manual:
const currentLocalePosts = await getCollection(
"posts",
({ id }) =>
id.startsWith(
`${getPathByLocale(Astro.currentLocale)}/`,
),
);
It would be polished here if Astro could understand the relationship between locale url subfolders in collections and have consistency checking between locales (i.e. locale codes), paths (i.e. locale urls), folder names, and collection filtering. Unfortunately as it stands this is detail we’re required to manage!
We’ve solved the first item on our list of being able to display translated content at appropriate urls, but we still need to internationalise the sites textual aspects.
Here, I search around briefly, and it doesn’t seem like there is any obvious popular library to solve this problem. Inspecting the Astro docs, they use the Starlight library which handles this aspect manually itself without any obvious library. Content driven web development is sometimes a little down and dirty.
Of course I would mostly just love to use the same API as react-intl
so for now decide just to go at it alone and implement a basic FormattedMessage component, which I have the custom of just calling M
because it’s so frequently used. I add a src/messages
folder with en.json
& fr.json
files inside it, and an index file that just exports the json from each containing key value pairs. Then my M.astro
file just becomes:
---
import { getPathByLocale } from "astro:i18n";
import msgs from "messages/index";
const messages = msgs as Record<
string,
Record<string, string>
>;
---
{
messages[Astro.currentLocale][
Astro.props.id
]
}
If you’re astute, you may wonder how that solves the very first pluralisation problem, and indeed you’re right - it doesn’t! Will leave adding FormatJS message syntax support to our M
component for later, but presumably it’s just passing in values to some formatter function.
With that, we have translated tags and in theory translated content! Except that we haven’t actually translated the content yet. For that we will need an LLM translation pipeline which you can read about in the next post :).