Feature blog image

How to create a sitemap with Next.js app directory

4 min read

To gain the best of SEO we should use a sitemap for our sites. Search engines like Google can use those sitemaps to crawl our sites more efficiently.

Create a sitemap

We could simply create a sitemap, like the following and put it into the public directory of our app:

public/sitemap.xml
Copy

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>https://sdorra.dev/</loc>
</url>
<url>
<loc>https://sdorra.dev/posts/2022-11-15-sitemaps-with-appdir</loc>
</url>
</urlset>

But this is very error prone. Every time we create a new blog post or add another page to the site we have to manually modify the sitemap.

I think we can do better.

Generate sitemap

We could generate our sitemap at build time by traversing the app directory and write it afterwards to the public directory:

scripts/sitemap.mjs
Copy

import { writeFile } from "fs/promises";
import { globby } from "globby";
const PAGE = "https://sdorra.dev";
const createPath = (p) => {
const path = "/" + p.replace("page.tsx", "");
if (path.endsWith("/") && path.length > 1) {
return path.substring(0, path.length - 1);
}
return path;
};
const collectPaths = async () => {
const paths = await globby("./**/page.tsx", {
cwd: "app",
});
return paths.map(createPath);
};
const createSitemap = async (routes) => {
return `
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
${routes
.map((route) => {
return `
<url>
<loc>${`${PAGE}${route}`}</loc>
</url>
`;
})
.join("")}
</urlset>
`;
};
(async () => {
const paths = await createPaths();
const sitemap = await createSitemap(paths);
await writeFile("./public/sitemap.xml", sitemap, { encoding: "utf-8" });
})();

The snippet above uses globby to find each page.tsx in the app directory (collectPaths). After the file paths are collected, each path is transformed to the url path (createPath). The last step is to create the sitemap (createSitemap) and write to the public directory.

This works well, until we introduce dynamic routes to our app.

Dynamic routes

A dynamic route can be identified by the square brackets around a variable e.g.: app/posts/[slug]/page.tsx. To handle those routes in the sitemap generator could become very tricky.

I will use a simple example how it could be implemented if the site uses contentlayer:

scripts/sitemap.mjs
Copy

import { writeFile } from "fs/promises";
import { globby } from "globby";
import { allPosts } from "../.contentlayer/generated/index.mjs";
const PAGE = "https://sdorra.dev";
const expandPath = (p) => {
if (p === "/posts/[slug]") {
return allPosts.map((p) => `/posts/${p._raw.flattenedPath}`);
}
return [p];
};
const createPath = (p) => {
const path = "/" + p.replace("page.tsx", "");
if (path.endsWith("/") && path.length > 1) {
return path.substring(0, path.length - 1);
}
return path;
};
const collectPaths = async () => {
const paths = await globby("./**/page.tsx", {
cwd: "app",
});
return paths.map(createPath).flatMap(expandPath);
};
// rest of scripts/sitemap.mjs

The first interesting line in this snippet is the import of the posts generated by contentlayer (allPosts). I had to specify the whole path including extension, otherwise typescript threw errors at me. In the expandPath function we will check the path and if it is our dynamic route, we will replace it with the slugs of all posts. This function is finally used by the collectPath function to create our flat array of paths.

warning

This method covers only the basics. If you are use something like grouped routes or the old pages directory is used in addition to the app directory, you have to go a few extra miles.

Build

Now it is time to add our generator to the build:

package.json
Copy

{
"scripts": {
"build": "next build && node scripts/sitemap.mjs"
}
}

Whenever our page is build using the build script, our sitemap is build afterwards. The order is important, because contentlayer have to run first in order to generate our content types.

robots.txt

We could specify the url to our sitemap in the robots.txt, this makes it easier for search engines to find our sitemap. If we use the default path /sitemap.xml most of the search engines should find it without the entry in the robots.txt. However the entry could look like the following:

public/robots.txt
Copy

User-Agent: *
Allow: /
Sitemap: https://sdorra.dev/sitemap.xml

Git

Since our sitemap is generated, we should not include it in our repository. So we should ignore it:

.gitignore
Copy

/node_modules
/.next/
.contentlayer
public/sitemap.xml

If we would not do that, we would see changes to the sitemap.xml every time we change the structure of the page.

Posted in: nextjs, sitemap, contentlayer