# Canonicalizing a URL path using std::filesystem::canonical

More specially, a file path where we are trying to remove .. and .

So, if you have “projects/vectorization/../gitstuff/./../../something/index.html” canonicalizing the string would reduce it to “something/index.html”

We have a couple of options to do this.

Option 1 – use std::filesystem
Since C++17, you can take a path, and canonicalize it using std::filesystem::canonical.

For paths that do not exist, or you just want to mess around, you can use std::filesystem::weakly_canonical to remove .. and ..

It’s really simple to work with, and here’s an example of how to use it:

void option_one(std::string& path)
{
auto fs_path   = fs::path(path);
auto canonical = fs::weakly_canonical(fs_path);
std::cout << canonical << '\n';
}

Option 2 – use a stack
We can use a stack to hold the path components, that we find while iterating up through a string.

This will be a lot more involved than Option 1, but if C++17 is not available in your project, then:

/**********************************************************/
std::string option_two(std::string& path)
{
// Don't work on an empty path
if (path.empty())
return "";

// Use a stack to hold each path component
auto path_components = std::stack<std::string>();

// Use 2 variables to hold the beginning and the end of a path component string
// initialized to the beginning of the string, and the first slash found
auto beginning = 0;
auto end = path.find("/", beginning);

// Now, walk up the string, gathering each path component
while (end != std::string::npos)
{
// Check if the path component is a .. or .
const auto item = path.substr(beginning, end - beginning);

// If it's a .., pop the stack, otherwise ignore . and only add a path component
if (item == ".." && !path_components.empty())
path_components.pop();
else if (item != ".")
path_components.push(item);

// Set our variables to the current slash position, and the next found slash
beginning = end + 1;
end = path.find("/", beginning);
}

// Add the last path component, if we have a trailing one
if ((path.length() - beginning) > 0)
{
const auto last = path.substr(beginning, path.length() - beginning);
if (last != ".." && last != ".")
path_components.push(last);
}

// Reverse the stack to make our mechanism work
std::stack<std::string> rpath_components;
while (!path_components.empty())
{
rpath_components.push(path_components.top());
path_components.pop();
}

// Append the path components to our string, delimited with /
std::string canonical;
while (!rpath_components.empty())
{
canonical += rpath_components.top();
rpath_components.pop();
canonical += "/";
}

// Remove the last trailing forward slash
canonical = canonical.substr(0, canonical.length() - 1);

std::cout << canonical << "\n";
return canonical;
}

So, you can see there are 2 ways, and lots more if you put your mind to it, of canonicalizing a path.

Happy coding!

This site uses Akismet to reduce spam. Learn how your comment data is processed.