Numerals, Romanticised

tagged under ,
reading length ~900 words

You found a Bag of Holding (III)…

In case you aren’t familiar with Roman numerals and how they relate to Arabic numerals, Roman numerals are denoted by letters, each of which represents a numeric value, and by adding each of the values in the sequence together, you arrive at the result.

I = 1
V = 5
X = 10
L = 50
C = 100
D = 500
M = 1000

These seven are the only Roman numerals, and using specific combinations of these numerals in a particular order, from greater to lesser values, you can represent any Arabic numerals; although, the readability of Roman numerals can suffer greatly depending on the number you want to represent.

MDCCCLXXXVIII (1888) is rather unwieldy, but can still be parsed with a consistent set of steps and you’ll quickly arrive at the result.

There are also some exceptions that we need to be aware of based on the left-to-right parsing we’ll be performing on the Roman numerals, so let’s focus for a moment on the first ten Roman numerals and highlight these types of exceptions:

I = 1
II = 2
III = 3
IIII = 4
V = 5
VI = 6
VII = 7
VIII = 8
VIIII = 9
X = 10

We can read Roman numerals such as VII as 5 + 1 + 1 = 7.

Similarly, VIII can be thought of as 5 + 1 + 1 + 1 = 8.

But to follow that line of thinking and think of 9 as 5 + 1 + 1 + 1 + 1 and write VIIII is not correct.

The rule here is that a character in Roman numerals can have no more than three modifiers. So, to represent 9 in Roman numerals, we instead refer to the next higher value (I, V, X, L, etc.) and modify that value. In this case, the next higher value from V is X, so we say 10 - 1, or one before ten, so 9 is written as IX.

An easy way to spot these exceptions is to look for letters which appear out of order—Roman numerals are written with higher-value letters almost always appearing before lower-value letters, reading from left to right, higher to lower—so any letter which appears before a higher value tells you that it falls under the one before exception.

IV is not 1 + 5, it is 1 before 5, or -1 + 5 = 4.

XL is not 10 + 50, it is 10 before 50, or -10 + 50 = 40.

Rewriting the first ten numbers to take into account this rule, we now have:

I = 1
II = 2
III = 3
IV = 4
V = 5
VI = 6
VII = 7
VIII = 8
IX = 9
X = 10

History Lesson Over, On To The Code

First we need to set up some data to be able to relate Arabic numerals to their Roman counterparts.

- arabic: 1000
  roman: M
- arabic: 900
  roman: CM
- arabic: 500
  roman: D
- arabic: 400
  roman: CD
- arabic: 100
  roman: C
- arabic: 90
  roman: XC
- arabic: 50
  roman: L
- arabic: 40
  roman: XL
- arabic: 10
  roman: X
- arabic: 9
  roman: IX
- arabic: 5
  roman: V
- arabic: 4
  roman: IV
- arabic: 1
  roman: I

You’ll notice that I have included the double-letter combinations which represent the one-value-less case that we described above. This is important to be able to accurately convert numerals, as the combination of the two characters does not equal the sum of the two characters on their own.

Let’s step through converting an Arabic numeral (e.g. 1569) to Roman numerals (MDLXIX). To do this we have to loop through our numeral conversion data from highest to lowest values. If our Arabic numeral is greater than the Arabic value in the data, we subtract that value from our Arabic numeral, we append the Roman value to our output string (which starts as being empty), and we start looping through the data from the top again. If our Arabic numeral is less than the Arabic value in the data, we continue looping and comparing the data to our Arabic numeral.

{% assign input = include.value %}
{% for c in (1..999) %}
    {% for numeral in site.data.numerals %}
        {% if input >= numeral.arabic %}
            {% assign input = input | minus: numeral.arabic %}
            {% assign output = output | append: numeral.roman %}
            {% break %}
        {% endif %}
    {% endfor %}
    {% if input == 0 %}
        {% break %}
    {% endif %}
{% endfor %}
{{ output }}

Because we’re subtracting values from our Arabic numeral as we loop and convert Arabic values to Roman numerals, we will know when we’re done because our Arabic numeral will equal 0.

The same can be done for going from Roman numerals to Arabic. To do so, we have to check each character in-sequence and tally up their values to arrive at the Arabic value. Like before, we have to watch out for the one-value-less exceptions, so instead of just checking each single character in sequence, we’ll first check if the next two characters in the Roman numeral sequence match an exception, and if so, use that value instead.

{% assign input = include.value %}
{% for c in (1..999) %}
    {% assign slice_double = input | slice: 0, 2 %}
    {% assign slice_single = input | slice: 0 %}
    {% for numeral in site.data.numerals %}
        {% if slice_double == numeral.roman or slice_single == numeral.roman %}
            {% assign input = input | replace_first: numeral.roman %}
            {% assign output = output | plus: numeral.arabic %}
            {% break %}
        {% endif %}
    {% endfor %}
    {% if input == '' %}
        {% break %}
    {% endif %}
{% endfor %}
{{ output }}

In this case we know we’re done converting when we’ve run out of Roman characters to parse.

You’ll also notice, in both cases, that there’s a loop that goes from 1 through 999. While normally this would be a point of poor performance, this limit should never be reached, unless the value being converted is extremely long, in which case Roman numerals would be poorly-suited to represent. This big loop is used to cycle through an arbitrary number of characters passed as an input to the include.

Copy This Part

{%- assign input = include.value | times: 1 -%}
{%- if input != 0 -%}
    {%- assign output = '' -%}
    {%- for c in (1..999) -%}
        {%- for numeral in site.data.numerals -%}
            {%- if input >= numeral.arabic -%}
                {%- assign input = input | minus: numeral.arabic -%}
                {%- assign output = output | append: numeral.roman -%}
                {%- break -%}
            {%- endif -%}
        {%- endfor -%}
        {%- if input == 0 -%}
            {%- break -%}
        {%- endif -%}
    {%- endfor -%}
{%- else -%}
    {%- assign input = include.value -%}
    {%- assign output = 0 -%}
    {%- for c in (1..999) -%}
        {%- assign slice_double = input | slice: 0, 2 -%}
        {%- assign slice_single = input | slice: 0 -%}
        {%- for numeral in site.data.numerals -%}
            {%- if slice_double == numeral.roman or slice_single == numeral.roman -%}
                {%- assign input = input | replace_first: numeral.roman -%}
                {%- assign output = output | plus: numeral.arabic -%}
                {%- break -%}
            {%- endif -%}
        {%- endfor -%}
        {%- if input == '' -%}
            {%- break -%}
        {%- endif -%}
    {%- endfor -%}
{%- endif -%}
{{ output }}

And there you have it. Should you need a covers-all-bases solution in Liquid for converting to and from Roman numerals, this will do the trick.

Further Reading



Note from Friday, 02 November 2018