Markdown functions

Markdown Functions

In this section, we provide an example of a package that facilitates the creation of HTML files from Markdown documents. Markdown is a popular format for technical documentation due to its focus on clean, readable text, easy integration of code examples, and support for images and hyperlinks. We have integrated Markdown conversion functions into the ACF language to simplify the process of generating HTML documents.

The result is a blisteringly fast end product; the HTML files load extremely fast, look good, and the functions to do the converting are also very fast. We processed the 50 documents in this manual in less than 0.5 seconds in total. With the CSS files, you can style the documents in any way you want.

Background

For our documentation project, we've chosen to create HTML documents from Markdown source files. To achieve this, we use a FileMaker application that includes:

A preferences table where path configurations and other fixed settings are stored.
A document table to manage manual documents.
A category table to organize document categories.
A template for new documents to streamline the creation process.

Workflow

The Markdown to HTML conversion process involves three phases:

Pre-processing: This step involves copying images from Markdown documents to the HTML/images folder and updating image URLs in the Markdown documents to point to their new locations.
Markdown to HTML Conversion: The Markdown documents are converted to HTML using ACF functions.
Post-processing: After conversion, we add a header and a left column navigation bar to the HTML documents.

Pre-processing

In the pre-processing phase, we perform the following tasks:

Extract image tags from Markdown documents.
Rewrite image URLs to point to their new locations.
Copy image files to the desired HTML/images folder.

The Markdown image tag has this format:

![image 1](../../../Desktop/regex-explanation.png)

We use a regular expression to extract image tags from the Markdown document. The regex pattern !\[[^\]]*\]\(((.+?\/)?([^\/)]+))\) captures image tags, including full paths, directory paths, and file paths.

This is explained this way: (Screenshot from regex101.com)

Here are the different paths we need to take care of:

Original Image path when dragged into the Markdown document from the Desktop:  
../../../Desktop/imageab.png

Relative Image path for its new location as seen from the Markdown document: 
 ../html_root_folder/images/imageab.png

Resulting image path in the HTML document: 
images/imageab.png

The absolute path of the document folder is: 
/Users/ole/Projects/ACFmanual/mdDocs/

The absolute path of the HTML path is: 
/Users/ole/Projects/ACFmanual/html_root_folder/

Now, we have some filesystem fun to pick apart all these paths and copy files. This is done in the MoveImages function seen below.

Markdown to HTML Conversion

After completing the pre-processing stage, we have all the referenced images available in the HTML/images folder. We utilize the built-in functions in the ACF language to perform the Markdown to HTML conversion. This function requires the source Markdown file, style/theme information defined for the markdown2html ACF function, and the destination HTML file to be created.

Once this function is executed, we obtain a bare HTML file that contains only the text and images from the Markdown document, without any navigation, headers, or additional elements. To transform this into a comprehensive manual, we require additional elements, which are handled by the post-processing function.

Post-processing

The post-processing function's role is to integrate the generated HTML into a complete page that includes a left sidebar, top bar, navigation, and bottom section.

The left navigation menu needs to be generated. We retrieve all the documents from a table named ACF_Documents, extracted from the database using an SQL query. The left navigation menu is then constructed in a loop, organizing documents into category sections and using HTML ul/li tags. CSS styling is applied to enhance its appearance.
The original HTML file is read and divided into the HEAD and BODY sections.
The content for the top bar is sourced from user preferences.
All these components, including DIV tags to structure the complete document, are placed into an array.
We utilize the implode function to merge all the parts within this array to create the final document.
A simple substitution is employed to remove certain image path prefixes in a cleanup operation.
We iterate through all the documents again, replacing titles within the document with links leading to their respective pages.

Finally, the fully assembled HTML file is written back to disk, completing the process.

The ACF code functions summary

Here's a breakdown of the functions involved in Markdown conversion:

The Package Header:

package MarkDownFunctions "MarkDown funksjoner for dok prosjektet...";

The Function for Preprocessing the Markdown Document and Copying Image Files:

This function extracts image tags from the Markdown document, copies image files to the HTML/images folder, and updates image URLs.
```
function MoveImages(string SourceFile, string html_root, string relpath)
```
The Post-processing Function to Add Header and Left Column to the Document:

This function adds a header and a left column navigation bar to the generated HTML document.
```
function post_processing(string htmlfile, string removeText)
```

The Markdown Document Conversion Function:

This function orchestrates the entire Markdown to HTML conversion process.

function ConvertMarkdown(string sourcefile, string htmlfile, string html_root, string style, string codestyle, string removeText)

A Utility Function to Create HTML Filenames:

This function generates HTML filenames based on the Markdown file's name.
```
function createHTML_Filename(string file, string html_path)
```

The full listing:

The Package Header:

package MarkDownFunctions "MarkDown funksjoner for dok prosjektet...";

Here is the MoveImages function:

function MoveImages ( string SourceFile, string html_root, string relpath ) 

    print "MoveImages enters...."; 
    int x = open ( SourceFile, "r"  ); 
    string md = read (x ) ; 
    close ( x ) ; 
    print "Pre-processing"; 
    string regex = "!\[[^\]]*\]\(((.+?\/)?([^\/)]+))\)"; // used to extract image tags...
    // gr 0 = full match, gr 1 = full path, gr 2 = directory path, gr 3 = file path. 
    array string tags = regex_extract ( regex, md ) ; 
    int no = sizeof ( tags ) ; 
    if ( no == 0 ) then
        print "Pre-processing - no tags"; 
        return "";      
    end if

// Getting the Markdown document folder where relative images start from. 
    string docFolder = regex_replace ( "^(.+\/)[^\/]+$", SourceFile, "\1" ) ; 

    int i, cnt = 0; 
    string fulltag, fullpath, fileFullPath, dir, file, newPath, newRelPath, newtag, res; 

// Loop all tags, and process each one of them. 
    for ( i= 1, no)
        print "\n" + tags[i]; 
        fulltag = getValue ( tags[i], 1);
        fullpath = getValue ( tags[i], 2);
        fileFullPath = fullpath; 
        dir = getValue ( tags[i], 3);
        file = getValue ( tags[i], 4);
        if ( file_exists ( fullpath ) == 0 ) then
             // probably a relative path. Prefix with doc folder to get absolute path. 
            fullpath = docFolder + fullpath; 
        end if
        if ( file_exists ( fullpath ) ) then
            newRelPath = relpath + "images/" + file; 
            newPath = html_root + "/images/" + file; 
            if ( file_exists ( newPath ) == 0) then
                res = copy_file ( fullpath, newPath ) ; 
                if ( res == "OK") then
                    newtag = substitute ( fulltag, fileFullPath, newRelPath ) ; 
                    md = substitute ( md, fulltag, newtag ) ; 
                    cnt++; 
                end if
            end if
        end if
    end for
    if ( cnt > 0 ) then
        // save the modified Markdown document. Take a backup first, in case we did something nasty...
        res = copy_file ( SourceFile, SourceFile+".bak" ) ; 
        // Write back to file. 
        x = open ( SourceFile, "w"  ); 
        write ( x, md ) ; 
        close ( x ) ;
    end if
    return "OK"; 
END

The Post-processing function to add header and left column to the document:

function post_processing (string htmlfile, string removeText)

/* Called from ConvertMarkdown 
    Open the generated HTML document, that is a plain page of the markdown doc, 
    and put on a header, and a left navigation bar.
   */

    print "post processing"; 
    int x = open ( htmlfile, "r"  ); 
    string html = read (x ) ; 
    close ( x ) ; 
    string body_part = between ( html, "&lt;body>", "&lt;/body>" ) ; 
    string head_part = between ( html, "&lt;head>", "&lt;/head>" ) ; 
    string stylesheet = '&lt;link rel="stylesheet" type="text/css" href="include/css/style.css">';
    string top_bar = Preferences::topp_nav; 
    
    // Build the left nav
    string left_nav = Preferences::Left_nav; 

doclist = @ExecuteSQL ( "SELECT Title, slug_filename, Category FROM ACF_Documents LEFT OUTER JOIN ACF_Categories as c ON c.PrimaryKey = Category_UUID ORDER BY Sorting, Category" ; "||" ; "|*|")@;
    
    print doclist;  // Debug print to console. 
    
    array string docs = explode ("|*|", doclist ), flt ; 
    int noDocs = sizeof ( docs ); 
    int i; 
    string curCat = "noone"; 
    for (i = 1, noDocs)
        flt = explode ( "||", docs[i]); 
        if (flt [3] != curCat ) then
            if ( curCat != "noone") then
                left_nav += "&lt;/ul>\n"; 
            end if
            left_nav += "&lt;h4>" + flt[3] + "&lt;/h4>\n"; 
            left_nav += '&lt;ul class="linklist">\n'; 
            curCat = flt[3]; 
        end if
        left_nav += "&lt;li>" + flt[1] + "&lt;/li>\n"; 
    end for
    left_nav += "&lt;/ul>\n"; 
    
    // Merge the parts
    array string fileparts = {"&lt;!DOCTYPE html>\n&lt;head>", head_part, stylesheet, "&lt;/head>\n&lt;body>", 
        '&lt;div class="hd">', top_bar, 
        '&lt;/div>&lt;div class="content_band">&lt;div class="left">',left_nav, 
        '&lt;/div>&lt;div class="content">', body_part,  
        "&lt;/div>&lt;/div>", "&lt;/body>n&lt;/html>\n" }; 
        
    string result = implode ( "\n", fileparts ) ; 
    
    // Remove Image path prefixes 
    result = substitute (result, removeText, ""); 
    
    // substitute all titles in the documents with a link to that page. 
    for (i = 1, noDocs)
        flt = explode ( "||", docs[i]); 
        // result = substitute (result, flt[1], '&lt;a href="'+flt[2]+'">' + flt[1]+"&lt;/a>"); 
    // Improved: Use regex-replace instead to substitute titles matching whole words
    // and not part of words, that looks a bit strange. 
        result = regex_replace ( "\b"+flt[1]+"\b", result, '&lt;a href="'+flt[2]+'">' + flt[1]+"&lt;/a>");
        
    end for
    
    // Write back to file. 
    x = open ( htmlfile, "w"  ); 
    write ( x, result ) ; 
    close ( x ) ;
    print "End-post-processing"; 
    return "OK"; 
end

The markdown document conversion function:

This function below uses the two functions above to complete the full conversion.

function ConvertMarkdown (string sourcefile, string htmlfile, string html_root, string style, string codestyle, string removeText)

    print "Convert markdown enters....\n"; 
    set_markdown_html_root ( html_root ) ; 
    string sf = sourcefile, df = htmlfile; 
    $$SourceFile = ""; 
    if (sf == "") then
        sf = select_file ("Velg en Markdown Fil?"); 
        $$SourceFile = sf; 
    end if
    if (sf != "") then
        if (df == "") then
         df = save_file_dialogue ("Lagre fil som?", "xx.html", html_root);
        end if
        if (df != "") then
                 // Pre-processing - Handle images in document. 
            string res = MoveImages ( sf, html_root, removeText );
                // Main conversion
            res = markdown2html (sf, style+","+codestyle, df); 
                // Post-processing...
            res = post_processing (df, removeText); 
        end if 
    end if
    return "OK";
end

A small utility function to produce HTML filenames:

function createHTML_Filename ( string file, string html_path ) 

    string newfile = substitute ( lower(file), " ", "_"); 
    newfile = substitute ( newfile, ":", "_"); 
    
    if (right(html_path, 1) != "/") then
        html_path += "/"; 
    end if
    return html_path + newfile + ".html";
end