On November 22^nd, GitLab announced the release of 14.5, in which you can personalise your profile with a README. This README gives you scope to spice up your GitLab profile page quite a bit, for instance you could:

Add more details about yourself, beyond what you can fit into GitLab's small biography field
Link to other pages, or to projects you would like to feature, turning your profile into a portfolio
Include formatting with any of the markup formats which GitLab can understand

This weekend, I decided to turn my personal GitLab profile into something a bit more personalised, and include an automatically updated list of blog posts from this blog's RSS feed.

This new GitLab feature was proposed in July 2020, and now that the feature is live, the proponent's own profile is looking pretty smart! It looks like he ported it directly from his GitHub profile. I like the username by the way: tonka3000 🚜.

To begin with, I wanted my own profile to be something that I would actually enjoy sharing with people. The idea is that it should showcase my technical skills through my public projects on GitLab, and that it should make use of GitLab's features to do it.

Making a README project

To add a profile README you need a public project, with the same name as your username.

Mine is https://gitlab.com/milohax/milohax/.

The README.* file in this project will be shown on my public GitLab profile, between the contribution graph, and the activity history and projects lists.

Put something in the README

The GitLab documentation (currently) says that you should populate the README with Markdown, but any markup which GitLab can understand should work, and I've tested Org-mode on my work profile. Just be sure to name the file README with an extension matching the markup format you choose to use: README.md for Markdown, or README.asc for Asciidoc, and so on.

I stuck with Markdown for this project, and added something fairly simple for a first draft. After a bit of playing around, by the end of 7 December I had a profile which looked like this.

It's a pretty good start, but I can do better.

Make a plan

I thought it would be pretty cool to include my latest blog posts into the profile too.

I've seen someone do this on GitHub. By looking at this example of how it is done, we can see that there's a community GitHub Action called gautamkrishnar/blog-post-workflow, which reads the specified RSS feed, and inserts links to the articles into the README. This Action is quite sophisticated, and would be a lot of work for me to port it to GitLab. The good news is that it is also much more flexible and capable than I need.

So I made a plan. I would figure out how to read my web site's RSS feed and then put this into the README at the right place.

For a bonus, I'd learn some Ruby doing it.

After that, I'd work out scheduling the script to run in GitLab CI, similar to the example GitHub Action.

Learning how to read RSS with Ruby

It's about time that I learnt to write some Ruby, especially since it is the main programming language we use at work! Ruby is a pretty, friendly, handy, and expressive language, with features a lot like Python and Scheme, but with a more Pascal-style syntax rather than LISP or the C-style you see in Java/JavaScript/Go/Rust. It's also like Perl except that you can read it after you write it. I've dabbled in Ruby through the musical program Sonic Pi: it has some nice syntax elements, and it also shows a good mixture of functional and object-oriented programming.

Ruby has been around since 1995, making it as old as Java and nearly as old as Python. I am a bit embarrassed to admit that I've never actually givin it much attention before, especially as I learn more about the Rails framework at work (itself dating from 2004, a year before I began this blog). Ruby has a lot of cool libraries, especially for coding on the web, and it feels like an old friend when I write in it.

It turns out that doing RSS in Ruby is a piece of cake! There is a standard RRS module, and also the URI module can handle web requests, much like Python's Requests library.

Here's a short interactive session, which I captured in my notes:

❯ pry
[1] pry(main)> require 'rss'
=> true
[2] pry(main)> url = 'https://milosophical.me/authors/michael-lockhart.xml'
=> "https://milosophical.me/authors/michael-lockhart.xml"
[3] pry(main)> URI.open(url) do |rss|
[3] pry(main)*   feed = RSS::Parser.parse(rss)
[3] pry(main)*   puts "|Blog post | Published|"
[3] pry(main)*   puts "|--|--|" 
[3] pry(main)*   feed.items.each do |item|
[3] pry(main)*     puts "|[#{item.title}](#{item.link}) | #{item.pubDate}|" 
[3] pry(main)*   end  
[3] pry(main)* end;nil

This prints the newest 10 posts from my blog into a nice table. It stops after the 10^th because that is how Nikola (my blog engine) made the RSS XML. This is fine, I don't want more than 10 posts anyway. I did need to make a few adjustments:

Put it all in a string, to be inserted later: just use string catenation (+=) instead of puts
Use the Time class's .iso8601 method to format the publish date as ISO-8601 (the only sensible date format)

Then all I needed was to learn some Ruby file handling and string substitution, and I was set:

require 'rss'
require 'time'

url = 'https://milosophical.me/authors/michael-lockhart.xml'
begin_blogs = "<!---BEGIN-BLOG--->\n"
end_blogs = "<!---END-BLOG--->"
README = "README.md"

# Get the blogs

feed_string = begin_blogs
feed_string += "|Post | Published|\n"
feed_string += "|--|--|\n"
URI.open(url) do |rss|
  feed = RSS::Parser.parse(rss)
  feed.items.each do |item|
    feed_string += "|[#{item.title}](#{item.link}) | #{item.pubDate.iso8601}|\n" 
  end
end
feed_string += end_blogs

# Load the README
readme_file = File.open(README)
readme_buffer = readme_file.read
readme_file.close

# Insert the blogs
File.write(README, 
           readme_buffer.gsub(/<!---BEGIN-BLOG--->(.*)<!---END-BLOG--->/im,
                              feed_string))

(blog-read.rb)

So now I have a script which can read my list of recent blog posts and insert it into my README between the markers I have placed for it.

Of course, I didn't just arrive at this code straight away. You can read all my explorations, false-steps, and discoveries in the commit history and in the issue. I really like to use GitLab in this way, it facilitates Rule 4 perfectly. Rather than me writing more about this project in my blog, I encourage you to read through my notes, and the project's commit log, directly. For this project, the journey is as important — probably more important — than the blog at the end.

Running in a GitLab pipeline

Next I wanted this to update automatically (I don't want to manually go edit this file after every blog post I write).

The easiest way to do that is to schedule a job to run every day and pull down the list. Since I don't blog very frequently, this should be a good balance between timeliness of updates and working too often for no purpose.

GitLab can schedule pipelines very easily. The main work in this step is to write the actual pipeline. It has to do three things:

Run the Ruby script to fetch and update the RSS into README.md
Add the changed README back into the project repository
Not run again on the git push, else it would cause an endless pipeline loop

The first step when writing a pipeline is to work out what the operating environment is going to be. GitLab can use Docker containers, so I thought the standard Ruby container from Docker Hub would be a good place to start. Since my own computer where I wrote the script uses Ruby 2.7 still, I specified this:

image: ruby:2.7

But I knew that I would need the git command as part of the image, too. After some more experimentation, and reading the container's documentation, I discovered there is a ruby-alpine variant:

image: ruby:2.7-alpine

Using this image, I can apk add git in the job's script, so that the git command will be available to push changes to the repository, as well as ruby and all the standard libraries I need to run the script.

It took me a few runs over December 11^th to work out exactly how to push back to the repository (final result is below). While doing this on GitLab through trial-and-error, it occurred to me that debugging CI pipelines in YAML — by making a change and then veiwing the job logs — feels a lot like programming a mainframe job in JCL (though admittedly the syntax is less confusing):

one makes a code change, submits the job and waits for it to run
Then it fails, so inspect the entrails (the CI Job Log in GitLab, or the ABEND log on a mainframe) for clues about what went wrong
Then one tries again, and waits to see the next outcome in the logs

Probably the biggest factor is that the feedback is not immediate, unlike when hacking away on a local computer.

Anyway, here is how to add changes back into the repository, from a CI job:

      git clone https://oauth2:${SCHEDULE_TOKEN}@gitlab.com/${CI_PROJECT_NAMESPACE}/${CI_PROJECT_NAME}
      cp README.md ${CI_PROJECT_NAME}
      cd ${CI_PROJECT_NAME}
      git config user.email "${GITLAB_USER_EMAIL}"
      git config user.name "${GITLAB_USER_NAME}"
      git add README.md
      git commit -m "Update blogs list"
      git push

This deserves some explanation:

First, even though the Runner clones the repository automatically, it does it with read-only access. So I created a special Personal Access Token (PAT) with read_registry, write_registry scope, and then saved into a private CI/CD pipeline variable called SCHEDULE_TOKEN
Clone the repoisitory using the PAT, into a subdirectory on the GitLab Runner (the machine running the pipeline)
Copy in the changes to the working copy of the clone
I had to tell git who the author is by configuring the user email and name
Add the changes, commit with a message
Push it (push it good…)

There can be only one!

The penultimate part of the puzzle was to make sure that the job only runs for pipeline events, because normally a push would trigger a new pipeline, causing an enless loop:

  rules:
    - if: $CI_PIPELINE_SOURCE == "schedule"

Finally, set a schedule to run the job once a day, at 00:42 UTC (just because 42 is the answer to the ultimate question).

Conclusion

So, there we have it: a personalised GitLab profile page which calls out projects that I'd like special attention on, as well as giving an overview about me. It also includes an automatically up-to-date list of my newer blog posts, made by a daily scheduled job which adds the blog's RSS feed.

For my next post, I'll show how this can be made more efficient by only triggering the profile update after actually making a post on my blog project.

Happy Hacking!

Links in this blog post

There are a lot of links in this post! To make it easier to find the reference which you might find helpful, I've listed the interesting ones below.

Personal README profiles in GitLab 14.5 release announcement
Add Details to your Personal Profile with a README (GitLab documentation)
GitLab Supported Markup Languages (GitLab documentation)
User profile customization via README.md in special repo (gitlab-org/gitlab issue 232157)
Testing Org-mode on my work profile
tonka3000@gitlab.com (example GitLab profile)
tonka3000@github.com (example GitHub profile)
milohax@gitlab.com (my GitLab profile)
https://gitlab.com/milohax/milohax/ (my GitLab profile project)
My notes for adding a profile with blog posts
My profile project's commit log
My blog-read.rb script
GitLab CI Pipeline schedule rules
My Stack Overflow answer for updating a GitLab repository from within a pipeline (see Use git inside your job)
- Personal Access Token (GitLab documentation)
- CI/CD pipeline variable (GitLab documentation)
Ruby container documentation (on Docker Hub)
Pipeline schedules (GitLab documentation)

Also, please up-vote My GitLab Songbook MR which add's Push It ;-)